Supradeep Danturti

I want Agents to do everything.


Dec 2024 - Present
Developed advanced AI systems using Google ADK with specialized agents for automated evaluation and decision making, enabling workflow automation and significantly improving operational efficiency. Built sophisticated web automation systems with Playwright by implementing complex DOM navigation and intelligent element detection, ensuring reliable data extraction from dynamic applications and enabling automated collection from previously inaccessible sources.

Architected scalable microservices platforms with FastAPI backends, integrating real time data processing and APIs to support high volume concurrent operations while maintaining scalability and performance. Designed intelligent document processing pipelines with automated text extraction, semantic chunking, and handwritten text recognition across multiple formats, which significantly reduced manual processing time and provided support for complex document structures.

Engineered production ready systems with containerized deployment, comprehensive error handling, and monitoring to ensure enterprise grade reliability, high availability, and performance under demanding workloads. Implemented advanced semantic search and vector matching solutions with RAG evaluation to deliver accurate and relevant content discovery, similarity matching, and high performance retrieval.

Leveraged Azure AI Foundry and Prompt Flow to orchestrate AI workflows, optimize LLM prompts through systematic experimentation, benchmarking, and evaluation, and built agentic AI systems with multi step reasoning, tool usage, and human-in-the-loop evaluation. Utilized Azure OpenAI evaluation metrics such as accuracy, ROUGE, and BLEU to ensure compliance, production grade performance, and enterprise readiness.

Technologies: Python, FastAPI, React, LangChain, Playwright, PostgreSQL, Qdrant Vector DB, Docker, OpenAI API, Azure, Azure AI Foundry
May - Aug 2023
I was at GeoComply as a Data Analyst Intern for the summer of 2023. Working with the Release Management Team Monitoring (110+) deployments using Elasticsearch, Kibana and Grafana. Built Forecasting models and Deep Neural Networks such as ARIMA, SARIMA Models and a Stacked LSTM based time series forecasting Model with 96% accuracy. Automated creating a few documents using Python and SQL.
2022 - 2024
Pursuing Master of Applied Computer Science.
Interesting Courses :-
2022 - 2024
As a Machine Learning Engineer in the Product R&D Team, I worked on Face Recognition App to automate attendance in real time with 98% accuracy. Contributed to the development of ALPR (Automatic License Plate Detection and Recognition) system using Computer Vision and Neural Networks to detect and recognize license plates.
May - Dec 2021
I was a Machine Learning Engineer Intern as part of the MID Labs Innovation Team. Built a POC to automate hiring process using Deep Neural Networks and Computer Vision which extracts facial info, speech patterns and Background. It matches candidate's resume with job description and returns the match percentage.
2018 - 2022
B.Tech in Computer Science and Engineering. During this time I worked on projects using Machine Learning and Deep Neural Networks. Built a Surveillance System using Deep Neural Networks which can detect and send alerts based on situations. more info below 👇

Certifications

Oracle Cloud Infrastructure 2024 Generative AI Professional

Oracle
Issued Jul 2024
Credential ID: OC5126228
Oracle Cloud Infrastructure 2024 Generative AI Professional

Applied Data Science - Hitachi Vantara

2022

Microsoft Certified: Azure Fundamentals

2022 - 2023

Projects & Learnings

Overlap Speaker Counter

Overlap Speaker Counter

This project addresses the challenge of accurately counting speakers in meeting recordings where speech may overlap. This is essential for improving the accuracy of automated meeting transcriptions. To generate realistic training data, a simulator was developed that combines clean speech (LibriSpeech-clean-100) with noise and reverberation effects (Open-RIR dataset).

Two established speaker recognition models (x-vector and ECAPA-TDNN) were tested alongside a novel approach. This new method integrated a pretrained Wav2Vec 2.0 model with a linear classifier and XVector. The system analyzes short audio segments, providing timestamps and the detected number of speakers.

Crucially, the Wav2Vec 2.0 hybrid model significantly outperformed the other approaches. This demonstrates its power in handling complex meeting environments. This work pushes the boundaries of speaker counting technology and offers a valuable tool for the SpeechBrain project, ultimately benefiting a wide range of speech-related applications.

Age Classification

Age Classification Using Convolutional Neural Networks

This project is a learning experience on how different neural networks and specific hyperparameters work on a dataset. Related to COMP 6721 Applied AI.

Surveillance System

Surveillance System

A mini version of the Patent. Reflective of the initial stride towards it :)

FIFA World Cup EDA

EDA On FIFA World Cup

A project which was part of learning Exploratory data analysis and dashboarding using Flask and Python.

The Lost Mayan T

The Lost Mayan T

A small video developed using Unity which has basic level design, Cinemachine and particle system animation. It's not the best but you can watch it here :)

Gastrointestinal Cancer Classification

Gastrointestinal Cancer Classification

An attempt to extract features, classify and understand Pathology images.

Deep Learning Using Pytorch

Deep Learning Labs

A collection of educational Jupyter Notebook exercises focused on deep learning concepts using PyTorch. The labs progress from foundational deep learning topics through advanced concepts, including integration with HuggingFace transformers for state-of-the-art NLP and machine learning applications.

Understanding SpeechBrain

Understanding SpeechBrain

An educational collection of hands-on lab notebooks (ConversationalAI-Labs) focused on speech and audio processing using the SpeechBrain framework. The materials progress from fundamental concepts like audio classification and CNNs through advanced techniques including transformers, speaker identification, and pre-trained models for speech recognition and generative language models.