siddharth srivastav
@siddharthsrivastav
Data Scientist | 3+ YOE | ML, GenAI, LLMs & RAG | Building scalable AI pipelines, analytics solutions & intelligent systems.
What I'm looking for
I’m a Data Scientist focused on turning complex, messy data into reliable, production-ready AI/ML and Generative AI workflows. I specialize in building scalable data pipelines, optimizing preprocessing systems, improving model performance, and deploying intelligent AI solutions that deliver consistent, interpretable outputs for real-world applications.
In my most recent role, I built and improved preprocessing workflows for high-volume datasets and supported algorithm-integrated systems for automated classification, extraction, and report generation. I collaborated cross-functionally to productionize AI-driven workflows by integrating ML models with scalable backend services, improving automation efficiency and reducing manual review efforts.
I have hands-on experience with Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and Agentic AI systems, including building intelligent workflows that combine LLM reasoning, vector databases, semantic search, prompt engineering, and automated decision-making agents. I have worked on optimizing domain-specific LLM applications, improving retrieval accuracy, enhancing context generation, and designing AI pipelines capable of handling complex user queries and multi-step tasks.
Previously, I worked on predictive healthcare modeling, improving patient risk analytics by ~20% using ensemble learning techniques such as Random Forest and Logistic Regression. I performed exploratory data analysis on 10M+ records using Python (Pandas, NumPy) and SQL, developed interactive dashboards using Python and Power BI across multiple projects, and achieved AUC scores of 0.85+ through model optimization, feature engineering, and evaluation of supervised and unsupervised learning models.
My experience also includes fine-tuning domain-specific LLMs on large-scale datasets, optimizing NLP pipelines, and reducing model training time through efficient preprocessing and scalable ML workflows. I enjoy research-driven problem solving—whether it involves developing Agentic AI assistants, implementing RAG-based knowledge systems, experimenting with reinforcement learning models, or creating automated feature engineering pipelines using Scikit-learn and modern ML frameworks.
Experience
Work history, roles, and key accomplishments
Data Scientist
KD Aerospace Systems Pvt.Ltd.
Apr 2025 - Present (1 year 2 months)
• Built and optimized scalable data preprocessing workflows for high-volume AI/ML datasets
• Developed automated classification, extraction, and reporting pipelines for production systems
• Improved data quality, validation, and consistency of model outputs through transformation techniques
• Collaborated with teams to deploy ML workflows and reduce manual effort through automation
Data Scientist
Atrobot Drones Pvt. Ltd.
Nov 2023 - Mar 2025 (1 year 4 months)
• Built predictive healthcare ML models, improving risk prediction accuracy by ~20% using Random Forest & Logistic Regression
• Analyzed 10M+ clinical records using Python, SQL, Pandas & NumPy to generate actionable insights
• Created Power BI dashboards across 3+ projects for data-driven decisions
• Tuned ML models, achieving 0.85+ AUC and automated preprocessing pipelines
Data Analyst Intern
Ameriprise Financial
Feb 2023 - Aug 2023 (6 months)
Analyzed the effects of a major event on company sales, AUM, and advisor redemptions using 6 months of pre- and post-event data. Automated data processing and reporting tasks and built a Python module to support causal inference when control groups were not properly sampled.
Education
Degrees, certifications, and relevant coursework
Jaypee Institute of Information Technology
Bachelor of Technology, Computer Science and Engineering
2019 - 2023
Grade: 7.6 (CGPA)
Earned a Bachelor of Technology in Computer Science and Engineering (CSE) at Jaypee Institute of Information Technology from 2019 to 2023, with a CGPA of 7.6.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Interested in hiring siddharth?
You can contact siddharth and 90k+ other talented remote workers on Himalayas.
Message siddharthFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
