I’m looking for a team where I can build production-ready ML/LLM systems—especially RAG, real-time data pipelines, and scalable workflows—while using strong evaluation to drive measurable business impact and continuous model iteration.
Sanskar Srivastava
@sanskarsrivastava
I build data science and ML systems for real-world AI impact.
What I'm looking for
I’m a Data Scientist and ML Engineer building AI systems, RAG, and deep learning models, with a strong focus on turning messy, unstructured data into structured insights. At Indiana University, I developed a dual-pipeline LLM + OCR/computer-vision redaction detection system for 40,000+ legal PDFs, reaching 92% combined accuracy, and I architected GPU-accelerated pipelines that reduced processing time by 10x.
In parallel, I extended transformer-based computational modeling for mental health discourse, using SBERT and 500K+ Reddit posts to generate triplet training data and to fine-tune models that map cognitive beliefs into belief networks. My work consistently emphasizes production readiness—scalable architectures, rigorous evaluation, and measurable outcomes—from causal churn modeling with uplift strategies to low-latency analytics and multimodal e-commerce systems.
Experience
Work history, roles, and key accomplishments
Machine Learning Engineer
Indiana University
Jan 2026 - Present (5 months)
Developed a dual-pipeline redaction detection system for 40,000+ legal PDFs using a Qwen-based LLM contextual approach and OCR/computer vision, achieving 92% combined accuracy. Architected GPU-accelerated HPC pipelines to convert unstructured legal documents into structured datasets, reducing processing time by 10x and enabling analysis of court confidentiality practices.
LLM Engineer
Soda Labs
Sep 2025 - Present (9 months)
Extended transformer-based computational modeling for mental-health discourse using SBERT on 500K+ Reddit posts by generating triplet training data to capture semantic patterns related to depression discussions. Fine-tuned models to map cognitive beliefs into belief networks, identifying patterns and underlying cognitive structures in mental health conditions.
Education
Degrees, certifications, and relevant coursework
Indiana University Bloomington
Master’s in Data Science, Data Science
2024 - 2026
Grade: GPA 3.94
Master’s in Data Science (GPA 3.94) with coursework including Applied Machine Learning, Data Mining, Advanced Database, Big Data principles, and Intro/Elements of AI and LLMs.
Vellore Institute of Technology
B. Tech in Computer Science and Engineering, Computer Science and Engineering
2020 - 2024
Grade: GPA 8.7
B. Tech in Computer Science and Engineering (GPA 8.7) from Vellore Institute of Technology, completed over 2020–2024.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Salary expectations
Job categories
Skills
Interested in hiring Sanskar?
You can contact Sanskar and 90k+ other talented remote workers on Himalayas.
Message SanskarFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
