Skip to main content
HimalayasHimalayas logo
Sanskar SrivastavaSS
Looking for a job

Sanskar Srivastava

@sanskarsrivastava

I build data science and ML systems for real-world AI impact.

United States
Message

What I'm looking for

I’m looking for a team where I can build production-ready ML/LLM systems—especially RAG, real-time data pipelines, and scalable workflows—while using strong evaluation to drive measurable business impact and continuous model iteration.

I’m a Data Scientist and ML Engineer building AI systems, RAG, and deep learning models, with a strong focus on turning messy, unstructured data into structured insights. At Indiana University, I developed a dual-pipeline LLM + OCR/computer-vision redaction detection system for 40,000+ legal PDFs, reaching 92% combined accuracy, and I architected GPU-accelerated pipelines that reduced processing time by 10x.

In parallel, I extended transformer-based computational modeling for mental health discourse, using SBERT and 500K+ Reddit posts to generate triplet training data and to fine-tune models that map cognitive beliefs into belief networks. My work consistently emphasizes production readiness—scalable architectures, rigorous evaluation, and measurable outcomes—from causal churn modeling with uplift strategies to low-latency analytics and multimodal e-commerce systems.

Experience

Work history, roles, and key accomplishments

IU
Current

Machine Learning Engineer

Indiana University

Jan 2026 - Present (5 months)

Developed a dual-pipeline redaction detection system for 40,000+ legal PDFs using a Qwen-based LLM contextual approach and OCR/computer vision, achieving 92% combined accuracy. Architected GPU-accelerated HPC pipelines to convert unstructured legal documents into structured datasets, reducing processing time by 10x and enabling analysis of court confidentiality practices.

SL
Current

LLM Engineer

Soda Labs

Sep 2025 - Present (9 months)

Extended transformer-based computational modeling for mental-health discourse using SBERT on 500K+ Reddit posts by generating triplet training data to capture semantic patterns related to depression discussions. Fine-tuned models to map cognitive beliefs into belief networks, identifying patterns and underlying cognitive structures in mental health conditions.

Education

Degrees, certifications, and relevant coursework

Indiana University Bloomington logoIB

Indiana University Bloomington

Master’s in Data Science, Data Science

2024 - 2026

Grade: GPA 3.94

Master’s in Data Science (GPA 3.94) with coursework including Applied Machine Learning, Data Mining, Advanced Database, Big Data principles, and Intro/Elements of AI and LLMs.

Vellore Institute of Technology logoVT

Vellore Institute of Technology

B. Tech in Computer Science and Engineering, Computer Science and Engineering

2020 - 2024

Grade: GPA 8.7

B. Tech in Computer Science and Engineering (GPA 8.7) from Vellore Institute of Technology, completed over 2020–2024.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan