Skip to main content
HimalayasHimalayas logo
VG
Open to opportunities

Vamsi Sai Garapati

@vamsisaigarapati

Senior data engineer building scalable Spark/Databricks pipelines and AI/ML vector RAG systems across AWS and Azure.

United States
Message

What I'm looking for

I’m looking for a role where I can own cloud-scale data platforms—building reliable pipelines with strong SLAs, measurable cost savings, and production-grade AI/ML data workflows for RAG and analytics.

I’m a senior data engineer with 5+ years building Big Data platforms and end-to-end data scalable pipelines on Spark, Databricks, and Snowflake across AWS and Azure. I’ve delivered $650K in cloud savings, maintained 99% SLA on 200M+ records/day, and led a Teradata-to-Delta Lake migration to modernize enterprise analytics.

I also engineer AI/ML data pipelines for production RAG systems—chunking, embedding, and persisting into vector databases for low-latency semantic retrieval—while orchestrating retraining and re-embedding workflows with Airflow and CI/CD-deployed pipelines. Across my roles, I’ve owned streaming and batch ingestion, built data-quality frameworks (bronze/silver/gold with alerting and lineage), operationalized 120+ Airflow/Databricks workloads, and automated releases with Jenkins on Kubernetes for faster cycles and fewer incidents.

Experience

Work history, roles, and key accomplishments

SA
Current

Data Engineer (RAG Pipelines)

Saayam for All

Feb 2026 - Present (4 months)

Built and streamlined end-to-end data pipelines for production RAG systems, ingesting PostgreSQL and Snowflake sources and persisting embeddings to ChromaDB for low-latency semantic retrieval. Orchestrated retraining and re-embedding workflows with Airflow, using CI/CD-deployed, versioned pipelines to enable weekly vector index updates.

IN

Data Engineer

Infosys

Nov 2020 - Aug 2024 (3 years 9 months)

Owned two enterprise data programs for Nike NA and China, delivering self-serve insights across 15+ ETL/ELT pipelines and sustaining 99% SLA on 200M+ records/day. Migrated legacy Teradata workloads to a Databricks Delta Lake medallion architecture with CDC, dbt models, and column-level security, accelerating delivery timelines by 30%, while cutting annual cloud spend by $650K and reducing operatio

Education

Degrees, certifications, and relevant coursework

University at Buffalo logoUB

University at Buffalo

Master of Science, Data Science

2024 - 2025

Grade: 3.9/4.0

Earned a Master of Science in Data Science at the University at Buffalo (Aug 2024–Dec 2025), achieving a 3.9/4.0 GPA.

Andhra University logoAU

Andhra University

Bachelor of Technology, Information Technology

2016 - 2020

Grade: 3.7/4.0

Earned a B.Tech in Information Technology at Andhra University (Jun 2016–Apr 2020), achieving a 3.7/4.0 GPA.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan