Skip to main content
HimalayasHimalayas logo
jairaj jagarapuJJ
Open to opportunities

jairaj jagarapu

@jairajjagarapu

I’m a data engineer with 4+ years building AWS/Databricks/Snowflake pipelines and AI-ready data platforms.

United States
Message

What I'm looking for

I’m looking to build reliable, scalable data platforms—improving pipeline success, cost, and performance—while expanding AI-ready capabilities like RAG, embeddings, and semantic search on cloud lakehouse stacks.

I’m a results-driven Data Engineer with 4+ years of experience designing and delivering enterprise-grade data platforms on AWS, Databricks, and Snowflake. I focus on building pipelines that are reliable in production, measurable in performance, and ready for analytics at scale.

In my current role at Capgemini, I engineered a fault-tolerant AWS ingestion platform integrating 10+ databases, REST APIs, and third-party sources into a centralized cloud data lake. I redesigned Apache Airflow DAG architecture with SLA alerts, retry logic, and dead-letter queue handling—reducing pipeline failures by 50% and improving reliability for daily workflows.

I optimize for both speed and cost: I tuned PySpark executor memory allocation, shuffle partitions, and broadcast join thresholds on Databricks for multi-terabyte workloads within strict SLA windows. I also implemented watermark-based CDC incremental ingestion to eliminate costly full-table scans, improving processing turnaround by 40% while reducing AWS Glue compute costs.

I also build AI-ready data infrastructure, including RAG pipelines, vector database workflows, and LLM embedding orchestration using LangChain and OpenAI API. I’ve contributed semantic search initiatives using LangChain and Pinecone embedding workflows to improve contextual retrieval quality across internal knowledge platforms.

Experience

Work history, roles, and key accomplishments

CA
Current

Data Engineer

Jan 2025 - Present (1 year 5 months)

Engineered a fault-tolerant AWS ingestion platform integrating 10+ data sources, enabling self-service analytics for product and finance teams. Reduced pipeline failures by 50% through an improved Airflow DAG design and cut processing turnaround by 40% by implementing watermark-based CDC incremental ingestion.

Education

Degrees, certifications, and relevant coursework

Webster University logoWU

Webster University

Master of Science, Information Technology Management

Master of Science in Information Technology Management at Webster University in Saint Louis, Missouri.

MT

Methodist College of Engineering and Technology

Bachelor of Technology, Electronics and Communication Engineering

Bachelor of Technology in Electronics and Communication Engineering from Methodist College of Engineering and Technology in Hyderabad, India.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan