jairaj jagarapu
@jairajjagarapu
I’m a data engineer with 4+ years building AWS/Databricks/Snowflake pipelines and AI-ready data platforms.
What I'm looking for
I’m a results-driven Data Engineer with 4+ years of experience designing and delivering enterprise-grade data platforms on AWS, Databricks, and Snowflake. I focus on building pipelines that are reliable in production, measurable in performance, and ready for analytics at scale.
In my current role at Capgemini, I engineered a fault-tolerant AWS ingestion platform integrating 10+ databases, REST APIs, and third-party sources into a centralized cloud data lake. I redesigned Apache Airflow DAG architecture with SLA alerts, retry logic, and dead-letter queue handling—reducing pipeline failures by 50% and improving reliability for daily workflows.
I optimize for both speed and cost: I tuned PySpark executor memory allocation, shuffle partitions, and broadcast join thresholds on Databricks for multi-terabyte workloads within strict SLA windows. I also implemented watermark-based CDC incremental ingestion to eliminate costly full-table scans, improving processing turnaround by 40% while reducing AWS Glue compute costs.
I also build AI-ready data infrastructure, including RAG pipelines, vector database workflows, and LLM embedding orchestration using LangChain and OpenAI API. I’ve contributed semantic search initiatives using LangChain and Pinecone embedding workflows to improve contextual retrieval quality across internal knowledge platforms.
Experience
Work history, roles, and key accomplishments
Engineered a fault-tolerant AWS ingestion platform integrating 10+ data sources, enabling self-service analytics for product and finance teams. Reduced pipeline failures by 50% through an improved Airflow DAG design and cut processing turnaround by 40% by implementing watermark-based CDC incremental ingestion.
Architected and delivered 15+ scalable batch data pipelines consolidating SAP ERP and Oracle data into Hadoop HDFS and cloud storage for analytics. Migrated workflows to Airflow and reduced manual monitoring by 48%, while improving BI query runtimes by 44% using Parquet and performance tuning.
Education
Degrees, certifications, and relevant coursework
Webster University
Master of Science, Information Technology Management
Master of Science in Information Technology Management at Webster University in Saint Louis, Missouri.
Methodist College of Engineering and Technology
Bachelor of Technology, Electronics and Communication Engineering
Bachelor of Technology in Electronics and Communication Engineering from Methodist College of Engineering and Technology in Hyderabad, India.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring jairaj?
You can contact jairaj and 90k+ other talented remote workers on Himalayas.
Message jairajFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
