Open to opportunities

Rahul Sahu

@rahulsahu2

Message

Senior Data Engineer focused on cloud data pipelines, Snowflake, and real-time analytics optimization.

India

Message

What I'm looking for

I’m looking for a growth-oriented team where I can architect scalable cloud data pipelines, optimize Snowflake/data warehouse performance, and deliver reliable real-time analytics with strong data governance—turning data into decisions with measurable impact.

I’m a Data Engineer with 6+ years of experience building and optimizing large-scale data pipelines across retail, finance, and education. I bring deep expertise in cloud data engineering and data warehousing to help teams turn complex datasets into reliable business insights.

At TripAdvisor, I architected and optimized Snowflake pipelines that ingest and transform diverse travel datasets, processing over 500GB of daily incremental updates. I engineered a unified 360-degree traveler view by merging on-platform interactions and off-platform bookings, improving personalized travel recommendations by 15%.

I also focused on real-time impact: I leveraged Snowflake Streams and Task for CDC and used Snowpark for complex Python-based transformations, reducing end-to-end data latency from 4 hours to under 30 minutes. To keep analytics trustworthy, I implemented data governance and quality frameworks (99.9% data accuracy) and optimized warehouse performance and cost, cutting monthly Snowflake credits by 20%.

Earlier, I delivered measurable outcomes at Credit Saison, reducing Athena query scan costs by 97% through S3 partitioning and Glue metadata management, and building transformation jobs with PySpark for financial datasets. At Embibe, I developed batch and streaming pipelines using Spark and Kafka and built real-time ranking with Kafka, Spark, and Redis—grounding my engineering style in performance, correctness, and practical delivery.

Experience

Work history, roles, and key accomplishments

Data Engineer 2

Tripadvisor

Feb 2025 - Dec 2025 (10 months)

Architected and optimized Snowflake pipelines ingesting and transforming diverse travel datasets with 500GB/day incremental updates, enabling a 360-degree traveler view that improved personalized travel recommendations by 15%. Reduced real-time CDC latency from 4 hours to under 30 minutes and improved warehouse cost/performance with 20% lower monthly Snowflake credits.

Snowflake SQL Python Data Warehousing Data Modeling Performance Tuning

Data Engineer 2

Tripadvisor

May 2024 - Jan 2025 (8 months)

Built and managed Whampipe-orchestrated ETL/ELT workflows using advanced SQL to automate travel data movement across multi-cloud environments. Tuned query execution and partitioning to cut peak-season ETL processing time by 25% and added automated SQL validations to detect schema drift and data anomalies.

Snowflake SQL Whampipe ETL Window Functions AWS RedShift Data Validation

Data Engineer 2

Blackbuck Insights

Feb 2022 - May 2024 (2 years 3 months)

Developed ingestion pipelines to load SFTP and GCS files into BigQuery for centralized processing, supporting batch analytics and CDP data structures. Built Airflow (Composer) batch ETL for CSV/JSON/Parquet into BigQuery and used SparkSQL/PySpark with Dataproc to explore datasets for ad performance and segmentation.

BigQuery Airflow Python SparkSQL Pyspark Data Ingestion

Data Engineer 2

Credit Saison

Nov 2020 - Jan 2022 (1 year 2 months)

Managed an AWS data lake on S3 with metadata cataloging in AWS Glue and crawling via Lambda to support NBFC financial analytics. Implemented PySpark transformation jobs and reduced Athena query scan costs by 97% using S3 partitioning strategies.

AWS S3 AWS Glue AWS Lambda Amazon Athena Pyspark SQL Data Lake Partitioning ETL Development

Data Engineer

Embibe

Jul 2019 - Nov 2020 (1 year 4 months)

Built batch and streaming pipelines using Spark (Scala) to process examination data and integrated real-time ingestion with Kafka. Developed student ranking pipelines using Kafka, Spark, and Redis and delivered exam trend insights to support business decision-making.