rahul lohia

RL

Open to opportunities

rahul lohia

@rahullohia

Senior Data Engineer optimizing real-time and batch pipelines for scalable, cost-efficient platforms.

What I'm looking for

I want to build and harden scalable data platforms—real-time and batch—with strong observability, data quality, and cost/performance optimization. I’m excited by roles where I can own pipelines end-to-end and deliver measurable impact.

I’m a Senior Data Engineer with 5+ years designing and optimizing production-scale data pipelines and big data platforms across travel and telecom domains. I focus on ETL/ELT engineering that improves reliability, performance, and operational clarity.

In my current role, I reduced a data platform’s monthly infrastructure cost from $85K to $4K (~95%, ~$972K/year saved) by rewriting Spark execution plans, right-sizing EMR clusters, and automating S3 storage lifecycle management. I’ve also owned production RCA and delivered permanent fixes, cutting Spark job compute and runtime by up to 80% through broadcast join tuning, partition optimization, and cluster-level configuration.

I build end-to-end pipelines with Apache Spark (Scala/PySpark) for both batch and real-time workloads. I orchestrate workflows via Apache Airflow and connect observability and alerting through Datadog, including SLA breach detection and anomaly monitoring to keep pipelines dependable.

My technical depth spans Kafka, Spark Structured Streaming, Hive, HBase, and AWS, plus data governance and quality through Collibra Data Quality (CDQ). I’ve delivered outcomes like sub-minute end-to-end latency for high-volume event streams and automated Kafka-vs-Hive reconciliation with HTML audit reporting to proactively detect data loss incidents.

Experience

Work history, roles, and key accomplishments

AA

Current

Senior Data Engineer

Current

Affine Analytics

Apr 2024 - Present (2 years 3 months)

Reduced a travel data platform’s monthly infrastructure cost from $85K to $4K (~95%, ~$972K/year saved) by rewriting Spark execution plans, right-sizing EMR clusters, and automating S3 storage lifecycle policies. Owned end-to-end Spark-Scala ETL to deliver curated datasets and implemented a Collibra CDQ framework (50+ validations) with Airflow orchestration and Datadog observability to improve rel

PySpark Spark Structured Streaming Kafka Hive HBase Datadog S3 Airflow

CO

Data Engineer

Cognizant

Feb 2021 - Apr 2024 (3 years 2 months)

Built production Kafka and Spark Structured Streaming pipelines for telecom event ingestion, achieving sub-minute end-to-end latency on high-volume streams. Reduced Hive batch/streaming runtimes by 30–40% using dynamic partitioning, bucketing, and query optimization, and delivered secure access-management and Kafka-vs-Hive reconciliation workflows with automated HTML audit reporting.

Kafka Spark Structured Streaming Hive HBase Data Reconciliation Query Optimization CI CD Pipelines HTML Audit Reporting

Education

Degrees, certifications, and relevant coursework

MM

Maulana Abul Kalam Azad University of Technology (MAKAUT)

Bachelor of Technology (B.Tech), Computer Science & Engineering

2017 - 2021

Grade: DGPA: 8.91 / 10

Earned a B.Tech in Computer Science & Engineering at MAKAUT, Kolkata (2017–2021) with a DGPA of 8.91/10.

Tech stack

Software and tools used professionally

Splunk

Apache Spark

GitHub

Jenkins

GitHub Actions

Salesforce

PySpark

MySQL

MongoDB

Hadoop

HBase

Databricks

Slack

Python

Java

Kafka

Datadog

Docker

Airflow

SQL

Apache Iceberg

Delta Lake

Collibra

Availability

Open to opportunities

Location

India

Authorized to work in

Salary expectations

45k-80k USD

Social media

Job categories

Data Engineer Data Quality Engineer ETL Developer Data Engineer Big Data Architect Data Developer

Interested in hiring rahul?

You can contact rahul and 90k+ other talented remote workers on Himalayas.

People also viewed

View all talent

Get matched with your dream remote job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!