AS

Open to opportunities

Anupam Srivastava

@anupamsrivastava

Data Engineer specializing in scalable batch and near real-time pipelines on AWS.

What I'm looking for

I’m looking for a role where I can build trusted, scalable batch and streaming data pipelines—optimizing Spark and cloud cost—while delivering high-impact analytics. I want ownership across ETL, data quality, and modern explainability/GenAI workflows.

I’m a Data Engineer with 4+ years of experience building scalable batch and near real-time data pipelines using PySpark, SQL, AWS, and Snowflake. I focus on ETL optimization, dimensional modeling, and analytical dataset design that helps teams make data-driven decisions with confidence.

At Piramal Finance, I architected the migration of 10+ production ETL pipelines from Python to PySpark on AWS Glue, improving fault tolerance and reducing processing latency by 83% (from 2 hours to 20 minutes). I also built near real-time streaming pipelines leveraging Apache Kafka on AWS EMR, ingesting 2M+ events daily into Snowflake and feeding 5+ dashboards.

I’ve consistently improved reliability and efficiency—delivering $36K+ in annual cloud cost savings by profiling and optimizing Spark execution plans, refining partitioning strategies, and right-sizing AWS Glue clusters. I developed an LLM-powered explainability workflow using Amazon Bedrock (Claude 3.5 Sonnet) to automatically identify and explain incentive payout factors for 15,000+ employees.

I design pipelines with lineage, quality, and performance in mind, including a multi-stage incentive computation pipeline using Snowflake stored procedures with daily snapshot versioning and AWS Glue delivery workflow. I’ve mentored 2 engineers and reduced production incidents by 30% through validation checks, performance tuning, and close collaboration with business stakeholders.

Experience

Work history, roles, and key accomplishments

PF

Current

Data Engineer

Current

Piramal Finance

Mar 2024 - Present (2 years 4 months)

Architected migration of 10+ production ETL pipelines from Python to PySpark on AWS Glue, improving fault tolerance and reducing processing latency 83% (2 hours to 20 minutes). Built Kafka-to-Snowflake near real-time ingestion (2M+ events/day) for 5+ dashboards, delivering $36K+ in annual cloud cost savings and implementing Bedrock-powered explainability for 15,000+ employees.

Kafka PySpark AWS Glue Amazon EMR Snowflake AWS AI Bedrock Airflow Power BI

PF

Associate Data Engineer

Piramal Finance

Jun 2022 - Mar 2024 (1 year 9 months)

Designed and developed PySpark and AWS Glue data pipelines orchestrated via Airflow DAGs, processing 10M+ records daily to power analytics and reporting. Maintained Snowflake source-of-truth datasets using SCD Type 2 (70M+ records) and reduced production data defects by 40% by implementing SPC-based data quality and anomaly detection across 10 tables.

Airflow PySpark AWS Glue Snowflake ETL ELT Data Quality Power BI

PF

Software Development Intern

Piramal Finance

Feb 2022 - Jun 2022 (4 months)

Converted and validated 25+ SQL scripts during data migration from PostgreSQL to Snowflake, troubleshooting processing issues to ensure data consistency across environments. Reduced manual reporting effort by 30% by enabling self-serve analytics through Power BI dashboards.

SQL Snowflake PostgreSQL Data Migration Data Validation ETL ELT Power BI Debugging

Education

Degrees, certifications, and relevant coursework

IS

Indian Institute of Information Technology, Surat

Bachelor of Technology, Computer Science and Engineering

2018 - 2022

Bachelor of Technology in Computer Science and Engineering from IIIT Surat (2018–2022).

Tech stack

Software and tools used professionally

Apache Spark

AWS Glue

GitHub

PySpark

PostgreSQL

MongoDB

Gmail

Databricks

Kafka

Airflow

s3-lambda

SQL

Delta Lake

Availability

Open to opportunities

Location

India

Authorized to work in

Job categories

Data Engineer Big Data Engineer Streaming Data Engineer ETL Engineer Analytics Engineering Data Engineering Data Engineering Specialist Data Engineering Positions Data Data Engineering

Interested in hiring Anupam?

You can contact Anupam and 90k+ other talented remote workers on Himalayas.

People also viewed

View all talent

Get matched with your dream remote job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!