Skip to main content
Akash SoniAS
Open to opportunities

Akash Soni

@akashsoni3

Senior Data Engineer crafting scalable AWS streaming and ETL pipelines to cut costs and accelerate real-time analytics.

India
Message

What I'm looking for

I’m looking for a Senior Data Engineering role where I can build reliable AWS streaming/ETL systems, own CDC and data lake patterns, and deliver measurable cost and performance wins with strong monitoring, automation, and engineering collaboration.

I’m a results-driven Senior Data Engineer with 8.5+ years of experience designing, building, and optimizing scalable data pipelines, streaming solutions, and cloud-based ETL workflows. I specialize in AWS Glue, Apache Spark (PySpark), Snowflake, and data lake architectures, with a strong focus on reliability and end-to-end performance.

I’ve delivered measurable outcomes, including reducing processing time by 30% and cloud costs by 25% through Spark optimization, partitioning strategies, and warehouse right-sizing. I build real-time CDC pipelines and handle schema evolution to support high-availability enterprise environments and trustworthy analytics.

Most recently, I designed a real-time CDC streaming pipeline for transaction monitoring using AWS MSK (Kafka) and AWS Glue, capturing changes from PostgreSQL to S3. I also implemented schema evolution handling with AWS Glue Schema Registry and used Apache Hudi to maintain ACID compliance for efficient upserts and downstream Athena queries.

Previously, I owned AWS-based pipeline architecture for large-scale BI workloads, building Glue ETL integrated with Snowflake and Amazon Athena. I’ve also led platform modernization efforts—migrating Hadoop to AWS, implementing CI/CD and workflow automation with Control-M and Apache Airflow, and strengthening monitoring with CloudWatch and Datadog to improve SLA adherence.

Experience

Work history, roles, and key accomplishments

WL
Current

Senior Data Engineer

Worldpay Pvt Ltd

Jan 2025 - Present (1 year 5 months)

Designed and implemented a real-time CDC streaming pipeline using AWS MSK (Kafka) and AWS Glue to capture PostgreSQL changes into S3 for near real-time transaction analytics. Built ACID-compliant data lake upserts with Apache Hudi, added schema evolution with AWS Glue Schema Registry, and improved reliability through Airflow automation plus CloudWatch/Datadog monitoring.

AP

Senior Data Engineer

Advance Auto Parts

Dec 2021 - Dec 2024 (3 years)

Owned end-to-end AWS data pipeline architecture for large-scale BI workloads using AWS Glue ETL with Snowflake and Amazon Athena. Reduced end-to-end job processing time by ~30% via PySpark optimization and incremental/dynamic partitioning, and cut overall cloud costs by ~25% through S3 storage optimization, Snowflake warehouse right-sizing, and query tuning.

AI

Data Engineer

Alten India

Apr 2021 - Nov 2021 (7 months)

Migrated aircraft maintenance data from on-premise Hadoop to AWS S3 to improve scalability and reduce infrastructure overhead. Built PySpark transformation pipelines, deployed automated ETL CI/CD, and scheduled/monitored production workflows with Control-M.

ET

Data Engineer

Ezyloads Technology

Sep 2017 - Dec 2018 (1 year 3 months)

Developed data pipelines for truck tracking and logistics analytics by integrating multiple sources into Hadoop and MySQL. Applied PySpark for real-time fleet data processing, built SQL/Python-based driver performance and incentive models, and optimized data warehouse query performance to improve report generation efficiency.

Education

Degrees, certifications, and relevant coursework

GT

Greater Noida Institute of Technology

Bachelor of Technology, Computer Science & Engineering

Earned a B.Tech in Computer Science & Engineering from Greater Noida Institute of Technology (AKTU).

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan