Skip to main content
Souptik BoseSB
Open to opportunities

Souptik Bose

@souptikbose

I’m a Senior Data Engineer modernizing large-scale AWS distributed data platforms using PySpark and Databricks.

India
Message

What I'm looking for

I’m looking for a role where I can own AWS-based distributed data platforms—building reliable ETL/ELT pipelines, tuning Spark performance, and improving orchestration/monitoring—while partnering with stakeholders on analytics and cloud modernization.

Senior Data Engineer with 5+ years of experience designing, optimizing, and modernizing large-scale distributed data platforms on AWS cloud environments. I specialize in turning enterprise needs into high-performance analytics infrastructure, including processing enterprise datasets exceeding 150TB.

I bring deep hands-on expertise across PySpark, Spark SQL, AWS EMR, AWS Glue, Redshift, and Databricks. I architected and optimized scalable AWS-based distributed ETL pipelines, built cloud data warehousing using Amazon Redshift and S3, and led modernization efforts to migrate legacy systems to cloud-native architectures.

I automate ingestion and ETL using PySpark, Spark SQL, and AWS Glue, reducing manual processing and improving reliability. I also strengthen end-to-end operations with workflow orchestration (Azkaban, Cron Jobs), CI/CD-driven deployment, and Spark performance tuning through partitioning, shuffle optimization, caching, and query/cluster optimization.

On the data layer, I design dimensional models and implement Slowly Changing Dimensions (SCD) to support scalable business intelligence. I focus on stakeholder collaboration, governance, privacy, and compliance—earning recognition for outstanding contribution, improved ETL efficiency, and enterprise-scale operational excellence.

Experience

Work history, roles, and key accomplishments

ZL

Senior Data Engineer

ZS Associates India Pvt. Ltd.

Architected and optimized AWS-based distributed ETL pipelines processing 150TB+ enterprise datasets for global pharmaceutical clients. Applied Spark performance tuning and dimensional data modeling (SCD) to improve execution efficiency and analytics reliability, alongside CI/CD-enabled deployments and job orchestration.

Education

Degrees, certifications, and relevant coursework

Maulana Abul Kalam Azad University of Technology (MAKAUT) logoMM

Maulana Abul Kalam Azad University of Technology (MAKAUT)

Bachelor of Engineering, Information Technology

2016 - 2020

Bachelor of Engineering in Information Technology from MAKAUT. Completed the program from 2016 to 2020.

West Bengal Council of Higher Secondary Education (WBCHSE) logoWW

West Bengal Council of Higher Secondary Education (WBCHSE)

Higher Secondary Education, Higher Secondary Education

2014 - 2016

Completed higher secondary education via WBCHSE from 2014 to 2016.

West Bengal Board of Secondary Education (WBBSE) logoWW

West Bengal Board of Secondary Education (WBBSE)

Secondary Education, Secondary Education

Completed secondary education through WBBSE.

Tech stack

Software and tools used professionally

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan