Open to opportunities

Dipesh Adhikari

@dipeshadhikari

Message

Experienced data engineer specializing in scalable data solutions.

United States

Message

What I'm looking for

I seek a role that fosters innovation, collaboration, and growth in data engineering.

With over 7 years of experience in designing and implementing scalable data pipelines and distributed systems across major cloud platforms like Azure, AWS, and GCP, I have a proven track record of delivering high-impact data solutions. My expertise lies in building Spark-based data workflows in Databricks, where I successfully reduced compute costs by 30% while enabling both batch and real-time processing at scale.

I have developed modular PySpark ETL pipelines orchestrated via Apache Airflow, enhancing reusability and cutting onboarding time by 40%. My experience includes architecting cross-cloud ingestion frameworks using Kafka, Spark Streaming, and Delta Lake, which support low-latency, multi-region analytics. I am passionate about leveraging machine learning pipelines using SageMaker and MLflow to drive innovation and efficiency in data processing.

Experience

Work history, roles, and key accomplishments

Current

Senior Data Engineer

Current

Goldman Sachs

Apr 2022 - Present (4 years 4 months)

Developed cross-cloud Spark-SQL pipelines in Databricks, reducing compute cost by 30% via cluster tuning. Built scalable PySpark and Scala ETL pipelines using Airflow, supporting both batch and real-time workflows across Azure, AWS, and GCP.

Databricks PySpark Scala ETL Airflow Azure AWS GCP Kafka Delta Lake Kubernetes Docker Jenkins Snowflake SQL Grafana Splunk MLFlow Spark NLP Polars Pandas JavaScript Python JSON Datadog GitHub Actions XML

Data Engineer

Change Healthcare

Oct 2019 - Present (6 years 10 months)

Built and optimized Spark-based ETL workflows using Scala, PySpark, Spark SQL, and Spark Streaming across batch and real-time environments. Automated and orchestrated data pipelines using Airflow, shell scripting, and Jenkins, reducing production incidents and improving deployment velocity by 25%.

ETL Developer

Barclays

Sep 2016 - Present (9 years 11 months)

Partnered with DevOps to architect hybrid cloud workflows across Azure, AWS, and GCP, provisioning infrastructure using Terraform and YAML. Transitioned legacy Hadoop/HDFS pipelines with Scala-based Spark jobs into modern Azure Databricks environments using Medallion Architecture and Delta Lake.

Education

Degrees, certifications, and relevant coursework

Sam Houston State University

Masters, Data Science and Statistics

Completed a Masters degree focusing on Data Science and Statistics. Gained expertise in statistical analysis, machine learning, and data modeling techniques relevant to modern data challenges.