Looking for a job

pemba moktan

@pembamoktan

Message

Senior Data Engineer with 6+ years of experience in cloud, ETL, ML pipelines, and data compliance.

United States

Message

What I'm looking for

I'm seeking a Senior Data Engineer role focused on building scalable cloud-based data platforms using Spark, Airflow, DBT, and Databricks. I value teams that prioritize clean architecture, automation, and governance. I'm especially interested in impactful domains like healthcare or finance, and I enjoy mentoring, learning, and taking ownership of end-to-end data solutions.

I'm a Senior Data Engineer with over 6 years of experience designing and deploying cloud-native data solutions across AWS, Azure, and GCP. My background spans healthcare, insurance, and financial domains, where I’ve led the development of secure, scalable, and high-performance ETL and real-time ML pipelines using tools like Apache Spark, Airflow, Kafka, DBT, and Databricks.

One of my proudest achievements was modernizing a legacy data platform at DaVita, which reduced processing time by 50% and increased SLA compliance by 30%. I’ve also built analytics environments compliant with HIPAA, GDPR, and SOX, enhancing data security and governance across organizations.

Beyond hands-on development, I enjoy mentoring junior engineers, improving CI/CD workflows, and contributing to architectural decisions that align data strategies with business goals. I'm deeply passionate about building systems that are not just technically sound, but also drive real business impact.

Outside of work, I enjoy exploring emerging data tools, contributing to open-source projects, and staying active in the data engineering community.

Experience

Work history, roles, and key accomplishments

Current

senior data engineer

Current

Johnson & Johnson

Mar 2023 - Present (3 years 4 months)

Designed and deployed cross-cloud data pipelines across Azure, AWS, and GCP, supporting batch and streaming workloads for advanced analytics, machine learning inference, and real-time business reporting enabling 24/7 availability across multi-region deployments. Engineered Spark-SQL pipelines in Databricks to process diverse data formats (JSON, Parquet, Avro).

Senior data engineer

Goldman Sachs

Jul 2020 - Jan 2023 (2 years 6 months)

● Architected hybrid cloud analytics workflows on Azure, provisioning infrastructure as code using Terraform and YAML to support scalable, governed pipelines across enterprise datasets.
● Modernized legacy Hadoop pipelines by replatforming Scala-based Spark jobs to Azure Databricks using Delta Lake and Medallion Architecture, improving data reliability and developer velocity.

senior data engineer

DaVita Kidney Care

Jan 2018 - Apr 2020 (2 years 3 months)

● Gathered business requirements, performed business analysis, and designed various data products.
● Built and automated Spark-based ETL workflows across legacy (Informatica + Hive) and modern (Talend + Snowflake/SQL Server) environments, using Airflow and shell scripting to streamline daily production processes.