Skip to main content
Ali ShahidAS
Open to opportunities

Ali Shahid

@alishahid1

Senior Data Engineer specializing in cloud-scale Lakehouse architectures, streaming systems, and data reliability.

United States
Message

What I'm looking for

I seek a senior data platform role building scalable, governed Lakehouse and streaming systems with strong DataOps, observability, and cost-efficiency at enterprise scale.

I am a Senior Data Engineer with 13 years of experience designing, building, and running cloud-scale data platforms across AWS, Azure, and GCP. I specialize in Lakehouse architectures, scalable batch and streaming pipelines (Spark, Flink, Kafka), CDC, and data governance to deliver reliable, well-governed data for analytics and ML.

At recent roles I architected cloud-native Lakehouse platforms using Delta Lake and Apache Iceberg, built real-time ingestion and CDC pipelines with Flink, Kafka, and Debezium, and implemented metadata-driven orchestration with Airflow and Dagster. I have driven performance optimizations for Spark workloads, standardized platform infrastructure with Terraform and Kubernetes, and established observability with OpenTelemetry, Prometheus, and Grafana.

I bring a strong programming background in Python, Scala, SQL, Go, and Rust, and a practical focus on DataOps automation, data quality, cost-efficient platform design, and federated analytics. I seek to apply these skills to build reliable, scalable data platforms that empower analytics, real-time reporting, and machine learning.

Experience

Work history, roles, and key accomplishments

DA
Current

Staff Data Engineer

Datafold

Sep 2021 - Present (4 years 9 months)

Led design and evolution of a cloud-native Lakehouse platform across AWS and Azure, built real-time CDC and ingestion pipelines with Flink, Kafka, and Debezium, and implemented metadata-driven orchestration and observability to reduce incidents and optimize compute costs.

AL

Data Engineer

AlphaSense

Jul 2015 - Mar 2019 (3 years 8 months)

Built and maintained large-scale ETL pipelines on AWS EMR and Azure HDInsight using PySpark/Scala, implemented CDC with Kafka Connect/Debezium, and automated data quality checks to improve pipeline reliability and reduce compute costs.

ET

Junior Data Engineer

Enigma Technologies

May 2012 - May 2015 (3 years)

Developed foundational ETL pipelines with Talend, Python, and SQL to ingest ERP/CRM data into PostgreSQL and Hadoop, designed dimensional models for BI, and supported migration of on-prem Hadoop workloads to AWS S3/EMR.

Education

Degrees, certifications, and relevant coursework

PU

Punjab University

Bachelor of Science, Computer Science

Completed a Bachelor of Science in Computer Science focused on core computing principles and software development.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan