Muhammad Khan
@muhammadkhan12
Lead Data Engineer and Cloud Data Architect building scalable lakehouse and real-time data platforms.
What I'm looking for
I am a Lead Data Engineer and Cloud Data Architect with 9+ years designing scalable enterprise data platforms across healthcare, finance, SaaS, and IoT domains. I specialize in cloud-native lakehouse architectures, real-time streaming pipelines, and modern ETL/ELT ecosystems on AWS, Azure, and GCP.
I've delivered high-availability systems that process hundreds of millions of events daily, improved performance and governance, and reduced cloud costs through migrations and optimization. I built HIPAA-compliant pipelines and promoted data quality and lineage for regulated environments.
My hands-on technical work includes Databricks, Snowflake, Spark, Kafka, dbt, Airflow, Terraform, and CI/CD practices, enabling analytics, AI/ML, and real-time decisioning. I led multi-cloud modernization projects that cut transformation time and infrastructure costs significantly.
I am a mentor and technical leader who modernizes legacy platforms, establishes best practices for cloud architecture and DevOps automation, and enables teams to deliver production-ready, secure data solutions that drive executive decision-making.
Experience
Work history, roles, and key accomplishments
Lead Data Engineer
DataFold
Jun 2024 - Present (1 year 9 months)
Architected enterprise-scale cloud-native data platforms on AWS and Azure processing millions of records daily; built real-time streaming ingestion and Snowflake/Databricks lakehouse architectures that reduced data latency from hours to minutes and enabled ML workloads.
Senior Data Engineer
MedeAnalytics
Oct 2019 - Dec 2023 (4 years 2 months)
Engineered batch and streaming pipelines across multi-terabyte datasets and optimized Snowflake and Azure Synapse environments to support finance and operational reporting, significantly reducing query runtimes and modernizing legacy ETL.
ETL & Data Warehouse Engineer
Pact-One Solutions
Mar 2017 - Sep 2019 (2 years 6 months)
Developed enterprise ETL pipelines and dimensional models using SQL Server, Python, SSIS, and Azure Data Factory; implemented near real-time ingestion with Kafka and Spark and improved reporting accuracy via validation and monitoring frameworks.
Education
Degrees, certifications, and relevant coursework
Muhammad hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Azure Synapse
Apache Spark
AWS Glue
Apache Flink
GitHub
GitLab
Kubernetes
Jenkins
GitHub Actions
GitLab CI
dbt
PostgreSQL
MongoDB
Cassandra
Hadoop
Gmail
Databricks
Redis
Terraform
Pulumi
Java
TensorFlow
MLflow
scikit-learn
Kafka
Apache NiFi
Grafana
Prometheus
Datadog
Vercel
Kafka Streams
Airflow
Apache Beam
Time Analytics
SQL
Apache Iceberg
Datafold
Monte Carlo
Delta Lake
Great Expectations
ArgoCD
Apache Hudi
Collibra
Bash
Lakehouse.ai
Unity Catalog
Factory
Beam
Availability
Location
Authorized to work in
Portfolio
khan-portfolio-ten.vercel.appJob categories
Skills
Interested in hiring Muhammad?
You can contact Muhammad and 90k+ other talented remote workers on Himalayas.
Message MuhammadFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
