Open to opportunities

Hamza Khal

@hamzakhal

Message

Staff Data Engineer scaling real-time lakehouse and distributed data platforms.

United States

Message

What I'm looking for

I’m looking to build and modernize AI-ready, governed data platforms—leading lakehouse and real-time streaming systems in AWS/Azure, ensuring reliability, cost optimization, and compliance through strong technical leadership and ownership.

I’m a Staff Data Engineer with 10+ years of experience architecting and scaling distributed data platforms across fintech, healthcare, SaaS, telecom, and energy. I specialize in real-time streaming systems, cloud-native lakehouse architectures, and large-scale batch processing handling billions of events daily. I focus on governance, compliance, and high-availability distributed systems while delivering measurable reliability and cost optimization.

I’ve led enterprise platform modernization and engineering outcomes end-to-end—architecting multi-region AWS and Azure lakehouse platforms processing 8B+ transactions and IoT events daily, and designing Kafka + Flink event-driven streaming for sub-second fraud detection. At Consensus, I led an on-prem Hadoop to Databricks Lakehouse migration reducing infrastructure costs by $2.4M annually, built a disaster recovery framework achieving 99.99% SLA across mission-critical workloads, and implemented data mesh governance across 14 domains to improve ownership and compliance. I’ve also optimized Spark workloads improving performance by 60% and enabled self-service analytics for 200+ analysts and ML engineers—building the kind of AI-ready data foundation I’m proud to own.

Experience

Work history, roles, and key accomplishments

Current

Staff Data Engineer

Current

Consensus

Jun 2023 - Present (3 years 1 month)

Architected multi-region AWS/Azure lakehouse platform processing 8B+ financial transactions and IoT events daily. Designed Kafka/Flink event-driven streaming for sub-second fraud detection and migrated from on-prem Hadoop to Databricks Lakehouse, cutting infrastructure costs by $2.4M annually while achieving 99.99% SLA.

AWS Azure Kafka Apache Flink Apache Spark Disaster Recovery Unity Catalog Kubernetes

Senior Data Engineer

CorroHealth

Oct 2021 - Apr 2023 (1 year 6 months)

Built HIPAA-compliant ingestion pipelines integrating HL7 and FHIR streams processing 2TB+ daily. Developed Snowflake/Spark ELT and Kafka/Debezium CDC for near real-time reporting, reducing Snowflake compute costs by 35% and improving reliability via automated data validation.

Snowflake Apache Spark Python Kafka Debezium CDC ELT HL7 FHIR Data Validation

Data Engineer

Sikich

Nov 2016 - Aug 2020 (3 years 9 months)

Developed Hadoop ETL pipelines processing 4B+ telecom records daily and designed churn analytics data marts for retention modeling. Implemented Kafka ingestion to reduce batch latency by 8 hours and improved Hive query performance by 40% through strategy optimization.

Hadoop Hive Kafka ETL Data Marts Churn Modeling Performance Optimization Batch Processing SQL Linux Unix

Junior Data Engineer

Calendly

Mar 2015 - Sep 2016 (1 year 6 months)

Built SSIS and Informatica workflows supporting enterprise reporting and designed dimensional models and PL/SQL procedures for financial systems. Automated Unix batch processing to improve job reliability and execution consistency.