Hamza Khal
@hamzakhal
Staff Data Engineer scaling real-time lakehouse and distributed data platforms.
What I'm looking for
I’m a Staff Data Engineer with 10+ years of experience architecting and scaling distributed data platforms across fintech, healthcare, SaaS, telecom, and energy. I specialize in real-time streaming systems, cloud-native lakehouse architectures, and large-scale batch processing handling billions of events daily. I focus on governance, compliance, and high-availability distributed systems while delivering measurable reliability and cost optimization.
I’ve led enterprise platform modernization and engineering outcomes end-to-end—architecting multi-region AWS and Azure lakehouse platforms processing 8B+ transactions and IoT events daily, and designing Kafka + Flink event-driven streaming for sub-second fraud detection. At Consensus, I led an on-prem Hadoop to Databricks Lakehouse migration reducing infrastructure costs by $2.4M annually, built a disaster recovery framework achieving 99.99% SLA across mission-critical workloads, and implemented data mesh governance across 14 domains to improve ownership and compliance. I’ve also optimized Spark workloads improving performance by 60% and enabled self-service analytics for 200+ analysts and ML engineers—building the kind of AI-ready data foundation I’m proud to own.
Experience
Work history, roles, and key accomplishments
Staff Data Engineer
Consensus
Jun 2023 - Present (2 years 10 months)
Architected multi-region AWS/Azure lakehouse platform processing 8B+ financial transactions and IoT events daily. Designed Kafka/Flink event-driven streaming for sub-second fraud detection and migrated from on-prem Hadoop to Databricks Lakehouse, cutting infrastructure costs by $2.4M annually while achieving 99.99% SLA.
Senior Data Engineer
CorroHealth
Oct 2021 - Apr 2023 (1 year 6 months)
Built HIPAA-compliant ingestion pipelines integrating HL7 and FHIR streams processing 2TB+ daily. Developed Snowflake/Spark ELT and Kafka/Debezium CDC for near real-time reporting, reducing Snowflake compute costs by 35% and improving reliability via automated data validation.
Data Engineer
Sikich
Nov 2016 - Aug 2020 (3 years 9 months)
Developed Hadoop ETL pipelines processing 4B+ telecom records daily and designed churn analytics data marts for retention modeling. Implemented Kafka ingestion to reduce batch latency by 8 hours and improved Hive query performance by 40% through strategy optimization.
Junior Data Engineer
Calendly
Mar 2015 - Sep 2016 (1 year 6 months)
Built SSIS and Informatica workflows supporting enterprise reporting and designed dimensional models and PL/SQL procedures for financial systems. Automated Unix batch processing to improve job reliability and execution consistency.
Education
Degrees, certifications, and relevant coursework
Hamza hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Amazon Redshift
Azure Synapse
Apache Spark
AWS Glue
Apache Flink
GitHub
Kubernetes
PySpark
Debezium
dbt
Sqoop
MySQL
PostgreSQL
Microsoft SQL Server
Hadoop
HBase
Gmail
Databricks
Redis
Terraform
MLflow
Kafka
Apache Pulsar
Istio
Grafana
Prometheus
OpenTelemetry
Amazon Kinesis
Kafka Streams
Airflow
SQL
Calendly
Clickhouse
Apache Iceberg
Apache Arrow
Pinecone
Monte Carlo
Feast
Delta Lake
OpenAI API
Great Expectations
OpenMetadata
Trino
Apache Hudi
dbt Cloud
Bash
OpenLineage
Unity Catalog
Factory
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Hamza?
You can contact Hamza and 90k+ other talented remote workers on Himalayas.
Message HamzaFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
