Open to opportunities

Fahad Chaudhary

@fahadchaudhary

Message

I build scalable real-time lakehouse platforms as a Staff Data Engineer.

Canada

Message

What I'm looking for

I’m looking to build cloud-native, real-time lakehouse data platforms—strong governance, data quality, and streaming reliability—with modern orchestration and CI/CD. I want a team where I can lead architecture decisions and mentor engineers.

I’m a Staff Data Engineer with over decade of experience designing and building scalable data pipelines, ETL/ELT workflows, and real-time data systems in cloud environments. I specialize in SQL, Python, and PySpark with distributed processing using Apache Spark.

At Aya Health Technologies, I designed and led end-to-end data platforms (ingestion → transformation → modeling → serving) for large-scale batch and streaming workloads. I architected modern ecosystems using Snowflake, Apache Spark, and dbt, and exposed data products through backend services, APIs, and microservices (Python/Java).

I’ve built high-throughput real-time processing pipelines with tools like Apache Kafka and Kinesis, and I established strong data quality, validation, observability, and governance frameworks (including compliance and PII masking). I lead technical architecture decisions, design reviews, and engineering standards to improve scalability, performance, and cost efficiency.

Previously at Vision Critical and Sensibill, I migrated legacy ETL to cloud-native platforms, built and optimized ETL/ELT pipelines with Snowflake and Databricks, and orchestrated workflows with Apache Airflow, dbt, and Dagster. Across healthcare, fintech, SaaS, and e-commerce projects, I consistently deliver reliable data platforms with CI/CD, Infrastructure as Code, and a focus on governance and data integrity.

Experience

Work history, roles, and key accomplishments

Current

Staff Data Engineer

Current

Aya Health Technologies

Jan 2022 - Present (4 years 6 months)

Designed and led end-to-end batch, streaming, and real-time data platforms for large-scale healthcare datasets, covering ingestion through transformation, modeling, and serving. Implemented streaming pipelines with Kafka/Kinesis, established data quality and governance, and introduced DevOps/CI/CD practices and monitoring for reliable deployments.

Amazon Kinesis Apache Spark Snowflake DBT Python PySpark Data Governance CI CD Kafka

Senior Data Engineer

Vision Critical

Jul 2018 - Dec 2021 (3 years 5 months)

Led migration of legacy ETL (e.g., Ab Initio) to cloud-native data integration using Informatica Cloud and lake/warehouse architectures. Built and optimized scalable ETL/ELT pipelines with Snowflake and Databricks, orchestrated workflows with Airflow/dbt/Dagster, and implemented data validation, reconciliation, CI/CD, and infrastructure-as-code for improved reliability.

Snowflake Databricks Informatica DBT Dagster Python PySpark SQL CI CD Airflow

Data Engineer

Sensibill

Sep 2016 - May 2018 (1 year 8 months)

Designed and maintained scalable ETL/ELT data pipelines across multiple sources using Databricks, Snowflake, Python, and PySpark for high-performance analytics. Automated orchestration with Airflow and CI/CD, implemented data modeling and quality/governance checks, and provided technical leadership while troubleshooting complex data issues.