Mark Fahad
@markfahad
Staff data engineer designing secure batch/streaming data platforms that cut cloud costs.
What I'm looking for
I’m a Staff Data Engineer with 10+ years of experience architecting data platforms across healthcare, financial services, ecommerce, and consulting. I’ve delivered batch and streaming systems processing 5M+ daily events and 50+ production workflows while reducing cloud infrastructure costs by 45% and Snowflake spend by 30%.
I specialize in Python, SQL, Spark/Databricks, Snowflake, and orchestration with Airflow and dbt, with Kafka and CDC at the core of my streaming and operational integrations. As a technical lead, I define architecture, strengthen governance and reliability, and deliver secure data products for analytics and AI, including feature store and ML data pipelines.
I also focus on production excellence—workspace governance, Unity Catalog, cluster policies, cost monitoring, testing, and infrastructure-as-code with Terraform and CI/CD. Across roles, I’ve built governed warehouses, CDC-driven analytics, HIPAA-compliant pipelines, and reusable data models, and I mentor teams to standardize documentation, lineage, and support.
Experience
Work history, roles, and key accomplishments
Staff Data Engineer
WestonChaseTechnologies
Apr 2023 - Present (3 years 2 months)
Led architecture and delivery of cloud batch, streaming, and ML data solutions, reducing infrastructure spend by 45% through an AWS migration and workload redesign. Built Kafka and Spark Structured Streaming pipelines processing 5M events/day and governed Databricks operations for 50+ production workflows using Terraform and monitoring.
Senior Data Engineer
SundaysforDogs
Mar 2020 - Mar 2023 (3 years)
Architected a Snowflake analytics warehouse using Kimball dimensional models for a platform serving 2M users and operated Fivetran, dbt, and Airflow ELT pipelines ingesting 20+ sources. Reduced Snowflake spend by 30% via rightsizing, clustering, incremental models, and SQL optimization, and improved stock visibility via Kafka-based inventory event pipelines.
Data Engineer
BlackbirdHealth
Jun 2017 - Feb 2020 (2 years 8 months)
Deployed HIPAA-compliant AWS pipelines and an encrypted S3 data lake to transform claims and EHR data into trusted analytics products. Orchestrated 50+ daily Airflow workflows and reduced report generation time by 50% through dimensional dataset tuning and PostgreSQL optimization, while decreasing manual data errors by 40% using reconciliation controls.
ETL Developer
LionTree
Aug 2015 - May 2017 (1 year 9 months)
Authored Python and SQL ETL pipelines integrating API, fixed-width, delimited, JSON, and XML financial/transaction data for risk analysis, reconciliation, and regulatory reporting. Supported migrations of data workloads from on-prem to AWS and improved performance via indexing and SQL tuning, while streamlining manual checks and improving operational visibility.
Education
Degrees, certifications, and relevant coursework
Mark hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Fivetran
Azure Synapse
AWS Glue
Apache Flink
GitHub
GitLab
Kubernetes
Jenkins
GitHub Actions
GitLab CI
Pandas
PySpark
Debezium
dbt
MySQL
PostgreSQL
MongoDB
Gmail
Databricks
Neo4j
Terraform
Java
JSON
PowerShell
XML
MLflow
Kafka
Grafana
Prometheus
Datadog
GraphQL
Avro
Vercel
Airflow
SQL
Clickhouse
Dagster
Apache Iceberg
Hightouch
Delta Lake
Great Expectations
Bash
OpenLineage
Unity Catalog
Factory
Remote
Availability
Location
Authorized to work in
Website
markfahad.vercel.appSocial media
Job categories
Skills
Interested in hiring Mark?
You can contact Mark and 90k+ other talented remote workers on Himalayas.
Message MarkFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
