HimalayasHimalayas logo
NA
Open to opportunities

Noah Anwar

@noahanwar

Principal Data Engineer building streaming-first, lakehouse data platforms for real-time analytics and ML.

United States
Message

What I'm looking for

I’m looking to lead streaming-first data platform work—ETL/ELT, Lakehouse, DataOps, and governance—while mentoring engineers and delivering real-time analytics and AI/ML that drive measurable business outcomes.

I’m a Principal/Senior Data Engineer with 11+ years of experience building and scaling cloud-native, data-intensive platforms across AWS, Azure, and GCP. I focus on streaming-first architectures, real-time data pipelines, and modern Lakehouse solutions using Apache Spark, Kafka, Flink, and Snowflake.

In my most recent work, I architected a streaming-first platform using Apache Kafka, Kafka Connect, and Apache Flink, then unified streaming and historical data with a Lakehouse approach (Delta Lake and Snowflake). I developed end-to-end machine learning pipelines in Python with Scikit-learn, TensorFlow, and MLflow, delivering demand forecasting models that reduced stock-outs by 20%.

I also lead with DataOps and governance—establishing data quality validation, lineage tracking, observability, and CI/CD with infrastructure-as-code. I enjoy translating complex business requirements into scalable, efficient, high-performance solutions, and mentoring engineers while aligning the data platform strategy to business objectives and key performance indicators.

Experience

Work history, roles, and key accomplishments

FA
Current

Principal Data Integration Engineer

Falkonry

Aug 2021 - Present (4 years 8 months)

Architected a streaming-first data platform using Kafka, Kafka Connect, and Flink, and built a Lakehouse on Delta Lake and Snowflake (bronze/silver/gold) to unify real-time and historical analytics. Developed Python ML pipelines with Scikit-learn, TensorFlow, and MLflow, delivering demand forecasting models that reduced stock-outs by 20% while deploying cloud-native infrastructure with Terraform.

CH

Data Engineering Team Lead

Current Health

May 2018 - Jul 2021 (3 years 2 months)

Led a real-time health data platform ingesting streaming IoT and wearable data, implementing event-driven pipelines with Kafka and Spark Structured Streaming for low-latency, fault-tolerant processing. Built scalable GCP-based lake and analytics foundations (GCS, Dataflow, BigQuery) and introduced DataOps practices (CI/CD, automated testing) to improve deployment efficiency and reduce failures.

SC

Data Engineer

Seeq Corporation

Feb 2015 - Apr 2018 (3 years 2 months)

Built real-time ingestion and processing systems with Kafka and Flink to deliver high-throughput, low-latency data pipelines, and engineered scalable storage and ETL workflows using Parquet and Python/Spark. Implemented OCR/document processing pipelines and established data quality, validation, and monitoring, deploying reproducible environments via Terraform and AWS CloudFormation.

Education

Degrees, certifications, and relevant coursework

University of the Punjab logoUP

University of the Punjab

Bachelor of Science, Computer Science

2010 - 2014

Grade: 3.7

Availability

Open to opportunities

Location

United States

Authorized to work in

Salary expectations

100k-600k USD

Interested in hiring Noah?

You can contact Noah and 90k+ other talented remote workers on Himalayas.

Message Noah

People also viewed

View all talent

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan