Open to opportunities

Sai Asfar

@saiasfar

Message

Principal Data Engineer building scalable cloud-native data platforms for real-time analytics and AI/ML-ready insights.

United States

Message

What I'm looking for

I’m looking for a Principal Data Engineering role where I can architect secure, cloud-native batch and real-time data platforms, strengthen DataOps/CI/CD, and enable AI/ML-ready analytics with strong governance, quality, and collaboration.

I’m a Principal / Senior Data Engineer with 9+ years of experience designing and building scalable cloud-native data platforms across AWS, Azure, and GCP. I specialize in Apache Spark, Kafka, Flink, Snowflake, Databricks, and modern Lakehouse architectures, with a strong focus on ETL/ELT development, real-time processing, and analytics enablement.

At Neudesic, I’ve architected and delivered batch and real-time data pipelines using Apache Spark, Kafka, and Flink, improving reliability and performance at enterprise scale. I modernized legacy ecosystems by migrating to cloud Lakehouse architectures, built robust ETL/ELT frameworks with Azure Data Factory and Airflow/dbt, and enabled AI/ML-ready data foundations through close collaboration with data scientists and business stakeholders. I also established DataOps best practices using CI/CD automation and Infrastructure as Code (Terraform/CloudFormation) to increase deployment efficiency and platform stability.

Earlier at Aunalytics, I served as Senior Data Engineer / Team Lead, leading the implementation of modern data warehousing and Lakehouse solutions for analytics, predictive modeling, and reporting. I developed data quality, MDM, metadata management, and governance frameworks to create a “single source of truth,” while mentoring engineers and driving Agile delivery and code review standards. I implemented DataOps and DevOps practices (CI/CD automation, IaC, monitoring, performance optimization) to reduce deployment cycles and strengthen platform reliability.

In my prior role at Kollabio, I built scalable ETL pipelines and cloud data architectures for federal and commercial clients, delivering both real-time and batch workflows. I’ve applied these capabilities to healthcare interoperability (HL7/FHIR with HIPAA-aligned controls), IoT predictive maintenance (time-series analytics and anomaly detection), and real-time supply chain intelligence (Kafka/Spark/Flink with Lakehouse storage), always with a focus on secure, scalable, high-performance data outcomes.

Experience

Work history, roles, and key accomplishments

Current

Principal Data Engineer

Current

Neudesic

Aug 2022 - Present (3 years 11 months)

Designed and implemented enterprise-scale cloud-native data platforms across AWS, Azure, and GCP, enabling scalable analytics, reporting, and AI/ML workloads. Led batch and real-time pipeline modernization to Lakehouse architectures and established DataOps practices with CI/CD and infrastructure as code to improve reliability and platform stability.

Apache Spark Apache Kafka Apache Flink Databricks Snowflake Lakehouse Architecture ETL ELT DataOps Terraform Data Governance

Senior Data Engineer

Aunalytics

May 2019 - Jul 2022 (3 years 2 months)

Architected scalable cloud-native data platforms for enterprise analytics, business intelligence, and AI-driven decision-making across financial services and healthcare. Built ETL/ELT and real-time/batch integration pipelines, advanced Lakehouse data warehousing, and implemented DataOps/DevOps automation to improve reliability and deployment efficiency.

Apache Spark Apache Kafka Python SQL Data Warehousing Data Governance CI CD Automation Terraform Master Data Management

Data Engineer

Kollabio

Feb 2017 - Apr 2019 (2 years 2 months)

Designed and developed scalable data integration and ETL pipelines for digital transformation initiatives serving federal and commercial clients. Implemented both real-time and batch data processing workflows and improved analytics performance through query/model optimization and data quality validation automation.