Skip to main content
Merry ShahMS
Open to opportunities

Merry Shah

@merryshah

Lead Data Engineer crafting real-time, cloud-native data pipelines and predictive analytics for data-driven decisions.

United States
Message

What I'm looking for

I’m looking to lead cloud-native data engineering—real-time pipelines, strong governance and automated validation, and predictive/ML-enabled analytics—working with teams that value scalable architecture, reliability, and measurable data quality outcomes.

I’m a Lead Data Engineer with 9+ years of experience designing, developing, and optimizing data pipelines, cloud architectures, and analytics solutions. I focus on scalable ETL workflows, cloud data warehouses, and real-time processing that turn events into actionable insight.

In my current role, I architected high-performance real-time pipelines using Apache Kafka, Apache Flink, and Apache Spark Streaming—handling 100M+ daily events with sub-second latency. I’ve migrated on-prem warehouses to AWS Redshift and Snowflake, improving query performance by 40% and reducing infrastructure costs by 25%, while also cutting processing latency and pipeline time with incremental loads, partitioning, and tuning.

I’m especially strong in building reliable data platforms with orchestration, validation, and governance. I use Apache Airflow, AWS Glue, and automated data validation (including EvidentlyAI and Prometheus) to improve data quality and achieve 99.9% pipeline uptime, alongside HIPAA-compliant healthcare pipeline work.

I also lead with a product mindset—integrating machine learning and predictive analytics (Python, Scikit-learn, XGBoost, Spark MLlib) into data pipelines and delivering interactive BI dashboards with Tableau and Power BI. I enjoy mentoring teams and creating maintainable systems that support data maturity, governance lineage, and strategic growth.

Experience

Work history, roles, and key accomplishments

WS
Current

Lead Data Engineer

Wavicle Solutions

Jun 2023 - Present (3 years)

Designed and implemented real-time Kafka/Flink/Spark Streaming pipelines handling 100M+ daily events with sub-second latency. Migrated warehouses to AWS Redshift and Snowflake, improving query performance by 40% and reducing infrastructure costs by 25%, while building HIPAA-compliant data pipelines and achieving 99.9% pipeline uptime through automated validation.

DA

Senior Data Engineer

Datavail

Sep 2019 - May 2023 (3 years 8 months)

Designed and developed Spark/Python ETL pipelines that reduced processing time by 30% and migrated legacy systems to AWS, improving processing speeds by 35% while cutting operational costs by 20%. Implemented secure RBAC and governance for data privacy and automated Trino cluster monitoring with Prometheus/Grafana to minimize downtime by 15%.

Accenture logoAC

Data Engineer

Aug 2017 - Aug 2019 (2 years)

Assisted in migrating multi-terabyte relational workloads to Hadoop and AWS Redshift using Sqoop and Flume, and optimized PostgreSQL/MySQL queries to improve retrieval times by 20%. Built batch processing and automated ETL orchestration/monitoring with Spark on AWS EMR plus Airflow and AWS Glue, reducing manual intervention by 40%.

Education

Degrees, certifications, and relevant coursework

Merry hasn't added their education

Don't worry, there are 90k+ talented remote workers on Himalayas

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan