Shiva MuppaSM
Open to opportunities

Shiva Muppa

@shivamuppa

Data Engineer with expertise in scalable cloud-native data solutions.

United States
Message

What I'm looking for

I seek a collaborative environment where I can leverage my data engineering skills to drive impactful projects and foster innovation.

I am a Data Engineer with extensive experience in building scalable, cloud-native data solutions on AWS and Azure. I specialize in designing and optimizing ETL pipelines and real-time streaming systems using Apache Spark, Kafka, and Airflow. My proven track record in delivering high-performance solutions ensures compliance with standards like HIPAA, and I aim to leverage my technical expertise and cross-functional collaboration skills to build secure and efficient data platforms.

Throughout my career, I have achieved significant milestones, such as reducing latency by 65% through the development of real-time ingestion pipelines and cutting reporting times by 50% by optimizing Snowflake queries. I have automated multi-source data ingestion, increasing engineering team efficiency by 40%, and developed Infrastructure as Code (IaC) solutions that reduced provisioning time by 70%. My commitment to security and compliance has enabled secure, scalable data access, improving governance across various projects.

Experience

Work history, roles, and key accomplishments

PA
Current

Data Engineer

Paychex

Aug 2023 - Present (1 year 11 months)

Designed and built scalable ETL/ELT pipelines using AWS Glue, Athena, and Python, handling over 10 billion records monthly across payroll and benefits domains. Developed real-time ingestion pipelines using Apache Kafka, integrating with AWS Lambda and Step Functions, reducing batch latency by 70%. Built serverless microservices for preprocessing and cleansing data using Lambda, with robust error h

ML

Data Engineer

Motivity Labs

Apr 2021 - Jul 2022 (1 year 3 months)

Developed modular, scalable pipelines using Azure Data Factory, Databricks, and PySpark, standardizing logic across ingestion flows. Built metadata-driven architecture leveraging parameterized datasets, triggers, and Data Lake Gen2, enabling dynamic, reusable pipeline configurations. Optimized Spark job performance by adjusting executor memory, partition counts, and cache strategies, improving loa

MC

Data Engineer

Magellanic Cloud

Sep 2019 - Mar 2021 (1 year 6 months)

Re-architected legacy ETL pipelines to distributed Apache Spark jobs using Scala and PySpark, reducing end-to-end job time by 60%. Built custom RDD transformations to process complex XML/CSV inputs into Parquet, achieving schema evolution and format standardization. Implemented SCD Type 2 logic in AWS Redshift, preserving historical snapshots of evolving customer data.

AT

Data Engineer

AQM Technologies

Sep 2018 - Aug 2019 (11 months)

Developed batch ETL pipelines using Spark and Scala, processing large-scale insurance claims and customer demographics data. Used Kafka with Spark Streaming to stream underwriting events in real-time, enabling timely alerting and reporting. Built Hive-based preprocessing and cleansing layers, applying filters, validations, and deduplication prior to ingestion.

Education

Degrees, certifications, and relevant coursework

University of North Texas logoUT

University of North Texas

Master of Science, Data Science

2022 - 2024

Completed coursework in Applied Machine Learning, Data Analysis, and Data Modeling. Studied Python Programming, Data Harvesting and Storage, and Natural Language Processing.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Shiva Muppa - Data Engineer - Paychex | Himalayas