Skip to main content
Sai Swetha PaspunuriSP
Open to opportunities

Sai Swetha Paspunuri

@saiswethapaspunuri

Experienced Data Engineer specializing in scalable data platforms.

United States
Message

What I'm looking for

I am looking for a role that fosters innovation, collaboration, and growth opportunities in data engineering and analytics.

I am a Data Engineer with over 4 years of experience in building secure and scalable data platforms across healthcare, banking, and financial services. My expertise lies in developing real-time and batch data pipelines using technologies such as Spark, Kafka, Databricks, and Airflow, handling over 30 million records daily. I have successfully delivered cloud-native solutions on AWS, Azure, and GCP, utilizing tools like Glue, Synapse, BigQuery, and Redshift to enable real-time analytics and ensure compliance.

At HCA Healthcare, I have developed real-time data ingestion pipelines that reduced alert latency by 50% and designed an AWS-based Lakehouse architecture that integrates seamlessly with multiple downstream systems. My focus on optimizing ETL jobs has improved performance by 60%, and I have automated over 50 Airflow workflows, achieving a 99.9% pipeline success rate. My commitment to data security and compliance is evident through my implementation of IAM, KMS, and RBAC, ensuring full HIPAA audit compliance.

Previously, at Capgemini, I built scalable ETL pipelines that processed millions of daily transactions and migrated to real-time pipelines, significantly reducing SLA violations. My ability to apply GDPR controls and deliver regulatory datasets with full audit traceability showcases my dedication to data governance and compliance. I am passionate about leveraging data to drive insights and foster collaboration within analytics teams.

Experience

Work history, roles, and key accomplishments

HH
Current

Senior Data Engineer

HCA Healthcare

Jan 2024 - Present (2 years 5 months)

Developed real-time data ingestion and pipelines using Kafka and Spark Structured Streaming, reducing alert latency by 50% through optimized data processing. Designed an AWS-based Lakehouse architecture (S3, Glue, Redshift) integrated with 15+ downstream systems for seamless collaboration with analytics teams.

Education

Degrees, certifications, and relevant coursework

East Texas A&M University logoEU

East Texas A&M University

Masters in Computer Science, Computer Science

Focused coursework on Data Warehousing, Distributed Systems, and Cloud Computing. Gained in-depth knowledge of advanced computer science concepts relevant to modern data architectures.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan