Sai Swetha Paspunuri
@saiswethapaspunuri
Experienced Data Engineer specializing in scalable data platforms.
What I'm looking for
I am a Data Engineer with over 4 years of experience in building secure and scalable data platforms across healthcare, banking, and financial services. My expertise lies in developing real-time and batch data pipelines using technologies such as Spark, Kafka, Databricks, and Airflow, handling over 30 million records daily. I have successfully delivered cloud-native solutions on AWS, Azure, and GCP, utilizing tools like Glue, Synapse, BigQuery, and Redshift to enable real-time analytics and ensure compliance.
At HCA Healthcare, I have developed real-time data ingestion pipelines that reduced alert latency by 50% and designed an AWS-based Lakehouse architecture that integrates seamlessly with multiple downstream systems. My focus on optimizing ETL jobs has improved performance by 60%, and I have automated over 50 Airflow workflows, achieving a 99.9% pipeline success rate. My commitment to data security and compliance is evident through my implementation of IAM, KMS, and RBAC, ensuring full HIPAA audit compliance.
Previously, at Capgemini, I built scalable ETL pipelines that processed millions of daily transactions and migrated to real-time pipelines, significantly reducing SLA violations. My ability to apply GDPR controls and deliver regulatory datasets with full audit traceability showcases my dedication to data governance and compliance. I am passionate about leveraging data to drive insights and foster collaboration within analytics teams.
Experience
Work history, roles, and key accomplishments
Senior Data Engineer
HCA Healthcare
Jan 2024 - Present (1 year 5 months)
Developed real-time data ingestion and pipelines using Kafka and Spark Structured Streaming, reducing alert latency by 50% through optimized data processing. Designed an AWS-based Lakehouse architecture (S3, Glue, Redshift) integrated with 15+ downstream systems for seamless collaboration with analytics teams.
Data Engineer - Banking & Financial Services
Capgemini
Feb 2020 - Present (5 years 4 months)
Built scalable ETL pipelines on Databricks and Azure Data Factory, processing over 30 million daily transactions. Migrated to real-time pipelines with Kafka, Delta Lake, and Spark Streaming, which reduced SLA violations by 70%.
Education
Degrees, certifications, and relevant coursework
East Texas A&M University
Masters in Computer Science, Computer Science
Focused coursework on Data Warehousing, Distributed Systems, and Cloud Computing. Gained in-depth knowledge of advanced computer science concepts relevant to modern data architectures.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Sai Swetha?
You can contact Sai Swetha and 90k+ other talented remote workers on Himalayas.
Message Sai SwethaFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
