Open to opportunities

Supriya Pudasainy

@supriyapudasainy

Message

Experienced Data Engineer specializing in cloud-based data architectures.

United States

Message

What I'm looking for

I seek a collaborative environment that fosters innovation and growth, focusing on data engineering and cloud solutions.

With over 5 years of experience in designing, implementing, and managing cloud-based data architectures, I have honed my skills in building scalable ETL pipelines and real-time big data solutions. My expertise lies in refactoring legacy ETL workflows using Python and optimizing data ingestion processes with tools like Apache Spark and Databricks. I have successfully engineered data pipelines that process over 5TB of healthcare data daily, applying Medallion architecture to enhance data lineage and transformation consistency.

Throughout my career, I have worked extensively with cloud platforms such as AWS, Azure, and GCP, leading cloud migration projects and integrating cross-platform data engineering workflows. My strong background in SQL development and data governance ensures that I deliver high-quality, compliant data solutions. I am passionate about leveraging machine learning and AI technologies to drive insights and improve operational efficiency, collaborating closely with data scientists to deploy predictive models and enhance clinical outcomes.

Experience

Work history, roles, and key accomplishments

Current

Data Engineer

Current

Johnson & Johnson

Sep 2023 - Present (2 years 2 months)

Designed scalable ETL pipelines using PySpark and Databricks, processing over 5TB of healthcare data daily from Medicare, Medicaid, and commercial sources. Engineered real-time data ingestion pipelines using Apache Flink, Kafka, AWS Kinesis, and GCP Pub/Sub, enabling sub-second latency analytics.

Data Engineer

Prudential Financial, Inc

Jul 2021 - Present (4 years 4 months)

Designed scalable ETL pipelines using Azure Data Factory, Matillion, and Apache Airflow, automating ingestion from RDBMS and APIs into Azure Data Lake and GCP BigQuery. Built modular data processing workflows using Databricks Notebooks, applying PySpark to transform customer and policyholder data across 10+ business domains.

Data Engineer

Citadel

Feb 2020 - Present (5 years 9 months)

Engineered ultra-low-latency data pipelines using Apache Spark, Kafka, and Flink, enabling ingestion and processing of billions of daily trade messages, order books, and market events from global exchanges. Built real-time streaming architectures on AWS and GCP, integrating Kinesis, Cloud Pub/Sub, and Lambda to support rapid analytics for alpha signal generation and intraday risk monitoring.

Education

Degrees, certifications, and relevant coursework

University of Louisiana at Monroe

Bachelor's in Computer Science, Computer Science

Studied the fundamentals of computer science, including programming languages, data structures, and algorithms. Gained a strong foundation in software development and problem-solving.