Supriya Pudasainy
@supriyapudasainy
Experienced Data Engineer specializing in cloud-based data architectures.
What I'm looking for
With over 5 years of experience in designing, implementing, and managing cloud-based data architectures, I have honed my skills in building scalable ETL pipelines and real-time big data solutions. My expertise lies in refactoring legacy ETL workflows using Python and optimizing data ingestion processes with tools like Apache Spark and Databricks. I have successfully engineered data pipelines that process over 5TB of healthcare data daily, applying Medallion architecture to enhance data lineage and transformation consistency.
Throughout my career, I have worked extensively with cloud platforms such as AWS, Azure, and GCP, leading cloud migration projects and integrating cross-platform data engineering workflows. My strong background in SQL development and data governance ensures that I deliver high-quality, compliant data solutions. I am passionate about leveraging machine learning and AI technologies to drive insights and improve operational efficiency, collaborating closely with data scientists to deploy predictive models and enhance clinical outcomes.
Experience
Work history, roles, and key accomplishments
Data Engineer
Johnson & Johnson
Sep 2023 - Present (1 year 11 months)
Designed scalable ETL pipelines using PySpark and Databricks, processing over 5TB of healthcare data daily from Medicare, Medicaid, and commercial sources. Engineered real-time data ingestion pipelines using Apache Flink, Kafka, AWS Kinesis, and GCP Pub/Sub, enabling sub-second latency analytics.
Data Engineer
Prudential Financial, Inc
Jul 2021 - Present (4 years 1 month)
Designed scalable ETL pipelines using Azure Data Factory, Matillion, and Apache Airflow, automating ingestion from RDBMS and APIs into Azure Data Lake and GCP BigQuery. Built modular data processing workflows using Databricks Notebooks, applying PySpark to transform customer and policyholder data across 10+ business domains.
Data Engineer
Citadel
Feb 2020 - Present (5 years 6 months)
Engineered ultra-low-latency data pipelines using Apache Spark, Kafka, and Flink, enabling ingestion and processing of billions of daily trade messages, order books, and market events from global exchanges. Built real-time streaming architectures on AWS and GCP, integrating Kinesis, Cloud Pub/Sub, and Lambda to support rapid analytics for alpha signal generation and intraday risk monitoring.
Education
Degrees, certifications, and relevant coursework
University of Louisiana at Monroe
Bachelor's in Computer Science, Computer Science
Studied the fundamentals of computer science, including programming languages, data structures, and algorithms. Gained a strong foundation in software development and problem-solving.
Tech stack
Software and tools used professionally
Matillion
Azure Synapse
Apache Spark
AWS Glue
Apache Flink
Talend
AWS IAM
AWS Step Functions
GitHub
GitLab
Kubernetes
Jenkins
GitLab CI
NumPy
Pandas
PySpark
dbt
Sqoop
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
HBase
Gmail
Rollout
Databricks
Terraform
Azure DevOps
Jira
Java
JSON
XML
TensorFlow
PyTorch
MLflow
scikit-learn
Kafka
Grafana
Kibana
OpenTelemetry
Azure Monitor
Elasticsearch
Avro
AWS Lambda
Airflow
Time Analytics
SQL
Hugging Face
AWS KMS
Temporal
LangChain
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Supriya?
You can contact Supriya and 90k+ other talented remote workers on Himalayas.
Message SupriyaFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
