Shiva Sah
@shivasah
Experienced Data Engineer with a passion for scalable data solutions.
What I'm looking for
I am an experienced Data Engineer with about 7 years of expertise in building scalable data platforms and ETL pipelines across AWS, Azure, and GCP. My proficiency in technologies such as Spark, Snowflake, and Databricks allows me to deliver high-performance data solutions that drive analytics and machine learning readiness. I thrive in Agile environments and excel at collaborating cross-functionally to meet business objectives.
At Pfizer, I led a global initiative to modernize clinical and manufacturing analytics by implementing a production-grade lakehouse platform. This project not only streamlined data access across teams but also supported regulatory submissions and real-time analytics. My commitment to data quality and governance has resulted in significant improvements in operational efficiency and compliance across various organizations, including Wells Fargo and Johnson & Johnson.
Experience
Work history, roles, and key accomplishments
Lead Data Engineer
Pfizer
Sep 2023 - Present (2 years)
Led Pfizer's global initiative to modernize clinical and manufacturing analytics by delivering a production-grade lakehouse platform integrating genomics, assay, batch, and safety data. This unified foundation supports ML workloads, regulatory submissions, and GxP-compliant real-time analytics across vaccine and oncology programs. Spearheaded the design and implementation of a Medallion-layer lake
Senior Data Engineer
Wells Fargo
Jan 2021 - Present (4 years 8 months)
Led cloud transformation initiatives across Wells Fargo's retail and commercial banking lines by designing and scaling a unified financial data platform for real-time fraud detection, regulatory compliance (SOX, CCAR), customer insights, and credit risk analytics across Azure and GCP. Architected scalable ETL pipelines using Azure Data Factory and Databricks (PySpark) to process high-volume credit
Data Engineer
Johnson & Johnson
Sep 2018 - Present (7 years)
Built a cloud-native data platform to integrate Medicare, Medicaid, and clinical datasets, enabling real-time analytics, ML workflows, and regulatory compliance across J&J's pharma and med-tech units using AWS, Azure, Spark, Kafka, and Snowflake. Designed and optimized ETL pipelines using PySpark, processing large-scale healthcare datasets from Medicare, Medicaid, and commercial lines to support a
Education
Degrees, certifications, and relevant coursework
Kalamazoo College
Bachelor's, Computer Science & Business and Economics
Studied Computer Science and Business and Economics, gaining foundational knowledge in both technical and economic principles. Developed skills in problem-solving, data analysis, and business strategy.
Tech stack
Software and tools used professionally
Postman
OpenAPI
OpenAPI Specification
Airbyte
Matillion
Azure HDInsight
Azure Synapse
Apache Spark
AWS Glue
Apache Flink
SAS
Data Studio
Amazon Quicksight
AWS Step Functions
GitHub
Kubernetes
Jenkins
GitHub Actions
Veeva
NumPy
Pandas
PySpark
dbt
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
HBase
Gmail
.NET
Databricks
Terraform
AWS CloudFormation
Azure DevOps
Jira
JavaScript
Java
PowerShell
TensorFlow
PyTorch
MLflow
scikit-learn
Keras
Kafka
RabbitMQ
FastAPI
Grafana
Prometheus
OpenTelemetry
Ubuntu
CentOS
Linux
Windows
Datadog
GraphQL
gRPC
AWS Lambda
Serverless
Airflow
Apache Beam
Time Analytics
SQL
ServiceNow
Dagster
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Shiva?
You can contact Shiva and 90k+ other talented remote workers on Himalayas.
Message ShivaFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
