Sai Samir
@saisamir
Senior Data Engineer specializing in scalable data platforms, lakehouses, and ML-enabled pipelines.
What I'm looking for
I am a Senior Data Engineer with ~6 years of experience building large-scale data pipelines, lakehouse architectures, and production ML/RAG systems using Spark, Databricks, AWS, and modern MLOps tools. I have delivered high-throughput Spark/Scala pipelines processing billions of records, implemented CDC and incremental ingestion, and built semantic/data-mart layers to enable self-service analytics and reliable reporting.
I combine strong engineering rigor—CI/CD, observability, IAM governance, and performance tuning—with applied ML and GenAI experience, designing RAG pipelines, vector search, and LLM microservices to accelerate data discovery and automation. My work includes migration of legacy warehouses, implementing Delta Lake best practices, and securing regulated healthcare and finance workloads to meet compliance and operational SLAs.
Experience
Work history, roles, and key accomplishments
Senior Data Engineer
PwC
Dec 2024 - Present (1 year 3 months)
Designed and maintained large-scale Spark/Databricks pipelines processing 1B+ records/day with 99.9% reliability; implemented Delta Lake optimizations and metadata-driven ingestion frameworks that reduced manual onboarding work by 40% and improved query performance by ~35%.
Data Engineer
BMW
Jul 2022 - Nov 2024 (2 years 4 months)
Built ETL and CDC-style pipelines in Databricks and Snowflake, migrated Netezza workloads, and implemented Snowpipe/Streams to improve ingestion latency and query performance for healthcare and claims reporting.
Designed dimensional models and developed ETL/SSIS packages and BI reports (SSRS, Tableau, Power BI) to support reporting and analytics, improving reporting efficiency and delivering actionable dashboards.
Education
Degrees, certifications, and relevant coursework
Sai hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Postman
Amazon Redshift
Matillion
Azure Synapse
Apache Spark
AWS Glue
Apache Hive
ggplot2
AWS IAM
Amazon EC2
Microsoft Azure
Amazon S3
GitHub
Kubernetes
Jenkins
GitHub Actions
NumPy
Pandas
PySpark
DB
Sqoop
MySQL
MongoDB
Cassandra
Hadoop
HBase
Vertica
Gmail
.NET
Databricks
Terraform
Visual Studio
PyCharm
Azure DevOps
Java
ASP.NET
JSON
Perl
Azure Machine Learning
scikit-learn
Kafka
Amazon SQS
FastAPI
MongoDB Atlas
SQLAlchemy
Linux
Datadog
AWS Lambda
Serverless
Amazon RDS
Azure SQL Database
Kafka Streams
JUnit
Airflow
Time Analytics
s3-lambda
Amazon Web Services (AWS)
SQL
Amazon SageMaker
Amazon EBS
SciPy
AWS KMS
Apache Iceberg
ScalaTest
LangChain
Pinecone
Celonis
Delta Lake
OpenAI API
Azure Logic Apps
Cosmos
Bash
Transform
Faiss
Make
Phase
Dynamic
Task
Factory
Matrix
Safe
Guidewire
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Sai?
You can contact Sai and 90k+ other talented remote workers on Himalayas.
Message SaiFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
