Rajaram Yadav
@rajaramyadav
Senior Data Engineer building secure, near real-time cloud data platforms and GenAI search pipelines.
What I'm looking for
I’m a Senior Data Engineer with 6 years of experience designing and scaling secure cloud native data platforms across AWS, Azure, Snowflake, Databricks, Microsoft Fabric, and dbt. I build batch and real time data pipelines with AWS Glue, S3, Lambda, Kinesis, Redshift, Azure Data Factory, Databricks, Synapse, Event Hubs, Apache Spark, Kafka, and Airflow—enabling reliable near real-time analytics for healthcare and financial teams.
Across multiple migrations, I’ve transitioned legacy systems to lakehouse architectures using Databricks/Delta Lake and Microsoft Fabric OneLake, reducing data processing time from hours to near real time. I also bring hands-on Generative AI capabilities—RAG pipelines, LLM integrations (Azure OpenAI, AWS Bedrock, LangChain), vector databases, and AI-assisted validation—along with strong governance, data quality, lineage, and observability to meet HIPAA/FHIR/SOC2 and enterprise security needs.
Experience
Work history, roles, and key accomplishments
Designed hybrid multi-cloud healthcare data lakehouse architecture on AWS, Azure, and Microsoft Fabric, reducing batch processing ~40% and compute costs ~25%. Built real-time streaming pipelines and dbt-based transformations, and delivered an Azure OpenAI RAG enterprise search that cut document search time ~50%, with HIPAA/GDPR governance via Purview and Lake Formation.
Built Databricks Spark SQL pipelines for regulated financial data and migrated legacy Redshift/Hadoop into ADLS with Delta Lake and Snowflake to improve auditability and schema evolution. Optimized Azure Synapse/Snowflake to cut query spend 35%, reduced onboarding time 40%, and delivered low-latency reporting that cut delays 60%, alongside end-to-end data observability using Monte Carlo.
Built an AWS customer data platform integrating 50+ internal and external sources, and developed PySpark and Airflow pipelines running hundreds of daily jobs with retries and failure handling to improve reliability. Enabled near-real-time fraud detection using Kinesis, Lambda, and SageMaker, built REST APIs for curated payment/customer datasets, and implemented governance with Lake Formation/Glue
Education
Degrees, certifications, and relevant coursework
University of Missouri–Kansas City
Master of Science in Computer Science, Computer Science
Completed a Master’s degree in Computer Science at the University of Missouri–Kansas City.
Tech stack
Software and tools used professionally
Amazon Redshift
Fivetran
Splunk
Azure Synapse
Apache Spark
AWS Glue
Talend
Metabase
Amazon Quicksight
Amazon S3
AWS Step Functions
GitHub
GitLab
Kubernetes
Azure Kubernetes Service
AWS CodePipeline
Jenkins
GitHub Actions
Salesforce
Pandas
PySpark
AWS Glue DataBrew
dbt
PostgreSQL
MongoDB
MariaDB
Cassandra
Hadoop
Gmail
Django
Databricks
Redis
Terraform
Azure DevOps
Jira
Java
MLflow
scikit-learn
Kafka
Grafana
Prometheus
OpenTelemetry
Azure Monitor
Datadog
Elasticsearch
AWS Lambda
Airflow
Apache Beam
NetSuite
SQL
ServiceNow
Hugging Face
Dagster
LangChain
Polars
Monte Carlo
Delta Lake
OpenAI API
Great Expectations
Bash
Microsoft Fabric
Dynamic
Factory
Beam
Jan
Microsoft Purview
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Rajaram?
You can contact Rajaram and 90k+ other talented remote workers on Himalayas.
Message RajaramFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
