Yasir Ch
@yasirch1
Staff data engineer and solutions architect delivering cloud lakehouse and real-time streaming platforms with governance and cost optimization.
What I'm looking for
I’m a Staff Data Engineer and Solutions Architect with 10+ years of hands-on experience designing and delivering enterprise-grade data platforms across healthcare, finance, and technology. I lead complex migrations from on-premise Hadoop ecosystems to modern Cloud Lakehouses (AWS, Azure, GCP), architecting high-throughput ETL/ELT pipelines and real-time streaming systems.
I focus on data quality, governance (HIPAA/GDPR), and FinOps optimization to maximize ROI. I’ve designed scalable Data Mesh approaches, real-time CDC pipelines, and lakehouse architectures that support enterprise analytics and AI/ML workloads—backed by strong lineage and metadata frameworks.
In healthcare, I work confidently with standards like HL7, FHIR, and C0-CDA, including clinical data mapping and EHR integrations. I’ve built FHIR-compliant data connectors to normalize HL7/C-CDA messages into standardized clinical datasets mapped to ICD-10, CPT, LOINC, and RxNorm.
I’m also a technical leader who bridges strategy with execution, translating ambiguous business requirements into scalable, future-ready data architectures. I mentor cross-functional engineering teams, drive CI/CD adoption, and deliver analytics-ready, AI/ML-enabled data products that reduce operational cost and accelerate decision-making.
Experience
Work history, roles, and key accomplishments
Lead Data Engineer
Axuall
Oct 2023 - Present (2 years 8 months)
Led migration of 50+ TB healthcare data from on-prem Hadoop to AWS EMR and Snowflake, improving performance and cutting infrastructure costs. Built batch/CDC/real-time pipelines with PySpark, Kafka, and AWS Kinesis processing 10M+ events/day, and implemented data quality checks using Great Expectations and Monte Carlo.
Senior Data Engineer
Census
May 2019 - Sep 2023 (4 years 4 months)
Designed and rolled out a unified AWS lakehouse platform using S3, Redshift, and Delta Lake to support enterprise analytics and AI/ML workloads. Built Kafka/Spark streaming pipelines for near real-time reporting and improved reliability by implementing dbt automated testing/documentation for 300+ models, along with governance (RBAC, data contracts, lineage, PII masking) across AWS/Azure/GCP.
Data Engineer
BuyerQuest
Nov 2015 - Apr 2019 (3 years 5 months)
Automated procurement and reporting workflows with Python and AWS Glue, reducing manual processing time by ~90%. Built star/snowflake models and optimized Snowflake queries, Spark jobs, and ETL to cut report generation from 4 hours to under 30 minutes, and created Tableau/Power BI dashboards that reduced ad hoc reporting requests by ~60%.
Education
Degrees, certifications, and relevant coursework
Yasir hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Splunk
Azure Synapse
Apache Spark
AWS Glue
Apache Flink
Druid
Talend
Superset
Metabase
Microsoft Azure
Amazon S3
GitHub
Kubernetes
Kubecost
Jenkins
GitHub Actions
Salesforce
PySpark
Debezium
dbt
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
Gmail
Rollout
Databricks
Terraform
TensorFlow
PyTorch
MLflow
scikit-learn
Kubeflow
HubSpot
Kafka
Grafana
Prometheus
New Relic
Datadog
GraphQL
Milvus
Ansible
Airflow
Apache Beam
CloudZero
Apache Ranger
Mapped
Luigi
SQL
AWS KMS
Mode Analytics
Dagster
Apache Iceberg
LangChain
Weaviate
Datafold
Pinecone
Atlan
Monte Carlo
Soda
DataHub
Mage
Delta Lake
Great Expectations
Trino
Amundsen
Marquez
Deequ
OpenLineage
Factory
Beam
ThoughtSpot
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Yasir?
You can contact Yasir and 90k+ other talented remote workers on Himalayas.
Message YasirFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
