Tariq User
@tariqfq
Lead Data Engineer building scalable, secure lakehouse and real-time data platforms.
What I'm looking for
I am a Lead Data Engineer with 10+ years designing and managing scalable data platforms across healthtech, retail, fintech, and SaaS, focused on transforming complex business needs into high-performing, well-documented solutions.
I build streaming and batch systems using Databricks, Delta, Iceberg, Kafka, Flink, and Spark, and standardize ELT with dbt and Airflow to accelerate analytics delivery while optimizing compute and storage.
I prioritize security, compliance (HIPAA, GDPR, SOC 2), observability, and CI/CD automation with Terraform and GitHub Actions, mentor engineers, and deliver measurable business outcomes through reduced MTTR, cost savings, and faster onboarding.
Experience
Work history, roles, and key accomplishments
Lead Data Engineer
Truveta, Inc
May 2022 - Present (3 years 8 months)
Led design and operation of large-scale, HIPAA-compliant lakehouse platforms on Databricks and Delta Lake across AWS/Azure/GCP; built batch and streaming CDC pipelines that enabled near-real-time analytics and supported ML feature workflows.
Lead Data Engineer
Vodworks
Feb 2022 - Present (3 years 11 months)
Built a multi-cloud lakehouse and real-time pipelines for clinical and claims data, reducing time-to-insight by 40% and achieving 99%+ pipeline uptime while ensuring HIPAA compliance.
Senior Data Engineer
Starschema
Jan 2017 - Apr 2022 (5 years 3 months)
Contributed to scaling cloud analytics platforms using Spark, Snowflake, Redshift, and BigQuery; optimized ETL pipelines and warehouses to improve query performance, observability, and cost efficiency for BI consumers.
Scaled ingestion and streaming pipelines across Databricks, Glue, and EMR to reduce failure rates by over 50% and cut costs ~18%, and built RAG-ready vector datasets for semantic search.
Data Engineer
Anblicks
Jan 2014 - Jun 2017 (3 years 5 months)
Consolidated APIs and databases into governed data lakes/warehouses, improving dashboard performance by over 50% and implementing GDPR/CCPA controls and reusable Spark ETL frameworks.
Data Engineer
United Techno
Jan 2014 - Dec 2016 (2 years 11 months)
Built ingestion workflows and a centralized Data Vault 2.0 warehouse using Hadoop ecosystem and relational databases; automated infrastructure with Terraform and delivered Power BI reporting to stakeholders.
Education
Degrees, certifications, and relevant coursework
Unknown
Bachelor of Science, Computer Science
Completed a Bachelor's degree in Computer Science.
Tech stack
Software and tools used professionally
Azure Synapse
Apache Spark
AWS Glue
Superset
Metabase
Amazon Quicksight
Amazon S3
AWS Step Functions
GitHub
Kubernetes
Jenkins
GitHub Actions
Pandas
PySpark
Debezium
dbt
DB
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
InfluxDB
Gmail
Databricks
Neo4j
Redis
Terraform
Pulumi
Java
TensorFlow
PyTorch
MLflow
scikit-learn
Kubeflow
Kafka
Apache NiFi
Apache Pulsar
FastAPI
Grafana
Prometheus
OpenTelemetry
Datadog
Elasticsearch
OpenSearch
Kafka Streams
Redpanda
Airflow
Apache Beam
dockerized
Root Cause
s3-lambda
SQL
Hugging Face
Apache Iceberg
LangChain
Pinecone
Monte Carlo
DataHub
Delta Lake
Great Expectations
ArgoCD
Trino
Apache Hudi
Collibra
Argo CD
Cosmos
Bash
Faiss
Unity Catalog
Factory
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Tariq?
You can contact Tariq and 90k+ other talented remote workers on Himalayas.
Message TariqFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
