Janice Jafri
@janicejafri
Lead Data Engineer specializing in cloud-native lakehouses, streaming, and production AI/ML pipelines.
What I'm looking for
I am a Lead Data Engineer with over 10 years building cloud-native data platforms across AWS, Azure, and GCP, delivering scalable lakehouse and streaming solutions for AI/ML and analytics. I drive cost and performance improvements—reducing compute costs by 30% and improving pipeline throughput and performance by ~45%—while enforcing governance and observability for secure production systems.
I have hands-on expertise in Databricks, Snowflake, Spark, Kafka, Flink, Delta Lake, dbt, Airflow, Python, and SQL, and I mentor and lead engineering teams to establish CI/CD and MLOps practices. My projects include multi-cloud lakehouses that saved $1.5M/year, high-throughput real-time pipelines, and enterprise governance implementations ensuring HIPAA/GDPR compliance and 99.9% uptime.
Experience
Work history, roles, and key accomplishments
Lead Data Engineer
Analytics8
Jan 2024 - Present (2 years 2 months)
Architected a multi-cloud Databricks/Delta Lake/Snowflake lakehouse supporting AI/ML and analytics, reducing compute costs by 30% and mentoring 8 engineers to improve release velocity by 50%.
Sr. Cloud Data Engineer
Vectorsoft
Jan 2020 - Dec 2023 (3 years 11 months)
Designed Snowflake and Synapse pipelines processing 6TB+ healthcare data daily, improved SLA compliance by 30%, and optimized SQL reducing query runtimes by 40%.
Developed Spark and Kafka streaming pipelines handling 5TB+ daily achieving 45% faster performance and scaled EMR Hadoop/Hive clusters to reduce infrastructure costs by 25%.
Built ETL workflows in Informatica and SQL Server loading 5TB+ supply chain data, automated 50+ ingestion pipelines reducing manual effort by 40% and delivered 20+ Power BI dashboards.
Education
Degrees, certifications, and relevant coursework
COMSATS University
Bachelor of Science, Computer Science
Completed a Bachelor of Science in Computer Science.
Tech stack
Software and tools used professionally
Fivetran
Apache Spark
Apache Flink
Talend
GitHub
Kubernetes
Jenkins
GitHub Actions
NumPy
Pandas
PySpark
dbt
Hadoop
Gmail
Databricks
Terraform
Azure DevOps
Java
MLflow
Kubeflow
Kafka
Grafana
Prometheus
Milvus
Avro
Ansible
Vercel
Redpanda
Airflow
SQL
Dagster
LangChain
LlamaIndex
Weaviate
Pinecone
Monte Carlo
Tecton
Feast
Cube.js
DataHub
Delta Lake
Great Expectations
Apache Hudi
Collibra
Bash
Faiss
Factory
Availability
Location
Authorized to work in
Website
project02-lemon.vercel.appJob categories
Skills
Interested in hiring Janice?
You can contact Janice and 90k+ other talented remote workers on Himalayas.
Message JaniceFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
