Mohiba Jalil
@mohibajalil
Senior Data Engineer building scalable lakehouse data pipelines and cloud platforms.
What I'm looking for
I’m a Senior Data Engineer with over 6 years of experience building and maintaining large-scale data pipelines and cloud-based data platforms. After starting as a Data Engineer, I grew through multiple Databricks roles to lead enterprise-scale Lakehouse solutions supporting analytics, reporting, and machine learning workloads across business domains.
I design and implement high-performance ETL and ELT frameworks using Databricks, PySpark, Delta Lake, and SQL—often processing 10TB+ per day. I optimize Spark workloads through partitioning, caching, query tuning, and cluster configuration, reducing processing times by over 40%, and I build CDC pipelines for near real-time synchronization. I also establish Bronze/Silver/Gold layers using Medallion Architecture and add automated data quality validation with monitoring and observability to keep datasets accurate, consistent, and reliable.
I partner closely with data scientists and analytics stakeholders to deliver feature engineering pipelines and production-grade data infrastructure, and I mentor junior engineers on data engineering standards and Spark optimization. Before moving fully into engineering, I worked as a Data Analyst supporting finance, healthcare, and retail clients with SQL extraction and dashboards in Tableau and Power BI, which keeps my work grounded in measurable business outcomes.
Experience
Work history, roles, and key accomplishments
Architected enterprise Lakehouse solutions for analytics, reporting, and machine learning, building high-performance ETL/ELT frameworks with Databricks, PySpark, Delta Lake, and SQL. Led scalable batch and real-time pipelines processing 15TB+ daily and reduced Spark processing times by 40%+ through partitioning, caching, query tuning, and cluster optimization.
Developed and maintained enterprise ETL pipelines ingesting data from transactional systems, APIs, cloud storage, and third-party providers, including 10TB+ daily processing. Implemented orchestration with Apache Airflow and Databricks Workflows, optimized SQL/Spark performance to reduce costs, and built reconciliation/auditing for improved data reliability.
Built and supported data pipelines ingesting from relational databases, APIs, flat files, and streaming sources using PySpark and SQL. Implemented Delta Lake tables with schema evolution and partitioning, created automated data quality checks, and performed root-cause analysis for production pipeline failures.
Data Analyst
ProCogia
May 2020 - Mar 2022 (1 year 10 months)
Collected, cleaned, and analyzed data from multiple sources (databases, APIs, and flat files) to support finance, healthcare, and retail clients. Built SQL extracts and dashboards in Tableau and Power BI, automated recurring reporting tasks, and implemented data validation/quality checks to improve reliability of reporting datasets.
Education
Degrees, certifications, and relevant coursework
The University of Texas at Dallas
Bachelor of Science in Computer Science, Computer Science
2016 - 2020
Earned a Bachelor of Science in Computer Science at The University of Texas at Dallas from 2016 to 2020.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Mohiba?
You can contact Mohiba and 90k+ other talented remote workers on Himalayas.
Message MohibaFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
