Open to opportunities

Mohiba Jalil

@mohibajalil

Message

Senior Data Engineer building scalable lakehouse data pipelines and cloud platforms.

United States

Message

What I'm looking for

I’m looking for a senior data engineering role where I can architect Lakehouse pipelines, optimize Spark workloads, and partner with data science and analytics to deliver reliable, observable data infrastructure.

I’m a Senior Data Engineer with over 6 years of experience building and maintaining large-scale data pipelines and cloud-based data platforms. After starting as a Data Engineer, I grew through multiple Databricks roles to lead enterprise-scale Lakehouse solutions supporting analytics, reporting, and machine learning workloads across business domains.

I design and implement high-performance ETL and ELT frameworks using Databricks, PySpark, Delta Lake, and SQL—often processing 10TB+ per day. I optimize Spark workloads through partitioning, caching, query tuning, and cluster configuration, reducing processing times by over 40%, and I build CDC pipelines for near real-time synchronization. I also establish Bronze/Silver/Gold layers using Medallion Architecture and add automated data quality validation with monitoring and observability to keep datasets accurate, consistent, and reliable.

I partner closely with data scientists and analytics stakeholders to deliver feature engineering pipelines and production-grade data infrastructure, and I mentor junior engineers on data engineering standards and Spark optimization. Before moving fully into engineering, I worked as a Data Analyst supporting finance, healthcare, and retail clients with SQL extraction and dashboards in Tableau and Power BI, which keeps my work grounded in measurable business outcomes.

Experience

Work history, roles, and key accomplishments

Current

Senior Data Engineer

Current

Databricks

Oct 2024 - Present (1 year 9 months)

Architected enterprise Lakehouse solutions for analytics, reporting, and machine learning, building high-performance ETL/ELT frameworks with Databricks, PySpark, Delta Lake, and SQL. Led scalable batch and real-time pipelines processing 15TB+ daily and reduced Spark processing times by 40%+ through partitioning, caching, query tuning, and cluster optimization.

Kafka PySpark Delta Lake ETL ELT Airflow Data Quality CDC Cloudwatch

Data Engineer II

Databricks

Feb 2023 - Sep 2024 (1 year 7 months)

Developed and maintained enterprise ETL pipelines ingesting data from transactional systems, APIs, cloud storage, and third-party providers, including 10TB+ daily processing. Implemented orchestration with Apache Airflow and Databricks Workflows, optimized SQL/Spark performance to reduce costs, and built reconciliation/auditing for improved data reliability.

Kafka PySpark Apache Spark Databricks Workflows Delta Lake Data Integration AWS SQL Optimization Airflow

Data Engineer I

Databricks

Mar 2022 - Feb 2023 (11 months)

Built and supported data pipelines ingesting from relational databases, APIs, flat files, and streaming sources using PySpark and SQL. Implemented Delta Lake tables with schema evolution and partitioning, created automated data quality checks, and performed root-cause analysis for production pipeline failures.

Kafka PySpark SQL Delta Lake Data Pipelines Data Validation And Schema Evolution Partitioning Airflow S3

Data Analyst

ProCogia

May 2020 - Mar 2022 (1 year 10 months)

Collected, cleaned, and analyzed data from multiple sources (databases, APIs, and flat files) to support finance, healthcare, and retail clients. Built SQL extracts and dashboards in Tableau and Power BI, automated recurring reporting tasks, and implemented data validation/quality checks to improve reliability of reporting datasets.

Python SQL Tableau Power BI Data Cleaning Data Extraction Data Validation Data Modeling