Open to opportunities

Prav Dhakal

@pravdhakal

Message

Data Solutions Engineer building scalable Azure/Databricks pipelines for reliable analytics and AI.

United States

Message

What I'm looking for

I’m looking for a role where I can own scalable Azure/Databricks data pipelines end-to-end—prioritizing production stability, data quality, and real-time/AI-ready analytics, while collaborating with data teams in Agile to deliver measurable impact.

I’m a Data Solutions Engineer with 6+ years of experience designing and supporting scalable data pipelines in Azure and Databricks. I build reliable ETL workflows focused on production stability, data quality, and system performance.

I work across real-time data ingestion, API integrations, and event-driven architectures—using Azure Data Factory, Azure Functions, webhooks, and incremental checkpointing to improve data freshness and reduce latency. I’ve optimized PySpark and SQL transformations to improve job performance and reduce execution time.

I also strengthen analytics readiness through curated datasets, analytical views, and dbt models, and I’ve delivered AI-ready proof-of-concepts like Retrieval-Augmented Generation (RAG) with vector search approaches. From troubleshooting pipeline failures to managing data security with Unity Catalog and RBAC, I prioritize dependable delivery for downstream reporting and AI use cases.

Experience

Work history, roles, and key accomplishments

Current

Data Engineer

Current

Paychex

Oct 2025 - Present (9 months)

Built and maintained Azure Data Factory/Databricks ETL pipelines using PySpark, ingesting Microsoft Graph and internal API data to move freshness from hourly to near real-time. Implemented event-driven ingestion with webhooks and Azure Functions and added incremental checkpointing and Unity Catalog governance to reduce duplicates and improve pipeline reliability.

Pyspark Azure Databricks Azure Data Factory Azure Functions Azure Key Vault SQL REST APIs Unity Catalog

Data Engineer

Verizon

Oct 2024 - Sep 2025 (11 months)

Developed and managed ETL pipelines in Azure Databricks using PySpark and Scala with Delta Lake for storage, versioning, and schema evolution. Built configurable delivery pipelines for customer-facing data stores, automated SQL/ETL workflows with Apache Airflow, and standardized transformations with dbt in Snowflake while enforcing RBAC and monitoring.

Pyspark Scala Azure Databricks Delta Lake DBT Azure Key Vault Snowflake Apache Airflow RBAC

Data Engineer

7-Eleven

Sep 2022 - Sep 2024 (2 years)

Owned end-to-end SDLC for data workflows, developing and optimizing Scala Spark jobs for cleansing, transformation, and batch processing to reduce processing time by 30%. Built real-time event processing with Apache Storm and Kafka, delivered star/snowflake dimensional models and Tableau/Power BI dashboards, and automated data quality monitoring with Python and Apache Airflow.

Scala Apache Spark Kafka Apache Storm HDFS Dimensional Models Tableau Power BI Apache Airflow SQL

Data Engineer

Southwest Airlines

May 2020 - Aug 2022 (2 years 3 months)

Developed and optimized Python ETL pipelines using PySpark for extraction, transformation, and aggregation across multiple sources on AWS. Implemented Snowflake data models with stored procedures, built real-time and batch pipelines, and created Tableau/Power BI dashboards while preparing data for machine learning with dimensional modeling.

Python Pyspark Scala AWS RDS Snowflake Tableau Power BI Dimensional Models SQL Predictive Analytics

Data Engineer

Merck Group

Nov 2019 - Apr 2020 (5 months)

Built HIPAA-compliant ETL pipelines for patient and claims data, handling extraction, mapping/normalization, validation, and reconciliation using SQL and Python. Implemented automated report-generation workflows with data masking, audit logging, and RBAC, and processed large-scale datasets with PySpark for reliable analytical reporting.