Prav Dhakal
@pravdhakal
Data Solutions Engineer building scalable Azure/Databricks pipelines for reliable analytics and AI.
What I'm looking for
I’m a Data Solutions Engineer with 6+ years of experience designing and supporting scalable data pipelines in Azure and Databricks. I build reliable ETL workflows focused on production stability, data quality, and system performance.
I work across real-time data ingestion, API integrations, and event-driven architectures—using Azure Data Factory, Azure Functions, webhooks, and incremental checkpointing to improve data freshness and reduce latency. I’ve optimized PySpark and SQL transformations to improve job performance and reduce execution time.
I also strengthen analytics readiness through curated datasets, analytical views, and dbt models, and I’ve delivered AI-ready proof-of-concepts like Retrieval-Augmented Generation (RAG) with vector search approaches. From troubleshooting pipeline failures to managing data security with Unity Catalog and RBAC, I prioritize dependable delivery for downstream reporting and AI use cases.
Experience
Work history, roles, and key accomplishments
Built and maintained Azure Data Factory/Databricks ETL pipelines using PySpark, ingesting Microsoft Graph and internal API data to move freshness from hourly to near real-time. Implemented event-driven ingestion with webhooks and Azure Functions and added incremental checkpointing and Unity Catalog governance to reduce duplicates and improve pipeline reliability.
Developed and managed ETL pipelines in Azure Databricks using PySpark and Scala with Delta Lake for storage, versioning, and schema evolution. Built configurable delivery pipelines for customer-facing data stores, automated SQL/ETL workflows with Apache Airflow, and standardized transformations with dbt in Snowflake while enforcing RBAC and monitoring.
Data Engineer
7-Eleven
Sep 2022 - Sep 2024 (2 years)
Owned end-to-end SDLC for data workflows, developing and optimizing Scala Spark jobs for cleansing, transformation, and batch processing to reduce processing time by 30%. Built real-time event processing with Apache Storm and Kafka, delivered star/snowflake dimensional models and Tableau/Power BI dashboards, and automated data quality monitoring with Python and Apache Airflow.
Developed and optimized Python ETL pipelines using PySpark for extraction, transformation, and aggregation across multiple sources on AWS. Implemented Snowflake data models with stored procedures, built real-time and batch pipelines, and created Tableau/Power BI dashboards while preparing data for machine learning with dimensional modeling.
Data Engineer
Merck Group
Nov 2019 - Apr 2020 (5 months)
Built HIPAA-compliant ETL pipelines for patient and claims data, handling extraction, mapping/normalization, validation, and reconciliation using SQL and Python. Implemented automated report-generation workflows with data masking, audit logging, and RBAC, and processed large-scale datasets with PySpark for reliable analytical reporting.
Education
Degrees, certifications, and relevant coursework
Prav hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Prav?
You can contact Prav and 90k+ other talented remote workers on Himalayas.
Message PravFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
