Amit Singh Negi
@amitsinghnegi
Data engineer building Azure-based ETL/ELT pipelines and AI-ready RAG systems with PySpark, delivering faster, cleaner analytics.
What I'm looking for
I’m a Data Engineer with 3.5+ years’ experience building scalable data pipelines using Azure (Databricks, Data Factory, Fabric) and PySpark. I focus on ETL/ELT design, data modelling, and reliable big data processing.
In my recent work on Microsoft Fabric, I designed and optimized end-to-end workflows (ingestion, transformation, view creation, orchestration) processing 100 GB/day and reducing latency by 30% for hospital and clinical datasets. I’ve also built config-driven dynamic pipelines, cutting development effort by 40%, and implemented automated data quality checks to reduce inconsistencies by 50%.
Previously, I migrated an on-premises SQL project to Azure Data Factory, Databricks, and ADLS, using PySpark optimization and incremental processing to reduce refresh time by 30%. I also built automated data validation in Databricks to cut monthly-report error rates by 40%, leveraging Fabric Dataflows Gen2 and OneLake to centralize access and eliminate duplication.
I pair engineering rigor with modern AI work—building a Retrieval-Augmented Generation (RAG) system and a multi-agent AI research assistant using LCEL. I ensure pipeline reliability with structured outputs, schema validation via Pydantic, and unit tests using Pytest, while staying grounded in strong fundamentals like Delta Lake, governance, and performance tuning.
Experience
Work history, roles, and key accomplishments
Data Engineering Consultant
Delphi Consultant
Designed and optimized end-to-end data workflows on Microsoft Fabric, processing 100 GB/day and reducing latency by 30% for hospital and clinical datasets. Built config-driven dynamic pipelines and automated data quality checks, reducing development effort by 40% and inconsistencies by 50%.
Data Engineer 1
MAQ Software
Led a team of 4 delivering Sales and Finance data engineering solutions, improving data-driven decision-making and business intelligence. Built reliable pipelines using Delta Lake (ACID, schema evolution, partitioning) and created Power BI dashboards that increased user engagement by 25%.
Associate Software Engineer
MAQ Software
Contributed to data quality initiatives by resolving 75% of detected inaccuracies and supported distributed pipeline development in Databricks using PySpark/Spark SQL. Integrated Power BI with Azure SQL, SharePoint, and Excel to automate cross-platform reporting, reducing manual effort by 40%, and achieved a 95% client satisfaction rate.
Data Engineer 2
MAQ Software
Migrated an on-prem SQL project to Azure Data Factory, Databricks, and ADLS, using PySpark optimization and incremental processing to reduce refresh time by 30%. Built a Databricks automated data validation framework in PySpark/Python, lowering monthly report error rates by 40%.
Education
Degrees, certifications, and relevant coursework
Lovely Professional University
Master of Computer Applications, Computer Science
2021 - 2023
Grade: 9
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Salary expectations
Job categories
Skills
Interested in hiring Amit?
You can contact Amit and 90k+ other talented remote workers on Himalayas.
Message AmitFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
