HimalayasHimalayas logo
Amit Singh NegiAN
Looking for a job

Amit Singh Negi

@amitsinghnegi

Data engineer building Azure-based ETL/ELT pipelines and AI-ready RAG systems with PySpark, delivering faster, cleaner analytics.

India
Message

What I'm looking for

I’m looking for a role where I can own Azure-based data engineering end-to-end—ETL/ELT, modelling, quality, and performance—while building production-ready AI features like RAG with reliable validation, tests, and strong governance.

I’m a Data Engineer with 3.5+ years’ experience building scalable data pipelines using Azure (Databricks, Data Factory, Fabric) and PySpark. I focus on ETL/ELT design, data modelling, and reliable big data processing.

In my recent work on Microsoft Fabric, I designed and optimized end-to-end workflows (ingestion, transformation, view creation, orchestration) processing 100 GB/day and reducing latency by 30% for hospital and clinical datasets. I’ve also built config-driven dynamic pipelines, cutting development effort by 40%, and implemented automated data quality checks to reduce inconsistencies by 50%.

Previously, I migrated an on-premises SQL project to Azure Data Factory, Databricks, and ADLS, using PySpark optimization and incremental processing to reduce refresh time by 30%. I also built automated data validation in Databricks to cut monthly-report error rates by 40%, leveraging Fabric Dataflows Gen2 and OneLake to centralize access and eliminate duplication.

I pair engineering rigor with modern AI work—building a Retrieval-Augmented Generation (RAG) system and a multi-agent AI research assistant using LCEL. I ensure pipeline reliability with structured outputs, schema validation via Pydantic, and unit tests using Pytest, while staying grounded in strong fundamentals like Delta Lake, governance, and performance tuning.

Experience

Work history, roles, and key accomplishments

MS

Associate Software Engineer

MAQ Software

Contributed to data quality initiatives by resolving 75% of detected inaccuracies and supported distributed pipeline development in Databricks using PySpark/Spark SQL. Integrated Power BI with Azure SQL, SharePoint, and Excel to automate cross-platform reporting, reducing manual effort by 40%, and achieved a 95% client satisfaction rate.

Education

Degrees, certifications, and relevant coursework

Lovely Professional University logoLU

Lovely Professional University

Master of Computer Applications, Computer Science

2021 - 2023

Grade: 9

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan