Skip to main content
HimalayasHimalayas logo
shaheer beigSB
Open to opportunities

shaheer beig

@shaheerbeig

I build production ETL/ELT and LLM-integrated analytics systems on Azure/AWS/Databricks.

Pakistan
Message

What I'm looking for

I’m looking for a role building reliable data infrastructure—ETL/ELT, streaming ingestion, and governed lakehouses—where I can apply LLM/RAG systems to ship measurable, production-ready analytics.

I’m a Data Engineer focused on the intersection of data infrastructure and applied AI. At Disrupt.com, I build production ETL/ELT pipelines, event-driven ingestion, and LLM-integrated analytics on Azure, AWS, and Databricks.

I engineered a Python ETL pipeline that parses Recurly webhook XML into sanitized, normalized JSON, and I migrated production data from MySQL to PostgreSQL with schema reconciliation for 100% data parity and zero downtime. I designed Airflow DAGs with idempotent upserts and Slack-integrated anomaly alerting, and I rewrote 4+ critical SQL queries to reduce latency to milliseconds while improving resource efficiency by 45%.

I also deliver AI-ready systems, including a Retrieval-Augmented Generation system using Agno with OpenAI embeddings for semantic search, plus multi-step LLM workflows with structured evaluation using LangChain and LangGraph. Alongside that, I’ve built real-time ingestion (RabbitMQ/Docker into Power BI) and lakehouse projects with Medallion Architecture, governance (Unity Catalog), lineage, and data contracts to prevent silent analytics failures.

Experience

Work history, roles, and key accomplishments

DI
Current

Data Engineer Trainee

Disrupt.com

Oct 2025 - Present (8 months)

Built a Python ETL pipeline converting Recurly webhook XML into a normalized JSON single source of truth. Migrated MySQL to PostgreSQL with 100% data parity and zero downtime, and improved pipeline performance by rewriting 4+ queries to millisecond latency with 45% better resource efficiency.

SI

AI Engineer Intern

SiRiiL

May 2025 - Sep 2025 (4 months)

Integrated production LLM inference into a Django + MySQL backend using structured request/response schemas. Built multi-step agentic workflows with LangChain/LangGraph, and used Celery asynchronous processing to improve throughput while maintaining reliable end-to-end state handoff across the stack.

Education

Degrees, certifications, and relevant coursework

FS

FAST-NUCES (National University of Computer and Emerging Sciences)

Bachelor of Science in Computer Science, Computer Science

2022 - 2026

Pursuing a Bachelor of Science in Computer Science, developing foundations in data engineering and applied AI.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan