Skip to main content
HimalayasHimalayas logo
Neeraj VardhanNV
Open to opportunities

Neeraj Vardhan

@neerajvardhan

AI Engineer building agentic AI systems and multimodal RAG pipelines with low-latency LLM infrastructure.

Zimbabwe
Message

What I'm looking for

I’m looking for a role where I can build production-grade agentic and multimodal RAG systems—focusing on scalable LLM infrastructure, low-latency inference, and strong evaluation/benchmarking to deliver measurable AI workflow performance.

I’m an AI Engineer specializing in agentic AI systems, multimodal RAG pipelines, and scalable LLM infrastructure. At NTT DATA, I architected and optimized production-grade multimodal RAG and agentic AI workflows integrating NVIDIA Nemotron and NeMo Retriever for enterprise use cases.

I designed low-latency LLM serving infrastructure using SGLang, vLLM, and NVIDIA Triton Inference Server to improve GPU utilization. I also built custom evaluation and benchmarking frameworks using token throughput, TTFT, and inter-token latency, plus graph-based explainability to visualize retrieval relationships across text, image, audio, and video.

Earlier at CHRP Technologies, I built ML and computer vision systems for object detection, image classification, and multimodal extraction. I trained and optimized LLM and LVM models for information extraction and synthetic data generation, and delivered Dockerized deployments on Azure Kubernetes.

Experience

Work history, roles, and key accomplishments

NTT Data logoND
Current

Digital Engineering Senior Engineer

Jun 2025 - Present (1 year)

Architected and optimized production-grade multimodal RAG and agentic AI workflows integrating NVIDIA Nemotron and NeMo Retriever for scalable enterprise AI applications. Designed low-latency LLM serving with SGLang, vLLM, and NVIDIA Triton, built custom evaluation/benchmarking using TTFT and inter-token latency, and developed LuminarOCR with explainability graphs deployed via Docker and Kubernete

CT

Machine Learning Engineer

CHRP Technologies

Jan 2023 - May 2025 (2 years 4 months)

Designed and deployed ML and computer vision models for object detection, image classification, and multimodal data extraction workflows. Trained and optimized LLM/LVM models for information extraction and synthetic data generation, and built end-to-end data ingestion and AI workflow pipelines with REST APIs and streaming systems, deploying via Dockerized Azure Kubernetes.

CT

Machine Learning Engineer Intern

CHRP Technologies

Oct 2022 - Dec 2022 (2 months)

Supported ML model development and Azure-based deployment workflows using TensorFlow and PyTorch.

Education

Degrees, certifications, and relevant coursework

SL

Sri Sathya Sai Institute of Higher Learning

Master of Science, Data Science and Computing

2020 - 2022

Master’s program in Data Science and Computing at Sri Sathya Sai Institute of Higher Learning from September 2020 to May 2022.

SL

Sri Sathya Sai Institute of Higher Learning

Bachelor of Computer Applications, Computer Applications

2017 - 2020

Bachelor of Computer Applications at Sri Sathya Sai Institute of Higher Learning from June 2017 to April 2020.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan