Pushpak Hinglaspure
@pushpakhinglaspure
I build GPU-optimized, production generative AI pipelines with multi-agent capabilities.
What I'm looking for
I’m an AI/ML Engineer specializing in GPU inference optimization and production-scale generative AI. I achieved 85% inference speedup on H100 clusters and architected a zero LangChain multi-agent video pipeline driven by a locally-served 72B-parameter VLM.
In my current role, I cut compute by adapting denoising step inference (about ~50%) and optimized inference serving with Triton, TensorRT, FlashAttention-3, and PyTorch Compile, scaling GPU concurrency from 1 to 4 simultaneous generations. I’ve also led end-to-end cloud migration (AWS → Azure) and built reliability-focused tooling like TSX self-repair, plus a production observability dashboard tracking GPU/VRAM/CPU load and PCIe bandwidth across H100 clusters.
Experience
Work history, roles, and key accomplishments
Senior AI/ML Engineer
Phenomenal AI Pvt. Ltd.
Aug 2025 - Present (9 months)
Optimized GPU inference for a 22B proprietary diffusion model, improving latency from 2 min to 45s on a single H100 (62.5% faster) while maintaining zero quality loss. Architected a zero-LangChain agentic video pipeline and delivered cross-cloud migrations from AWS to Azure, scaling concurrent GPU generations from 1 to 4.
Applied AI Founder
Persist Ventures
Aug 2024 - Sep 2025 (1 year 1 month)
Founded VidGenCraft, driving 1,000+ users, 75% engagement, and 95% cost reduction, and led social automation for 50+ accounts with 24/7 operations. Built high-throughput content automation processing 21,000+ posts/month (500M+ views) using an async distributed AWS architecture and delivered AI analytics and monitoring dashboards.
Education
Degrees, certifications, and relevant coursework
Jayawantrao Sawant College of Engineering, Pune
Bachelor of Technology, Computer Science & Engineering
2020 - 2024
Grade: 8.77/10
Completed a B.Tech in Computer Science & Engineering at Jayawantrao Sawant College of Engineering from 2020 to 2024. CGPA achieved: 8.77/10.
Tech stack
Software and tools used professionally
AWS IAM
Microsoft Azure
GitHub
Kubernetes
kubernetes-deploy
Azure Kubernetes Service
GitHub Actions
Azure Pipelines
AWS CodeDeploy
Node.js
Tailwind CSS
Google Analytics
Azure Stack
OpenCV
FFMPEG
Redis
Azure DevOps
React
JavaScript
Python
AWS Elastic Load Balancing ...
Azure Machine Learning
Flask
OpenTelemetry
Ubuntu
AWS Lambda
Azure Functions
GitHub Pages
Docker
NGINX
Uvicorn
NVIDIA Deep Learning AMI
Django REST framework JWT
Redis Cloud
Amazon Web Services (AWS)
Google Kubernetes Engine
Azure Blob Storage
LiteLLM
vLLM
NVIDIA Triton Inference Server
Stable Diffusion
OpenLLM
AnythingLLM
NVIDIA TensorRT-LLM
Google Cloud Deployment Manager
Lima (Linux machines)
Google Cloud Vertex AI Workbench
OpenRouter
NVIDIA NIM
LlamaCloud
WebGPU
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Pushpak?
You can contact Pushpak and 90k+ other talented remote workers on Himalayas.
Message PushpakFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
