Open to opportunities

Deepak Mangla

@deepakmangla

Message

Director of Multimodal AI, shipping real-time multimodal avatar systems from research to production.

India

Message

What I'm looking for

I’m looking to lead multimodal AI and computer vision teams, architecting low-latency, autoscalable production systems that turn research into shipped products—partnering across engineering and research to deliver measurable user impact.

I’m an AI leader with 6+ years of experience building and shipping real-time multimodal AI systems from research to production. Currently, I direct the computer vision team at Alethia AI, where I architected a real-time AI avatar platform—combining custom lipsync models, emotive facial expressions, hand gesture synthesis, and low-latency streaming into a unified production system.

I’ve built autoscalable GPU infrastructure that reduced cloud costs by 20x while supporting 40+ concurrent agents and 500+ users, and I’ve taken cutting-edge research (GANs, diffusion models, motion transfer) from paper to deployed product. Previously, I led CV efforts for major NFT collections and founded multiple AI startups, including a virtual try-on and real-time face-swap products, always with a focus on performance, reliability, and measurable impact.

Experience

Work history, roles, and key accomplishments

Current

Director of Multimodal AI

Current

Alethia AI

Nov 2025 - Present (8 months)

Architected Alethia AI’s real-time AI avatar platform, combining custom lipsync, emotive facial expressions, hand gesture synthesis, and low-latency streaming into a unified production system. Built autoscalable GPU infrastructure that benchmarked 200+ GPUs, reduced cloud costs ~20x, and enabled low-cost OME + SRT/WebRTC livestreaming supporting 40+ agents and 500+ concurrent users.

Avatar Systems Lipsync Modeling Hand Gesture Synthesis Low Latency Streaming (WebRTC SRT)

Computer Vision Advisor

Alethia AI

May 2023 - Nov 2025 (2 years 6 months)

Architected the real-time AI avatar pipeline, including custom lipsync, emotive facial expressions, hand gesture synthesis, and seamless emotion transitions. Trained a custom lipsync model from scratch and reduced avatar latency from ~6s to <1s, while optimizing Flux image generation 3x across pipelines.

Real Time Avatar Pipeline Custom Lipsync Training VLM Based Data Filtering Fault Tolerant Checkpointing Optimization

Chief Executive Officer

Dressme

Jun 2023 - Aug 2024 (1 year 2 months)

Founded a virtual try-on startup using computer vision for real-time garment visualization and secured partnerships with BookMyShow and FanCode/Shibumi.AI.

Computer Vision Partnership Development Product Development

Lead Computer Vision Engineer

Alethia AI

Jan 2022 - Apr 2023 (1 year 3 months)

Led a computer vision team to automate NFT animation for major collections, processing ~200K animations and pioneering NFT dancing via keypoint-based human motion transfer (~12,000 unique dances). Deployed a generative art pipeline with multi-GPU support and fine-tuned diffusion models on LAION-5B while hiring and coordinating 10+ animators and engineers.

NFT Animation Automation Keypoint Based Motion Transfer Stable Diffusion Jax Multi GPU Training Team Leadership

Computer Vision Engineer

Alethia AI

May 2021 - Dec 2021 (7 months)

Optimized a First Order Motion (FOM) model for real-time face animation using super-resolution upsampling, PixelShuffle, and segmentation masks. Built a full lipsync pipeline and added a VQGAN-CLIP wrapper API with segmentation-based color shift and transparency support.

First Order Motion (FOM) Models Lipsync Pipeline VQGAN CLIP Integration