Aryan Sinha
@aryansinha
Systems-focused MLOps engineer specializing in real-time video analytics and distributed inference.
What I'm looking for
I am a systems-focused MLOps engineer with deep experience designing and operating real-time video analytics and distributed inference platforms. I build low-latency gRPC microservices, orchestrate Kubernetes clusters for high-throughput GPU workloads, and optimize LLM stacks with vLLM and Triton.
At Neophyte Ambient Intelligence I deployed large-scale multimodal pipelines—scaling OCR and face-recognition systems across multi-GPU clusters, reducing onboarding time and GPU idle overhead while maintaining strict SLOs. I’ve engineered orchestration engines handling millions of daily inferences and sustained high TPS for complex real-time aggregations.
I bring hands-on expertise in performance optimization (TensorRT, FP16), vector search (Milvus), stream ingestion (Kafka, RTSP), and observability (Prometheus, Grafana). I thrive solving latency, scaling, and reliability challenges for mission-critical computer vision and LLM applications.
Experience
Work history, roles, and key accomplishments
Founder / Engineer
Arison–X
Jan 2026 - Present (5 months)
Built an agentic financial research platform using MCP orchestration and LangGraph to enable LLMs to query live market data and validate research via Vector-RAG, reducing hallucinations through cross-referencing across 50+ 10-K filings.
Platform Architect
Argus
Nov 2025 - Present (7 months)
Architected a distributed computer vision Manager–Worker platform to autoscale CV pipelines across 50+ edge nodes, reducing idle compute costs by 25% and enabling programmatic pod provisioning via a FastAPI control plane interacting with Kubernetes.
MLOps Engineer
Neophyte Ambient Intelligence
Jul 2024 - Present (1 year 11 months)
Deployed and scaled multimodal inference pipelines across GPU clusters, reducing onboarding time by 40% and achieving mobile OCR latency of 1.2s using vLLM KV caching and a custom Kotlin SDK. Engineered a gRPC microservices mesh and fault-tolerant Kubernetes platform to sustain 500+ TPS and maintain 99%+ SLO compliance.
Distributed Video Search Engineer
Lumina
Aug 2025 - Oct 2025 (2 months)
Designed a high-throughput video indexing pipeline ingesting RTSP streams via Kafka and serving Qwen2-VL workers with vLLM, enabling sub-200ms semantic retrieval over 2M+ embeddings in Milvus.
MLOps Intern
Neophyte Ambient Intelligence
Feb 2024 - Jun 2024 (4 months)
Optimized real-time detection pipelines with TensorRT and FP16 quantization to increase throughput by 10 FPS and improved spatial query efficiency by 70% via MongoDB-based zone mapping for ReID systems.
Education
Degrees, certifications, and relevant coursework
IIIT Bhubaneswar
Bachelor of Technology, Electronics & Telecommunication
2020 - 2024
Completed a Bachelor of Technology in Electronics & Telecommunication with coursework and projects focused on systems, real-time video analytics, and distributed inference pipelines.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Aryan?
You can contact Aryan and 90k+ other talented remote workers on Himalayas.
Message AryanFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
