Aryan Sinha
@aryansinha
Systems-focused MLOps engineer specializing in real-time video analytics and distributed inference.
What I'm looking for
I am a systems-focused MLOps engineer with deep experience designing and operating real-time video analytics and distributed inference platforms. I build low-latency gRPC microservices, orchestrate Kubernetes clusters for high-throughput GPU workloads, and optimize LLM stacks with vLLM and Triton.
At Neophyte Ambient Intelligence I deployed large-scale multimodal pipelines—scaling OCR and face-recognition systems across multi-GPU clusters, reducing onboarding time and GPU idle overhead while maintaining strict SLOs. I’ve engineered orchestration engines handling millions of daily inferences and sustained high TPS for complex real-time aggregations.
I bring hands-on expertise in performance optimization (TensorRT, FP16), vector search (Milvus), stream ingestion (Kafka, RTSP), and observability (Prometheus, Grafana). I thrive solving latency, scaling, and reliability challenges for mission-critical computer vision and LLM applications.
Experience
Work history, roles, and key accomplishments
Founder / Engineer
Arison–X
Jan 2026 - Present (1 month)
Built an agentic financial research platform using MCP orchestration and LangGraph to enable LLMs to query live market data and validate research via Vector-RAG, reducing hallucinations through cross-referencing across 50+ 10-K filings.
Platform Architect
Argus
Nov 2025 - Present (3 months)
Architected a distributed computer vision Manager–Worker platform to autoscale CV pipelines across 50+ edge nodes, reducing idle compute costs by 25% and enabling programmatic pod provisioning via a FastAPI control plane interacting with Kubernetes.
MLOps Engineer
Neophyte Ambient Intelligence
Jul 2024 - Present (1 year 7 months)
Deployed and scaled multimodal inference pipelines across GPU clusters, reducing onboarding time by 40% and achieving mobile OCR latency of 1.2s using vLLM KV caching and a custom Kotlin SDK. Engineered a gRPC microservices mesh and fault-tolerant Kubernetes platform to sustain 500+ TPS and maintain 99%+ SLO compliance.
Distributed Video Search Engineer
Lumina
Aug 2025 - Oct 2025 (2 months)
Designed a high-throughput video indexing pipeline ingesting RTSP streams via Kafka and serving Qwen2-VL workers with vLLM, enabling sub-200ms semantic retrieval over 2M+ embeddings in Milvus.
MLOps Intern
Neophyte Ambient Intelligence
Feb 2024 - Jun 2024 (4 months)
Optimized real-time detection pipelines with TensorRT and FP16 quantization to increase throughput by 10 FPS and improved spatial query efficiency by 70% via MongoDB-based zone mapping for ReID systems.
Education
Degrees, certifications, and relevant coursework
IIIT Bhubaneswar
Bachelor of Technology, Electronics & Telecommunication
2020 - 2024
Completed a Bachelor of Technology in Electronics & Telecommunication with coursework and projects focused on systems, real-time video analytics, and distributed inference pipelines.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Aryan?
You can contact Aryan and 90k+ other talented remote workers on Himalayas.
Message AryanFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
