Skip to main content
HimalayasHimalayas logo
Ermias SamsonES
Open to opportunities

Ermias Samson

@ermiassamson

AI inference platform engineer building scalable LLM systems with Python, Kubernetes, and AWS.

United States
Message

What I'm looking for

I’m looking to build and operate production-grade LLM inference platforms—prioritizing low latency, high availability, observability, and reliable Kubernetes/AWS deployments—while expanding capabilities in RAG, agents, and real-time AI services.

I am a software engineer with 8+ years of experience specializing in AI systems, LLM-based applications, and scalable cloud infrastructure. I focus on building and operating production-grade inference systems that deliver high availability, low latency, and reliability.

I have designed and maintained LLM-powered platforms, including RAG systems, multi-agent workflows, and real-time AI services. In my Senior AI Engineer role, I built LLM-powered SaaS features with Python (FastAPI), deployed RAG pipelines integrating Pinecone with PostgreSQL/MongoDB, and delivered multi-turn conversational systems using LangChain and LangGraph.

I also prioritize operational excellence: I developed backend inference APIs with streaming responses and retry mechanisms, monitored system health using logs, metrics, and alerting, and improved system uptime to 99.8% while reducing latency and failure rates. I’ve led Kubernetes-based deployments on AWS (EKS), implemented observability practices, and handled production incidents through performance optimization and reliable fallback strategies.

Experience

Work history, roles, and key accomplishments

RL

Senior AI Engineer

R-Works Ltd

Mar 2023 - Nov 2025 (2 years 8 months)

Designed and deployed LLM-powered SaaS features (FastAPI) including document understanding, summarization, and conversational workflows using RAG. Improved real-time service uptime to 99.8% by optimizing Pinecone-based retrieval, streaming inference APIs, and AWS EKS deployments with observability and incident handling.

GA

Senior Full Stack Engineer

Gamba

Apr 2018 - Mar 2023 (4 years 11 months)

Built scalable SaaS backend services supporting real-time AI workflows and high-throughput ingestion using Python microservices (FastAPI/Django) and distributed task processing (Celery). Implemented Kubernetes-based deployments and observability, and delivered real-time WebSocket systems; achieved 95% test coverage using TDD with Pytest/Jest and React Testing Library.

Education

Degrees, certifications, and relevant coursework

University of Richmond logoUR

University of Richmond

Bachelor of Computer Engineering, Computer Engineering

2012 - 2017

Earned a Bachelor of Computer Engineering at the University of Richmond from 2012 to 2017.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan