HimalayasHimalayas logo
Ermias SamsonES
Open to opportunities

Ermias Samson

@ermiassamson

AI inference platform engineer building scalable LLM systems with Python, Kubernetes, and AWS.

United States
Message

What I'm looking for

I’m looking to build and operate production-grade LLM inference platforms—prioritizing low latency, high availability, observability, and reliable Kubernetes/AWS deployments—while expanding capabilities in RAG, agents, and real-time AI services.

I am a software engineer with 8+ years of experience specializing in AI systems, LLM-based applications, and scalable cloud infrastructure. I focus on building and operating production-grade inference systems that deliver high availability, low latency, and reliability.

I have designed and maintained LLM-powered platforms, including RAG systems, multi-agent workflows, and real-time AI services. In my Senior AI Engineer role, I built LLM-powered SaaS features with Python (FastAPI), deployed RAG pipelines integrating Pinecone with PostgreSQL/MongoDB, and delivered multi-turn conversational systems using LangChain and LangGraph.

I also prioritize operational excellence: I developed backend inference APIs with streaming responses and retry mechanisms, monitored system health using logs, metrics, and alerting, and improved system uptime to 99.8% while reducing latency and failure rates. I’ve led Kubernetes-based deployments on AWS (EKS), implemented observability practices, and handled production incidents through performance optimization and reliable fallback strategies.

Experience

Work history, roles, and key accomplishments

RL

Senior AI Engineer

R-Works Ltd

Mar 2023 - Nov 2025 (2 years 8 months)

Designed and deployed LLM-powered SaaS features (FastAPI) including document understanding, summarization, and conversational workflows using RAG. Improved real-time service uptime to 99.8% by optimizing Pinecone-based retrieval, streaming inference APIs, and AWS EKS deployments with observability and incident handling.

GA

Senior Full Stack Engineer

Gamba

Apr 2018 - Mar 2023 (4 years 11 months)

Built scalable SaaS backend services supporting real-time AI workflows and high-throughput ingestion using Python microservices (FastAPI/Django) and distributed task processing (Celery). Implemented Kubernetes-based deployments and observability, and delivered real-time WebSocket systems; achieved 95% test coverage using TDD with Pytest/Jest and React Testing Library.

Education

Degrees, certifications, and relevant coursework

University of Richmond logoUR

University of Richmond

Bachelor of Computer Engineering, Computer Engineering

2012 - 2017

Earned a Bachelor of Computer Engineering at the University of Richmond from 2012 to 2017.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan