Stephen Bahr
@stephenbahr2
Senior software engineer building large-scale AI and backend infrastructure.
What I'm looking for
I’m a Senior Software Engineer with 11+ years architecting large-scale AI platforms, distributed systems, and backend infrastructure. I specialize in building production services that deliver low-latency, reliable outcomes—often for mission-critical healthcare workflows.
In my most recent role, I architected an end-to-end LLM inference platform using Python FastAPI and Go microservices, delivering patient engagement systems with 97ms p95 latency and 2.1M daily API requests. I led design of event-driven AI response streaming, reducing WebSocket connection failures by 73% for 50K concurrent healthcare users.
I also drove RAG pipeline infrastructure using Pinecone and LangChain, improving semantic retrieval accuracy by 34% through fine-tuned embeddings and multi-agent orchestration. My approach emphasizes observability and fast detection—implementing Prometheus, Grafana, and Datadog to reduce mean time to detection by 58 minutes across 10 microservices.
Across product, research, and security teams, I’ve focused on technical leadership and compliant delivery—especially in HIPAA-aligned systems. I build “golden path” CI/CD using Jenkins, Terraform, and Docker on AWS EKS for zero-downtime deployments, and I mentor teams to elevate system design for interoperability at scale.
Experience
Work history, roles, and key accomplishments
Senior Full Stack Engineer
SoftServe
Mar 2022 - Apr 2026 (4 years 1 month)
Architected an end-to-end LLM inference platform (FastAPI + Go microservices) processing 2.1M daily API requests at 97ms p95 latency for patient engagement. Built Kafka/Redis event streaming and RAG infrastructure (Pinecone, LangChain) and delivered HIPAA-focused CI/CD on AWS EKS, cutting WebSocket failures 73% and improving semantic retrieval accuracy 34%.
Built cloud-native ML and telehealth backends on AWS (SageMaker, Lambda, Kubernetes), scaling from 8K to 20K virtual visits/month and reducing patient readmission rates 17% via real-time FHIR ingestion. Re-engineered APIs with Node.js, MongoDB, and Redis caching to cut authentication latency from 680ms to 140% while supporting SOC 2 and SMART on FHIR integrations (Epic/Cerner).
Delivered real-time healthcare analytics stream processing with Apache Kafka and Python (520K events/hour) using exactly-once semantics. Built observability (Prometheus, Grafana, Datadog) and Kubernetes/Helm + Terraform deployment workflows, reducing incident response time 41% and maintaining 99.7% uptime for clinical data exchange pipelines.
Education
Degrees, certifications, and relevant coursework
Boston University
Bachelor of Science, Computer Science
Earned a B.S. in Computer Science from Boston University, completed in 2015.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Stephen?
You can contact Stephen and 90k+ other talented remote workers on Himalayas.
Message StephenFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
