Wycliffe Rotich
@wyclifferotich
Software Engineer and AI Training Specialist optimizing LLM code evaluations.
What I'm looking for
I’m a performance-driven Software Engineer and AI Training Specialist with 2+ years of experience authoring complex codebase evaluation tasks, debugging private repositories, and executing test-driven development (TDD) pipelines across global platforms including Mercor, Turing, and Micro1. I specialise in reading large-scale, unfamiliar code architectures and constructing SWE-bench-style evaluations to stress-test frontier LLMs, with a strong focus on eliminating model technical hallucinations through RLHF workflows.
At Mercor, I’ve authored and evaluated 100+ advanced algorithmic coding tasks and backend scripts to stress-test LLM reasoning, code generation, and debugging capabilities, creating robust ground truth datasets used in SWE-bench-style model evaluations. I also constructed programmatic SFT data pairs mapping expected multi-file code modifications and runtime execution outputs to improve patch accuracy, and I benchmark model outputs against structural programming paradigms, API standards, and execution safety compliance boundaries. In parallel, I’ve delivered deep technical code reviews and fact-checking validations at Turing, designed benchmark-adopted edge-case backend prompts, and built automated test scripts that reduced manual review time by an estimated 40%—and at Micro1, I generated high-quality prompt-response pairs, ran 200+ evaluation sessions, and identified and rectified logical code hallucinations to improve RLHF loop accuracy.
Beyond evaluation and training workflows, I contribute to open-source—e.g., NeuroMesh AI Control Plane—to demonstrate hands-on ability to design and implement production-grade AI systems across distributed systems, model orchestration, and infrastructure engineering.
Experience
Work history, roles, and key accomplishments
LLM Technical Trainer
Mercor
Jun 2024 - Present (2 years)
Authored and evaluated 100+ advanced coding tasks and backend scripts to stress-test frontier LLM reasoning, code generation, and debugging. Built SWE-bench-style ground truth datasets and programmatic SFT data pairs to improve multi-file patch accuracy and provide structured alignment feedback.
AI Content Evaluator
Turing
Nov 2023 - Jun 2024 (7 months)
Performed deep technical code reviews and fact-checking for AI-generated software patches across Python and SQL. Developed automated isolated execution tests, reducing manual review time by an estimated 40%, and created benchmark prompts and rubric-based rankings to improve output quality and safety compliance.
AI Training Specialist
Micro1
Apr 2023 - Nov 2023 (7 months)
Created high-quality prompt-response pairs for complex mathematics, structural engineering algorithms, and database query execution to improve training data quality. Ran 200+ evaluation sessions to rank model outputs and debug logical code hallucinations, improving RLHF loop accuracy and reducing hallucinations in targeted domains.
Education
Degrees, certifications, and relevant coursework
The Technical University of Kenya
Bachelor of Software Engineering, Software Engineering
Grade: Second Class Honours — Upper Division
Earned a Bachelor of Software Engineering from The Technical University of Kenya, graduating on 30 November 2024 with Second Class Honours (Upper Division).
Availability
Location
Authorized to work in
Salary expectations
Social media
Job categories
Skills
Interested in hiring Wycliffe?
You can contact Wycliffe and 90k+ other talented remote workers on Himalayas.
Message WycliffeFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
