Skip to main content
Wycliffe RotichWR
Open to opportunities

Wycliffe Rotich

@wyclifferotich

Software Engineer and AI Training Specialist optimizing LLM code evaluations.

Kenya
Message

What I'm looking for

I’m looking for a role where I engineer reliable evaluation and training pipelines for frontier LLMs—building SWE-bench-style tasks, automated tests, and RLHF/SFT workflows that improve code accuracy and reduce technical hallucinations.

I’m a performance-driven Software Engineer and AI Training Specialist with 2+ years of experience authoring complex codebase evaluation tasks, debugging private repositories, and executing test-driven development (TDD) pipelines across global platforms including Mercor, Turing, and Micro1. I specialise in reading large-scale, unfamiliar code architectures and constructing SWE-bench-style evaluations to stress-test frontier LLMs, with a strong focus on eliminating model technical hallucinations through RLHF workflows.

At Mercor, I’ve authored and evaluated 100+ advanced algorithmic coding tasks and backend scripts to stress-test LLM reasoning, code generation, and debugging capabilities, creating robust ground truth datasets used in SWE-bench-style model evaluations. I also constructed programmatic SFT data pairs mapping expected multi-file code modifications and runtime execution outputs to improve patch accuracy, and I benchmark model outputs against structural programming paradigms, API standards, and execution safety compliance boundaries. In parallel, I’ve delivered deep technical code reviews and fact-checking validations at Turing, designed benchmark-adopted edge-case backend prompts, and built automated test scripts that reduced manual review time by an estimated 40%—and at Micro1, I generated high-quality prompt-response pairs, ran 200+ evaluation sessions, and identified and rectified logical code hallucinations to improve RLHF loop accuracy.

Beyond evaluation and training workflows, I contribute to open-source—e.g., NeuroMesh AI Control Plane—to demonstrate hands-on ability to design and implement production-grade AI systems across distributed systems, model orchestration, and infrastructure engineering.

Experience

Work history, roles, and key accomplishments

ME
Current

LLM Technical Trainer

Mercor

Jun 2024 - Present (2 years)

Authored and evaluated 100+ advanced coding tasks and backend scripts to stress-test frontier LLM reasoning, code generation, and debugging. Built SWE-bench-style ground truth datasets and programmatic SFT data pairs to improve multi-file patch accuracy and provide structured alignment feedback.

TU

AI Content Evaluator

Turing

Nov 2023 - Jun 2024 (7 months)

Performed deep technical code reviews and fact-checking for AI-generated software patches across Python and SQL. Developed automated isolated execution tests, reducing manual review time by an estimated 40%, and created benchmark prompts and rubric-based rankings to improve output quality and safety compliance.

MI

AI Training Specialist

Micro1

Apr 2023 - Nov 2023 (7 months)

Created high-quality prompt-response pairs for complex mathematics, structural engineering algorithms, and database query execution to improve training data quality. Ran 200+ evaluation sessions to rank model outputs and debug logical code hallucinations, improving RLHF loop accuracy and reducing hallucinations in targeted domains.

Education

Degrees, certifications, and relevant coursework

The Technical University of Kenya logoTK

The Technical University of Kenya

Bachelor of Software Engineering, Software Engineering

Grade: Second Class Honours — Upper Division

Earned a Bachelor of Software Engineering from The Technical University of Kenya, graduating on 30 November 2024 with Second Class Honours (Upper Division).

Tech stack

Software and tools used professionally

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan