HimalayasHimalayas logo
AK
Open to opportunities

Allan Kipkemboi

@allankipkemboi

Data scientist and AI specialist improving model performance through data pipelines and RLHF systems.

Kenya
Message

What I'm looking for

I’m looking to build measurable AI improvements—strong data pipelines, human-in-the-loop systems, and reliable LLM evaluation—across enterprise NLP/CV use cases. I value teams where quality, experimentation, and model performance translate into real outcomes.

I’m a data scientist and AI specialist with over five years of hands-on experience across the full AI data lifecycle. I focus on boosting model performance by building high-quality data pipelines, designing human-in-the-loop systems, and tuning RLHF setups.

At Mercor, I led the production deployment of an RLHF data pipeline that integrates 10,000+ human-in-the-loop signals weekly into the training loop, reducing user-reported factual errors by 15%. I also standardized LLM evaluation and ranking, improving human-alignment scores while strengthening dataset quality by annotating and validating multimodal data (text, image, audio), cutting annotation errors by 20%.

As a Data Developer at RWS Group, I architected and implemented a feature engineering layer in Python/Pandas that increased dialect classification accuracy by 9% across 10+ dialects. I’ve improved computer vision precision and recall with pixel-level semantic segmentation and bounding box annotation, and strengthened evaluation using pairwise comparison testing with metrics like accuracy and F1-score.

Earlier, as a Data Analyst at CloudFactory, I delivered data-driven insights using Python and SQL—improving reporting accuracy and turnaround time through validation workflows and data cleaning. In my annotation roles with Remotasks and Appen, I scaled training datasets for speech, sentiment/intent, object detection, and 3D point cloud labeling, consistently improving accuracy and labeling consistency.

Experience

Work history, roles, and key accomplishments

RWS Group logoRG
Current

Data Developer

Sep 2025 - Present (7 months)

Implemented a Python/Pandas feature engineering layer that increased multilingual dialect classification accuracy by 9% across 10+ dialects. Improved computer vision precision/recall using semantic segmentation and bounding box annotation, and validated outputs with pairwise comparisons and accuracy/F1 metrics.

RE

AI Data Annotator

Jun 2021 - Aug 2022 (1 year 2 months)

Scaled supervised learning training datasets by annotating thousands of image, video, and LiDAR data points. Improved speech recognition quality with multilingual audio transcription/segmentation and enhanced object detection performance using bounding boxes and 3D point cloud labeling.

Appen logoAP

AI Data Annotator Contributor

Jan 2020 - May 2021 (1 year 4 months)

Improved NLP training for sentiment analysis and intent classification through high-volume dataset annotation. Increased inter-annotator agreement using standardized labeling methodologies and supported large-scale AI data pipeline workflows across multiple platforms.

Education

Degrees, certifications, and relevant coursework

The Technical University of Kenya logoTK

The Technical University of Kenya

Bachelor of Technology, Chemical Engineering

2022 -

Pursuing a Bachelor of Technology in Chemical Engineering at The Technical University of Kenya, Nairobi, with anticipated completion in June 2026.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan