hellen waruru
@hellenwaruru
AI Data Trainer with 6 years evaluating LLMs and shaping RLHF responses with high-quality scoring.
What I'm looking for
I’m an Experienced AI Data Trainer with 6 years in AI model evaluation and reinforcement learning, focused on improving performance, accuracy, and efficiency across real-world AI systems. I craft prompt-response pairs for reinforcement learning from human feedback pipelines and support fine-tuning and alignment of large language models.
In my roles, I generate, review, and evaluate AI model responses as a subject matter expert across STEM, creative writing, logic, and coding. I assess outputs for factual accuracy, coherence, safety, and instruction adherence using comprehensive scoring rubrics, driving consistent quality at scale.
At Outlier AI (via Scale AI), I maintained a “97%+ average quality score across over 2,000 tasks with no critical violations,” collaborating with project leads to identify and address systematic model errors. I also developed structured datasets for mathematics and reasoning to enhance AI tutoring systems, including detailed solution explanations and error pattern annotations.
With a PhD in Computer Science, plus training in NLP, machine learning, and prompt engineering for large language models, I bring a research-minded approach to evaluation and quality assurance. I’m motivated by building safer, more reliable AI through rigorous benchmarking, fact-checking, and continuous improvement.
Experience
Work history, roles, and key accomplishments
Generated, reviewed, and evaluated AI model responses across STEM, creative writing, logic, and coding, creating prompt-response pairs to support reinforcement learning from human feedback workflows. Maintained a 97%+ average quality score across 2,000+ tasks with no critical violations by using rubric-based scoring for factual accuracy, coherence, safety, and instruction adherence.
Robotic and LLM Eval
Open Train Ai
Jun 2022 - Sep 2025 (3 years 3 months)
Conducted comprehensive evaluations of robotics systems and large language models (LLMs) to assess performance, accuracy, and efficiency, supporting development of cutting-edge AI technologies.
AI Content Trainer (Math)
Mindrift / Toloka
Mar 2021 - Dec 2021 (9 months)
Developed structured mathematics and logical reasoning datasets with detailed solution explanations and error-pattern annotations to improve AI tutoring output accuracy. Rated 500+ peer-submitted samples monthly while maintaining a rejection rate below 2%.
Freelance Data Annotator
Appen / DataAnnotation.tech
Jun 2020 - Feb 2021 (8 months)
Annotated text, image, and audio data for NLP and computer vision model training, including named entity recognition, sentiment analysis, and intent classification. Completed 300+ tasks weekly while resolving annotation ambiguities and maintaining strict confidentiality and data-handling standards.
Education
Degrees, certifications, and relevant coursework
University of Washington
Doctor of Philosophy (PhD), Computer Science and Mathematics
Earned a Doctor of Philosophy (PhD) in Computer Science and Mathematics from the University of Washington.
University of Washington
PhD, Computer Science and Mathematics
PhD in Computer Science and Mathematics from the University of Washington, completed in 2022.
University of Washington
Bachelor of Science, Computer Science and Cognitive Systems
Bachelor of Science in Computer Science and Cognitive Systems from the University of Washington, completed in May 2020.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Salary expectations
Social media
Interested in hiring hellen?
You can contact hellen and 90k+ other talented remote workers on Himalayas.
Message hellenFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
