shadrack mwangi
@shadrackmwangi
HIPAA-certified AI training data specialist for Medical NLP, Translation QA, and RLHF data.
What I'm looking for
I’m a HIPAA-certified AI training data specialist with demonstrated expertise in Medical NLP annotation, Gikuyu–Swahili–English machine translation quality assurance, and RLHF/RLAIF preference data collection. I apply MQM error taxonomy and Cohen’s Kappa inter-annotator agreement to deliver production-quality labelled datasets.
I’ve annotated 50 clinical sentences using a 13-label NER schema in Label Studio, developing guidelines for span boundaries, negation handling, clinical abbreviations, and edge cases. In translation QA, I’ve post-edited 30 Swahili–English MT sentence pairs across 8 domains and used MQM taxonomy to compute penalty scores and quality verdicts (0.08–0.78).
For AI alignment training, I completed a 20-item RLHF preference ranking test set and achieved a mean kappa of 0.82—above the ≥0.70 production threshold—across simulated annotation batches. I also built an IAA tracking template for scaling annotation teams, and I’ve mastered ICD-10-CM and MIMIC-III workflows through HIPAA deep-dive study.
Experience
Work history, roles, and key accomplishments
Medical AI & HIPAA Study
Medical AI & HIPAA Study
Jan 2025 - Jan 2026 (1 year)
Self-directed study for medical NLP and clinical data workflows, mastering ICD-10-CM coding structure, sequencing rules, and specificity/combo-code guidelines while reviewing MIMIC-III dataset tables and ICD coding benchmarks. Completed HIPAA certification covering PHI identifiers, Safe Harbor/Expert Determination de-identification, and ePHI safeguards.
Swahili-English Translation QA
Swahili–English Translation QA Project
Jan 2026 - Present (5 months)
Post-edited 30 Swahili–English MT sentence pairs across 8 domains and performed MQM-based translation quality assessment. Applied MQM error taxonomy to score segments and produced quality verdicts ranging from 0.08 (Excellent) to 0.78 (Needs Revision), including correlation analysis with BLEU/METEOR/TER against human MQM annotations.
Medical NER Annotation
AI Training Data Project
Jan 2026 - Present (5 months)
Annotated 50 cardiology, oncology, nephrology, neurology, and infectious-disease clinical sentences using a 13-label NER schema in Label Studio. Built annotation guidelines for span boundaries, negation, abbreviations, and edge cases, and improved consistency via self-review and answer-key cross-validation.
RLHF Response Ranking
RLHF Response Ranking Project
Jan 2026 - Present (5 months)
Completed a 20-item RLHF preference ranking test set covering Helpfulness, Harmlessness, Honesty, Instruction Following, and Formatting. Applied HHH-style decision categories with tie-breaking rules and computed Cohen’s Kappa across a 5-annotator simulation (mean kappa 0.82).
Education
Degrees, certifications, and relevant coursework
Self-directed research
Self-directed research, Medical NLP & Health Data (HIPAA, ICD-10, MIMIC-III)
2025 - 2026
Activities and societies: Studied NOTEEVENTS, ICD chapters/classification/sequencing, de-identification (Safe Harbor/Expert Determination), and medical NLP landscape.
Completed a self-directed study of ICD-10-CM structure and MIMIC-III clinical workflows, including ICD coding benchmarks and Python/SQL access. Covered HIPAA de-identification safeguards and medical NLP task areas such as clinical NER and de-identification.
AI Annotation Specialist training
Project-based practical training, AI Training Data Annotation (NER, RLHF, Translation QA)
2025 - 2026
Activities and societies: Applied MQM error taxonomy and annotation QA methods; practiced RLHF/RLAIF preference ranking and response comparison.
Completed project-based practical training in AI annotation covering NER, RLHF preference data collection, MQM translation quality assessment, and related inter-annotator agreement practices.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring shadrack?
You can contact shadrack and 90k+ other talented remote workers on Himalayas.
Message shadrackFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
