Stephen Lyabandi
@stephenlyabandi
Multilingual AI data specialist with expertise in NLP and annotation.
What I'm looking for
I am a multilingual AI data specialist with several years of experience in Swahili-English language annotation, LLM prompt engineering, and AI content evaluation. My expertise lies in developing high-quality datasets for generative AI and conducting human-in-the-loop evaluations to enhance multilingual NLP models. I have a proven track record of collaborating with NLP engineers to improve annotation guidelines and ensure linguistic consistency through rigorous QA processes.
Throughout my career, I have worked on diverse projects, including building intent recognition datasets for chatbots, evaluating AI-generated content for fluency and accuracy, and curating large-scale Swahili corpora for machine learning applications. My technical skills include proficiency in annotation tools such as Labelbox and Prodigy, as well as basic knowledge of Python for data analysis. I am passionate about leveraging my linguistic skills and technical expertise to contribute to innovative AI solutions.
Experience
Work history, roles, and key accomplishments
Linguistic Data Associate
AIWorks
Jan 2023 - Mar 2024 (1 year 2 months)
Annotated Swahili-English language data to support LLM and NLU development. Generated and refined prompt-response pairs for generative AI training. Ensured linguistic consistency through QA checks using Labelbox and Prodigy.
AI Evaluation Specialist
DataCentrix AI
May 2023 - Nov 2023 (6 months)
Evaluated AI-generated Swahili outputs for fluency, accuracy, and cultural relevance. Conducted quality assessments using dynamic evaluation matrices. Delivered insights on linguistic bias and improved annotation processes using Airtable and Python scripts.
Knowledge Graph Annotator
AI Semantic Partners
Apr 2019 - Aug 2019 (4 months)
Tagged Swahili text entities and mapped them to structured knowledge bases (e.g., Wikidata). Conducted ontology alignment for entities like events, people, and locations. Used RDF/OWL tools for semantic labeling and categorization.
Swahili Data Annotator
DataCurio AI
Jun 2018 - Oct 2018 (4 months)
Created 5,000+ prompt-response pairs across history, sociology, and religion. Applied strict annotation protocols to ensure linguistic accuracy and cultural nuance. Participated in QA loops to optimize dataset quality for LLM fine-tuning.
NLP Data Analyst
LinguaTech Labs
Jul 2016 - May 2018 (1 year 10 months)
Curated and cleaned large-scale Swahili corpora for machine learning applications. Standardized entity recognition tags and dialectal variations. Contributed to a Swahili sentiment analysis model via custom dataset development.
Digitization & Archives Assistant
National Archives of Finland
Jan 2013 - Mar 2016 (3 years 2 months)
Applied metadata and annotation standards to digitized archival materials. Performed structured data QA and supported end users of historical databases.
Content Writer Trainee
Haaga-Helia University of Applied Sciences
Jan 2011 - Jun 2011 (5 months)
Authored question-answer content for e-learning platforms. Participated in QA reviews and content optimization with design and dev teams.
Education
Degrees, certifications, and relevant coursework
Haaga-Helia University of Applied Sciences
Bachelor of Business & Information Technology, Data & Analytics
2007 - 2012
Focused on Information Systems and Communication Technologies. Gained skills in data and analytics.
Availability
Location
Authorized to work in
Job categories
Interested in hiring Stephen?
You can contact Stephen and 90k+ other talented remote workers on Himalayas.
Message StephenFind your dream job
Sign up now and join over 85,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
