Skip to main content
HimalayasHimalayas logo
Pranava MittalPM
Open to opportunities

Pranava Mittal

@pranavamittal

I’m an AI trainer and evaluator using RLHF/RLAIF to improve instruction-following, safety, and response quality.

India
Message

What I'm looking for

I’m looking for AI evaluation/training work where I can stress-test models, write structured rationales, and improve instruction-following, safety, and multilingual quality with clear rubrics and measurable impact.

I’m an AI TRAINER and EVALUATOR with concurrent experience across three RLHF platforms, contributing across 20+ task types including data annotation, audio evaluation, image analysis, reasoning, and creative writing. I focus on model alignment by detecting subtle issues like hallucinations, unsafe outputs, and instruction drift that automated systems can miss.

At Outlier.ai, I contributed to the Aether project with consistently top-rated task acceptance, writing detailed comparative rationales for model response ranking and helping feed preference data into RLHF and RLAIF training pipelines. I also identified recurring failure patterns and refined annotation guidelines to improve consistency and rubric alignment.

In parallel at AfterQuery Experts and Welocalize, I evaluate frontier model outputs for reasoning quality, instruction-following, safety compliance, and hallucination rate, while also assessing multimodal and bilingual (English/Hindi) audio and text for naturalness, fluency, and cultural fit.

Experience

Work history, roles, and key accomplishments

Welocalize logoWE
Current

Audio & Localization Evaluator

May 2026 - Present (1 month)

Bilingually evaluates AI-generated text, audio, and personalization outputs in English and Hindi for naturalness, fluency, and cultural fit. Rates map personalization and search relevance for intent alignment and flags localization errors that impact product quality.

Outlier.ai logoOU
Current

AI Trainer & Evaluator

Mar 2026 - Present (3 months)

Contributed across 20+ RLHF task types (data annotation, audio evaluation, image analysis, reasoning, and creative writing), consistently ranking among top-rated contributors by task acceptance. Wrote comparative response rationales and fed preference data into RLHF/RLAIF training pipelines while identifying recurring failure patterns (hallucinations, unsafe outputs, instruction drift).

Education

Degrees, certifications, and relevant coursework

Chitkara University logoCU

Chitkara University

Bachelor of Technology (B.Tech), Computer Science

Grade: GPA 9.5 / 10 (Sem 1)

Activities and societies: AI trainer/evaluator contributions spanning data annotation, audio evaluation, image analysis, reasoning, and creative writing; focuses on detecting hallucinations, unsafe outputs, and instruction drift.

Pursuing a B.Tech in Computer Science at Chitkara University while contributing to AI model alignment through RLHF evaluation and data annotation across 20+ task types.

Tech stack

Software and tools used professionally

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan