I want to work on AI systems where quality and judgment directly shape the product. Evaluation, data curation, business analysis, or any role sitting at the intersection of AI and structured decision-making. Remote-first, with real ownership of the work.
Aarya Desai
@aaryadesai
AI Evaluator and Business Analyst. I score LLM outputs, run adversarial tests, and turn ambiguous briefs into decisions teams can act on.
What I'm looking for
I find the failures that look like correct answers.
Most of my work starts the same way: something unclear that needs to become a decision someone can defend. On the AI side, that means evaluating LLM and multimodal outputs against structured rubrics, running adversarial and red-teaming tests, and mapping failure modes (hallucination, instruction drift, constraint leakage, contradiction) into severity-graded reports engineering can act on. At Turing I was doing 150+ of these a week for a global technology client. At Mercor I rebuilt a large malformed dataset in Python so the evaluation pipeline could actually run.
On the business side, I've turned vague briefs into structured Scopes of Work across 5+ industries, owned QA on 20+ monthly deliverables, and built competitive positioning reports that fed directly into quarterly strategy. Requirements analysis, stakeholder alignment, and making sure the output is not just correct but usable.
B.Tech (Computer Science) with AI specialisation, NMIMS. Python, Power BI, statistical analysis, technical writing. Native Gujarati and Hindi, fluent English. Open to AI evaluation, LLM quality, and BA roles in AI-first teams.
Experience
Work history, roles, and key accomplishments
Evaluated AI model outputs across code, image, and
text modalities within RLHF-aligned workflows. Rebuilt a large-scale malformed dataset in Python, resolving data integrity issues blocking downstream evaluation. Assessed data science task
solutions for statistical reasoning, methodology, and correctness.
Evaluated AI model outputs across code, image, and
text modalities within RLHF-aligned workflows. Rebuilt a large-scale malformed dataset in Python, resolving data integrity issues blocking downstream evaluation. Assessed data science task solutions for statistical reasoning, methodology, and correctness.
Business Analyst
Mountain Monk Consulting
Jul 2025 - Dec 2025 (5 months)
Led client engagements across 5+ industries, translating ambiguous problems into structured Scopes of Work and analytical frameworks.
Owned QA on 20+ monthly deliverables, auditing for logical coherence, semantic accuracy, and alignment to client objectives.
Market Researcher
Aum Electric Engineering Pvt. Ltd.
Sep 2024 - Jul 2025 (10 months)
Documented requirements from 12+ clients across industrial and infrastructure verticals, translating technical briefs into operational frameworks. Built a competitive positioning report that directly informed quarterly strategy.
Scriptwriter
MadLads Studios
Aug 2023 - Apr 2025 (1 year 8 months)
Researched audience data and scripted 25+ digital campaigns, increasing listener retention by 18% and improving content engagement by 20% through data-informed creative briefs.
Web Developer Intern
Easocare
Jul 2023 - Dec 2023 (5 months)
Collaborated with cross-functional teams to enhance user journeys and optimize UX, improving site load time by 25% and integrating user behavior insights into design decisions.
Business Analyst (Intern)
Avinashi Group of Companies
Dec 2021 - Jul 2022 (7 months)
Gathered business requirements and translated them into technical specifications and PRDs, improving project planning accuracy by 12% through structured reporting and requirements gathering.
Education
Degrees, certifications, and relevant coursework
NMIMS University (MPSTME)
Bachelor of Technology, Computer Science (Artificial Intelligence)
2020 - 2024
Grade: CGPA: 3.0/4.0
Activities and societies: Projects: Driver Drowsiness Detection (OpenCV, Keras); ML-based Sign Language Model (TensorFlow, Flask).
Completed a Bachelor of Technology in Computer Science (Artificial Intelligence) with a CGPA of 3.0/4.0, focusing on AI, machine learning, and practical projects in computer vision and sign-language translation.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Salary expectations
Social media
Job categories
Skills
Interested in hiring Aarya?
You can contact Aarya and 90k+ other talented remote workers on Himalayas.
Message AaryaFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
