HimalayasHimalayas logo
GG

AI Evaluation Engineer (Data Analysis & Multi-Agent Systems)

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

We are looking for an AI Evaluation Engineer specialized in data analysis to design benchmark tasks that simulate real-world analytical workflows. Responsibilities include designing and developing multi-agent benchmark tasks, creating realistic datasets, and implementing evaluation pipelines using Python and SQL.

Requirements

  • 5+ years of experience in data analysis or analytics-heavy roles
  • Strong proficiency in Python (pandas, NumPy) and SQL
  • Experience working with real-world, messy datasets (CSV, JSON, logs, reports)
  • Ability to design analytical problems with clear, verifiable answers
  • Solid understanding of statistics (distributions, correlations, outliers)
  • Familiarity with AI benchmarks or evaluation environments (e.g., SWE-bench or similar)
  • Hands-on experience with Docker (Dockerfiles, image builds, debugging)

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Location requirements

Hiring timezones

Ghana +/- 0 hours

About Gramian Consulting Group

Learn more about Gramian Consulting Group and their company culture.

View company profile
Claim this profileGG

Gramian Consulting Group

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

18 remote jobs at Gramian Consulting Group

Explore the variety of open remote roles at Gramian Consulting Group, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Gramian Consulting Group

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan