Overview
AI Agent Evaluation Analyst - AI Trainer role at Mindrift. We are looking for curious, intellectually proactive contributors who double-check assumptions and think through scenarios and edge cases. This is a flexible, project-based opportunity for those who enjoy evaluating how modern AI systems are tested and evaluated.
Mindrift connects domain experts with AI projects, powered by Toloka, to unlock the potential of GenAI through real-world expertise from across the globe.
What you'll do
- Review evaluation tasks and scenarios for logic, completeness, and realism
- Identify inconsistencies, missing assumptions, or unclear decision points
- Help define clear expected behaviors (gold standards) for AI agents
- Annotate cause–effect relationships, reasoning paths, and plausible alternatives
- Think through complex systems and policies to ensure agents are tested properly
- Work closely with QA, writers, or developers to suggest refinements or edge case coverage
Requirements
Excellent analytical thinking : ability to reason about complex systems, scenarios, and implicationsStrong attention to detail : can spot contradictions, ambiguities, and vague requirementsFamiliarity with structured data formats : can read JSON / YAML (not necessarily write)Capability to assess scenarios holistically : identify what's missing or unrealistic and what might breakGood communication and clear writing in English to document findingsWe also value
Experience with policy evaluation, logic puzzles, case studies, or structured scenario designBackground in consulting, academia, Olympiads (logic / math / informatics), or researchExposure to LLMs, prompt engineering, or AI-generated contentFamiliarity with QA or test-case thinking (edge cases, failure modes)Understanding of scoring / evaluation in agent testing (precision, coverage)Benefits
Get paid for your expertise, with rates that can go up to $20 / hour depending on skills and project needsFlexible, remote, freelance project that fits around your commitmentsGain valuable experience on an advanced AI project to enhance your portfolioInfluence how future AI models understand and communicate in your fieldHow to get started
Apply to this post, qualify, and contribute to a project aligned with your skills on your own schedule. Shape the future of AI while building tools that benefit everyone.
Seniorities
InternshipEmployment type
Part-timeJob function
OtherIndustries
IT Services and IT ConsultingReferrals increase your chances of interviewing at Mindrift.
#J-18808-Ljbffr