Freelance Agent Evaluation Engineer at Mindrift. Location Information: Anywhere in the World. . . Headquarters:. India . URL:. . https://mindrift.ai/. . . Please submit your CV in English and indicate your level of English proficiency.. . Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. . Participation is project‐based as an independent freelance contributor and does not create an employment relationship with Toloka or our clients.. . What this opportunity involves . . As an independent freelance contributor, you choose which projects to join and how you organize your work, as long as deliverables meet the project requirements and deadlines.. . While each project involves unique tasks, contributors may: . . . Create structured test cases that simulate complex human workflows . . Define gold-standard behavior and scoring logic to evaluate agent actions . . Analyze agent logs, failure modes, and decision paths . . Work with code repositories and test frameworks to validate your scenarios . . Iterate on prompts, instructions, and test cases to improve clarity and difficulty . . Ensure that scenarios are production-ready, easy to run, and reusable. . . What we look for. . This opportunity is a good fit for software engineers, open to part-time, non-permanent projects. Ideally, contributors will have: . . . 3+ of software development experience with strong Python focus . . Experience with Git and code repositories . . Comfortable with structured formats like JSON/YAML for scenario description . . Understanding core LLM limitations (hallucinations, bias, context limits) and how these affect evaluation design . . Familiarity with Docker . . English proficiency - B2. . . How it works . . Apply → Pass qualification(s) → Select a project. → . Complete tasks on your own schedule within project timelines→ Get paid. . Project time expectations . . For this project, tasks are estimated to require around 6-10 hours of work per week during active phases, depending on the tasks you choose to complete. This is an estimate only and does not create any minimum or guaranteed hours.. . Payment . . . Paid freelance contributions, with rates that may be equivalent to . up to $80/hour*. (project‐ and task‐based). . Fixed project rate or individual rates, depending on the project . . Some projects include incentive payments . . . *Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project.. . To apply:. . https://weworkremotely.com/remote-jobs/mindrift-freelance-agent-evaluation-engineer.
Freelance Agent Evaluation Engineer at Mindrift