AI Evaluation – Safety Specialist at Mercor

Source: https://remotive.com/remote-jobs/software-dev/ai-evaluation-safety-specialist-2067933

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

AI Evaluation – Safety Specialist at Mercor. Location Information: USA, UK, Canada. This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.. Role Description. At Mercor, we believe the foundation of AI safety is high-quality human data. Models can’t evaluate themselves — they need humans who can apply structured judgment to complex, nuanced outputs.. We’re building a flexible pod of Safety specialists: contributors from both technical and non-technical backgrounds who will serve as expert data annotators. This pod will annotate and evaluate AI behaviors to ensure the systems are safe.. No prior annotation experience is required — instead, we’re looking for people with the ability to make careful, consistent decisions in ambiguous situations.. This role may include reviewing AI outputs that touch on sensitive topics such as bias, misinformation, or harmful behaviors. All work is text-based, and participation in higher-sensitivity projects is optional and supported by clear guidelines and wellness resources.. Qualifications. You bring experience in model evaluation, structured annotation, or applied research. . You are skilled at spotting biases, inconsistencies, or subtle unsafe behaviors that automated systems may miss. . You can explain and defend your reasoning with clarity. . You thrive in a fast-moving, experimental environment where evaluation methods evolve quickly. . Examples of past titles: Machine Learning Research Assistant, AI Evaluator, Data Scientist, Applied Scientist, Research Engineer, AI Safety Fellow, Annotation Specialist, Data Labeling Analyst, AI Ethics Researcher. . Requirements. Produce high-quality human data by annotating AI outputs against safety criteria (e.g., bias, misinformation, disallowed content, unsafe reasoning, etc). . Apply harm taxonomies and guidelines consistently, even when tasks are ambiguous. . Document your reasoning to improve guidelines. . Collaborate to provide the human data that powers AI safety research, model improvements, and risk audits. . Benefits. Work at the frontier of AI safety, providing the human data that shapes how advanced systems behave. . Gain experience in a rapidly growing field with direct impact on how labs deploy frontier AI responsibly. . Be part of a team committed to making AI systems safer, trustworthy, and aligned with human values. . Company Description. Mercor is a talent marketplace that connects top experts with leading AI labs and research organizations.. Our investors include Benchmark, General Catalyst, Adam D’Angelo, Larry Summers, and Jack Dorsey.. Thousands of professionals across law, engineering, research, and creative fields collaborate with Mercor on frontier AI projects shaping the future.. The pay rate for this role may vary by project, customer, and content category. Compensation will be aligned with the level of expertise required, the sensitivity of the material, and the scope of work for each engagement.