Engineering Expert (PhD) - AI Systems Evaluation at Weekday AI

Source: https://jobs.workable.com/view/g5qXJQZBuCE5totHDyV9zz/remote-engineering-expert-(phd)---ai-systems-evaluation-in-united-states-at-weekday-ai

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

Engineering Expert (PhD) - AI Systems Evaluation at Weekday AI. This role is for one of our clients. Compensation: $73.29 per hour. PhD-level engineers are sought to support high-impact collaborations with advanced AI research teams. This role focuses on improving the accuracy, rigor, and reliability of general-purpose conversational AI systems, particularly in engineering-related contexts.. AI systems used in professional engineering scenarios must demonstrate strong applied reasoning, quantitative accuracy, and alignment with real-world systems. This project centers on evaluating and enhancing how models interpret, reason about, and explain engineering concepts across multiple disciplines.. Key Responsibilities. Develop and refine prompts to guide AI behavior in engineering-specific scenarios . Evaluate model-generated responses for technical correctness, applied reasoning, completeness, and practical relevance . Fact-check technical claims using authoritative public sources and domain expertise . Annotate outputs by identifying conceptual gaps, flawed assumptions, and factual inaccuracies . Assess clarity, structure, and appropriateness of explanations for various audiences . Ensure responses align with expected conversational standards and system-level guidelines . Apply structured evaluation frameworks, taxonomies, and benchmarking standards consistently . Required Qualifications. PhD in Engineering or a closely related field . Deep expertise in one or more of the following domains: . Mechanical & Physical Systems Engineering . Electrical, Electronic & Computer Engineering . Chemical, Materials & Process Engineering . Civil, Environmental & Infrastructure Engineering . Strong familiarity with large language models (LLMs) and their practical applications . Excellent written communication skills with the ability to clearly explain complex technical concepts . High attention to detail and ability to detect subtle technical inaccuracies . Experience reviewing, editing, or critiquing technical or academic writing . Preferred Experience. Applied research, industry engineering workflows, or systems design . Experience with reinforcement learning from human feedback (RLHF), model evaluation, or structured data annotation . Teaching, mentoring, or explaining engineering concepts to non-expert audiences . Familiarity with structured evaluation rubrics, benchmarks, or quality assurance frameworks . What Success Looks Like. You consistently identify technical inaccuracies, incomplete reasoning, or flawed assumptions in engineering-related AI outputs . Your structured feedback measurably improves the rigor, clarity, and correctness of model responses . You produce consistent, reproducible evaluation artifacts that strengthen model performance over time . Engineering-focused AI systems demonstrate greater reliability and trustworthiness as a result of your evaluations . Contract & Payment Terms. Engagement will be structured as an independent contractor agreement . Fully remote with flexible scheduling . Projects may be extended, shortened, or concluded early based on performance and evolving needs . Assignments will not require access to confidential or proprietary information from any employer, client, or institution . Payments are processed weekly via Stripe or Wise based on services rendered . Visa sponsorship is not available; H1-B and STEM OPT candidates cannot be supported at this time. Company Location: United States.