AI Engineer (Prompt Engineering & Python) at Synthflow AI

Source: https://jobs.ashbyhq.com/synthflow/d5bba02f-1708-4368-aba1-1afdf695af40

We are redirecting you to the source. If you are not redirected in 3 seconds, please click here.

AI Engineer (Prompt Engineering & Python) at Synthflow AI. Remote Location: Global Remote. Synthflow AI is a . no-code platform for deploying voice AI agents . that automate phone calls across contact center operations and business process outsourcing (BPO) at scale. We help mid-market and enterprise companies manage routine calls to save teams time and resources. . Our agents have already delivered measurable impact:. Over . 5 million hours. of contact center operations saved. 35% more calls . answered compared to non-AI operators. 45 million calls handled with a . 99.9% uptime. Backed by . Accel, Atlantic Labs, and Singular. and trusted by over 1,000 customers, our growth leads an industry shift toward sophisticated and accessible conversational AI.. The Role. We’re hiring an . AI Engineer. who lives and breathes . prompt engineering. and writes excellent . production-grade Python. . You’ll run a tight feedback loop with customers, turn real conversations into better prompts and eval datasets, and ship changes that measurably improve agent outcomes. This is a highly applied role working directly with customer feedback.. What You’ll Do. Design & iterate prompts. (system, tool/function-calling, task prompts) to boost voice AI agent success, reliability, and tone.. Build co-pilots for customers to author their own prompts. : meta-prompted assistants that suggest structures, lint for risks, autocomplete tool schemas, critique drafts, and generate eval cases.. Work . directly with customer feedback. and conversation logs to identify failure modes; translate them into prompt changes, guardrails, and data improvements.. Build eval datasets. (success labels, rubrics, edge cases, regressions) and run . offline/online evaluations. (A/B tests, canaries) to quantify impact.. Create . Python utilities/services. for prompt versioning, config-as-code, rollout/rollback, and guardrails (policies, refusals, redaction).. Partner with PM/Success to define . success metrics. (task completion, first-pass accuracy, cost, latency) and instrument dashboards/alerts.. Own LLM integration details: . function/tool schemas. , output parsing/validation (pydantic), retrieval-aware prompting, and fallback strategies.. Ensure . privacy & compliance. (PII handling, anonymization, regional data boundaries) in datasets and logs.. Share learnings via concise docs, playbooks, and internal demos.. Must-Have Skills. Python:. 3+ years writing clean, tested, production code (typing, pytest, profiling); experience building small services/APIs (FastAPI preferred).. Prompt Engineering:. Hands-on experience designing system/tool prompts, meta-prompting, rubric graders, and iterative prompt tuning based on real user data.. LLM Integration:. Comfortable with major APIs (OpenAI/Anthropic/Google/Mistral), function/tool calling, streaming, and robust output handling.. Evaluation Mindset:. Ability to define measurable success, create labeled datasets, and run methodical experiments/A/B tests.. Product Sense:. Comfortable talking with customers, turning qualitative feedback into shipped improvements.. Data Hygiene:. Practical experience cleaning, labeling, and balancing datasets; awareness of privacy/PII constraints.. Nice-to-Haves. Experience building . prompt-authoring UIs/SDKs. or internal tooling for prompt versioning and governance.. Agentic frameworks & tooling:. DSpy, MCP, LangGraph, LlamaIndex, Rasa; experience with agent/tool schemas and orchestration.. Observability & eval tooling:. Langfuse, LangSmith, Braintrust; building eval harnesses and experiment dashboards.. RAG & vector stores:. Qdrant/Weaviate/Pinecone and retrieval-aware prompting.. Experimentation workflows:. A/B testing, prompt diffing/versioning.. Infra & analytics:. light SQL/log analysis, metrics & tracing, simple Grafana/OTel dashboards.. Writing public . blog posts. or talks about applied LLM techniques.. Interview Process. 30-min intro screen. – background, role fit, questions both ways.. Practical exercise (Prompt + Python). – design a prompt strategy, a customer-facing co-pilot flow, and a small eval harness.. Team interviews. – deep dive on product mindset, experimentation rigor, and collaboration.. Founder/Leadership chat. – scope, impact, and ways of working.. Why Join. Own the . reasoning layer. and the . customer co-pilot experience. used at scale.. Ship fast in a . tight customer feedback loop. and see your impact measured in days, not quarters.. Founded in Berlin in 2023 by serial entrepreneurs Albert Astabatsyan, Hakob Astabatsyan, and Sassun Mirzakhan-Saky, Synthflow AI democratizes access to advanced voice AI with a no-code platform that lets . enterprises easily create, deploy and scale. natural-sounding, cost-effective voice agents tailored to their business needs.