← All jobs · Cresta

AI Evaluations Lead

Cresta ·
75
AI-Agency
B72 U80
📍 US 🌐 Remote-only 🛠 AI tools welcome at work Lead 5+ yrs
LLMRAGprompt engineeringconversational AISTTTTS
TL;DR

AI Evaluations Lead at Cresta, responsible for designing and scaling end-to-end quality frameworks for AI agent systems. Role involves architecting LLM evaluation methodologies, leading QA teams, and ensuring reliable AI deployment for enterprise customers.

Apply at Cresta →
share:
you'll be redirected to the company's career page

Job description

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Our platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices, automate conversations and inefficient processes, and empower every team member to work smarter and faster. Born from the prestigious Stanford AI lab, Cresta's co-founder and chairman is Sebastian Thrun, the genius behind Google X, Waymo, Udacity, and more. Our leadership also includes CEO, Ping Wu, the co-founder of Google Contact Center AI and Vertex AI platform, & co-founder, Tim Shi, an early member of Open AI.

We’ve assembled a world-class team of AI and ML experts, go-to-market leaders, and top-tier investors including Andreessen Horowitz, Greylock Partners, Sequoia, and former AT&T CEO John Donovan. Our valued customers include brands like Intuit, Cox Communications, Hilton, and Carmax and we’ve been recognized by Forbes and Bain Consulting as one of the top private AI companies in the world.

Join us on this thrilling journey to revolutionize the workforce with AI. The future of work is here, and it's at Cresta.

About the Role:

At Cresta, shipping AI is only half the story. Ensuring that AI interacts with humans reliably, accurately, and empathetically at scale is where the real challenge lies.

As the Ai Evaluations Lead, you will be the ultimate guardian of the customer experience for our AI Agent product line. This role is perfect for a strategic quality expert who loves the intersection of human psychology and machine logic. You will own the end-to-end quality strategy, from designing complex test plans for non-deterministic LLMs to building automated and scalable testing environments using Cresta's proprietary no-code test and evaluation tools.

You aren't just looking for bugs; you are building the framework that allows Cresta to deploy world-class AI agents for the world's largest enterprises with total confidence.

What You’ll Do:

What We’re Looking For:

Bonus Points:

Perks & Benefits:

We offer a comprehensive and people-first benefits package to support you at work and in life:

Compensation at Cresta:

Cresta’s approach to compensation is simple: recognize impact, reward excellence, and invest in our people. We offer competitive, location-based pay that reflects the market and what each individual brings to the table.

Compensation for this position includes a Base salary + Bonus + Equity.

Actual base salaries will be based on candidate-specific factors, including experience, skillset, and location, and local minimum pay requirements as applicable. Your recruiter can provide further details. In addition, total compensation includes a comprehensive benefits package for you and your family.

We have noticed a rise in recruiting impersonations across the industry, where scammers attempt to access candidates' personal and financial information through fake interviews and offers. All Cresta recruiting email communications will always come from the @cresta.ai domain. Any outreach claiming to be from Cresta via other sources should be ignored.  If you are uncertain whether you have been contacted by an official Cresta employee, reach out to recruiting@cresta.ai

Apply at Cresta →

More open roles at Cresta

Cresta · 🔄 synced 2h ago
Staff Machine Learning Engineer
📍 US 🌐 Remote-only 💰 $230K–$300K · Staff
Staff Machine Learning Engineer at Cresta building agentic AI systems for contact centers. Focus on LLM architecture, RAG pipelines, agent orchestration, and evaluation frameworks for production AI systems.
PyTorchTensorFlowHugging FaceRAGLLMsTransformers
86
AI-core
Cresta · 🔄 synced 2h ago
Senior Machine Learning Engineer
📍 US 🌐 Remote-only 💰 $205K–$270K · Senior
Senior Machine Learning Engineer at Cresta building agentic AI systems for contact centers. Focus on LLM-powered agents, RAG pipelines, evaluation frameworks, and production ML systems.
PyTorchTensorFlowHugging FaceRAGLLMsNLP
83
AI-core
Cresta · 🔄 synced 2h ago
Senior Machine Learning Engineer
📍 CA 🌐 Remote-only 🛠 AI tools welcome at work · Senior
Senior Machine Learning Engineer at Cresta building agentic AI systems for contact centers. Focus on LLM-powered agents, RAG pipelines, evaluation frameworks, and production ML systems.
PyTorchTensorFlowHugging FaceRAGLLMsTransformers
83
AI-core
Cresta · 🔄 synced 2h ago
Senior Forward Deployed Engineer (AI Agent)
📍 CA 🌐 Remote-only 🛠 AI tools welcome at work · Senior
Senior Forward Deployed Engineer at Cresta building and deploying AI agents for contact center customers. Develops agent solutions, integrates with external systems, and provides technical guidance on LLM-based deployments.
PythonGolangAWSGCPAzureLLMs
81
AI-core
Cresta · 🔄 synced 2h ago
Senior Forward Deployed Engineer (AI Agent)
📍 AU 🌐 Remote-only 🛠 AI tools welcome at work · Senior
Senior Forward Deployed Engineer at Cresta building and deploying AI agents for contact center customers. Develops agent integrations, optimizes performance, and provides technical guidance on LLM-based solutions.
PythonGolangAWSGCPAzureLLMs
79
AI-core