Open roles
Application

Founding Research Scientist

Full-time · San Francisco

You will define how our agents think, reason, and improve. This is not a role for researchers who want to stay in the lab — you will own the full loop from hypothesis to eval to production. You'll work on hard problems: grounded generation, compliance-aware reasoning, multi-turn sales conversations, and agent-to-agent negotiation.

Role focus
  • Design and run evaluations that measure what actually matters: conversion, compliance, satisfaction.
  • Research and implement improvements to our core agent capabilities.
  • Own the eval infrastructure and ensure every model update is measurable and safe to ship.
Strong signals
  • PhD or equivalent research experience in ML, NLP, or a related field.
  • Track record of shipping research into real systems — papers are great, deployed models are better.
  • Deep familiarity with LLMs, fine-tuning, RLHF/RLAIF, and evaluation methodology.
Click to upload your CV
PDF, DOC, or DOCX

Tell us how you use agentic coding tools in your workflow. What works, what doesn't, how far do you push it?

Walk us through the architecture, the tradeoffs you made, and what you'd do differently.

If you don't yet have deep production experience, share something you built on your own. Up to 3 links.

Any experience with LLM apps, evals, prompt pipelines, RAG, or fine-tuning? What have you shipped?

Cancel