Open roles
Application

Founding ML Engineer

Full-time · San Francisco

You will help close the gap between promising model behavior and dependable product behavior. This is a production-oriented ML role for someone who can take models, evals, and agent improvements out of notebooks and into systems that are fast, reliable, measurable, and safe to operate.

Role focus
  • Improve the performance of our agents in production through better prompts, model selection, evals, retrieval, fine-tuning, and system design.
  • Build and own the ML workflows that let us test, measure, and ship model changes safely.
  • Develop offline and online evaluation systems tied to real business metrics like conversion, compliance, latency, and customer experience.
Strong signals
  • You have shipped ML or LLM systems into production and have seen what breaks outside the demo environment.
  • You are strong in Python and comfortable owning the surrounding engineering work, not just the modeling layer.
  • You have good judgment about tradeoffs across model quality, latency, cost, observability, and operational complexity.
Click to upload your CV
PDF, DOC, or DOCX

Tell us how you use agentic coding tools in your workflow. What works, what doesn't, how far do you push it?

Walk us through the architecture, the tradeoffs you made, and what you'd do differently.

If you don't yet have deep production experience, share something you built on your own. Up to 3 links.

Any experience with LLM apps, evals, prompt pipelines, RAG, or fine-tuning? What have you shipped?

Cancel