Work / Brimley

Case study 03 · Autonomous SDR

Brimley

Live in production · Closed beta · Founder & full-stack AI engineer · brimley.ai ↗

A butler for outbound sales. You give it a brief and an ICP; it sources prospects, writes a personalised multi-step sequence per lead, sends from your own mailbox, classifies replies, and handles bounces, unsubscribes and pacing - autonomously, on your own infrastructure.

app modules

~30k

LOC in apps/

147

HTML templates

test files

engineer

4mo

to production

Why it's different

Most "AI agents" are demos. This is built like a system.

Outbound sales is the most labour-intensive function in B2B and the part most likely to be replaced by AI agents in the next 24 months. Existing tools each solve one piece of it. Brimley does the whole loop - sourcing, research, generation, sending, polling, classifying, suppressing - autonomously, on infrastructure the customer owns.

This isn't a wrapper around an API. It's a production-shaped multi-module Python system with a real state machine, idempotency on every external side effect, queryset-level tenant isolation with dedicated regression tests, cost-tracked AI calls stamped with token counts and USD at call time, and a hot-editable prompt layer with version-controlled defaults. Every interesting decision has a docstring explaining why.

Campaign state machine

Explicit transitions. Explicit failure paths.

draft ──▶ sourcing ──▶ sourced ──▶ running ──▶ paused ──▶ completed ──▶ archived │ └──▶ draft (sourcing failure)

Pipeline

Source → Schedule → Dispatch → Generate → Send → Poll → Classify.

Source
A background task calls the lead provider with the campaign's ICP filters, reveals emails (cached cross-org by provider ID), bills via the credit ledger, persists Prospect + CampaignProspect rows.
Schedule
Scheduler stamps next_action_at on each row from the campaign's daily cap, sending hours, sending days, timezone, and deterministic ±25% jitter.
Dispatch
dispatch_due_messages beat task wakes every minute, picks rows whose next_action_at has passed, prioritises by (current_step DESC, next_action_at ASC), fires send_email.delay() per row, capped at the org-wide daily total.
Generate
Message generator builds a per-prospect email using campaign voice, sequence step, and research. Call is logged with token counts + USD cost.
Send
Verifier runs first. OutreachMessage persisted. Delivered via the user's authenticated mailbox with an RFC 8058 List-Unsubscribe header injected. Suppressions short-circuit before the API call.
Poll & classify
Mailbox history poll every 5 minutes walks each mailbox since the last cursor, classifies inbound replies/bounces/OOO via an LLM-graded reply classifier, halts on a reply, suppresses on a bounce.

Things most demos hand-wave

Each tile is a decision a senior engineer would notice and ask about - and would find a defensible answer.

ETA-based dispatch

Pacing is a property of when sends are scheduled, not a task-queue rate-limit. Deterministic ordering, faithful distribution, no over/under-run. ±25% deterministic jitter prevents round-minute spam-flagging.

Idempotency everywhere

Every lead reveal, mailbox send, and LLM call goes through IdempotencyRecord. A retry on a network blip can't double-send, double-bill, or double-spend.

Queryset-level tenant isolation

Explicit .for_organization(org) on every query - not a middleware thread-local that breaks the moment a background task runs. Dedicated CI test fails if a new view forgets it.

Cost-tracked AI calls

Every LLM call wrapped to stamp per-token pricing at call time, persist input/output/cache-read/cache-creation tokens, and persist USD cost. Historical accounting stays correct when pricing changes.

Hot-editable prompts

Every system prompt lives in apps/ai/prompts/*.md AND in a PromptOverride DB table. Resolution: DB row wins if non-empty, otherwise the on-disk default. Iterate without deploys; version-control the canonical default.

RFC 8058 + suppression

Every send carries a List-Unsubscribe header. Bounces auto-suppress. Unsubscribes auto-suppress. Domain-level suppression is a first-class table, not a flag on a row.

Selected decisions

The patterns I'd point at in a deep-dive interview.

Structured AI output via typed schemas

Reply classification, ICP parsing, and several other AI calls validate the LLM's output against a typed schema before it ever reaches the rest of the system - no JSON-string brittleness.

Mailbox history polling rather than push

Push notifications via Pub/Sub require a verified domain and a more invasive OAuth posture. History polling on a 5-minute cadence with the per-mailbox last_history_id cursor gives the same correctness with a simpler trust posture - and survives downtime gracefully.

LinkedIn companion as an MV3 extension, not headless automation

Headless LinkedIn automation gets accounts banned and violates LinkedIn's ToS. Running the automation inside the user's own authenticated Chrome session keeps it inside the terms-of-service envelope and avoids cookie-store/fingerprint problems.

Honest about boundaries

Mailbox OAuth scopes are send + modify only - Brimley does not read the inbox beyond the threads it sent on. OAuth tokens encrypted at rest with envelope encryption. HMAC-peppered indices on sent-message lookups.

Single-tenant by default, multi-tenant capable

Brimley is designed to be self-hosted on the customer's own Postgres + Redis. But the data model is fully org-scoped, so the same image runs in a managed multi-tenant mode. Same code; the difference is operational.

Real observability

Worker heartbeats. Incident ladder. Health dashboard. Error tracking + structured logging. A dry-run mode (OUTREACH_DRY_RUN). A maintenance mode that short-circuits sending tasks rather than dropping them.

Stack

Web / API

Python ASGIWebSocket layerServer-rendered + progressive enhancementOAuth libraryStatic-file middlewareUtility-first CSS

Async / queue

Async task queueDB-scheduled cronTask-state persistencePer-queue concurrency

Datastores

PostgreSQL 17Redis 7 (cache, broker, channels, locks)

Frontier LLMTyped structured outputDB-overrideable promptsPer-call cost + token logging

Integrations

Verified-lead providerMailbox OAuth (encrypted at rest)Browser extension (social platform)ABM data provider

Security & ops

Envelope-encrypted secretsHMAC-peppered indicesError trackingStructured loggingRFC 8058 List-UnsubscribeTenant-isolation regression suite

Next: FluxAgents → Get in touch