v0.4 · open source · MIT

DarwinLoop

Your AI Agents smarter at each user interaction

DarwinLoop is an open-source Python and Node.js library that detects implicit user feedback signals, computes prompt quality metrics, and autonomously evolves your agent's prompt — without touching your infrastructure, storing conversations, or requiring labeled data.

$pip install darwin-feedbackloop
User message
Signal detection
NFR / AINPS
Threshold breach
Darwin agent
Prompt mutation
Canary rollout
Promotion
The problem

Prompt quality silently decays. Nothing in your stack catches it.

Prompts don't age well

Production agents accumulate failure modes silently. There's no alert when your prompt starts degrading.

The signal is there — and ignored

Every re-prompt, correction, and session abandoned by users is diagnostic data. Today it vanishes on every turn.

Manual review doesn't scale

Reading conversation logs to find failure patterns is economically irrational at production volume.

How Darwin works

A closed loop from signal to safe deployment.

01

Intercept

Zero-latency I/O hook. Darwin wraps your agent without changing your code. <1 line of instrumentation.

02

Detect

Implicit signals extracted per turn: re-prompts, corrections, session abandonment. NFR, PFR, and AINPS computed in rolling windows.

03

Mutate

When metrics breach configurable thresholds, the Darwin agent generates a targeted prompt edit using GEPA — a reflective evolutionary algorithm. No labeled data required.

04

Validate

Canary rollout with statistical significance testing. Auto-promote if better. Auto-rollback if worse.

Zero-config setup

One wrap call. Your agent now evolves itself.

python
import darwin

agent = darwin.wrap(
    agent=my_agent,          # any LangChain, OpenAI, Anthropic agent
    agent_id="support-bot",
    llm_api_key=OPENAI_KEY   # BYOK — for the mutation engine only
)

# That's it. Darwin runs in the background.
# Your agent now evolves autonomously.
Works with
Anthropic SDKOpenAI SDKLangChainLangGraphCrewAIOpenAI Agents SDKn8n
What you get

Production-grade primitives, not a science project.

Privacy by design

Raw conversations never leave your process. Darwin stores hashes and anonymized summaries only.

Three mutation levels

Injection (session-scoped), Soft Edit (auto-apply to new sessions), Hard Edit (GitHub PR for human approval).

Local-first

SQLite by default. PostgreSQL/Supabase via pluggable interface. No SaaS dependency. pip install and go.

Statistical validation

Fisher exact test, Student's t-test, or threshold-based. Configurable. Mutations only promote when provably better.

GEPA mutation engine

Prompt evolution via reflective reasoning + Pareto-optimal candidate selection. The algorithm that beat Claude Opus 4.1 on enterprise benchmarks at 90× lower cost.

1,284
GitHub stars
800+
developers shipping
"Caught a regression in our support agent we wouldn't have seen for weeks. The PR landed before standup."
— Maya R., Staff Eng, Pencil
Free to start

Self-host today. Hosted Cloud when you need it.

Self-hosted

Free forever
  • Managed Darwin agent
  • SQLite / Postgres
  • Full mutation engine
  • Community support
Get started free

OSS Cloud

Coming soon · paid hosted tier
  • Team dashboard
  • RBAC + multi-agent view
  • Priority support
  • Managed canary infra
Join the waitlist