Forensic memory for AI coding agents. Every failure becomes a guardrail.
Autopsy records every action your coding agent takes, autopsies the failures, builds a graph of why they happened, and injects warnings into the next agent's system prompt, so it doesn't make the same mistake twice.
The problem
Every run is a clean slate. Yesterday's lesson is gone. The agent re-discovers the same migration bug, the same type drift, the same missing test, across every project, on every team, forever.
The agent edits a model and ships. The migration is missing. The deploy fails. Next week another agent does the exact same thing in a different repo.
Backend types change. Generated frontend types don't. Compilation breaks somewhere downstream nobody noticed.
Code shipped, no test added. The bug surfaces in CI hours later. The agent's already moved on to the next task.
How it works
A four-step closed loop that turns rejected runs into prompt-time warnings for every future run.
A lightweight opencode plugin streams every tool call, file edit, chat message, and rejection into Autopsy in real time. Fire-and-forget, never blocking the agent.
A deterministic classifier + LLM enhancer extracts FailureModes and FixPatterns from each rejected run.
Postgres + pgvector store a project-scoped graph of tasks → runs → failures → fixes, with temporal decay and counter-evidence dampening.
When the agent starts a new task, Autopsy retrieves similar past failures and injects a warning into its system prompt, or blocks the tool call entirely if the risk is high enough.
What's in the box
Not a logging tool. Not a wrapper. A real, end-to-end agent infrastructure layer with retrieval, prevention, and a live dashboard.
Captures the live opencode event bus: tool calls, edits, chat, permissions, rejections. Fire-and-forget batched bus that never blocks the LLM stream.
Rule-based pipeline detects schema changes, frontend drift, missing tests, regressions, and more. Augmented by a Gemma LLM pass for semantic depth.
Postgres + pgvector. Project-scoped, temporally decayed, counter-evidence dampened. Hybrid ANN + 3-hop graph walk in ~10ms warm.
On every chat turn, Autopsy retrieves similar past failures and injects a warning into the system prompt. On every tool call, it can block the call if the risk crosses a threshold.
After every code-mod tool, Autopsy automatically runs lint, typecheck, and tests. Failures are folded back into the graph as a real rejection so the next run gets warned.
Next.js + SSE. Timeline of every event, autopsy report on every run, force-graph view of the whole failure graph, and a green badge when Autopsy caught something before it failed.
Every project gets its own isolated graph. Lessons from monolith X never leak into greenfield Y, and team-wide patterns surface where they belong.
Reject mid-run as many times as you need. The agent reads the autopsy and tries again on the same thread, no session restart, no context lost.
Get started
From the root of your opencode project. Re-run anytime to update.
curl -fsSL https://install.autopsy.surf/install.sh | bash