All inquests have the right foundation: they have clear psychological safety and guilt-free discussion. That’s great! But some failed to achieve the goal despite having a clear timeline and some causal factors. Usually it is the actions that are weak, such as:
These are not actions; they are ambitions.
Why are these ambitions? The engineers who wrote the buggy code weren’t trying to write buggy code. They responded rationally to the system as it existed (with time constraints, available information, existing assessment processes and the tools at their disposal, etc.).
If you don’t change the system, you’re just asking people to resist it better. And that simply doesn’t work. Break!
Autopsy reports that end with “we will try harder” or “we will be more careful” have fundamentally misunderstood the problem. An autopsy is not about identifying the failure of a people; it’s about dealing with a system that made failure likely.
Every post-mortem action must pass this test:
If someone ignores the action, what mechanism prevents the bug from returning?
If the answer is ‘nothing’ then it is not an action, just a tick on a post-mortem examination that turns up nothing.
To give some examples:
“Make sure all code is tested before being merged” – This is an ambitious promise. Nothing in the system has changed except a vague promise. The systemic alternative could be the introduction of coverage rattle.
“All code must be reviewed by X” – Use a
CODEOWNERSfile in GitHub instead to make this explicit.“Identify problems earlier in the planning” – What was the real problem here? No one has a perfect crystal ball! But if more people were involved, perhaps you would encounter certain types of problems more quickly? In that case, introducing a lightweight ADR process could help!
The most effective post-mortems usually fall into a few categories.
Automated guards – CI ports, linters, cover ratchets, CODEOWNERS, pre-determined hooks. Integrating it into the process changes the system in a way that can’t be skipped!
Process gates – Checklists that introduce friction for certain change types. These require some discipline to enforce, but they create visible control points.
Environmental changes – For example, test environments that mirror production, tooling that makes the right thing easier than the wrong thing, or even a Slackbot that shares information at the right time! These shape behavior by changing the path of least resistance.
Structural changes – Changes to team topology, ownership boundaries, API contracts. This requires a greater effort, but addresses deeper systemic problems.
More generally, you can use the Use the ladder to find an appropriate level to intervene in the system.
The most important thing is what’s not on this list: there is nothing that depends on people remembering to do something, try harder or be more careful!
What would you do if an autopsy process doesn’t work? Well, you change the system! The simplest possible change I can think of is to achieve “concreteness”, so I propose that post-mortem actions should be enforced via a template like this:
Action: [Specific change to the system]
Owner: [Named person responsible for implementation]
Two dates: [When this will be done]
Enforcement mechanism: [What prevents regression]
How we’ll know it worked: [Observable outcome]
Be More Careful only gets you so far beyond this template, because you need to specify what’s actually changing, who’s going to change it, and how you know it works.
Post-mortems are supposed to be learning, but learning that doesn’t change behavior isn’t really learning, it’s just awareness!
The systems in organizations cause the problems (the tools, processes, incentives, constraints, etc.), not people. If you finish your autopsy and the result is ‘try-harder’, then the system that caused the bug still exists and will produce a new one.
#Postmortem #Theatre


