Counterfactuals are not causality

an hour ago by Jtsummers

Those aren't counterfactuals, they're negative statements.

"Why are the lights off?"

"Because I didn't turn them on."

That's not a counterfactual, it's a fact. Just like the admins not configuring file purging is a (presumed) fact in the scenario under discussion in the article. Negative statements are not counterfactuals. Now, they may not be helpful for a 5 Whys analysis, but that's a separate thing.

Counterfactual: If the admins had turned on file purging, then the volume would not have been full.

Why? Because they didn't turn it on so this is a counterfactual in that it is dealing with facts not in existence in the reality under examination and reaching a conclusion of how things would have been different with those different facts. But nearly all the causes in a 5 Whys can be seen as counterfactuals if phrased correctly:

"Builds did not complete." Why? "Kubernetes could not start the pod, and the operation timed out after 1 hour."

As a counterfactual: If Kurbernetes could have started the pod, then the operation would not have timed out after 1 hour and the builds would have completed.

an hour ago by nerdponx

Ironically, "counterfactual" happens to be a technical term in the subfield of causal analysis within data analysis and statistics. See, e.g. https://david-salazar.github.io/2020/08/10/causality-counter...

an hour ago by carbocation

Based on the (apparently accidental) use of terms of art in the field, I clicked on this article genuinely expecting to see a refutation of Pearl or something.

31 minutes ago by undefined

[deleted]

2 hours ago by Jiro

By this reasoning, "I'm tired because I didn't sleep" is fundamentally different from "I'm tired because I exercised a lot".

Also, "the car didn't run because there wasn't enough gas" is fundamentally different from "the car didn't run because the engine broke down".

an hour ago by mannykannot

> By this reasoning, "I'm tired because I didn't sleep" is fundamentally different from "I'm tired because I exercised a lot".

Well, they are different because they propose different causes. Either of these statements (and the so-called counterfactuals in the article) might be true, depending on the actual cause.

The distinction that matters here is that statements of causes are not solutions, and solutions are elicited by different questions than are causes ("what can we do about it?", rather than "why did it happen?") It does not really have anything to do with counterfactuals vs. causes, and the author's actual point seems to be about how to present solutions.

an hour ago by kc0bfv

Sure.

"I'm tired because - I stayed up all night playing games", or, "I'm tired - because my child screamed all night long," are more along the lines of what the post suggests are useful, and I can see how they're fundamentally more useful to getting at the root cause than the counterfactual, "I'm tired because I didn't sleep."

But this example is very simple and makes the argument less important I'd say.

an hour ago by tshaddox

You're tired because you stayed awake. The volume was full because the admin failed to configure file purging.

an hour ago by benlivengood

I mostly agree with this author but I think counterfactual isn't the most useful word in the article.

For example "The admin did not configure file purging." isn't a counterfactual unless admins were in fact purging files.

A better phrase might be subverted expectations, along with the admonition to make expectations more explicit with the author's if-then examples. Inaccurate expectations are just as important to analyze and fix as hardware and software bugs.

Counterfactuals are actually very powerful and useful because they do let us analyze the past instead of just the future. I have found that chasing counterfactuals usually bottoms out in known failure modes in practice. For example tracing back that lack of disk space in build nodes would probably arrive at either "insufficient design" or "insufficient review" or even "insufficient understanding", all of which are pretty well-known and common engineering failure modes and we already know the solutions in the future; spend more time on design, review with more people, ensure education is happening. It's actually quite useful to do this because as a trend over many postmortems "insufficient design" has very different organizational solutions than "insufficient understanding" and it's a meta-level counterfactual of "if we were better at (design|understanding) what problems would we avoid?"

2 hours ago by skmurphy

The shorthand is "if only" => "next time."

Key points from a good article.

Definition: "A counterfactual is a statement about how the world might be different now if something had happened differently in the past. It’s a kind of “alternate history” idea."

"Here’s the rub: a counterfactual cannot be a cause. By definition the counterfactual did not happen, therefore it cannot have caused anything. Only events that actually occur can be causes of other events. Causality should be stated in a form “Because X then Y”. The statement “If not X then not Y” is not an explanation, it is a kind of wishful thinking about how the past might have unfolded differently."

"When performing Five Whys it is important to avoid this counterfactual leap. Stick to the events that actually occurred. [..]"

"Try to reformulate the counterfactual as a statement about future prevention:"

1. If we configure file purging, then this won’t happen again

2. If we monitor for “volume full” conditions, then this won’t happen again

3. If we clean up files from old builds, then this won’t happen again.

These are useful statements. When formulated this way, they’re clearly talking about the future and not hypothesizing an alternate history. "

an hour ago by ihumanable

This post would have been stronger if the author had gone back to their original 5 whys and continued it without counterfactuals.

- Build didn't happen, why?

- Pod didn't start, why?

- Disk was full, why?

At this point if counterfactuals are off the table, the answers become tautologies. Although if everyone is in agreement that counterfactuals are not helpful, this could become a powerful heuristic in determining that you've hit root cause and can shift from searching for root cause to designing mitigations for the root cause.

I guess that's the point of the post, but it seemed weird to start a scenario saying "it goes off the rails here, because of XYZ" and not come back and give the better version of that scenario.

an hour ago by undefined

[deleted]

2 hours ago by pixl97

I get asked for root causes all the time.

Many times I dont have visibility into the entire system so the best I can give the client a good set of counterfactuals on the bad policy they used to get to that place.