During my undergraduate years, I took a class titled "Software Studio"Referred to as 6.170 in the MIT course number taxonomy.. One of the professors for the class, Daniel Jackson, had a much anticipated ritual of beginning many of his lectures by playing a sketch featuring the British comedy duo Mitchell & WebbFor the Americans: Think Key & Peele, but for the UK. on the projector. The purpose of this—beyond possibly coaxing students to attend class—was to introduce the day's concepts by presenting them in a non-technical context. It just happened to be the case that many of the sketches shown in class presented a succinct application of the system design concept we were about to learn. For example, the professor once played a radio sketch where the duo complain to each other about how everyday conveniences—cellphones, microwaves, cars, even wall calendars—are apparently over-engineered to explain the idea of minimum feature sets and MVPs. This pedagogy-by-drollery made for a memorable semester.

I've binged much of the duo's comedic material since. There is another radio play (completely unrelated to the class) that I find so morbidly original and hilarious that I keep coming back to it. In it, a radio host interviews a local resident about his opinions on the recent news that their town had no drownings in the past year:

Over the years, I've noticed many real-world situations where the reasoning in the sketch applies, including in scenarios where no laughs are to be hadAlas.. I dub this heuristic the drowning principle, in homage to its source. The drowning principle can be defined in the following general terms:

The Drowning Principle

If one invests a resource with the goal of preventing (or ensuring) a particular event or outcome, the amount invested will far exceed the minimum needed in the event the goal is achieved.

Rephrased in crude mathematical terms: Let \(Y\) be the event or outcome, \(X\) be the amount of resource invested, and \(X^*\) be the minimum amount to invest in order to prevent (or ensure) \(Y\). Assuming a causal relationship between \(X\) and \(Y\), the drowning principle is expressed by the following limit: \[ \lim_{\Pr(Y | X) \: \rightarrow \: p} X \gg X^* \quad \text{where} \quad p = \begin{cases} 0 & \text{if we want } Y \text{ to never occur} \\ 1 & \text{if we want } Y \text{ to always occur} \end{cases} \]

Why might we believe this is the case? Consider a simple example: Let's say one wants to throw a ball as far as they could, as one might do in shot put. There are at least two things that are under the thrower's control: the force and angle of the throw. Assuming no air resistance, calculus tells us that the optimal angle is 45 degrees, no matter the force put in. So we make our first throw at that angle, and it lands somewhere. Now say we introduce the constraint that the next throw must be at a meager 15 degree angle from the ground. In order to hit the same spot, the thrower must put more force into the throw to compensate for the suboptimal angle.

The aimPun intended. of this illustration is to note that there are likely many (even infinite) causal factors—\(X_1\), \(X_2\), \(X_3\), and so on—that influence any chosen \(Y\). And it's similarly likely that we don't have access to all those levers, or that we are even privy to the existence of every single one. Thus we likely can't reach the "fully optimized" universe characterized by \(\Pr(Y | X^*_1, X^*_2, X^*_3, ...)\). So for any \(X_i\) (or strict subset) we can manipulate to induce \(\Pr(Y | X_i) = 0\) (or 1), we can expect it to far surpass its corresponding optimum \(X^*_i\).

Now, some events never (or always) occurring is actually completely OK, and no one bats an eye at whether we over-extended ourselves. In 2020, the World Health Organization declared polio in Africa eradicatedAs determined by no recorded cases in 3 years. This isn't synonymous with saying there is literally no polio on the continent.. I'm not aware of any serious person who has come out and said that the massively coordinated vaccination campaign behind such a success was over-resourced—much less predicating that claim on the observation that no one has polio anymore! No doubt part of the reason why the Mitchell & Webb sketch is so funny is precisely because the radio guest would just be an ordinary civilian save for his absurdly hawkish view on municipal public safety spending (and his amphibious burp).

The drowning principle may not apply in situations where the conversion from each additional unit of resource \(X\) invested to an increase/decrease in the frequency of outcome \(Y\) is well-known. Cooking or baking may be one such domain—each additional teaspoon of water will modify the final texture of the loaf by some predictable degree, for example. In such controlled environments we may have a more narrow confidence bound around our belief that \(|X - X^*|\) is small. Rather, the principle best applies in situations where the chain of causation from \(X\) to \(Y\) is complicated; confounding causes, weird non-linearities, long causals chains, butterfly effects, etc.

So in what situations does it apply? Consider the following example queries:

In all these examples, we interrogate the state of the world where we have achieved something that is nominally good for any hidden harms associated with how that world came to be. There may be no issue after doing our due diligence; for example, maybe only the most important (and hence relatively infrequent) notifications are configured to grab your attention, so acknowledging them immediately doesn't seem indicative of any problem. But performing this due diligence may be either a) non-obvious or b) onerous to conduct, for similar reasons why \(\Pr(Y|X)\) may be difficult to model.

We might observe that the drowning principle (dp) is very closely related to Goodhart's law (gl), which roughly states:

When a measure becomes a target, it ceases to be a good measure.

While both highlight what issues may arise if one was to succeed at a nominal goal, I would state the difference as this: gl warns of how a well-defined metric may induce unexpected behavior (i.e. a change in \(X_i\)); dp warns of how a constrained goal may induce excessive behavior (i.e. a surplus of \(X_i\)).

The drowning principle is a somewhat Bayesian outlook on the world—if you submit to the idea of having non-zero priors, how can you ever really commit to having absolute certainty in your posteriors? This leads us into a deeper read on the drowning principle: the world is not best described by statements like "such-and-such always (or never) happens". The probabilities of events are seldom 0 or 1, and while we can affect the frequency of things, getting them to either stochastic extreme is, well, extremely hard. And the resource-intensive nature of those interventions mean we are likely to overshoot.

Which, again, may be completely fine! So remember: always have a swim buddy.