Every lie we tell incurs a debt to the truth. Sooner or later, that debt is paid.
— Valery Legasov, Chernobyl (HBO)
I must not fear.
Fear is the mind-killer.
Fear is the little-death that brings total obliteration.
I will face my fear.
I will permit it to pass over me and through me.
And when it has gone past I will turn the inner eye to see its path.
Where the fear has gone there will be nothing. Only I will remain.
— Bene Gesserit (Dune)
A quarrel had arisen between the Horse and the Stag, so the Horse came to a Hunter to ask his help to take revenge on the Stag. The Hunter agreed, but said: “If you desire to conquer the Stag, you must permit me to place this piece of iron between your jaws, so that I may guide you with these reins, and allow this saddle to be placed upon your back so that I may keep steady upon you as we follow after the enemy.” The Horse agreed to the conditions, and the Hunter soon saddled and bridled him. Then with the aid of the Hunter the Horse soon overcame the Stag, and said to the Hunter: “Now, get off, and remove those things from my mouth and back.”
“Not so fast, friend,” said the Hunter. “I have now got you under bit and spur, and prefer to keep you as you are at present.”
The structure of the program should exactly follow the structure of the problem. Each real world concurrent activity should be mapped onto exactly one concurrent process in our programming language. If there is a 1:1 mapping of the problem onto the program we say that the program is isomorphic to the problem.
It is extremely important that the mapping is exactly 1:1. The reason for this is that it minimizes the conceptual gap between the problem and the solution. If this mapping is not 1:1 the program will quickly degenerate, and become difficult to understand. This degeneration is often observed when non-CO [concurrency-oriented] languages are used to solve concurrent problems. Often the only way to get the program to work is to force several independent activities to be controlled by the same language thread or process. This leads to an inevitable loss of clarity, and makes the programs subject to complex and irreproducible interference errors.
Joe Armstrong, Making reliable distributed systems in the presence of software errors
A culture successfully terrorized on a dark day in September of 2001; initially, in anger but, over the long run (to this day), in fear, throwing off the high ground of nominal “truth, justice, and the American way” for Jack Bauer ends-justify-the-means.
A culture particularly prone to disengaging into its own individual, typically violent, often vengeful, entertainment bubbles.
A culture particularly prone to embracing sounds bites and now internet memes to simplify complex issues into narratives that are palatable.
A culture particularly prone to being goaded into team dynamics at the expense of actually engaging people on the “other side” to work through nuanced problems.
A culture particularly prone to sensational voyeurism to the point it lets what has become a “socio-political-entertainment complex” get away with dosing us garbage constantly.
18 years straight of war — leaning disproportionately on people far removed from the political decision making.
Inarguably relaxed service entry requirements — multiple times — in this ongoing span to hit the desired recruiting numbers.
Poor support for those coming back from such horrors facing their own demons as well as an incredibly competitive global marketplace and global workforces.
Governments–especially ours–preserving themselves and their revolving door cronies from unrest over horribly insolvent dynamics THEY fostered (80s – now) via transnational corporate socialism that comes at the expense of “the little guy” who gets robbed in ways “he’ll never [explicitly] understand”. Coordinated currency devaluation, statist direct intervention in the markets, and coordinated messaging. We feel the “real world” fallout.
Decades of institutional rot at the highest levels of power. Normalization of dynastic political power. And now blatant amoral criminality at the top pushing us headlong into cultic authoritarianism.
Active attacks from foreign adversaries. Little to no push back. Or even improved defense.
Overt hatred, open racism, demonization of “the other” crawling out from under the rocks.
And an ever increasing rate of folk trying to hit a real life “kill count high score” and get plastered all over the media for doing it. We look around when we go to airports. We look around when we go to concerts. We look around when we go to school. We look around when we go to movies. We look around when we go to bars. We look around when we go to night clubs. We look around when we go to church buildings. We look around when we go to the office.
It is unacceptable. We must do better.
Fail-fast is a systems design approach that intentionally fails early and visibly. The gist is that it is better for users as well as better for developers to halt a system if it finds itself in a “critical enough” situation.
Wait, What, Why?
There are worse things than a crash.
Mike Stall (via Jeff Atwood) sketched an “app health hierarchy” which looks thusly:
1) Application works as expected and never crashes.
2) Application crashes due to rare bugs that nobody notices or cares about.
3) Application crashes due to a commonly encountered bug.
4) Application deadlocks and stops responding due to a common bug.
5) Application crashes long after the original bug.
6) Application causes data loss and/or corruption.
Complex systems can reduce their exposure to the darkest parts of that hierarchy in failing fast. Users are immediately presented truth rather than sometime later discovering their actions didn’t take. Developers get immediate, direct feedback in critical scenarios that points directly toward the root cause.
And, perhaps most importantly, halting in the right places prevents on-going corruption, which is what keeps me up at night as a software developer.
We can’t just BSOD our users!
Yes and no.
Yes, we can, if the situation is that critical and we know of no way to recover.
No, we can’t, on anything less than that.
Fail-fast requires (team) wisdom in implementation. Each situation must be considered in its own context. Tricky, at times, but in my experience the effort is very much worth it.