The Negotiability of “Severity” Levels

Posted Posted in Incident Analysis

What does the term severity mean, in the context of incidents involving software systems? Merriam-Webster gives us this: “the quality or state of being severe: the condition of being very bad, serious, unpleasant, or harsh.” Here are a few colloquial definitions: “Severity measures the effort and expense required by the service provider to manage and resolve an […]

Hindsight and Sacrifice Decisions

Posted Posted in Uncategorized

A few weeks ago I tweeted this thread which references sacrifice decisions and contrasts some facets of the Knight Capital (2012) case and the NYSE trading halt (2015) case: On Aug 1, 2012, a company named Knight Capital experienced a business-destroying incident. Much has been written about it, but that’s not the topic of this thread. […]

Chapter in “Seeking SRE”: SRE Cognitive Work

Posted Posted in Uncategorized

Dr. Richard Cook and I were honored to contribute to a chapter in the new book Seeking SRE edited by the lovely David Blank-Edelman. Consider it a signpost along the way as we learn more about how engineers actually do their work in real-world scenarios (rather than abstract or generalized descriptions of how we like to […]