Organizational Second Hit Syndrome

NOTE: Richard wrote this post but hadn’t managed to publish it before he died in 2022.

Organizational Second Hit Syndrome is an incident-related phenomenon analogous to neurological second-impact-syndrome (SIS). It occurs when a major incident creates a vulnerable period during which a second incident generates strong, widespread, and sometimes destructive organizational reactions. The phenomenon is one aspect of organizational reactions to failure.  

Introduction

Organizations are buffeted by incidents. Especially in technology-intensive workplaces these are frequent, sporadic, with a wide range of impact, and evoking different reactions. 

The reaction to failure is not a linear function of the weight or impact of the failure. Instead, we have seen lower impact failures produce stronger reactions than high impact ones. This is especially puzzling because many organizations have elaborate methods for measuring impact and directing resources based on its size. We observe that organizational reactions to failure are often linked and that even a minor event can produce strong reactions if it falls on the heels of a major event. 

There may be a parallel here between the ways in which organizations respond to incidents and the way that the human brain sometimes responds to an injury that happens after a prior ‘hit’. This is the “second impact syndrome“: 

“Second impact syndrome (SIS) occurs when an athlete experiences a mild head injury or concussion, then suffers a second head injury before the symptoms associated with the first injury have resolved… Typically, in SIS the second head injury is only minor, and usually does not produce immediate loss of consciousness. However, within minutes of the injury, severe cerebral edema, vascular engorgement, and brain herniation develop with resultant clinical deterioration. SIS typically affects young athletes, particularly males (90%) ranging in age from 10 to 24 years… Most athletes reported to have SIS were American football players, usually at the high school level. SIS has also been reported in association with boxing, karate, skiing, and ice hockey.”

Mckee AC, Daneshvar DH (2015). The neuropathology of traumatic brain injury. Handbook of clinical neurology. 127: 45–66. 

Organizational Second Hit pattern

What we observe

  1. First Hit: The organization experiences a painful, costly, disruptive incident that attracts senior management attention and concern. This is a major and memorable event.
  2. First Hit’s aftermath engages the organizational hierarchy. Its significance drives serious examination of the issues involved. There is communication up and down the hierarchy about these issues. Plans are made to repair, restore, rebuild, recover, etc. The plans are put in motion.
  3. Some time passes. 
  4. Second Hit: The organization experiences another event. 
  5. Second Hit evokes senior management attention and action in a qualitatively and quantitatively different fashion. Turmoil ensues.

The first hit creates a vulnerable state that diminishes over time. If a second hit occurs in the vulnerable period, the results are amplified by that vulnerable state. This is analogous to SIS.

What happens after the first hit?

The first hit is a crisis

The first hit evokes reactions focused mainly on the incident and its direct consequences. Addressing the technical issues and customer and public consequences consumes the available attention and effort. First hits prompt announcements that organizational leaders will do things (just what is seldom specified) to “assure that this can never happen again”. There is a good deal of external facing defensiveness from management, e.g. downplaying the event’s consequences, pointing out the generally good performance over some time period. Internally the focus is on an individual, usually front-line worker. “Human error” by this worker is widely believed to be (and often promoted internally) as the ’cause’ or even the “root cause”. 

Focus on the response

Review and critique of the first hit often centers on the incident response. Why did it take so long to discover the event’s true nature? Why were people distracted by side issues? Why did some people become fixated on unproductive avenues and ignore critical signals? What blocked remediation? Why weren’t the ‘right’ people engaged early in the incident evolution? Why did internal communications break down? Often there appear to have been premonitory signals that the incident was likely (notably prior events that mimicked the incident’s underlying themes). Why were these not recognized as trial-runs of the major incident?

Superficial investigation 

Attempts to reconstruct the incident response are often themselves blocked. The scattered records and distributed technical and organizational structure of the system hinder these activities. The incident’s big impact adds urgency to this: stakeholders demand explanations, media attention drives public relations efforts to deflect and redirect attention, regulatory and contractual consequences drive investigation and interpretation. Ultimately these demands for immediacy truncate the process. For those involved, there is a palpable irony to all of this; the response to an important, large, difficult-to-understand event. Eventually an acceptable account is cobbled together. This account is simple, easy to understand, and conveys exercise of authority. In many cases this account includes three ideas: (1) human error by one or more individuals was present, (2) the event was unanticipated and un-anticipatable, and (3) the event was unique and not reflective of the organization or technology from which it seemed to come. The management of such accounts is now a well-developed art.

Interest wanes 

Management’s interest attracted by the first hit diminishes as “business as usual” resumes. That interest was itself abnormal: high leadership interest in incidents is itself unusual. Most incidents are taken in stride, addressed by lower or middle management or even by front-line workers without gaining attention from higher management. Any consideration of these ‘ordinary’ incidents by high management is in the aggregate, as a collection (read: metrics). 

Vulnerability remains 

But the return of “business as usual” does not mean that the status quo is restored. The first hit changes the environment in subtle ways. The incident takes up residence in the individual and the organizational memories. These memories are disturbing for everyone but especially for senior leadership. Front line workers and first level managers who are close to the work have long been aware of the possibility of major incidents and the first hit can be treated as a fulfillment of their expectations. Senior leaders who, in contrast, do not have daily contact with the hazards of work or the many moments of their close approach, experience the first hit as foreign, unique, and astonishing. The disturbing quality of these memories gradually diminishes.

What happens after the second hit? 

Still vulnerable

The second hit occurs before the first is fully processed. Its arrival produces different reactions than the first hit, and makes the organization and technology seem more consequential for incidents. Rather than being sporadic and uncorrelated, the second hit produces the impression of an emerging pattern. The reactions now include significant organizational changes including organizational structural change, redirection of resources, shifts of personnel, and direct control of organizational function by senior management (“reaching down”). Although these activities are diverse, they have a common thread: they are assertions of authority that seek to re-establish (the illusion of) organizational control.

Structural change

Second hits emphasize that the problems facing the organization are manageable but unmanaged. Making structural change is a way of addressing this management failure. New mechanisms for “managing” incidents are introduced; often these seek to make explicit the authority chain or the process for decisions. These may be memorialized in documents specifying responsibility for various contingencies. These documents reflect management’s imagination about what can possibly happen. Changes to reporting hierarchy, alterations to regular reports (e.g. shortened interval, wider distribution) are common. Sometimes a new entity is created to provide senior managers more immediate or direct access to incident operations. This is a common response after events where leadership has been embarrassed by its inability to present a convincing image of competence. 

Redirection of resources

Second hits generate concern that organizational priorities are misaligned. It is common for this to play out along the “new product” vs. “reliability” dimension with reliability being judged more important than new product development. This leads to shifting resources towards work in favor of reducing the frequency and severity of incidents. Whether this work is effective is another matter. This redirection of resources is necessarily temporary; the pressure for new products is not relieved or reduced, just deferred.

How resources are redirected depends on beliefs about the sources of incidents. After-incident reviews that identify recent failures with a particular product, team, or service lead to focus on that product, team, or service. For example, we see companies seeking to constrain the resource allocation for a particular team by forcing that team to concentrate on resolving outstanding bug fixes, increasing test coverage, or slowing deployment rates. These efforts reflect a belief that there is a smooth tradeoff dimension between “safe” practices and “fast” ones and that the team or group’s performance is a manifestation of some temporary imbalance. Please note, we are talking here about what people believe about how things work, rather than about how things actually work. 

Personnel changes

The second hit sometimes prompts replacement of one or more people in the hierarchy. This is frequently rationalized as a structural change even though the goal is to replace an individual who is thought to have failed to have forestalled the second hit. Although a front-line worker may alone be blamed for the first hit, the second hit blame is often directed to a mid-level manager — someone believed to have had the necessary authority and resources to correct what is now seen to be a systemic problem. 

Damage to the organization’s image

Although the second hit is often — in objective terms — less damaging than the first hit, its consequences for the organization can be larger. 

The second hit occurs close enough in time to the first hit that the two can be seen as related. A first hit is usually portrayed by the affected organization as an aberration, a chance occurrence, an anomaly. Explanations rely on some form of error by a front-line worker. This characterization leads to the whole unfortunate cascade of defenses against human error (see Woods et al. 2010). More important to the organization is the effect of localizing the source of failure in an individual worker. This either absolves the organization or greatly limits its liability for the outcome. Tasca (1990) describes this function in maritime accidents. But, as Lady Bracknell notes, recurrence undercuts the argument.

The second hit calls into question the organization and, particularly, its processes and management. It raises questions and criticisms that are not so easily dismissed as isolated worker error. Boeing’s blaming pilot error for the Lion Air 737-MAX crash in 2018 was far more successful in diverting attention from the aircraft design than the same claim made in the wake of the Ethiopian Air crash in 2019 (Kitroeff and Gelles 2020).

The second hit makes the explanation for the first hit ring hollow both externally and internally. Senior and executive management are frequently susceptible to “good news” formulations of the sources of incidents and eager to resume “business as usual” both for its economic and political returns. A second hit can provoke quite high-level reactions. High level management may lose confidence that middle managers are up to the task of running the show. Board members may lose confidence that executives are effective. 

Conclusion

Organizations that employ information technology are commonly nagged by technical incidents; many companies recognize a hundred or more per week. Most incidents have a small impact — they affect only a few customers, are limited in duration, etc. These companies invariably have process machinery to cope with the ‘usual’ flow of incidents. That machinery seeks to address incidents and to maintain proportion between the organization’s reactions and the individual incidents. The running of this machinery is mostly out of sight and, for many senior leaders, out of mind. 

Major incidents are qualitatively different from minor ones. Their prominence engages management and evokes different kinds of reaction — some immediate and visible, others delayed or hidden. The appearance of this pattern suggests that organizational capacity to withstand incidents may recover more slowly than nominal organizational function. 

It is likely that patterns of incidents have a strong influence on an organization’s reactions to failure. Closely coupled incidents or salvos of multiple incidents may evoke different consequences than widely spaced incidents. At least in some domains the interval between ‘hits’ can be months, e.g. in the 737 MAX case. In other settings we have observed incident salvos over a few days or weeks can produce reactions much stronger than normal. 

References

Kitroeff, N., Gelles, D. (2020). As Boeing Scrutinizes 737 Max, New Safety Risks Come to Light. New York Times (New York edition, January 6, 2020, Section A, Page 1).

Tasca, L. (1990). The social construction of human error. SUNY at Stony Brook, 1990. Unpublished PhD dissertation.

Woods, D., Dekker, S., Cook, R., Johannesen, L., Sarter, N. (2010). Behind Human Error. London: CRC Press, https://doi.org/10.1201/9781315568935

Scroll to Top