Confusion in Terminology: DOUBLE-JEOPARDY versus SIMULTANEOUS FAILURES versus COMMON CAUSE

In the world of HAZOP/PHA, double-jeopardy (D-J) is defined as a scenario
where two independent causes exist at the same time. One Cause (in LOPA
terms, one Initiating Event) can occur in time much earlier than another,
and not be corrected or maybe not even discovered. Then a second cause
occurs and the scenario progresses. In LOPA terminology, one or more IPL
(nearly always) still must fail for the consequence to occur. Similarly in
the HAZOP or PHA world, we do not define those (use the term of) as Causes
but instead we say a Safeguard must fail. There is a special cause of D-J
that is simultaneous failure where (the two so-called “independent”) Cause
(in HAZOP) or IE (in LOPA) occur at the same or nearly same instant. Those
are rare indeed (on the order one chance in 10,000 per year), unless they
are due to a common cause (CC). If the CC is obvious and if we can guess at
its likelihood, then we do not call this D-J in HAZOP; we instead re-define
the “Cause” to be say “Local Power Loss” or “Instrument Air failure” (if
these are reason for the simultaneous failure), such as leading to two pumps
failing off at the same time:

EXAMPLE: For a case of two pumps running together (in parallel) but feeding
the same hydrocracker unit; if one pump fails, we have loss of production
rate and some operational issues; if both pumps fail, then a scenario of
backflow from the cracker (1500 psig; and very hot) back to the feed tank
(150 psig and warm) is possible. Of course, there are likely safeguards
(called IPLs or candidate IPLs in LOPA) that would have to fail (such one or
more check-valves, and/or an SIS designed for the backflow case). Some
HAZOP teams would show the second pump in the safeguard column of the
analysis tables for the random failure case; but for the common power case
(say total refinery power loss) nearly all HAZOP teams and all LOPA analyst
would have the joint cause (CC) be Power Loss leading to failure of both
electric driven pumps. Knowing the CC failure and its likelihood will lead
us to take some action (have different drivers or separate power supply for
one of the two pumps). With all of that said, the CC failure is also a
unique case of D-J.

BUT, in the Incident Investigation world, each of these is a Causal Factor
(just different terminology in that world); so IE1 is a causal factor, IE2
is a causal factor, and IPL1 failing is a causal factor, etc.

EXAMPLE: Take the case of an engine failure on a plane. Normally, only
one engine will fail at a time (unless you run out of fuel, etc.); of
course, that is a serious scenario and one that occurs each week somewhere
(given the large number of aircraft flying at one time). Two engines can
also fail internally at the same time. This used to occur much more
frequently than it does today, because one mechanic (or a pair working
together) were in the past allowed to work on two or more engines on the
same plane on the same day… when that occurs, one mistake is very likely
to be repeated in the maintenance or repair of multiple engines (or multiple
hydraulics). After numerous double, triple, and even quadruple failures on
the same flight, within seconds or minutes of each other, the FAA passed
regulations that required staggering of maintenance activities to break this
CC failure. This improved the reliability of primary systems and safeguard
systems. We do the same in Nuke Power and to a growing extent in the
chemical process industry.

The example of the Bayer CropScience accident (Institute, West Virginia,
2008) is a good example of D-J event that would normally be low frequency
because there is no CC. However, the probability of such a D-J event is
likely if there are common ROOT causes as well (higher worker fatigue, poor
procedures in general, etc.) These are also more likely if the first
failure (even if announced) is not promptly repaired/remedied. These are
VERY likely there is no PHA or HAZOP of the procedures for this mode of
operation to uncover the D-J event ahead of time.

Bottomline: D-J is a term used to describe two Causes (supposed IEs)
becoming true at some instant in time, but the individual Causes (errors or
failures) may have occurred at different times. Simultaneous equipment
failure is only one case of D-J, and it is so rare it is normally not
discussed in a PHA or HAZOP, which is good; unless your first failure can
linger for a long time before repairs occur (so in FTA terms, then MTTR is
much longer than it should be). Further, there are many other D-J scenarios
in batch operations and startup modes and online maintenance modes and even
(more rarely) in continuous mode of operation that are not CC and not
simultaneous. To say these are not credible is definitely wrong; just look
at accident histories. This is one reason that Startup and Online
maintenance and shutdown modes of operation (of a continuous process/unit)
accounts for 80% of the major accident scenarios. If the HAZOP and PHA
teams are not applying the methods to all modes of operation, they will miss
these scenarios; many of those are D-J.

By | 2016-12-07T01:33:33+00:00 August 14th, 2015|Double-Jeopardy|1 Comment

About the Author:

One Comment

  1. Mike Robertson October 5, 2015 at 8:59 am - Reply

    I find incidents can occur more from a culmination of events vs. a simultaneous occurrence of events. It is not unlikely to have two manual block valves left open or a reaction runaway with the key thermocouple not working. I think we need to deal with this in our reviews especially if history supports the scenario.

Leave A Comment