Beware of the Black Swans


By Jane Lo, Singapore Correspondent

Nassim Nicholas Taleb’s “The Black Swan: The Impact of the Highly Improbable” appeared in 2007, during the year when the Dow Jones Industrial Average index peaked at 14,164.  Topping the New York Times bestseller list for weeks, Nassim Taleb’s argument that banks and trading firms were vulnerable to improbable events and exposed to losses beyond predictions modelled on standard scenarios, was however taken to be an academic one.

That was, until the Global Financial Crisis imploded a year later.

The collapse of Lehman Brothers, one of the oldest and largest investment banks on Wall Street; Merrill Lynch, another which verged on bankruptcy; and an incessant string of banking bail-out announcements by governments on both sides of the Atlantic sent global markets plummeting and into a period of extreme volatility.

“Black Swan”, a term that describes impossibility, is derived from the presumption that ‘all swans must be white’, until the discovery of black swans in Australia. 

The Great Financial Crisis hit home the lesson that “Black Swan”-  rare, unexpected but highly significant events – are much more common than we think.

The Cyber-Physical attack on Prykarpattya Oblenergo power plant in Western Ukraine, the first cyber-physical attack since Stuxnet degraded Iran’s uranium processing capability in 2010, was an unexpected but highly significant event.

Mike Bates (Principal Consultant, Risktec, TÜV Rheinland) at the Safety Case Symposium 2018 held in the Singapore Institute of Technology (14th March 2018). Photo credit: TÜV Rheinland

At the Safety Case Symposium 2018 held in the Singapore Institute of Technology (14th March 2018, partnered with TÜV Rheinland, SITLEARN, Singapore Standards Council), with 200+ delegates from more than 10 countries, we sat down with Mike Bates (Principal Consultant, Risktec, TÜV Rheinland), to chat about risks and Black Swans in the critical infrastructure sectors.

What were some major accidents in the past?

There was the 2005 explosion at the third largest refinery in the US – the BP’s Texas City Refinery, triggered by the ignition of a hydrocarbon vapour cloud, killing 15 workers, injuring more than 180.

In Singapore, there was the fire at Shell’s refinery off the mainland at Pulau Bukom in 2011, which began near pipelines carrying petroleum products, and took more than 100 firefighters 32 hours to extinguish.

In the UK, the one that lead to the introduction of offshore safety case was the 1988 explosion and fire of the Piper Alpha platform in UK North Sea, killing 167 of 226 men onboard. Several recommendations included best practices for operational safety – clear shift handovers, adequate safety and evacuation training, operational firewater system, timeliness of management decisions.

Recommendations from this incident also formed useful guidelines for other countries when drafting their own regulations.

In today’s world, what does it mean to ensure safety of a modern industrial control system?

Digital transformation across the industrial and OT sector means no one process or a piece of hardware is considered completely “safe” in an always-on, connected environment.

Functional safety and cybersecurity are now inextricably linked in modern plant and process control systems.

A plant that meets the necessary functional technical safety design requirement could be compromised by a cyber-attack impacting its safety integrity level.

Embracing Industry 4.0 means embracing the challenge of both safety and cybersecurity risks.

So, in Singapore, you have the Singapore Cybersecurity Bill that was recently passed, requiring critical information infrastructure owners and operators to take responsibility for securing their systems and networks; while the regulations for Safety Case Regime kicked in last September. 

What are the obligations under the Safety Case Regime?

All Major Hazards Installation (MHI) companies are required to submit a Safety Case.

** MHIs in Singapore comprise petroleum refining, petrochemical manufacturing facilities, chemical processing plants and installations where large quantities of toxic and flammable substances are stored or used .. around 110 in Singapore

Fundamental obligations under the regulations to prevent major accidents include identification of hazards and risk that may lead to a major accident, control measures, and how organisational, technical and human factors contribute to safety, and arrangements to rectify identified shortcomings.

Safety Case Symposium 2018 held in the Singapore Institute of Technology. Photo credit: Safety Case Symposium 2018.

What are the key concepts for a good Safety Case?

Avoid performing a ‘paper exercise’ and generating reams of documentation that is neither read nor practicable.

Follow a SHAPE approach:  S-“Succinct”, H-“Homegrown”, A-“Accessible”, P-“Proportionate”, E-“Easy to Understand”.

For example, “Homegrown” means involving staff from different levels of the organisation including leadership, middle management, supervisory and ground staff, personnel who understand plant design and operation, staff with expertise in quantitative risk assessment and process hazard analysis, engineers, emergency response team members.

By “Proportionate”, we mean the time and effort spent producing a safety case should be proportionate with the risks from the facility.  A small plant with high fatality potential may need more effort than a very large facility with low fatality potential.

“Beware of Black swans” –  does this mean predicting the unpredictable?

It is not possible to identify and predict all plausible hazardous scenarios of an Infrastructure Control System where there are multiple interdependencies with millions of possible interlinked chains of events and outcomes.

It is more critical to have a crisis management approach to effectively manage the situation, in other words, emergency response and business continuity plans to recover from events.

These set out detailed system and flexible resources, appropriate and relevant teams, communication channels to escalate and inform stakeholders, pre-established partnerships including third parties who can work with you to help.

Keep the plan up-to-date.  Conduct drills, whether is an integrated response drill within the facility or a role-play or desktop exercise, and to attest mutual aid agreement.

Simply put, if I were an investor, I would want to know that the company is still running, after an event happen.

What are some practical steps?

Establish your context and scope of the assessment. Use a recognised framework such as the relevant ISO. Conduct workshops to take an inventory of hazards and risk factors. Involve the right participants, start with what they consider are high risk areas based on their experience.

In a refinery for example, high energy materials such as oil and gas present a significant hazard with pressures and temperatures adding to the risk.  So, a hazardous scenario could be damage to a live pipework causing loss of containment of these materials which, under specific pressure and temperature conditions, may cause fire or explosion.  But how you rate the risk is unique to the environment, for example, depending on your asset’s distance from source of explosion – the nearer you are, the higher the impact for example.

Quantifying the risk likelihood and impacts would help rate your risks and design the appropriate safeguards and mitigants.

And if you use industry software pre-loaded with scenarios, parameter settings and algorithms – remember that the these may be derived from certain assumptions of laws of physics (e.g .Boyle’s law).  So, calibrate these results to your environments.

For example, gas and pressure behave differently in a dessert versus, say, in Jurong Island of Singapore.  And the societal impact of an explosion in a dessert is arguably lower given the lower population density.  On the other hand, resources to manage the situation is also arguably limited.  So, safeguards for the same hazard in two different locations call for different protocols and designs.

And of course, the settings need to be tweaked for season (eg. winter or summer).

Key things to keep in mind?

Your stakeholders extend beyond your company and employees, to suppliers, the neighbours, and ultimately the end-users. What is the contingency plan if power supply is cut off and consumers have no access?

Also ensure sensitive information and data are protected and secured when communicating with your client. Manage your physical security risk such as authorised access to facilities.

There is also a difference between high-risks from a business continuity perspective, and those from an operational risk perspective.  A high dependency on adequate fire-fighting resources in case of an emergency is an example of a business continuity risk.  Whereas a high dependency on the competency of operations staff following the safety procedures is an operational risk.

Final Tips?

Many industrial major accidents are colloquially described as black swans, when in fact they were entirely foreseeable and preventable – a good place to start is to foster a culture that has a ‘collective mindfulness’ of such risks.

So, a safety case could help to foster and formalise a such a culture, and should include all of the above,

  • Focus on managing risk
  • Clearly define the scope, and keep within it
  • Focus on what the key users and stakeholders need to know
  • Include ‘workers’ in the development to ensure ownership
  • Present information clearly and concisely – be easy to understand and easy to navigate, minimise repetition, and use up to date, relevant references/supporting information
  • Contain clear and implementable recommendations, either contain or reference an implementation plan

But most importantly, it should be signed by highly senior company personnel to demonstrate commitment from senior management commitment.