Tuesday, November 13, 2012

Risk Assessment

What is risk assessment?

Risk assessment is the process of determining the relative probability and consequence of taking an action in response to an event. In risk assessment we are looking at the probability of an event taking place, the actions we take in anticipation of the event, and the consequences of each action if the event does or does not take place.

It is important to include the action "do nothing" as one of the potential actions.

At the end of the risk assessment we will have the ability to look at all of the potential outcomes, the likelihood of each outcome, as well as how acceptable each outcome is.

For any event there is a probability of that event taking place. What is the probability of the power failing tonight for more than an hour? What is the probability of the power failing in the next 6 weeks for more than an hour? What is the probability of power failing sometime in the next 6 months for more than a day?

Each of the above named events has a probability. Where we live, the probability of losing power for more than 5 minutes sometime in the next 12 months is near 100%. The probability of losing power for more than 6 hours sometime in the next 12 months is closer to 40%. These are very rough numbers and are the result of looking at history of power failures in our area. It is also the result of looking at preventive maintenance by the power company in our area.

Identifying an event and assigning the probability of the event happening is the first step in risk assessment.

For each of the above events we can make a list of actions we can take in anticipation of these events. The following are some of the actions we can take in anticipation of a power failure event.

  • Do nothing
  • Buy and install a full building power backup system (UPS)
  • Buy and install a full building power system.
  • Buy a portable generator and install a generator feed with switching capability for the building.
  • Buy a portable generator and some extra extension cords
  • Set aside $500 to buy a portable generator and some extra extension cords if the event takes place.
  • Buy extra candles, some oil or Coleman lanterns and a camp stove along with some sleeping bags.

The next step is the hard step. Each of these actions needs to be evaluated in terms of what the consequences are of the event happening or the event not happening for each action.

To put it differently, the event will or will not take place. For each action you have identified, you have to identify the consequences of taking that action if the event does take place as well as if event does not take place.

Here's a simple example: A standing dead tree will fall on the house. The probability is low. One possible action: I will cut the tree down now so it will fall safely away from the house. The consequence of taking no action and the tree falling is that our house is damaged and maybe somebody is hurt. Also, the contents of our house are likely to be damaged. The consequence of taking no action and the tree NOT falling is that there is no change. The house is still at risk but nothing has happened. The consequence of cutting down the tree is that I have to buck it up and turn the tree into firewood. The consequence of cutting down the tree which would NOT have knocked it over is that I have to buck up the tree and turn it into firewood.

Of the four possible outcomes, 2 are good, 1 is very bad and 1 is no change in condition.

An approaching hurricane changes the probability of the event taking place which changes the risk assessment which led me to go out and cut down the tree.

In the following table we go back to our power outage event and list our actions and the consequences of taking that action if the event takes place and if the event does not take place.

Power Goes Out For 24 Hours
ActionHappensDoes Not Happen
Do nothingFrozen goods start to melt, no internet, no water pump, no electric heat, no electric fans, no hot water (and maybe more)Nothing
Install Full Building UPSNothing happens. Use up some fuel.$15,000-$25,000 gone
Install Full Building Backup PowerPower goes out to everything for 3 to 5 minutes$5,000-$6000 gone
Buy Portable GeneratorPower failure to house. Extension cords have to be run. Some items will not have power. All 220V (well pumps, furnace blowers, electric dryer will need special handling or might not work at all. Fuel use will have to be monitored$200-$1000 gone
Put aside $500Power goes out. You have to get to a store that still has a generator. Buy generator. Buy gas container. Buy gas. Get generator home and hooked up like above.Nothing
Get electric free equipmentComputers and internet go down. People gather around the wood fired stove for heat. People go to bed earlier because candles or oil light not bright enough. People talk to each other. Food is cooked over wood or propane stove.A few hundred dollars are spent on food, candles, an extra tank of propane for the grill.

Setting up a table like this helps us understand the different possible outcomes. We get to see the good side and the bad side of each and how we can evaluate which choices are best for us.

The Complexities

Unfortunately, risk assessment is not a simple process. In fact it is very complex due to interactions between different actions and events.

Let's consider an example of such an action. It is 9/11/2001 and then-President Bush is informed that four aircraft have been hijacked. He is told that there is actionable intelligence stating that the hijackers are going to fly the aircraft into the World Trade Center, the Capitol Building in DC and the White House.

He has two choices. He can let the events play out and the hijackers MIGHT actually do what is predicted, or he can push the destruct button on the aircraft, causing them to be destroyed instantly with loss of the lives of everybody onboard.

The risk assessment says that there is a 95% chance that 5,000 to 30,000 people will lose their lives if he lets the events unfold.

If he pushes the button there is a 100% chance that 200-400 people will die on those aircraft. Does he push the button? Would you push the button? Could you kill 400 people when nobody might die if you don't?

The consequences then cascade from there. We know the consequences of those aircraft destroying the World Trade Center buildings. We know the horrible cost in lives lost, in families ripped apart, and of health issues for those that were there. We know how even today we pay the price in freedoms lost when we go to board an aircraft.

What of the consequences if he had pushed the button? He would have gone down in history as a mass murder. "He killed 400 defenseless people!", "He didn't KNOW they would have crashed the aircraft into those buildings; everybody knows that hijackers always fly someplace and then demand something.", "It was all a government plot, there were never any hijackers, he lied and people died!" Those are some of the things that would have been said.

What would never have been said was, "He saved the lives of tens of thousands of people." Because we would never have known the results of him NOT pushing the button.

One of the real problems with risk assessment is that successful risk assessment means nothing happened. The only visible consequence of the assessment and choosing an action is that there was some cost that a bean counter can see.

Avoiding the "Dead Baby Story"

People react much more strongly to a good emotionally driven story than they do to numbers, logic or reason. A dead baby story is exactly that, a strong, emotionally driven story used get people to make a decision. The decision could be a good one or a poor one but it is impossible to judge due to the nature of it. A poor decision is one made with out regards to fact or logic.

Dead baby stories are used all the time to explain why something should be outlawed or forbidden. "I once heard that somebody put razor blades in apples so don't ever accept fresh fruit when trick or treating", "A friend of a friend drowned because she was wearing her seatbelt and couldn't get free when her car went in the river." These are examples of "dead baby" stories. In almost all of these stories, something terrible happened to somebody or something, which pulls on the heart strings. However, the probability of the event in question was very, very low.

The reason they are called "Dead Baby Stories" is often the story involves some child being hurt or killed. "You should never have a gun in your house. I read a story where a 2 year old got a hold of his father's pistol and accidentally killed himself with it."

One of the real problems with not getting caught up with a dead baby story is that sometimes they are real events with real consequences. There are babies that died because the bathwater was too hot. There are people that died because they couldn't get out of their seat belts when their car went into the water. These are real events and consequences.

Regardless of whether or not the stories are real, the risk assessment must take into account both the probability of the event taking place and the consequences of that event.

The stories of people dying because of seat belts trapping them in a situation where they had to escape are many. But the analysis has to take into account two different events. The first is, "My car goes into the water and my seat belt does not release." The probability of this happening is very low but the consequence is high. Total risk, LOW.

On the other hand, the probability of getting into an accident is relatively high. The probability of being injured in an accident if you are NOT wearing your seat belt is even higher. Therefore the risk from wearing your seat belt is much lower than the risk from NOT wearing your seat belt.

Part of the problem with dead baby stories is that sometimes we don't recognize them as dead baby stories. Sometimes the stories are presented in the news as a "did you know?" with lots of facts about how bad the consequences are. Or how awful the company is for letting this consequence happen.

Sometimes the information is presented in such a way that we want to believe, and our risk assessment goes out the window. BPA might be an example of this. There is some research that shows this chemical might leach into the contents and this leaching might cause issues. But the people that are pushing this research just happen to have been hired by a company which was selling BPA free bottles . . . before BPA was an issue.

So is this a dead baby story being used to sell a product or is it good research that was brought to view by a company who just happens to represent a client that sells BPA free items? Makes you wonder.

Here's an example from a few years ago. IUDs are one of the safest, most effective forms of birth control available. They have one of the lowest failure rates and one of the lowest side effect rates of any birth control method. For a long time, they were considered dangerous to use and were often ignored in the U.S. However, that is now changing.

Because the IUD has to be inserted and removed by a medical professional, a woman will not "forget" it. It is not going to have "pinprick" failures. A woman won't accidental stop taking her birth control pills. An IUD just works.

I've had multiple doctors tell me these facts and my research supports their opinions. (Personal contact with multiple obgyns - while the sample size is not large, having all report the same thing leads me to believe that these facts are true).

But then the dead baby stories started. They are based on fact. They really did happen. The Dalkon Shield had a design flaw. The flaw led to Pelvic Inflammatory Disease (PID). PID could cause scaring and adhesions which in turn lead to significant reproductive health issues.

Under the risk assessment, we have a consequence, "PID," and a probability. Given that there were millions of women using IUDs (160,000,000+ in 2002) but there were only 4,000,000 Dalkon Shields used. Math gives us a 2.5% chance of a woman having a Dalkon Shield if she choose to get an IUD. Of those, only 8% had issues, meaning that the probability of a negative outcome was 0.2%

Looking at the statistics for other birth control methods it is easy to see higher failure rates. But the stories were so horrible of some women being unable to have children after using "an IUD," or of birth defects "caused" by "a IUD," that many women stopped choosing IUDs as a form of birth control.

This example of a dead baby story shows a consequence with a horrible outcome (sterilization, birth defects) but a very low probability of occurrence, led women to chose a path with bad outcomes (getting pregnant, headaches, dizziness, spotting, decreased libido, mood swings, interactions with other medications and health conditions) with a much higher probability of one or more of these negative consequences.

I apologize for not having all the citations for this digression. Please feel free to use Google to double-check my information.

Other examples are the outlawing of DDT. There was a very small probability of health issues from DDT. But DDT controlled the mosquitoes which carried a wide range of diseases that killed many more people than DDT ever harmed. In fact there is some research which says outlawing DDT is one of the reasons for such serious health issues in third world countries. Just look at the advertisements asking for netting to protect people from mosquitoes.

Still another example is our airport security procedure. The facts say that the TSA is not preventing terrorists or others from getting weapons on aircraft. Note the "shoe bomber", "underwear bomber", and "toner cartridges" were all discovered or stopped by people other than the TSA. The hope that they might just stop one bad guy is worth it to the majority of people, and so they give up vast amounts of personal liberty in exchange for appearance personal protection.

To put a little perspective on dead baby stories affecting our choices, there are 43,600 injuries per year of just children in the bathroom or bath tub and around 140,000 per year over all vs. 165,000 ladder related falls. Yet ladders have all sorts of warnings on them and everybody talks about how dangerous they are. There are even rules that say that people have to use "fall arresting" gear when on ladders, yet almost as many people are hurt in bathroom related accidents every year.

Risk assessment requires separating the probability of an event, from the actions taken in response to that event, from the consequences of those actions in the face of the event happening or not happening. When these parts are not separated it becomes almost impossible to create a good risk assessment.

Just because there is a consequence that is horrific does not mean that avoiding that action is the correct path to choose.

How Event Probability and Consequences Affect the Risk

Risk is the combination of the probability of an event and the consequences of that event taking place in the face of preventative actions. The higher the probability of an event taking place, the higher the risk of from the event, and the higher the consequence from the event, the higher the risk.

When we are doing a risk assessment we are looking at actions we can take prior to the event in order to modify the total risk involved.

To better understand the interrelationship between probability and consequences in determining total risk, we are going to use a simple example of placing a bet. There are only two actions to be considered.

For the purposes of this example we are going to use an event of a coin toss. The coin and toss are "fair" which is to say the coin will land heads up 50% of the time.

Event: The coin will land heads up.

The action will be "placing a bet" on the outcome of the event. In other words we are betting on the coin landing heads up. The second action is "do nothing."

The consequence is the loss of the amount bet.

Flipping A Fair Coin
ActionEvent HappensEvent Does Not Happen
Do NothingNothingNothing
Place a BetWin Amount of BetLose Amount of Bet

From the table we can see that there are four potential consequences: two neutral, one good, one bad. If we choose not to bet, nothing bad will happen but nothing good will happen either.

If we set the amount of the bet low, say $1, then we can set a "value" to the consequence, "acceptable" and "unacceptable". If the bet value is low enough then all four outcomes are "acceptable".

If we raise the amount of the bet to $20, then things start to change. Now we have three results which are acceptable and one that is "unacceptable". With only two choices we can't tell how "unacceptable" the bad result is. What we do know is as the size of the bet increases the level of "unacceptable" becomes higher.

Using a scale from 0 to 10 with 0 being totally unacceptable and 10 being acceptable, anything less than 10 is unacceptable to some extent. So to use this in our example, if we are betting $1 then the consequence of losing the bet has a value of 9. On the other hand if we are betting our mortgage payment, the consequence of losing might be a 4. If we were betting our life it is likely that it will have a value of 0 or 1. (For most people the thought of losing the life of a loved one is a 0 while the loss of their own life is about a 1).

We can actually see how people are affected by risk assessments when we watch the same group of people play poker for cash and when they play for tokens with no physical value. People that have nothing to lose (the tokens don't have value) will bet heavier and on poorer hands than when they are using real cash.

One thing that has to be taken into consideration when these types of risk are calculated is the value assigned to each consequence is very personal. A person with a "fun" budget of a few thousand dollars will assign a consequence value to a $100 bet much differently than a person with a fun budget of only $500.

At this point you should be able to look at our simple example and see how the risk changes based on the consequences. But we can also modify the risk by modifying the probability of an event.

We modify the probability of an event by learning more about an event or by changing the circumstances in which the event might take place.

Taking an example of rock climbing, the probability of a "fall" happening is reduced by increasing the experience of the climber. We can also reduce the probability by changing equipment or conditions in which the climb is taking place.

Here is an example of a set of risk analyses that was performed in the mid 1980s. They bet wrong.

The event will be: A catastrophic failure with loss of lives because of a gasket failure. The probability of gasket failure is set at 0.1% when the temperature is above 32°F. We know the failure rate is higher if the temperature is lower but we don't know what the actual failure rate will be.

The actions we can take are: Refuse to perform the mission if the temperature is below 32°F. Test to determine the failure rate below 32°F. Perform the mission if the temperature is above 0°F. (The entity in charge had already decided that launch at sub-zero temperatures was unacceptable.)

Catastrophic Failure Of Seal
ActionEvent HappensEvent Does Not Happen
Do NothingPeople die, equipment lost, huge PR issueProcedure continues as is
Refuse MissionEvent can not happenPR issues, loss of revenue, loss of face, loss of management bonuses
Perform Extra TestsSame As Do NothingSame as Do Nothing

In this case we can see that one of these actions does not actually affect the risk analysis. So why include it? The answer becomes that performing the test gives us a better understanding of the probabilities of the event. Given a better probability, we can make better decisions.

This analysis was actually done for NASA. The actual event was hidden. The event that should have been analyzed was, "Is the probability of seal failure significantly higher at 32°F such that our risk analysis for a mission go decision should be modified?"

Do Tests show different probabilities?
ActionEvent HappensEvent Does Not Happen
Do NothingSame Probability feed to Primary analysis.Same probability feed to Primary analysis.
Test PerformedHigh cost of test, loss of face, potential delay in missions, new better probabilities leading to more mission delaysHigh cost of test, loss of face, potential delay in mission

In the analysis done it was decided that the cost of doing the testing to determine the probability of failure at low temperatures was prohibitive. Therefore the testing was not done. Therefore the risk assessment stated the same which was it was safe to launch the space shuttle when the temperature was at freezing the night before. Therefore the launch did take place. During the launch a gasket (O-Ring) failed leading to the catastrophic failure of the mission including loss of all lives aboard and the loss of the vehicle as well.

Please note an important aspect of risk analysis is having good information on consequences and probabilities. If you are working from bad data then your assessment is likely to be bad as well.

So far we've been modifying the consequences to show how that changes the risk assessment. We can also change the probability of an event taking place in order to increase or decrease the risk involved.

If the probability is modified, the risk is modified. Consider the standard movie leap (see The Day After Tomorrow) where our hero starts to run and then leaps over a chasm, narrowly avoiding falling to his certain death. Yeah, most of the time the leaps are long enough to make an Olympic gold medalist envious, but let's ignore that for the moment.

The consequence of the leap succeeding changes the risk. If we KNOW the probability of making the leap is 100% then there is nearly no risk. If on the other hand we know there is almost no chance of the hero making the leap then the risk is astronomical.

The movie Executive Decision is an example of the movie maker playing with the audience by presenting the standard, "It would take a super human effort to survive this," with the audience knowing the hero will survive because he does have top billing and it is very early in the film. Then the hero doesn't make it. The hero dies. Oh my, what is this movie about now?

The movie maker has taken our innate sense of risk assessment and told us, "yeah, this is impossible. You and I both know it is impossible but the hero can always do the impossible." In other words the real world probability is near 0% but in the movie world the probability of succeeding has always been 100% so the perceived risk is low. Then the movie maker breaks the rules: "Fooled you! I used real world probabilities. Now you can't assume that movie probabilities are in effect for this movie."

So rather than the three probabilities we've looked at so far, nearly 100%, 50% and nearly 0%, we can have probabilities anywhere in between. As the probabilities change the risk also changes.

Where you might be willing to "bet on the event" if the probability is "fair" and the consequences are "acceptable" you might decide the odds are against you too much and refuse to bet.

If you know that you are going to lose 100% of the time then it is highly unlikely you will place the bet. On the other hand people will bet on losing odds every single day of the week, knowing the odds are against them.

Would you bet $2 when the odds are 1.8% of winning even $4? Those are pretty bad odds. They say for every $110 you bet you will win $4. But you could do better, you could win $7 but the probability is 0.14% or you'd have to spend $1412 to make $7. Note, math-wise there is a point before you get to the $1412 where you are likely to win something but it is still fairly high and much higher than $7.

But we can really entice you! If you are willing to bet $2 with a probability of 0.013%, I'll pay you $200. Does that sound like good odds? Most people can look at that and see there is a very low chance of winning the $200 before you've spent well over $200.

And here is the kicker, I'll give you a 0.00002% chance of winning a million dollars or more! And all you have to do is give me your $2. Just remember, you can't win if you don't play.

The preceding odds are for "Power Ball Lotto".

In general, in a "game of luck" the "house" attempts to hide the fact that the game is rigged so over time the house will take in more money than they pay out. The game is not "fair". The fact that it is not fair is hidden in the payouts. As an example US roulette double zero wheels have a house advantage of 5.26%. Put another way for every $100 bet at roulette the house takes $5.26.

The player doesn't see that 5.26% house edge, instead they see a pay out of $35 for a dollar bet. If they only bet a single number a 100 times they'll win big! Unfortunately the house will still take $5.26 of that hundred. The probability of winning is 2.63% the pay out is 35 to 1. 100*35*0.0263=$92.05. The reason this is not exactly $94.74 is because when you win you get your original bet back.

I once read: Lotteries are taxes on people that don't understand math. What they are actually saying is that people do the risk assessment and are willing to lose $2 or $5 or what ever it is they bet per week.

As the probability changes so does the risk. Often times the actual risk is hidden by the feeling of potential profit. Risk assessment requires looking at both the probability and the consequences. Evaluating either in isolation can lead to serious errors in assessment.

Cascading Consequences

The problem with the simple risk assessment examples given is they do not take into account how a consequence could cascade.

Consider a person placing a $100 bet in order to win $1000. The odds are not good but if he wins then he can pay off his credit card. He has the $100 in his pocket. The consequence is he will lose the $100.

The cascading consequence could be he no longer has enough money to make his mortgage payment. Now instead of being behind on his credit card he is also going to be behind on his mortgage. This happens all the time in casinos. As a matter of fact casinos often are designed to make it easy for a gambler to bet more than they can really acceptably lose.

Cascading consequences occur when the first consequence causes some other consequence, which in turn causes more consequences.

Consider the power outage scenario. Losing power in our household is no big deal. Wood heat and propane grill gives us heat, food and water. But there are bad things that go along with extended loss of power. The largest is that our freezer will start to warm and we might lose some food.

On the other hand if you are on an O2 generator and the power fails there is no more Oxygen being generated. You need to get the power on or have a secondary source of oxygen.

We do have long power outages where I live. I had a telecommute job. The job required me to be at my desk at 0800 and be available through 1700 to take customer calls and to provide additional technical support to my team. For most people a power outage is not a huge thing. For me it could mean my job.

My risk assessment said that being with out internet access which included my phone service was an unacceptable risk. So we ended up rewiring parts of the house and installing a T1 (type of high speed internet connection). This meant that if there was a power failure in the area we could bring up the computers and the internet with a whole house generator.

But there was another thing that happened. Installing a T1 connection means that the phone company treats a "down connection" exactly the same as if an entire town lost phone service. This means that while my Comcast neighbors were still waiting for service to come back up we'd been up and running for two days. Mean time before repair was about 2 hours.

Of course there was a cost for all of this. We had to invest in the time to rewire the house and we had to pay a premium price for our T1 connection.

In this case the cascading consequence was that I might lose my job if we lost power. And that would be because we lost computers and internet.

Why Are People Concerned About Major Catastrophes?

Or: Why are people worried about what to do after the end of the world as we know it? Simple, because their risk assessment says that it is OK and reasonable to plan for it.

When doing a risk assessment we often find actions which are not bad, which reduce the likelihood of the event or which would mitigate the event if it takes place. By choosing to do the action we have an investment in resources (time and money) but it causes no harm and might prevent a bad consequence.

My daughter shows signs of dyslexia. My wife is a reading specialist. She thinks there is a 90% probability that my daughter has dyslexia. But we do not have a diagnosis of dyslexia.

Because of her training my wife does not want to flatly say my daughter is dyslexic. So we did the risk assessment.

Daughter might have dyslexia (90%)
ActionEvent is TrueEvent is false
Do nothingDaughter is slow to read, has frustration reading, will have spelling issues.Nothing
Teach her as if she has dyslexiaDaughter is given the tools to read is not significantly slowed, does not become frustrated, will have coping methods in place for spelling.Daughter will learn new reading tools, will likely read better and faster, will spell better

When we look at the risk assessment we have one negative consequence which is doing nothing and she is dyslexic. We have one neutral consequence and two positive consequences. Both positive consequences are the result of taking action. Therefore since there is no downside to taking the action we proceed by taking the action.

It does not matter if she does or does not have dyslexia because in either case taking action will cause no harm and will help regardless.

At the end of October 2012 there was a large hurricane which hit the north east states of the U.S. For some people it was a big deal. For others it was "no big deal." Why the difference?

My adult daughter in Maryland was asking me as the storm was coming ashore if she should stay where she was or evacuate? She was not prepared for either option.

For us, we stopped by the local store to fill up the spare propane tank and get an extra couple of gallons of fuel for the generator. My lady stopped at the store on her way home to pick up eggs.

It was no big deal. We have been preparing for a bad situation. As such this situation was just a minor test of the plans.

A friend of mine has lost her job a couple of times. One time it was for over six months. She was getting some unemployment but not enough to pay all her bills. She paid her bills with that money but they lived on the food she had put away for an emergency.

My parents gave me a hard time a couple of months ago about thinking and planing for major events. My father's statement was something along the lines of "When we go shopping we always do a check of our pantry first. If we are down to a couple of cans of something then we'll pick it up but it doesn't pay to buy more than you are going to need."

My parents have been preparing all their lives! They are so good at it that they don't even think about it. They have a few months of food at hand. They might not be happy about the choices but they won't go hungry if they get snowed in. Their toilet paper supply won't be exhausted if the snow keeps them home for a week. This is just the way they grew up.

Summary

Risk assessment allows us to make good decisions based on facts and logic, not emotion. These sorts of decisions might hurt people's feelings but are unlikely to get people harmed or killed.

Risk assessment is a process. It begins with identifying an event. Once an event is identified a set of actions are analyzed by determining the consequences of the event happening or not happening based on the action.

Each consequence or set of consequences for an event action intersection is given an acceptability level. The event is given a probability of happening. The risk of each intersection is the combination of the probability of the event happening or not happening and the acceptability of the consequence.

Consequences have to be evaluated at both the first level and as a sequence of cascading events. (For lack of a nail the shoe was lost, for lack of a shoe the horse was lost, for lack of...)

At times the correct response to a risk assessment is, "We have to have a better determination of the actual probabilities and conditions in which those probabilities hold true."

It is possible and actually probable that a single action can be used for multiple events. While the worst of those events might never take place, the action you choose to perform in case of that event might very well be the correct action for a multitude of other events whose combined probability is very high.

1 comment:

  1. Thank you Chris, that was very well written. I love the myriad of examples that you included.

    ReplyDelete