Write 1-2 pages single spaced that addresses the following topic….Risk = The probability of Threat x vulnerability to that threat x consequences or loss of value of the asset. However, Risk is rarely isolated. Comment on the approach in (Griener. 2000) and top down versus bottom up risk analysis…. Here is the lecture: Commentary What’s a Disaster? What’s a Crisis? Terminology is not completely standardized, so this will complicate your reading for this course. Lindell, Prater, and Perry define the terms hazard, emergency, and disaster at the beginning of Chapter 1. FEMA uses similar definitions (e.g., in the Comprehensive Preparedness Guide [CPG] 101), but the definitions are a bit different in ASIS International’s 2009 national standard and other publications. In the field of information technology, incidents are actions that violate or threaten to violate security policies, while disasters overwhelm an organization's ability to handle the event and typically require moving essential functions to an alternate location.
A crisis generally poses a strategic threat that requires attention from and involvement of the highest levels of an organization. The impact can be to the organization’s people, mission, its financial stability, or its reputation and credibility. Like disasters, crises typically imply single, sudden, large events, but they have secondary meanings that denote a state of affairs -- something that has duration. The onset may also be preceded by smaller events or incidents, by the growth of a condition, or the compounding of inter-related failures. Using an analogy from medicine, think of small strokes preceding a large stroke or the buildup of plaque in arteries leading to a heart attack. These would be termed “slow-onset” crises. Risks to Assets The assets of an organization include people, intellectual property (both information/data and processes), facilities, equipment, and other physical components. ASIS identifies these assets at risk: "Human Resources and Intellectual Assets; Information/Data; Facilities and Premises; Ethics and Reputation; Transportation, Distribution, and Supply Chain; Environmental, Health, and Safety; Financial Assets; ... Vendor/Outsourcing" (2008, p. 5). ASIS also identifies legal and regulatory risks to organizations. For a community, assets encompass households, businesses (for-profit and not-for-profit), government agencies, infrastructure, and even special features, such as a waterfront. The assets need to be assigned a value, even if it is a ranking such as High, Moderate, and Low. The diagram in the attached file presents the concepts that make up risk to assets. Here, too, the terminology can vary. For example: • Threat, hazard • Vulnerability, exposure, weakness • Countermeasure, safeguard, control • Consequence, impact, harm One term missing from the diagram is the likelihood of the threat exploiting the vulnerability. And likelihood may be characterized in terms of probability or frequency. The following sections explore threats, vulnerabilities, controls, consequences, and likelihood. For other terms, see the Department of Homeland Security's DHS risk lexicon (2010), which defines nearly 125 terms associated with risk management and analysis from a homeland security perspective, from "absolute risk" to "vulnerability assessment." Threats Threats are a potential cause of harm to assets; they can be events, physical objects, or humans. Lindell et al. present various types of natural threats; their list could be expanded to include disease (such as pandemic flu) and infestations (such as insects, rodents, etc.). They also list technological threats, emphasizing hazardous materials. Other kinds of technological threats include structural and infrastructure failures; examples include bridges or dams, machinery, utilities, and hardware and software. The more complex a system, the more potential there is for problems, and Chiles' book Inviting disaster: Lessons from the edge of technology (2002) provides numerous examples. Many technological threats are exacerbated by human factors such as management and operational problems; the BP oil spill in the Gulf of Mexico in 2010 is a significant example. Perrow provides a more nuanced analysis, covering not only the human factors (as in the 2009 Metrorail disaster), but also distinguishing between “integrated” and “modular” designs of complex systems (2009). Human threats are another major category. The actions of human threat agents may be intentional or deliberate; accidental; or negligent. Intentional acts may be criminal (sabotage, espionage, terrorism, product tampering or counterfeiting, fraud) or not (e.g., boycotting the products or blockading the location of a business; hostile takeovers by another company). Accidental threats may come from errors or omissions. In a business context, accidental threats can occur because of poor procedures (they are incorrect, hard to understand, or non-existent), poor communication or bad information, poor training, or other reasons. Negligence implies that a certain standard of care or due diligence is expected. Negligence is, for example, a failure to comply with existing regulations, policies, standards, or procedures. Acts of management or of employees may be negligent, such as overriding a policy, mismanaging resources, or deception. Note that sometimes it is difficult to determine if an action is accidental or negligent. Indeed, it may not be clear at the start of some events -- for example, fires -- whether humans or other factors cause them. The study of human threats also looks at whether the agent is internal to an organization (such as employees or citizens) or external. In business, the distinction between internal and external is not always exact, as there can be a continuum from inside to outside. All sorts of people may be allowed access to an organization, such as customers, contractors and vendors, third-party partners, or visitors. Ex-employees may also have access. Insider threats may be harder to detect and insiders may cause more significant damage. Think, for example, of FBI agent Robert Hanssen who was selling information to the Russians over the course of two decades. Insider threats and incidents for the banking and financial industry were the subject of a 2004 study conducted by the U.S. Secret Service's National Threat Assessment Center and CERT/CC. The insiders typically had "minimal technical skills" but had authorization to use the systems. However, they misused or exceeded their authorization. The National Infrastructure Advisory Council’s 2008 report describes behavioral characteristics of insiders who may pose a threat, explains why organizations fail to deal with such persons, and makes recommendations. Speed of onset is one characteristic of a threat or hazard. Other characteristics are the scope and the duration of the impact; how predictable the event is; what the primary impacts to assets are, and whether there are secondary impacts; and the state of preparation to respond and recover. Lindell et al. present a similar list in Chapter 6, while CPG 101 lists other factors (see pp. 4-8 through 4-9). These lists were designed for physical threats, but could be adapted for cyber threats or organizational crises such as fraud, sexual harassment, or other forms of malfeasance. Vulnerabilities Vulnerabilities allow potential threats to cause actual events. In risk management terms, threats exploit vulnerabilities, which are weaknesses or susceptibilities to damage or disruption. Vulnerabilities may be understood as controls that are absent, weak, or flawed. (Controls are discussed in the next section.) For example, • No fence = absent control • Short fence = weak control • Hole in fence or fence poorly constructed = flawed control Some kinds of vulnerabilities are considered conditions, such as high population density for a community, the location of an organization, or the type of industry a business is in (e.g., airlines, shipping, etc.). Sometimes the distinction between threat and vulnerability becomes fuzzy around the concept of condition or circumstances. My preference is to treat circumstances and conditions as vulnerabilities that a threat agent can exploit. Lindell et al. provide an in-depth discussion of vulnerabilities in Chapter 6, including what they call "vulnerability dynamics." Vulnerabilities are categorized in different ways. Social vulnerabilities revolve around how well individuals, households, or other social units can deal with disasters. Physical vulnerabilities arise from design, the materials used, implementation or construction, as well as factors such as neglected or improper maintenance, all of which are influenced by costs. The same concerns apply to software in computers and other electronic devices. Geographic vulnerabilities play a role in natural disasters such as earthquakes, tornadoes, and hurricanes or, for business operations, political conditions or currency markets. Complexity and interdependencies can obscure vulnerabilities and they can complicate efforts to respond effectively when an incident occurs, such as during the Three Mile Island nuclear reactor incident in 1979. Malfunctioning valves led to overheating of reactor coolant, causing a relief valve to open; the relief valve then did not close when it should have, and coolant drained out of the reactor. Operators, however, mistakenly believed that there was too much coolant, and drained out more coolant, leading to a meltdown of half of the reactor core (Chiles, 2002, p. 47). Perrow (2009) cites examples of airplane computers involved in accidents and near-accidents. Controls to Mitigate (Reduce) Risk and Other Strategies As mentioned above, one definition of vulnerability is a control that is absent, flawed or weak. The term control includes safeguards (proactive controls) and countermeasures (reactive controls). Controls mitigate risks by reducing a threat’s impacts or its likelihood, or both. Only rarely do they completely eliminate threats; that is a message that must be stated clearly (and often) to management. One way to classify controls is whether they help to prevent (or protect or deter), detect, or respond to and recover from adverse events. Lindell et al. use the term hazard mitigation to refer to prevention controls. The levee system in New Orleans is a protection control, as are, for example, regulations about insider trading. Detection may take the form of monitoring. Examples include sensors for natural phenomena (earthquakes) or technology (pressure, temperature, traces of harmful materials, etc.). On the enterprise side, examples of monitoring might be collecting and analyzing reports of product failures or customer complaints, tracking of adherence to policies, etc. Other forms of detection include testing and quality control as well as auditing. The third area is controls for response or recovery. Here are some examples: Stockpiles of vaccines or medicines; enough trained emergency personnel; sufficient insurance or reserve funds to carry out recovery. The list is long and varied. Consequences (Impacts) Consequences of an adverse event are also referred to as its impacts. Some consequences are primary while others are secondary. For example, storms or flooding may have secondary consequences such as disruption of utilities or transportation. Human impacts vary, depending on the type of crisis or disaster. Most people think first of death, injury, and illness when they discuss human impacts, but there may also be psychological and social impacts. Physical damage and destruction, whether with or without environmental consequences are created by many kinds of disasters. The consequences can disrupt vital critical infrastructures such as energy, water, communications, or transportation. A disruption can impact an organization's processes, the outputs of the processes, or the resources used to create the outputs, such as people, financial condition, facilities, equipment, materials and supplies. A disaster or crisis may slow or even disrupt cash flow to an organization (accounts receivable) or tourists may stay away, impacting the local economy. Responding to a crisis can also drain an organization's or local government's resources. There may be unplanned expenses to restore and rebuild, for example. A disaster or crisis may also result in increased insurance rates for an organization, or even being unable to obtain insurance. Primary threats to organizations' financial assets include fraud, theft, or extortion, or negligence such as making extremely risky investments. Other non-physical consequences may be political or legal. A crisis may draw the attention of politicians, as happened to Veterans Affairs in summer 2006 because a stolen laptop contained personally identifiable information of 26 million veterans. Regulations may be changed or added to respond to a crisis. There can be legal reactions to crises, such as government investigation and prosecution; for example, Hewlett-Packard was investigated by the California attorney general for spying on members of its board of directors and reporters to uncover leaks. Another legal reaction is filing of lawsuits. Defending against lawsuits incurs legal costs and loss of a lawsuit may require a substantial payout to the plaintiffs, such as in tobacco liability cases. With or without political or legal consequences, an organization (or even an industry segment) can suffer harm to its reputation. Loss of reputation sometimes occurs because the organization is perceived as unprepared, but more often the image issues arise because the organization's poor response or mishandling of the response. When reputation is damaged, there is a loss of confidence in the organization, both externally (customers, suppliers, shareholders, the general public, or -- in the case of charities -- donors) and internally (employees). For example, customers abandoned ValuJet after the 1996 fatal crash caused by negligence. The Department of Veterans Affairs is closely scrutinized by congressional committees since the 2006 data loss and is the subject of a specific law (P.L. 109-461). Likelihood, Probability, and Frequency Likelihood is the extent to which an event is apt to occur. Probability is the statistical expression of likelihood. It may be expressed on a scale from 0 (not likely) to 1 (certain) or in terms of percentages (a 75% chance). Events that are low in probability or have not been experienced are difficult to conceptualize, even though the impact may be disastrous. For example, the Iowa floods of 2008 may have had a probability of just 0.2% (two-tenths of one percent). Therefore, it is common to recast probability in terms of frequency -- a 500-year flood, in other words. "The frequency format implies that one is being asked about things that have happened -- which may justify an inference that they will happen about as often in the future -- rather than about things that haven't happened yet though they may in the future" (Posner, 2004, p. 10). Still, low probability or frequency can lead an organization to under-invest in prevention and preparation. Thus, a trick is to change the reference timeframe. For example, 1% or once in 100 years can be rephrased as 20% likelihood in the next 20 years. Low-probability events with high impacts should be the focus of crisis management planning (Sheffi, 2005, p. 251). You need historical data to calculate probability and frequency. Studies of accidents in large organizations or across an industry can provide rich data not only on the number of accidents but also the “near misses.” A pioneer in this field was H.W. Heinrich, who began publishing in the 1930s. Heinrich's model is known as "Heinrich's Triangle." While I have seen slight variations in the numbers used, here's an approximation of the complete Heinrich's Triangle: 1 major accident 30 (or 29) serious accidents 300 recordable incidents with no injury 3000 near misses 30,000 unsafe actions The triangle disregards the causes of the accidents, beyond the general point made by Heinrich that behavior contributes to the accidents (1959). However, historical data is not always available, particularly for certain types of events. Sheffi notes that it is difficult to calculate the statistical probability of deliberate human threats: "Intentional disruptions constitute adaptable threats in which the perpetrators seek both to ensure the success of the attack and to maximize the damage" (2005, p. 50). To model intentional threats, one must factor in the capabilities, intents, and underlying motives of the perpetrators (Ezell, Bennett, von Winterfeldt, Sokolowski, & Collins, 2010, p. 577). Likelihood needs to be periodically re-examined, because factors can either deteriorate or ameliorate. For example, building more homes along beachfronts or switching to Voice over Internet Protocol (VoIP) lead to increased risks, while implementing additional controls will decrease risks (Jrad, Morawski & Spergel, 2004). In addition, new threats may arise, such as "spear phishing" attacks. Risk Assessment / Risk Analysis Risk assessment or risk analysis encompasses identifying and evaluating risks and their degree of impact. The terms “assessment” and “analysis” are often used interchangeably. ASIS (2009, p. 49) implies that “assessment” is a broad process that starts with identifying risks, then analyzes them, and results in an evaluation, while “analysis” is more detailed, with the objective of determining the degree of risk. Risk analysis is a core competency according to the National Infrastructure Protection Plan (NIPP), comprising "knowledge and skills" to conduct quality risk analyses; that is, a risk analysis needs to be "accurate, documented, objective, defensible, transparent, and complete" (DHS, 2009, p. 84). Appendix 3A of NIPP expands on these criteria and offers guidance on assessing threats, vulnerabilities, and impacts (p. 148). Risk Assessment Techniques Risk assessment is based on inputs such as these: Assets • Identify • Determine value • Prioritize Threats • Identify • Estimate likelihood Vulnerabilities • Identify Impacts • Estimate consequences (life and health; economic; mission; reputation; etc.) Some techniques also require building scenarios for each threat or adverse event. The methods for scenario development vary from one discipline to another, but generally involve asking “What if…?” NFPA 1600 presents several techniques for assessing risks, including these three (2010, p. 17): • FMEA (failure modes and effects analysis) begins with a vulnerability and what conditions could make it actual; by implication, this uncovers what controls are missing. Through inductive reasoning, the analysts explore the possible impacts or effects. • FTA (fault tree analysis) is a deductive technique that starts from the problem and seeks out the causes. It is a search for vulnerabilities that might be mitigated through controls. • HAZOP, or hazard and operability studies, has its origins in process control industries such as chemicals. Its purpose is to review processes or systems, to determine if sufficient and adequate controls are in place to prevent failures or accidents. Quantitative versus Qualitative Analysis Analysis methods may be qualitative or quantitative. The methodology used in your organization may dictate how mathematical an analysis is required. A Chief Financial Officer is apt to want an analysis supported by numbers. A qualitative analysis, by contrast, may use categories such as High, Medium, and Low. Some models use a 5-factor scale, adding Very High and Very Low. Bottom line: If your organization requires a certain methodology or model, use that. If it doesn't employ a standard methodology or model, then the following examples may help you in recommending one. Quantitative Analysis Different organizations use different formulas to compute risk quantitatively. A common formula is Risk = Threat * Vulnerability * Consequences (or Asset Value) This formula only implies likelihood; it could be used to provide a worst-case scenario, where it is assumed that the threat will exploit the vulnerability, completely destroying the asset. Grenier (2000) presented a more nuanced version of this formula for critical infrastructure protection: R = f { ( P (T x V) ) I x s x t x g} Grenier's definition of risk (the R in the formula) is: "The probability that a particular threat will exploit a particular infrastructure's vulnerability (ies) weighted by the impact (s) of that exploitation. The Risk function can be further modified to account for seasonal, temporal and geographic variabilities. The time variables could allow for: time from incident until impacts are felt, duration of impacts, etc." (Slide #3). Frequency is an alternative to using probabilities. Frequency for risk analysis is typically expressed in terms of an annual basis; the formal term is annualized rate of occurrence, or ARO. Some quantitative analyses create a scale of frequency; for example, once every 300 years, every 30 years, every 3 years, annually, quarterly, monthly, weekly, daily, hourly, and more than hourly. For intentional threats, the frequency may represent only an estimate, which can shift based on countermeasures employed; for example, unarmed guards versus armed guards versus armed guards who draw their weapons when approached. For consequences, formal quantitative risk analysis often uses the term single loss exposure (or SLE) instead of consequence or impact. The SLE is calculated by multiplying the value of the asset by the exposure factor, which is a percentage of asset value that is at risk (that is, how bad the loss could be, based on potential threats and vulnerabilities). Asset value, in a restricted sense, is the replacement value, not the depreciated value. A broader view of asset value is the total cost of the incident, including replacement or recovery of the asset as well as losses of sales or of customers; productivity of employees; penalties and legal costs; loss of intellectual property or secrets; and decline in goodwill or reputation of the organization. Then, Annualized Loss Exposure is calculated by multiplying a single loss exposure by the annualized rate of occurrence, or: ALE = SLE x ARO. Schneier presents a few examples (2004, pp. 301-302). One is for corporate network intrusions that infect end devices, with 3 events per day translating to about 1,000 events on an annual basis. The impact is the cost of cleaning up the infected systems. Network intrusions per year 1,000 Cost to clean up $10,000 ALE $10,000,000 Schneier's second example is of an intrusion by a competitor intending to steal a major corporate secret such as new design plans. Here, he switches to probability: Probability of intrusion 0.001 Impact of information theft $10,000,000 ALE $10,000 The bottom line: Can you justify the cost of the controls? By reducing the exposure factor, is the new ALE acceptable? Qualitative Analysis Qualitative models typically compare a threat’s likelihood to impact, assigning High, Medium, or Low values to both likelihood and to the impact. These models often associate statistical values to the likelihood, with low in the range of 10% and high at 100% -- a certainty, in other words. The following is adapted from Nyanchama (2005, p. 40). He also has a similar chart depicting the impact against the ease of exploiting vulnerabilities (Nyanchama, 2005, p. 52). Impact Threat Level High Medium Low High H H M Medium H M L Low M L L Wrobel's qualitative approach uses a large number of parameters, including 3 different kinds of impact, to classify risks to information systems. The following chart is an adaptation from Wrobel (1997). Evaluation Factors High Medium Low Low - No Impact: Systems >75% 50-75% 25-50% <25% Impact: Functional Entire mission ~50% mission ~25% mission Not mission critical Impact: Users No access; >200 users Sporadic access; >50 users Noticeable; > 15 users Brief; < 15 users Probability >75% 50-75% 25-50% <25% Cost >$100K $50-100K $25-50K $0-25K Controls None Difficult to implement At hand Automatic; redundant Time to recover >2 hours 1-2 hours 10-60 min <10 min Staff to respond Not available Available soon On site On site Risk Is Not Isolated A single threat or vulnerability may produce impacts that overwhelm all the controls in place to contain it. This is a common mode failure, and there are mathematical models to analyze and calculate these. However, calculating the impact of a single threat in isolation is not necessarily reasonable. For example, an earthquake may have secondary impacts such as power outages and fires. Common cause failure analysis is a technique to evaluate the "effect of inter-system and inter-component dependencies, which tend to cause simultaneous failures and thus significant increases in overall risk" ("Probabilistic risk assessment," n.d.). More concretely, consider a “system of systems” where individual components are highly interdependent. Assume that there are 4 key components, and each has only a 5% risk factor; that is, each is 95% risk-free. If you then compound the risk for the entire system, the safety factor drops precipitously to 81%: 95 x .95 x .95 x .95 = .8145 Another way of looking at this is that the probability of a complex event is nearly 20%. The National Infrastructure Protection Plan (NIPP) recommends looking across multiple sectors when analyzing risks (DHS, 2009, pp. 27-28). The analysis should encompass "dependencies, interdependencies, and cascading effects; identification of common vulnerabilities; ... common threat scenarios" (p. 27). Furthermore, across sectors, risks should be compared, prioritized, and controls should be shared (p. 27). DHS, however, recommends different approaches for different sectors: For those sectors primarily dependent on fixed assets and physical facilities, a bottom-up, asset-by-asset approach may be most appropriate. For sectors such as Communications, Information Technology, and Agriculture and Food, with accessible and distributed systems, a top-down, business or mission continuity approach, or risk assessments that focus on network and system interdependencies may be more effective. (DHS, 2009, p. 28)
No comments:
Post a Comment