Contingency Planning and Disaster Recovery
Of the more than $40 billion that insurance companies paid out because of the Sept. 11 attacks, more than 25 percent–$11 billion–was for claims related to business interruption. Some industry experts say that among organizations that suffer significant, sustained disasters, 20 percent are completely out of business within 24 months. Yet most companies today do not have contingency plans. Many existing plans are out of date or ignore key human factors. Worse yet, many plans have not been tested. Contingency plans address the “availability” security principle. The availability principle addresses threats related to business disruption so that authorized individuals have access to vital systems and information when required.
Contingency planning, also referred to as business continuity planning (BCP), is a coordinated strategy that involves plans, procedures and technical measures to enable the recovery of systems, operations and data after a disruption.
The contingency plan must be developed with the input and support of line managers and all key constituencies, since the plan will need to work across the organization. The plan must be based on the risks faced by the organization, as well as risks associated with partners, suppliers and customers. All technology issues must be addressed in the context of business operations. The plan itself must be tested regularly and refined as required.
The core objectives of contingency planning include the capability to:
- Restore operations at an alternate site.
- Recover operations using alternate equipment.
- Perform some or all of the affected business processes using other means.
Business Impact Analysis (BIA)
One of the critical steps in contingency planning is business impact analysis (BIA). BIA helps to identify and prioritize critical IT systems and components. IT systems may have numerous components, interfaces and processes. BIA enables a complete characterization of system requirements, processes and interdependencies.
As part of the BIA process, information is collected, analyzed and interpreted. The information provides the basis for defining contingency requirements and priorities. The objective is to understand the impact of a threat on the business. The impact of the threat may be economic, operational or both. Questionnaires or survey tools may be used to collect the information.
It may be necessary for organizations to prioritize their sensitive business information into categories. An example of this is what can be found in the Massachusetts Institute of Technology’s Disaster Recovery and Business Resumption plans.
Classification of Threats
The National Institute of Standards and Technology (NIST) has identified three classifications of threats: natural (hurricane, tornado), human (operator error, terrorist attacks) and environmental (equipment failure, electric power failure).
Systems are vulnerable to a variety of disruptions, ranging from mild, such as a short-term power outage or a disk-drive failure, to severe, such as equipment destruction or fire. Vulnerabilities may be minimized or eliminated through technical, management or operational solutions as part of the organization’s risk management effort. However, it is impossible to eliminate all risks. Contingency planning is designed to mitigate the risk of system and service unavailability by focusing on effective and efficient recovery solutions.
Components of a Contingency Plan
Every business must develop a contingency plan. The responsibility of the contingency planning process is typically with the contingency planning coordinator. This individual may be the security officer, the CIO or an individual with management responsibilities and experience in this area. It is recommended that the organization formally identify this person and the team that will be working to develop the contingency plan.
The contingency plan document must specifically address the following critical components:
- Data Backup Plan (Administrative Safeguard): A documented and routinely updated plan to create and maintain retrievable exact copies of information for a specific period of time. Successful data backup and restores are sometimes dependent on business processes and “batch” activities. The organization needs to carefully test all critical backups and restores on a schedule related to the criticality of data to the organization.
- Disaster Recovery Plan (Administrative Safeguard): Provides a blueprint to continue business operations in the event that a catastrophe occurs. The disaster recovery plan must include contingencies for the period during the disaster and until the recovery plan can be completely implemented.
- Emergency Mode Operation Plan (Administrative Safeguard): The part of an overall contingency plan that contains a process enabling an enterprise to continue to operate in the event of fire, vandalism, natural disaster or system failure. Organizations must consider identifying the levels of emergencies and associated responses.
- Testing and Revision Procedure (Administrative Safeguard): Procedures for the processing of periodic testing of written contingency plans to discover weaknesses and, subsequently, revising the documentation if necessary. These written testing and feedback mechanisms are the key to successful tests. The tests conducted may be walkthroughs or document reviews, simulation tests or checklist tests, or may very well be a full interruption test to check all aspects of the contingency plan.
- Applications and Data Criticality Analysis (Administrative Safeguard): The purpose of applications and data criticality analysis is to assess the relative criticality of specific applications and data in support of other contingency plan components. It is an entity’s formal assessment of the sensitivity, vulnerabilities and security of its programs and the information it receives, manipulates, stores or transmits. This procedure begins with an application and data inventory.
- Contingency Operations (Physical Safeguard): Contingency operations establish (and implement as needed) procedures that allow facility access in support of restoration of lost data under the disaster recovery plan and emergency mode operation plan in the event of an emergency. Physical security is a critical aspect of disaster and business continuity planning. Administrative controls for physical access to enable contingency operations must be in place so recovery can proceed as defined in plans.
- Data Backup and Storage (Physical Safeguard): Continual and consistent backup of data is required, as one cannot be sure when an organization may experience some disaster that will require access to data that has been backed up. Data may also be lost or corrupted, hence a good data backup plan is important. Data backup methods include full, incremental or differential. Data backup and storage addresses questions such as: Where will the media be stored? What is the media-labeling scheme? How quickly will data need to be recovered in the event of an emergency? How long will data be retained? What is the appropriate media type used for backup?
- Emergency Access Procedures (Technical Safeguard): Establish and implement procedures for obtaining necessary sensitive business information during an emergency. Emergency access is a requisite part of access control and will be necessary under emergency conditions, although these may be very different from those used in normal operational circumstances. For example, in a situation where normal environmental systems, including electrical power, have been severely damaged or rendered inoperative due to a disaster, procedures should have been established beforehand to provide guidance on possible w