Tips for Disaster Recovery Planning
Auditors need to assess whether their organization has an effective plan in place for recovering from unexpected events that could impede operations.
February 01, 2011
The floods in Queensland, Australia, are a stark reminder of why disaster preparedness is important for businesses. Many businesses affected by this disaster may struggle to survive.
A disaster recovery plan (DRP) can help organizations recover from such incidents. This comprehensive document identifies an organization's critical assets and describes the length of downtime that is acceptable depending on the severity of the situation, as well as which functions or systems are a priority to restore before the rest.
Because organizations change every day, a DRP should be updated regularly. The internal audit activity needs to assess whether the organization has developed a continuous monitoring mechanism to ensure that the plan is up to date and will be effective when implemented.
Free sample plans available on the Web provide a template that organizations can use as a starting point to develop their own plan, including examples from the Massachusetts Institute of Technology and Michigan State University. The DRP should be customized to reflect the organization's existing environment and requirements in the event of a disaster.
The following are 10 tips for auditors who are evaluating disaster recovery planning in their organizations.
- Senior management must support the efforts to implement a DRP. Without such support, the DRP may not have the full cooperation of the organization's various departments and personnel.
- A DRP committee should be in place comprising the heads of different departments to ensure appropriate buy-in and effective recovery in case of a disaster. The DRP committee is responsible for the design and implementation of the plan.
- A DRP committee member should be designated as the coordinator to ensure the implementation of the plan. This position is similar to that of a project manager who is responsible for all aspects of a project.
- The organization's critical assets should be identified as well as acceptable downtime (loss of use) of these assets. This allows the organization to calculate the cost of the recovery as well. Backup sites are part of the recovery plan and can be classified as hot, warm, or cold. A hot site is almost a duplicate of the existing site in the organization, with full computer systems and duplication of data from the original site. However, hot sites are expensive to maintain. A warm site has some, but not all, features of a hot site and is slightly less expensive. It may have some cabling and minimal hardware as well as recent backup tapes. A cold site is the least expensive option but takes the longest time to bring online in a recovery situation. It does not have the equipment or backups in place, but it may have some limited cabling and the right to use space in the designated facility. In addition, facilities, network and communication lines, supplies, and hardware vary in cost and factor into the ease of recovery.
- Backups (i.e., tapes, disks, operator manuals, and installation software) should be factored into the plan.This includes determining whether the organization plans on having off-site or remote storage of the backups. The storage choice also should be considered when determining whether the backup methodology is full, incremental (includes only changes since the last backup), or differential (includes all changes since the last full backup and is a cumulative backup). An incremental backup is the simplest method, but takes the longest time to recover after a disaster; a differential backup is the most cost-effective and is easy to recover. In addition to the backup methodology, organizations that use backup tapes must decide on a tape rotation method such as the GFS rotation scheme: grandfather (yearly), father (monthly), and son (weekly).
- The configuration of the network, Web applications, cloud computing, and software as a service (SaaS) should be considered in the overall planning. Organizations that use cloud computing and SaaS should document these services in the overall DRP.
- The DRP implementation team should identify a location where it will meet after a disaster occurs. List the individuals who are authorized to declare a disaster as well as those personnel who are allowed access to the recovery site. Maintain a list of emergency phone numbers to be used when a disaster is declared. Create a listing of personnel and their duties in implementing the plan. This is the DRP team.
- The DRP committee should write and update the plan. The process should include constant feedback from the DRP team as well as employees of the organization.
- The plan should be tested. Depending on the organization's industry, it might be difficult to stop everything and perform a full-fledged test of the DRP. There are different ways to test a DRP. A full test performs a complete recovery, but it brings day-to-day operations to a stop. A partial test recovers critical portions of the plan. A paper test is performed as a document to get participant feedback and to make changes. The plan should be corrected or modified based on the test results.
- DRP team members should have adequate training to implement the plan effectively. The training should include mock drills and unannounced tests of the DRP plan.
An effective DRP requires good planning, testing, and regular updating. Auditors need to be aware of these concepts to provide assurance to the organization that the plan is well-suited to meet its needs.