I’ve spoken to many business continuity practitioners, and most of them (both beginners and experts) are telling me that the most difficult part of ISO 22301 implementation is the business impact analysis.
So, here are my tips on how to implement the BIA – this article is an excerpt from my upcoming book Becoming Resilient: The Definitive Guide to ISO 22301 Implementation):
Business impact analysis is primarily here to give you an idea about the timing of your recovery (Maximum Acceptable Outage/Recovery Time Objective) and the timing of your backup (Recovery Point Objective/Maximum Data Loss), since the timing is crucial – the difference of only a couple of hours could mean life or death for certain companies. For example, if you are a financial institution, recovery time of 4 hours could mean you will probably survive a disruption, whereas recovery time of 12 hours is unacceptable for certain systems/activities in a bank, and disruption of a full day would probably mean the bank would never be able to open its doors again. And there is no magic standard that would give you the timing for your organization – not only because the timing for every industry is different, but also because the timing for each of your activities could be different. Therefore, you need to perform the business impact analysis to draw the correct conclusions.
Since this step in the ISO 22301 project is time consuming and complex, you can decide whether it will be performed by the your own Business continuity coordinator, or by some hired expert (e.g. a consultant) – for the sake of simplicity, I will mention only the Business continuity coordinator in this article. In any case, this person has to develop the BIA Questionnaires for collecting the information (or configure the tool, if it is used), organize interviews or workshops, compile all the data and produce the report (or include the results in the Strategy if no separate report is produced).
If you only send the methodology and BIA Questionnaires to the responsible persons in each activity and tell them to fill them in, the results you get will probably be unusable. The reason this will happen is that people find it very difficult to understand what business impact analysis is all about, even though you have written your methodology well.
Therefore, if you want your BIA to succeed, you basically have two options:
a) Perform business impact analysis through interviews – this means that the Business continuity coordinator will interview the responsible person(s) from each activity, where he will explain the purpose of BIA first, and make sure that every assessment made by the responsible person makes sense and is not biased.
b) Perform workshops with responsible persons first – in such workshops, the Business continuity coordinator explains to all responsible persons the purpose of BIA, and through several real-life examples, shows how to perform the analysis.
Of course, conducting interviews will probably yield better results; however, this option is much more time consuming for the Business continuity coordinator.
The main input for the business impact analysis process is the BIA Methodology, and you also need a list of your business continuity activities (i.e. processes or departments).
All the information must be given by, and assessments made by, the responsible persons from each activity. While doing that, they must use the worst-case scenario criteria: what would have happened in a huge storm, not some average storm; a breakdown of your whole IT infrastructure, not just some insignificant server; loss of data from your main server, not from one laptop only; your CEO and main system administrator are missing, not only some lower-level employees; and all of this happens when you have a short deadline to deliver an important product to your most important customer.
If your respondents tell you “This is never going to happen to us!” – just tell them to read a couple of news stories from the crime section. Besides, business continuity is here to prepare you for bad times, not for good times.
Here are a few tips for collecting the required information from the responsible persons from each activity:
- Impact assessment – they have to consider the business damage that will happen if their operations are halted, in light of particular questions that are asked. For example, for the question “How will your clients react to a disruption?” – for a disruption that lasts 2 to 4 hours, you should receive assessment (1) on a scale 1 to 4 if there would be no client reaction whatsoever; assessment (2) if clients would start calling you, but nothing significant would happen in that time frame; if after an 8-hour disruption some clients would start leaving your company, then this would mean an assessment of (3); if after 48 hours the majority of clients would leave your company, this would mean an assessment of (4). See also figure 1 for an example.
- Assessment of RPO/Maximum Data Loss – you have to ask your respondents to list all their databases, applications and files, but also all services (e.g. email), etc., and for each of them separately to state the acceptable limit up to which you can afford to lose the data. Usually, this limit is displayed in number of hours, but sometimes it can also be in number of transactions or records. The main criteria while doing the analysis must be the damage of any potential data loss to the company – in terms of money or other impacts like legal, reputation, etc. Also, while doing such analysis it is important not to be distracted by the fact that you already have the backup; the question is – if your existing backup fails, how much data can you really afford to lose? See also figure 2.
- Minimum Business Continuity Objectives (MBCO) – you should specify the minimum acceptable level of capacity required immediately after the recovery for a particular activity, taking your peak hours or days into account. For example, December is typically the busiest month in banks for most activities, so you should specify the minimum number of transactions or customers you would have to process if a disruption occurred on the busiest day of December.
- Required resources – taking into account the MBCO (number of transactions, customers, products, etc.), you should identify how many people and other resources you need for the recovery. Resources like laptops, furniture, mobile phones, offices, etc. usually depend on the number of people; capacity of resources like software and telecom links depend on number of users or number of transactions that need to processed; data as a resource needs to be described in terms of how many and which records you need – for example, all the records created in the past six months (for, e.g., a database), or only the current documents (for, e.g., contracts that are signed with partners and clients); external services are described in terms of transactions, products or whatever it is they provide to you; financial resources are expressed, well, in money (in your local currency or the currency your company normally uses).
- Dependency on others – basically, these are all other activities without which you wouldn’t be able to perform a certain activity. These are usually divided like this:
1) Dependency on other activities within your organization – for example, all of your activities will probably depend on the IT department/IT activity, whereas only some of your activities will depend on your legal department/legal activity.
2) Dependency on suppliers and outsourcing partners – typically, all of your activities depend on electricity and telecommunication links (Internet, fixed lines and mobile phones), but many companies also depend on software development companies, hosting providers, cloud providers, accounting services, etc. Here you need to evaluate the business continuity capabilities of those third parties by studying the clauses in agreements you signed with them, inquire as to how they handled disruptions in the past, or perhaps audit them to get a deeper insight into their capabilities.
As already mentioned, all the assessments must be done by the responsible persons from each activity – this is because they know their activities the best, so doing the assessment is not the job of the Business continuity coordinator. However, the Business continuity coordinator is crucial for coordinating the whole effort, and for making sure that the criteria for assessing the impact are the same. For example, responsible persons from activities tend to overestimate the importance and the impact of their activities – so you might get an assessment, say from your accounting department, that if their activity is disrupted for two hours it would have a catastrophic impact (4). To counteract such an unreasonable assessment you should ask them the following question: “Do you really think that the company will go bankrupt if your department doesn’t work for two hours?” – after such a question, the assessment usually becomes reasonable.
Where the Business continuity coordinator must be actively involved is in making the decision about MAO and RPO – usually, he makes these decisions together with the responsible persons from activities, based on the results from BIA Questionnaires.
Here is an example of how the responses related to Maximum Acceptable Outage in the BIA Questionnaire for a particular activity might look:
Figure 1: Example of BIA Questionnaire – determining the Maximum Acceptable Outage
The decision about MAO is basically made visually – by looking at this example (and assuming this is a small company with annual revenue of 1 million U.S. dollars and a profit of 150,000 U.S. dollars), higher impacts begin with 8 hours (question #1), whereas it is obvious that multiple high impacts will begin at 24 hours. Therefore, as the first step, some consideration should be given if clients’ reactions might be tolerated for a disruption longer than 8 hours (question #1) – if so, in the second step, MAO for this activity will be set somewhere between 8 hours and 24 hours. To determine the Recovery Time Objective (RTO) for this activity, the dependencies on other activities will have to be examined, and then the final decision on RTOs of each activity can be made.
And here’s an example of how the responses to the BIA Questionnaire might look for Maximum Data Loss/RPO:
Figure 2: Example of BIA Questionnaire – determining the Maximum Data Loss/RPO
The decision about Maximum Data Loss/RPO is also made visually – in this example, RPO for Software #1 should be 24 hours, for Software #2 it is 8 hours, for Database XYZ it’s less than 1 hour (probably zero), and for Paper-based document ZXY, about 1 week.
What does this mean in practice? This means that backup for Software #1 should be done at least every 24 hours, because you can afford to lose a maximum of 24 hours of data. For Software #2, the backup should be made at least every 8 hours, Database XYZ should be probably backed up in real time (e.g. synchronous or asynchronous replication – this is typical for transactional databases in banks), and Paper-based document ZXY should be copied or scanned at least within a week of its creation. All these conclusions should be documented in the Business continuity strategy or related Backup policy.
Similar to risk assessment, if the organization doesn’t use the tool, then the results are usually collected through Excel questionnaires – in this case, the Business continuity coordinator collects all these questionnaires; if the tool is used, then these are collected automatically.
No matter if the tool is used or not, the information that is collected during the BIA process must include all the elements mentioned in BIA Methodology.
If yours is a larger company, you should probably compile all these results in a Business impact analysis report; however, smaller companies will be just fine with summarizing all the results in the Business continuity strategy.
This article is an excerpt from the book Becoming Resilient: The Definitive Guide to ISO 22301 Implementation. Click here to see what’s included in the book…