Clifton Risk Management

Background

History

The saying goes that Noah was one of the first continuity managers as he assessed the risk, put a plan and strategy into action and prioritised the recovery process, unfortunately the unicorns didn’t do to well on the business impact analysis. What this illustrates is that planning for catastrophe is nothing new and very few new ideas have come into the arena for sometime. People continually attempt to intellectualise the areas of risk and produce more and more complicated models of multi dimensional theorems but what it boils down to is that:

“Something could happen, sometime, and if this concerns you then you need to do something about it.”

Now that isn’t too complicated a theory.

Risk assessment led to insurance and the many London based companies in the 17th century who made and lost a fortune by underwriting shipping are legend. As such this created the business of actuaries and logically you would think actuaries would make excellent risk managers, my experience has shown that there are very few actuaries in the business of risk management. Actuaries “bet” that being under 25 male and living in the inner city you are more likely to claim on your car insurance, and as such penalise any that fall into that category. This is a simple example and many complex “bets” are made too. In risk management we cannot afford that bet as if we lose we could lose the company we are working for.

Insurance is an essential component of your business continuity strategy and any assessment of your risk strategy should include a review of the breadth and value of your insurance cover. However insurance only pays out after you have failed and our job is to stop you failing or if that is impossible then to reduce the impact of that failure to acceptable levels.

Disaster recovery in its modern sense came about after the birth of computers.

Computers by their nature increased the speed of processing to such levels that man could not keep up by working in an alternate manner. This was realised in the 1960s in the USA and thus saw the start of a new business, that of the disaster recovery planner. That said many businesses across the world still propose manual work arounds for extended periods of time as being the correct strategy for computer failure!

So what did the disaster recovery planner do and how did it move forward, or has it?

Present

Once the computer started to move into the business arena then the value of the data on it soon became obvious to those managing operational areas. Before information had primarily been on paper and as such the responsibility of the author or filing clerks, now technicians were responsible for controlling, storing and retrieving essential information. As such the value of backups were quickly recognised. Again however we would state that time and time again we find that now, almost 40 years later this is still one of the fundamental failures in organisations and data is not backed up or stored in secure sites. I will expand further on this in the section on disaster recovery.

Once the data was backed up and available then something to restore it on became necessary. Whilst many of us now take this for granted we must realise that until the early 1990s PCs were not the norm and large scale, primarily mainframe computers were required to restore this information.

So back to the early days, a member of the IT operations team has to convince management that there is a need to invest in a duplicate, possibly multi million dollar, computer to restore this backed up information onto. Now whilst all organisations are willing to spend money on insurance to get something after failure, few were prepared to invest to prevent an incident, as with most things in human life, investment is only considered essential after a failure has been seen to happen. In the USA natural disasters abound and many examples such as floods, earthquakes and hurricanes motivated companies to invest in recovery capability.

However cost justification was required and thus the initial types of business impact analysis reviews were created. In almost all cases these first efforts were to cost justify investment. The niceties such as the prioritisation of recovery, timescales, dependencies etc. were consequences of this review not the first aims. This has now changed in many ways and the value and approach to these reviews is covered in a later section called business impact analysis.

In analysing the impact to a business it was also essential to investigate what risks the business was exposed to, there is a section on risk analysis later. Risk analysis looked at many levels of a company and attempted to find the risks it was exposed to; these risks could then be eliminated, reduced or compensated for. The approach covered many aspects, from the environment in which an organisation was based; through to the physical risks within the facility and including a review of the logical risks its systems brought to operations. It must be remembered though that the fundamental levels of security available in most mainframe systems meant that in the early days, protection of data was much more substantial than it is now with distributed systems, the internet et al—that’s progress for you! Many reviews also included investigation into specialist areas of risk, such as financial or production risks and often personnel risks were included too.

At this stage we still had IT personnel, primarily computer operations, responsible for the planning of solutions to possible failures. In fact IT was probably not an expression used then, more likely data processing, in a world of ever changing abbreviations and acronyms, mostly meaning the same thing it is easy to miss some. In very few cases before the 1990s were business management involved in planning for the recovery of their operations. With hind site this may seem a little silly but in many cases globally it is still the case. One problem that arose from this is that disaster recovery was perceived to be an IT issue and that IT would provide almost a “magic” solution to any failure that may impact the business. In very few cases did business management challenge IT as to how this was going to be done.

The impact of this split between IT and the business will be investigated later during the section on business continuity planning.

IT managers, in primarily large organisations, started to implement strategies to cater for failures. This saw the birth of the disaster recovery companies providing equipment in a syndicated manner. Their aim being to reduce the cost of investment to companies for equipment, which, hopefully, they would never need. These companies and their approach to managing risk are further expanded on in the section recovery strategies. So in the 1980s we saw a situation where there was an understanding of why we want to put strategies in place and a methodology for implementing them. What we needed then were plans to control the situation and co-ordinate the recovery.

Writing a plan seemed a simple matter and many IT managers wrote simple high level plans as to what they would do in the event of an incident. In writing these plans a requirement for another whole new industry was found, that of disaster recovery software and later business continuity software, more of these under the section named software. As mentioned very few distributed systems or PCs were in use before the 1990s and as such most data was resident on mainframes. As people started to write plans they found that they needed a mass of information that companies stored in other places. As this was not easily accessible a tool was needed to store this information on. The first of these tools were simple DOS based systems and allowed data entry into partially written plans that then produced finished documents. Later models allowed users to import data from other sources, collate it and use it to create plans. In the 1980s these tools were very useful in the production of large-scale plans. However, software is not, and has never been a panacea for proper planning and many companies still force their plans to fit the software as opposed to their own internal culture.

As IT managers planned for incidents it became obvious to some of them that the requirement went beyond IT and into other areas of the business. This was particularly true in the initial reaction to an incident and the birth of the crisis management plan, more of this in a later section. Other areas including facilities, public relations and personnel were needed in the planning. This started the move from the IT world into the wider organisation and the need for plans to be more encompassing. Many industries had always had emergency plans, public authorities for natural disasters, airlines and oil companies for example, and experts from these fields were called upon to help shape the plans for business.

During the 1980s the industry moved out from the USA and around the world. There had been small expansions before this but during this decade the pace increased and the global interest grew. Also around this time other drivers came into play, which started to bring fresh approach to planning. Without a doubt at this time the USA were the global leaders and set the standards for everyone else. However the UK had a very different threat to almost all of the rest of the world in its drive to planning, that of terrorism.

This became particularly visible to the world in the large scale bombing of the city of London in 1992 and 1993. These bombs were deliberately targeted to minimise loss of life and maximise the damage to financials institutions. The scale of the damage and the impact on many international companies completely refocused the drive from disaster recovery to business continuity in the UK.

What many companies found was that their computer sites were not impacted, as they were not based where the explosions were. However the loss of the working environment, the loss of work in progress and key documentation were in many cases new to planners and introduced the need to involve the business units in planning.

This meant in the early 1990s there were two significant changes in the requirements for planning, namely the introduction of distributed computer systems, with PCs and the change of focus from IT to the business. This also created a challenge for the industry as most members of the industry were primarily from one of three backgrounds: IT, facilities or the emergency/armed services. None of these areas were experts in business and many new skills had to be learnt.

Around the industry new bodies were appearing too, groups to co-ordinate conferences, offer qualifications, training and education. The industry as a whole was growing and the supporting infrastructure around it was growing to support it.

The industry moved on through the 1990s and plans adapted to new business drivers, a section on planning is available later in this document. Experience has shown that most large companies in the UK now have business driven plans, which have impacted IT plans and led to some fundamental changes in planning. In the US where many plan owners still reside in the IT arena, the taking of ownership and drive by business management has not been so prevalent. It is surmised that this is primarily due to the different nature of threats that have driven planning in each country, i.e. terrorism in the UK and events such as the Manhattan power failure in New York, for the USA.

Planning and strategies have also had to adapt to new technologies and methods of working. Downsizing, just in time operations, internationalism, mergers and acquisitions have all meant that plans must be altered and assumptions challenged. The Internet delivers 24 hours of service, many call centres are the only way companies can be contacted. Contracts guarantee levels of service and technology is often the only way that these can be honoured.

The diagram below shows what is seen as the traditional approach to recovering from an IT failure.

The approach displayed shows an incident occurring which leads to a failure in IT systems. At this stage a review of options and strategies would be undertaken. If agreed, recovery of systems and data to the last point of back up would then take place. As some data would be lost and processing time would also be lost during the recovery phase there would then be a period of time when business areas are “catching up” on data and time lost. The business areas can then operate in their reduced state before eventually recovering to normal operations.

This old method of incident, failure, review, recovery, catch up, operational is not acceptable to certain sectors of business. New technologies bringing in options such as redundancy in systems, data farming or mirroring amongst others, mean that the need to recover is being reduced, as we move to continual processing.

Traditional IT Recovery Diagram

Normal Operation → Incident → Failure → Review → Recovery → Catch Up → Reduced Operation → Normal Operation

All of these factors moved business continuity more into the mainstream of the business. It became part of change management procedures, project planning and risk became an item management discussed. Regulators started to review organisations capability to operate and demanded proof of plans in place being tested. And then there was the millennium. Y2K brought business continuity to the forefront. Television programmes were made on the subject, books were written, Boards of directors were talking about it and suddenly we were mainstream. As such business continuity planning became business continuity management, it was now part of business as usual and not a series of projects. So is there a future in business continuity management?

Future

There will always be a need to plan. Even if all technology is secure and needs no recovery there is the potential to lose people, access to buildings or your reputation. As such the need for testing and proving the usefulness and value of plans increases, there is a section on testing later in this document. People must still be co-ordinated, controlled and informed of the situation and this must all come from planning. The need for disaster recovery and disaster recovery suppliers may decrease as technology leaves them behind, they may move into markets, which are new for them, such as the SME environment but the need to plan, and test plans will remain.

The next two pages show two models: The first is a traditional diagram used to plan projects, the second is a little more radical and is justified within the remainder of this document.