When working on the interesting challenge of managing the implementation
of changes to the solid state interlocking (SSI) data, we used the
risk analysis and the process of preparing the risk analysis as
a tool to identify the issues of greatest effect and to keep the
project team focused.
Project Context
SSI, the current standard Railtrack microprocessor-based, safety
interlocking system, was developed some 25 years ago under a tripartite
agreement between two of Britain’s leading signalling equipment
manufacturers and the then nationally-owned British Rail. This agreement
established a set of common standards for interworking of the different
SSI component modules.
A typical SSI installation consists of:
- A central interlocking (CI) computer suite.
Each CI contains three identical multi-processor modules (MPM).
CIs can be combined to control larger areas.
- Multiple trackside functional module field-interface
units.
- A panel processor module for panel or video-display-unit-based
control centre interface.
- A diagnostic processor module.
Within the CI, a fixed safety programme operates on variable geographic
data. The geographic data describe the identities and types of the
external plant, the safety equations for the sequence and combination
of allowable operations, and the exact addresses of the connections
to the trackside functional modules for each external input and
output. Any geographic data need to be written and verified for
both safety and operational correctness by licensed designers and
checkers using a safety approved design workstation. By linking
the design workstations to a simulator of the target CI, the geographical
data can then be tested by licensed testers before being delivered
to site and commissioned.
The Key Issues
The key issues related to SSI data change include the following
ones.
Complex and Multiple Elements. We were dealing
with 8 independent zones within Railtrack, 4 data change contractors,
and approximately 225 SSI interlockings within the scope of Train
Protection and Warning System (TPWS). The interlockings comprised
approximately 2,500 signals and 20 percent of the TPWS Programme.
Identification of Interlockings Needing Change.
We estimated that approximately 70 of the 225 interlockings would
require a data change (but this estimate was untested). The specific
interlockings that would require a data change were unknown at the
outset. Because they could not be identified, it was not possible
to programme the resources to carry out the data changes. Even if
the estimate of 70 was reasonable, it was not possible to know and
nor could we tell until well into the programme if they were grouped
at the front of the programme, at the end of the programme or evenly
distributed throughout the programme.
Restrictions on Use of Interlockings. We faced
restrictions on which manufacturers’ CIs could be worked on
by some of the data change contractors. Until the data change CIs
had been identified, we did not know how many of the restricted
types might need to be worked on by any one contractor.
Need for Design Work Station. The design work
stations (DWSs) on which the data change software runs were approximately
20 years old and used a processor that was no longer manufactured.
Because the number of operational DWSs was limited, the location
of each one tended to be known to interested parties—i.e.,
it was going to be very difficult to obtain work stations.
Shortage of Licensed Personnel. The design and
testing personnel who would carry out the work, or at least their
supervisors, had to be appropriately licensed by the Institute of
Railway Signalling Engineers (IRSE) so, therefore, were a scarce
resource, especially in the climate of extensive railway renewal
in the UK. Licensees are listed by the IRSE and can, therefore,
be looked up in their register. In the small world of railway signalling,
competent personnel tend to be known to the wider community.
Failure to complete the task would prevent the completion of the
TPWS Programme, which was subject to a public commitment by the
Chairman of Railtrack and to legal regulations.
Data Change Issues
Interchangeability. A number of improvements to
both the hardware and software for both SSI CIs and trackside functional
modules have taken place and have been given safety approvals under
various safety authorities, but with general oversight from SSI
Signalling Principles Group (SSISPG), a voluntary, industry steering
group recognised and supported by the infrastructure owner (Railtrack).
The standard interchangeability of the two different manufacturers’
components has been generally maintained. In the period when British
Rail was privatised, however, the manufacturers developed different
upgraded versions of the CI. Unfortunately, these two “turbo”
processors, both working at twice the clock frequency of the original,
are not interchangeable in that they require different compiled
versions of the geographical data. The two compilers are the proprietary
property of the separate manufacturers, although each can be run
on the standard design workstation.
The interface arrangements for TPWS to the SSI equipment may involve
only wiring changes in simple installations or both hardware changes
(wiring connections, additional trackside functional modules, etc.),
and geographical data changes to allow both the CI and the diagnostic
processor module to operate as required. General electrical contractors
were engaged by Railtrack to undertake the physical trackside installation
and normal signalling circuit design for TPWS installations, which
included those SSI installations that required only wiring changes.

Figure 1: Interactions for a Single Data Change Work Package.
|

Figure 2: Process Flow Chart |

Figure 3: Process Flow Chart for Multiple Data Changes |
Availability of Resources. Where data changes
are required, the available resources (in terms of not only the
licensed workers but also the safety-certified design workstations)
are concentrated in the hands of a very small number of companies,
none of which is actually engaged in the physical trackside installation
work. Because these resources are not only in short supply but are
also required for a number of Railtrack’s other priority projects,
it was considered unwise to allow the trackside installation contractors
to engage in a bidding contest for use of these resources. As a
result, the TPWS National Team began a process of identifying and
contracting for the available data change resource as a central
pool to be used as needed by the different zones and trackside installation
contractors.
Part of this process involved quantifying the available resources,
the required work and the rules under which data change or non-data
change solutions should be applied to SSI sites. The planning was
further complicated by one of the key issues mentioned earlier—
restrictions on which manufacturers’ CIs can be worked on
by some of the data change contractors.
Contractors and Controls. The work on SSI data
change worksites was consequently divided between two contractors,
each of whom reports to a different part of Railtrack. In addition,
each data change contractor would possibly work concurrently with
two or three trackside installation contractors and vice versa.
Adding to the complications were the controls needed to ensure that
all parties were using the same version of records and that a clearly
identified party was responsible for all safety-related records
at any time. The interactions foreseen for a single data change
work package are shown in Figure 1.
Moving Towards Quantified Risk Analysis
When we first started to develop a quantified risk analysis (QRA),
we found that some risks had been identified on a general level
and some had been identified in more detail. Actions to mitigate
these had been determined and implemented where practical. A basic
programme of work was in place based on key assumptions. There had
been little quantification of the risks, however, and no analysis
of the impact to the overall sub-project or to the programme. It
was considered very important to perform such quantification and
analysis as early as possible.
Creating a Model to Analyse. The first part of
any risk analysis is to evaluate the work process so that the impact
of any risk can be understood the context of the process. From an
initial review, a basic data change process was apparent, as shown
in Figure 2.
The team discussed the factors that may cause this process to
fail. We felt that the major risks were not to any individual data
change, but to the overall programme of data changes within which
each data change lay. An expanded model was developed therefore,
as shown in Figure 3.
Identifying the Major Risks. The greatest concern
was that the data changes would not be completed within the legally
regulated timescales. Factors affecting the outcome could be summarised
as:
- Number of data changes
- Number of available data teams
- Average time taken to perform a data change
- Available data requisitions (to ensure consistent
resource usage)
- Potential delays to the process (resulting
in wasted resource time).
After some discussion, and focusing on the
Railtrack requirement for signals to be completed by a particular
deadline, a brainstorming exercise identified a number of key risks
affecting the above factors.
Table 1: Modeling Assumptions and Exclusions  |

Figure 4: Sample Process Flow Diagram |

Figure 5: Example of Sensitivity Charts |
Creating a Risk Model
In order to focus on time effects, a risk model based on data change
resource weeks was created using Predict! Risk Analyser. It compared
resource weeks needed to complete the work against those currently
available. The spreadsheet-based analysis tool allowed a risk-based
variability in the major risks listed above to be modelled quantitatively.
A few key assumptions and exclusions concerning the model are shown
in Table 1.
A key benefit of an integrated time-based model is that the output
is the holistic effect of risks even when they may impact on separate
parts of the process—the whole may be greater than the sum
of the parts. The output variable
(the completion date) allowed the confidence of completion by our
deadline to be generated. The model also generated a probabilistic
range of completion dates, and those relating to the 50 percent
and 80 percent confidence levels were quoted in all reports to give
an estimate of the likely range on the completion date.
The model was based on a mathematical process flow diagram created
by the risk team and agreed to by the technical team (Figure 4 on
the preceding page). The mathematics of the model were checked by
both teams to ensure accuracy.
Following the first run of the model, both risk and technical
teams were concerned to find that the chance of completing on time
was less than 5 percent!
Input Sensitivity Analysis. A series of input
sensitivity tests were carried out that involved varying key inputs
manually and rerunning the model to find out how the end-date changed.
Examples of sensitivity graphs are shown in Figure 5. It is interesting
to note that in both examples shown in Figure 5 the project was
on the cusp of the sensitivity curve, with expected data changes
at 30 percent and a likely activity duration of six weeks. This
fact meant that negative changes (increases) to either number would
very quickly result in failure to achieve success. This analysis
showed the tight time frame available within which to carry out
the work and the high degree of interdependency between the various
factors.
Mitigation and Action Generation. Discussions
about a series of possible mitigation measures arose from the sensitivity
tests. These mitigation measures were input into the model one by
one, starting with those that would have the most impact and that
we assumed would be successful. Changes to the completion date resulting
from potential mitigations were recorded, generating a set of possible
mitigations that, if successful, would ensure that the whole data
change plan would complete on time.
An action plan was created based on achieving the mitigations required
to gain an acceptable modelled completion date. For example, the
assumed number of data change teams was identified as being deficient,
and an action to obtain further teams was created.
Adaptability. The risk model has since been used
to generate the likely cost variability of the data change work.
The risks to scope, resource availability and time remained consistent,
but additional cost-based resource risks were considered. The simple
spreadsheet nature of the model within Predict enabled the additional
variables and their associated uncertainty to be calculated quickly.
Some additional output cells then generated the commercial information
required.
Maintaining the Model: Periodic Reviews. It is
vital for any risk analysis to be reviewed and kept up-to-date.
The TPWS/SSI team has accomplished this using a regular (monthly)
review of the current state of the risk model and analysis. The
success or otherwise of mitigation actions taken to date results
in an adjustment to the risk model, then the entire risk process
is re-run. The result is an up-to-date mitigation action plan each
month that reflects what needs to be done if the team is to ensure
success.
Consequences
As a result of the initial QRA, a number of memos and discussion
documents were raised for use by the Railtrack Board to highlight
to the zone directors and implementation teams the importance of
SSI to the overall project and the restrictions imposed by the resource
shortage. Highlighting the sensitivity of the model to policy changes
was useful, resulting in each zone holding back requests for special
treatment that it considered to be in its own interest because they
could now see their requests in reference to the overall project.
At a later stage, the QRA process highlighted the need for certain
measurement tools to keep track of the actual achievement against
the programme. This tracking, in turn, allowed corrections to deviations
to be made as quickly as possible.
The SSI process required corrective action to be implemented rapidly
to avoid delay. A simple weekly “scorecard” type of
report was instituted to enable senior managers to see the adherence
to the plans of each zone without having to await a new run of the
QRA.
Conclusions
Prior to the creation of the QRA for SSI, the TPWS National Programme
Team was generally aware that SSI data change was a difficult area
but did not realise just how sensitive it was. For this and other
reasons, a robust and up-to-date QRA has proved to be a very useful
tool. A monthly review of the QRA keeps the entire project team
focused on the most effective actions within the SSI Project and
the National Programme. In addition, the QRA can also be used to
demonstrate the effect on schedule and budget of interventions by
others, including the client.
Nevertheless, a QRA is only a monitor—a dial on the dashboard
of project management controls. The project team must still expend
the effort to implement effectively the actions identified.
|
Lessons Learned
- Constant attention to every detail is required to run
down all potential risks on a project such as these that
were identified as having so many interacting factors.
- While individual risks can be identified and quantified
to the best degree possible, the cascade/ cumulative effect
of these risks acting on each other can be much greater
than expected.
|
Despite efforts to reduce the “headline”
page of the regular report to show one significant number that could
be the monitor of status, we found that this was not practicable.
The regular report ran to several pages of analysis. We found ourselves
reporting two headline numbers:
- The percentage level of confidence of completion
without any mitigating actions. Because this was the first figure
to emerge, it was the one that stuck in everyone’s mind
and, therefore, was the crude baseline. This percentage is unrealistic,
however. Of course there are going to be mitigations—that
is management and that is the role of the project team.
- The percentage level of confidence of completion
if all mitigations are successful. This figure shows what the
project team expects to achieve; however, the reader needs to
understand what the project team has factored in and if this is
realistic.
The QRA of a sub-project is an effective tool, the output of which
can be factored into the risk analysis of the whole programme. Numbers
can be made to say anything you like. As with all mathematical models,
however, the user must understand how the model has been constructed
in order to know how to interpret the results produced, and action
must be taken on what the correctly interpreted results highlight.
|