You cannot work in close proximity to technical people, particularly those who build systems, for long without hearing the term “technical debt” bandied around.

Technical debt is what you are adding to every time you choose an easy or quick solution now, rather than looking at longer-term strategies. It is the technical expression of ‘failing to plan is planning to fail.’ And it has consequences.

Suppose a technical system is held together with metaphorical prayers and duct tape. In that case, it will often be too fragile to effectively maintain (i.e. trying to update or patch it is likely to cause an outage or just break it irreparably). Given the earliest systems in an organisation are also often the most important to its operation, the most critical systems are usually the ones carrying the most technical debt.

While it’s often thought of as just adding friction to systems, possibly the occasional outage, it is much more insidious and damaging when we consider that most technical debt also involves what I like to call security debt. If you can’t update a critical system to maintain it effectively, it will be vulnerable to cybersecurity threats. Known vulnerabilities are a key factor in most cybersecurity incidents, as systems are often left unpatched.

This is a significant problem on its own, but when we consider the world of Operational Technology (OT) and Industrial Control Systems (ICS), the situation worsens quickly. OT refers to technologies used to manage physical systems, often but not always industrial (e.g., building HVAC, access control systems, lifts, etc.). ICS is a subset, specifically the systems that monitor, manage, and control industrial processes.

OT, ICS, and CNI

The most important OT processes are those in Critical National Infrastructure (CNI), everything from power plants to water treatment facilities. These systems were often automated before security was a major concern as it is now, were connected to the internet to enable remote or centralised monitoring, and accumulated sufficient technical debt that they are often impossible to maintain.

Manufacturers for some of these systems no longer exist, and their expense means that they are certainly not refreshed every few years as other IT systems should be.

The first known and most famous attack against ICS systems involved the Stuxnet malware, uncovered in 2010. Stuxnet is still one of the most sophisticated cyberweapons developed to date and has been repurposed to carry out other attacks after the one that (rightly) made it famous.

To keep the story short, the Stuxnet malware was developed to compromise Microsoft Windows machines to gain an initial foothold on a network, after which it would seek out the controllers that automated gas centrifuges for separating nuclear material. Estimates are that Stuxnet ruined roughly one-fifth of Iran’s nuclear centrifuges and set back the national nuclear programme for several years.

Stuxnet was developed to be subtle; it did not simply cause centrifuges to fail but introduced random variances in their operations which caused them to fail faster. It’s estimated that it was a year after release before it was discovered, and the discovery was more luck than planning.

It isn’t only malware that we need to worry about affecting CNI systems. While the Colonial pipeline attack did not involve any targeted ICS systems (it was down to a compromised password, and the shutdown was precautionary), if the attackers had aimed at indeed causing chaos rather than deploying off-the-shelf ransomware, the attack would not have been detected as quickly. It potentially would have affected the pipeline’s ICS systems. Given the control those systems had, the damage could have been much more significant. The attackers showed no signs of attempting to breach those critical systems.

In another recent case, an attack against a Florida water treatment plant was attributed to outdated software (technical debt again). Although it was detected and additional safety measures were in place, the attackers increased the amount of sodium hydroxide (used in small quantities to lower the acidity of water and in large concentrations capable of causing chemical burns) by approximately 100-fold. Fortunately, it was detected. However, in this case, we observe attackers deliberately and maliciously attempting to cause damage. Whether they want to send a message and know that other measures would prevent the attack or genuinely attempting to poison the water supply is unknown, and only limited information is shared.

So far, only two attacks have been confirmed to have destroyed equipment (though a recent hospital ransomware attack has been directly linked to a loss of life). The Stuxnet attack damaged nuclear processing centrifuges in a very careful way. A second attack, some years later, occurred in Germany, where a steel mill was compromised. The attackers disrupted the control systems of a blast furnace to a sufficient extent that it resulted in ‘massive’ damage, believed to have been caused by overheating and by disabling the furnace's shutdown capability.

More seriously, or at least more impactfully, shortly afterwards, the Ukrainian power grid was deliberately targeted with a strain of malware named ‘Black Energy’, resulting in over 200,000 customers losing power. A year later, the same happened with a different attack using more sophisticated malware known as Crash Override. Both of these were perfectly capable of being much more serious, as the attackers chose only to cut the power and not reconnect it out of phase, which would have been catastrophic.

Another attack in 2017 was the first to deliberately target safety systems designed to enforce emergency shutdowns when human life is at risk. The attack was initially believed to be a malfunction in the equipment until the security team sent in to investigate determined it was due to malware and was part of an effort to develop the capacity to cause physical harm.

Realistically, there aren’t any easy answers to this technical debt problem. The effort and expense needed to update these systems are beyond what the organisations who own them are willing to or able to afford. The alternative solution of fully isolating the facilities requires re-engineering processes that are highly dependent on interconnections and effective communication.

It is vitally important, when designing a new system, to carefully consider the technical decisions being made. The risks of not doing so are becoming more visible every day.

Cyber Security Fundamentals: Security and Technical Debt CollectionBy James Bore

Reply

Avatar

or to participate

Keep Reading