Patching has become a ubiquitous part of modern life, with developers releasing software updates in response to the latest bugs and vulnerabilities in their platform. Patch Tuesday, the second Tuesday of every month, is when Microsoft, and many other companies, typically release their latest patch.
However, blindly installing a patch onto a system without testing it first in a non-production environment could have grave implications, especially where bespoke software is used, due to potential compatibility issues. Patching is particularly tricky for systems that are in constant use and for these reasons, patching is as much a business consideration as it is technical one.
Patching, when done properly, is not a quick process. The IT team should take time to verify the source, read patch notes, determine the criticality of the patch and conduct a risk assessment, before thoroughly testing the patch to ensure the system would remain secure and stable. Only then should an organisation take a snapshot of their system and upload the patch.
“You need to have a cast-iron guarantee that it’s going to work before you put the update in,” says Dave Lear, lead security architect at an end-user organisation. “This is where rigorous testing and the absolute stress testing of that particular deployment is needed, before you can write a case to say, ‘I’m happy now that this can be deployed’.”
Alongside the regular updates, there also are the urgent patches for critical vulnerabilities. These need to be reviewed, tested and uploaded within a short amount of time. When these are not properly accounted for, it can add a strain on resources.
Patching costs time, money and resources. To not take this into consideration can lead to unanticipated losses. This has resulted in some organisations using legacy systems, if the time and cost for patching and installing upgrades has not been accounted for. This tends to lead to resources for patching coming from the budget that had initially been set aside for the five-year replacement.
“Organisations, particularly in the public sector, have been unable to upgrade from Windows XP, and it’s all because there’s never any long-term budgetary considerations given to the lifecycle management,” says Mike Gillespie, director of Advent IM.
Patching has now been further complicated with the software-as-a-service (SaaS) model. Rather than purchasing a license to use that software outright, it is now common to pay an annual subscription for the continued use of that software. However, if a patch has been released for a piece of software that an organisation has failed to maintain subscription for, a supplier – depending upon how mercenary they are feeling – is well within their rights to withhold the patch until the subscription is renewed.
Patching a system, particularly a customer-facing one, has always been a tug of war between the necessities of IT and the business wanting to be always available. Scheduling system updates becomes a case for careful internal negotiation. This is where risk assessments come in, as these communicate the potential dangers of delaying certain software updates.
“For as long as I’ve been involved in IT, there has been a tension between the security people who want to patch and maintain systems, and the business side that want to have systems up and running 24/7,” says Gillespie. “The truth is, in most cases, it is never going to be possible to have a full 24/7 capabilities in every system. Businesses have to get better at recognising, and then factoring in, windows of downtime.”
In cases where a system needs to be always on, such as when it forms part of critical infrastructure, multiple systems can be run in parallel. These can also act as a redundancy measure.
Testing a patch, especially for core systems, is crucial to maintaining business continuity. Tests are typically conducted on a replica of an organisation’s systems in a sandbox environment. This allows the development team to perform a dry run of the patching process and to thoroughly test the system to ensure it remains stable.
“You’re never going to have your test system exactly like your live system. There’s always going to be those key differences, as one is closed off and not connected outside,” explains Lear. “What you can do is put dummy systems on the end and say ‘This is supposed to be an internet connection’, it’s supposed to be able to talk to the server. You put a dummy server on the end of an ethernet string and configure it to respond the way it is expected to in the real-world scenario.”
Naturally, this can come with a significant expenditure of time and resources. Not only do resources need to be set aside to cover the expense of running such a test, but staff with the appropriate experience must also be available. Failure to do this can expose an organisation to significant risks, such as patches potentially causing system failures.
Some of the cost associated with patching can be mitigated with third-party tools and automation. However, care needs to be taken to ensure that the process is the optimal one and that bad habits are not inadvertently carried over.
“If you buy third party patch management software, to just automate what you’re already doing, then you’re automating rubbish,” says Gillespie. “If your strategy isn’t sound, isn’t properly thought through and isn’t well documented, then all you’re doing is spending money to make it easier to make the same mistake that you were making before.”
Simply automating an existing patching process without reviewing and optimising the process is unlikely to result in the maximum efficiency that automation allows.
A careful review of the existing patching process, from initial announcement all the way through to the patch being uploaded will ensure that the automated patching process becomes an efficient use of time and resources and no redundant stages are carried over.
Furthermore, whilst third-party tools are useful, they should not be totally relied upon. There is a risk of blind spots, should such tools fail to detect specific flaws or vulnerabilities.
“A lot of third-party tools, particularly the monitoring tools and patch management level testing, are better because they’re written specifically to do that job,” says Lear. “They highlight not only where your vulnerabilities are, but most often they will give you a link to other patches so, that you can instantly grab that and get moving towards the patching of the system again.”
Updating a system is always a risk, but one that can be significantly mitigated through a rigorous testing process. However, errors can still happen. These can vary from a system returning to its default settings or losing network connectivity, through to a system completely failing.
“From personal experience, it’s happened twice where we’ve really had to do some damage control. It’s always a risk, and always in the back of your mind, that this might not work,” recalls Lear. “Just because it worked on ‘System A’ doesn’t mean it will necessarily work on ‘System B’. It’s the same thing when you’ve got your dual redundancy failure systems, you’ve patched one side of it, and that’s working, then you’ve patched the second side of it and it falls over for some inexplicable reason.”
This is when the time spent documenting and updating business continuity and disaster recovery plans provides a return on investment. As part of the disaster recovery plan’s annual review, the system details section should be updated, including not only what the systems are comprised of, but the versions as well. “When you’re writing ‘This is the current state of the system,’ that will always need to be updated because it’s never the same as it was twelve months ago,” says Lear.
With an appropriate catalogue of system snapshots, it can be a simple case of rolling back the network to the last known “good” configuration of the system if and when a problem occurs.
Due to the potential risks of threats that are embedded, but not active, within a system, it may need to be rolled back by several iterations to be confident that the system is stable. “Rollback is not as simple as uninstalling it to the last known configuration,” says Gillespie. “If your last known good configuration is from three years ago, then you have lost three years of work.”
When lifecycle management is embedded within an organisation’s core business practices, the IT teams are able to ensure that the appropriate time and resources available to maintain the systems. This reduces any unexpected downtimes and reserves the upgrade budget for the five-yearly upgrade.
“Patch management isn’t just an IT issue; it’s a business issue that needs to be managed as part of a business communication and business change programme,” concludes Gillespie.