One of the biggest issues I tend to see these days when dealing with incident response cases for clients isn’t so much the breach itself, but the deployment of incident response tools to aid an investigation. Not necessarily the deployment itself, but agents deployed to a host clashing with other tools/agents.
We tend to utilize one cloud IR toolkit for the most part but have had the same issues with multiple different toolsets historically.
There is only one thing which will make a client more upset during an IR investigation than the breach itself, and that is to have critical assets spin up to 100% utilization and as a result grind to a halt! This is still a rare occurrence (but it does happen) as any deployment involves a set of test devices/systems to ensure that there won’t be an impact.
In the past the sole agent that could be found on most assets would be the anti-virus, but these days there are multiple agents running on any organization’s production system (AV, EDR, encryption, vulnerability scanners, etc.). Unfortunately, the more components we add into a system, the more likely it is that the system will fail – not intentionally, but because there are so many ways for these systems to fail than of working together in an orderly fashion.
Just in the last week our team saw a rare condition identified due to two tools having a known issue (which had not as yet been resolved). Another cause of issues is our requirement for these tools to operate at the kernel level (thus thwarting the attackers from doing code injection etc. to bypass the tools). The last time I checked the kernel is a pretty important part of a system! Should there be an issue with the agent there is much more chance of a fatal error on the system.
It is pretty obvious that the more technology we add to an endpoint, the more complex the machine becomes. The more complex the machine becomes, the greater the likelihood of failure and to an increase in the vulnerabilities created on a system.
These applications and agents ALL utilize system processing power (no matter how frugal they are) and when an IR investigation is ongoing there is a requirement to search for IOC’s and other indicators/tools/malware.
Conducting these searches in a cautious manner is essential. A “Hail Mary” approach of launching a scan/sweep across EVERY system in an organization is going to hurt.
Recently my team started an investigation to find that the client had gone with the Hail Mary approach to search for a specific ‘Hash Value’ the day before we started. Every single one of their production VM Servers spun up to 100% for over a day whilst the scan was in progress and as a result, was unusable to the organization (without rebooting all of the production servers!).
I believe there is a shift in the marketplace taking place at the moment as more and more larger venders acquire start-ups (over 1,500 vendors were seen at the 2019 RSA Conference) and consolidating into a suite of tools thus providing a harmonious technical environment which has been tested.
The other option is to enable API integration so that these tools can talk together and provide more effective as a whole than as a sum of their parts. This is a particular model which is growing in popularity amongst the EDR tool vendors; open “marketplaces” allow more functionality with less risk to organization.
As security teams and managers it is our job not only to ensure the security of organizations but to ensure that we conduct due diligence when looking to implement new technologies into our production networks.
Therefore, a phased approach with vendor proof of concepts in a development lab with all of your other security tools/agents is definitely the right approach to reduce the risk to our production networks.