Cyber threat intelligence (CTI) practitioners have to deal with an increasing volume of cyber events and incidents, making it hard to keep track of threats.
For example, in its latest Data Breach Investigation Report (2023 DBIR), Verizon found more than 16,000 security incidents and roughly 5200 breaches over the past year. The US Cybersecurity and Infrastructure Security Agency (CISA) estimates there are around 1000 known exploited vulnerabilities, let alone all the zero-day ones that are found – or maliciously exploited – on a weekly basis.
During Centripetal’s Cyber Threat Intelligence Summit on July 18, 2023, Daniel Grant, principal data scientist at GreyNoise Intelligence, argued that CTI practitioners can no longer do their jobs without any additional help: “There’s too much volume of information. You’re constantly looking for a needle in a haystack – and most of the time, that needle is camouflaged to look like a piece of hay.”
Many Threat Reports Lack Context
Andy Piazza, IBM X-Force’s global head of threat intelligence, highlighted during the summit another issue faced by CTI practitioners: the threat reports they write are not always helpful to defenders like detection engineers, incident responders and security operation center (SOC) analysts.
“While our reports are usually very well written, with amazing context at the report level, we’re often missing context around the indicators of compromise (IoCs),” Piazza regretted.
Read more: Threat intelligence: Why Attributing Cyber-Attacks Matters
He pointed out that most cyber threat reports end with a static IoC table, with no further explanation and no links between different IoCs to allow the defenders to get a clear picture of how to find them and prevent – or remediate – an attack.
“Therefore, it’s not really actionable by the defenders, who spend long hours deciphering these reports. They, too, have to deal with an increasing amount of data,” he added.
Templates, STIX and MITRE ATT&CK
Piazza’s recommendations to improve threat reports at a lower cost for CTI practitioners include using repeatable templates and automation tools.
“Start with tags if you must and leverage automation tools to convert them to human-readable context so that you can provide valuable data and metadata with your IoCs.”
He also suggested CTI practitioners should aim for an ideal state where they can leverage more advanced automation tools and standards frameworks like Structured Threat Information Expression (STIX), a standardized XML programming language for conveying data about cybersecurity threats in a common language that humans and security technologies can easily understand.
“We’ve seen great progress at the tactics, techniques and procedures (TTPs) level with more and more threat analysts using the MITRE ATT&CK framework. Now we need to apply the same to IoCs. Some people tell me we should stop using IoCs altogether and move towards behavior-based mapping. I’d agree, but the reality is that 99% of organizations are not ready for that.”
LLMs to Enrich Threat Reports and Automate Outputs
Grant said generative AI could be a new, helpful addition to the CTI toolkit.
“AI algorithms have been used in cybersecurity at least since the 1990s, with spam filters, which used simple Bayesian models, first for malware detection in antiviruses, then for anomaly detection in a network. Now, the main way the cybersecurity industry can use generative AI is to help enrich threat reports, providing more context and metadata while saving the practitioners time,” he said.
Speaking to Infosecurity, Jack Chapman, VP of threat intelligence at Egress, agreed. He added that LLMs can also be used as data pre-processing tools to simplify the detection scanning data “so the practitioners don’t have to work with binary code.”
Read more: Are GPT-Based Models the Right Fit for AI-Powered Cybersecurity?
They can also help automate tasks and provide the human and machine-readable context that Piazza mentioned, as they can generate outputs in formats such as JSON, Grant indicated.
For all these reasons, Grant recommended that cybersecurity practitioners start playing with generative AI chatbots, trying out prompts and building a process in order to integrate them into their workflows later.
However, Grant warned that CTI analysts “should never trust the output, and always verify it. Also, always remember the knowledge cut-off of these models – it dates back to September 2021 for GPT-4 and won’t take anything newer into account when giving an output, unless you provide it to it.”