Adam Bangle, Vice-President, EMEA, BlackBerry
Today’s cybersecurity threats are incredibly smart and sophisticated. Security experts have to battle every day to discover new instances, identify which ones are fraudulent, label them as such, and then feed these instances into cybersecurity algorithms.
To meet these constantly changing threats, the security industry has followed the example of wartime codebreakers by offloading these tasks to intelligence much greater than any human brain. At Bletchley Park, what turned the tide in the battle against Enigma was Colossus, the world’s first programmable, electronic, digital computer, a machine that was able to perform many more calculations, much faster and more accurately than a team of skilled humans could do. Today’s equivalent of Colossus is artificial intelligence (AI) and machine learning (ML), which are rapidly becoming the foundation of modern cybersecurity infrastructure.
The next generation of cybersecurity threats require agile and intelligent programs that can rapidly adapt to new and unforeseen attacks. AI and ML’s ability to meet this challenge certainly hasn’t gone unnoticed by cybersecurity decision makers, the vast majority of whom believe that AI is fundamental to the future of cybersecurity.
Well, little wonder. While ML and AI have only recently entered the public’s consciousness (and indeed only been widely considered a cornerstone of cybersecurity for a couple of years), these technologies boast a long pedigree. Way back in 1956, the term ‘artificial intelligence’ was coined at a conference at Dartmouth College, New Hampshire. In IT threat detection, AI has been implemented as early as 1995.
Where once security was manual and reactive, technologies that harness AI and ML are automated and predictive. This means the technology is not only capable of preventing known and unknown threats, but of predicting new threats before they are encountered in the real world.
As AI and ML technologies continue to mature, they are giving rise to new possibilities for cybersecurity threat protection. For instance, they allow us to automatically flag unusual patterns and enable detection of network problems and cyber-attacks in real-time.
These technologies recognize patterns in our environment and apply complex analytics to monitor, and therefore protect, networks and infrastructure to a scale far exceeding what is possible for a human to do. This visibility supplies deeper insights into the threat landscape which in turn informs the ML. This means that AI-based security systems are constantly learning, adapting, and improving – just like human brains, only many orders of magnitude faster and smarter.
Powered by AI, the cybersecurity industry, having lagged long behind the malevolent geniuses who continue to develop new malware faster, can finally have the tools to take the lead and stay there.
Dr David Day, Special Officer, NCA
You have now read about all the remarkable pioneering implementations of AI in cybersecurity, for ‘good’ and ‘bad.’ So let me now introduce you to the darn right ugly face of AI in our beloved field. We have all heard the cliché before: “cybersecurity is an arms race,” – well, when it comes to AI, it really is. Our nemesis is moving quickly to weaponize AI against us, and here are just a few examples of how they are doing it.
Detecting vulnerabilities in source code: Open source code has always been perceived as a double-edged sword from a cybersecurity perspective. On the one hand, its transparent nature allows robust security checking by an extensive collection of open-source advocates, all keen to contribute to ensuring the application is secure. On the other hand, the bad guys can see it too, and if they spot a vulnerability in the code, they will keep quiet and compromise it; the so-called ‘zero-day’ attack. With AI, they now have the means to do this quicker and easier. A recent academic paper from Beijing University proposes methods of using machine learning to teach safe programming patterns to systems by subjecting them to many instances of known mature safe code. This learning process then creates rules for determining secure code. If new code is then subjected to these rules and fails, we can be almost sure it is vulnerable. Imagine the bad guys feeding through masses of code snippets from Github to these algorithms – not a pleasant thought is it?
Kamikaze malware: One of our principal weapons against malware is the ability to reverse engineer it and figure out precisely what it is doing. The process consists of using specialized tools, including disassemblers, network analyzers, debuggers and memory analysis tools. Evidently, though, nobody wants to execute malware in a production environment, so the analysis tools are usually bundled together into malware analysis sandboxes, to isolate the malware analysis procedure from the operating system. In retaliation, the malware developers include several tests to see if the malware is operating in a sandboxed environment, and if it is, it modifies its intended operation or deletes itself to keep us all guessing how it works, sneaky eh? However, the researchers know these tricks and hook into the malware, fooling it into thinking it is on a real system, touché bad guys! For now, the bad guys have AI and can train the malware to recognize the patterns of virtualized environments, and when they detect they are running in one, they will shut up shop – checkmate, the hackers win.
IBM’s Deeplocker: This proof of concept AI malware was designed by IBM and first showcased at Blackhat USA in 2018. The malware is combined with benign software such as an audio application to avoid detection by security analysis and anti-virus. Also, it is fused with target attributes. When these target attributes are recognized, the malware is opened, and the payload activated. This target recognition uses an AI neural net which has been trained to detect traits of the target. With the target identified, the malicious payload, e.g. ransomware, is released. It brings to mind images of precision-guided smart missiles hitting their targets.
The million-dollar question is, are such techniques in the wild now? The truth is, we don’t know for certain, but one thing is for sure, they are most definitely coming for us.
Shawn Riley, Chief Visionary Officer and Technical Advisor, DarkLight
If I had to point to just one bad thing about AI in cybersecurity, it would have to be our understanding of the different types of AI being used. The biggest problem is that people think we only use machine learning-based AI, with 85% of the people I engage with not understanding that we have other fields of AI in cybersecurity. They then often struggle to understand how the fields are different.
Most people are familiar with the AI that is good at learning and describing patterns in data, but we also have AI that is good at understanding and explaining information and knowledge. Hybrid AI combines approaches from both learning and understanding.
As Forrester said in its 2019 AI and automation predictions: “Machine learning’s strength is data. Knowledge engineering’s strength is human wisdom.” Let’s take a closer look at these two different fields of AI used in cybersecurity today.
Non-symbolic AI is the field of AI from data science that gives us machine learning and deep learning. Machine learning algorithms use statistics to find patterns in massive amounts of data. Deep learning is machine learning in overdrive: it uses a technique that gives machines an enhanced ability to find – and amplify – even the smallest patterns.
Non-symbolic AI focuses on learning the patterns in the data and describing those patterns with generalizations. Non-symbolic AI is also the field that suffers from ‘black box’ approaches, but work is being done to address explainability.
Symbolic AI is the field of AI from knowledge engineering that gives us knowledge graphs, knowledge-based systems and expert systems.
Symbolic communication in humans can be defined as the rule-governed use of a system of arbitrary symbols whose definition and usage are agreed upon by the community of users. Being able to communicate in symbols is one of the main things that make us (humans) intelligent. Therefore, symbols have also played a crucial role in the creation of symbolic (human understandable) AI since the 1950s.
Symbolic AI focuses on having a shared understanding of the symbols used by humans and being able to logically reason over the symbols based on that understanding.
In symbolic AI, semantic technology encodes meanings separately from data and content files, and separately from application code. Symbolic AI leverages semantic technologies to provide an abstraction layer above existing IT technologies that enables bridging and interconnection of data, content and processes which allow symbolic AI to work across information silos.
Given a question, symbolic AI can directly search topics, concepts and associations that span a vast number of sources thanks to the integrated semantic technologies. Symbolic AI is ‘white box’ which means results are 100% transparent and explainable.
As a community, we need to better understand the strengths of different types of AI in cybersecurity. When companies use definitions like ‘AI is computer programming that learns and adapts’ when that definition only applies to machine learning, it can mean that the people investing in cybersecurity don’t understand that there are different forms of AI that can be used in cybersecurity.