From Siri to Alexa to Watson, we are living in an AI world; it understands when we ask it to play our favorite song; it knows what book we will want to read next. AI can recognize the face of an individual, and the distinctive look of a cancer cell. It understands concepts and context. It can write a song that sounds like the Beatles.
From the weather to the shop floor to robotic surgery, AI’s massive processing power is far faster than the human brain in computational ability, and is progressing in conventionally “human” areas like strategic thinking, judgement and inference.
What if the right AI technology falls into the wrong hands? Are we right to be frightened? Leading thinkers like Stephen Hawking, Elon Musk and Bill Gates have declared their deep fear that AI – created by humans – could end up controlling us.
I can tell you that all the world-changing gifts of AI – provided in large part by open collaboration on its development – can be exploited by the criminal hacking community. These black hats can do it whilst enjoying the same access as the white hatters – those committed to the values of open source.
So yes: AI can easily be twisted to become a cybercriminal’s most valuable ally – leveraging its very “intelligence” and self-sufficiency as a partner-in-crime unlike any ever seen before. So how can we ensure that AI applications are used in ways that benefit individuals and society? How can we tell whether AI software underlying many of the platforms we utilize aims to emulate the good guy’s brain… or the criminal mind – or whether it started off well-intentioned and was perverted along the way?
The Basic Characteristics of AI
Let’s start with a short primer on the fundamental nature of AI as it relates to the world of cybersecurity. On the simplest level, there are two predominant types of applications: supervised AI and unsupervised AI.
Supervised AI applications are those that are pre-programmed offline, through a fully managed process, and then released to do their job. These applications are typically used for the automatic identification of certain images and data such as faces in photos, certain kinds of structured and unstructured text, and context, via training sets.
They are trained by exposure to large, relevant data sets that allow them to generalize their classification capabilities. Siri and Alexa, for example, use Natural Language Processing (NLP) and speech recognition (speech-to-text translation applications), while Watson uses mainly NLP to answer questions. These applications get smarter and smarter all the time – which is precisely what makes Hawking et. al. so nervous.
Unsupervised AI applications are generally also pre-trained, but can learn from the environment and adapt accordingly, creating new patterns on the fly. These applications study and process their environment to create new classes of behaviors. They then adapt, independently, to better execute various decision-making functions, mirroring human thinking patterns and neural structures. Some examples include applications able to learn an individual’s text message or email style, browsing behavior and interests. Facebook and Google employ this approach to study user behaviors and adjust their results (and adverts) accordingly.
When Things Get Ugly: AI’s Malicious Potential
Both AI applications – supervised and unsupervised – can be used maliciously in different ways.
Actors looking to do harm may use supervised AI applications to target confidential or embarrassing data, or any data that can be used for ransomware. Imagine – a phone is infected with malware that is “trained” to identify and retrieve potentially compromising texts, photos or voice messages.
Unsupervised AI applications can do this too, and can also mimic human behaviors. For example, an unsupervised AI application could imitate a manager’s style of writing, and use it to direct one of his/her employees to download malware files, make a shady transaction, or relay confidential information.
The risks and dangers are enormous. Nevertheless, the security industry is not adequately discussing how to prevent AI from being abused by hackers. When Bill Gates and Stephen Hawking warned about AI turning on humans, they were not talking about the risk of cybercriminals helping to “set AI free” – but perhaps they should be.
One such attack was carried out in Israel, in which Waze was hacked to report fake traffic jams. If an unsupervised AI application were to perpetuate these kinds of bogus reports unabated, we’d see real-life chaos on the roads.
Six Principles for Preventing the Abuse of AI
Always map AI objects in your environment – Know where all AI software objects and applications exist in your environment (which servers, end-points, databases, equipment/accessories, etc.). This is not a trivial task; it forms the basis for all other methods of preventing AI from being abused by cybercriminals. To do this, we need new methods of analyzing code and pinpointing mathematical evidence of AI that we would not find in regular code (regression formulas, usage of specific libraries of optimized linear algebra, etc.).
Do an AI vulnerability assessment – Currently, there are various types of tools and procedures that help evaluate possible vulnerabilities or weaknesses in a product or code. But these tools and procedures have not yet been adapted to the era of AI code and solutions, and they should be. These tools would begin, for example, by looking for basic API flaws, with the goal of finding holes within the AI application that can potentially be exploited.
Understand intent – Utilizing any software with AI should be accompanied by a thorough understanding of the capabilities of that AI. The AI application vendor should provide an AI spec that describes the potential of the AI code, listing, for example, whether it can adapt, and the types of data it can classify (i.e. voice or text or traffic classification). This will help indicate the risks associated with the AI application, and give a sense of what can potentially happen if the AI is used in the wrong way. If the vendor does not provide this spec, the AI software should not be deployed.
Identify data content type – After understanding what AI objects are in our environment and what their capabilities are, we should then determine what type of data exists virtually “nearby” these AI objects – meaning what data is in the AI’s reach. This knowledge should enable us to begin assessing which data it can reach to, and the necessity of the AI object in that environment.
Monitor APIs – Establish control over the API activity in your environment by identifying the type of APIs with which the AI object can potentially integrate. In principle, API activities should be logged and can even be integrated to send out alerts. If, for example, Siri was used to convert speech into an email or text messages, or if an app used the device graphics processing unit (GPU) to retrieve data, you should be alerted to it, or at least have a record of it.
Train employees – Companies need to teach their employees about AI chicanery. Just when people are learning not to open spear phishing emails, along comes AI which can operate at new levels of deceit. Imagine you’re having an email exchange with someone, and they write that they will send you a resume. A half-hour later it shows up. Who wouldn’t open that? Who would suspect it was a malicious chatbot?
The AI revolution holds both great promise, and potentially, great peril. Like all new technologies, it is vulnerable to exploitation by the unscrupulous. That is why AI cybersecurity – developed and deployed by those deeply cognizant of our unnatural neural foe – is a key imperative for the 21st century.