Organizations will never be able to trust artificial intelligence (AI) for cybersecurity decision-making until they can fix data bias and inaccuracies, and challenges around transparency and validation, according to a leading expert.
Speaking at Infosecurity Europe today, Titania chief strategy officer, Nicola Whiting, argued that bad data will always lead to poor decision making in machine learning systems.
This has been seen in the past with Microsoft’s ill-fated Tay AI system, whose Twitter account began spouting racist, sexist and neo-Nazi epithets after “learning” from other social media users.
It can also be witnessed in Amazon’s aborted attempt to use AI in recruitment. A four-year project was shelved last year after managers discovered that it had been learning from biased data favoring male candidates.
Things become a major problem when such systems are used in sensitive areas like criminal justice. Questions have been raised in both the US and UK about systems which are trained using historical data which then builds in the same conscious and unconscious human biases over matters of race.
Thousands of leading AI and robotics experts have also signaled their opposition to attempts to develop autonomous weapons systems.
“If the experts are saying AI is not good enough yet when lives are on the line, how can it be good enough to make decisions on our networks?” argued Whiting.
Part of the problem also lies with the type of data AI systems are being fed, she added.
When used in SIEM systems, AI is typically working with probabilistic data which extrapolates info from how devices respond to attacks or queries and “makes an educated guess” about risk. Using AI in this context effectively layers a “guess on top of a guess,” Whiting warned.
To support effective SOAR (Security Orchestration, Automation and Response) systems, AI instead need to be fed deterministic data where risks are determined from well-defined parameters such as device configurations, she explained.
It can then be harnessed to drive systems that are both self-defending and adaptive, and self-healing, meaning they’ll reconfigure themselves according to best practices and standards.
This will take a lot of pressure off stretched IT security teams, enabling them to focus more fully on reviewing probabilistic data, and drives better decision-making overall, Whiting claimed.
The path towards trustworthy, effective AI lies not only with better understanding bias and focusing on deterministic data and decision-making, but also in being able to validate decision processes and data types/integrity.
Current proprietary systems make that nearly impossible. Trusting such systems is akin to sending your child to university without knowing what course they’re studying or whether the professor is even qualified, Whiting argued.
“The problem with our trust in AI is that we can’t always trust the data is accurate and unbiased; we can’t always access how it thinks; and we can’t validate it either,” she concluded. “Unless we can fix this, I don’t think we’ll ever be able to trust AI.”