The future is uncharted, a mystery, and unknown. We have films and TV shows that make shots in the dark about what it may entail. A great example of this is the 2002 film Minority Report: from self driving cars, to targeted advertising and virtual reality, it’s made some fairly spot on guesses about what our current day holds.
The important one for us, however, is predictive policing - for us in 2019, this doesn’t involve three people floating in a pool, but instead a bunch of analysts sitting around a computer screen.
With more than 90% of all the world’s data appearing in just the past two years it’s now easier than ever to build profiles of people online, with or without their consent.
Predictive policing can be broken down into two main areas: predicting a person and predicting large amounts of people. When it comes to predicting a person, predictive policing can be seen like a game of 20 questions. This being where each question takes us down a different route on if the individual being profiled is likely to commit a crime.
When it comes to predicting people there are three main criminology theories that can help us in understanding the causes and blockers for a possible crime.
- Strain Theory - “Strain theory states that society puts pressure on individuals to achieve socially accepted goals (such as the American dream), though they lack the means.”
- Social Control Theory - “Social control theory proposes that people's relationships, commitments, values, norms, and beliefs encourage them not to break the law.”
- Social Disorganization Theory - “The theory directly links that a person's residential location is a substantial factor shaping the likelihood that that person will become involved in illegal activities.”
The question still stands. How can we use natural language processing to derive motive from text and, in turn, profile possible malicious actors? It comes down to understanding what natural language processing is all about. As human beings we understand different nuances in text, this is down to having experience with a plethora of different sources of information.
These sources range from books, to TV shows and the internet. This works in the same way for machines. Looking at the below excerpt for the City of London on Wikipedia we can see how an NLP program might answer the question of “Paris - France + England = _____”.
“London is the capital and largest city of the United Kingdom. Standing on the River Thames in south eastern England, at the head of its 50-mile estuary leading to the North Sea, London has been a major settlement for two millennia. London is often considered as the world's leading global city ”
It comes down to understanding the surrounding context between these key words. Where in the above example the answer would be ‘London’ as Paris is to France as London is to England.
Sentiment analysis is a subset of this, it comes down to understanding the sentiment or emotions behind text. Normally this is done by looking at if text has a positive or negative sentiment. What does that mean? At a high level we can break human emotions down into eight key areas, each of these being either positive or negative. These break down to:
- Positive: Joy, Anger, Trust, and Surprise
- Negative: Sadness, Fear, Disgust, and Anticipation
Sentiment analysis analyses the emotion behind text in a similar way to most machine learning approaches. This is done by using both a training and testing data set. This data comprises of a string, for example a restaurant review, and a predetermined sentiment for that review. For example:
- “I love my local pizza restaurant!”, Positive
- “The food arrived quickly and I am very pleased with the quality.”, Positive
- “This place has gone downhill since it started, very disappointed.”, Negative
The data is then broken down into the training and test set where the analyzer is trained on words that are more common for each sentiment. Finally to test the analyzer, the testing data is used, where the analyzer predicts the sentiment of each string and is compared with it’s known sentiment.
From looking at technology trends we can see a myriad of ways that natural language processing has been implemented. From AWS using it to analyze health information, to the Microsoft Tay.ai bot and to predictive text in mobile keyboards. How can we marry the concepts of predictive policing with sentiment analysis and natural language processing?
We can use a framework, broken down into six main areas to look at how we can use sentiment analysis as a part of a predictive policing approach. The below framework template assumes that there are three entities in play: the individual (the individual being profiled), the analyst (the individual or group assigning the impact of a crime or attack that may occur), the victim (the individual or group that would be the victim of a potential crime or attack).
A six module framework template:
- Impact Scope
- Understanding what the analyst is protecting, as the impact of an attack will depending on what is being targeted.
- Societal Pressures
- In the sampled text has the individual mentioned a goal or aspiration? Does this text have a positive or negative sentiment?
- Relationship Anchors
- In the sampled text has the individual mentioned any close relationships? Does this text have a positive or negative sentiment?
- Location Identifiers
- In the sampled text has the individual mentioned their location. Is the location known for the behavior being profiled? Does this text have a positive or negative sentiment?
- Overall Individual Risk
- Risk = Likelihood * Impact.
- Deriving the overall risk for the individual can be calculated by giving a weight if the above modules we’re answered with a ‘yes’ and multiplying that by the impact of an attack.
- Naming Conventions
- Unlike typical malicious actors, it is important to have a clear naming convention for individuals that only conveys specific pertinent information - for example including the risk score but not the individuals name or location.
Finally there is a quote from Graham Greene that states: “Human nature is not black and white, but instead is multiple shades of grey.” This quote summarizes why both the above framework works and doesn’t work. It’s because, quite frankly, humans are complex and while a lot of these behaviors can be profiled a lot of them can’t be at the same time.
James Stevenson is a computer security graduate, and a software engineer for BT Security. He is also a speaker at Security conferences on topics from Offender Profiling to Getting Into The Industry.