Under the GDPR, which came into force last May, individuals now have enhanced rights regarding their ability to request and access personal data from entities holding such data about them. The recent California state legislature passed the California Consumer Privacy Act of 2018 (Bill AB375), which gives Californians similar data rights as the GDPR.
The data access requests could result in millions of dollars in operational costs for organizations that already face unprecedented fines in case of a security breach.
Data Subject Access Request (DSAR)
Organizations that control or process EU citizens' personal data, are now obliged to provide ALL information about a client or employee upon request. In 2020, the same will apply to US companies that will work under AB375.
According to the UK's Information Commissioner's office, the response should also specify the purpose of the data processing, disclose whether a third-party has access to this data, and to specify the safeguards applied to this data. Furthermore, the GDPR sets a new bar for handling these requests – the data access response is given free of charge and within a time window of 30 days.
Collaboration is inaccurate and a time-consuming effort
Some organizations treat DSAR as a legal or compliance issue that can be handled by request processing documentation and internal collaboration. The idea behind these concepts is rather simple: upon a new request, the compliance team sends emails to HR, finance, sales, marketing and other relevant departments that asks them to search for John Doe's information.
Sounds good right? Yet, organizations fail to use collaboration tools for productivity, and to use collaboration for compliance is an even bigger challenge.
Second, internal departments are adopting new technologies independently. Gartner studies have found that shadow IT (technology outside of control of the IT organization) comprises of 30-40% of IT spending.
That means that every department uses several and often disparate data sources. At Cognigo, we found that in organizations of fewer than 10,000 employees, it takes around three to four weeks on average to perform DASR through collaboration. That results in around $10,000 worth of man-hours spent. This number soars in larger organizations that work with complex data and need to respond within the 30 day window.
Our studies show that in the UK for instance, the number of DSAR has tripled since May 2018, especially by request of former employees who require the deletion of their personal data. This exponential growth in requests could result in costs reaching millions of dollars for medium and large enterprises.
Traditional e-Discovery tools are ineffective
Many organizations today have implemented either data discovery and e-Discovery tools: e-Discovery tools are used when that data is required for legal evidence; data discovery is often used for compliance (for instance in complying with ISO 27001 guidelines) and for data security. While these tools provide a more comprehensive and automated approach, they are quite ineffective in regards to DSAR.
First, most tools specialized in very specific data repositories – some tools will work only with file repositories while others only work with databases and applications. Second, the GDPR's definition of personal data is very broad – as any information that could lead directly or indirectly to a natural person. For instance, a former employee’s name is considered to be personal information under the GDPR.
Traditional discovery tools are focused on pattern matching within the content so the organization will have to rescan the entire data repositories upon each new DSAR.
Data Mapping and Artificial Intelligence may be the answer
The solution for responding effectively to data subject access requests may lay in two emerging technologies: AI and Big Data. Behind those buzzwords, deep learning can now understand language in a way similar a human being does – at scale. That brings us closer to identifying GDPR data across the organization and index for potential requests. In other words, organizations need to take a proactive approach rather than a reactive approach.
The reactive approach is based on pattern matching throughout the data once a DASR is received. For instance, upon each new data access request, the individual's name will be searched in the entire organization's data.
In the proactive approach, AI-driven algorithms first extract personal identifiers from documents based on context. These identifiers are later stored in an optimized database for future responses. The ability to focus on context rather than content can also provide insights regarding the purpose of data processing as required by the GDPR.
Cognigo's research shows that organizations which use the proactive approach can handle a DSAR at less than $10 per request. In other words, extracting the result before the query is performed is probably the most cost-efficient way to respond to data access requests.