Data insecurity is becoming costly for UK businesses. ICO fines, which reached record highs in 2021, have added to existing consumer pressure to protect their sensitive information. As a result, many organizations have invested in new processes to safeguard their ‘crown jewels.’ However, there is a dark part of databases lingering beneath the surface that might come back to haunt them.
‘Dark data’ is information that is forgotten, ignored or unused – often as a result of a user’s daily digital interactions. This could include employee records, financial transaction logs or confidential emails. This information can be anywhere – spread across all areas of an organization and a myriad of data repositories, from data lakes to applications. The unknown nature of dark data makes it difficult to protect and legislate for, creating security risks if bad actors get their hands on it.
With data being produced in such large volumes and at such a rapid pace, it is extremely challenging for organizations to quantify their dark data, with more than half of an organization’s data potentially unavailable for analysis. Additionally, the volume of unstructured data is rising at a staggering rate of 55-65% per annum. To break this down, in effect, 1.7 MB of data is created for each of the 7.3 billion people on our planet every minute of every day.
This explosion means that by 2025 there will be roughly 163 trillion gigabytes of data globally, 80% of which will be unstructured, and 90% of that unstructured data will never be analyzed or used in regular business activities. This is despite compulsory regional data standards, business value and its cost of storage. This shows the true scale of the task at hand, and organizations are at huge risk if they don’t take measures to identify, store and secure data.
Shining a Light on Dark Data
Because organizations need to continually store more data, they inevitably create more dark data. Therefore, they must protect all of their data from bad actors while also making it available for auditors. The first and most important step in this process is discovering data and establishing what is sensitive and exposed. By having the ability to discover and classify dark data, an organization is far better placed to leverage this previously unknown information for decision-making. To accomplish this, security teams need to know where sensitive dark data resides, who accesses it, when there is suspicious activity and when abuse occurs so they can take immediate action.
There are two main approaches when it comes to assessing dark data. First, organizations can seek the expertise of independent consultants who can review a data environment and conduct in-depth reviews of unused and uncatalogued data. Second, with the correct tools, organizations can automatically review all their data repositories themselves, wherever their data resides. This is often the preferred course of action because it allows organizations to identify regulatory violations, internal permissions and potentially malicious or negligent behavior. If an organization chooses to use a data analytics solution instead of an external contractor, they can expect a more precise understanding of their data with clear instructions on how to respond to any risks efficiently.
Managing Dark Data Requires a Framework
Once an organization has gained clear sight of its dark data, it can then identify if it has any business value and protect that data accordingly. Building a basic framework to ‘tag’ or catalog this hidden data is the first step to gaining that insight. Without this, an organization can’t comply with data governance standards, regional regulatory compliance or offer truly effective security. Additionally, a lack of framework means organizations cannot guarantee data privacy for their customers and employees. When considering that 35% of all consumers don’t trust any industry to protect their data adequately, organizations should do all they can to win back trust and demonstrate to customers that their data is secure.
As part of a wider strategy, organizations need to know if their data is already visible and being used – is it managed, business-critical, obsolete or dark data? It is critical to understand where data is, what it is and what standards and policies must apply. Knowing who is accessing it and how organizational data is (and should be) governed are all a part of the basic framework for classification and discovery. After proper investigation, truly obsolete dark data can be scheduled for deletion.
Ultimately, amid a backdrop of increasing data compliance regulations such as GDPR and PCI DSS, organizations cannot turn a blind eye to lingering dark data that is growing in volume at a rapid rate. The key component is for organizations to ensure they have a clear line of sight over all data so they can categorize and store it accordingly. Failure to do so will see them lost in a maze of ever-expanding volumes of data which could cause them some serious damage.