#HowTo: Manage ‘Dark Data’ in Organizations

Data insecurity is becoming costly for UK businesses. ICO fines, which reached record highs in 2021, have added to existing consumer pressure to protect their sensitive information. As a result, many organizations have invested in new processes to safeguard their ‘crown jewels.’ However, there is a dark part of databases lingering beneath the surface that might come back to haunt them.

‘Dark data’ is information that is forgotten, ignored or unused – often as a result of a user’s daily digital interactions. This could include employee records, financial transaction logs or confidential emails. This information can be anywhere – spread across all areas of an organization and a myriad of data repositories, from data lakes to applications. The unknown nature of dark data makes it difficult to protect and legislate for, creating security risks if bad actors get their hands on it.

With data being produced in such large volumes and at such a rapid pace, it is extremely challenging for organizations to quantify their dark data, with more than half of an organization’s data potentially unavailable for analysis. Additionally, the volume of unstructured data is rising at a staggering rate of 55-65% per annum. To break this down, in effect, 1.7 MB of data is created for each of the 7.3 billion people on our planet every minute of every day.

This explosion means that by 2025 there will be roughly 163 trillion gigabytes of data globally, 80% of which will be unstructured, and 90% of that unstructured data will never be analyzed or used in regular business activities. This is despite compulsory regional data standards, business value and its cost of storage. This shows the true scale of the task at hand, and organizations are at huge risk if they don’t take measures to identify, store and secure data.

Shining a Light on Dark Data

Because organizations need to continually store more data, they inevitably create more dark data. Therefore, they must protect all of their data from bad actors while also making it available for auditors. The first and most important step in this process is discovering data and establishing what is sensitive and exposed. By having the ability to discover and classify dark data, an organization is far better placed to leverage this previously unknown information for decision-making. To accomplish this, security teams need to know where sensitive dark data resides, who accesses it, when there is suspicious activity and when abuse occurs so they can take immediate action.

There are two main approaches when it comes to assessing dark data. First, organizations can seek the expertise of independent consultants who can review a data environment and conduct in-depth reviews of unused and uncatalogued data. Second, with the correct tools, organizations can automatically review all their data repositories themselves, wherever their data resides. This is often the preferred course of action because it allows organizations to identify regulatory violations, internal permissions and potentially malicious or negligent behavior. If an organization chooses to use a data analytics solution instead of an external contractor, they can expect a more precise understanding of their data with clear instructions on how to respond to any risks efficiently.

Managing Dark Data Requires a Framework

Once an organization has gained clear sight of its dark data, it can then identify if it has any business value and protect that data accordingly. Building a basic framework to ‘tag’ or catalog this hidden data is the first step to gaining that insight. Without this, an organization can’t comply with data governance standards, regional regulatory compliance or offer truly effective security. Additionally, a lack of framework means organizations cannot guarantee data privacy for their customers and employees. When considering that 35% of all consumers don’t trust any industry to protect their data adequately, organizations should do all they can to win back trust and demonstrate to customers that their data is secure.

As part of a wider strategy, organizations need to know if their data is already visible and being used – is it managed, business-critical, obsolete or dark data? It is critical to understand where data is, what it is and what standards and policies must apply. Knowing who is accessing it and how organizational data is (and should be) governed are all a part of the basic framework for classification and discovery. After proper investigation, truly obsolete dark data can be scheduled for deletion.

Ultimately, amid a backdrop of increasing data compliance regulations such as GDPR and PCI DSS, organizations cannot turn a blind eye to lingering dark data that is growing in volume at a rapid rate. The key component is for organizations to ensure they have a clear line of sight over all data so they can categorize and store it accordingly. Failure to do so will see them lost in a maze of ever-expanding volumes of data which could cause them some serious damage.

#HowTo: Manage ‘Dark Data’ in Organizations

Dan Neault

Shining a Light on Dark Data

Managing Dark Data Requires a Framework

You may also like

Interview: Jon Baines, Mishcon de Reya & Joe Hancock, MDR Cyber

#2018InReview Compliance and GDPR

GDPR is Here, So What's Next for Cybersecurity Professionals?

Top Thoughts for GDPR Third-Party Management

GDPR Two Years On – Has it Gone How we Hoped?

What’s Hot on Infosecurity Magazine?

Top Ransomware Trends of 2025

Hundreds of Arrests as Operation Sentinel Recovers $3m

UK: NHS Supplier Confirms Cyber-Attack, Operations Unaffected

Nefilim Ransomware Affiliate Pleads Guilty

Clop Ransomware Group Linked to 3.5m University of Phoenix Breach

86% Surge in Fake Delivery Websites Hits Shoppers During Holiday Rush

Helldown Ransomware Expands to Target VMware and Linux Systems

Top 10 Cyber-Attacks of 2025

Phishing Messages and Social Scams Flood Users Ahead of Christmas

OAuth Device Code Phishing Campaigns Surge Targets Microsoft 365

Urban VPN Proxy Accused of Harvesting AI Chat Conversations

New “Lies-in-the-Loop” Attack Undermines AI Safety Dialogs

Revisiting CIA: Developing Your Security Strategy in the SaaS Shared Reality

Exposing AI’s Blind Spots: Security vs Safety in the Age of Gen AI

How Mid-Market Businesses Can Leverage Microsoft Security for Proactive Defenses

Risk-Based IT Compliance: The Case for Business-Driven Cyber Risk Quantification

Cyber Defense in the Age of AI: Stay Ahead of Threats Without Compromising Safety

Mastering Identity and Access for Non-Human Cloud Entities

Regulating AI: Where Should the Line Be Drawn?

What Is Vibe Coding? Collins’ Word of the Year Spotlights AI’s Role and Risks in Software

Risk-Based IT Compliance: The Case for Business-Driven Cyber Risk Quantification

Bridging the Divide: Actionable Strategies to Secure Your SaaS Environments

NCSC Set to Retire Web Check and Mail Check Tools

Beyond Bug Bounties: How Private Researchers Are Taking Down Ransomware Operations

#HowTo: Manage ‘Dark Data’ in Organizations

Written by

Shining a Light on Dark Data

Managing Dark Data Requires a Framework

You may also like

What’s Hot on Infosecurity Magazine?