#HowTo: Avoid Common Data Discovery Pitfalls

A data classification process is an important component of any data security, risk management and compliance strategy, as it makes it easier to locate and retrieve data. Yet, many companies struggle with the process.

Here at Digital Guardian, we invited a panel of data scientists and security experts to identify the most common pitfalls around discovering and properly classifying data, and how they can be avoided. Here’s what they told us.

Lack of goal setting
Among the most common issues with data discovery and classification is the lack of goal setting from the outset. Too often, the objective is to capture more data, and the assumption is that will help influence decision making. However, the actual decisions that need further influence are frequently not considered early enough. This leads to outcomes that may have no significant business value relative to the time spent on the data.

Before setting out on data discovery and classification, make sure there are clear goals in mind of what the data is going to help achieve.

‘Paralysis by analysis’
Organizations often run into the problem of 'paralysis by analysis.' Too often, analysts get far too caught up in data. People put time and effort into collecting, cleaning, and centralizing it, but then what? Data on its own is just raw information. Obsessing over it like this is a mistake, and organizations need to shift toward an obsession with taking this information and transforming it into knowledge. Only then can it help organizations generate wisdom.

Failing to realize the value of data discovery
In itself, data discovery and classification hold no intrinsic value. Organizations can't expect to adequately improve data security and compliance solely through locating and labelling data. They will only start to see real value when it is used in conjunction with other data security practices.

For example, once an organization has found out where their most at-risk data resides within its infrastructure, what do they do next? Can they determine who has access to that data, who's making changes to it, what those changes are, and whether the surrounding environment is secure?

Data discovery and classification is powerful and necessary, but it shouldn’t live in a silo. Combining this functionality with permissions analysis, user and entity behavior analytics, and change auditing will enable the true value to emerge.

Poor data quality hinders ability to deliver customer-centric value
To avoid this pitfall, plan search and segmentation fields ahead of time to ensure the search criteria delivers the expected value. Develop a governance plan with a clear delineation of who is responsible for entering, validating, and maintaining the data; and establish user protocols regarding where data gets entered, and when.

Also, select a CRM tool that is easy to access and use in every situation where users communicate with customers and prospects (e.g., email, on social, on the web, and while mobile), that consolidates all business contacts and automates data entry and data enrichment. Data enrichment can either be provided via a third-party tool like ZoomInfo or DiscoverOrg or offered in-the-box with a small business CRM, such as Nimble.

Trying to solve problems beyond human scale
In the enterprise, there are problems simply beyond human scale. Some organizations are sitting on petabytes of data they don’t even know about, to say nothing about the volumes of data being created every day. Discovery and classification become impossible due to the sheer amount of data organizations have in their possession.

AI-powered auto-classification, trained on a small subset of properly recognized data, is possible today. Machine learning tools are the only way organizations can hope to make significant headway. It's not perfect, but it's a process that improves over time as the machine learns what defines a document specific to an organization. It means the companies that start the process now are in a better position to leverage more of their data in the future and provide cleaner data fuel for future predictive AI-powered analytics and decision making.

Giving valuable data away
Data discovery eventually leads to data storage requirements and the creation of irresistible honeypots for malicious third parties. Combine that with chronic levels of employee-centric data breaches, and it can create a recipe for disaster.

While doing data discovery looks like work and may seem like a productive use of time to the user, it is easy for casual users to spend time analyzing data without purpose. Data is the new oil - don't give it away.

Across such a broad and detailed discipline, the pitfalls and remedies will shift as new approaches and products emerge, but avoiding common and foundational issues will allow organizations to maximize their investment in data discovery.

#HowTo: Avoid Common Data Discovery Pitfalls

Jan van Vliet

You may also like

Debunking Common Misconceptions about Third-Party Risk Management

NCSC Offers Seven-Question Guidance on Cyber Insurance

Data Mapping & Discovery Tools Top Privacy Shopping Lists

Launching a Vendor Risk Management Program with Limited Resources

Cost of Cyber-Events Worsening for Large Businesses

What’s hot on Infosecurity Magazine?

LockBit Admins Tease a New Ransomware Version

Top 10 Cyber-Attacks of 2024

Italy’s Data Protection Watchdog Issues €15m Fine to OpenAI Over ChatGPT Probe

New Malware Can Kill Engineering Processes in ICS Environments

Interpol Identifies Over 140 Human Traffickers in New Initiative

CISA Urges Encrypted Messaging After Salt Typhoon Hack

Cybercriminals Exploit Google Calendar to Spread Malicious Links

New APIs Discovered by Attackers in Just 29 Seconds

US Organizations Still Using Kaspersky Products Despite Ban

New Malware Can Kill Engineering Processes in ICS Environments

Phishing Attacks Double in 2024

Texas Tech University Data Breach Impacts 1.4 Million

Alert Fatigue: What Are You and Your Security Teams Missing?

Dispelling the Myths of Defense-Grade Cybersecurity

How to Manage Your Risks and Protect Your Financial Data

New Cyber Regulations: What it Means for UK and EU Businesses

Learn Key Strategies for Industrial Data Security

The Future of Fraud: Defending Against Advanced Account Attacks

Gatwick Airport's Cybersecurity Chief on Supply Chain Risks and CrowdStrike Outage

You're Hired! The Truth About Certifications in Cybersecurity Careers

T-Mobile Claims Salt Typhoon Did Not Access Customer Data

Darknet Services Fuel Holiday Scams and E-Commerce Exploits

Top 10 Cyber-Attacks of 2024

Google Deindexes Chinese Propaganda Network

#HowTo: Avoid Common Data Discovery Pitfalls

Written by

You may also like

What’s hot on Infosecurity Magazine?