Using Metadata to Quantify the Enterprise Value of Data to Assess the Severity of Data Breaches

In spite of the increased global regulatory focus on data privacy, such as South Africa’s imminent and long-awaited implementation of PoPIA, data breaches continue to be a plague on businesses around the world, with many of the world’s largest data breaches happening in the last year.

Almost everyone on the planet has been exposed to some level of unauthorised data access or will be soon.

While organizations are always looking to evolve their data protection strategies and strengthen their internal security practices, it’s hard to keep up with all of the changes. As organizations implement more tools and security protocols to protect themselves and their customers, hackers become savvier.

And it is not just external threats that we need to manage. In November 2020, Absa disclosed that a rogue employee had stolen and sold the personal data of several hundred thousand customers.

Clearly, savvy companies are taking steps to protect personal data. At the same time, forward-looking companies are setting themselves apart by leveraging their metadata to better understand and protect the value of their data.

Leveraging Metadata

Metadata provides context to our data. It can tell us various details including where data resides, how to find it, what it is used for, and where it came from. Identifying, quantifying, and understanding this information helps organizations that have struggled to understand how much their data is worth. Metadata can help quantify the enterprise value of data (EvD) to help determine whether that data is essential to the organization.

By using metadata to better understand the valuable and sensitive information that could be targeted by hackers, organizations can have a more effective plan to help mitigate security risks. In addition, should a breach occur, organizations can have a faster public response because they know and understand the value of the compromised data.

Using Metadata to Derive the Enterprise Value of Data

If an organization is considering using metadata to classify and inventory an organization’s data, data should first be classified into one of three categories. This allows them to put a value on specific data sets, the enterprise value of data, so if a data breach occurs, organizations can understand the severity of the breach. Three data categories may be:

Sensitive: This category should include customers’ most critical information assets like Identity numbers, credit card numbers, medical information, and any other special personal information e.g. data pertaining to a child. This data must be carefully guarded and tracked to ensure it is thoroughly protected. Only a few individuals within an organization may need access to this information.

Confidential: This category should contain information that must be protected and retained inside the security firewall, like customer addresses and phone numbers and other elements classified as personal information. Access to this information is again limited but typically is available to a broader set of uses.

Open: This category contains information that does not need to be protected with any security protocols. Generally, this information is publicly available or could be easily obtained from other sources by competitors and/or hackers. Information such as yesterday’s weather or the GDP of the U.S. Economy would fall into this category.

If a breach happens to an organization and they have their data classified, they have a head start on identifying, quantifying, and understanding the impact of the breach. This is invaluable because time is of the essence in a breach scenario. Organizations must act quickly to assess the damage and take the necessary steps for protecting additional information as well as their own reputation.

What is the Data Worth?

Data is an asset for any organization, and securing its value is critical. After categorizing data by security type, it is a smart idea to apply a normalized value to each set. Since any data in the sensitive category requires a higher security classification, it should also be assigned the highest value. We might assign a risk of ‘five’ to this category of data. Data in the confidential category is still essential to secure, but it is not as crucial as sensitive data, so we might give this data a three on the risk scale. And since open data contains minimal risk, it can be given a value of one.

While it makes sense to move toward a normalized quantification of data element value and risk, one of the challenges of data security is that it is the combination of data elements, rather than the individual data element, that poses the greatest risk.

Let’s use an example to help illustrate this concept further. If an organization has saved a number of customers’ Identity numbers and classified them as sensitive (and thus, a level five), and those numbers were breached, then they can classify the data breach as a level five. While customers are rightfully weary of sharing their identity number, identity thieves may need a combination of other customer information, such as name and/or date of birth to put those Identity numbers to (nefarious) use.

However, if Identity numbers (5), addresses (1), full names (3) and dates of birth (5) were compromised, then the breach would be classified as a 14 (5+1+3+5). This means the breach is serious and the organization will need to notify customers and prepare for possible financial and reputation repercussions.

Similarly, a leak of names and dates of birth may be more serious if data linked to children (special personal information) is leaked.

Of course, these rudimentary quantitative models are only a start, and might not adequately reflect the seriousness of a breach, but, are more advanced than what is in place at many organizations. The truly data-centric organization would create a more complex data security model that accounts for particularly dangerous data combinations. This model would guide data security decisions in terms of where and how to store data.

Metadata can quantify risk if a data breach does occur. When organizations know the value of the data that has been leaked, they are much more prepared to handle it.

To learn more about leveraging metadata, download the Data360 data sheet.