How do you choose Critical Data Elements?

Way back in 2018 we posted a couple of posts on Critical Data Elements.

In What are critical data elements and why are they critical? we defined critical data elements as “the data that is critical to success” in a particular data area.

From a governance perspective, these are the data elements that we should prioritise from a governance perspective. Yet, in practice identifying CDEs can be hard.

Image from pxfuels

Four Fundamental questions that are hard to answer

  1. What data do we govern?
  2. How should we govern it?
  3. Where should we govern it?
  4. Why should we govern it?

For large organisations, the sheer volume of data elements, combined with the competing priorities of multiple stakeholders across business and IT can make it really difficult to answer these seemingly simple questions.

In Approaches for selecting critical data elements, we proposed that rather than looking at a data domain, and trying to define CDEs within that context, we should look at specific business problems and how data supports them.

From this perspective we suggested three approaches:

  1. Report driven – understanding which data elements underpin Key Performance Indicators
  2. Policy Driven – understanding which data supports specific business policies or regulatory requirements
  3. Process driven – which data are inputs or outputs of critical business processes.

During the presentation on  Data Strategy SOS: Signs Your Data Initiatives Need Help, at last month’s Trust ’22 conference, Precisely’s David Woods and Perrigo Senior Director for Enterprise Data Governance, Colleen Henderson, built on this approach by adding the following insights.

Rather than asking about which elements are critical, the fundamental question is:

What are the criteria that make an attribute a candidate for governance?

David introduced the idea of a Data Governance Decision Tree that ensures that Critical Data  Elements are tied to business drivers and dictate business impact and governance strategies.

These criteria may vary from one business to another, but the starting point suggested aligns very much with our 3 points above:

  1. Is the data element critical for business process execution?
  2. Is the data element critical for analytics?
  3. Is the data element critical for financial reporting?
  4. Is the data element critical for compliance, risk, or regulatory reporting?

Out of hundreds of candidate attributes this decision tree will typically whittle down a list of 30 or so critical data elements that have the biggest impact on the business and should be prioritised for governance.

Three strategies for governing critical data elements

Once these candidates have been identified it becomes easier to decide how and where governance standards should be applied.

  1. Active governance at point of entry – by applying validation rules we ensure that the element is captured correctly first-time
  2. Passive governance – once the data element has been captured we make decisions and apply automation to correct or enhance
  3. Process control – governance depends on policies (backed up by dashboards) rather than on automation.

Who owns each critical data element?

The question of data ownership is one that plagues most governance implementations.

If the above data governance decision tree is applied then this can help to identify key stakeholders.

Who is responsible for the business process this element enables? Who must act on the key report?

With our understanding of who in the business is dependent on, and who captures, the data element in question, we can build a RACI:

Who is responsible for capturing data correctly (and who do they report to)? (R)

Who is accountable for data being captured correctly? (A)

Who needs the data in a particular format and is able to provide input into standards and rules? (C)

Who else uses or is impacted by the data and needs visibility? (I)

How have you approached solving this problem?