For someone that’s been preaching data governance and data quality for more than fifteen years, its been fascinating to see how these two topics have been gaining traction in the last few years.
A few week’s back I touched on the difference between data governance and data quality – governance is about “what” and “who”, data quality is about “how.”
This week, prompted by last week’s Syncsort webinar Unlocking Greater Insights with Integrated Data Quality for Collibra I want to look at the overlap, using a simple example of how this may work in practise.
It’s a common scenario. Two line of business managers generate similar reports, in theory measuring the same metric, and yet are presented with wildly different outcomes.
Certifying a report allows each decision maker to assess the trustworthiness of each report by allowing them to understand:
- Who was involved in the design and sign off of the report?
- How are the terms, calculations and aggregations derived for each report?
- Where does the data come from and does it reflect the source?
- Can we trust the data?
Let’s look at each of these in turn
Who was involved in the design and sign off of the report?
Governance is first and foremost about accountability. The delivery of a single report may involve multiple stakeholders across the organisation including:
- The business sponsor (or owner) who requested the report
- The business analyst and subject matter experts who designed the report
- The data engineer who sourced the data
- The BI developer that developed the report
Governance ensures that these stakeholders (and other that may be involved) collaberate effectively, that it is easy to identify who was involved and what their role and input was, and who approved the final deliverable. Governance makes it easy to engage the right people to clarify and points of contention, and to asses the rigour of the design process
How are terms, calculations and aggregations derived for the report?
Conflicting interpretations of business metadata can be one of the most common issues causing mismatched results. In effect, two reports measuring churn can have different results if they calculate churn using different approaches.
Governance ensures that the definitions for each attribute used are clearly defined and accessible, that each definition has engaged all the necessary stakeholders, and that definitions are shared across the enterprise where possible.
The governed business metadata generated helps us to understand our report
Where does the data come from?
Another common reality is that two reports may measure similar data from different sources. Sales reflected in the CRM system may only be reflected in the billing engine a month later.
By engaging the right people data governance helps to ensure that our hypothetical reports source and lineage can be properly understood and assessed.
Can we trust the data?
Understanding the source of data is one aspect of trust.
The other is data quality
Data quality means not measuring compliance of the data to an agreed (governance) set of standards and rules. For example, if we are not capturing gender indicators in our data the a report segmented by gender is probably going to be inaccurate.
Certifying each report for data quality allows executives to again compare each report with insight as to the level of confidence that they can apply to the outcomes.
Integrating best of breed technologies
In last week’s webinar, Syncsort discussed how two best of breed platforms have been integrated to deliver an end to end capability to deliver truste, governed data. Watch the recording here – or reach our to us to set up a live demonstration