Consider this figure: $136 billion per year.
That’s the research firm IDC’s estimate of the size of the big data market, worldwide, in 2016. This figure should surprise no one with an interest in big data.
But here’s another number: $3.1 trillion,
IBM’s estimate of the yearly cost of poor quality data, in the US alone, in 2016. While most people who deal in data every day know that bad data is costly, this figure stuns.
So here’s the question:
Why is there so much focus on data analytics and so little (by comparison) focus on data quality?
For many people, analytics is what they think of when they are asked about data management.
Some, normally those required to deliver meaningful insights, will have an understanding of the complexity required to understand, assess, integrate and prepare data to allow meaningful analysis to occur. Data quality issues, data integration challenges, and a poor understanding of where the correct data can be found, all add risk and cost to the analytics processes
Here’s another figure: 60%!
This is the amount of time that data scientists spend trying to get data to the point where they can use it for analysis.
In conversation with a data scientist at a South African corporate she laughed when this number was mentioned. “More like 90% of my time,” she responded.
Yet, in most cases these costs and issues are hidden from business users (which may include senior technical management). In some cases, we may be afraid that our insights will not be taken seriously if users understand how much we have had to massage the data in order to draw the insight. In other cases, senior staff may ignore concerns raised by data scientists until they stop.
We all have limited budgets. Yet, spending on sexy graphical analytics results without considering data governance and data quality can be wasteful.
BI vendors sell the “sizzle”.
Modern BI tools offer a multitude of interesting graphics and charts to represent insights in compelling ways. These graphics can be thought of as a variety of condiments that allow us to spice up our data “steak”.
Yet, as any good chef will tell you, for good food we need to focus on the quality of the raw ingredients. A good quality steak, simply seasoned with salt and pepper, will deliver a better eating experience than a poor quality cut.
For analytics, data is the raw material. Good quality data just not just ensure better insights – it has a direct operational impact as well. Business can save trillions annually by investing in improved data quality – tools, training, process improvements, etc.
Time and money spent on enhancing data quality should not be hidden from business, or absorbed into analytics or data integration budgets.
Data quality deserves its own focus, and its own budget.
Importantly, the benefits of improving data quality go far beyond reduced costs. It is hard to imagine any sort of future in data when so much is so bad. Thus, improving data quality is a gift that keeps giving — it enables you to take out costs permanently and to more easily pursue other data strategies. For all but a few, there is no better opportunity in data.