The Role of Technology in Data Quality Management, takes issue with the proposition that data quality tools do not solve data quality problems.
His point, largely, was that the consulting community, by and large, promote the implementation of process and disregard the importance of tools, while tool vendors may veer equally far in the opposite direction – emphasing the importance of tools without focusing sufficiently on process and people.
In response, John Owens points out that tools cannot solve a problem if inappropriately applied. He suggest that the current dearth of data quality specialists is due to an over reliance on tools.
While John’s point about tools is valid, I disagree with his assertion that the lack of skills is due to a dependence on tools.
Over the last ten years we seen a dilution in data quality skills(which were never particularly strong to begin with) driven, in my opinion, by two key technology trends.
1.) The data quality market has been diluted by the acquisition of many of the major data quality tool vendors by stack vendors that do not have a data quality focus. Analysts such as Gartner have rightly pointed out that the trend is towards stacks that offer some basic level of data quality, but where the focus is on ETL or BI or MDM or something else. This lack of core focus means that the passionate Data Quality pros are leaving these organisations and they may be dropping technology without having a good understanding of how to use it appropriately.
2.) In many cases glorified SQL tools and analytics solutions are being positioned as DQ solutions. In particular, there is confusion over the difference between data profiling and data discovery. The big data community, for example, will suggest that Python is a great data quality tool….
As John pointed out, tools simple enable – they do not solve the problem. That is why the leading data quality vendors focus on communication and enabling business and data steward involvement in data quality issues. Good data quality tools produce a better, more consistent result, more quickly, than the more technical alternatives, and certainly better than manual approaches which are prone to error.
So there a place for a good data quality tool, backed up by a team with a data quality focus.
Process and approach by itself most certainly does not solve the problem. Automation is necessary to enforce a consistent and relevant application of process, particularly when dealing with volume.
Companies should look for service providers that bring a blend of data quality “how to” knowledge and approach, blended with the ability to implement using an appropriate tool. Not all tools are created equal and the vendor’s knowledge of how to leverage the tool is important to defining the correct process.
What do you think?