South Africa is a beautiful country.
While international tourists may fly in to well known destinations, such as Cape Town or the Kruger National Park, locals tend to use the roads.
A common sight at this time of the year are the swathes of wild Cosmos lining the road sides throughout much of the interior.
Yet, cosmos is not an indigenous flower. Seeds were introduced in fodder imported from the America’s to feed British horses during the Second Anglo Boer war, which ended just over 110 years ago this week
Cosmos at the side of the road is a good indication that this route was used by British troops.
Two quick data quality lessons:
1.) Data quality issues are typically a sign of another problem [Tweet This]
Poor quality data may not be as obvious a cosmos at the side of the road. Regular data profiling will identify common data quality issues which indicate broken, dysfunctional business processes or bad habits in capturing data. Sudden spikes in data quality issues can indicate the unforeseen consequences of system or process changes.
2.)The most obvious assumptions about common data quality root causes may be incorrect. [Tweet This]
Cosmos is so ubiquitous in South Africa that many people assume it is indigenous. Similar assumptions about the root causes of data quality issues may also be incorrect. Proper analysis is necessary to dig below the obvious and address the real root cause.
In some cases, the process works as designed but the data requirement was not understood – a governance issue. In other cases, data integration or other system issues may be the root cause. Poor data quality can even (gasp) be caused by attempts to solve data quality problems. For example, false positive matching can incorrectly link and merge unrelated records.
Data quality must be based on valid assumptions.Data profiling gives you the facts.
Image sourced from http://www.pbase.com/dewas/image/43354741