Datameer Lineage for Hadoop

Data Lake vs Data Cesspool

Andrew C. Oliver’s (@acoliver) recent post “How to create a data lake for fun and profit”  is an interesting take on the value of a data lake – an unstructured data warehouse where you pull in all your different sources into one large “pool” of data. In contrast to data marts and warehouses a data lake…

Data Quality Error in your Favour

Data Quality Errors cost real money

“Bank Error in Your Favour – please collect $200″ Supply chain errors are estimated to cost the average corporation around 5% of total spend. For large organisations these errors can add up to substantial amounts of money – yet if underlying data quality root causes are not addressed they will never be adequately resolved.  …

meeting

The ugly truth about data governance meetings

A recent infographic by visual.ly discusses the extreme cost of unproductive meetings. The infographic highlights some alarming facts! Of 25 million meetings a day in the United States nearly 17 million (67%) are deemed a waste of time. And these time wasters are more likely to affect more senior staff. Senior management spend significantly more…

monopoly

Is quality address data still relevant?

Most organisations are actively working to shift their communications from paper to electronic formats. After all, we can save around $1 per communication if we send an email rather than a traditional letter. So, given this shift, is there still value in quality address data? I would suggest that customer-centric organisations should find value in…

Find the signal in the noise

Big Data: When data quality doesn’t matter

If BI is about reporting, analytics is about creating insight. [Tweet this] The old adage of garbage in, garbage out applies to both BI and to analytics. Yet, the introduction of Hadoop into the analytics architecture changes how poor quality data impacts us, and may create the illusion that poor data quality is no longer of concern. Moving…