Managing the modern data pipeline

A few weeks back I wrote about the emerging role of the data engineer – the group of person’s responsible for delivering the quality data pipelines that enable the data scientist. I followed it up with this tweet – which I believe summaries very consisely the changing reality of big data and advanced analytics 2012…

Big data or big disaster?

When I first started posting about big data, very few users existed in South Africa. Today, most last organisations have a Hadoop data lake – in many cases replacing traditional ETL and/or acting as a data archive as well as a feeder to the enterprise data warehouse and, various operational data marts. In a few,…

How are you governing unstructured data?

Most data governance initiatives begin with structured data – information held in systems or databases. Yet, unstructured data comprises the overwhelming majority of an enterprise’s information assets – and is of enormous potential business value to those who harness it correctly. It also presents huge challenges when it comes to management, governance, and effectively leveraging information…