Managing the modern data pipeline

A few weeks back I wrote about the emerging role of the data engineer – the group of person’s responsible for delivering the quality data pipelines that enable the data scientist.

I followed it up with this tweet – which I believe summaries very consisely the changing reality of big data and advanced analytics

2012 – Data scientist don’t need #dataquality

2015 – Data scientists spend 60% of their time cleaning and preparing data

2019 – The data engineer takes over from the data scientist to deliver a quality data pipeline

Webinar: Managing the modern data pipeline

This week, the Bloor Groups Synthesis Series discusses the modern data pipeline in what promises to be a useful webinar on Wednesday night.

Join Eric Kavanaugh, Dave Wells and Fernanda Tavares as they look at the changing realities of managing data pipelines to support modern analytics.

In particular: Where analytics was supported by a single piepline feeding the data warehouse, modern analytics requires multiple pipelines supporting dozens of use cases – from reporting to self service analytics, data science, machine learning, real time delivery of dat ato customer facing applications, and many more.

This webinar promises to examine the characteristics of modern data pipelines and provide a framework for building and managing them. It will explore the role of DataOps and describe methods to automate the flow of data from source to target, increasing agility, efficiency, and accuracy.