What is the role of the data engineer?

Data engineering is the term that has emerged to describe the tasks related to delivering useful data for analytics – particularly in relation to data science.

With between 60% and 90% of the effort of most big data project allocated to data engineering tasks, the role has matured as organisations found that traditional data scientists either did not have the skills, or do not have the aptitude to focus on data preparation

The data engineers focus on delivering a quality data pipeline, leaving the data scientist free to deliver analytics. For example, they may be responsible for building data pipelines to collect and store data; building extract, transform and load (ETL) processes to prepare data for analysis; designing data quality processes to cleanse or enrich data; and engineering services and frameworks to deliver the data need for analytics.

This means that data engineers focus on programming, systems and data management skills rather than quantitative skills.

The right tools can really help the data engineer to deliver more quickly and more cost effectively – a topic we will cover at next week’s DAMA Johannesburg Chapter meeting – How does big data change data integration

Hope to see you there!



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.