build your enterprise data hub

Cloudera and Hortonworks merge – good news for customers

Late last year, Cloudera and Hortonworks announced plans to merge the two companies – a move that came into play early this year. According to Tendü Yoğurtçu, CTO of our partner, Syncsort, the two companies have emerged as clear winners in the data space and gained momentum. “But each has had its own unique strengths,…

ETL ELT architecture

What is ETL?

ETL defined Extract, Transform and Load  or ETL is a standard information management term used to describe a process for the movement and transformation of data. ETL is commonly used to populate data warehouses and datamarts, and for data migration, data integration and business intelligence initiatives. ETL processes can be built by manually writing custom scripts or code, with…

Turning “No” into “Yes” through data governance

Data governance has become associated with the word “No.” This is the intriguing start to a recent IDC perspective discussing how Cox Automotive bucked the trend and used data governance to become the “Yes” team – delivering the right data to the right person at the right time. The report, which can be accessed here,…

Is data engineering overtaking data science

When Harvard Business Review first touted the data scientist as sexiest job of the 21st century  back in 2012 the role was still in its infancy. The promise of advanced analytics and the insights that business could gain – about their customers, their interactions, their products and everything else were rightly identified as potential gold. Data…

What is the role of the data engineer?

Data engineering is the term that has emerged to describe the tasks related to delivering useful data for analytics – particularly in relation to data science. With between 60% and 90% of the effort of most big data project allocated to data engineering tasks, the role has matured as organisations found that traditional data scientists…