How healthy is data driven healthcare

Over the past few years, the ethics of data use, particularly when related to big data analytics, has been a recurring theme. Most recently, I had a look at the cost of the Facebook data scandal. In this post, I moved away from a purely ethics driven argument to also look at the commercial impact…

Are you making the most of your mainframe data?

When most people think of legacy software, we think of software that is outdated and due for replacement. Yet, an alternative definition of legacy, particularly when it comes to mainframe application, is, simply, software that works. This is a definition that our partner, Syncsort, is proud of. The legacy DMX Sort product has been helping…

Good data or bad data – do you really care

A short post this week – I am still travelling Bias in decision making can trump data – as discussed in Is Bias the 7th data quality metric –  and Is High Cholesterol bad?   On the global stage – the analytics that predicted a Hillary Clinton victory in the 2018 US election was, in…

big data visualisation

Q and A on Data & Analytics

Let’s define data and analytics and how specifically they are interpreted within the business context. Within a business context, many people consider data and information to be one and the same. While one can argue the definition of each, when talked of in conjunction with analytics, data becomes a resource to be interrogated for a…

The Impact of Poor Data Quality on Machine Learning

We are surrounded by huge amount of data. Data is everywhere and is gaining huge importance and relevance in today’s world. There are many firms that are performing tasks of gathering, retrieving and managing data. This requires systems that can help us handle that much amount of data. Machine Learning has helped us in gathering…

gender bias

Is “Bias” the 7th big data quality metric

A few weeks back I wrote about the The 6 dimensions of big data quality. These are: Coverage – how well does the data source meet (or fail to meet)  the business need? Continuity – How well does the data set cover all expected or needed intervals? Triangulation – How consistent is data when measured form…