Data quality is about fitness for purpose

While some question the definition of data quality as data that is fit for purpose, we argue that the definition is both valid and appropriate


The most common definition of data quality is that it is data that is “fit for purpose”.

quality data means data that is fit for purpose

In this series of thought-provoking posts, Henrik Liliendahl Sørensen raises questions about this definition.

  1. The Principle of “Fit for Purpose”
    1. Building a House Analogy
    2. Balancing Cost-Effectiveness and Necessity
  2. A Value-Driven Approach to Data Management
  3. The Subjectivity of Data Quality
  4. Conclusion

While acknowledging the importance of data being suitable for its intended use, Henrik expresses concerns about the adaptability of such data for multiple purposes over time.

He suggests that, at a certain point, it becomes more cost-effective to model the real-world object itself rather than continuously adapting the data to meet new requirements.

The Principle of “Fit for Purpose”

“Fit for purpose” is a legal term that signifies something is sufficiently capable of fulfilling its intended function. This principle of being “good enough” plays a crucial role in our approach to data management by value. Over-engineering a solution is neither practical nor cost-effective.

Building a House Analogy

To illustrate the concept further, let’s consider building a house.

It is reasonable to ensure that the house is fit for its purpose, such as having a bathroom, a kitchen, and bedrooms, being water-tight, and having heating and running water. However, the specific design and features may vary depending on factors like family size, budget, and personal preferences. Planning for every conceivable use of the house, such as incorporating a manufacturing centre or a surgery room, would neither be pragmatic nor cost-effective.

Balancing Cost-Effectiveness and Necessity

Similarly, when working with data, we must strike a balance between cost-effectiveness and necessity. Henrik’s concern about poorly thought-out tactical solutions that lack scalability is valid. There may come a point where an enterprise-wide view of data necessitates a redesign of these tactical solutions. This does not contradict the “fit for purpose” definition.

A Value-Driven Approach to Data Management

Taking a value-driven approach to data management allows us to leverage multiple uses of data across the enterprise, minimizing costs and maximizing reuse. Data governance principles can guide the planning of projects to achieve tactical goals in a cost-effective manner without compromising the ability to address additional goals in the future.

The Subjectivity of Data Quality

Ultimately, the quality of data lies in the eye of the beholder. There is no universal benchmark to strive for. If the data meets the requirements and objectives of a specific use case, it can be considered of good quality. Data quality is a context-dependent evaluation.

Excuses for bad data: Explore the top 5 excuses for bad data and learn how to overcome them for improved decision-making

What is the impact of bias on machine learning?: Explore the analogy of monkeys, bananas, and the implications of bias on machine learning outcomes.

Conclusion

In conclusion, Henrik Liliendahl Sørensen’s posts challenge the conventional definition of quality data as “fit for purpose.” While acknowledging the importance of suitability for intended use, Henrik emphasizes the need to balance cost-effectiveness with adaptability. A value-driven approach to data management, guided by data governance principles, enables organizations to achieve their goals efficiently and effectively. Ultimately, data quality is subjective and dependent on specific requirements and objectives.

Response to “Data quality is about fitness for purpose”

  1. Alan Snow

    The term ‘Fit for purpose’ is a legal term, more relevant to Courts and Lawyers than to quantifying Data Quality. It is used as a test to settle a legal dispute, whereas data quality management is an ongoing process aimed at improving the quality (and thereby, the VALUE) of data.
    I agree that improving data quality has to bring real world benefits and not just be an end in itself.

    Nor is the term ‘good enough’ an appropriate way of viewing data quality. “Good enough” suggests a minimum acceptable quality level; a level aimed at mimimising costs rather than maximising value.
    The more relevant question is: “Can we realise benefits through improving data quality, that are worth more than the investment costs”?

    If the expected benefits outweigh the costs, we should improve the data quality.

    The ‘fitness’ of data (or ‘value of data’ as I prefer to view it), is not directly measurable; it is an interpretation of the metrics objectively measured across all data quality dimensions, (Completeness, Validity, Conformity to Business Rules, etc.), using standard data quality software. Only when the results of these measurements have been evaluated, and a cost/benefit analysis of possible improvement actions has been made, can a view be taken on whether the data is ‘good enough’, or not. (i.e. the anticipated benefits are not worth the investment costs).

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.



Related posts

Discover more from Data Quality Matters

Subscribe now to keep reading and get our new posts in your email.

Continue reading