Peter Benson, quoted by Jim Harris, has very nicely expressed one of my pet bugbears – “Mention the word metadata and you have immediately lost all but the hard core techies and they have neither the authority nor the budget to solve the problem. ”
As data management professionals, we all understand the importance of metadata. The problem we face is that the concept of metadata – information about information – is all encompassing and means different things depending in the context and who you are talking to.
IT Architects argue eloquently for the need for a consolidated metadata repository. However, a single repository that holds everything from the business definition of a terms (“How do we calculate GROSS PROFIT”), to a dictionary of data defintions (“Where and how do we store Product codes?”), to active data statistics (“How many Null values do we have for Client ID in the Orders table?”) is neither technically practical nor, in my opinion, particularly useful. The volume of work required to capture this information in a single place means that it will be out of date practically as soon as it starts – and it will never be completed.
At the very least, the practical implementation of such a repository will require significant rework – a lot of metadata is stored already – as ER diagrams, in process models, as data flows or in your Data Profiling tool.
For me the problem boils down to the fact, as discussed in Jim’s post The Metadata Continuum, that metadata in itself has no standards.
I would like to suggest that we, the data management community, need to be a bit more specific and a bit more pragmatic if we want to get business support. If we are trying to create consensus as to the “calculation of Gross Profit” or “the definition of a Customer” maybe these (and similar concepts) can be categorised as our Business Glossary (say)? Maybe our Data Quality rules are classified as just that – Data Quality rules?
In my opinion, business may be interested in ensuring that all BI reports calculate Gross Profit using the same formula, and as result show consistent values. They probably don’t care about a metadata repository that will link every piece of information (at every level) in the business.
By breaking the metadata challenge down into smaller pieces we can make it more achievable and may find it easier to get buy in from business stakeholders. At the very least – hopefully – we will all be talking about the same thing!
3 thoughts on “What is metadata anyway?”