Anybody that has studied English grammar, whether as their native tongue or as a foreign language, has at some stage been dumbfounded by one of the many exceptions to the basic rules as defined in grammar and spelling.
An entertaining article by the Financial Times’ Michael Skapinker, discusses how the English language is constantly adapting to changing cultural circumstances. This means that grammar and spelkling that was at one time acceptable, becomes unacceptable. Eventually, these original uses of language may become common again – by definition changing the rules of language again.
Any data intensive project, and any data governance program, need to absorb two critical lessons from this experience.
1.) The grammar and spelling rules for data need to be defined to ensure that data is fit for purpose. Data rules (called metadata) may include a business glossary of terms, definitions of calculated fields, key business indicators (data quality rules) and attribute defintions, to name a few. Ensuring common use and reducing ambiguity amonst the data users helps to ensure that data is fit for reuse and is therefore a critical success factor for enterprise projects such as data governance and master data management
2.) These rules must be living! Modern business drivers require the ability to adapt quickly and continously to changing business circumstances – whether driven by new legislation, by mergers or acquisitions or simply by competing for a new client segment or adapting to a competitor’s new product launch. Metadata that is buried in passive documentation is likely to be ignored by business. Active metadata, on the other hand, is exposed and used by the business to measure, validate and track data compliance to business needs. New business requirements can be documented, for example in the enterprise data quality suite, which supports the implementation of these rules across various enterprise systems.
Like English spelling and grammar, last year’s rules may no longer be relevant or correct. Metadata that is being used by systems and business users is far more likely to be kept current, and, in staying current maintain its value.
This post was first published on the dataqualitymatters blog