Dropping down to data quality

droppingPresenting the user with a predefined list of options may seem like a simple solution to solving data quality problems like that described last week.

Yet, as anyone who has worked with data can confirm, drop down lists present their own challenges that make them inappropriate for many situations

So, when can we use a drop down list to improve data quality?

Only when we have a relatively small, finite list of options.

  • Small, because poor performance can make rendering large lists impractical. Very few users are prepared to wait for more than a few seconds for a list to render. Any more than that and impatience will kick in – increasing the chance that the user will capture an incorrect value.
  • Finite, because lists must be complete. Lists that do not provide valid options will, once again, result in the user capturing an invalid option. This time, however, it will not be through choice.

Setting up a valid list of choices may be harder than it may appear.

Take a simple drop down list to capture the gender indicator for a customer.

Should be simple, right? Male or Female!

Yet, it is not that easy. What if the customer is not a person (a company or a trust for example?) Shall we add Not Applicable?

What if we don;t know the gender? Unknown?

Is Unknown different from Refused to Disclose? We wouldn’t want to be accused of gender discrimination!

And, on that note, should we cater for gay, lesbian and transgender designations? Who should make this kind of decision?

Are you governing your drop down lists?

  1. A relatively simple list can quickly become very complex.
  2. How will this list be represented across different systems that use it – can we use the same values everywhere or do we need to map codes across systems and business units?
  3. Drop down lists are a simple form of reference data and should be governed to ensure fitness for purpose and consistency across systems.
  4. Creating a reference list must be lead by business- this is too important to be left to a technical resource or team.
  5. Consider how data will be used from both a reporting and operational perspective when defining your lists.

Anyone that has set up a statistical survey will no that reducing the choices gives us more easily measurable results (grouping results into meaningful groups or dimensions). However, leaving valid options out will skew results by forcing users to respond with a less accurate answer.

It is critical that drop down lists are included as part of your reference data governance process.


Reference Data Governance in Collibra


The alternative will deliver a scary ride!

Find out more about reference data governance by completing the form

 Image sourced from http://commons.wikimedia.org/wiki/File:Fahrenheit_Drop).jpg


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.