A guide to South African address data quality


street-name-changes-tshwaneA recent decision by the South African high court compels the City of Tshwane to reinstate old street names that were controversially changed in 2012.

This is a good example of the complexity inherent in dealing with South African name and address data.

20 odd years ago the marketing department of my then employer engaged in a competition to change the names of the meeting rooms inside our building – from bird names to various other options.

To the marketing manager’s great disappointment the option that won was to keep the names the same.Otherwise we would spend three years saying – the meeting is in Marula (the one that used to be Bluebird)

Of courses, the history of our country means that decision have to be made to correct historical imbalances, and in some cases, name changes are seen as a means to achieve that.

We went from 4 provinces and 10 nominally self-governing homelands to 11 provinces.

We have numerous city or town names that have changed, along with a multitude of new suburbs.

Of course, name changes aren’t a new phenomenon.

We take a trip down to Durban, for our summer break, not to Port Natal.

The Pretoria military base, Thaba Tshwane got its new name in 1998, after being known for many years as Voortrekkerhoogte. How many remember that Voortrekkerhoogte only came into being in 1939. The base was originally named Robert Heights after British Boer War commander, Lord Roberts.

Over time, older names will fade out of use.

But in the short term, like our meeting rooms, both old and new names will continue to be used for some time.

How do you handle this complexity in your corporate databases? [Tweet this]

Managing name changes and variations

From a data quality perspective this means that we must cater for all names, and do some kind of translation, if possible, to standardise on a particular variation.

Of courses, this challenge extends beyond name changes.

Cape Town is also known, legitimately, as Kaapstad.

Aliwal Noord, Somerset West, Bezuidenhoutsvallei and Magaliesig are all examples of place names that have legitimate representations in two languages, and there are thousands more.

Last year I asked Is quality address data still relevant?  It is!

I followed this up with 7 business drivers for Quality address data

We have invested heavily in ensuring that we can manage the complexities of South African address data.

This post just scratches the surface.

When, for example, we ran over 100 million South African name and address data records through our rules engine for a project we delivered for the South African government) we discovered more than 600 variations of spelling for East London.

We have chosen to enhance our rules engine to correct these (and tens of thousands) of other common misspellings to standardise on the list provided by the South African Post Office (PAMMS).

This means that Aest London, for example, will be corrected to East London, but that we retain the original value Oos Londen, which is valid, and add the alternate, East London, which can be used for geocoding or matching.

The work we have done gives you a significant head start when dealing with your South African address data quality problems – contact us for a Proof of Concept.

Image sourced from thenewage.com
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s