#DataQuality In the Wild, Some Where…

Jul 7, 2011   //   by Karen Lopez   //   Data, Data Modeling, Database  //  6 Comments



This is why you should never believe users when they say they NEVER have international data in their databases.

I understand that this letter was probably mailed using some sort of application that has no room for a Country data field on the address.  I get mail from the US all the time with hand written, taped or otherwise appended Canada on the envelope. 

I have business users all the time tell me that they are 100% sure that they have no international data in their systems.  When we dive in to see what they actually have, they will find all kinds of "workarounds" that end users have done to wedge that data into their applications and database.  In fact, I’ve been guilty of that myself.

The C/O trick, pictured in this post, is a common one.  Other tricks I’ve seen:

  • Hand writing the country on that see-through window pane on the envelope.  This often rubs off between the sender and my mail box.
  • Using another field, such as Mailstop or Box #
  • Using "sounds like" choices, such as OH for ON
  • Using "fake" ZIPCodes like 90210, 99999 or 12345 when a postal code isn’t accepted by the application.
  • Adding the country to the end of my name.  I kind of like the sound of Karen Canada, but I’m not sure my postie is going to get that mail to me.
  • Just leaving the country off the address and hope that the mail gets directed correctly.

I will concede that employees are bending or breaking the rules when they accept international data if the policy is that they should not.  By having applications strictly enforce these rules, organizations still end up with that data and it is much harder to find and it is most likely poor quality data at that.


  • You bring up an interesting point.
    By trying to restrict international data, organizations may still end up with the poor quality data you mention.
    By not restricting the data, the rogue data will be easier to find and better quality, but there will probably be more of it.

    It’s really hard to balance those two choices isn’t it?

    • That’s a double kick as the data gets recorded anyway and the data that should be collected gets contaminated too.

  • This reminds me of some of the many issues in our system, but our database can handle international addresses and formats them to that country’s standards. But we have had mailings done by the print house who could not handle international addresses, even though the CSV file I sent them had them formatted. So, they printed the US addresses normally, but hand wrote the international ones. We didn’t find out until one was returned. If they had just said something we could have mailed the international ones on our own with printed labels.

    I have seen some interesting methods for entering addresses, putting the country in the state or city spot is most common. I have never come across “c/o Canada” and am surprised that it even made it to you.

    • You bring up a great point – even when we build systems that support the real world, others are bound to still try to wedge their limited view of the world onto our data.

  • […] as Karen Lopez recently pondered in the post Data Quality in The Wild, Some Where …, actually everyone, even in the United States, has some international data somewhere looking very […]

  • I have been on both sides of this (User and Developer) and from the user POV having a system to use that accepts the data in the way it is received, has a place for every piece of information, accepts that there are occasions where data will be ‘missing’ and doesnt apply impossible rules is a wonderful experience. Having to cope with one that doesnt is a chore. 
    As a user of a website, entering my address details and finding that my UK postcode isnt acceptable in the US Zip code field and there is no option of “Non US” in the State list simply moves me on to make a purchase (or subscribe for info etc.,) else where.

    Having fought these issues on a regular (dozens of times a day when working a CRM system at a previous job) has guided me to make systems that sympathise with the people who will be using it. I hope I succeed

Leave a comment

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to www.datamodel.com by Email



UA-52726617-1 Secured By miniOrange