Countries of birth lookup table with subcategories

  • Hello All,

    We have a table for countries of birth that has bad old legacy data such as a country of "Africa". We have to preserve this "country" for searches but not new data entry, so the entry is marked inactive, meaning that it will not be used on the dropdowns for new data. However, the new data entered for African countries will be "Kenya" or another real country. I am thinking about adding a category to each entry so that "Kenya" will have a "continent" category of "Africa". Then the searches can be written so that the search for "Africa" can return both people marked "Africa" and people marked "Kenya". Does anybody have experience with this problem? I am thinking of a table containing country of birth categories by category set such as "continent". Then we could divide the countries other ways. Would someone give me some direction?

    Thanks,

    DD

  • It has been a long time but I implemented an International Trade Database.

    It had a Codes for the Continent, Sub Continent and Country with Lookup Tables for each.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I don't know how much data you already have in your database but you could add an Unspecified for the records that do not have a country but have a continent.

    Update the records to convert existing recors to the codes if it is feasible to do so.

    Create and load the Continent Tables and Create PK's.

    Add FK's and Not Null Contraints on the Country and Continent Codes Columns

    You can use any of these standards for Country Codes:

    http://www.unc.edu/~rowlett/units/codes/country.htm

    http://en.wikipedia.org/wiki/List_of_sovereign_states_and_dependent_territories_by_continent_(data_file)

    http://www.oscommerce.com/community/contributions,5177

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Diane Davis (5/19/2011)


    I am thinking about adding a category to each entry so that "Kenya" will have a "continent" category of "Africa". Then the searches can be written so that the search for "Africa" can return both people marked "Africa" and people marked "Kenya".

    Question is... is that what business rules are telling you it should happen?

    If the answer is Yes, go for it.

    If the answer is No, don't try to fix what is not broken.

    An alternative solution would be to identify the "continent" of every single country then treat the one with bad data e.g. country=Africa as "unassigned".

    An even better alternative would be to clean up the data and ensure data reflects reality.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (5/22/2011)


    Diane Davis (5/19/2011)


    I am thinking about adding a category to each entry so that "Kenya" will have a "continent" category of "Africa". Then the searches can be written so that the search for "Africa" can return both people marked "Africa" and people marked "Kenya".

    Question is... is that what business rules are telling you it should happen?

    If the answer is Yes, go for it.

    If the answer is No, don't try to fix what is not broken.

    An alternative solution would be to identify the "continent" of every single country then treat the one with bad data e.g. country=Africa as "unassigned".

    An even better alternative would be to clean up the data and ensure data reflects reality.

    Sounds familiar. 😀

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • PaulB-TheOneAndOnly (5/22/2011)


    Diane Davis (5/19/2011)


    I am thinking about adding a category to each entry so that "Kenya" will have a "continent" category of "Africa". Then the searches can be written so that the search for "Africa" can return both people marked "Africa" and people marked "Kenya".

    Question is... is that what business rules are telling you it should happen?

    If the answer is Yes, go for it.

    If the answer is No, don't try to fix what is not broken.

    An alternative solution would be to identify the "continent" of every single country then treat the one with bad data e.g. country=Africa as "unassigned".

    An even better alternative would be to clean up the data and ensure data reflects reality.

    Sounds familiar. 😀

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I am gathering requirements to see if I have the business rules right.

    I can't get the data cleaned; that wasn't my decision to make.

    Thanks for everyone's comments.

    DD

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply