May 19, 2011 at 5:47 pm
Hello All,
We have a table for countries of birth that has bad old legacy data such as a country of "Africa". We have to preserve this "country" for searches but not new data entry, so the entry is marked inactive, meaning that it will not be used on the dropdowns for new data. However, the new data entered for African countries will be "Kenya" or another real country. I am thinking about adding a category to each entry so that "Kenya" will have a "continent" category of "Africa". Then the searches can be written so that the search for "Africa" can return both people marked "Africa" and people marked "Kenya". Does anybody have experience with this problem? I am thinking of a table containing country of birth categories by category set such as "continent". Then we could divide the countries other ways. Would someone give me some direction?
Thanks,
DD
May 21, 2011 at 11:00 am
It has been a long time but I implemented an International Trade Database.
It had a Codes for the Continent, Sub Continent and Country with Lookup Tables for each.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
May 21, 2011 at 5:24 pm
I don't know how much data you already have in your database but you could add an Unspecified for the records that do not have a country but have a continent.
Update the records to convert existing recors to the codes if it is feasible to do so.
Create and load the Continent Tables and Create PK's.
Add FK's and Not Null Contraints on the Country and Continent Codes Columns
You can use any of these standards for Country Codes:
http://www.unc.edu/~rowlett/units/codes/country.htm
http://www.oscommerce.com/community/contributions,5177
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
May 22, 2011 at 5:51 pm
Diane Davis (5/19/2011)
I am thinking about adding a category to each entry so that "Kenya" will have a "continent" category of "Africa". Then the searches can be written so that the search for "Africa" can return both people marked "Africa" and people marked "Kenya".
Question is... is that what business rules are telling you it should happen?
If the answer is Yes, go for it.
If the answer is No, don't try to fix what is not broken.
An alternative solution would be to identify the "continent" of every single country then treat the one with bad data e.g. country=Africa as "unassigned".
An even better alternative would be to clean up the data and ensure data reflects reality.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.May 23, 2011 at 12:19 pm
PaulB-TheOneAndOnly (5/22/2011)
Diane Davis (5/19/2011)
I am thinking about adding a category to each entry so that "Kenya" will have a "continent" category of "Africa". Then the searches can be written so that the search for "Africa" can return both people marked "Africa" and people marked "Kenya".Question is... is that what business rules are telling you it should happen?
If the answer is Yes, go for it.
If the answer is No, don't try to fix what is not broken.
An alternative solution would be to identify the "continent" of every single country then treat the one with bad data e.g. country=Africa as "unassigned".
An even better alternative would be to clean up the data and ensure data reflects reality.
Sounds familiar. 😀
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
May 23, 2011 at 12:21 pm
PaulB-TheOneAndOnly (5/22/2011)
Diane Davis (5/19/2011)
I am thinking about adding a category to each entry so that "Kenya" will have a "continent" category of "Africa". Then the searches can be written so that the search for "Africa" can return both people marked "Africa" and people marked "Kenya".Question is... is that what business rules are telling you it should happen?
If the answer is Yes, go for it.
If the answer is No, don't try to fix what is not broken.
An alternative solution would be to identify the "continent" of every single country then treat the one with bad data e.g. country=Africa as "unassigned".
An even better alternative would be to clean up the data and ensure data reflects reality.
Sounds familiar. 😀
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
May 23, 2011 at 3:45 pm
I am gathering requirements to see if I have the business rules right.
I can't get the data cleaned; that wasn't my decision to make.
Thanks for everyone's comments.
DD
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply