December 7, 2006 at 4:41 am
Hi guys, I am generating new scripts for novice users of SQL to use and was wondering if anybody had an opinion on the best way of producing the tables for my team to work on.
Steps:
- Create table with Raw Supplier and address info
- Clean Data
- Add clean data to Raw Supplier Table with additional fields, i.e. Clean_add_1, Clean_add_2
OR
- Create Clean_Supplier Table with only clean data & should they need the original Supplier table they can match back on Supplier_ID
Any suggestions on which method to use and why? Also, when running queries would it be better to have all the data in one master table or in various tables and do joins.
Thanks in advance.
December 7, 2006 at 11:04 am
It sounds like your source data is a bit "dirty" and you want to clean it up to make it easier for them to work with while they are learning. If that's the case, I wouldn't perform any cleanup whatsoever, as that's the data they'll be dealing with in the future, so they'll need to learn how to handle the dirty data issues, via filtering, CASE statements, etc.
As for "joins versus a denormalized data structure", I usually use views to denormalize the relational model for casual users, while helping the power users get a good handle on how joins work so that they can go where the views can't take them.
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply