September 8, 2011 at 8:27 pm
I have landed a brand new table named LabData. It has a sequence number that is the pk, but i can see plain as day there are duplicates, so this is what i did;
i selected the distinct rows in the table (ex the seq num) so i knew what 'done' was, my assumption being that whatever key i found would be not null and uniquely identify the row, so the number of distinct rows was my magic number...e.g., what i had to wind up with at the end.
then, i started selecting distinct columns out of the table, a column at a time, based on what i *thought* made sense, so, for example, i started with MemberID, but that only yielded about half the rows, so i added DateOfService, that returned about 70% of the rows, then added Provider and that gave about 80%, then CPT Code, that returned 97% and so on, until i had a list of about eight columns, the combination of which was enough to uniquely indentify all the distinct rows of the table.
Now for my question.
Had i indexed the whole migilla to cover the entire table, iow, a non clustered index including every column, is there metadata listing the number of steps between values and the values themselves, that would have let me see the combination of rows needed to create the composite primary (business) key from the metadata table, instead of trial and erroring my way into it?
thanks very much for your time and attention
drew
September 9, 2011 at 12:38 am
The answer to your question is no.
Usually a Candidate Key is obvious.
I have queried individual columns to identify what columns or combination of column make a record unique. It is not a bad thing for you get familiar with you data.
You do not want to create indexes that are wide I would recommend that you exercise caution.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
September 10, 2011 at 7:48 am
thanks very much.
i didnt know enough about the data to decide what the candidate keys were and which should be primary, indeed, if it were obvious, it would not have taken seven iterations to determine the eight columns that guaranteed uniqueness.
A surrogate key does nothing to prevent syntactic duplication, it makes everything unique...that's why spelunking through lab data was necessary.
the idea of creating a table wide index was not to use that index, but to use its index statistics to determine uniquness.
i thought that there were index statistics that would facilitate picking keys instead of backing into them.
thanks again
drew
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply