June 29, 2010 at 7:40 am
I inherited what I consider a poorly designed set of tables that we import data into regularly, the current field that is being used for the key can have duplicates so I am changing it to have a surrogate key for future data. I am "retrofitting" the tables to use this surrogate key for the existing data also and have constructed a query to accomplish this.
The "Num_ICN" field is the current key field. I have added fields to all tables to aid uniqueness and the surrogate key. The Fla_Inst_Main table is the parent table in this set of 13 tables. Originally I created several update queries, one for each table, to update the fields I just added, while maintaining data integrity. After thinking about it a bit more, I decided to make the whole thing a bit more compact, since the same query needs to run on each table, and they need to be done in batches to keep the log from getting out of hand. The tables have as many as 108M records, and I don't get to run this in Prod, so I am trying to make it informative for whoever does run it, since it may take a while.
Example of original query (would repeat 12 times with a different table name in each):
Set Rowcount 10000
Declare @Count BigInt
Select 1 --To make @@rowcount have an initial value greater than 0.
While (@@RowCount) > 0
Begin
Update C
Set C.MainID = M.MainID
From FLA_INST_CONDITION C
Inner Join FLA_INST_MAIN M On C.Num_Icn = M.Num_Icn
Where C.MainID Is Null
Set @Count = @Count + @@RowCount
Print @Count
End
New query:
Set Rowcount 100
Set NoCount ON
Declare @Loop Int
Declare @SQL VarChar(1000)
Declare @Table VarChar(100)
Declare @Count BigInt
Declare @RowCount BigInt
Print 'Loading Temp table.'
Create Table #Tables (TableID Int Identity(1,1), TableName VarChar(500))
Insert Into #Tables (TableName)
Select 'FLA_INST_CONDITION'
Union All Select 'FLA_INST_CROSSOVER'
Union All Select 'FLA_INST_DETAILS'
Union All Select 'FLA_INST_DIAGX'
Union All Select 'FLA_INST_HEADERKEY'
Union All Select 'FLA_INST_ICD9'
Union All Select 'FLA_INST_INP'
Union All Select 'FLA_INST_NH'
Union All Select 'FLA_INST_OCCURENCES'
Union All Select 'FLA_INST_PAYER'
Union All Select 'FLA_INST_TREATMENT'
Union All Select 'FLA_INST_VALUE'
Select @Loop = Min(TableID) From #Tables --Get minimum table ID.
While @Loop < (Select Max(TableID) + 1 From #Tables) --Loop through each table by ID.
Begin
Select @Table = TableName From #Tables Where TableID = @Loop --Get table name.
Set @RowCount = 1
While @RowCount > 0 --While there are records still being updated...
Begin
Set @SQL = 'Update C
Set C.MainID = M.MainID
From ' + @Table + ' C
Inner Join FLA_INST_MAIN M On C.Num_Icn = M.Num_Icn
Where C.MainID Is Null' --Assemble the query.
Print 'Now processing table: ' + @Table --Informational
Exec(@SQL) --Execute query.
--Print @SQL --For Debugging.
Set @RowCount = @@RowCount --Grab the value here, since we can only read it once.
Set @Count = @Count + @RowCount --Count the number of records updated, cumulatively.
Print @Count
End
Set @Count = 0 --Reset for next table.
--Print @Loop --For Debugging.
Set @Loop = @Loop + 1
End
Drop Table #Tables
The new query is set to a low rowcount for testing purposes. When this runs I get no errors, but the first table never gives me any update number. No zero, no nothing. The following tables do give me a result.
Results sample for first table:
Loading Temp table.
Now processing table: FLA_INST_CONDITION
Now processing table: FLA_INST_CONDITION
Now processing table: FLA_INST_CONDITION
Now processing table: FLA_INST_CONDITION
Now processing table: FLA_INST_CONDITION
Results sample for following tables:
Now processing table: FLA_INST_CROSSOVER
0
Now processing table: FLA_INST_DETAILS
100
Now processing table: FLA_INST_DETAILS
200
Now processing table: FLA_INST_DETAILS
300
As you can see, the other tables in the list return record counts as expected, but the first one doesn't.
I have been over this and do not see what I messed up. Hopefully a fresh set of eyes can give my a clue.
Thanks,
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
June 29, 2010 at 1:45 pm
Well, I figured it out, although I don't know why it works this way.
I had to set the @Count value to an initial value before I could add another value to it, apparently.
By adding Set @Count = 0 before the While loops, right after the Declares section, I now get my record counts for the first table.
I suppose it could be due to the way SQL handles Nulls, in that since the value of it was null when I started to use it, at:
Set @Count = @Count + @RowCount --Count the number of records updated,
The value of Null + @Rowcount = Null. After the first table I independently set the value back to zero, and 0 + @RowCount can equal whatever value @Rowcount just happens to be at that time.
Chris
Learning something new on every visit to SSC. Hoping to pass it on to someone else.
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply