June 27, 2007 at 2:39 am
I have a website traffic logging system that was developed for a 2k system with the following
spec.
TRAFFIC_LOG
RecordID uniqueidentifier default newID()
Stamp datetime
Other site related data
TRAFFIC_LOG_HISTORY
RecordID uniqueidentifier
Stamp datetime
Same column definition as TRAFFIC_LOG
The TRAFFIC_LOG table only holds a days worth of logging. In the middle of the night a scheduled job is run that transfers all the data inside for the current day into a historical table TRAFFIC_LOG_HISTORY which has the same column definition.
All the indexes on TRAFFIC_LOG_HISTORY are dropped the data is inserted, the indexes rebuilt and the data cleared from TRAFFIC_LOG.
Now the issue is that the RecordID column in TRAFFIC_LOG uses NewID() as a default expression to create the uniqueidentifier and the SQL 2k version is just a random non sequential GUID.
Now the system has moved to a 2005 system and I know about the newsequentialid() function which I would like to use instead because
-Its sequential so the GUIDs would reflect the order they were inserted into the table which is by date.
-The RecordNo is used in indexing and from articles I have read a sequential GUID will definitely be a major improvement on a random GUID and almost performs as good as an identity integer.
The problem is that you can't use the newsequentialid() function in any UPDATE statement only in DEFAULT constraints.
So would the best way to transfer existing RecordNo GUIDS from the historical data and the current (today's) log data be the following:
-Create a new column on the TRAFFIC_LOG_HISTORY table with newsequentialid() NOT NULL, so that is populates the table (although how do I get it to order sequentially by the Stamp field? OR can't I?)
-Then create a new colum on the TRAFFIC_LOG table with newsequentialid() NOT NULL, so that is populates the table (although how do I get it order sequentially by the Stamp field? OR can't I?)
-re-write any code that references the old RecordNo GUID to use the new sequentialGuid column
-drop the indexes that reference the old RecordNo index and rebuild on the new column.
-finally drop the old RecordNo column from both tables.
Does that sound like the only way to do it or is there a better / easier way? Also does anyone know how I can insert the new sequentialid values by datetime order so that they are in fact sequential on historical data. Once the new column is added on the TRAFFIC_LOG table they will be getting added by date order so I would like to have the historical data ordered that way as well if I can.
It would be really nice if I could somehow just update the existing RecordNo column values with the new sequentialID so that I don't have to rewrite code, add columns etc. Then I could just rebuild the indexes.
Any help would be much appreciated.
June 27, 2007 at 9:17 am
June 27, 2007 at 11:09 am
Not sure I agree with what you're doing, but maybe this will help anyway! How about creating a temp table that has a new guid column that defaults to sequential id, insert all your existing guids from the main table into it. Then join on origid = origid and update origid=sequentialid from the temp table.
June 28, 2007 at 3:40 am
Thanks for your advice.
Andy can I ask why you don't agree with what I am doing?
June 28, 2007 at 7:01 am
Should have added that to the original post, my apologies. I like the sequentialid change, just not sure I'd bother with changing old data (which usually affects other things, but I can see where in your case it probably doesn't, just log data). I'd be tempted to create a new history table and start pushing the new data that uses the sequential id into it, leave the old historical data is, just union all them together if needed. If you're clustering on the guid you probably won't need to drop the indexes anymore either.
June 28, 2007 at 7:40 am
Hi. The reason that I drop the indexes is for performance as the HISTORY table is only used for reporting by administrators. So the only time I do any inserts is this one big job each night. I have a fill factor of 100 on the table as its only used for SELECTS and then at night I drop all the indexes (5 different indexes on combination of fields), insert all the data from the LOG table and then add the indexes back on with a fillfactor of 100. As the table isn't being constantly updated I thought that this would be a better way of doing it than setting a lower fill factor to accomodate both selects and updates. Do you think this is ok?
June 28, 2007 at 7:47 am
Nothing wrong with that approach. If you're really curious, you could compare the time it takes to just dbcc indexdefrag vs the complete rebuild that happens when you add the indexes back. What col(s) are you clustering on in the history table?
June 29, 2007 at 2:50 am
Currently on the history & log tables respectivley (as they have the same structure) I am doing the following:
Clustered
Index is on the Stamp column Sorted ASC
My PrimaryKey (non-clustered)
is on the RecordNo (using the newid()) stored as Varbinary(50)
I then have a number of other non-clustered indexes on other relevant columns that are used for report filters.
I have played around with the indexes a lot on this table and found having the cluster on the RecordNo col performed worse than on the stamp but I am thinking that once I have replaced this non-sequential guid with a sequential guid the cluster would perform a lot better if I put it back on RecordNo.
What do you think?
June 29, 2007 at 6:58 am
Stamp isnt the greatest choice because it isn't unique, but I bet its how you access the data most commonly? Recordno would be unique but putting the clustered index there only makes sense if you really focus on that key for lookups. If RecordNo isn't clustered (but is indexed), then I think changing to the sequentialid isn't as big a deal, perhaps a bit of improvement having new RecordNo's get added to the end of the index instead of all over would reduce fragmentation (which doesnt matter if you rebuild after load!).
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply