April 27, 2010 at 4:46 am
I have been asked to refactor a piece of T-SQL that applies a schedule made up of time segments over a date range. This is currently used in a number of SSRS reports that are unacceptably slow.
A segment record contains the following data items:
1. A start time
2. An end time
3. A value indicating which days of the week the segment applies to
4. A value indicating the 'type' of segment
5. A value indicating if the segment is absolute (a one off) or recurring (applies during the same time period of every day the segment applies to)
6. A value indicating the priority of the segment.
7. A flag that idicates if the segment crosses midnight into the next day
8. A note that is to be applied to all instances of the segment within the date range.
A start and end date for the date range are supplied as parameters.
The current code then builds a table variable containing a row for every minute of the date range (this is the first issue as the date range can be several months or even years leading to millions of rows). Each row contains a default segment type.
The code then runs a number of cursors (horrible I know) over the table variable, one for each type of segment in order of priority, overwriting previous entries if a set of conditions are met.
It then constructs a series of table variables which filter the data ending up with a table that contains only the dates and times within the date range when a change of segment applies.
This is clearly RBAR on a massive scale and the performance of the report when presented with a large date range is unacceptable in its current form.
I have attached a script which contains the current code. Please note that the code has been limited to only return 3 months worth of data forward from the execution date to forestall any performance issues from running it interfering with the operation of the host server.
I realise that this is a fairly large chunk of code and I'm not asking for someone to rewrite it for me. Rather, what I am after here is any advice on how this could be refactored to work in a more efficient, set-based way as at present I am at a loss as to where to start with it.
TIA,
Chris
April 27, 2010 at 1:16 pm
A few (standard) things that I spotted:
a) don't use a table variable with such an amount of data. Use a temp table with proper indexing instead so you can benefit from statistics (if storing intermediate data is required at all...).
b) Use UNION ALL instead of UNION if you can guarantee not to have dups (which is the case here due to the different value of the Restriction column). UNION will internally force a SORT DISTINCT, which is not required here.
c) Don't apply calculation to a column used in a WHERE clause (especially when compared to a variable) since this will prevent using indexes. Apply the changes to the variable so it will match the values in the column.
d) Replace the weekdays calculation and the repetitive "OR weekday=1" with a method where you can join two columns directly. One option would be to reference the segment.day_flags to a specific date, e.g. '1900-01-01' for Monday (which, fortunately, is true 😉 ). Therewith you could avoid that massive OR clause including the DATEPART()%7 code - which will give different results for different @@DATEFIRST values, ...
e)The following statement looks very strange... What is the FETCH PRIOR supposed to do?
FETCH PRIOR FROM AllSegmentsAtChangePoints INTO @CurrentDate, @CurrentRestriction, @LastNote
FETCH NEXT FROM AllSegmentsAtChangePoints INTO @CurrentDate, @CurrentRestriction, @LastNote
f) .. list goes on
You have three nested c.u.r.s.o.r.s. plus one nested l.o.o.p.... :pinch:
Start with the inner most and change the RBAR into a set based solution. Replace each and every nesting level until you end up with a single insert statement (if required, use two separate insert statements). Update the result set based on sets, not RBAR. Use a "quirky update", if required. Continue to use the intermediate table until you're done with the "un-looping". Verify the code if you can get rid of one or the other temp table.
You might end up with some code where you don't even need the temp tables anymore being able to use a (more or less) standard view instead (maybe using a CTE and/or iTVF).
Another option would be to hire a consultant for a day or two...
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply