January 3, 2009 at 9:26 am
Jeff Moden (1/3/2009)
Then, if no one has any objections, there are certain huge peformance benefits to NOT using a TVF in this case... I'm just gonna do it as a sproc if no one has any serious objections.
Not from me - i'm just doing the same 🙂
Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone
January 3, 2009 at 9:34 am
You know, I'm not sure how much point there is to this particular exercise. 90% of the time is in the intiial scanning of the WebLogEvents table and as there is not clustered index and indeed, no relevant index at all, there is no way to avoid scanning the entire table every time.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
January 3, 2009 at 9:51 am
Ok then - my entry, given the size of the dataset will be the code below.
I'm not sure if the UNION approach to getting the data out of the table will work better on jacroberts' database - because a table scan of that magnitude would be considerably expensive.
Anyway - this on my machine weighs in at 3.562.
DECLARE @Date datetime
SET @Date = '20081220'
DECLARE @ASPString nvarchar(100)
DECLARE @Len int
DECLARE @DateMinus1 datetime
DECLARE @ElevenThirty datetime
SET @ElevenThirty = '1899-12-30 23:30:00.000'
SET @ASPString = 'ASP.NET_SessionId='
SET @Len = Len(@ASPString)
SET @Date = Convert(varchar, @Date, 112)
SET @DateMinus1 = DateAdd(dd, -1, @Date)
CREATE TABLE #sessions(ID int IDENTITY(1,1) primary key clustered, EventMinuteIndex smallint, ASPSessionID char(40))
INSERT INTO #sessions (EventMinuteIndex, ASPSessionID)
SELECT EventMinuteIndex, ASPSessionID FROM
(
SELECT datediff(minute, '18991230', [Time]) as EventMinuteIndex,
CASE
WHEN CharIndex(';', [cs(Cookie)], CharIndex(@ASPString , [cs(Cookie)])) > 1 THEN
SubString([cs(Cookie)], CharIndex(@ASPString, [cs(Cookie)])
+ @Len, CharIndex(';', [cs(Cookie)], CharIndex(@ASPString, [cs(Cookie)]))
- CharIndex(@ASPString, [cs(Cookie)])- @Len)
ELSE
SubString( [cs(Cookie)], CharIndex(@ASPString , [cs(Cookie)]) + @Len, 100)
END ASPSessionId
FROM dbo.WebLogEvents WITH (NOLOCK)
WHERE Date = @date
AND CharIndex(@ASPString , [cs(Cookie)]) > 0
UNION ALL
SELECT datediff(minute, '18991230', [Time]) - 1440 as EventMinuteIndex,
CASE
WHEN CharIndex(';', [cs(Cookie)], CharIndex(@ASPString , [cs(Cookie)])) > 1 THEN
SubString([cs(Cookie)], CharIndex(@ASPString, [cs(Cookie)])
+ @Len, CharIndex(';', [cs(Cookie)], CharIndex(@ASPString, [cs(Cookie)]))
- CharIndex(@ASPString, [cs(Cookie)])- @Len)
ELSE
SubString( [cs(Cookie)], CharIndex(@ASPString , [cs(Cookie)]) + @Len, 100)
END ASPSessionId
FROM dbo.WebLogEvents WITH (NOLOCK)
WHERE Date = @DateMinus1 AND [Time] >= @ElevenThirty
AND CharIndex(@ASPString , [cs(Cookie)]) > 0
) itbl
ORDER BY ASPSessionID, EventMinuteIndex
SELECT @date as Date, hr, fm, count(*) FROM
(
select eventminuteindex / 60 as hr, (eventminuteindex / 5) % 12 as fm FROM
(
SELECT s1.eventminuteindex,
CASE WHEN s1.eventminuteindex - s2.eventminuteindex > 30
THEN 1
ELSE 0
END as enoughTimePassed,
CASE WHEN s1.aspsessionid = s2.aspsessionid
THEN 0
ELSE 1
END as SessionDifferent
FROM #sessions s1 INNER JOIN
#sessions s2
on s1.ID = s2.ID + 1
UNION ALL
-- union the first row in, as the above join will never pick it up
SELECT s1.eventminuteindex, 1 as enoughTimePassed, 1 as SessionDifferent
FROM #sessions s1 where id = 1
) itbl
WHERE EventMinuteIndex >= 0
and (EnoughTimePassed + SessionDifferent) > 0
) otbl
GROUP BY hr, fm
Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone
January 3, 2009 at 9:58 am
RBarryYoung (1/3/2009)
You know, I'm not sure how much point there is to this particular exercise. 90% of the time is in the intiial scanning of the WebLogEvents table and as there is not clustered index and indeed, no relevant index at all, there is no way to avoid scanning the entire table every time.
I know where you're coming from.
I did actually create a normalised data table whereby the aspsessionid was already extracted, and the clustered indexing was appropriate. The performance of the set based method there was quite good also - my best effort set based was 0.625 seconds, whereas the cursor method, refactored to use the new table, was 1.750.
To the others - FYI the code I used to create the normalised table was
DECLARE @ASPString nvarchar(100)
DECLARE @Len int
SET @ASPString = 'ASP.NET_SessionId='
SET @Len = Len(@ASPString)
CREATE TABLE [dbo].[WebLogEventsNormalised] (
[ASPSessionID] varchar(40),
[Date] datetime,
[Time] datetime
)
CREATE CLUSTERED INDEX [IX_WebLogEventsNormalised] ON [dbo].[WebLogEventsNormalised] (ASPSessionID, Date, [Time])
INSERT INTO [dbo].[WebLogEventsNormalised] ([Date], [Time], [ASPSessionID])
SELECT [Date],
[Time],
CASE
WHEN CharIndex(';', [cs(Cookie)], CharIndex(@ASPString , [cs(Cookie)])) > 1 THEN
SubString([cs(Cookie)], CharIndex(@ASPString, [cs(Cookie)])
+ @Len, CharIndex(';', [cs(Cookie)], CharIndex(@ASPString, [cs(Cookie)]))
- CharIndex(@ASPString, [cs(Cookie)])- @Len)
ELSE
SubString( [cs(Cookie)], CharIndex(@ASPString , [cs(Cookie)]) + @Len, 100)
END ASPSessionId
FROM dbo.WebLogEvents WITH (NOLOCK)
WHERE CharIndex(@ASPString , [cs(Cookie)]) > 0
Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone
January 3, 2009 at 10:08 am
repicurus (12/24/2008)
I believe that Abed misses or eludes the entire point / need for the occasional use of cursors, bad though they may be:To loop through a record set and take some sort action(s) (possibly updating select columns) based on complex procedural logic, possibly involving other database access to adhere to business rules, that cannot be accomplished via set-at-a-time constructs.
I would be happy to contribute TWO examples that I do not think could be accomplished WITHOUT cursors.
Perhaps our resident 'expert' on bad cursors (as opposed to good cursors) would be so kind as to enlighten me as to the proper manner in which change my evil cursor ways.
:rolleyes:
DUDE! Bring It ON!
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
January 3, 2009 at 11:00 am
OK... here we go. First, I'm gonna recommend that you add the following index to the WebLogEvents table... it will NOT interfere with inserts or deletes, will not need any form of maintenance (no pages splits if inserts are always done in order by date and time) and it WILL make all date sensitive queries 30% faster (or more)...
CREATE CLUSTERED INDEX IXC_WebLogEvents_Date_Time
ON dbo.WebLogEvents (Date,Time) WITH FILLFACTOR = 100
Now, based on the fact that JacRobert's said that he doesn't care whether or not the TVF is preserved or not, I went ahead and just did it as a stored procedure that returns a single result set.
This proc returns the result set on my box in just over 6 seconds without the index above and in about 4 seconds with the recommended index above. Matt's fine code generally took between 18 and 28 seconds for the same day's testing. And, that's what I wanted to show folks... between using the Select Into (can't be done on table variables) and the awesome speed of the "pseudo cursor" (Phil Factor calls it the "quirky update"), [font="Arial Black"]you can do some trully awesome things that most people would use a cursor for.[/font]
Here's the code and thanks for playing... as always, the details are commented in the code...
CREATE PROCEDURE dbo.GetDailyVisitCount
/*********************************************************************************************************************
Purpose:
Returns the visit count for each 5 minute period of each hour for the given day.
Programmer notes and references:
1. A "visit" is defined as a SessionId that hasn't been seen for over 30 minutes on the web site. So the data is
processed sorted by SessionId and Datetime and the time between each session visit is measured to see if it is a
new session or the same one.
2. This sproc uses a "pseudo cursor update" which is discussed in detail in the following article:
Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5
By Jeff Moden, 2008/01/31
http://www.sqlservercentral.com/articles/Advanced+Querying/61716/
Usage:
EXEC dbo.GetDailyVisitCount @Date
... where @Date is anything that converts to a DATETIME data type.
Revision History:
Rev 00 - 03 Jan 2009 - Jeff Moden - Initial creation and test
Reference: http://www.sqlservercentral.com/Forums/Topic625172-1373-2.aspx
*********************************************************************************************************************/
--===== Declare procedure I/O
@pDate DATETIME
AS
--=====================================================================================================================
-- Test Harness... uncomment this section and run from here down for on screen testing.
-- See the final Select in this proc for addition field outputs
--=====================================================================================================================
--DECLARE @pDate DATETIME
-- SET @pDate = '12/20/2008' --Must be a date that's actually in the dbo.WebLogEvents table at the time
--=====================================================================================================================
-- Presets
--=====================================================================================================================
--===== Declare and preset local variables
-- Note... all of these variables are used with the "pseudo cursor update"
-- to keep track of values from previous rows
DECLARE @PrevSessionID VARCHAR(50),
@PrevKeepDate DATETIME,
@PrevKeep TINYINT
SELECT @PrevSessionID = '',
@PrevKeepDate = 0,
@PrevKeep = 0
--===== Ensure the date parameter is a whole date
SELECT @pDate = DATEADD(dd,DATEDIFF(dd,0,@pDate),0)
--===== Supress the autodisplay of rowcounts so only the result set will be returned
SET NOCOUNT ON
--=====================================================================================================================
-- Copy data from the event table to a working table where we can work on it using a special index
--=====================================================================================================================
--===== Create and populate the working table on the fly
SELECT CAST(SUBSTRING(Cookie,1,CHARINDEX(';',Cookie)-1) AS VARCHAR(50)) AS SessionID,
TheDate+2.0 AS TheDate,
CAST(0 AS TINYINT) AS KeepMe,
Date AS DateOnly
INTO #MyHead
FROM (--==== Subquery does some of the calcs and isolates the start of the SessionID
SELECT Date+Time AS TheDate,
SUBSTRING([cs(Cookie)],CHARINDEX('ASP.NET_SessionId=',[cs(Cookie)])+18,8000)+';' AS Cookie,
Date
FROM dbo.WebLogEvents
WHERE Date = @pDate
OR (Date = @pDate-1 AND Time >= '1899-12-30 23:30')
)p1
--===== Add the necessary index to support the "pseudo cursor" update
CREATE CLUSTERED INDEX IXC_#MyHead_SessionID_TheDate
ON #MyHead (SessionID,TheDate) WITH FILLFACTOR = 100
--=====================================================================================================================
-- Mark "visits" according to the rules in "Programmer's notes #1" above
--=====================================================================================================================
--===== Do the "pseudo cursor" update to determine which rows to keep
UPDATE #MyHead
SET @PrevKeep = KeepMe = CASE WHEN SessionID = @PrevSessionID
AND DATEDIFF(mi,@PrevKeepDate,TheDate)>30
THEN 1
WHEN SessionID <> @PrevSessionID
THEN 1
ELSE 0
END,
@PrevKeepDate = TheDate,
@PrevSessionID = SessionID
FROM #MyHead WITH(INDEX(0)) --Forces a clustered index scan to maintain the update order
--=====================================================================================================================
-- Return the required output as a single result set
--=====================================================================================================================
--===== Return the results
SELECT @pDate AS Date,
DATEPART(hh,TheDate) AS [Hour],
DATEPART(mi,TheDate)/5 AS [FiveMinute],
COUNT(*) AS TheCount
FROM #MyHead
WHERE KeepMe = 1
AND DateOnly = @pDate
GROUP BY DATEPART(hh,TheDate),
DATEPART(mi,TheDate)/5
--=====================================================================================================================
-- Test Harness... uncomment this section for on screen testing.
--=====================================================================================================================
-- SELECT SessionID,
-- TheDate,
-- DateOnly,
-- KeepMe,
-- DATEPART(hh,TheDate) AS [Hour],
-- DATEPART(mi,TheDate)/5 AS [FiveMinute]
-- FROM #MyHead
-- WHERE KeepMe = 1
-- ORDER BY SessionID, TheDate
GO
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2009 at 11:03 am
Jeff Moden (1/3/2009)
jacroberts (1/2/2009)
The stored procedure that calls the function just takes the results table from the TVF and inserts this into a reporting table number of new site visits for every 5 minute period in a day, it does this by inserting a:
SELECT @Date, [Hour], [FiveMinute], Count(*)
FROM SiteVisits(@Date)
GROUP BY [Hour], [FiveMinute]
There is no reason why it shouldn't all be done inside one stored procedure without a TVF.
Then, if no one has any objections, there are certain huge peformance benefits to NOT using a TVF in this case... I'm just gonna do it as a sproc if no one has any serious objections.
Yes, no need to be in a TVF.
January 3, 2009 at 11:31 am
Jeff,
My solution was almost identical, except that I did an explicit create for the temporary table. (I'm showing my age here; It used to lock system tables for the whole implied transaction: i.e. the SELECT INTO, whereas the explicit creation of the temp table did it only for the CREATE)
The other thing I missed was the WITH FILLFACTOR = 100. Neat!
Goes like a rocket.
Best wishes,
Phil Factor
January 3, 2009 at 11:47 am
Phil Factor (1/3/2009)
Jeff,My solution was almost identical, except that I did an explicit create for the temporary table. (I'm showing my age here; It used to lock system tables for the whole implied transaction: i.e. the SELECT INTO, whereas the explicit creation of the temp table did it only for the CREATE)
The other thing I missed was the WITH FILLFACTOR = 100. Neat!
Goes like a rocket.
Heh... Guess I'll show my age, as well... I believe it was SP1 in SQL Server 6.5 that allowed for a fix using a startup parameter. When SQL Server 7.0 hit the streets, it had the fix built in from the git.
Also, I thing it's absolutely hilarious that someone who uses the handle of "Phil Factor" missed the "Fill Factor" speed enhancement. 😛
As always, Phil, I very much appreciate your feedback. It would be fun to work on a project with you, someday, even if it were remote.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2009 at 12:03 pm
Just a followup folks... I tried the original cursor/While loop based function on my box... now, keep in mind that it doesn't produce the required output... it just creates a return that would be used to return the final output. It took over 3 minutes. Even if you didn't know how to do the ultra fast methods I used, the standard method of using a self-joined table like that in Matt's good code still blew the doors off of the cursor method.
The other thing I want everyone to notice is how simple the code I wrote is... the comments are longer than the code and the code is very simple to understand. That kinda blows the doors off two of the biggest excuses for writing cursors... the cursor code turned out to be much longer and much more complex than the set based code did. The set based code took less time to write, produced shorter and MUCH faster code, and will be easier to troubleshoot or modify in the future because it's so simple.
There's two things to be learned here...
1. Like I said before, people give up too easily on finding a set based method. 99.9999% of the time in SQL Server 2000, there's a set based method for everything you can imagine. The number goes to 100% in SQL Server 2005. There is not now, nor will there ever be a trully good reason to use a cursor or a while loop in SQL Server. Learn your trade well enough to know the highspeed ways around a cursor so that you never ever have to make the trade off between using the proper set based technology or using a cursor because it's simple to do so you can meet a schedule.
2. If you ever think there's something that can't be done without a cursor in a high performance manner in SQL Server or that it will be quicker and easier to write a cursor so you can meet a &^%$#! schedule, see item 1 above. 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2009 at 12:34 pm
Incredible result, Jeff. About once a week you post something that just blows me away.
And for the record, it wasn't the pseudo-cursor, the SELECT INTO or the FillFactor that got me on this one, those either I knew about and/or they didn't make much difference because the table load time was so long. What blew me away on this one was how you got the the table load time down to almost nothing.
I tried everything that I could think of (including the Select Into & the source table index) and the table load still took 8-20 seconds on my laptop. Yours takes just over a second! And I honestly haven't a clue why.
You're a freakin' Jedi Wizard, man.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
January 3, 2009 at 12:54 pm
Jeff Moden (1/3/2009)
OK... here we go. First, I'm gonna recommend that you add the following index to the WebLogEvents table... it will NOT interfere with inserts or deletes, will not need any form of maintenance (no pages splits if inserts are always done in order by date and time) and it WILL make all date sensitive queries 30% faster (or more)...
CREATE CLUSTERED INDEX IXC_WebLogEvents_Date_Time
ON dbo.WebLogEvents (Date,Time) WITH FILLFACTOR = 100
Now, based on the fact that JacRobert's said that he doesn't care whether or not the TVF is preserved or not, I went ahead and just did it as a stored procedure that returns a single result set.
This proc returns the result set on my box in just over 6 seconds without the index above and in about 4 seconds with the recommended index above. Matt's fine code generally took between 18 and 28 seconds for the same day's testing. And, that's what I wanted to show folks... between using the Select Into (can't be done on table variables) and the awesome speed of the "pseudo cursor" (Phil Factor calls it the "quirky update"), [font="Arial Black"]you can do some trully awesome things that most people would use a cursor for.[/font]
Here's the code and thanks for playing... as always, the details are commented in the code...
CREATE PROCEDURE dbo.GetDailyVisitCount
/*********************************************************************************************************************
Purpose:
Returns the visit count for each 5 minute period of each hour for the given day.
Programmer notes and references:
1. A "visit" is defined as a SessionId that hasn't been seen for over 30 minutes on the web site. So the data is
processed sorted by SessionId and Datetime and the time between each session visit is measured to see if it is a
new session or the same one.
2. This sproc uses a "pseudo cursor update" which is discussed in detail in the following article:
Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5
By Jeff Moden, 2008/01/31
http://www.sqlservercentral.com/articles/Advanced+Querying/61716/
Usage:
EXEC dbo.GetDailyVisitCount @Date
... where @Date is anything that converts to a DATETIME data type.
Revision History:
Rev 00 - 03 Jan 2009 - Jeff Moden - Initial creation and test
Reference: http://www.sqlservercentral.com/Forums/Topic625172-1373-2.aspx
*********************************************************************************************************************/
--===== Declare procedure I/O
@pDate DATETIME
AS
--=====================================================================================================================
-- Test Harness... uncomment this section and run from here down for on screen testing.
-- See the final Select in this proc for addition field outputs
--=====================================================================================================================
--DECLARE @pDate DATETIME
-- SET @pDate = '12/20/2008' --Must be a date that's actually in the dbo.WebLogEvents table at the time
--=====================================================================================================================
-- Presets
--=====================================================================================================================
--===== Declare and preset local variables
-- Note... all of these variables are used with the "pseudo cursor update"
-- to keep track of values from previous rows
DECLARE @PrevSessionID VARCHAR(50),
@PrevKeepDate DATETIME,
@PrevKeep TINYINT
SELECT @PrevSessionID = '',
@PrevKeepDate = 0,
@PrevKeep = 0
--===== Ensure the date parameter is a whole date
SELECT @pDate = DATEADD(dd,DATEDIFF(dd,0,@pDate),0)
--===== Supress the autodisplay of rowcounts so only the result set will be returned
SET NOCOUNT ON
--=====================================================================================================================
-- Copy data from the event table to a working table where we can work on it using a special index
--=====================================================================================================================
--===== Create and populate the working table on the fly
SELECT CAST(SUBSTRING(Cookie,1,CHARINDEX(';',Cookie)-1) AS VARCHAR(50)) AS SessionID,
TheDate+2.0 AS TheDate,
CAST(0 AS TINYINT) AS KeepMe,
Date AS DateOnly
INTO #MyHead
FROM (--==== Subquery does some of the calcs and isolates the start of the SessionID
SELECT Date+Time AS TheDate,
SUBSTRING([cs(Cookie)],CHARINDEX('ASP.NET_SessionId=',[cs(Cookie)])+18,8000)+';' AS Cookie,
Date
FROM dbo.WebLogEvents
WHERE Date = @pDate
OR (Date = @pDate-1 AND Time >= '1899-12-30 23:30')
)p1
--===== Add the necessary index to support the "pseudo cursor" update
CREATE CLUSTERED INDEX IXC_#MyHead_SessionID_TheDate
ON #MyHead (SessionID,TheDate) WITH FILLFACTOR = 100
--=====================================================================================================================
-- Mark "visits" according to the rules in "Programmer's notes #1" above
--=====================================================================================================================
--===== Do the "pseudo cursor" update to determine which rows to keep
UPDATE #MyHead
SET @PrevKeep = KeepMe = CASE WHEN SessionID = @PrevSessionID
AND DATEDIFF(mi,@PrevKeepDate,TheDate)>30
THEN 1
WHEN SessionID <> @PrevSessionID
THEN 1
ELSE 0
END,
@PrevKeepDate = TheDate,
@PrevSessionID = SessionID
FROM #MyHead WITH(INDEX(0)) --Forces a clustered index scan to maintain the update order
--=====================================================================================================================
-- Return the required output as a single result set
--=====================================================================================================================
--===== Return the results
SELECT @pDate AS Date,
DATEPART(hh,TheDate) AS [Hour],
DATEPART(mi,TheDate)/5 AS [FiveMinute],
COUNT(*) AS TheCount
FROM #MyHead
WHERE KeepMe = 1
AND DateOnly = @pDate
GROUP BY DATEPART(hh,TheDate),
DATEPART(mi,TheDate)/5
--=====================================================================================================================
-- Test Harness... uncomment this section for on screen testing.
--=====================================================================================================================
-- SELECT SessionID,
-- TheDate,
-- DateOnly,
-- KeepMe,
-- DATEPART(hh,TheDate) AS [Hour],
-- DATEPART(mi,TheDate)/5 AS [FiveMinute]
-- FROM #MyHead
-- WHERE KeepMe = 1
-- ORDER BY SessionID, TheDate
GO
Jeff, That's a neat way of doing it. I hadn't read up on setting working variables through update statement process so that's new to me. In fact combined with a CASE statement in the sql it's almost like using a cursor, it could be argued that this isn't set based logic, though I've no doubt it runs fast which is the aim of this excercise!
I will test both your method and Matt's on Monday, time permitting, on some real data (10 million rows per day). I think at the moment the cursor driven procedure takes about 20 minutes to complete, a full table scan takes about 4 minutes.
I take is that @PrevKeep variable is redundant as it doesn't seem to be used?
Regarding adding the index on WeblogEvents table it would impact insert performance as there are 4 web servers each one generates a weblog file every hour which are imported into the database, the rows within each file are datetime ordered but there are 4 files all for the same datetime range so inserts are not always done in date and time order.
January 3, 2009 at 1:07 pm
Barry, coming from the likes of you, that's a totally awesome compliment. Thanks, ol' friend.
Just a hint about speed... I've found that SQL Server and I both have a couple of things in common... we're both lazy and neither of us can remember more than a couple of things at a time. 😛 I knew the split code to dig the SessionID out of the cookie was going to require multiple CharIndexes (ie, more than 2) and a whole bunch of other hooie if I tried to do it all in the same Select... WAY too much typing and thinking for me... it was shorter to do two Selects... one to find the beginning of the SessionID as a derived table and then a very simple outer select to get rid of the end. The outer select was also simple enough so I could do the mid-air conversion from VARCHAR(2048) to a nice little VARCHAR(50) in the process. It's "Divide and Conquer" at it's best, it usually works the best, requires the least amount of thinking, and usually requires the least amount of code. And, if you think about it... that's the same kind of stuff they use CTE's for. I just used the older technology so this code would work in 2k or 2k5.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2009 at 1:16 pm
jacroberts (1/3/2009)
Jeff, That's a neat way of doing it. I hadn't read up on setting working variables through update statement process so that's new to me.I will test both your method and Matt's on Monday, time permitting, on some real data (10 million rows per day). I think at the moment the cursor driven procedure takes about 20 minutes to complete, a full table scan takes about 4 minutes.
Read more about the method, here... many of us, including Phil Factor, have been using the method for many, many different things. But, if you don't follow the rules with the clustered index on the temp table or the index hint on the FROM temp table, you can and will get incorrect answers. Please take the time to study and try the examples in the following article before you try one of these bad boys on your own... there are some ver specific rules to follow...
[font="Arial Black"]Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5[/font][/url]
I take is that @PrevKeep variable is redundant as it doesn't seem to be used?
BWAA-HAAA!!! Well, at least I know someone read the code. Yes sir, I made a final change to the code and forgot to delete it from the declaration section of the code.
Regarding adding the index on WeblogEvents table it would impact insert performance as there are 4 web servers each one generates a weblog file every hour which are imported into the database, the rows within each file are datetime ordered but there are 4 files all for the same datetime range so inserts are not always done in date and time order.
Ah... understood... I thought it was a single feed. You're absolutely correct about the clustered index on overlapping feeds. Thanks for the feedback.
Just don't delete the clustered index from the temp table in the stored proc... that one is critical to the "pseudo cursor".
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2009 at 1:29 pm
Jeff Moden (1/3/2009)
Just a hint about speed... I've found that SQL Server and I both have a couple of things in common... we're both lazy and neither of us can remember more than a couple of things at a time. 😛 I knew the split code to dig the SessionID out of the cookie was going to require multiple CharIndexes (ie, more than 2) and a whole bunch of other hooie if I tried to do it all in the same Select...
Thanks, that does make sense. It just didn't occur to me that that was the source of the slowness.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
Viewing 15 posts - 121 through 135 (of 272 total)
You must be logged in to reply to this topic. Login to reply