Limit Resources to UGLY query

Question

Limit Resources to UGLY query

Viewing 15 posts - 31 through 45 (of 58 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004558 More actions · Answer 1

Heh... no... not real painful Let me explain...

Let's say you have a table that looks like this...

--===== Create and populate a 1,000,000 row test table.
-- Column RowNum has a range of 1 to 1,000,000 unique numbers
-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers
-- Column "SomeString" has a range of "AA" to "ZZ" non-unique 2 character strings
-- Column "SomeNumber has a range of 0.0000 to 99.9999 non-unique numbers
-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times
-- Takes about 77 seconds to execute.
SELECT TOP 1000000
RowNum = IDENTITY(INT,1,1),
SomeInt = CAST(RAND(CAST(NEWID() AS VARBINARY))*50000+1 AS INT),
SomeString = CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)),
SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),
SomeNumber = CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS MONEY),
SomeDate = CAST(RAND(CAST(NEWID() AS VARBINARY))*3653.0+36524.0 AS DATETIME),
Change = CAST(NULL AS INT)
INTO dbo.JBMTest
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2 --Lack of join criteria makes this a CROSS-JOIN
--===== A table is not properly formed unless a Primary Key has been assigned
ALTER TABLE dbo.JBMTest
ADD CONSTRAINT PK_jbmTest_RowNum PRIMARY KEY CLUSTERED (RowNum)
GO

... and, let's say we want to do something similar to what you are doing... in order by rownum (forgetting account numbers, etc, for now), we want to id the change of the SomeNumber column... if the next row has a larger value than the previous, well mark 1... if the same, then 0... if smaller, then -1. We'll do a little setup and then don't blink... it runs nasty fast... notice the handy way we do an update remembering the previous value without either a self join, a join, or an explicit loop... that's what makes it so fast...

--===== Do a little setup
DECLARE @PrevVal MONEY
SET @PrevVal = 0
DECLARE @Dummy INT
--===== Do the update to the Change column
UPDATE jbmTest
SET @Dummy = Change = SIGN(SomeNumber-@PrevVal),
@PrevVal = SomeNumber
FROM jbmTest WITH (INDEX(PK_jbmTest_RowNum),TABLOCK)

That update on a million rows take 7 seconds on my box.

But, add 1 "reporting" index...

CREATE INDEX IDX_DODAH

ON jbmTest (SomeDate,SomeNumber,Change)

GO

... and, suddenly, the same "previous row comparison" update takes over 2 minutes to run instead of 7 seconds.

You can do the same type of update in your temp tables... The clustered primary key (whatever it turns out to be) is most important for the blazing speed. Updates on a table with a bazillion reporting indexes are going to continue to kill you... with such a large update, you may want to drop all of the indexes, do your inserts/updates, and rebuild the indexes... even that's going to take a while and might not be worth it (although it will optimize the indexes for reporting :hehe: )

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Bob Fazio SSChampion Points: 10794 More actions · Answer 2

I love that trick you did. I tried to do that myself, but was unable to find a way to do it.

@dummy = change = sign(type1 - @previous) ,@previous = type1

Jeff Moden SSC Guru Points: 1004558 More actions · Answer 3

Heh, yeah... the key was the @Dummy variable in this case. Needed it to get by the mix of columns and variables because the value wasn't necessary. The other key is the index hint on an index that sorts in the order you need.

Thanks for the feedback, Bob.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 4

Good proof that index maintenance really DOES have a price! I like the update mechanism you used too Jeff.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Bob Fazio SSChampion Points: 10794 More actions · Answer 5

It's going to take a little to fix the code. Should have it done a little later today. Let you know the results. Now for the other nightmare issue I have.

Since I often mix days and weeks, I need to do something like the following.

Date,Week,Month,Daily_Type1,Monthly_Type2

2006-11-29 00:00:00.000,5577,1283,0.176767241809893,NULL

2006-11-30 00:00:00.000,5578,1283,0.206843646769383,0.467677951418334

Now you see that the value for Monthly_Type2 is null because I don't have a value for that date, but 11/30 is the monthly value.

Since they are both the same month, I need to back fill in the nulls with the value for that month. Any other really cool tricks like the previous one to do that?

UPDATE #ComplexFilter

SET Monthly_FM= MonthlyValue.Monthly_FM,Monthly_FM_Change= MonthlyValue.Monthly_FM_Change

FROM #ComplexFilter

INNER JOIN #ComplexFilter AS MonthlyValue

ON #ComplexFilter.MonthNumber = MonthlyValue.MonthNumber

AND MonthlyValue.StoredCalcDataID = #ComplexFilter.StoredCalcDataID

AND MonthlyValue.Monthly=1

AND MonthlyValue.Monthly_FM IS NOT NULL

AND MonthlyValue.Monthly_FM_Change IS NOT NULL

WHERE #ComplexFilter.Monthly=0

Kurt W. Zimmerman SSCrazy Eights Points: 8676 More actions · Answer 6

Yes I would agree but not in this case. The process I'm referring to deals with a large volume of data.

Thanks for your reply

Kurt

DBA

RHWI, Inc

Poughkeepsie, NY

Kurt W. Zimmerman
SR DBA
Lefrak Organization
New York, NY

http://www.linkedin.com/in/kurtwzimmerman

Bob Fazio SSChampion Points: 10794 More actions · Answer 7

I was able to update 11M+ rows 3x and Update every 20th row (months) 1x

in 210 seconds. I am going to see if I can get this to do this in no more than 1 pass per day,week, month but at this point that is an improvement. I am re-running now to see how significant. But I am thinking it is about a 600% improvement.

Bob Fazio SSChampion Points: 10794 More actions · Answer 8

DECLARE @Dummy INT,@PrevID INT

SELECT @PrevID = 0

DECLARE @PreDaily_D FLOAT

DECLARE @PreDaily_M FLOAT

DECLARE @PreDaily_RM FLOAT

UPDATE #ComplexFilter

SET

@PreDaily_D = CASE WHEN @PrevID = StoredCalcDataID THEN @PreDaily_D ELSE NULL END,

@PreDaily_M = CASE WHEN @PrevID = StoredCalcDataID THEN @PreDaily_M ELSE NULL END,

@PreDaily_RM = CASE WHEN @PrevID = StoredCalcDataID THEN @PreDaily_RM ELSE NULL END,

@Dummy=Daily_D_Change=SIGN(Daily_D-@PreDaily_D),@PreDaily_D=Daily_D,

@Dummy=Daily_M_Change=SIGN(Daily_M-@PreDaily_M),@PreDaily_M=Daily_M,

@Dummy=Daily_RM_Change=SIGN(Daily_RM-@PreDaily_RM),@PreDaily_RM=Daily_RM,

@PrevID = StoredCalcDataID

FROM #ComplexFilter WITH (INDEX(#ComplexFilter_PK))

WHERE DAILY=1

SELECT @PrevID = 0

DECLARE @PreMonthly_FM FLOAT

UPDATE #ComplexFilter

SET

@PreMonthly_FM = CASE WHEN @PrevID = StoredCalcDataID THEN @PreMonthly_FM ELSE NULL END,

@Dummy=Monthly_FM_Change=SIGN(Monthly_FM-@PreMonthly_FM),@PreMonthly_FM=Monthly_FM,

@PrevID = StoredCalcDataID

FROM #ComplexFilter WITH (INDEX(#ComplexFilter_PK))

WHERE MONTHLY=1

FYI... Now it is down to only 105 seconds. Which with my past method took about 800 seconds for a 4 field 11M row update.

Thanks Jeff!!! That's an improvement.

Bob Fazio SSChampion Points: 10794 More actions · Answer 9

Problem 2.

Date,Week,Month,Daily_Type1,Monthly_Type2

2006-11-29 00:00:00.000,5577,1283,0.176767241809893,NULL

2006-11-30 00:00:00.000,5578,1283,0.206843646769383,0.467677951418334

Drop PK index

create unique clustered index #TmpPK on #ComplexFilter(AsOfDate DESC) -- Change order

UPDATE #ComplexFilter

SET

@CurrMonthly_Type2 = CASE WHEN Monthly_Type2 IS NULL THEN @CurrMonthly_Type2 ELSE Monthly_Type2 END,

Monthly_Type2 = @CurrMonthly_Type2

FROM #ComplexFilter WITH (INDEX (#TmpPK ))

-- Apparently you don't need the @dummy field (at least in 2k5) Nor the HINT since default would be to go in order of CLUSTERED INDEX.

Now I am in the process of figuring out if the dropping and creating of the index is FASTER than the old UPDATE, but I'll let you know.

Bob Fazio SSChampion Points: 10794 More actions · Answer 10

An Update

The @Dummy= is required. Well at least if you want the value to be right The statement will run, but the value will be wrong. Not sure exactly what happens, but it isn't what it should be.

noeld SSC Guru Points: 96590 More actions · Answer 11

Just a little warning... you are using an undocumented behaviour and a SP could belly up your project. :hehe:

Other than that the use of variables in update statements have been stable for the last releases but never documented..

Cheers,

* Noel

Bob Fazio SSChampion Points: 10794 More actions · Answer 12

hmmm. Well, I guess the good news here is that this code isn't controlling a missile launch system or anything like that. However, I will keep that in mind. Thanks.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 13

Ninja's_RGR'us

SSC Guru

Points: 294069

October 2, 2007 at 5:23 pm

#738443

What is it for?

Jeff Moden SSC Guru Points: 1004558 More actions · Answer 14

TheSQLGuru (10/2/2007)
Good proof that index maintenance really DOES have a price! I like the update mechanism you used too Jeff.

Thank you much!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004558 More actions · Answer 15

Bob Fazio (10/2/2007)
I was able to update 11M+ rows 3x and Update every 20th row (months) 1x
in 210 seconds. I am going to see if I can get this to do this in no more than 1 pass per day,week, month but at this point that is an improvement. I am re-running now to see how significant. But I am thinking it is about a 600% improvement.

Now, we're cookin'... nice job, Bob...

You can make other variables/formula combination in the same update if you want to try to get it down to a single pass... it won't cost but a bit performance wise.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Limit Resources to UGLY query

Cookies on SQLServerCentral