Regarding SQL Query optimization

Question

Regarding SQL Query optimization

VSSGeorge

SSCrazy Eights

Points: 8151
More actions
October 13, 2011 at 4:39 am

#260924

I have a query regarding Query optimization.[font="Arial"][/font]
Which one of the following is preferred considering the performance of the SQL query executed..??
Method 1
--------
UPDATE Table1 SET Col1 = ( Select Count(Id) from Maintable MT where (MT.Age between 1 and 2) and (MT.PID = @id))
UPDATE Table1 SET Col2 = ( Select Count(Id) from Maintable MT where (MT.Age between 2 and 3) and (MT.PID = @id))
------------------------
------------------------
UPDATE Table1 SET Col10 = ( Select Count(Id) from Maintable MT where (MT.Age between 10 and 11) and (MT.PID = @id))
Method 2
--------
UPDATE Table1 SET Col1 = ( Select Count(Id) from Maintable MT where (MT.Age between 1 and 2) and (MT.PID = @id)),
SET Col2 = ( Select Count(Id) from Maintable MT where (MT.Age between 2 and 3) and (MT.PID = @id)),
------------------------
------------------------
SET Col10 = ( Select Count(Id) from Maintable MT where (MT.Age between 10 and 11) and (MT.PID = @id))

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

hunt Right there with Babe Points: 780 More actions · Answer 1

The second one would be faster becuase it is single SET based update on the table compared to 10 different update statements.

The syntax in the second query seems to be incorrect.

thanks
sarat 🙂
Curious about SQL

Andre Guerreiro SSCertifiable Points: 7319 More actions · Answer 2

You could try running a test example using the estimated execution plan button and the percentage taken by each method. I can tell you that the single UPDATE affecting several columns took me 36% of the cost of the total batch from a simple test I made while the first method took me the remaining 64%.

Best regards,

Andre Guerreiro Neto

Database Analyst
http://www.softplan.com.br
MCITPx1/MCTSx2/MCSE/MCSA

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

codebyo (10/13/2011)
You could try running a test example using the estimated execution plan button and the percentage taken by each method. I can tell you that the single UPDATE affecting several columns took me 36% of the cost of the total batch from a simple test I made while the first method took me the remaining 64%.
Best regards,

Even with the actual plan, this is completely unreliable. You need both CPU and io and the best way to do that is with actual timings or better still the profiler. If you are unsure of the veracity of this claim, I'll post up a couple of queries - one takes 4%, the other 96% of the total cost when they are run in the same batch. The 4% query however takes 3 times longer to run than the 96% cost query. They're both string-splitters. The 4% query uses a rCTE as a source of rows/numbers which is heavy on io.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Devendrakumar SSC-Forever Points: 42493 More actions · Answer 4

I'll post up a couple of queries

It would be beneficial for everyone here. Thanks!

VSSGeorge SSCrazy Eights Points: 8151 More actions · Answer 5

Chris, whats on your mmind..Which Method is better..first or second..??

Andre Guerreiro SSCertifiable Points: 7319 More actions · Answer 6

ChrisM@Work(10/13/2011)
Even with the actual plan, this is completely unreliable. You need both CPU and io and the best way to do that is with actual timings or better still the profiler. If you are unsure of the veracity of this claim, I'll post up a couple of queries - one takes 4%, the other 96% of the total cost when they are run in the same batch. The 4% query however takes 3 times longer to run than the 96% cost query. They're both string-splitters. The 4% query uses a rCTE as a source of rows/numbers which is heavy on io.

Oh, yes. I'm aware of that. Thanks for pointing that out. 😉

I was not saying that would be the best solution though.

It was just a quick test and in this particular UPDATE example it worked as expected because both of my tests were using the same columns and exactly the same code except for the single UPDATE vs several UPDATEs.

But I should have stated that more clearly. Thank you for the correction. 🙂

Best regards,

Andre Guerreiro Neto

Database Analyst
http://www.softplan.com.br
MCITPx1/MCTSx2/MCSE/MCSA

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 7

No problem. This relies heavily on the testing framework which Jeff Moden et al set up for his excellent string splitter article[/url].

Here's the rCTE code:

CREATE FUNCTION [dbo].[DS8K_rCTEa](@pString [varchar](8000), @pDelimiter [char](1))

RETURNS TABLE WITH SCHEMABINDING AS

RETURN

-- DS8K_rCTEa

WITH x AS (

SELECT

ItemNumber = 1,

pos = 0,

Nextpos = CHARINDEX(',',@pString,1)

UNION ALL

SELECT

ItemNumber = x.ItemNumber + 1,

pos = x.nextpos,

nextpos = ISNULL(NULLIF(CHARINDEX(',',@pString,x.nextpos+1),0),8000)

FROM x

WHERE x.nextpos < 8000 AND x.nextpos+x.pos>0

)

SELECT

ItemNumber,

Item = CASE WHEN pos+nextpos = 0 THEN @pString ELSE SUBSTRING(@pString,pos+1,Nextpos-(pos+1)) END

FROM x

Here's a (hard) tally table version:

ALTER FUNCTION [dbo].[DS8K_Tally1U](@pString [varchar](8000), @pDelimiter [char](1))

RETURNS TABLE WITH SCHEMABINDING AS

RETURN

-- DS8K_Tally1U (one-based permanent tally table)

SELECT

ItemNumber= CAST(0 AS BIGINT),

Item= LEFT(@pString,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,1),0)-1,8000))

UNION ALL

SELECT

ItemNumber= ROW_NUMBER() OVER(ORDER BY pos),

Item= SUBSTRING(@pString,pos,[len])

FROM (

SELECT

pos= n+1,

[len]= ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,(n+1)),0)-(n+1),8000)

FROM dbo.Tally1 WITH (NOLOCK)

WHERE n <= ISNULL(DATALENGTH(@pString),0)

AND SUBSTRING(@pString,n,1) = @pDelimiter

) x

Here's the code to test the two functions in sequence:

USE tempdb

GO

SELECT * into JBMTest

FROM (

SELECT 0 as SomeID, NULL as SomeValue UNION ALL --1 NULL

SELECT 1, SPACE(0) UNION ALL --1 b (Empty String)

SELECT 2, SPACE(1) UNION ALL --1 b (1 space)

SELECT 3, SPACE(5) UNION ALL --1 b (5 spaces)

SELECT 4, ',' UNION ALL --2 b b (both are empty strings)

SELECT 5, '55555' UNION ALL --1 E

SELECT 6, ',55555' UNION ALL --2 b E

SELECT 7, ',55555,' UNION ALL --3 b E b

SELECT 8, '55555,' UNION ALL --2 b B

SELECT 9, '55555,1' UNION ALL --2 E E

SELECT 10, '1,55555' UNION ALL --2 E E

SELECT 11, '55555,4444,333,22,1' UNION ALL --5 E E E E E

SELECT 12, '55555,4444,,333,22,1' UNION ALL --6 E E b E E E

SELECT 13, ',55555,4444,,333,22,1,' UNION ALL --8 b E E b E E E b

SELECT 14, ',55555,4444,,,333,22,1,' UNION ALL --9 b E E b b E E E b

SELECT 15, ' 4444,55555 ' UNION ALL --2 E (w/Leading Space) E (w/Trailing Space)

SELECT 16, 'This,is,a,test.' --E E E E

) d

SELECT test.SomeID, test.SomeValue, split.ItemNumber, Item = QUOTENAME(split.Item,'"')

FROM JBMTest test CROSS APPLY dbo.DS8K_rCTEa(test.SomeValue,',') split

SELECT test.SomeID, test.SomeValue, split.ItemNumber, Item = QUOTENAME(split.Item,'"')

FROM JBMTest test CROSS APPLY dbo.DS8K_Tally1U(test.SomeValue,',') split

Try this too, it's fun but not quite legit:

CREATE FUNCTION [dbo].[DS8K_CTE2U](@pString [varchar](8000),@pDelimiter [char](1))

RETURNS TABLE WITH SCHEMABINDING AS

RETURN

-- DS8K_CTE2U (zero-based row-constructor tally table)

-- When a CROSS JOIN is used without the ON clause, parentheses can be used to indicate the join order -

-- at least that's the "official" line ;)

SELECT

ItemNumber= CAST(0 AS BIGINT),

Item= CAST(LEFT(@pString, ISNULL(NULLIF(CHARINDEX(@pDelimiter, @pString, 1),0)-1,8000)) AS VARCHAR(8000))

UNION ALL

SELECT

ItemNumber= ROW_NUMBER() OVER(ORDER BY (SELECT NULL)), -- row_number appears to be necessary

Item= SUBSTRING(@pString,n,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,n),0)-n,8000))

FROM (

SELECT n = n+1

FROM (

SELECT TOP (1+ISNULL(DATALENGTH(@pString),0))

n = (n1 + n2 + n3 + n4)

FROM (VALUES (0),(100),(200),(300),(400),(500),(600),(700),(800),(900)) t3 (n3)

CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t1 (n1)

CROSS JOIN (VALUES (0),(10),(20),(30),(40),(50),(60),(70),(80),(90)) t2 (n2)

CROSS JOIN (VALUES (0),(1000),(2000),(3000),(4000),(5000),(6000),(7000),(8000),(9000)) t4 (n4) --

) n

WHERE SUBSTRING(@pString,n,1) = @pDelimiter

) d

Now, to test the duration of these two queries, your best bet is Jeff's test setup which is an attachment at the end of the article. This query yields a good summary of the results:

SELECT

SplitterName,

Min_ = MIN(Duration),

Max_ = MAX(Duration),

Sum_ = SUM(Duration),

Avg_ = AVG(Duration)

FROM (

SELECT *, Placing = ROW_NUMBER() OVER(PARTITION BY NumberOfElements, MaxElementLength ORDER BY Duration)

FROM dbo.TestResults -- 392 rows

) d

GROUP BY SplitterName

ORDER BY SUM(Duration)

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Junglee_George (10/13/2011)
Chris, whats on your mmind..Which Method is better..first or second..??

Easy - test them 😉

It's an unrestricted update though, one pass is likely to be better than two or more.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

drew.allen SSC Guru Points: 77014 More actions · Answer 9

Junglee_George (10/13/2011)
Which one of the following is preferred considering the performance of the SQL query executed..??

Actually, neither one is preferred. The first one is worst--as has already been mentioned--because it performs 10 separate updates as opposed to the 1 update for the second, but both of them are bad, because they both perform 10 separate selects from MainTable. Your query can be rewritten to only perform one select as well.

;

WITH MT AS (

SELECT

Col1 = Count(CASE WHEN Age BETWEEN 1 and 2 THEN [ID] END)

,Col2 = Count(CASE WHEN Age BETWEEN 2 AND 3 THEN [ID] END)

,Col3 = Count(CASE WHEN Age BETWEEN 3 AND 4 THEN [ID] END)

FROM MainTable

WHERE PID = 1

)

UPDATE Table1

SET Col1 = MT.Col1

,Col2 = MT.Col2

,Col3 = MT.Col3

FROM MT

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

VSSGeorge SSCrazy Eights Points: 8151 More actions · Answer 10

Thank You all for the responses...

That provides lots of insight...!!!

Thanks a lot...

Performance Guard (Shehap) Ten Centuries Points: 1297 More actions · Answer 11

Actually , difference between 2 or 3 update statements depend on workload exist there and almost ,it could be such big deal for performance

But actually , I totally agree with the coding algorithm of Drew particularly more for big workload + huge data entity where it could be quite significant performance .

But as more simple + fast query , you could run the below:

DECLARE @COL1 INT , @COL2 INT,@COL3 INT

SELECT @COL1 = ISNULL( Count(CASE WHEN AGE BETWEEN 1 and 2 AND PID =1 THEN 1 ELSE NULL END) ,0)FROM TABLE1

SELECT @COL2= ISNULL( Count(CASE WHEN AGe BETWEEN 2 AND 3 AND PID =1 THEN 1 ELSE NULL END),0)FROM TABLE1

SELECT @COL3= ISNULL(Count(CASE WHEN AGe BETWEEN 3 AND 4 AND PID =1 THEN 1 ELSE NULL END),0)FROM TABLE1

update TABLE1 set col1 =@COL1

, COL2=@COL2

, COL3=@COL3

Where more specific parameters filtration of each count scope would be there thereby , it could benefit within huge data entity if you created 3 filtered nonclustered indexes according to the criteria above

[font="Arial Black"]Performance Guard[/font]

[font="Courier New"]Just ask me to get 0 sec for all queries[/font]

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 12

Junglee_George (10/13/2011)
I have a query regarding Query optimization.[font="Arial"][/font]
Which one of the following is preferred considering the performance of the SQL query executed..??
Method 1
--------
UPDATE Table1 SET Col1 = ( Select Count(Id) from Maintable MT where (MT.Age between 1 and 2) and (MT.PID = @id))
UPDATE Table1 SET Col2 = ( Select Count(Id) from Maintable MT where (MT.Age between 2 and 3) and (MT.PID = @id))
------------------------
------------------------
UPDATE Table1 SET Col10 = ( Select Count(Id) from Maintable MT where (MT.Age between 10 and 11) and (MT.PID = @id))
Method 2
--------
UPDATE Table1 SET Col1 = ( Select Count(Id) from Maintable MT where (MT.Age between 1 and 2) and (MT.PID = @id)),
SET Col2 = ( Select Count(Id) from Maintable MT where (MT.Age between 2 and 3) and (MT.PID = @id)),
------------------------
------------------------
SET Col10 = ( Select Count(Id) from Maintable MT where (MT.Age between 10 and 11) and (MT.PID = @id))

UPDATE Table1 SET Col1 = ( Select Count(Id) from Maintable MT where (MT.Age between 1 and 2) and (MT.PID = @id))

UPDATE Table1 SET Col2 = ( Select Count(Id) from Maintable MT where (MT.Age between 2 and 3) and (MT.PID = @id))

...

You do realize that MT.Age when it is 2 it will be counted in both Col1 and Col2. The same is true for the other columns where the upper bound of one is equal to the lower bound of the next.

Same is true with method 2 as well.