January 27, 2014 at 2:16 pm
Hi ,
I have a procedure that loads about 60 million rows using code similar to below. I just wrote up this sample code and loaded with about 7 million rows to test the execution plan. It seems like it uses a sort operation to return a few max() over(partition by) different columns. what would be the best way to rewrite the query or process to get better performance. The code used to create the sample tables and load them is commented.
Please let me know any suggestions . Thank you
--DROP TABLE TestExecPlans
--CREATE TABLE TestExecPlans
--(Usr INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
--val1 INT NOT NULL,
--val2 INT NOT NULL,
--val3 INT NOT NULL,
--pal1 INT NOT NULL,
--pal2 INT NOT NULL,
--pal3 INT NOT NULL,
--)
--CREATE CLUSTERED INDEX IX_TestEx_Usr on TestExecPlans(Usr)
--DROP INDEX IX_TestEx_Usr on TestExecPlans
--DROP TABLE DestTable
--CREATE TABLE pparpati.DestTable
--(Usr INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
--val1 INT NULL,
--val2 INT NULL,
--val3 INT NULL,
--pal1 INT NULL,
--pal2 INT NULL,
--pal3 INT NULL,
--)
--CREATE NONCLUSTERED INDEX IX_val1 ON Desttable(val1);
--INSERT INTO TestExecPlans
-- ( val1, val2, val3, pal1, pal2, pal3 )
--VALUES
--(1,2,5,1,2,5),
--(3,3,3,3,3,3),
--(0,1,0,0,1,0)
--INSERT INTO TestExecPlans
--SELECT val1, val2, val3, pal1, pal2, pal3 FROM TestExecPlans
--SELECT count(*) FROM TestExecPlans
;WITH CTE1 AS
(
SELECT Usr,val1,val2+1 AS val2, val3, pal1 FROM TestExecPlans WHERE val1=1
),
CTE2 AS
(
SELECT Usr,val1,val2+1 AS val2, val3, pal1 FROM TestExecPlans WHERE val1=3
),
CTE3 AS
(
SELECT Usr,val1,val2+1 AS val2, val3, pal1 FROM TestExecPlans WHERE val1=0
)
INSERT INTO DestTable
SELECT c1.val1, c2.val2, c3.val3,
(SELECT MIN(mv)FROM (VALUES (c1.val1), (c2.val2), (c3.val3)) AS t(mv)) AS MinSimilarity,
(SELECT MAX(mv)FROM (VALUES (c1.val1), (c2.val2), (c3.val3)) AS t(mv)) AS MaxSimilarity,
MAX(c1.val1) OVER (PARTITION BY ISNULL(c1.Usr, ISNULL(c2.Usr,c3.Usr))) AS maxval
--(SELECT MIN(mv)FROM (VALUES (pal1), (pal2), (pal3)) AS t(mv)) AS MinSimilarity,
--(SELECT MAX(mv)FROM (VALUES (pal1), (pal2), (pal3)) AS t(mv)) AS MaxSimilarity
FROM CTE1 c1 FULL OUTER JOIN CTE2 c2 ON c1.Usr=c2.Usr FULL OUTER JOIN CTE3 c3 ON ISNULL(c1.usr,c2.usr)=c3.Usr
January 31, 2014 at 9:14 am
anyone can think of anything for this one. I have n't been able to figure it. Thanks
January 31, 2014 at 3:16 pm
I am looking at this. Is it possible for you to post the actual execution plan that is created when you run this query?
-- Itzik Ben-Gan 2001
January 31, 2014 at 4:25 pm
This can still be optimized (I'm out of time here). But this should be an enormous improvement.
WITH CTE AS
(
SELECT Usr,
CASE WHEN val1=1 THEN 1 END AS val1,
CASE WHEN val1=3 THEN val2+1 END AS val2,
CASE WHEN val1=0 THEN val3 END AS val3
FROM TestExecPlans
)
INSERT INTO DestTable
SELECT *,
(SELECT MIN(mv)
FROM (VALUES (val1), (val2), (val3)) AS t(mv)) AS MinSimilarity,
(SELECT MAX(mv)
FROM (VALUES (val1), (val2), (val3)) AS t(mv)) AS MaxSimilarity,
MAX(val1) OVER (PARTITION BY Usr) AS maxval
FROM CTE;
-- Itzik Ben-Gan 2001
February 1, 2014 at 1:10 am
venus.pvr (1/31/2014)
anyone can think of anything for this one. I have n't been able to figure it. Thanks
Your query almost certainly isn't doing what you are expecting. The column Usr is unique. There's no point in looking for three rows for the same usr and different values of val1 and full-outer-joining them because there will always be only one, even in 60 million rows. Your query is equivalent to this:
SELECT
val1 = CASE WHEN val1=1 THEN val1 END,
val2 = CASE WHEN val1=3 THEN val2+1 END,
val3 = CASE WHEN val1=0 THEN val3 END,
MinSimilarity = CASE WHEN val1=1 THEN val1 WHEN val1=3 THEN val2+1 WHEN val1=0 THEN val3 END,
MaxSimilarity = CASE WHEN val1=1 THEN val1 WHEN val1=3 THEN val2+1 WHEN val1=0 THEN val3 END,
maxval = CASE WHEN val1=1 THEN val1 END
FROM TestExecPlans
- which is a heck of a lot cheaper, but still pointless.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
February 1, 2014 at 10:55 am
Hi Alan/Chris,
Sorry, I did not post my original query since I didn't want to post the orig code. But all those parts are needed for my case. I wrote something similar. But I think what I wrote doesn't make sense at all. So it will be hard to interpret all this..
I think I will try to write up some code which makes sense and also post the execution plan.
thank you very much for your time.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply