August 6, 2013 at 5:50 am
Hi,
Pl. suggested me, how to resolve this issuse? In select statment sub-tree DISTINCT cost is 79%. as per atttached actual exec.plan. Is it possible wirtting query alternative ways?
UPDATE CSV_Details_MainFile
SET Mark_Rev_No = (CASE
WHEN (SELECT COUNT(mark_rev_no) as mark_rev_no
FROM CSV_Details_MainFile H1
WHERE H1.GA_Drg_NO = CSV_Details_MainFile.GA_Drg_NO
AND H1.Mark_No = CSV_Details_MainFile.Mark_No) != 0
THEN
(SELECT ISNULL(mark_rev_no,0) AS mark_rev_no FROM CSV_Details_MainFile H1
WHERE H1.GA_Drg_NO = CSV_Details_MainFile.GA_Drg_NO
and H1.Rev_NO = CSV_Details_MainFile.Rev_NO
AND H1.Mark_No = CSV_Details_MainFile.Mark_No
and DeleteFlag='1'
GROUP BY H1.GA_Drg_NO,H1.Mark_No ,H1.mark_rev_no)
ELSE
'1'
END)
WHERE CSV_Details_MainFile.GA_Drg_NO='C63-GPE105-499-005' AND Rev_NO = '1' AND
CSV_Details_MainFile.Mark_No in (SELECT DISTINCT Mark_No FROM CSV_Details_MainFile
WHERE GA_Drg_NO = 'C63-GPE105-499-005' AND Rev_NO ='1' and DeleteFlag is null)
Thanks
ananda
August 6, 2013 at 6:24 am
Yes of course, but folks will need a sample data script and a decent description of what the query is supposed to do.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 6, 2013 at 6:33 am
ok.. I will update you once getting from Dev.Team for the all details..
thanks
ananda
August 6, 2013 at 6:39 am
Is that 79% cost of the DISTINCT a problem?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 6, 2013 at 7:17 am
You are using the Distinct in a sub-select in the where clause using IN, why use distinct at all? Whether you have 1 record that matches or 100 records it won't cause dupicates because it is just checking the "IN" and not a join.
August 6, 2013 at 7:28 am
Sarah Wagner (8/6/2013)
You are using the Distinct in a sub-select in the where clause using IN, why use distinct at all? Whether you have 1 record that matches or 100 records it won't cause dupicates because it is just checking the "IN" and not a join.
The optimiser's smart enough to ignore distincts in an IN subquery. If removing duplicate rows is required, the optimiser will put in some distinct operation, if not it won't. That's regardless of what is specified in the query.
http://sqlinthewild.co.za/index.php/2011/01/18/distincting-an-in-subquery/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 6, 2013 at 7:44 am
When you display execution plan in SSMS, does it suggest any new index?
From what I see, it could at least use an index on Mark_No, but query plan analyzer may suggest a better composite index.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
August 6, 2013 at 9:26 am
ananda.murugesan (8/6/2013)
ok.. I will update you once getting from Dev.Team for the all details..thanks
ananda
Here's a guess at the business logic:
Where a set, partitioned on GA_Drg_NO, Rev_NO and Mark_No,
contains at least one row where DeleteFlag IS NULL and at least one row where DeleteFlag = '1',
pick the value of mark_rev_no from a row where DeleteFlag = '1'
and apply it to all rows of the set.
Can you confirm if this is correct or not and correct it if it it isn't?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 6, 2013 at 11:51 am
Hi, ChrisM@Work.. wow what a understand that update query logic..Absloute correct...
Hi, Eric M Russell.. SSMS - actual execution plan suggested to create one Non clustred index, that was created... (index name is Idx_ActualPlan)
CREATE NONCLUSTERED INDEX [Idx_ByActual_Plan] ON [dbo].[CSV_Details_MainFile]
(
[GA_Drg_NO] ASC,
[Rev_NO] ASC,
[Mark_No] ASC,
[DeleteFlag] ASC
)
INCLUDE ( [Mark_Rev_No]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Before total execution time was 00:00:28 seconds, after that it was just took 00:00:02 seconds after created NC index..But second time actual exex.plan was showing 72% of total cost in SELECT DISTINCT sort.
Thanks
ananda
August 6, 2013 at 1:33 pm
-- sample data
DROP TABLE #CSV_Details_MainFile
CREATE TABLE #CSV_Details_MainFile (GA_Drg_NO VARCHAR(25), Rev_NO CHAR(1), Mark_No VARCHAR(3), mark_rev_no VARCHAR(3), DeleteFlag CHAR(1))
INSERT INTO #CSV_Details_MainFile (GA_Drg_NO, Rev_NO, Mark_No, mark_rev_no, DeleteFlag) VALUES
('C63-GPE105-499-003','1','10','10','1'),
('C63-GPE105-499-003','1','10','10','1'),
('C63-GPE105-499-004','1','10','10',NULL),
('C63-GPE105-499-004','1','10','40','1'),
('C63-GPE105-499-004','2','10','10',NULL),
('C63-GPE105-499-004','2','10','10',NULL),
('C63-GPE105-499-005','1','10','10',NULL),
('C63-GPE105-499-005','1','10','10',NULL),
('C63-GPE105-499-005','1','10','20','1'),
('C63-GPE105-499-005','1','10','10','1')
CREATE CLUSTERED INDEX cx_LoadsOfStuff ON #CSV_Details_MainFile (GA_Drg_NO, Rev_NO, Mark_No, DeleteFlag)
----------------------------------------------------------------------------------------------------------------------------
-- Stare & Compare
SELECT *
FROM #CSV_Details_MainFile m
INNER JOIN (
SELECT GA_Drg_NO, Rev_NO, Mark_No, mark_rev_no = MAX(mark_rev_no)
FROM (
SELECT GA_Drg_NO, Rev_NO, Mark_No, mark_rev_no, DeleteFlag,
rn = DENSE_RANK() OVER(PARTITION BY GA_Drg_NO, Rev_NO, Mark_No ORDER BY DeleteFlag)
FROM #CSV_Details_MainFile
) d
WHERE rn = 2
GROUP BY GA_Drg_NO, Rev_NO, Mark_No
) s
ON s.GA_Drg_NO = m.GA_Drg_NO
AND s.Rev_NO = m.Rev_NO
AND s.Mark_No = m.Mark_No
----------------------------------------------------------------------------------------------------------------------------
-- Execute proposed new method
UPDATE m SET Mark_Rev_No = s.mark_rev_no
FROM #CSV_Details_MainFile m
INNER JOIN (
SELECT GA_Drg_NO, Rev_NO, Mark_No, mark_rev_no = MAX(mark_rev_no)
FROM (
SELECT GA_Drg_NO, Rev_NO, Mark_No, mark_rev_no, DeleteFlag,
rn = DENSE_RANK() OVER(PARTITION BY GA_Drg_NO, Rev_NO, Mark_No ORDER BY DeleteFlag)
FROM #CSV_Details_MainFile
) d
WHERE rn = 2
GROUP BY GA_Drg_NO, Rev_NO, Mark_No
) s
ON s.GA_Drg_NO = m.GA_Drg_NO
AND s.Rev_NO = m.Rev_NO
AND s.Mark_No = m.Mark_No
WHERE m.GA_Drg_NO = 'C63-GPE105-499-005' AND m.Rev_NO = '1'
-- execute old method
UPDATE #CSV_Details_MainFile
SET Mark_Rev_No = (CASE
WHEN (SELECT COUNT(mark_rev_no) as mark_rev_no
FROM #CSV_Details_MainFile H1
WHERE H1.GA_Drg_NO = #CSV_Details_MainFile.GA_Drg_NO
AND H1.Mark_No = #CSV_Details_MainFile.Mark_No) != 0
THEN
(SELECT ISNULL(mark_rev_no,0) AS mark_rev_no FROM #CSV_Details_MainFile H1
WHERE H1.GA_Drg_NO = #CSV_Details_MainFile.GA_Drg_NO
and H1.Rev_NO = #CSV_Details_MainFile.Rev_NO
AND H1.Mark_No = #CSV_Details_MainFile.Mark_No
and DeleteFlag='1'
GROUP BY H1.GA_Drg_NO,H1.Mark_No ,H1.mark_rev_no)
ELSE
'1'
END)
WHERE #CSV_Details_MainFile.GA_Drg_NO='C63-GPE105-499-005' AND Rev_NO = '1'
AND #CSV_Details_MainFile.Mark_No in (SELECT DISTINCT Mark_No FROM #CSV_Details_MainFile
WHERE GA_Drg_NO = 'C63-GPE105-499-005' AND Rev_NO ='1' and DeleteFlag is null)
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
August 6, 2013 at 1:52 pm
ananda.murugesan (8/6/2013)
Before total execution time was 00:00:28 seconds, after that it was just took 00:00:02 seconds after created NC index..But second time actual exex.plan was showing 72% of total cost in SELECT DISTINCT sort.
Again, is that 72% a problem? Why are you worrying again about operator costs?
To emphasise the point, if I have a query that has 95% of its cost in a clustered index scan, is that always, automatically and without further consideration a problem that needs to be addressed?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 6, 2013 at 2:58 pm
No more problem addressed even query working fine with 00:00:02 seconds and DISTINCT cost 72%..
Thank you..
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply