January 12, 2012 at 4:48 am
Hi,
I have a query running in 2008 which takes 100 seconds to run. I checked the execution plan and indexes but did not see anything missing. Probably the joins and the logic written could be a bit ugly. Can someone help me please.
declare @trweek int;
set @trweek=201151
select distinct R.TRSITID, R.TRPRODID, R.OPIS,
(SELECT TOP 1 TrWeek FROM FleetCorOPIS O2
WHERE O2.Start_date <= R.TrDate AND O2.End_date >= R.TrDate)
from transactions_recon R LEFT OUTER JOIN FleetcorOPIS O ON
O.Trweek = (SELECT TOP 1 TrWeek FROM FleetCorOPIS O2 WHERE O2.Start_date
<= R.TrDate AND O2.End_date >= R.TrDate) AND O.TrSitID = R.TrSitID AND
O.TrProdID = R.TrProdID where R.trweek = @trweek AND O.recid IS NULL
Regards
Chandan
January 12, 2012 at 5:00 am
First things first, we could do with some information from you 🙂
Read through this link[/url] to find out how best to post your performance related question.
Also, DDL may be useful.
Below I've formatted your query for clarity
SELECT DISTINCT R.TRSITID, R.TRPRODID, R.OPIS, (
SELECT TOP 1 TrWeek
FROM FleetCorOPIS O2
WHERE O2.Start_date <= R.TrDate AND O2.End_date >= R.TrDate
)
FROM transactions_recon R
LEFT OUTER JOIN FleetcorOPIS O ON O.Trweek = (
SELECT TOP 1 TrWeek
FROM FleetCorOPIS O2
WHERE O2.Start_date <= R.TrDate AND O2.End_date >= R.TrDate
) AND O.TrSitID = R.TrSitID AND O.TrProdID = R.TrProdID
WHERE R.trweek = @trweek AND O.recid IS NULL
And here is my total stab in the dark :-
SELECT DISTINCT R.TRSITID, R.TRPRODID, R.OPIS, outerQ.TrWeek
FROM transactions_recon R
OUTER APPLY (
SELECT TOP 1 TrWeek
FROM FleetCorOPIS O2
WHERE O2.Start_date <= R.TrDate AND O2.End_date >= R.TrDate
) outerQ
LEFT OUTER JOIN FleetcorOPIS O ON O.Trweek = outerQ.TrWeek AND O.TrSitID = R.TrSitID AND O.TrProdID = R.TrProdID
WHERE R.trweek = @trweek AND O.recid IS NULL
January 12, 2012 at 5:51 am
Cadavre (1/12/2012)
First things first, we could do with some information from you 🙂Read through this link[/url] to find out how best to post your performance related question.
Also, DDL may be useful.
Below I've formatted your query for clarity
SELECT DISTINCT R.TRSITID, R.TRPRODID, R.OPIS, (
SELECT TOP 1 TrWeek
FROM FleetCorOPIS O2
WHERE O2.Start_date <= R.TrDate AND O2.End_date >= R.TrDate
)
FROM transactions_recon R
LEFT OUTER JOIN FleetcorOPIS O ON O.Trweek = (
SELECT TOP 1 TrWeek
FROM FleetCorOPIS O2
WHERE O2.Start_date <= R.TrDate AND O2.End_date >= R.TrDate
) AND O.TrSitID = R.TrSitID AND O.TrProdID = R.TrProdID
WHERE R.trweek = @trweek AND O.recid IS NULL
And here is my total stab in the dark :-
SELECT DISTINCT R.TRSITID, R.TRPRODID, R.OPIS, outerQ.TrWeek
FROM transactions_recon R
OUTER APPLY (
SELECT TOP 1 TrWeek
FROM FleetCorOPIS O2
WHERE O2.Start_date <= R.TrDate AND O2.End_date >= R.TrDate
) outerQ
LEFT OUTER JOIN FleetcorOPIS O ON O.Trweek = outerQ.TrWeek AND O.TrSitID = R.TrSitID AND O.TrProdID = R.TrProdID
WHERE R.trweek = @trweek AND O.recid IS NULL
This still takes too long to execute. I am attaching the execution plan. I can see that there is an index scan operator for the table Fleetcoropis and it is executed 10k times when I look at its properties.
Thanks
chandan
January 12, 2012 at 6:31 am
Looking at the execution plan, your indexes may need some work. I would concentrate on the clustered index scan. You've got this as a predicate:
[CardData_Test].[dbo].[FleetcorOPIS].[start_date] as [O2].[start_date]<=[CardData_Test].[dbo].[transactions_recon].[TRDATE] as [R].[TRDATE] AND [CardData_Test].[dbo].[FleetcorOPIS].[end_date] as [O2].[end_date]>=[CardData_Test].[dbo].[transactions_recon].[TRDATE] as [R].[TRDATE]
Now, the scan could be because of the greater than OR equal to, the OR can sometimes lead to scans. But I think it's likely that You might just need better indexing on the table. Experiment with an index on the start and end date. Try reversing the order of the index too, end, then start. See what that does. I'd go ahead and look at the other scan too.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 12, 2012 at 8:02 am
Couple of small index tweaks (just to add an extra included column):
CREATE NONCLUSTERED INDEX
[IX_SuggestedBy_JLG]
ON [dbo].[transactions_recon]
(
[TRWEEK] ASC,
[TRDATE] ASC,
[TRTIME] ASC,
[RecID] ASC
)
INCLUDE ([TRSITID],[TRPRODID],[Qty],[recordkey],OPIS) -- Added OPIS
WITH (DROP_EXISTING = ON)
GO
CREATE NONCLUSTERED INDEX
[SI_FleetcorOPIS_Dates]
ON [dbo].[FleetcorOPIS]
(
[start_date] ASC,
[end_date] ASC
)
INCLUDE (TRWEEK) -- Added this
WITH (DROP_EXISTING = ON)
Query rewritten as:
DECLARE @TRWEEK int;
SELECT
tr.TRSITID,
tr.TRPRODID,
tr.OPIS
FROM dbo.transactions_recon AS tr
CROSS APPLY
(
SELECT TOP (1)
fo.TRWEEK
FROM dbo.FleetcorOPIS AS fo WITH (FORCESEEK)
WHERE
fo.[start_date] <= CONVERT(datetime, tr.TRDATE)
AND fo.end_date >= CONVERT(datetime, tr.TRDATE)
ORDER BY
fo.TRWEEK DESC
) AS Latest (TRWEEK)
WHERE
tr.TRWEEK = @TRWEEK
AND NOT EXISTS
(
SELECT 1
FROM dbo.FleetcorOPIS AS fo
WHERE
fo.TRWEEK = Latest.TRWEEK
AND fo.TRSITID = tr.TRSITID
AND fo.TRPRODID = tr.TRPRODID
)
OPTION (RECOMPILE)
There are multiple problems with the database design, indexing and this query. The data types between TRWEEK and (start_date and end_date) are different: TRWEEK is smalldatetime, the other two are datetime. Workaround is to add a CONVERT as above, though this should be changed in the database to use compatible types wherever possible. Second, the TOP subquery had no ORDER BY clause to tell the TOP which row to pick - you were getting lucky before, or receiving wrong results without knowing it. I assumed the TOP 1 date required was the latest, so the ORDER BY is TRWEEK DESC. If it should be the smallest TRWEEK, change that DESC to ASC.
January 12, 2012 at 8:21 am
Ouch. I didn't catch the date types. Good one Paul.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 12, 2012 at 8:26 am
Grant Fritchey (1/12/2012)
Ouch. I didn't catch the date types. Good one Paul.
It's not at all evident from the plan - I only came across it after running the DDL and looking at rewriting the query. Actually, the mismatched data types bother me much less than the missing ORDER BY on the TOP. The types might cause a performance problem, the missing ORDER BY may well give wrong results, and no-one likes that 🙂
January 12, 2012 at 8:42 am
SQL Kiwi (1/12/2012)
Grant Fritchey (1/12/2012)
Ouch. I didn't catch the date types. Good one Paul.It's not at all evident from the plan - I only came across it after running the DDL and looking at rewriting the query. Actually, the mismatched data types bother me much less than the missing ORDER BY on the TOP. The types might cause a performance problem, the missing ORDER BY may well give wrong results, and no-one likes that 🙂
You pointing that out made me remember a SPROC I tuned yesterday at work where I had added a subquery with a TOP and no ORDER BY. Saved my bacon since it's still in the development cycle until the end of January, so thanks 😉
January 12, 2012 at 8:52 am
Cadavre (1/12/2012)
You pointing that out made me remember a SPROC I tuned yesterday at work where I had added a subquery with a TOP and no ORDER BY. Saved my bacon since it's still in the development cycle until the end of January, so thanks 😉
You're welcome. It's such an easy thing to forget, I sometimes wonder if SQL Server ought not display a warning about it. One might perhaps even make an argument that TOP should *require* an ORDER BY clause.
January 13, 2012 at 12:09 am
SQL Kiwi (1/12/2012)
Grant Fritchey (1/12/2012)
Ouch. I didn't catch the date types. Good one Paul.It's not at all evident from the plan - I only came across it after running the DDL and looking at rewriting the query. Actually, the mismatched data types bother me much less than the missing ORDER BY on the TOP. The types might cause a performance problem, the missing ORDER BY may well give wrong results, and no-one likes that 🙂
Thanks for replies. Well, I agree that we are comparing different columns here i.e smalldate time to datetime but should not it be easy for SQL Server. I can understand that a lot goes inside SQL when we compare a char with varchar which leads to insargeability and thus indexes scans but does it happen the same in the case of comparing smalldatetime vs. datetime?
Regards
Chandan
January 13, 2012 at 1:37 am
chandan_jha18 (1/13/2012)
Thanks for replies. Well, I agree that we are comparing different columns here i.e smalldate time to datetime but should not it be easy for SQL Server. I can understand that a lot goes inside SQL when we compare a char with varchar which leads to insargeability and thus indexes scans but does it happen the same in the case of comparing smalldatetime vs. datetime?
When SQL Server compares values of different types, it has to convert one of them to the type of the other. It makes this decision based on the rules of Data Type Precedence: http://msdn.microsoft.com/en-us/library/ms190309.aspx DATETIME has a higher precedence than SMALLDATETIME, so in a comparison between those two types, SQL Server will convert the SMALLDATETIME to a DATETIME. The script below demonstrates the behaviour on SQL Server 2008 (other versions may show slight differences):
CREATE TABLE #SmallDateTime
(
col1 smalldatetime NOT NULL,
)
CREATE TABLE #DateTime
(
col1 datetime NOT NULL,
col2 datetime NOT NULL
)
INSERT #SmallDateTime
(col1)
VALUES
('2012-01-01'),
('2012-01-03'),
('2012-01-06'),
('2012-01-10'),
('2012-01-11'),
('2012-01-13');
INSERT #DateTime
VALUES ('2012-01-06', '2012-01-13');
CREATE UNIQUE CLUSTERED INDEX cuq
ON #SmallDateTime (col1)
CREATE UNIQUE CLUSTERED INDEX cuq
ON #DateTime (col1)
-- For an equality comparison, SQL Server can
-- take care of all the details for you, and still perform a seek.
-- No problem, simple seek with a hidden conversion
SELECT *
FROM #SmallDateTime AS sdt
JOIN #DateTime AS dt ON
dt.col1 = sdt.col1
-- Extra join and expression to find the range to seek
SELECT *
FROM #SmallDateTime AS sdt
JOIN #DateTime AS dt ON
dt.col1 <= sdt.col1
-- Two ranges need to be computed and merged
-- Extra join and operators
SELECT *
FROM #SmallDateTime AS sdt
JOIN #DateTime AS dt ON
sdt.col1 BETWEEN dt.col1 AND dt.col2
-- Simple seek again
SELECT *
FROM #SmallDateTime AS sdt
JOIN #DateTime AS dt ON
sdt.col1 BETWEEN CONVERT(smalldatetime, dt.col1) AND CONVERT(smalldatetime, dt.col2)
DROP TABLE #SmallDateTime, #DateTime
Where the comparison involves a range of values (as it did in your example), SQL Server has to introduce extra query plan operators to determine the correct start and end points of the range, given the conversion it has to perform as well. These are the query plans for anyone that doesn't have access to a SQL Server right now:
January 13, 2012 at 1:58 am
Thanks Paul for taking time out of your schedule and giving such a detailed explanation. Unfortunately, when a lot of DBAs like me enter the world of already established databases and organizations, we find a lot of things that are bad but we are not sure how to proceed with them.
As you mentioned, I have a lot of procedures where varchar columns are being compared with int, smalldatetime compared with datetime, a lot of functions in where clause, cursors and a lots of them. Now, the very thought of changing schema just scares hell out of everyone and we are asked to make things better without it and we hit a dead end and then DBAs are held resposible if reports do not run fast.
On a side note, as you mentioned that SQL Server internally converts data types as per the precedence, to avoid that we can use convert or cast functions to change the data types to match the data type, won't it mean the same thing? Extra work anyways will be done.
Regards
Chandan
January 13, 2012 at 2:33 am
chandan_jha18 (1/13/2012)
Thanks Paul for taking time out of your schedule and giving such a detailed explanation. Unfortunately, when a lot of DBAs like me enter the world of already established databases and organizations, we find a lot of things that are bad but we are not sure how to proceed with them.As you mentioned, I have a lot of procedures where varchar columns are being compared with int, smalldatetime compared with datetime, a lot of functions in where clause, cursors and a lots of them. Now, the very thought of changing schema just scares hell out of everyone and we are asked to make things better without it and we hit a dead end and then DBAs are held resposible if reports do not run fast.
I understand, though that does not sound like much fun to me, at least in the longer term. Anyway, to explain: I spent some time working on the code you posted, so it would have been remiss of me not to mention things I noticed that could be improved. It's often hard to tell how much the poster of a question is aware of, or what the level of expertise is, so I tend to err on the side of saying too much rather than too little. Feel free to disregard any advice I offered that you do not find useful.
On a side note, as you mentioned that SQL Server internally converts data types as per the precedence, to avoid that we can use convert or cast functions to change the data types to match the data type, won't it mean the same thing? Extra work anyways will be done.
It's very hard to generalize: sometimes a judiciously placed CONVERT will lead the optimizer down a more productive path, sometimes it will not - this is too a deep topic to get into here. One thing is for sure though: you will encounter more issues with incompatible types than without.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply