February 15, 2013 at 3:34 pm
I'm seeing a huge performance difference between two queries that are almost identical. The only SQL difference is in the WHERE clause:
Query1: QueryID=@QID
Query2: QueryID=ISNULL(@QID,A.QueryID)
Full Query 1 takes 100x + longer to execute. @QID is never null in this case b/c is set at the beginning. I would expect a declared variable (int) would be quicker than a function wrapping that declared variable. The actual execution plan says Query2 is more intensive, but actual time to process tells a much different story. Can anyone shed light on this?
Full Query1:
DECLARE @days INT; SET @days=7;
DECLARE @date1 DATETIME; SET @date1=getdate();
DECLARE @date2 DATETIME; SET @date2=DATEADD(day,-@days,@date1);
DECLARE @date3 DATETIME; SET @date3=DATEADD(day,-@days*2,@date1);
DECLARE @QID INT; SET @QID = (SELECT id FROM Queries WHERE Query='abc');
SELECT COUNT(*) as Visits, Query,
(SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date2) AS Pos,
(SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date3 AND date<@date2) AS Pos2
FROM Visits A INNER JOIN Pages B ON A.PageID=B.id INNER JOIN Queries E ON E.id=A.QueryID
WHERE date>=@date2 AND QueryID=@QID GROUP BY QueryID,Query,PageID;
Full Query2:
DECLARE @days INT; SET @days=7;
DECLARE @date1 DATETIME; SET @date1=getdate();
DECLARE @date2 DATETIME; SET @date2=DATEADD(day,-@days,@date1);
DECLARE @date3 DATETIME; SET @date3=DATEADD(day,-@days*2,@date1);
DECLARE @QID INT; SET @QID = (SELECT id FROM Queries WHERE Query='abc');
SELECT COUNT(*) as Visits, Query,
(SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date2) AS Pos,
(SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date3 AND date<@date2) AS Pos2
FROM Visits A INNER JOIN Pages B ON A.PageID=B.id INNER JOIN Queries E ON E.id=A.QueryID
WHERE date>=@date2 AND QueryID=ISNULL(@QID,A.QueryID) GROUP BY QueryID,Query,PageID;
February 15, 2013 at 3:37 pm
Could use the DDL for the tables, index definitions, sample data, expected results, and of course the actual execution plans for the two queries.
Other than that, all you may get are shots in the dark.
February 15, 2013 at 5:34 pm
duane+sql (2/15/2013)
I'm seeing a huge performance difference between two queries that are almost identical. The only SQL difference is in the WHERE clause:Query1: QueryID=@QID
Query2: QueryID=ISNULL(@QID,A.QueryID)
Full Query 1 takes 100x + longer to execute. @QID is never null in this case b/c is set at the beginning. I would expect a declared variable (int) would be quicker than a function wrapping that declared variable. The actual execution plan says Query2 is more intensive, but actual time to process tells a much different story. Can anyone shed light on this?
Full Query1:
DECLARE @days INT; SET @days=7; DECLARE @date1 DATETIME; SET @date1=getdate(); DECLARE @date2 DATETIME; SET @date2=DATEADD(day,-@days,@date1); DECLARE @date3 DATETIME; SET @date3=DATEADD(day,-@days*2,@date1); DECLARE @QID INT; SET @QID = (SELECT id FROM Queries WHERE Query='abc');
SELECT COUNT(*) as Visits, Query, (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date2) AS Pos, (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date3 AND date<@date2) AS Pos2 FROM Visits A INNER JOIN Pages B ON A.PageID=B.id INNER JOIN Queries E ON E.id=A.QueryID WHERE date>=@date2 AND QueryID=@QID GROUP BY QueryID,Query,PageID;
Full Query2:
DECLARE @days INT; SET @days=7; DECLARE @date1 DATETIME; SET @date1=getdate(); DECLARE @date2 DATETIME; SET @date2=DATEADD(day,-@days,@date1); DECLARE @date3 DATETIME; SET @date3=DATEADD(day,-@days*2,@date1); DECLARE @QID INT; SET @QID = (SELECT id FROM Queries WHERE Query='abc');
SELECT COUNT(*) as Visits, Query, (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date2) AS Pos, (SELECT AVG(CAST(Pos AS DECIMAL)) FROM GooglePos WHERE QueryID=A.QueryID AND PageID=A.PageID AND date>=@date3 AND date<@date2) AS Pos2 FROM Visits A INNER JOIN Pages B ON A.PageID=B.id INNER JOIN Queries E ON E.id=A.QueryID WHERE date>=@date2 AND QueryID=ISNULL(@QID,A.VisitID) GROUP BY QueryID,Query,PageID;
The second query isn't SARGable which means it's not capable of using an INDEX SEEK because you have a column inside of a function.
Test and see. This makes a million rows of data with a clustered index on the only column. It takes scant seconds to run so don't let the big number scare you.
--===== Create a test table and populate it on the fly.
SELECT TOP 1000000
QueryID = IDENTITY(INT,1,1)
INTO #TestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
--===== Add the expected index.
ALTER TABLE #TestTable
ADD PRIMARY KEY CLUSTERED (QueryID)
;
Now, turn on the Actual Execution plan and let the following code rip. I've added another bit of code to show what else you don't want to do.
--===== Before you run this section, turn on the Actual Execution Plan.
-- Then run it and see the difference on the message tab and the AEP.
DECLARE @QID INT;
SELECT @QID = 999999;
SET STATISTICS IO, TIME ON;
SELECT QueryID FROM #TestTable WHERE QueryID=@QID;
SELECT QueryID FROM #TestTable WHERE QueryID=ISNULL(@QID,QueryID);
SELECT QueryID FROM #TestTable WHERE (@QID IS NULL OR QueryID = @QID);
SET STATISTICS IO, TIME OFF;
It sounds like you might be leaning toward a "catch all" query. Please see Gail Shaw's wonderful article on how to do such a thing correctly. Here's the link.
http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/
As a bit of a sidebar, never trust execution plan comparisons to tell you which code will be the fastest. Even Actual Execution Plans have a lot of estimated information in them that throw things like % of Batch way out of whack.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 15, 2013 at 7:58 pm
Jeff, you are right. The second should take longer, but actually the 1st query takes MUCH longer in this case. I just don't understand why in this case.
Here are stats from the actual queries:
(1 row(s) affected)
(199 row(s) affected)
Table 'GooglePos'. Scan count 398, logical reads 58888876, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Pages'. Scan count 0, logical reads 2116, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Visits'. Scan count 1, logical reads 459642, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Queries'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 88858 ms, elapsed time = 88961 ms.
(199 row(s) affected)
Table 'Pages'. Scan count 17, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Visits'. Scan count 1, logical reads 93, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Queries'. Scan count 17, logical reads 933, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'GooglePos'. Scan count 2, logical reads 15, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 127 ms, elapsed time = 177 ms.
February 22, 2013 at 3:22 pm
Recap:
Query 1 has in the WHERE clause: "QueryID=@QID". As we can see from the statistics Query 1 does a ton more logical reads for the subqueries contain in the full SQL. Why would Query 1 do these logical reads while Query 2 does not? Query 2 has in the WHERE clause: "QueryID=ISNULL(@QID,A.QueryID)". This is the only difference between Query 1 and Query 2.
The GooglePos table is as follows:
id int
Date smalldatetime
Pos int
PageID int
QueryID int
February 22, 2013 at 3:29 pm
Please post the DDL for the table(s) including index definitions. Also, please post the actual execution plan for both of the queries. These can be saved as .sqlplan files and uploaded to ssc.
February 22, 2013 at 4:32 pm
DDL w/ index
USE [MaxF]
GO
/****** Object: Table [dbo].[Visits] Script Date: 02/22/2013 17:25:41 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Visits](
[id] [bigint] NOT NULL,
[Date] [smalldatetime] NULL,
[SiID] [int] NULL,
[PageID] [int] NULL,
[QueryID] [int] NULL,
[RefID] [int] NULL,
[Referrer] [varchar](256) NULL,
127.0.0.1 [varchar](15) NULL,
[AffLinkID] [int] NULL,
[Cchk] [bit] NULL,
[Rchk] [bit] NULL,
[CampID] [int] NULL,
CONSTRAINT [PK_Visits] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
CREATE NONCLUSTERED INDEX [Date_nc] ON [dbo].[Visits]
(
[Date] ASC
)
INCLUDE ( [PageID],
[QueryID],
[RefID],
[AffLinkID],
[Cchk],
[Rchk],
[CampID],
[Referrer],
127.0.0.1) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [PageID_nc] ON [dbo].[Visits]
(
[PageID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [QueryID_nc] ON [dbo].[Visits]
(
[QueryID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [RefID_nc] ON [dbo].[Visits]
(
[RefID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [SiID-QID-Date] ON [dbo].[Visits]
(
[SiID] ASC,
[QueryID] ASC,
[Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
/****** Object: Table [dbo].[Queries] Script Date: 02/22/2013 17:25:41 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Queries](
[id] [int] IDENTITY(1,1) NOT NULL,
[Query] [nvarchar](100) NULL,
CONSTRAINT [PK_Queries] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [Query_nc] ON [dbo].[Queries]
(
[Query] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
/****** Object: Table [dbo].[GooglePos] Script Date: 02/22/2013 17:25:41 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[GooglePos](
[id] [int] IDENTITY(1,1) NOT NULL,
[Date] [smalldatetime] NULL,
[Pos] [int] NULL,
[PageID] [int] NULL,
[QueryID] [int] NULL,
CONSTRAINT [PK_GooglePos] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [Date_nc] ON [dbo].[GooglePos]
(
[Date] ASC
)
INCLUDE ( [Pos],
[PageID],
[QueryID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [PageID_nc] ON [dbo].[GooglePos]
(
[PageID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [QueryID_nc] ON [dbo].[GooglePos]
(
[QueryID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
February 22, 2013 at 5:06 pm
In the first execution plan on the Operator Index Seek QueryId_nc you can see that the Estimated Number of Rows is about 4 and Actual Number of Rows is 34Mio! SQL Server Optimizer has choosen an Index Seek followed by Lookup because it expected only 4 rows to be returned and that's the reason why you have a lot of logical reads.
In the second case ISNULL operator disallowed the optimizer to choosing Index Seek and in this case this desicion was better.
The reason why the optimizer had a bad estimation in first case is usage of local variable. When you use local variables the optimizerhas to generate the plan for an unknown value and the real value has been evaluated at the run-time and this is too late for the exeution plan. The plan has been already created.
You can use OPTION (RECOMPILE) hint at the end of the first query to force generating the plan at the statement level which allows the optimizer a better estimation and finally to generate an optimal execution plan.
___________________________
Do Not Optimize for Exceptions!
February 26, 2013 at 5:39 pm
Thank you so much for the explanation!
February 27, 2013 at 12:04 am
Parameter sniffing, or lack thereof: http://sqlinthewild.co.za/index.php/2008/02/25/parameter-sniffing-pt-2/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply