Query performance change depending where condition date value

Question

Post reply

Query performance change depending where condition date value

eliseoaznarte

SSC Enthusiast

Points: 171
More actions
July 7, 2016 at 6:21 am

#400370

Hi all
I have and issue with a query in SQL Server 2012
SELECT top 1000 firmas.TipoPers, firmas.CodPers, GRUPO1.Limite, GRUPO1.Disponible
FROM (
select TipoPers, CodPers, Num_Grupo
from TABLA1
WHERE FECDATOMES = CAST (convert(varchar(8),'20160531') as datetime)
) firmas
LEFT JOIN (
SELECT grupo, SUM(limite) as Limite, SUM(disp) as Disponible
FROM TABLA2
WHERE FECDATOMES = CAST (convert(varchar(8),'20160531') as datetime)
group by grupo
) GRUPO1
ON firmas.Num_Grupo = GRUPO1.grupo
It is a production enviroment and I´m allowed to see only this stats:
SQL Server parse and compile time:
CPU time = 31 ms, elapsed time = 40 ms.
(1000 filas afectadas)
Table 'TABLA2'. Scan count 9, logical reads 2151, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TABLA1'. Scan count 9, logical reads 186, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 249 ms, elapsed time = 31 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
It works fine, but when i run the query with a date value of 20160630 the stats change to worse:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 6 ms.
(1000 filas afectadas)
Table 'TABLA2'. Scan count 1000, logical reads 1916000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TABLA1'. Scan count 1, logical reads 32, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 321141 ms, elapsed time = 412712 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
I must use a TOP 1000 because in the second case the query never finished. Without the TOP the first case finished in 10 seconds max and return 400.000 rows.
Regards

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Luis Cazares SSC Guru Points: 183706 More actions · Answer 1

It seems that you could have a data issue. The amount of data is not similar on both dates.

The TOP 1000 is not helping because it needs to group the data before limiting the results.

Be sure to have the adequate indexes. Post table and index definitions to get suggestions. If possible, post the actual execution plan. For information on how to do it, read this: http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

By the way, this query will be interpreted the same way as yours (as long as FECDATOMES column is a datetime)

SELECT top 1000

firmas.TipoPers,

firmas.CodPers,

GRUPO1.Limite,

GRUPO1.Disponible

FROM TABLA1 AS firmas

LEFT JOIN (

SELECT grupo,

SUM(limite) as Limite,

SUM(disp) as Disponible

FROM TABLA2

WHERE FECDATOMES = '20160531'

group by grupo

) GRUPO1

ON firmas.Num_Grupo = GRUPO1.grupo

WHERE FECDATOMES = '20160531';

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

eliseoaznarte SSC Enthusiast Points: 171 More actions · Answer 2

Thanks for your answer.

I cheked the number of rows distribution by the field FECDATOMES before open the post and are very similar for all dates.

About the TOP 1000, I used it just to have some result, because with the second date it is returning only 5.000 rows in an hour.

The tables have a clustered PK and no more indexes. I´m thinking that it could be a hardware issue, is it possible?

I can´t post the execution plan because I haven´t rights do run it. Sorry.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

Luis Cazares (7/7/2016)
...The TOP 1000 is not helping because it needs to group the data before limiting the results....

That was my first thought too Luis so I set up a test harness to check it out - and what I find in every case I've tried is the TOP is applied before the subquery, hence limiting the number of rows aggregated. Even when useful indexes are dropped (I can't drop them all except CI as per OP). I reckon it's vastly different rowcounts on those dates.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Luis Cazares SSC Guru Points: 183706 More actions · Answer 4

ChrisM@Work (7/7/2016)
Luis Cazares (7/7/2016)
...The TOP 1000 is not helping because it needs to group the data before limiting the results....
That was my first thought too Luis so I set up a test harness to check it out - and what I find in every case I've tried is the TOP is applied before the subquery, hence limiting the number of rows aggregated. Even when useful indexes are dropped (I can't drop them all except CI as per OP). I reckon it's vastly different rowcounts on those dates.

Can you share the test harness? I ran these queries on AdventureWorks and the TOP was the last statement applied to the query. Only when used without an aggregate function it reduces the number of reads.

SET STATISTICS IO ON;

SELECT OrderDate,

SUM(SubTotal) SubTotal,

SUM(TotalDue) TotalDue

FROM Sales.SalesOrderHeader

GROUP BY OrderDate;

SELECT TOP 100 OrderDate,

SUM(SubTotal) SubTotal,

SUM(TotalDue) TotalDue

FROM Sales.SalesOrderHeader

GROUP BY OrderDate;

SELECT TOP 100 OrderDate,

SubTotal,

TotalDue

FROM Sales.SalesOrderHeader;

SELECT top 5

T.Name,

T.[Group],

S.SubTotal,

S.TotalDue

FROM Sales.SalesTerritory AS T

LEFT JOIN (

SELECT TerritoryID,

SUM(SubTotal) as SubTotal,

SUM(TotalDue) as TotalDue

FROM Sales.SalesOrderHeader

WHERE OrderDate = '20050901'

group by TerritoryID

) S

ON T.TerritoryID = S.TerritoryID;

SET STATISTICS IO OFF;

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

This was using local data but I could share actual plans with you privately?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 6

After a bit more fiddling with this, I can get stable and predictable behaviour using an index tailored for my specific query. Adjusting it for the OP's environment yields this:

CREATE INDEX ix_Test ON TABLA2 (grupo, FECDATOMES) INCLUDE (limite, disp)

- which in every case I've tried now, using wildly different rowcounts in both tables, has worked i.e. the aggregate query is filtered by the outer TOP-limited query.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Luis Cazares SSC Guru Points: 183706 More actions · Answer 7

Nice Chris, I hope that this helps Eliseo.

I think that I understand what you're saying about the TOP filtering before the aggregate, even if the execution plan indicates otherwise with the operators order. However, the reads indicate that the TOP isn't being applied. I would appreciate if you could share me the plans.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Luis Cazares (7/7/2016)
Nice Chris, I hope that this helps Eliseo.
I think that I understand what you're saying about the TOP filtering before the aggregate, even if the execution plan indicates otherwise with the operators order. However, the reads indicate that the TOP isn't being applied. I would appreciate if you could share me the plans.

Better still, here's a test harness.

-- Run up a table with a decent number of rows.

-- 31 distinct dates

-- 10 distinct types, 11 to 21

-- 911 * date * type

IF OBJECT_ID('tempdb..#Trans') IS NOT NULL DROP TABLE #Trans;

;WITH _Tally AS (

SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n),

(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n),

(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n)

),

Dates AS (SELECT TOP(31) [date] = DATEADD(DAY,n-1,'20160101') FROM _Tally),

_Types AS (SELECT TOP(10) TypeID = n+10 FROM _Tally),

Iterations AS (SELECT TOP(911) n FROM _Tally)

SELECT

ID = IDENTITY(int,1,1),

d.[Date],

t.TypeID

INTO #Trans

FROM Dates d CROSS JOIN _Types t CROSS JOIN Iterations i

ORDER BY d.[Date], t.TypeID

CREATE UNIQUE CLUSTERED INDEX ucx_ID ON #Trans (ID)

CREATE INDEX ix_Helper ON #Trans (TypeID, [Date])

-- 100 rows, TypeID = 1 to 100

IF OBJECT_ID('tempdb..#TypeID') IS NOT NULL DROP TABLE #TypeID;

SELECT TOP 100 TypeID = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

INTO #TypeID

FROM #Trans;

CREATE UNIQUE CLUSTERED INDEX ucx_TypeID ON #TypeID (TypeID)

-----------------------------------------------------------------------

-- Run the query

SELECT TOP(11) *

FROM #TypeID t

LEFT loop JOIN (

SELECT TypeID, cnt = COUNT(*)

FROM #trans

WHERE [date] = '20160116'

GROUP BY TypeID

) d ON d.TypeID = t.TypeID;

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden