July 7, 2016 at 6:21 am
Hi all
I have and issue with a query in SQL Server 2012
SELECT top 1000 firmas.TipoPers, firmas.CodPers, GRUPO1.Limite, GRUPO1.Disponible
FROM (
select TipoPers, CodPers, Num_Grupo
from TABLA1
WHERE FECDATOMES = CAST (convert(varchar(8),'20160531') as datetime)
) firmas
LEFT JOIN (
SELECT grupo, SUM(limite) as Limite, SUM(disp) as Disponible
FROM TABLA2
WHERE FECDATOMES = CAST (convert(varchar(8),'20160531') as datetime)
group by grupo
) GRUPO1
ON firmas.Num_Grupo = GRUPO1.grupo
It is a production enviroment and I´m allowed to see only this stats:
SQL Server parse and compile time:
CPU time = 31 ms, elapsed time = 40 ms.
(1000 filas afectadas)
Table 'TABLA2'. Scan count 9, logical reads 2151, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TABLA1'. Scan count 9, logical reads 186, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 249 ms, elapsed time = 31 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
It works fine, but when i run the query with a date value of 20160630 the stats change to worse:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 6 ms.
(1000 filas afectadas)
Table 'TABLA2'. Scan count 1000, logical reads 1916000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TABLA1'. Scan count 1, logical reads 32, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 321141 ms, elapsed time = 412712 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
I must use a TOP 1000 because in the second case the query never finished. Without the TOP the first case finished in 10 seconds max and return 400.000 rows.
Regards
July 7, 2016 at 6:58 am
It seems that you could have a data issue. The amount of data is not similar on both dates.
The TOP 1000 is not helping because it needs to group the data before limiting the results.
Be sure to have the adequate indexes. Post table and index definitions to get suggestions. If possible, post the actual execution plan. For information on how to do it, read this: http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
By the way, this query will be interpreted the same way as yours (as long as FECDATOMES column is a datetime)
SELECT top 1000
firmas.TipoPers,
firmas.CodPers,
GRUPO1.Limite,
GRUPO1.Disponible
FROM TABLA1 AS firmas
LEFT JOIN (
SELECT grupo,
SUM(limite) as Limite,
SUM(disp) as Disponible
FROM TABLA2
WHERE FECDATOMES = '20160531'
group by grupo
) GRUPO1
ON firmas.Num_Grupo = GRUPO1.grupo
WHERE FECDATOMES = '20160531';
July 7, 2016 at 7:21 am
Thanks for your answer.
I cheked the number of rows distribution by the field FECDATOMES before open the post and are very similar for all dates.
About the TOP 1000, I used it just to have some result, because with the second date it is returning only 5.000 rows in an hour.
The tables have a clustered PK and no more indexes. I´m thinking that it could be a hardware issue, is it possible?
I can´t post the execution plan because I haven´t rights do run it. Sorry.
July 7, 2016 at 7:32 am
Luis Cazares (7/7/2016)
...The TOP 1000 is not helping because it needs to group the data before limiting the results....
That was my first thought too Luis so I set up a test harness to check it out - and what I find in every case I've tried is the TOP is applied before the subquery, hence limiting the number of rows aggregated. Even when useful indexes are dropped (I can't drop them all except CI as per OP). I reckon it's vastly different rowcounts on those dates.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 7, 2016 at 7:57 am
ChrisM@Work (7/7/2016)
Luis Cazares (7/7/2016)
...The TOP 1000 is not helping because it needs to group the data before limiting the results....That was my first thought too Luis so I set up a test harness to check it out - and what I find in every case I've tried is the TOP is applied before the subquery, hence limiting the number of rows aggregated. Even when useful indexes are dropped (I can't drop them all except CI as per OP). I reckon it's vastly different rowcounts on those dates.
Can you share the test harness? I ran these queries on AdventureWorks and the TOP was the last statement applied to the query. Only when used without an aggregate function it reduces the number of reads.
SET STATISTICS IO ON;
SELECT OrderDate,
SUM(SubTotal) SubTotal,
SUM(TotalDue) TotalDue
FROM Sales.SalesOrderHeader
GROUP BY OrderDate;
SELECT TOP 100 OrderDate,
SUM(SubTotal) SubTotal,
SUM(TotalDue) TotalDue
FROM Sales.SalesOrderHeader
GROUP BY OrderDate;
SELECT TOP 100 OrderDate,
SubTotal,
TotalDue
FROM Sales.SalesOrderHeader;
SELECT top 5
T.Name,
T.[Group],
S.SubTotal,
S.TotalDue
FROM Sales.SalesTerritory AS T
LEFT JOIN (
SELECT TerritoryID,
SUM(SubTotal) as SubTotal,
SUM(TotalDue) as TotalDue
FROM Sales.SalesOrderHeader
WHERE OrderDate = '20050901'
group by TerritoryID
) S
ON T.TerritoryID = S.TerritoryID;
SET STATISTICS IO OFF;
July 7, 2016 at 8:01 am
This was using local data but I could share actual plans with you privately?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 7, 2016 at 9:32 am
After a bit more fiddling with this, I can get stable and predictable behaviour using an index tailored for my specific query. Adjusting it for the OP's environment yields this:
CREATE INDEX ix_Test ON TABLA2 (grupo, FECDATOMES) INCLUDE (limite, disp)
- which in every case I've tried now, using wildly different rowcounts in both tables, has worked i.e. the aggregate query is filtered by the outer TOP-limited query.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 7, 2016 at 9:47 am
Nice Chris, I hope that this helps Eliseo.
I think that I understand what you're saying about the TOP filtering before the aggregate, even if the execution plan indicates otherwise with the operators order. However, the reads indicate that the TOP isn't being applied. I would appreciate if you could share me the plans.
July 8, 2016 at 3:12 am
Luis Cazares (7/7/2016)
Nice Chris, I hope that this helps Eliseo.I think that I understand what you're saying about the TOP filtering before the aggregate, even if the execution plan indicates otherwise with the operators order. However, the reads indicate that the TOP isn't being applied. I would appreciate if you could share me the plans.
Better still, here's a test harness.
-- Run up a table with a decent number of rows.
-- 31 distinct dates
-- 10 distinct types, 11 to 21
-- 911 * date * type
IF OBJECT_ID('tempdb..#Trans') IS NOT NULL DROP TABLE #Trans;
;WITH _Tally AS (
SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n)
),
Dates AS (SELECT TOP(31) [date] = DATEADD(DAY,n-1,'20160101') FROM _Tally),
_Types AS (SELECT TOP(10) TypeID = n+10 FROM _Tally),
Iterations AS (SELECT TOP(911) n FROM _Tally)
SELECT
ID = IDENTITY(int,1,1),
d.[Date],
t.TypeID
INTO #Trans
FROM Dates d CROSS JOIN _Types t CROSS JOIN Iterations i
ORDER BY d.[Date], t.TypeID
CREATE UNIQUE CLUSTERED INDEX ucx_ID ON #Trans (ID)
CREATE INDEX ix_Helper ON #Trans (TypeID, [Date])
-- 100 rows, TypeID = 1 to 100
IF OBJECT_ID('tempdb..#TypeID') IS NOT NULL DROP TABLE #TypeID;
SELECT TOP 100 TypeID = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
INTO #TypeID
FROM #Trans;
CREATE UNIQUE CLUSTERED INDEX ucx_TypeID ON #TypeID (TypeID)
-----------------------------------------------------------------------
-- Run the query
SELECT TOP(11) *
FROM #TypeID t
LEFT loop JOIN (
SELECT TypeID, cnt = COUNT(*)
FROM #trans
WHERE [date] = '20160116'
GROUP BY TypeID
) d ON d.TypeID = t.TypeID;
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply