August 12, 2011 at 5:13 pm
We have a large table in one of our databases on which we are testing out Page-Level Compression. This is a test database, so periodically we truncate out the table and reload it from our production environment. The table itself holds between 800 million and 3 billion rows, depending on how much data we are loading. Each row is 154 bytes, on average, including indexes. The table has a field named rootFlowdayNumber, which is a composite of rootElement (an internal number for our system) and the Flowday (date to which the data relates). The table is partitioned on this number, resulting in every day being on it's own partition.
The table looks like this:
[identifier] [bigint] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[maxVersionTsUtc] [datetime2](3) NOT NULL,
[maxVersionTsUtc_value] [varbinary](max) NULL,
[dataPointIdentifier] [uniqueidentifier] NOT NULL,
[elementIdentifier] [uniqueidentifier] NOT NULL,
[intervalUtc] [smalldatetime] NOT NULL,
[sequenceNumber] [decimal](9, 3) NULL,
[rootFlowdayNumber] [int] NOT NULL
The query we are running is:
select
Data.dataPointIdentifier,
Data.elementIdentifier,
@flowday as flowday,
Data.intervalUtc,
Data.sequenceNumber,
null as sourceSequenceNumber,
Data.maxVersionTsUtc as versionTsUtc,
Data.maxVersionTsUtc_value as value
from
@possibleValues PV
inner join data_calc.[vw_Data] as Data on
Data.rootFlowdayNumber = @rootFlowdayNumber
and Data.sequenceNumber is null
and Data.elementIdentifier = PV.elementIdentifier
and Data.dataPointIdentifier = PV.dataPointIdentifier
and Data.intervalUtc between @startIntervalUtc and @endIntervalUtc
and Data.maxVersionTsUtc_value is not null
option(force order)
@possibleValues is a combincation of the dataPointIdentifiers and elementIdentifiers that are logically possible to related (not all dataPointIdentifiers can be referenced by each elementIdentifier).
A few weeks ago, we set all partitions on this table to utilize page compression and rebuilt the indexes.
A few days ago, we truncated the table and reloaded 800 million rows from production.
Starting yesterday, when we ran this query, the results took a LOT longer than normal to run. When we would run that same query for that same flowday again, it ran as quickly as normal.
We found that each run was generating the same execution plan with only one difference, the first (slower) run had a "Table Spool (Lazy Spool)" object, taking 54% to 63% percent of the cost; the second run did not.
I am attaching the estimated sqlPlan we are generating of the slower run.
Please let me know what you think.
August 12, 2011 at 5:38 pm
It would be a good idea to find out what the actual and estimated row differences look like in the execution plan. Since there is a table variable being used, there won't be any statistics being used for the table variable. But would be a good idea to check the XML plan compiled parameter values actually match the parameter values passed to it during the runtime. From the attached plan, I see the parameters used to compile the plan was:
<ColumnReference Column="@endIntervalUtc" ParameterCompiledValue="'2011-08-14 05:00:00.000'" />
<ColumnReference Column="@startIntervalUtc" ParameterCompiledValue="'2011-08-13 05:05:00.000'" />
<ColumnReference Column="@rootFlowdayNumber" ParameterCompiledValue="(17511577)" />
This posting is provided "AS IS" with no warranties, and confers no rights.
My Blog: TroubleshootingSQL
Twitter: @banerjeeamit
August 12, 2011 at 5:44 pm
Unfortunately, I would have to get my devs to give me an elementIdentifier list and datapointIdentifier list. I am sending them an e-mail asking for said information, but unfortunately it may be too late on a Friday to get an answer today 🙂
I'll update the list when I get a reply from them.
August 12, 2011 at 5:46 pm
I know I asked this on Twitter as well... Check if the plan changes after updating stats with a fullscan on the table.
This posting is provided "AS IS" with no warranties, and confers no rights.
My Blog: TroubleshootingSQL
Twitter: @banerjeeamit
August 12, 2011 at 5:48 pm
Will do, as soon as the devs get back to me. I'll run the query before and after and post plans from each.
August 12, 2011 at 5:52 pm
Can you post the code from the View. Looks like there is a concatenation operation that is triggering the spool. Is the view indexed? If so, what Editions is test and production?
August 12, 2011 at 5:56 pm
The view is not indexed, everything is SQL 2008 Enterprise.
The view is:
CREATE VIEW [data_calc].[vw_Data]
WITH SCHEMABINDING
AS
SELECT [identifier], [maxVersionTsUtc],[maxVersionTsUtc_value],[dataPointIdentifier],[elementIdentifier],[intervalUtc],[sequenceNumber],[rootFlowdayNumber]
FROM [data_calc].[tbl_Data_ERNDL_2011] (NOLOCK)
UNION ALL
SELECT [identifier], [maxVersionTsUtc],[maxVersionTsUtc_value],[dataPointIdentifier],[elementIdentifier],[intervalUtc],[sequenceNumber],[rootFlowdayNumber]
FROM [data_calc].[tbl_Data_ERZNL_2012] (NOLOCK)
UNION ALL
SELECT [identifier], [maxVersionTsUtc],[maxVersionTsUtc_value],[dataPointIdentifier],[elementIdentifier],[intervalUtc],[sequenceNumber],[rootFlowdayNumber]
FROM [data_calc].[tbl_Data_ERNDL_2012] (NOLOCK)
UNION ALL
SELECT [identifier], [maxVersionTsUtc],[maxVersionTsUtc_value],[dataPointIdentifier],[elementIdentifier],[intervalUtc],[sequenceNumber],[rootFlowdayNumber]
FROM [data_calc].[tbl_Data_ERZNL_2013] (NOLOCK)
UNION ALL
SELECT [identifier], [maxVersionTsUtc],[maxVersionTsUtc_value],[dataPointIdentifier],[elementIdentifier],[intervalUtc],[sequenceNumber],[rootFlowdayNumber]
FROM [data_calc].[tbl_Data_ERNDL_2013] (NOLOCK)
UNION ALL
SELECT [identifier], [maxVersionTsUtc],[maxVersionTsUtc_value],[dataPointIdentifier],[elementIdentifier],[intervalUtc],[sequenceNumber],[rootFlowdayNumber]
FROM [data_calc].[tbl_Data_ERZNL_2010] (NOLOCK)
UNION ALL
SELECT [identifier], [maxVersionTsUtc],[maxVersionTsUtc_value],[dataPointIdentifier],[elementIdentifier],[intervalUtc],[sequenceNumber],[rootFlowdayNumber]
FROM [data_calc].[tbl_Data_ERNDL_2010] (NOLOCK)
UNION ALL
SELECT [identifier], [maxVersionTsUtc],[maxVersionTsUtc_value],[dataPointIdentifier],[elementIdentifier],[intervalUtc],[sequenceNumber],[rootFlowdayNumber]
FROM [data_calc].[tbl_Data_ERZNL_2011] (NOLOCK)
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply