November 25, 2018 at 11:02 pm
The table is practically a heap because its' clustered index is on a column that allows NULLs, and there are hundreds of thousands of NULLs.
Whenever I run my query the execution plan shows that the clustered index is always used (all 32 million rows scanned everytime), nevermind my query doesn't even search against the clustered index column, and despite the presence of non clustered indexes for each of the filtered columns.
During daytime my report query takes 3-4 minutes to return 9 rows and at night 5 seconds. Business needs it to return in 1 second at all times.
Clustered Index: OfferID
Non Clustered Index1: Account#
Non Clustered Index3: BulkId
Non Clustered Index4: IMEI
Non Clustered Index2: Date
My query:SELECT <25 columns> WHERE Account# = @account# OR BulkID = @BulkID OR IMEI = @IMEI OR Date >= @startdate AND Date <= @enddate.
Execution plan used shows three operations: Clustered Index Scan, Parallelism, and SELECT.
Please suggest how to optimize. Can the table be optimized for my query if it has no PK, and clustered index permits NULLs, and all columns being filtered also allow nulls?
--Quote me
November 26, 2018 at 4:19 am
polkadot - Sunday, November 25, 2018 11:02 PMThe table is practically a heap because its' clustered index is on a column that allows NULLs, and there are hundreds of thousands of NULLs.Whenever I run my query the execution plan shows that the clustered index is always used (all 32 million rows scanned everytime), nevermind my query doesn't even search against the clustered index column, and despite the presence of non clustered indexes for each of the filtered columns.
During daytime my report query takes 3-4 minutes to return 9 rows and at night 5 seconds. Business needs it to return in 1 second at all times.
Clustered Index: OfferID
Non Clustered Index1: Account#
Non Clustered Index3: BulkId
Non Clustered Index4: IMEI
Non Clustered Index2: DateMy query:SELECT <25 columns> WHERE Account# = @account# OR BulkID = @BulkID OR IMEI = @IMEI OR Date >= @startdate AND Date <= @enddate.
Execution plan used shows three operations: Clustered Index Scan, Parallelism, and SELECT.
Please suggest how to optimize. Can the table be optimized for my query if it has no PK, and clustered index permits NULLs, and all columns being filtered also allow nulls?
How many columns are in the table?
Which of the filter columns is most selective?
How many rows are returned by the query? Obviously this will vary - a low and high figure will be useful.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
November 26, 2018 at 5:05 am
Can you share the execution plan. PK and CI is different. You can also check the fragmentation, since it's a heap.
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
November 26, 2018 at 6:44 am
muthukkumaran Kaliyamoorthy - Monday, November 26, 2018 5:05 AMCan you share the execution plan. PK and CI is different. You can also check the fragmentation, since it's a heap.
One, it's not a heap if it has a clustered index (however bad the clustered index may be). Two, fragmentation won't affect plan creation and index choices (there's an exception to this, but not worth getting into here). Be cautious on advice along these lines.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 26, 2018 at 6:50 am
polkadot - Sunday, November 25, 2018 11:02 PMThe table is practically a heap because its' clustered index is on a column that allows NULLs, and there are hundreds of thousands of NULLs.Whenever I run my query the execution plan shows that the clustered index is always used (all 32 million rows scanned everytime), nevermind my query doesn't even search against the clustered index column, and despite the presence of non clustered indexes for each of the filtered columns.
During daytime my report query takes 3-4 minutes to return 9 rows and at night 5 seconds. Business needs it to return in 1 second at all times.
Clustered Index: OfferID
Non Clustered Index1: Account#
Non Clustered Index3: BulkId
Non Clustered Index4: IMEI
Non Clustered Index2: DateMy query:SELECT <25 columns> WHERE Account# = @account# OR BulkID = @BulkID OR IMEI = @IMEI OR Date >= @startdate AND Date <= @enddate.
Execution plan used shows three operations: Clustered Index Scan, Parallelism, and SELECT.
Please suggest how to optimize. Can the table be optimized for my query if it has no PK, and clustered index permits NULLs, and all columns being filtered also allow nulls?
Without the plan and the structure, any advice is just guesses. So, all those OR clauses make this look like a catch-all query. You should read this article by Gail Shaw on how to better deal with this.
One question on the logic there. Is it this: (Account# = @account# OR BulkID = @BulkID OR IMEI = @IMEI OR Date >= @startdate) AND Date <= @enddate
Or, is it this: Account# = @account# OR BulkID = @BulkID OR IMEI = @IMEI OR (Date >= @startdate AND Date <= @enddate)
Or what you already have?
I'd suggest using parenthesis to quantify that for the optimizer. It will make a difference in the indexes you build.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 26, 2018 at 7:16 am
polkadot - Sunday, November 25, 2018 11:02 PMThe table is practically a heap because its' clustered index is on a column that allows NULLs, and there are hundreds of thousands of NULLs.Whenever I run my query the execution plan shows that the clustered index is always used (all 32 million rows scanned everytime), nevermind my query doesn't even search against the clustered index column, and despite the presence of non clustered indexes for each of the filtered columns.
During daytime my report query takes 3-4 minutes to return 9 rows and at night 5 seconds. Business needs it to return in 1 second at all times.
Clustered Index: OfferID
Non Clustered Index1: Account#
Non Clustered Index3: BulkId
Non Clustered Index4: IMEI
Non Clustered Index2: DateMy query:SELECT <25 columns> WHERE Account# = @account# OR BulkID = @BulkID OR IMEI = @IMEI OR Date >= @startdate AND Date <= @enddate.
Execution plan used shows three operations: Clustered Index Scan, Parallelism, and SELECT.
Please suggest how to optimize. Can the table be optimized for my query if it has no PK, and clustered index permits NULLs, and all columns being filtered also allow nulls?
Post the DDL for the table and all indices please, makes it easier to contribute to the problem's solution.
😎
Suggest you also post the execution plan😉
November 26, 2018 at 7:17 am
polkadot - Sunday, November 25, 2018 11:02 PMNon Clustered Index4: IMEI
Is this mobile CDR data?
😎
November 26, 2018 at 7:52 am
Your query is equivalent to this:SELECT *
FROM myTable
WHERE Account# = @account#
UNION
SELECT *
FROM myTable
WHERE BulkID = @BulkID
UNION
SELECT *
FROM myTable
WHERE IMEI = @IMEI
UNION
SELECT *
FROM myTable
WHERE Date BETWEEN @startdate AND Date @enddate
You might be able to get it to use the indexes if you rewrite the query like that.
The indexes should also include the 25 columns you are selecting.
November 26, 2018 at 7:55 am
Grant Fritchey - Monday, November 26, 2018 6:44 AMmuthukkumaran Kaliyamoorthy - Monday, November 26, 2018 5:05 AMCan you share the execution plan. PK and CI is different. You can also check the fragmentation, since it's a heap.One, it's not a heap if it has a clustered index (however bad the clustered index may be). Two, fragmentation won't affect plan creation and index choices (there's an exception to this, but not worth getting into here). Be cautious on advice along these lines.
The table is practically a heap because its' clustered index is on a column that allows NULLs, and there are hundreds of thousands of NULLs.
Oh bad, I thought it was a heap. ya, right fragmentation will not affect the plan creation and it slow the data fetch.
Sure. I will read carefully.
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
November 26, 2018 at 8:06 am
Jonathan AC Roberts - Monday, November 26, 2018 7:52 AMYour query is equivalent to this:SELECT *
FROM myTable
WHERE Account# = @account#
UNION
SELECT *
FROM myTable
WHERE BulkID = @BulkID
UNION
SELECT *
FROM myTable
WHERE IMEI = @IMEI
UNION
SELECT *
FROM myTable
WHERE Date >= @startdate A
AND Date <= @enddatYou might be able to get it to use the indexes if you rewrite the query like that.
The indexes should also include the 25 columns you are selecting.
25 columns is likely to be most or all of the columns (guessing of course).
Next guess - a small number of rows would be best returned with a seek of a narrow index and key lookup.
Some idea of the data distribution might help, however I suspect the OP actually has the usual "match or null" logic in the query i.e. [variable = NULL or variable = column]
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
November 26, 2018 at 8:11 am
ChrisM@Work - Monday, November 26, 2018 8:06 AM25 columns is likely to be most or all of the columns (guessing of course).
Next guess - a small number of rows would be best returned with a seek of a narrow index and key lookup.
Some idea of the data distribution might help, however I suspect the OP actually has the usual "match or null" logic in the query i.e. [variable = NULL or variable = column]
Which does bring up the question, are those variables we're seeing or parameters? That too could affect index choice based on the differences in how the row estimates are arrived at. There just are no easy answers without all the details.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 26, 2018 at 8:27 am
Grant Fritchey - Monday, November 26, 2018 8:11 AMChrisM@Work - Monday, November 26, 2018 8:06 AM25 columns is likely to be most or all of the columns (guessing of course).
Next guess - a small number of rows would be best returned with a seek of a narrow index and key lookup.
Some idea of the data distribution might help, however I suspect the OP actually has the usual "match or null" logic in the query i.e. [variable = NULL or variable = column]Which does bring up the question, are those variables we're seeing or parameters? That too could affect index choice based on the differences in how the row estimates are arrived at. There just are no easy answers without all the details.
Can I run this again in case it was missed?
"There just are no easy answers without all the details."
Guessing can be amusing, however a faster (and probably better) solution will be arrived at with all of the necessary details up front.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
November 26, 2018 at 8:47 am
Grant Fritchey - Monday, November 26, 2018 8:11 AMChrisM@Work - Monday, November 26, 2018 8:06 AM25 columns is likely to be most or all of the columns (guessing of course).
Next guess - a small number of rows would be best returned with a seek of a narrow index and key lookup.
Some idea of the data distribution might help, however I suspect the OP actually has the usual "match or null" logic in the query i.e. [variable = NULL or variable = column]Which does bring up the question, are those variables we're seeing or parameters? That too could affect index choice based on the differences in how the row estimates are arrived at. There just are no easy answers without all the details.
The OP did say the query returns just 9 rows. So they should be able to get it to use an index.
November 26, 2018 at 8:48 am
Jonathan AC Roberts - Monday, November 26, 2018 8:47 AMGrant Fritchey - Monday, November 26, 2018 8:11 AMChrisM@Work - Monday, November 26, 2018 8:06 AM25 columns is likely to be most or all of the columns (guessing of course).
Next guess - a small number of rows would be best returned with a seek of a narrow index and key lookup.
Some idea of the data distribution might help, however I suspect the OP actually has the usual "match or null" logic in the query i.e. [variable = NULL or variable = column]Which does bring up the question, are those variables we're seeing or parameters? That too could affect index choice based on the differences in how the row estimates are arrived at. There just are no easy answers without all the details.
OP did say the query returns just 9 rows. So they should be able to get it to use an index.
No arguments there. We just don't have enough data to do more than speculate... somewhat wildly.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 26, 2018 at 12:39 pm
Answers to Qs:
Grant Fritchey: (1) I added parenthesis around Date. (2) using parameters. Attached query.
ChrisM: (1) table has 98 columns (2) selectivity of columns will change over time (3) rows returned will be between 5 to 50
Erik Erikson: (1) IMEI is mobile. (2) you'll see from DDL that some columns are quite huge and all but one, Bill_Process_Failure_Reason @ varchar(4000) is justified.
Jonathan AC Roberts: I don't think I will take UNION route because report query actually has a number of JOINS that I have commented out for sake of addressing bottlenecks one step at a time
attached exec plan / inserted DDL, and report query.
DDL
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[QR_Device_Trade_In_Master](
[Offer_ID] [varchar](500) NULL,
[Offer_Generate_Date] [datetime] NULL,
[Offer_Update_Date] [datetime] NULL,
[Manufacturer] [varchar](300) NOT NULL,
[Model] [varchar](300) NOT NULL,
[Serial_Number] [varchar](100) NOT NULL,
[Serial_Number_Type] [varchar](50) NULL,
[Application_User_ID] [varchar](500) NULL,
[Application_ID] [varchar](500) NULL,
[device_Evaluation_Date] [datetime] NULL,
[Store_ID] [varchar](100) NULL,
[Purchase_Device_Sku] [varchar](100) NULL,
[BAN] [numeric](9, 0) NULL,
[Subscriber_No] [varchar](20) NULL,
[Offer_Amount] [numeric](8, 4) NULL,
[RMA_Number] [varchar](100) NULL,
[Offer_Expiration_Date] [datetime] NULL,
[Customer_Email_Address] [varchar](100) NULL,
[Customer_Phone_Number] [varchar](30) NULL,
[Offer_Status] [varchar](100) NULL,
[Rejection_Reason] [varchar](300) NULL,
[RMA_Status] [varchar](100) NULL,
[RMA_Expiration_Date] [datetime] NULL,
[Settlement_Amount] [numeric](8, 4) NULL,
[Network_Operator] [varchar](100) NOT NULL,
[Shipping_Label_URL] [varchar](2000) NULL,
[Eligibility_Status] [varchar](100) NULL,
[Eligibility_Reason] [varchar](1000) NULL,
[Is_Customer_Billed] [varchar](1) NULL,
[Adjustment_Reason] [varchar](2000) NULL,
[Bill_Status] [varchar](1) NULL,
[Bill_Process_Failure_Reason] [varchar](4000) NULL,
[Eligibility_Reason_Code] [varchar](100) NULL,
[Sender_ID] [varchar](500) NULL,
[Channel_ID] [varchar](500) NULL,
[Session_ID] [varchar](500) NULL,
[Work_Flow_ID] [varchar](500) NULL,
[Activity_ID] [varchar](500) NULL,
[Dealer_Code] [varchar](500) NULL,
[Credit_Issue_Date] [datetime] NULL,
[Create_Date] [datetime] NOT NULL,
[Created_By] [varchar](500) NULL,
[Update_Date] [datetime] NULL,
[Updated_By] [varchar](500) NULL,
[Image_Url] [varchar](4000) NULL,
[Sales_Document_Number] [varchar](100) NULL,
[Purchase_Order_Number] [varchar](100) NULL,
[Sap_Channel] [varchar](50) NULL,
[Sap_Return_Type] [varchar](10) NULL,
[Sap_Return_Message] [varchar](1000) NULL,
[Transaction_Type] [varchar](100) NULL,
[ProcessLogID] [int] NOT NULL,
[ImportFileDate] [date] NOT NULL,
[Event_Type] [varchar](100) NULL,
[Retail_Store_Type] [varchar](50) NULL,
[Assurant_Model] [varchar](300) NULL,
[Assurant_Manufacturer] [varchar](300) NULL,
[Cancel_Reason] [varchar](500) NULL,
[Program_ID] [varchar](10) NOT NULL,
[Fulfillment_Ord_Number_Forward] [varchar](30) NULL,
[EIP_Balance] [numeric](8, 4) NULL,
[EIP_Plan_ID] [varchar](50) NULL,
[EIP_Device_ID] [varchar](50) NULL,
[Claim_Auth_Code] [varchar](50) NULL,
[Claim_Req] [varchar](2) NULL,
[Claim_Auth_Date] [date] NULL,
[Claim_Auth_Available] [varchar](2) NULL,
[Deductible_Amount] [numeric](8, 4) NULL,
[Device_Collection_Required] [varchar](2) NULL,
[Claimed_Device_Collected] [varchar](1) NULL,
[Jump_TradeIn_Value] [numeric](8, 4) NULL,
[Claim_Submitted_By] [varchar](100) NULL,
[TOTAL_DAMAGE_AMOUNT] [numeric](10, 4) NULL,
[INSURANCE_DEDUCTABLE_AMOUNT] [numeric](10, 4) NULL,
[DEVICE_SHIPMENT_DATE] [date] NULL,
[DEVICE_RETURN_EXPIRATION_DATE] [date] NULL,
[DEVICE_RETURN_DATE] [date] NULL,
[DEVICE_RETURN_SESSION_ID] [varchar](100) NULL,
[ORIGINAL_RMA_NUMBER] [varchar](100) NULL,
[UPDATED_RMA_EXPIRATION_DATE] [date] NULL,
[Adjustment_Id] [varchar](50) NULL,
[Tracking_Carrier] [varchar](100) NULL,
[Tracking_Number] [varchar](50) NULL,
[Return_Tracking_Number] [varchar](100) NULL,
[Trade_In_Sub_Status] [varchar](50) NULL,
[Trade_In_Status] [varchar](50) NULL,
[Return_Tracking_Carrier] [varchar](100) NULL,
[Bill_Credit_Balance] [numeric](8, 4) NULL,
[Device_Received_Date] [date] NULL,
[Trade_In_Credit_Date] [date] NULL,
[Credited_Down_Payment] [numeric](8, 4) NULL,
[Returned_Date] [date] NULL,
[TRADEIN_INTENT] [varchar](50) NULL,
[ORDER_LINE_ID] [varchar](50) NULL,
[BulkID] [varchar](50) NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING
OFF
GO
Report Query
In report query you'll notice I am pulling from table with '_2016onward' which is a copy of the table, containing records from year 2016 onward, so as not to interfere with production table while testing new indexes in production.
Have commented out the JOINS to focus on primary table
DECLARE
@BULKID varchar(50) = 5006,
@BANvarchar(50) = null,-- '955932237',
@FROM_DATEdate = NULL,
@TO_DATEdate = NULL,
@IMEI varchar(100) = NULL
SELECT QR.[Offer_ID]
,QR.[Offer_Generate_Date]
,QR.[Offer_Update_Date]
,QR.[Manufacturer]
,QR.[Model]
,QR.[Serial_Number]
,QR.[Application_User_ID]
,QR.[Application_ID]
,QR.[device_Evaluation_Date]
,QR.[Store_ID]
,QR.[BAN]
,QR.[Subscriber_No]
,QR.[Offer_Amount]
,QR.[RMA_Number]
,QR.[Offer_Status]
,QR.[Rejection_Reason]
,QR.[RMA_Status]
,QR.[Settlement_Amount]
,QR.[Eligibility_Status]
,QR.[Eligibility_Reason]
,QR.[Adjustment_Reason]
,QR.[Bill_Status]
,QR.[Channel_ID]
,QR.[Credit_Issue_Date]
,QR.[Program_ID]
,QR.[BulkID]
--,QP.MARKET_ASSESSED_AMOUNT as [Fair Market Value]
--,QP.PRM_NAME AS Promo
,QR.Assurant_Manufacturer as [Assurant Manufacturer Received]
,QR.Assurant_Model as [Assurant Model Received]
--,COALESCE(Q1.Answer_Name,'N/A') AS [FMiP OFF]
--,COALESCE(Q6.Answer_Name,'N/A') AS [AntiTheft OFF]
--,COALESCE(Q2.Answer_Name,'N/A') AS [Liquid Damage]
--,COALESCE(Q3.Answer_Name,'N/A') AS [Promo Accepted]
--,COALESCE(Q4.Answer_Name,'N/A') AS [LCD Acceptable]
--,COALESCE(Q5.Answer_Name,'N/A') AS [Powers On]
FROM DistributionCenterFeed.[dbo].[QR_Device_Trade_In_Master_2016onward] QR
--LEFT JOIN ODS.[dbo].[QR_DR_TradeIn_Promo] QP ON QR.Offer_ID = QP.Offer_ID
--LEFT JOIN ODS.[dbo].[QR_DR_Transaction_Ques_Answers] Q1 ON QR.Offer_ID = Q1.Offer_ID AND Q1.Question_Name = 'Find My Iphone'--
--LEFT JOIN ODS.[dbo].[QR_DR_Transaction_Ques_Answers] Q2 ON QR.Offer_ID = Q2.Offer_ID AND Q2.Question_Name = 'Liquid Damage'
--LEFT JOIN ODS.[dbo].[QR_DR_Transaction_Ques_Answers] Q3 ON QR.Offer_ID = Q3.Offer_ID AND Q3.Question_Name = 'ATLANTIS_PROMO'
--LEFT JOIN ODS.[dbo].[QR_DR_Transaction_Ques_Answers] Q4 ON QR.Offer_ID = Q4.Offer_ID AND Q4.Question_Name = 'LCD Issue'
--LEFT JOIN ODS.[dbo].[QR_DR_Transaction_Ques_Answers] Q5 ON QR.Offer_ID = Q5.Offer_ID AND Q5.Question_Name = 'Powers On'
--LEFT JOIN ODS.[dbo].[QR_DR_Transaction_Ques_Answers] Q6 ON QR.Offer_ID = Q6.Offer_ID AND Q6.Question_Name = 'Anti Theft Disabled'
WHERE
BAN = @BAN
OR BulkID = @BULKID
OR Serial_Number= @IMEI
OR ([Offer_Generate_Date] >= @FROM_DATE AND [Offer_Generate_Date] <= @TO_DATE)
GO
--Quote me
Viewing 15 posts - 1 through 15 (of 54 total)
You must be logged in to reply to this topic. Login to reply