August 23, 2017 at 1:20 pm
Hi All,
Help required in re-writing below WHERE predicate so that an index on date column can be utilized. Currently a lot of functions are being used in the condition to get the starting day of the month and ending day of the month.
How can I re-write below WHERE condition by not using this function's on the datetime column on which index exists.
SELECT
....
....
WHERE t1.tran_dt BETWEEN DATEADD(DAY,-DAY(t2.updated_dt) + 1,t2.updated_dt) AND DATEADD(DAY,-1, DATEADD(MONTH,1,DATEADD(DAY,-DAY(t2.updated_dt)+1,t2.updated_dt)))
sample data: This is not real data. This is just for the sake of understanding. However query has to process of 4 million records on prod data.
create table Testtbl
(date_last_updated datetime
)
insert into Testtbl
select getdate()
union all
select getdate()-10
union all
select getdate()-40
select
date_last_updated
,RIGHT( '00' + LTRIM( RTRIM( STR( MONTH( DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) ) ))))), 2) + '/'
+ RIGHT('00' + LTRIM( RTRIM( STR( DAY( DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) ) ))))), 2) + '/'
+ LTRIM( RTRIM( STR( YEAR( DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) ) ))))) AS "Date_Last_Updated in mm/dd/yyyy"
,DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) as w1_starting_dt_of_the_month
,DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) ) ) as w2_ending_dt_of_the_month
from Testtbl
where date_last_updated between DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated)
and
DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated) ) )
Thanks,
Sam
August 23, 2017 at 1:47 pm
vsamantha35 - Wednesday, August 23, 2017 1:20 PMHi All,
Help required in re-writing below WHERE predicate so that an index on date column can be utilized. Currently a lot of functions are being used in the condition to get the starting day of the month and ending day of the month.
How can I re-write below WHERE condition by not using this function's on the datetime column on which index exists.
SELECT
....
....
WHERE t1.tran_dt BETWEEN DATEADD(DAY,-DAY(t2.updated_dt) + 1,t2.updated_dt) AND DATEADD(DAY,-1, DATEADD(MONTH,1,DATEADD(DAY,-DAY(t2.updated_dt)+1,t2.updated_dt)))
sample data: This is not real data. This is just for the sake of understanding. However query has to process of 4 million records on prod data.
create table Testtbl
(date_last_updated datetime
)insert into Testtbl
select getdate()
union all
select getdate()-10
union all
select getdate()-40select
date_last_updated
,RIGHT( '00' + LTRIM( RTRIM( STR( MONTH( DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) ) ))))), 2) + '/'
+ RIGHT('00' + LTRIM( RTRIM( STR( DAY( DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) ) ))))), 2) + '/'
+ LTRIM( RTRIM( STR( YEAR( DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) ) ))))) AS "Date_Last_Updated in mm/dd/yyyy"
,DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) as w1_starting_dt_of_the_month
,DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated ) ) ) as w2_ending_dt_of_the_month
from Testtbl
where date_last_updated between DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated)
and
DATEADD( DAY, -1, DATEADD( MONTH, 1, DATEADD( DAY, -DAY( date_last_updated ) + 1, date_last_updated) ) )Thanks,
Sam
Sam, can you post an actual execution plan from the query please? This will provide a wealth of information to folks who wish to participate. Thanks.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
August 23, 2017 at 1:55 pm
Attaching Actual Execution plan executed from SQL Sentry plan explorer.
August 23, 2017 at 2:31 pm
vsamantha35 - Wednesday, August 23, 2017 1:55 PMAttaching Actual Execution plan executed from SQL Sentry plan explorer.
For those of us without SQL Sentry explorer can you post the actual execution plan from SSMS? E.g. execute the query in SSMS with "Include Actual Execution plan" turned on then right-click on the plan, save it then post it.
-- Itzik Ben-Gan 2001
August 23, 2017 at 3:20 pm
What are the indexes on the table DDA_TRAN? It looks like it might make more sense to have an index that starts with the 2 join columns, with maybe the date being the third column, so (INSTITUTION_NUMBER, DDA_ACCOUNT, TRAN_DATE).
Similarly on DDA_DESC, (INSTITUTION_NUMBER, DDA_ACCOUNT, DESCRIPTION_LINK).
A couple other things, the expression:,RIGHT ('00' + LTRIM(RTRIM(STR(MONTH("DDA_TRAN"."POSTED_DATE")))), 2) + '/'
+ RIGHT('00' + LTRIM(RTRIM(STR(DAY("DDA_TRAN"."POSTED_DATE")))), 2) + '/'
+ LTRIM(RTRIM(STR(YEAR("DDA_TRAN"."POSTED_DATE"))))
can probably be replaced by the much simpler:,CONVERT(varchar(20), "DDA_TRAN"."POSTED_DATE", 101)
And much of the processing seems to be to support the very complex GROUP BY clause, can that be simplified?
August 23, 2017 at 3:35 pm
Pretty much this, DATEADD, on the columns in your predicate is going to lead to table scans which will absolutely lead to poor performance overall. You'd be better off with a calculated column if you need to manipulate those dates. You can index that for use with this query and not need to do the function on the columns. As long as the function is there, performance is pretty likely to stink.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 23, 2017 at 5:36 pm
Your WHERE clause isn't doing anything at all. It's literally checking to see if the date_last_updated is in the month that it's in.
Any idea what the calculated range is supposed to be?
Never mind... I was looking at the last WHERE clause, not the 1st one...
August 23, 2017 at 9:23 pm
Grant Fritchey - Wednesday, August 23, 2017 3:35 PMPretty much this, DATEADD, on the columns in your predicate is going to lead to table scans which will absolutely lead to poor performance overall. You'd be better off with a calculated column if you need to manipulate those dates. You can index that for use with this query and not need to do the function on the columns. As long as the function is there, performance is pretty likely to stink.
Hi Grant, Can you please provide more suggestions from the plan. This is one such queries which are generated from 3rd party tool COGNOS and we are like looking for oppurtunities to optimize it.
August 23, 2017 at 9:24 pm
Chris Harshman - Wednesday, August 23, 2017 3:20 PMWhat are the indexes on the table DDA_TRAN? It looks like it might make more sense to have an index that starts with the 2 join columns, with maybe the date being the third column, so (INSTITUTION_NUMBER, DDA_ACCOUNT, TRAN_DATE).Similarly on DDA_DESC, (INSTITUTION_NUMBER, DDA_ACCOUNT, DESCRIPTION_LINK).
A couple other things, the expression:
,RIGHT ('00' + LTRIM(RTRIM(STR(MONTH("DDA_TRAN"."POSTED_DATE")))), 2) + '/'
+ RIGHT('00' + LTRIM(RTRIM(STR(DAY("DDA_TRAN"."POSTED_DATE")))), 2) + '/'
+ LTRIM(RTRIM(STR(YEAR("DDA_TRAN"."POSTED_DATE"))))can probably be replaced by the much simpler:
,CONVERT(varchar(20), "DDA_TRAN"."POSTED_DATE", 101)
And much of the processing seems to be to support the very complex GROUP BY clause, can that be simplified?
Thanks Chris for the help.
August 24, 2017 at 5:00 am
You could do with clustered indexes on those heap tables, something like
DDA_DESC (INSTITUTION_NUMBER, DDA_ACCOUNT, DESCRIPTION_LINK)
DDA_TRAN (INSTITUTION_NUMBER, DDA_ACCOUNT
)
Simplification of the query:
SELECT
"DDA_TRAN"."INSTITUTION_NUMBER" AS "Institution_Number"
,"DDA_TRAN"."DDA_ACCOUNT" AS "DDA_Account"
,"DDA_TRAN"."CHECK_NUMBER" AS "Check_Number"
,x.DebitOrCredit AS "Debit_or_Credit"
,"DDA_TRAN"."DESCRIPTION_CODE" AS "Description_Code"
,"DDA_TRAN"."DESCRIPTION_LINK" AS "Description_Link"
,"DDA_DESC"."DESCRIPTION1" AS "Description1"
,"DDA_DESC"."DESCRIPTION2" AS "Description2"
,"DDA_TRAN"."DOCUMENT_IMAGE_FLAG" AS "Document_Image_Flag"
,"DDA_TRAN"."EXTERNAL_TRAN_CODE" AS "External_Transaction_Code"
,"DDA_TRAN"."INTERNAL_TRAN_CODE" AS "Internal_Transaction_Code"
,CONVERT(VARCHAR(10),DDA_TRAN.POSTED_DATE,101)
,"DDA_TRAN"."REVERSAL_CODE" AS "Reversal_Code"
,SUM("DDA_TRAN"."TRAN_AMOUNT") AS "Transaction_Amount"
,SUM(CASE WHEN x.DebitOrCredit = 'Credit' THEN "DDA_TRAN"."TRAN_AMOUNT" ELSE 0 END ) AS "Credit_Tran_Amount"
,SUM(CASE WHEN x.DebitOrCredit = 'Debit' THEN "DDA_TRAN"."TRAN_AMOUNT" ELSE 0 END ) AS "Debit_Tran_Amount"
,CONVERT(VARCHAR(10),DDA_TRAN.TRAN_DATE,101) AS "Transaction_Date"
,"DDA_TRAN"."TRAN_USED_CODE" AS "Transaction_Used_Code"
,"DDA_DESC"."ACH_COMPANY_ID" AS "ACH_Company_Id"
,"DDA_DESC"."ACH_STANDARD_ENTRY_CLASS" AS "ACH_Standard_Entry_Class"
,"DDA_DESC"."ITEM_ADDRESS" AS "Item_Address"
,CONVERT(VARCHAR(10),EOMONTH(DATE_LAST_UPDATED),101) AS "Date_Last_Updated"
FROM ("DDA_ACCT" "DDA_ACCT"
LEFT OUTER JOIN "DDA_TRAN" "DDA_TRAN" -- left outer join is converted into an inner join by the WHERE clause
ON "DDA_ACCT"."INSTITUTION_NUMBER" = "DDA_TRAN"."INSTITUTION_NUMBER"
AND "DDA_ACCT"."DDA_ACCOUNT" = "DDA_TRAN"."DDA_ACCOUNT"
)
LEFT OUTER JOIN "DDA_DESC" "DDA_DESC"
ON "DDA_TRAN"."INSTITUTION_NUMBER" = "DDA_DESC"."INSTITUTION_NUMBER"
AND "DDA_TRAN"."DDA_ACCOUNT" = "DDA_DESC"."DDA_ACCOUNT"
AND "DDA_TRAN"."DESCRIPTION_LINK" = "DDA_DESC"."DESCRIPTION_LINK"
CROSS APPLY (
SELECT DebitOrCredit = CASE
WHEN "DDA_TRAN"."INTERNAL_TRAN_CODE" IN (6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 60, 62, 72, 76, 78, 80, 87, 25, 26, 27, 31, 32, 75)
THEN 'Credit' ELSE 'Debit' END
) x
WHERE "DDA_TRAN"."TRAN_DATE" BETWEEN
DATEADD( DAY, -DAY( "DDA_ACCT"."DATE_LAST_UPDATED" ) + 1, "DDA_ACCT"."DATE_LAST_UPDATED" )
AND
EOMONTH(DDA_ACCT.DATE_LAST_UPDATED)
GROUP BY "DDA_TRAN"."INSTITUTION_NUMBER"
, "DDA_TRAN"."DDA_ACCOUNT"
, "DDA_TRAN"."CHECK_NUMBER"
, x.DebitOrCredit
, "DDA_TRAN"."DESCRIPTION_CODE"
, "DDA_TRAN"."DESCRIPTION_LINK"
, "DDA_DESC"."DESCRIPTION1"
, "DDA_DESC"."DESCRIPTION2"
, "DDA_TRAN"."DOCUMENT_IMAGE_FLAG"
, "DDA_TRAN"."EXTERNAL_TRAN_CODE"
, "DDA_TRAN"."INTERNAL_TRAN_CODE"
, CONVERT(VARCHAR(10),DDA_TRAN.POSTED_DATE,101)
, "DDA_TRAN"."REVERSAL_CODE"
, CONVERT(VARCHAR(10),DDA_TRAN.TRAN_DATE,101)
, "DDA_TRAN"."TRAN_USED_CODE"
, "DDA_DESC"."ACH_COMPANY_ID"
, "DDA_DESC"."ACH_STANDARD_ENTRY_CLASS"
, "DDA_DESC"."ITEM_ADDRESS"
,CONVERT(VARCHAR(10),EOMONTH(DATE_LAST_UPDATED),101)
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 24, 2017 at 6:07 am
vsamantha35 - Wednesday, August 23, 2017 9:23 PMGrant Fritchey - Wednesday, August 23, 2017 3:35 PMPretty much this, DATEADD, on the columns in your predicate is going to lead to table scans which will absolutely lead to poor performance overall. You'd be better off with a calculated column if you need to manipulate those dates. You can index that for use with this query and not need to do the function on the columns. As long as the function is there, performance is pretty likely to stink.Hi Grant, Can you please provide more suggestions from the plan. This is one such queries which are generated from 3rd party tool COGNOS and we are like looking for oppurtunities to optimize it.
The issue is that calculations of any kind performed on columns means that in order to satisfy the query, the calculation must be performed on every single row in the table in order to determine if that row matches the calculated value. So if you have 10 rows, you have to run the calculation 10 times. If you have 10 million lakhs of rows, you have to run a whole bunch of calculations. That's going to be very slow and there is absolutely no changing it unless you change how things are being done.
First option, look at what's going on in Cognos. Why is it making that calculation? Could you do something else in Cognos so that the calculation doesn't occur. I'm not a Cognos expert, so you'll have to go elsewhere for definitive answers there.
Second option, instead of having Cognos run calculations against the date time column to get the two values of last_update_date and one month from last_update_date (is that right? whatever it is), create two additional columns on your table that perform these functions on the data, computed columns. Because you're storing the calculations within the table and you can create indexes on computed columns, your query can take advantage of an index which should make it run faster.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 24, 2017 at 7:39 am
Grant Fritchey - Thursday, August 24, 2017 6:07 AMFirst option, look at what's going on in Cognos. Why is it making that calculation? Could you do something else in Cognos so that the calculation doesn't occur. I'm not a Cognos expert, so you'll have to go elsewhere for definitive answers there.Second option, instead of having Cognos run calculations against the date time column to get the two values of last_update_date and one month from last_update_date (is that right? whatever it is), create two additional columns on your table that perform these functions on the data, computed columns. Because you're storing the calculations within the table and you can create indexes on computed columns, your query can take advantage of an index which should make it run faster.
Another option could be to create some kind of calendar table that has 1 row for each day and some kind of YearMonth integer column. You could then join to this calendar table and compare the 2 YearMonths to see if they match.
August 24, 2017 at 9:05 am
Chris Harshman - Thursday, August 24, 2017 7:39 AMGrant Fritchey - Thursday, August 24, 2017 6:07 AMFirst option, look at what's going on in Cognos. Why is it making that calculation? Could you do something else in Cognos so that the calculation doesn't occur. I'm not a Cognos expert, so you'll have to go elsewhere for definitive answers there.Second option, instead of having Cognos run calculations against the date time column to get the two values of last_update_date and one month from last_update_date (is that right? whatever it is), create two additional columns on your table that perform these functions on the data, computed columns. Because you're storing the calculations within the table and you can create indexes on computed columns, your query can take advantage of an index which should make it run faster.
Another option could be to create some kind of calendar table that has 1 row for each day and some kind of YearMonth integer column. You could then join to this calendar table and compare the 2 YearMonths to see if they match.
Absolutely. Great idea. I'm aware of this approach, but I keep forgetting about it.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 24, 2017 at 11:58 am
Hi Chris,
I tried to create those covering indexes but i didnt see much improvement in query performance . it was 5-10 sec difference. Sometimes the query without indices performed well. so decided not to create those 2 indexes. Thanks for the query re-write. I ll try to see if thats works well and returning the same result in terms of rowcount and data. Thank you for taking your valuable time out to help people like me who wanted to learn some tips in regards with query tuning and approach towards it. Thanks a lot.
August 24, 2017 at 12:04 pm
Grant Fritchey - Thursday, August 24, 2017 6:07 AMvsamantha35 - Wednesday, August 23, 2017 9:23 PMGrant Fritchey - Wednesday, August 23, 2017 3:35 PMPretty much this, DATEADD, on the columns in your predicate is going to lead to table scans which will absolutely lead to poor performance overall. You'd be better off with a calculated column if you need to manipulate those dates. You can index that for use with this query and not need to do the function on the columns. As long as the function is there, performance is pretty likely to stink.Hi Grant, Can you please provide more suggestions from the plan. This is one such queries which are generated from 3rd party tool COGNOS and we are like looking for oppurtunities to optimize it.
The issue is that calculations of any kind performed on columns means that in order to satisfy the query, the calculation must be performed on every single row in the table in order to determine if that row matches the calculated value. So if you have 10 rows, you have to run the calculation 10 times. If you have 10 million lakhs of rows, you have to run a whole bunch of calculations. That's going to be very slow and there is absolutely no changing it unless you change how things are being done.
First option, look at what's going on in Cognos. Why is it making that calculation? Could you do something else in Cognos so that the calculation doesn't occur. I'm not a Cognos expert, so you'll have to go elsewhere for definitive answers there.
Second option, instead of having Cognos run calculations against the date time column to get the two values of last_update_date and one month from last_update_date (is that right? whatever it is), create two additional columns on your table that perform these functions on the data, computed columns. Because you're storing the calculations within the table and you can create indexes on computed columns, your query can take advantage of an index which should make it run faster.
we don't have idea regarding who is owning the Cognos piece. As a vendors we do managed services and therefore they give us bunch of queries to tune on daily or weekly basis. We don't have proper documentation for these reports nor the full time employees have full info on these. The project has been take over from 1 client to another and we work as a vendor team for the new client.
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply