SQL Query execution slowness

Question

Post reply

SQL Query execution slowness

yuvipoy

SSChampion

Points: 10469

June 22, 2020 at 4:27 am

#3764619

Hi ,

Below is the query which has two tables, based on the business requirement , have written the query as below. but it is taking 30 mins to complete the query. There is no Insert or Update or Delete happen when the query is executed. This query is executed from SSIS package.Is there a better way to write the query ? or is the below query is fine.

When in the final query when C.columnid is removed then i see it run in 3 mins, but where as it is needed based on business requirement.

Below are the number if records in each table, TableA has 200000 and TableB has 200000000.

Select Count(*) from [TableA] --> 200000
Select Count(*) from [TableB]--> 200000000

;WITH CTETableA AS
(
SELECT DISTINCT
 [ColumnID]
FROM TableA
WHERE [ColumnID] IS NOT NULL
AND [Date] >= <Day before yesterday date>
AND [Date] < <Yesterday date>
)
,CTETableB AS
(
SELECT DISTINCT 
 [ColumnID] --- Int
,[Column_S] -- Varchar(20)
FROM TableB
WHERE  [Date] >= <Day before yesterday date>
AND [Date] < <Yesterday date>
)

SELECT
     b.[ColumnID]  --Int
,s.[Column_var]-- Varchar(25)
,s.[Column_int]-- Int
,s.[Date]    --- Datetime
,b.[Column_Var]--  Varchar(20)
,b.[ColumnTime] --- Time
,b.[Column_S]   -- Varchar(20)
,b.[Column_float] -- Float
FROM TableB b
LEFT JOIN CTETableB c ON c.[ColumnID] = b.[ColumnID]
LEFT JOIN TableA d ON d.[ColumnID] = b.[ColumnID] AND d.[Column_S] = b.[Column_S]
LEFT JOIN CTETableA s ON s.[ColumnID] = b.[ColumnID]
WHERE ((b.[Date] >= <Day before yesterday date> AND b.[Date] < <Yesterday date>) OR c.[ColumnID] IS NOT NULL OR d.[ColumnID] IS NOT NULL)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

frederico_fonseca SSCoach Points: 16157 More actions · Answer 1

Giving sample sql like this may lead to errors - and for this you may need to supply us with a full explain plan (both slow and fast).

we will also need to know which indexes these tables have

but I may be missing something but it seems that, at least the way you posted, CTETableB is not required at all and should be removed.

regardless - this seems like a typical case of "if master has changed or child has changed" extract data - that being the case doing a union of 2 queries would normally give better results - first query detects changes to master, second detects changes to child.

yuvipoy SSChampion Points: 10469 More actions · Answer 2

CTETableB is needed as we are having the not null condition [OR c.[ColumnID] IS NOT NULL OR d.[ColumnID] IS NOT NULL] in the query.

TableA [ColumnID] is clustered index and Date with non-cluster index, non- unique.

TableB [ColumnID] is clustered index and Date with non-cluster index , non- unique.

Execution plan- Don't have permission to execute.

How could Union would help us here , in which way we could represent it to achieve?. Again thanks for your time and help.

frederico_fonseca SSCoach Points: 16157 More actions · Answer 3

according to the code you posted it does not seem you need that cte - and without your full code we don't know if you missed something on your post or if it accurately represents the full query joins and criteria.

,CTETableB AS
(
SELECT DISTINCT 
 [ColumnID] --- Int
,[Column_S] -- Varchar(20)
FROM TableB
WHERE  [Date] >= <Day before yesterday date>
AND [Date] < <Yesterday date>
)

select *
from TableB b
LEFT JOIN CTETableB c ON c.[ColumnID] = b.[ColumnID]
where (b.[Date] >= <Day before yesterday date> AND b.[Date] < <Yesterday date>) OR c.[ColumnID] IS NOT NULL


in both CTETableB and the main select you are filtering on the same field and criteria so having the cte with a left outer really equates to

select *
from TableB b
where  (b.[Date] >= <Day before yesterday date> AND b.[Date] < <Yesterday date>)

how ever having the cte as is may even be giving you duplicate records if there are multilple records on TableB with same ColumnID so I would check that

Regarding the union

-- first select records where TableB date field meets the criteria
SELECT
     b.[ColumnID]  --Int
,s.[Column_var]-- Varchar(25)
,s.[Column_int]-- Int
,s.[Date]    --- Datetime
,b.[Column_Var]--  Varchar(20)
,b.[ColumnTime] --- Time
,b.[Column_S]   -- Varchar(20)
,b.[Column_float] -- Float
FROM TableB b
LEFT JOIN TableA s ON s.[ColumnID] = b.[ColumnID] and (s.[Date] >= <Day before yesterday date> AND s.[Date] < <Yesterday date>) -- I wonder if this was correct in the first place
WHERE (b.[Date] >= <Day before yesterday date> AND b.[Date] < <Yesterday date>)

union -- remove duplicate entries - depending on the data this may need to be done slightly differently using rownumber 

-- second retrieve records from TableB where there is a match on TableA where date meets the criteria and Column_S has a match
SELECT
     b.[ColumnID]  --Int
,s.[Column_var]-- Varchar(25)
,s.[Column_int]-- Int
,s.[Date]    --- Datetime
,b.[Column_Var]--  Varchar(20)
,b.[ColumnTime] --- Time
,b.[Column_S]   -- Varchar(20)
,b.[Column_float] -- Float
FROM TableB b
inner JOIN TableA s ON s.[ColumnID] = b.[ColumnID] AND d.[Column_S] = b.[Column_S]

and (s.[Date] >= <Day before yesterday date> AND s.[Date] < <Yesterday date>) -- TableA filtering


An index on table A with Date, column_S and ColumnID required - potentially with include of column_var and columnn_int 

An index on table B with Date, Column_S, ColumnID required - potentially with remaining columns required to cover the select

as we don't know your data neither your true table and query the above is just a possible approach - may be incorrect but only you will be able to determine that.

yuvipoy SSChampion Points: 10469 More actions · Answer 4

Once again thanks for your time.

Yes you are correct on the observation on

"how ever having the cte as is may even be giving you duplicate records if there are multiple records on TableB with same ColumnID so I would check that"

you could see in my code that i have added c.[ColumnID] IS NOT NULL or d.[ColumnID] IS NOT NULL ,which will fetch the record for the whole day when left join is there

Here the query is based on datewith timestamp as well, say if it is day before yesterday(22-06-2020 2:00:05:056) till yesterday(23-06-202 3:34:34:978) , so in the first cte we would be getting set of records say 100 , in second cte we are getting some 1000 records for the dates mentioned, while doing the left join we would be getting the additional rows

cte1

columnid, date

1, '22-06-2020 2:00:05:056'

2, '22-06-2020 5:00:08:232'

1, '22-06-2020 1:12:05:344'

cte2

columnid, date

1, '22-06-2020 7:13:12:234'

2, '22-06-2020 4:11:21:213'

3, '22-06-2020 3:12:08:784'

so while doing this we would be ending with 3 records with left join with [ColumnID] IS NOT NULL and not with 2 records, columnid 1 will be coming 2 time and columnid 2 will be coming 1 time, so total 3 records.

;With CTETableB AS
(
SELECT DISTINCT 
 [ColumnID] --- Int
,[Column_S] -- Varchar(20)
FROM TableB
WHERE  [Date] >= <Day before yesterday date>
AND [Date] < <Yesterday date>
)

select *
from TableB b
LEFT JOIN CTETableB c ON c.[ColumnID] = b.[ColumnID]
where (b.[Date] >= <Day before yesterday date> AND b.[Date] < <Yesterday date>) OR c.[ColumnID] IS NOT NULL


in both CTETableB and the main select you are filtering on the same field and criteria so having the cte with a left outer really equates to

select *
from TableB b
where  (b.[Date] >= <Day before yesterday date> AND b.[Date] < <Yesterday date>)

how ever having the cte as is may even be giving you duplicate records if there are multilple records on TableB with same ColumnID so I would check that

yuvipoy SSChampion Points: 10469 More actions · Answer 5

Any suggestions on the above, still the issues persists. the time consumed here is more. Below is the reads and writes and time taken for the query to complete.

Run time Reads Writes

20min 3,596,149 97

45min 73,412,456 378,279

48min 198,270,103 1,558,015

22min 202,130,607 1,303,290

34min 143,565,802 1,123,014

25min 15,137,118 2,233

Sergiy SSC Guru Points: 110208 More actions · Answer 6

The query is a mess.

its difficult to understand the requirements from the code because it lacks of logic, not to mention some constructions in the query eliminate others.

My guess would be you want actually full outer join of 2sets fot the specific date.

you definitely don't need scan each table twice for that.

Just include all required columns into CTE's and then join these CTE's and only them:

Hate this framework.
15 mins reworking the code from a mobile device - and it drops everything at once
So, try following my suggestions yourself, post your code - we'll adjust it if something wrong

_____________
Code for TallyGenerator

Sergiy SSC Guru Points: 110208 More actions · Answer 7

2 more things:

- forget the word "DISTINCT". Use Group by when needed (I bet you don't need it in your query);

- if you have an index rebuilding routine, use that window of opportunity to rebuild the current wrong indexing into correct one:

ColumnID -PK, NON-clustered

[Date] - clustered

this willl bring the response time of your query down to several seconds.

_____________
Code for TallyGenerator