Need help on query performance

Question

Need help on query performance

Viewing 15 posts - 31 through 45 (of 77 total)

You must be logged in to reply to this topic. Login to reply

jc85 Right there with Babe Points: 790 More actions · Answer 1

ChrisM@Work (5/25/2016)
Only one clustered index per table, but the index can contain more than one column.
"Different result" is too vague, please explain.

Thanks, added.

Incorrect result is returned, it is not the one I wanted to get.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 2

jc85 (5/25/2016)
ChrisM@Work (5/25/2016)
Only one clustered index per table, but the index can contain more than one column.
"Different result" is too vague, please explain.
Thanks, added.
Incorrect result is returned, it is not the one I wanted to get.

Can't you describe how it's different? Row count? Values?

How long does it take to run with the modified clustered index?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Sergiy SSC Guru Points: 110209 More actions · Answer 3

jc85 (5/25/2016)
Thanks ChrisM@Work.
The query took quite some time to complete.
Additional info provided in case you need it.
ann_events_Tech_Details
[start_time] - clustered index
[id] - Unique, non-clustered index (PK)
ann_ReportItem
[source] - non-clustered index
[call_flow_name] - non-clustered index
[report_id] - non-clustered index
[id] - Unique, non-clustered index (PK)

ann_ReportItem

[source], [call_flow_name], [report_id] - clustered index

[id] - Unique, non-clustered index (PK)

Optimal order of columns in the clustered index may be different, depending on selectivity.

The most selective column (smallest number of rows with identical values) must go first.

_____________
Code for TallyGenerator

jc85 Right there with Babe Points: 790 More actions · Answer 4

ChrisM@Work (5/25/2016)
jc85 (5/25/2016)
ChrisM@Work (5/25/2016)
Only one clustered index per table, but the index can contain more than one column.
"Different result" is too vague, please explain.
Thanks, added.
Incorrect result is returned, it is not the one I wanted to get.
Can't you describe how it's different? Row count? Values?
How long does it take to run with the modified clustered index?

The row count reduced a lot and [report_item] values are different.

It is much faster though.

New query output

2016-01-01----PV-a.report_item--------1----1------0

2016-01-01----CYGAL-a.report_item----502--2529--440

2016-01-01----PFCC-a.report_item-----6----13-----1

Old query output

2016-01-01 00:00:00.000----CYGAL-CKC End Call--------0-----2------0

2016-01-01 00:00:00.000----CYGAL-Noisy--------------18-----83-----30

2016-01-01 00:00:00.000----CYGAL-CKC Transfer CSR---17----124----17

2016-01-01 00:00:00.000----CYGAL-Tel - Speak to CSR--348---1792---230

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

Sergiy (5/25/2016)
jc85 (5/25/2016)
Thanks ChrisM@Work.
The query took quite some time to complete.
Additional info provided in case you need it.
ann_events_Tech_Details
[start_time] - clustered index
[id] - Unique, non-clustered index (PK)
ann_ReportItem
[source] - non-clustered index
[call_flow_name] - non-clustered index
[report_id] - non-clustered index
[id] - Unique, non-clustered index (PK)
ann_ReportItem
[source], [call_flow_name], [report_id] - clustered index
[id] - Unique, non-clustered index (PK)
Optimal order of columns in the clustered index may be different, depending on selectivity.
The most selective column (smallest number of rows with identical values) must go first.

Not necessarily. Filtering on [Source] followed by a date range (and having the data conveniently sorted in this order too, by the clustered index) falls within scope of this article.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 6

jc85 (5/25/2016)
ChrisM@Work (5/25/2016)
jc85 (5/25/2016)
ChrisM@Work (5/25/2016)
Only one clustered index per table, but the index can contain more than one column.
"Different result" is too vague, please explain.
Thanks, added.
Incorrect result is returned, it is not the one I wanted to get.
Can't you describe how it's different? Row count? Values?
How long does it take to run with the modified clustered index?
The row count reduced a lot and [report_item] values are different.
It is much faster though.
New query output
2016-01-01----PV-a.report_item--------1----1------0
2016-01-01----CYGAL-a.report_item----502--2529--440
2016-01-01----PFCC-a.report_item-----6----13-----1
Old query output
2016-01-01 00:00:00.000----CYGAL-CKC End Call--------0-----2------0
2016-01-01 00:00:00.000----CYGAL-Noisy--------------18-----83-----30
2016-01-01 00:00:00.000----CYGAL-CKC Transfer CSR---17----124----17
2016-01-01 00:00:00.000----CYGAL-Tel - Speak to CSR--348---1792---230

Sorry my bad, missed the correlation:

SELECT

[date] = CAST(b.start_time AS DATE),

[report_item] = (b.[source] + '-' + 'a.report_item'),

[EN] = SUM(CASE WHEN b.lang = 'EN' THEN 1 ELSE 0 END),

[BM] = SUM(CASE WHEN b.lang = 'BM' THEN 1 ELSE 0 END),

[MD] = SUM(CASE WHEN b.lang = 'MD' THEN 1 ELSE 0 END)

FROM ann_events_Tech_Details b

WHERE CAST(b.start_time AS DATE) >= @StartDate

AND CAST(b.start_time AS DATE) <= @EndDate

AND 1 IN (b.sel_tel_nodialtone, b.sel_tel_noisy, b.sel_tech_ckc_disconnect,

b.sel_tech_ckc_transfer, b.sel_tel_csr, b.sel_inv_del, b.sel_inv_unifi)

AND EXISTS (

SELECT 1 FROM ann_ReportItem a

WHERE a.[Source] = b.[Source] -- correlation

AND a.report_id = 8

AND a.call_flow_name = @call_flow_name)

GROUP BY

CAST(b.start_time AS DATE),

b.[source]

ORDER BY CAST(b.start_time AS DATE)

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

jc85 Right there with Babe Points: 790 More actions · Answer 7

ChrisM@Work (5/25/2016)
Sorry my bad, missed the correlation:
SELECT
[date] = CAST(b.start_time AS DATE),
[report_item] = (b.[source] + '-' + 'a.report_item'),
[EN] = SUM(CASE WHEN b.lang = 'EN' THEN 1 ELSE 0 END),
[BM] = SUM(CASE WHEN b.lang = 'BM' THEN 1 ELSE 0 END),
[MD] = SUM(CASE WHEN b.lang = 'MD' THEN 1 ELSE 0 END)
FROM ann_events_Tech_Details b
WHERE CAST(b.start_time AS DATE) >= @StartDate
AND CAST(b.start_time AS DATE) <= @EndDate
AND 1 IN (b.sel_tel_nodialtone, b.sel_tel_noisy, b.sel_tech_ckc_disconnect,
b.sel_tech_ckc_transfer, b.sel_tel_csr, b.sel_inv_del, b.sel_inv_unifi)
AND EXISTS (
SELECT 1 FROM ann_ReportItem a
WHERE a.[Source] = b.[Source] -- correlation
AND a.report_id = 8
AND a.call_flow_name = @call_flow_name)
GROUP BY
CAST(b.start_time AS DATE),
b.[source]
ORDER BY CAST(b.start_time AS DATE)

It returned the same output as previous query.

2016-01-01----PV-a.report_item--------1----1------0

2016-01-01 ----CYGAL-a.report_item----502--2529-- 440

2016-01-01 ----PFCC-a.report_item -----6----13-----1

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

jc85 (5/25/2016)
ChrisM@Work (5/25/2016)
Sorry my bad, missed the correlation:
SELECT
[date] = CAST(b.start_time AS DATE),
[report_item] = (b.[source] + '-' + 'a.report_item'),
[EN] = SUM(CASE WHEN b.lang = 'EN' THEN 1 ELSE 0 END),
[BM] = SUM(CASE WHEN b.lang = 'BM' THEN 1 ELSE 0 END),
[MD] = SUM(CASE WHEN b.lang = 'MD' THEN 1 ELSE 0 END)
FROM ann_events_Tech_Details b
WHERE CAST(b.start_time AS DATE) >= @StartDate
AND CAST(b.start_time AS DATE) <= @EndDate
AND 1 IN (b.sel_tel_nodialtone, b.sel_tel_noisy, b.sel_tech_ckc_disconnect,
b.sel_tech_ckc_transfer, b.sel_tel_csr, b.sel_inv_del, b.sel_inv_unifi)
AND EXISTS (
SELECT 1 FROM ann_ReportItem a
WHERE a.[Source] = b.[Source] -- correlation
AND a.report_id = 8
AND a.call_flow_name = @call_flow_name)
GROUP BY
CAST(b.start_time AS DATE),
b.[source]
ORDER BY CAST(b.start_time AS DATE)
It returned the same output as previous query.
2016-01-01----PV-a.report_item--------1----1------0
2016-01-01 ----CYGAL-a.report_item----502--2529-- 440
2016-01-01 ----PFCC-a.report_item -----6----13-----1

Can you post a plan for this query please?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

jc85 Right there with Babe Points: 790 More actions · Answer 9

ChrisM@Work (5/25/2016)
jc85 (5/25/2016)
ChrisM@Work (5/25/2016)
Sorry my bad, missed the correlation:
SELECT
[date] = CAST(b.start_time AS DATE),
[report_item] = (b.[source] + '-' + 'a.report_item'),
[EN] = SUM(CASE WHEN b.lang = 'EN' THEN 1 ELSE 0 END),
[BM] = SUM(CASE WHEN b.lang = 'BM' THEN 1 ELSE 0 END),
[MD] = SUM(CASE WHEN b.lang = 'MD' THEN 1 ELSE 0 END)
FROM ann_events_Tech_Details b
WHERE CAST(b.start_time AS DATE) >= @StartDate
AND CAST(b.start_time AS DATE) <= @EndDate
AND 1 IN (b.sel_tel_nodialtone, b.sel_tel_noisy, b.sel_tech_ckc_disconnect,
b.sel_tech_ckc_transfer, b.sel_tel_csr, b.sel_inv_del, b.sel_inv_unifi)
AND EXISTS (
SELECT 1 FROM ann_ReportItem a
WHERE a.[Source] = b.[Source] -- correlation
AND a.report_id = 8
AND a.call_flow_name = @call_flow_name)
GROUP BY
CAST(b.start_time AS DATE),
b.[source]
ORDER BY CAST(b.start_time AS DATE)
It returned the same output as previous query.
2016-01-01----PV-a.report_item--------1----1------0
2016-01-01 ----CYGAL-a.report_item----502--2529-- 440
2016-01-01 ----PFCC-a.report_item -----6----13-----1
Can you post a plan for this query please?

Sure, thanks for your quick response.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 10

jc85 (5/25/2016)
How should i post the table with data on this forum?
How many rows do you need for ann_events_Tech_Details table?

ChrisM@Work (5/25/2016)
Ideally you would do this by posting a script to create a table then a series of inserts to populate it with data. You don't need to use the same table name, a temp table called say #ann_ReportItem will be fine. Jumping the gun a bit, you might want to do the same with ann_events_Tech_Details.

The simple answer to "how many rows" is "sufficient to determine correctness of the query", so I'll leave that up to you, but you should include sufficient to give you some results in each of the columns in a representative set of output rows, possibly a subset of three days' worth.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

jc85 Right there with Babe Points: 790 More actions · Answer 11

ChrisM@Work (5/25/2016)
jc85 (5/25/2016)
How should i post the table with data on this forum?
How many rows do you need for ann_events_Tech_Details table?
ChrisM@Work (5/25/2016)
Ideally you would do this by posting a script to create a table then a series of inserts to populate it with data. You don't need to use the same table name, a temp table called say #ann_ReportItem will be fine. Jumping the gun a bit, you might want to do the same with ann_events_Tech_Details.
The simple answer to "how many rows" is "sufficient to determine correctness of the query", so I'll leave that up to you, but you should include sufficient to give you some results in each of the columns in a representative set of output rows, possibly a subset of three days' worth.

Provided the data in attached excel sheet, thanks!

Sergiy SSC Guru Points: 110209 More actions · Answer 12

How does this query perform?

SELECT dateadd(dd,0, datediff(dd,0,b.start_time)) as [date], b.source, b.lang, SUM(b.sel_test) Test, SUM(b.sel_test2) Test2

FROM TechDetails b

WHERE b.start_time >= @StartDate

AND b.start_time < dateadd(dd,0, datediff(dd,0,@EndDate)+1)

GROUP BY dateadd (dd, 0, datediff(dd, 0, b.start_time)), b.source, b.lang

_____________
Code for TallyGenerator

Sergiy SSC Guru Points: 110209 More actions · Answer 13

If that one goes not so bad then incorporate it into the bigger one:

SELECT DT.date, (a.source + '-' + a.report_item) as report_item,

SUM(CASE WHEN b.lang = 'EN' THEN CASE a.report_item WHEN 'Test' THEN Test WHEN 'Test2' THEN Test2 ELSE 0 END ELSE 0 END) EN,

SUM(CASE WHEN b.lang = 'BM' THEN CASE a.report_item WHEN 'Test' THEN Test WHEN 'Test2' THEN Test2 ELSE 0 END ELSE 0 END) BM

FROM ReportItem a

INNER JOIN (

SELECT dateadd(dd,0, datediff(dd,0,b.start_time)) as [date], b.source, b.lang, SUM(b.sel_test) Test, SUM(b.sel_test2) Test2

FROM TechDetails b

WHERE b.start_time >= @StartDate

AND b.start_time < dateadd(dd,0, datediff(dd,0,@EndDate)+1)

AND b.lang IN ('EN', 'BM')

GROUP BY dateadd (dd, 0, datediff(dd, 0, b.start_time)), b.source, b.lang

) DT ON a.source = b.source

WHERE a.report_id =8 AND a.flow_name = @flow_name

AND a.report_item IN ('Test', 'Test2')

And from looking at the overall design I can conclude the clustered index on ReportItem must be

(report_id, flow_name, report_item) INCLUDE (source)

_____________
Code for TallyGenerator

jc85 Right there with Babe Points: 790 More actions · Answer 14

Sergiy (5/25/2016)
If that one goes not so bad then incorporate it into the bigger one:
SELECT dateadd(dd,0, datediff(dd,0,b.start_time)) as [date], b.source, b.lang, SUM(b.sel_test) Test, SUM(b.sel_test2) Test2
FROM TechDetails b
WHERE b.start_time >= @StartDate
AND b.start_time < dateadd(dd,0, datediff(dd,0,@EndDate)+1)
GROUP BY dateadd (dd, 0, datediff(dd, 0, b.start_time)), b.source, b.lang
SELECT DT.date, (a.source + '-' + a.report_item) as report_item,
SUM(CASE WHEN b.lang = 'EN' THEN CASE a.report_item WHEN 'Test' THEN Test WHEN 'Test2' THEN Test2 ELSE 0 END ELSE 0 END) EN,
SUM(CASE WHEN b.lang = 'BM' THEN CASE a.report_item WHEN 'Test' THEN Test WHEN 'Test2' THEN Test2 ELSE 0 END ELSE 0 END) BM
FROM ReportItem a
INNER JOIN (
SELECT dateadd(dd,0, datediff(dd,0,b.start_time)) as [date], b.source, b.lang, SUM(b.sel_test) Test, SUM(b.sel_test2) Test2
FROM TechDetails b
WHERE b.start_time >= @StartDate
AND b.start_time < dateadd(dd,0, datediff(dd,0,@EndDate)+1)
AND b.lang IN ('EN', 'BM')
GROUP BY dateadd (dd, 0, datediff(dd, 0, b.start_time)), b.source, b.lang
) DT ON a.source = b.source
WHERE a.report_id =8 AND a.flow_name = @flow_name
AND a.report_item IN ('Test', 'Test2')
And from looking at the overall design I can conclude the clustered index on ReportItem must be
(report_id, flow_name, report_item) INCLUDE (source)

The first one performed well but the second one is getting syntax error "The multi-part identifier b.lang and b.source could not be bound).

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 15

Sergiy (5/25/2016)
And from looking at the overall design I can conclude the clustered index on ReportItem must be
(report_id, flow_name, report_item) INCLUDE (source)

Do you mean (report_id, flow_name, report_item, source)? You can't have an included column in a clustered index.

John