January 6, 2017 at 4:18 pm
SET STATISTICS TIME ON
Ok. Can you try doing this ^ before running the query and check the Messages tab for details?
Or run the following query, add text from the query, may be an alias used, in '%%'
SELECT eqs.last_elapsed_time, eqs.total_elapsed_time / eqs.execution_count
FROM sys.dm_exec_query_stats eqs
OUTER APPLY sys.dm_exec_sql_text(eqs.sql_handle) est
WHERE est.text LIKE '%%'
https://sqlroadie.com/
January 6, 2017 at 4:36 pm
Most expensive operation in my query is for sorting, which we cannot avoid I think. Can you please do the following and check if there is any improvement?
1. Firstly, do this. Looks like statistics are stale.
UPDATE STATISTICS dbo.ub_charge
2. Can you please check what the current Max DOP setting is, is it 4?
EXEC sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
EXEC sp_configure 'max degree of parallelism'
3. Also, please run this and post the output.
SELECT COUNT(1) FROM sys.dm_os_schedulers WHERE status = 'visible online'
4. Add the following at end of the query and run it query again.
(OPTION MAXDOP 1)
If this takes longer, we can experiment with different MAXDOP values and find the optimum.
https://sqlroadie.com/
January 6, 2017 at 4:41 pm
This is what was returned with "SET STATISTICS TIME ON" for your query.
(2570 row(s) affected)
SQL Server Execution Times:
CPU time = 3185 ms, elapsed time = 2322 ms.
Thanks for all the help by the way, I have some experience with SQL, but not with SQL admin, but I have been thrown into it. I am learning!
January 6, 2017 at 4:52 pm
Okay, I updated the statistics and then I ran the next bit of SQL and this is the output:
Configuration option 'show advanced options' changed from 1 to 1. Run the RECONFIGURE statement to install.
name minimum maximum config_value run_value
----------------------------------- ----------- ----------- ------------ -----------
max degree of parallelism 0 32767 0 0
I then ran this:
SELECT COUNT(1) FROM sys.dm_os_schedulers WHERE status = 'visible online'
and it returned: 4
Then I added this to the end my query and ran it again and this is the execution time:
OPTION (MAXDOP 1)
SQL Server Execution Times:
CPU time = 2484 ms, elapsed time = 5025 ms.
I attached the SQLplan again in case it was needed.
January 6, 2017 at 5:04 pm
Thanks for doing all of that 🙂
Let me explain. The server has 4 processors. MAX DOP is set at zero, meaning optimizer will use as many number of processors to execute the query as it wishes. It was using all 4 of them.
Previously, CPU time = 3185 ms, elapsed time = 2322 ms.
This meant query was executed in 2322 ms. 4 processors spent a cumulative 3.185 s on the query.
When you specified MAXDOP 1, only 1 processor worked on the query and it took longer for query to complete. I hope this makes sense.
CPU time = 2484 ms, elapsed time = 5025 ms
In short, you are better off with the original query and not specifying a MAXDOP, but if you are curious try OPTION (MAXDOP 2) and check execution times.
I think without adding covering indexes, this is the fastest it will get. Now that statistics are refreshed, can you please run the query again, without the MAXDOP hint and post new plan?
https://sqlroadie.com/
January 6, 2017 at 5:16 pm
Thank you for the explanation. Here is the latest execution time:
(2570 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 3593 ms, elapsed time = 2785 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
And attached is the .sqlplan
You mentioned you might know why your query had more records, are you feeling like it is returning more and it is correct and mine was not returning some?
January 6, 2017 at 5:32 pm
Thanks for that mate. I think your query returns less rows because of this condition.
AND (
SELECT SUM(csub.charge_qty)
FROM ub_charge_test csub
WHERE csub.charge_code = c.Charge_code AND
csub.ub_master_id = m.ub_master_id
) > 0
Now, let us create an index and check how that affects the query.
Index 1: [ub_master_id] INCLUDE [charge_amount], [charge_code], [charge_qty]
Can you please run query after creating this, and post plan only if time is any better? My guess is that this will make the query a little faster.
Index 2: [a_site_id] INCLUDE [charge_amount], [charge_code], [charge_date], [charge_description], [charge_qty], [CPTCode], [CPTCode_Description]
I do not really recommend this index as it is expensive to maintain, but you could try adding it and see how it goes, and drop it later may be.
https://sqlroadie.com/
January 9, 2017 at 2:37 am
The execution plan indicates that the subquery picking up the row with the max date is the most expensive operation.
If charge_date is unique within each partition of a_site_id/charge_code, then this might be more efficient:
WITH charges AS (
SELECT
c.a_site_id,
c.charge_code,
SUM(c.charge_qty) AS SumQty,
SUM(c.charge_amount) AS TotalRevenue,
COUNT(DISTINCT c.ub_master_id) AS CountByClaim,
MAX(c.charge_date) AS Max_charge_date
FROM ub_charge c
WHERE EXISTS (
SELECT 1
FROM ub_master m
WHERE c.ub_master_id = m.ub_master_id
AND m.a_site_id = 1
AND m.importversion = 1
AND m.ub_batch_sub_number = 0
AND m.date_discharged BETWEEN '12/1/2015' AND '11/30/2016'
)
GROUP BY c.a_site_id, c.charge_code
)
SELECT c.charge_code,
l.LatestDescription,
l.LatestAmount,
c.SumQty,
c.TotalRevenue,
c.CountByClaim,
l.CPTCode,
l.CPTCode_Description
FROM charges c
CROSS APPLY (
SELECT
sub.charge_description AS LatestDescription,
sub.charge_amount / ISNULL(NULLIF(sub.charge_qty,0),1) as LatestAmount,
sub.CPTCode,
sub.CPTCode_Description
FROM ub_charge sub
WHERE sub.a_site_id = c.a_site_id
AND sub.charge_code = c.charge_code
AND sub.charge_date = c.Max_charge_date
) l
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 9, 2017 at 11:02 am
@ChrisM@Work - Your query did run faster, however it is not unique and so it brings up 9,266 rows instead of the 2,570.
@arjun Sivadasan - I added the index and it is even faster.
SQL Server Execution Times:
CPU time = 2732 ms, elapsed time = 1354 ms.
Regarding the extra rows, I was doing that code and I guess I forgot to mention that part for my criteria of filtering. I need to eliminate any charge codes that the sum(charge_amount) of those that match ub_master_id is 0 or less. So I will need to put that back in.
I added the latter index and it didn't make it faster, actually slowed it down some and it was big to create, so I will leave that one off.
I attached the latest sqlplan as well.
January 9, 2017 at 3:15 pm
Well done! I think you only need to add following condition after the group by.
having sum(charge_amount) > 0
Edit: Your original query checks for sum(charge_qty) > 0. Do both conditions produce the same results?
https://sqlroadie.com/
January 10, 2017 at 1:14 am
douglas.t (1/9/2017)
@ChrisM@Work - Your query did run faster, however it is not unique and so it brings up 9,266 rows instead of the 2,570.
Multiple rows per day I guess:
WITH charges AS (
SELECT
c.a_site_id,
c.charge_code,
SUM(c.charge_qty) AS SumQty,
SUM(c.charge_amount) AS TotalRevenue,
COUNT(DISTINCT c.ub_master_id) AS CountByClaim,
MAX(c.charge_date) AS Max_charge_date
FROM ub_charge c
WHERE EXISTS (
SELECT 1
FROM ub_master m
WHERE c.ub_master_id = m.ub_master_id
AND m.a_site_id = 1
AND m.importversion = 1
AND m.ub_batch_sub_number = 0
AND m.date_discharged BETWEEN '12/1/2015' AND '11/30/2016'
)
GROUP BY c.a_site_id, c.charge_code
)
SELECT c.charge_code,
l.LatestDescription,
l.LatestAmount,
c.SumQty,
c.TotalRevenue,
c.CountByClaim,
l.CPTCode,
l.CPTCode_Description
FROM charges c
CROSS APPLY (
SELECT TOP 1
sub.charge_description AS LatestDescription,
sub.charge_amount / ISNULL(NULLIF(sub.charge_qty,0),1) as LatestAmount,
sub.CPTCode,
sub.CPTCode_Description
FROM ub_charge sub
WHERE sub.a_site_id = c.a_site_id
AND sub.charge_code = c.charge_code
AND sub.charge_date = c.Max_charge_date
) l
WHERE c.SumQty > 0
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 10, 2017 at 1:59 am
Arjun Sivadasan (1/9/2017)
Well done! I think you only need to add following condition after the group by.having sum(charge_amount) > 0
Edit: Your original query checks for sum(charge_qty) > 0. Do both conditions produce the same results?
Arjun, your query generates different results to mine and the OP's using the test data:
if 0 = 1 begin;
DELETE FROM #ub_charge;
DELETE FROM #ub_master;
INSERT INTO #ub_master
(ub_master_id, a_site_id, importversion, ub_batch_sub_number, date_discharged)
VALUES
(1, 5, 1, 0, '12/20/2016'),
(2, 3, 1, 0, '12/21/2016'),
(3, 3, 1, 1, '01/02/2017'),
(4, 3, 2, 0, '01/03/2017'),
(5, 3, 1, 0, '01/01/2017'),
(6, 3, 1, 0, '01/03/2017');
INSERT INTO #ub_charge
(ub_charge_id, ub_master_id, a_site_id, charge_code, charge_amount, charge_qty, charge_date, charge_description, cptcode, cptcode_description)
VALUES
(1, 1, 5, '1000', 50, 2, '01/3/2017', 'PRODUCT 1000 V1 (m=1)', 'code1', 'cpt description 1'),
(2, 1, 5, '2000', 100, 1, '12/17/2016', 'PRODUCT 2000 V1 (m=1)', 'code2', 'cpt description 2'),
(11, 1, 5, '5000', 90, 2, '12/17/2016', 'PRODUCT 5000 V1 (m=1)', 'code11', 'cpt description 11'),
(3, 2, 3, '3000', 500, 1, '12/20/2016', 'PRODUCT 3000 V1 (m=2)', 'code3', 'cpt description 3'),
(4, 2, 3, '2000', 100, 1, '12/19/2016', 'PRODUCT 2000 V2 (m=2)', 'code4', 'cpt description 4'),
(5, 3, 3, '1000', 30, 1, '1/2/2017', 'PRODUCT 1000 V2 (m=3)', 'code5', 'cpt description 5'),
(6, 3, 3, '2000', 210, 2, '1/2/2017', 'PRODUCT 2000 V3 (m=3)', 'code6', 'cpt description 6'),
(7, 4, 3, '3000', 1000, 2, '12/21/2016', 'PRODUCT 3000 V2 (m=4)', 'code7', 'cpt description 7'),
(8, 4, 3, '4000', 10, 2, '12/20/2016', 'PRODUCT 4000 V1 (m=4)', 'code8', 'cpt description 8'),
(9, 5, 3, '1000', 100, 4, '11/19/2016', 'PRODUCT 1000 V3 (m=5)', 'code9', 'cpt description 9'),
(10, 5, 3, '2000', 220, 2, '11/20/2016', 'PRODUCT 2000 V4 (m=5)', 'code10', 'cpt description 10'),
(12, 5, 3, '2000', 220, 1, '11/20/2016', 'PRODUCT 2000 V5 (m=5)', 'code12', 'cpt description 12'),
(13, 6, 3, '1000', 100, 1, '01/01/2017', 'PRODUCT 1000 V4 (m=6)', 'code13', 'cpt description 13'),
(14, 6, 3, '1000', 100, 1, '01/01/2017', 'PRODUCT 1000 V5 (m=6)', 'code14', 'cpt description 14');
end;
------------------------------------------------------------------------------------------------------------
DECLARE @site_id INT = 3,
@importversion INT = 1,
@batchsubnumber INT = 0,
@datefrom DATETIME = '1jan2017',
@dateto DATETIME = '1may2017';
------------------------------------------------------------------------------------------------------------
-- Arjun
WITH cteChargeDesc
AS
(
SELECT charge_code, charge_description, charge_date, charge_amount, charge_qty, cptcode, cptcode_description,
ROW_NUMBER() OVER(PARTITION BY charge_code ORDER BY charge_date DESC) rn
FROM #ub_charge
WHERE a_site_id = @site_id
)
,cteMaster AS
(
SELECT ub_master_id
FROM #ub_master
WHERE a_site_id = @site_id AND importVersion = @importversion AND ub_batch_sub_number = @batchsubnumber AND date_discharged BETWEEN @datefrom AND @dateto
)
SELECT ct.charge_code, cd.charge_description, cd.latest_amount, SUM(ct.charge_qty) [sum_qty], SUM(ct.charge_amount) [total_revenue],
COUNT(DISTINCT ct.ub_master_id) [count_by_claim], cd.CPTCode, cd.CPTCode_Description
FROM #ub_charge ct
INNER JOIN cteMaster m
ON m.ub_master_id = ct.ub_master_id
OUTER APPLY
(
SELECT icd.charge_description, icd.charge_amount / nullif(icd.charge_qty, 0) [latest_amount], icd.cptcode, icd.cptcode_description
FROM cteChargeDesc icd
WHERE icd.charge_code = ct.charge_code AND icd.rn = 1
) cd
GROUP BY ct.charge_code, cd.charge_description, cd.latest_amount, cd.CPTCode, cd.CPTCode_Description
-------------------------------------------------------------------------------------------------------------------------------
-- ChrisM
;WITH charges AS (
SELECT
c.a_site_id,
c.charge_code,
SUM(c.charge_qty) AS SumQty,
SUM(c.charge_amount) AS TotalRevenue,
COUNT(DISTINCT c.ub_master_id) AS CountByClaim,
MAX(c.charge_date) AS Max_charge_date
FROM #ub_charge c
WHERE EXISTS (
SELECT 1
FROM #ub_master m
WHERE c.ub_master_id = m.ub_master_id
AND m.a_site_id = @site_id
AND m.importversion = @importversion
AND m.ub_batch_sub_number = @batchsubnumber
AND m.date_discharged BETWEEN @datefrom AND @dateto
)
GROUP BY c.a_site_id, c.charge_code
HAVING SUM(c.charge_qty) > 0
)
SELECT c.charge_code,
l.LatestDescription,
l.LatestAmount,
c.SumQty,
c.TotalRevenue,
c.CountByClaim,
l.CPTCode,
l.CPTCode_Description
FROM charges c
CROSS APPLY (
SELECT TOP 1
sub.charge_description AS LatestDescription,
sub.charge_amount / ISNULL(NULLIF(sub.charge_qty,0),1) as LatestAmount,
sub.CPTCode,
sub.CPTCode_Description
FROM #ub_charge sub
WHERE sub.a_site_id = c.a_site_id
AND sub.charge_code = c.charge_code
AND sub.charge_date = c.Max_charge_date
) l
--------------------------------------------------------------------------------------------------
-- Original
-- Here I am getting the charges that do not cancel out each other. In the DB I could have a 1 qty and a -1 qty, so I want to eliminate those.
;WITH ubc as (
SELECT c.ub_charge_id from #ub_master m
JOIN #ub_charge c ON c.ub_master_id = m.ub_master_id
AND (
SELECT SUM(csub.charge_qty)
FROM #ub_charge csub
WHERE csub.charge_code = c.Charge_code AND
csub.ub_master_id = m.ub_master_id
) > 0
WHERE
m.a_site_id = @site_id
AND m.importversion = @importversion
AND m.ub_batch_sub_number = @batchsubnumber
AND m.date_discharged BETWEEN @datefrom AND @dateto
),
-- Here I am using row_number() to limit to the latest Charge Amount and Description
cte as (
SELECT * FROM (
SELECT
c.ub_master_id,
c.charge_code,
-- I am dividing by the charge_qty so that I get the original price per unit
case when c.charge_qty <> 0 then (c.charge_amount / c.charge_qty) else c.charge_amount END as chargeAmount,
c.charge_description,
c.CPTCode,
c.CPTCode_Description,
row_number() over(partition by c.charge_code order by c.charge_date desc) as rownum
from #ub_charge c
WHERE
-- Making sure I am getting the latest positive price and not credits
c.charge_amount >= 0
AND c.ub_charge_id IN (select ub_charge_id from ubc)
) t
where t.rownum = 1
)
--Now fetching the columns and querying the stats that I wanted.
select
cte.charge_code,
cte.charge_description as LatestDescription,
cte.chargeamount as LatestAmount,
(select sum(charge_qty) from #ub_charge c
WHERE c.charge_code = cte.charge_code
AND c.ub_charge_id IN (select ub_charge_id from ubc)
) SumQty,
(select sum(charge_amount) from #ub_charge c
WHERE c.charge_code = cte.charge_code
AND c.ub_charge_id IN (select ub_charge_id from ubc)
) TotalRevenue
,
(
select count( t.charge_code ) from (
select distinct ub_master_id, charge_code from #ub_charge
WHERE ub_charge_id IN (select ub_charge_id from ubc)
AND charge_code = cte.charge_code
) t
) CountByClaim
from cte
In table ub_charge I've put the ub_master_id into the charge_description column so you can see which ub_master_id was in the row, in the output.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 10, 2017 at 2:29 pm
Arjun Sivadasan (1/9/2017)
Well done! I think you only need to add following condition after the group by.having sum(charge_amount) > 0
Edit: Your original query checks for sum(charge_qty) > 0. Do both conditions produce the same results?
I added that line and it gave me this result count: 2500 (40 less than I need)
ChrisM@Work (1/10/2017)
-- ChrisM
;WITH charges AS (
SELECT
c.a_site_id,
c.charge_code,
SUM(c.charge_qty) AS SumQty,
SUM(c.charge_amount) AS TotalRevenue,
COUNT(DISTINCT c.ub_master_id) AS CountByClaim,
MAX(c.charge_date) AS Max_charge_date
FROM #ub_charge c
WHERE EXISTS (
SELECT 1
FROM #ub_master m
WHERE c.ub_master_id = m.ub_master_id
AND m.a_site_id = @site_id
AND m.importversion = @importversion
AND m.ub_batch_sub_number = @batchsubnumber
AND m.date_discharged BETWEEN @datefrom AND @dateto
)
GROUP BY c.a_site_id, c.charge_code
HAVING SUM(c.charge_qty) > 0
)
SELECT c.charge_code,
l.LatestDescription,
l.LatestAmount,
c.SumQty,
c.TotalRevenue,
c.CountByClaim,
l.CPTCode,
l.CPTCode_Description
FROM charges c
CROSS APPLY (
SELECT TOP 1
sub.charge_description AS LatestDescription,
sub.charge_amount / ISNULL(NULLIF(sub.charge_qty,0),1) as LatestAmount,
sub.CPTCode,
sub.CPTCode_Description
FROM #ub_charge sub
WHERE sub.a_site_id = c.a_site_id
AND sub.charge_code = c.charge_code
AND sub.charge_date = c.Max_charge_date
) l
This query gives me the correct return count (2540) and does it with these stats:
SQL Server Execution Times:
CPU time = 1796 ms, elapsed time = 795 ms.
Thank you Arjun and ChrisM. I have attached ChrisM sqlplan just in case anyone wants to look at it. You both have been very helpful in different ways. Maybe looking at the SQLPlan (which I cannot read yet very well) you can see if it needs a different index or anything else that will be even quicker, but if not, this is amazing and I am so grateful to all.
January 10, 2017 at 4:27 pm
ChrisM@Work (1/10/2017)
Arjun, your query generates different results to mine and the OP's using the test data:
Chris: True! Mine matches with OP's expected output (check the latest_amount and description columns), but I think yours does what the OP actually intended and is definitely simpler, so all good.
https://sqlroadie.com/
January 10, 2017 at 4:36 pm
douglas.t (1/10/2017)
Arjun Sivadasan (1/9/2017)
Well done! I think you only need to add following condition after the group by.having sum(charge_amount) > 0
Edit: Your original query checks for sum(charge_qty) > 0. Do both conditions produce the same results?
I added that line and it gave me this result count: 2500 (40 less than I need)
Which condition did you add?
sum(charge_amount) > 0
or
sum(charge_qty) >0
douglas.t (1/10/2017)
Thank you Arjun and ChrisM. I have attached ChrisM sqlplan just in case anyone wants to look at it. You both have been very helpful in different ways. Maybe looking at the SQLPlan (which I cannot read yet very well) you can see if it needs a different index or anything else that will be even quicker, but if not, this is amazing and I am so grateful to all.
Can you please check the latest amount and description columns and confirm if they show the expected values? If that's fine, I think Chris's query is about as good as it gets. It'll be interesting to watch if someone can give us a faster query.
Execution plan can seem a little cryptic, but make him your best friend and you will have a wonderful relationship 🙂
https://sqlroadie.com/
Viewing 15 posts - 16 through 30 (of 31 total)
You must be logged in to reply to this topic. Login to reply