Spill Levels?

Question

Spill Levels?

SSC Veteran

Points: 274

October 9, 2020 at 1:23 pm

Hi, I am reviewing some code that runs in 26 sec but I think it's poorly written. I have provided it below and am asking for advice on a better way to do what we're doing which is a multi threaded select. I personally feel the data from the inner loops should be stored in a temp tables. I have attached the actual query plan. Am I on the right track? Any advice is appreciated, and please no bashing, I've inherited this code. 🙂

DECLARE @Today date = GETDATE()
DECLARE @EndDueDate date = DATEADD(mm,1,@Today)
DECLARE @StartDueDate date = DATEADD(yy,-1,@EndDueDate)

SELECT
InvoiceNumber
,BillCo
,OpCo
,GroupNum
,GroupName
,GroupRep
,GroupBillMode
,BillDueDate
,Age
,BillStatus
,OnHold
,DedFreq
,OrigPolCount
,ActualPolCount
,OrigBillAmt
,ActualBillAmt
,FirstFileDate
,LastFileDate
,LastFileUser
,LastRelDate
,LastRelUser
,InvoicePrintDate
,LockboxFundDate
,LockboxFunds
,PayCount
,PayMethods
,AppliedPaymentDate
,TotalAppliedAmt
,PaymentExtractDate
,Processor
,@Today AS ReportDate

FROM
--Bill dataset; aggregates contract details to bill-level
(SELECT b.invoice_number AS InvoiceNumber
,b.bill_pk --for JOIN, not needed for output
,(SELECT branding_line_1 FROM BRANDING WHERE branding_pk = g.branding_fk) AS BillCo
,STUFF((SELECT DISTINCT ', ' + c2.operating_company
FROM BILL_RECORD br2 JOIN [CONTRACT] c2 ON br2.bill_record_pk = c2.bill_record_fk
WHERE br2.bill_fk = b.bill_pk AND c2.update_status_type_cd <> 'DELETED' AND c2.is_deleted = 0
FOR XML PATH('') --STRING_AGG() may be another solution, but is unavailable in this version of SQL Server
), 1, 2, '') AS OpCo --aggregate of all operating companies on the bill
,g.group_number AS GroupNum
,g.group_name AS GroupName
,r.name AS GroupRep
,(SELECT billing_mode_type_description 
FROM BILLING_MODE_TYPE 
WHERE billing_mode_type_cd = g.billing_mode_type_cd) AS GroupBillMode
,CONVERT(date, b.bill_due_date, 101) AS BillDueDate
,DATEDIFF(day,b.bill_due_date,@Today) AS Age --number of days past DueDate; restrict to unpaid, released?
,b.bill_status_type_cd AS BillStatus
,b.hold_for_review AS OnHold
,(SELECT REPLACE(df.deduction_frequency_short_desc, ' deductions','')
FROM DEDUCTION_FREQUENCY_TYPE df
WHERE df.deduction_frequency_cd = g.deduction_frequency_cd) AS DedFreq
,SUM(CASE WHEN c.update_status_type_cd <> 'ADDED' THEN 1 ELSE 0 END) AS OrigPolCount
,SUM(CASE WHEN c.update_status_type_cd <> 'DELETED' THEN 1 ELSE 0 END) AS ActualPolCount
,SUM(c.original_bill_amount) AS OrigBillAmt
,SUM(c.actual_bill_amount) AS ActualBillAmt
,CONVERT(date, (SELECT MIN(add_date) FROM FILE_UPLOAD 
WHERE bill_fk = b.bill_pk AND status_at_time_of_upload <> 'UNRELEASED'), 101) AS FirstFileDate
,CONVERT(date, (SELECT MAX(add_date) FROM FILE_UPLOAD 
WHERE bill_fk = b.bill_pk AND status_at_time_of_upload <> 'UNRELEASED'), 101) AS LastFileDate
,REPLACE((SELECT add_user FROM FILE_UPLOAD 
WHERE bill_fk = b.bill_pk AND add_date = (SELECT MAX(add_date) FROM FILE_UPLOAD WHERE bill_fk = b.bill_pk)),
'@selmanco.com','') AS LastFileUser
,CONVERT(date, (SELECT MAX(add_date) FROM BILL_RELEASE
WHERE bill_fk = b.bill_pk), 101) AS LastRelDate
,CASE 
WHEN (SELECT release_user FROM BILL_RELEASE 
WHERE bill_fk = b.bill_pk AND add_date = (SELECT MAX(add_date) FROM BILL_RELEASE WHERE bill_fk = b.bill_pk)) 
NOT LIKE '%selman%' THEN 'Autorelease'
ELSE REPLACE((SELECT release_user FROM BILL_RELEASE 
WHERE bill_fk = b.bill_pk AND add_date = (SELECT MAX(add_date) FROM BILL_RELEASE WHERE bill_fk = b.bill_pk)),
'@selmanco.com','')
END AS LastRelUser
,CONVERT(date, (CASE WHEN b.bill_due_date < '10/01/2019' 
THEN NULL
ELSE b.invoice_print_date 
END), 101) AS InvoicePrintDate --date bills are sent to mainframe for printing

FROM BILL b 
LEFT JOIN BILL_RECORD br ON b.bill_pk = br.bill_fk 
LEFT JOIN [CONTRACT] c ON br.bill_record_pk = c.bill_record_fk
LEFT JOIN [GROUP] g ON g.group_pk = b.group_fk
LEFT JOIN REP r ON r.rep_pk = g.rep_fk

WHERE b.is_deleted = 0 
AND br.is_deleted = 0 
AND c.is_deleted = 0 
AND g.is_deleted = 0 
AND g.admin_location_cd = 'KEENE'

GROUP BY b.invoice_number
,b.hold_for_review
,b.bill_pk
,g.branding_fk
,g.group_number
,g.group_name
,g.billing_mode_type_cd
,r.name
,b.bill_due_date
,b.file_upload_date
,b.invoice_print_date
,b.bill_status_type_cd
,g.deduction_frequency_cd

) AS bds --end bill dataset

LEFT JOIN

/*Payment dataset; not 100% clear on the role of some of the fields in p and pd; 
some may be redundant and/or not what they appear to be? */
(SELECT pd.bill_fk --for JOIN, not needed for output
,COUNT(p.payment_pk) AS PayCount --number of payment sources applied to bill
,STUFF((SELECT DISTINCT ', ' + p2.payment_method_cd
FROM PAYMENT_DETAIL pd2 JOIN PAYMENT p2 ON p2.payment_pk = pd2.payment_fk
WHERE pd2.bill_fk = pd.bill_fk
FOR XML PATH('')) 
, 1, 2, '') AS PayMethods
,CONVERT(date, MAX(pd.applied_payment_date), 101) AS AppliedPaymentDate
,SUM(CASE WHEN p.payment_method_cd = 'Credit Suspense'
THEN -1*pd.applied_amount
ELSE pd.applied_amount
END) AS TotalAppliedAmt
,CONVERT(date, MAX(p.payment_extract_datetime), 101) AS PaymentExtractDate
,STUFF((SELECT DISTINCT ', ' + REPLACE(p2.processor_user,'@selmanco.com','')
FROM PAYMENT_DETAIL pd2 JOIN PAYMENT p2 ON p2.payment_pk = pd2.payment_fk
WHERE pd2.bill_fk = pd.bill_fk
FOR XML PATH('')) 
, 1, 2, '') AS Processor 

FROM PAYMENT p 
JOIN PAYMENT_DETAIL pd ON p.payment_pk = pd.payment_fk

WHERE pd.is_deleted = 0 
AND p.is_deleted = 0
AND p.payment_status_cd = 'PROCESSED'

GROUP BY pd.bill_fk

) AS pds --end payment dataset

ON bds.bill_pk = pds.bill_fk --should be 1-to-1

LEFT JOIN

/* Lockbox dataset */
(SELECT x.invoice, MAX(x.deposit_date) AS LockboxFundDate, SUM(x.paid_amount) AS LockboxFunds
FROM (SELECT CASE WHEN r.lockbox_reconcile_pk IS NULL 
THEN a.generated_invoice_no 
ELSE r.generated_invoice_no 
END AS invoice
,CONVERT(date, a.deposit_date, 101) AS deposit_date
,CASE WHEN r.is_deleted = 0 
THEN r.paid_amount
ELSE NULL
END AS paid_amount
FROM LOCKBOX_IN_ARCHIVE a
LEFT JOIN LOCKBOX_RECONCILE r ON a.lockbox_in_archive_pk = r.lockbox_in_archive_fk) AS x
GROUP BY invoice 
) AS lbds --end lockbox dataset

ON bds.InvoiceNumber = lbds.invoice

--WHERE 
--(OpCo LIKE '%VC%' OR OpCo LIKE '%CB%')
--AND InvoiceNumber = '368_20190601'
--AND GroupNum IN ('DYQ')
--AND BillStatus NOT IN ('UNRELEASED', 'PAID')
--AND BillDueDate BETWEEN @StartDueDate AND @EndDueDate

ORDER BY InvoiceNumber
;

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply

Mr. Brian Gale SSC-Insane Points: 24800 More actions · Answer 1

I think you forgot the execution plan. Without that it is incredibly hard to say with any certainty what would help.

indexes may help, or you may already have the ones you need. if you don't have any PK/FK constraints in place, that may help. Re-writing the nested selects may help. It is hard to say though just from the query, especially with no data.

Converting the nested selects MAY make things faster, or it might even make things slower. If the tables in the nested selects are already finely tuned for these queries (good indexes and constraints), you may only slow things down by pulling it into temp tables or table variables.

It would also be helpful if you qualified the columns you are using with the alias they correspond to. This makes it easier to tell if you are pulling all of the data you require OR if you might be pulling data that is simply discarded.

Depending on how much data you are pulling, that performance MAY be the best that you can get. For example, lets say you are pulling back 10 million rows of data at 1 KB per row. That is 10,000,000 KB of data, OR 10,000 MB of data OR 10 GB of data (approximately... I am rounding 1024 to 1000 to make the math easier to visualize).

There are a lot of factors in place that could be causing it to be "slow" and maybe 23 seconds is as good as you can get. Maybe there are some indexes and constraints you can add to get better performance. Maybe your statistics are out of date and should be updated. Maybe your indexes could use some include columns. There are a lot of unknowns on our side.

One thing that MAY help is to change your data pulls from the table "FILE_UPLOAD" into a JOIN rather than pulling data from that table multiple times per row.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.