March 25, 2015 at 10:46 am
Hi,
I need to re-write a part of the SQL query to make it run fast. I need help to replace the MAX function which I think is making the query very slow.
LEFT JOIN
(
SELECT distinct
BL_ID,
MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN NAME ELSE NULL END) AS [SHIPPER],
MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN NAME ELSE NULL END) AS [CONSIGNEE],
MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN NAME ELSE NULL END) AS [NOTIFY],
MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN DUNS ELSE NULL END) AS [SHIPPER_DUNS],
MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN DUNS ELSE NULL END) AS [CONSIGNEE_DUNS],
MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN DUNS ELSE NULL END) AS [NOTIFY_DUNS],
MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN COMPANY_NAME ELSE NULL END) AS [SHIPPER_CUSTNAME],
MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN COMPANY_NAME ELSE NULL END) AS [CONSIGNEE_CUSTNAME],
MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN COMPANY_NAME ELSE NULL END) AS [NOTIFY_CUSTNAME]
FROM MG_BL_PARTY
join MG_COMPANY
on MG_BL_PARTY.COMPANY_ID = MG_COMPANY.COMPANY_ID
WHERE COMPANY_ROLE_CD IN ('SH','CN','NP')
AND ORDER_SEQ_NBR = 1
GROUP BY BL_ID
)MGP
ON MG_BILL_OF_LADING.BL_ID = MGP.BL_ID
Please find the SQL query below:
SELECT DISTINCT
MG_BILL_OF_LADING.BL_NUM
,MG_BL_CONTAINER.CONTAINER_NUM
,MG_BILL_OF_LADING.POR_LOCATION_CD
,MG_BILL_OF_LADING.POL_LOCATION_CD
,MG_BILL_OF_LADING.TRANSHIPMENT_PORT_CD
,MG_BILL_OF_LADING.POD_LOCATION_CD
,MG_BILL_OF_LADING.PDL_LOCATION_CD
,MG_BILL_OF_LADING.POI_LOCATION_CD
,MG_BILL_OF_LADING.BL_OK_FLG
,MG_BILL_OF_LADING.DELETED_FLG
,MG_BILL_OF_LADING.DELETED_DT
,MG_BILL_OF_LADING.CREATE_DT
,MG_BILL_OF_LADING.SC_NUM
,MG_BILL_OF_LADING.TRADE_CD
,SHIPPER
,CONSIGNEE
,NOTIFY
,SHIPPER_DUNS
,CONSIGNEE_DUNS
,NOTIFY_DUNS
,SHIPPER_CUSTNAME
,CONSIGNEE_CUSTNAME
,NOTIFY_CUSTNAME
,MG_BL_CONTAINER.CONTAINER_PREFIX
,MG_BL_CONTAINER.CONTAINER_CHECK
,MG_BL_CONTAINER.CONTAINER_TYPE_CD
,MG_BL_CONTAINER.CONTAINER_SIZE_CD
,MG_VESSEL_PARTICULAR.VESSEL_TYPE_CD
,MG_BL_ITINERARY.VESSEL_CD
,MG_BL_ITINERARY.VOYAGE_CD
,MG_BL_ITINERARY.LEG_CD
,MG_BL_ITINERARY.FROM_LOCATION_CD
,MG_BL_ITINERARY.ETD_DT
,MG_BL_ITINERARY.ETA_DT
,MG_BL_ITINERARY.TO_LOCATION_CD
,CASE WHEN MG_BL_CONTAINER.CONTAINER_SIZE_CD = '20' THEN 1 WHEN MG_BL_CONTAINER.CONTAINER_SIZE_CD IN ('40','45') THEN 2 END AS TEU,
MG_BL_ITINERARY.FROM_FACILITY_CD,
MG_BL_ITINERARY.TO_FACILITY_CD,
--MG_VSLVOY_SCHEDULE.ARRIVAL_ACTUAL_DT,
MG_BILL_OF_LADING.BL_TYPE_CD
FROM
MG_BILL_OF_LADING
INNER JOIN
MG_BL_CONTAINER
ON MG_BILL_OF_LADING.BL_ID = MG_BL_CONTAINER.BL_ID
LEFT JOIN
(
SELECT distinct
BL_ID,
MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN NAME ELSE NULL END) AS [SHIPPER],
MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN NAME ELSE NULL END) AS [CONSIGNEE],
MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN NAME ELSE NULL END) AS [NOTIFY],
MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN DUNS ELSE NULL END) AS [SHIPPER_DUNS],
MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN DUNS ELSE NULL END) AS [CONSIGNEE_DUNS],
MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN DUNS ELSE NULL END) AS [NOTIFY_DUNS],
MAX(CASE WHEN COMPANY_ROLE_CD = 'SH' THEN COMPANY_NAME ELSE NULL END) AS [SHIPPER_CUSTNAME],
MAX(CASE WHEN COMPANY_ROLE_CD = 'CN' THEN COMPANY_NAME ELSE NULL END) AS [CONSIGNEE_CUSTNAME],
MAX(CASE WHEN COMPANY_ROLE_CD = 'NP' THEN COMPANY_NAME ELSE NULL END) AS [NOTIFY_CUSTNAME]
FROM MG_BL_PARTY
join MG_COMPANY
on MG_BL_PARTY.COMPANY_ID = MG_COMPANY.COMPANY_ID
WHERE COMPANY_ROLE_CD IN ('SH','CN','NP')
AND ORDER_SEQ_NBR = 1
GROUP BY BL_ID
)MGP
ON MG_BILL_OF_LADING.BL_ID = MGP.BL_ID
Could someone please help on this ? Thanks in advance for your help.
March 25, 2015 at 10:53 am
You should get rid of the DISTINCT in the subquery; since you're GROUPing on BL_ID, you don't need it anyway.
Otherwise it's difficult to re-write because there are no aliases on the other columns, so I don't know which tables COMPANY_ROLE_CD, ORDER_SEQ_NBR, etc., are in.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 25, 2015 at 10:59 am
Just out curiousity are those varchar fields you're using MAX on? While that will work is that really what you want?
March 25, 2015 at 11:04 am
ZZartin (3/25/2015)
Just out curiousity are those varchar fields you're using MAX on? While that will work is that really what you want?
Yes, those are all varchar fields. Is there some way to replace the MAX function ? Thanks.
March 25, 2015 at 11:05 am
ScottPletcher (3/25/2015)
You should get rid of the DISTINCT in the subquery; since you're GROUPing on BL_ID, you don't need it anyway.Otherwise it's difficult to re-write because there are no aliases on the other columns, so I don't know which tables COMPANY_ROLE_CD, ORDER_SEQ_NBR, etc., are in.
Thanks, I removed the distinct but the query is still slow. Any more ideas ? Thanks.
March 25, 2015 at 11:16 am
The first thing to looks for is execution plan , which tells alot about how optimizer is processing this . I would say please paste the Exec plan and start focusing on cost involved at different operator to optimize this .
Have you checked number of reads and stats .
March 25, 2015 at 11:24 am
pwalter83 (3/25/2015)
ZZartin (3/25/2015)
Just out curiousity are those varchar fields you're using MAX on? While that will work is that really what you want?Yes, those are all varchar fields. Is there some way to replace the MAX function ? Thanks.
The reason I ask is because unless those fields are specifically populated in such a way that they will always sort in such a way that the top value is what you always want MAX may not be giving you the data you want(or expect). Without knowing exactly how those fields are populated and what they mean it would be hard to offer an answer.
As for performance, can you verify that the subquery is what is actually causing the performance issue? Run just the subquery and compare the run time of that to running the entire query.
March 25, 2015 at 11:36 am
pwalter83 (3/25/2015)
ScottPletcher (3/25/2015)
You should get rid of the DISTINCT in the subquery; since you're GROUPing on BL_ID, you don't need it anyway.Otherwise it's difficult to re-write because there are no aliases on the other columns, so I don't know which tables COMPANY_ROLE_CD, ORDER_SEQ_NBR, etc., are in.
Thanks, I removed the distinct but the query is still slow. Any more ideas ? Thanks.
Again, please alias ALL columns in the query, otherwise I can't tell which table each column comes from. You have to remember we don't know your data -- it's easy for you to know which column is from which table, but we've never seen the tables before, and so can't possibly know.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 25, 2015 at 11:43 am
I'd really focus on the execution plan. I suspect the MAX itself isn't causing issues but rather the GROUP BY.
One way around it would be to try getting a TOP 1 with an ORDER BY instead of a MAX & GROUP BY. That doesn't require aggregation so it can run faster.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 25, 2015 at 1:28 pm
Grant Fritchey (3/25/2015)
I'd really focus on the execution plan. I suspect the MAX itself isn't causing issues but rather the GROUP BY.One way around it would be to try getting a TOP 1 with an ORDER BY instead of a MAX & GROUP BY. That doesn't require aggregation so it can run faster.
The MAX and the GROUP BY is a form of PIVOT known as a CROSS TAB and it's not the problem. The problem is that they're trying to join to a Pivot/CrossTab. The best thing to do here would be to do the Divide'n'Conquer thing and do the cross tab separately into a Temp table and then join on that temp table. Done correctly, I believe you'll see at least an order of magnitude in performance improvement.
As a side bar, I'll never understand why people think they need to do things in a single query. That's NOT what set-based code is at all.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 25, 2015 at 1:34 pm
Please provide a description of what this query is attempting to accomplish. For example, if all you're trying to do is return the most recent row for each BL_ID, then that can be accomplished by apply a filer to a DENSE_RANK() or ROW_NUMBER() function rather than use of aggregation.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
March 26, 2015 at 10:27 am
I'd also agree you're better off posting the execution plan rather than just removing elements of your query.
Admittedly, the more experienced you are, the more likely you will be able to spot code smells but still...
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply