Sql Select statement running slow

Question

Post reply

Sql Select statement running slow

jdbrown239

SSCarpal Tunnel

Points: 4393
More actions
December 5, 2013 at 9:32 am

#280768

Hi
I have a select statement that runs forever because of 'Select Distinct' and it makes tempdb grow very large due to sorting. Does anyone have suggestions on how to improve this select statement?
Select distinct d.dlr_reference_num,d.dlr_key,a.app_dlr_fkey, CONVERT(VARCHAR, a.app_key)AS RapAPP, c.user1, c.vendid from NJACCOUNTING01.marlin_test.dbo.apdoc c inner join dbo.application a
on CONVERT(VARCHAR, a.app_key) = SUBSTRING(c.user1, 1,6), dbo.dealer d
where c.DocType = 'VO' and c.Crtd_DateTime > '2012-01-01 00:00:00' and c.VendId like '01%'

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Lowell SSC Guru Points: 323755 More actions · Answer 1

On The server NJACCOUNTING01, look at the indexes on the table marlin_test.dbo.apdoc

an index like this would greatly help the query, i believe:

CREATE INDEX IX_apdoc ON dbo.apdoc(Crtd_DateTime,DocType,VendId) INCLUDE (user1,vendid)

see what indexes are already there, and see if you can replace or modify an existing, or add one like my example above;

add the index, test your query again, and compare the before and after execution plans to see if ti's worthy of permanent promotion.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

jdbrown239 SSCarpal Tunnel Points: 4393 More actions · Answer 2

Thank you so much for the reply. I need to mention that the tables application.app_key and apdoc.user1 exist on separate databases (db1.application.app_key and db2.apdoc.user1.). I am running the query from db1 using a linked server connection to db2. I created an index on db1.application with (app_key, app_dlr_fkey) and one on db2.apdoc with (vendid, user1). After updating the stats on both tables I displayed the proposed execution plan 99% of the cost was on the Inner Join as a nested loop. I got a result set back pretty quick but got 100's of duplicate rows on user1 and vendid even if I user cross join. Any ideas why?

Select distinct d.dlr_reference_num,d.dlr_key,a.app_dlr_fkey, CONVERT(VARCHAR(30), a.app_key)AS RapAPP, c.user1, c.vendid from NJACCOUNTING01.marlin_test.dbo.apdoc c inner join dbo.application a

on CONVERT(VARCHAR(30), a.app_key) = SUBSTRING(c.user1, 1,6) , dbo.dealer d

where c.DocType = 'VO' and c.Crtd_DateTime > '2012-01-01 00:00:00' and c.VendId like '01%'

Jason Wolfkill SSCrazy Eights Points: 9832 More actions · Answer 3

jdbrown239 (12/5/2013)
Thank you so much for the reply. I need to mention that the tables application.app_key and apdoc.user1 exist on separate databases (db1.application.app_key and db2.apdoc.user1.). I am running the query from db1 using a linked server connection to db2. I created an index on db1.application with (app_key, app_dlr_fkey) and one on db2.apdoc with (vendid, user1). After updating the stats on both tables I displayed the proposed execution plan 99% of the cost was on the Inner Join as a nested loop. I got a result set back pretty quick but got 100's of duplicate rows on user1 and vendid even if I user cross join. Any ideas why?
Select distinct d.dlr_reference_num,d.dlr_key,a.app_dlr_fkey, CONVERT(VARCHAR(30), a.app_key)AS RapAPP, c.user1, c.vendid from NJACCOUNTING01.marlin_test.dbo.apdoc c inner join dbo.application a
on CONVERT(VARCHAR(30), a.app_key) = SUBSTRING(c.user1, 1,6) , dbo.dealer d
where c.DocType = 'VO' and c.Crtd_DateTime > '2012-01-01 00:00:00' and c.VendId like '01%'

It would help us answer this question if you explained the structure of your tables (especially the relations between them - one-to-one, one-to-many, etc.) and attached the actual execution plan. Providing the DDL is most helpful, and if you want some code suggestions, consumable sample data would be appreciated.

Jason Wolfkill

Lowell SSC Guru Points: 323755 More actions · Answer 4

one of the things that is going to make this slow is the linked server;

the execution plan will end up copying the entire table over to tempdb, and THEN filter the data witht eh WHERE / do the joins;

that's a lot of data over the wire that's not needed, most likely.

switching to a divide-and-conquor plan, you can get just the data that matches form the linke dserver into a temp table, adn then join on that.

something like this is my first guess, without any DDL to back it up.

--get the minimal data over the wire possible by using EXECUTE AT

INSERT INTO #temp

EXECUTE ( 'SELECT

c.DocType,c.Crtd_DateTime,c.VendId,c.user1,c.vendid

FROM marlin_test.dbo.apdoc c

WHERE c.DocType = ''VO''

AND c.Crtd_DateTime > ''2012-01-01 00:00:00''

AND c.VendId LIKE ''01%'';' ) AT NJACCOUNTING01;

--data was already filtered at linked server, just join and distinct it.

SELECT DISTINCT

d.dlr_reference_num,

d.dlr_key,

a.app_dlr_fkey,

CONVERT(VARCHAR(30), a.app_key)AS RapAPP,

c.user1,

c.vendid

FROM #temp c

INNER JOIN dbo.application a

ON CONVERT(VARCHAR(30), a.app_key) = SUBSTRING(c.user1, 1, 6),

dbo.dealer d

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Jason Wolfkill SSCrazy Eights Points: 9832 More actions · Answer 5

Lowell (12/6/2013)
one of the things that is going to make this slow is the linked server;
the execution plan will end up copying the entire table over to tempdb, and THEN filter the data witht eh WHERE / do the joins;
that's a lot of data over the wire that's not needed, most likely.
switching to a divide-and-conquor plan, you can get just the data that matches form the linke dserver into a temp table, adn then join on that.
something like this is my first guess, without any DDL to back it up.
--get the minimal data over the wire possible by using EXECUTE AT
INSERT INTO #temp
EXECUTE ( 'SELECT
c.DocType,c.Crtd_DateTime,c.VendId,c.user1,c.vendid
FROM marlin_test.dbo.apdoc c
WHERE c.DocType = ''VO''
AND c.Crtd_DateTime > ''2012-01-01 00:00:00''
AND c.VendId LIKE ''01%'';' ) AT NJACCOUNTING01;
--data was already filtered at linked server, just join and distinct it.
SELECT DISTINCT
d.dlr_reference_num,
d.dlr_key,
a.app_dlr_fkey,
CONVERT(VARCHAR(30), a.app_key)AS RapAPP,
c.user1,
c.vendid
FROM #temp c
INNER JOIN dbo.application a
ON CONVERT(VARCHAR(30), a.app_key) = SUBSTRING(c.user1, 1, 6),
dbo.dealer d

I'll vouch for this approach. Generally, it's a good idea to push as much of the filtering as possible to the linked server for two reasons - one, it cuts down the amount of data piped back to the calling server, as Lowell mentioned, and two, the linked server can use statistics, indexes, and metadata that are unavailable to the calling server to construct a more efficient plan for returning its data. I've used the temp table approach Lowell describes for these very reasons.

Jason Wolfkill