August 28, 2006 at 7:54 am
We have a query that runs in < 1 second in MSSQl 2000 but runs for several minutes after converting to 2005. The difference is that the plan generated by the optimizer for 2000 uses a hash join while in 2005 it does not. If I put an "inner hash join" hint in the sql it returns in < 1 second. However we don't want to use hints in our sql if possible. Does anyone know of a way to influence the optimizer(possibly a configuration setting) to favor a hash join? The sql is pretty poor but it's been running for 7 years and the developers don't want to change it because to re-test is very time consuming. This is the sql:
select
clientid, divid, contactid, rolecd, firstnm, lastnm
from
csms.csms_mts.tmp_client_contact
where
convert(char(4),clientid)+convert(char(3),divid)+convert(char(12),contactid)+convert(char(4),rolecd)
not
in
(
select
convert(char(4),n.clientid)+convert(char(3),n.divid)+convert(char(12),n.contactid)+convert(char(4),n.rolecd)
from
csms..client_contact o, csms.csms_mts.tmp_client_contact n where
convert
(char(4),n.clientid)+convert(char(3),n.divid)+convert(char(12),n.contactid)+convert(char(4),n.rolecd)=convert(char(4),o.clientid)+convert(char(3),o.divid)+convert(char(12),o.contactid)+convert(char(4),o.rolecd)
)
August 28, 2006 at 8:21 am
Hate to burst your developers bubble, but they are going to have to rewrite this query eventually. In its current format, it contains an old style inner join. This has been depreciated in SQL 2005 and may not be supported in future versions. When that happens, they will have to rewrite it. I took a few minutes to try a rewrite on my own, but don't know how good it will work but you have to tell me if it works. Here is what I came up with using SQL 2005 syntax:
WITH ClientContact (clientid, divid, contactid, rolecd) AS
(
SELECT
n.clientid,
n.divid,
n.contactid,
n.rolecd
FROM
csms.dbo.client_contact 0
INNER JOIN csms.csms_mts.tmp_client_contact n
ON (n.clientid = o.clientid AND
n.divid = o.divid AND
n.contactid = o.contactid AND
n.rolecd = o.rolecd)
)
SELECT
tcc.clientid,
tcc.divid,
tcc.contactid,
tcc.rolecd,
tcc.firstnm,
tcc.lastnm
FROM
csms.csms_mts.tmp_client_contact tcc
WHERE
NOT EXISTS (SELECT
*
FROM
ClientContact cc
WHERE
cc.clientid <> tcc.clientid AND
cc.divid <> tcc.divid AND
cc.contactid <> tcc.clientid AND
cc.rolecd <> tcc.rolecd)
hth,
Lynn
August 28, 2006 at 8:23 am
You may have to check all the statistics/Indexes (It does not get carried when you convert to 2005..also Implicit conversion of datatypes does not use Index Scan/Seek in 2005)
August 28, 2006 at 9:33 am
My bad. I think my previous post had a logic error (if someone wants to verify, I'd apreciate it). I think this is what I was trying to come up with for a rewrite:
WITH ClientContact (clientid, divid, contactid, rolecd) AS
(
SELECT
n.clientid,
n.divid,
n.contactid,
n.rolecd
FROM
csms.dbo.client_contact o
INNER JOIN csms.csms_mts.tmp_client_contact n
ON (n.clientid = o.clientid AND
n.divid = o.divid AND
n.contactid = o.contactid AND
n.rolecd = o.rolecd)
)
SELECT
tcc.clientid,
tcc.divid,
tcc.contactid,
tcc.rolecd,
tcc.firstnm,
tcc.lastnm
FROM
csms.csms_mts.tmp_client_contact tcc
WHERE
NOT EXISTS (SELECT
*
FROM
ClientContact cc
WHERE
cc.clientid = tcc.clientid AND
cc.divid = tcc.divid AND
cc.contactid = tcc.clientid AND
cc.rolecd = tcc.rolecd)
again, hth!
August 28, 2006 at 9:39 am
Always good to look at what one writes, as I have another solution that may also work:
SELECT
n.clientid,
n.divid,
n.contactid,
n.rolecd,
n.firstnm,
n.lastnm
FROM
csms.dbo.client_contact o
RIGHT OUTER JOIN csms.csms_mts.tmp_client_contact n
ON (n.clientid = o.clientid AND
n.divid = o.divid AND
n.contactid = o.contactid AND
n.rolecd = o.rolecd)
WHERE
o.clientid IS NULL
August 28, 2006 at 12:20 pm
Lynn,
Thanks for your replies. The last query you sent me executes in < 1 sec. and gives the correct result! The one right before that (the second query that has a "with" ) seems to give the wrong result. I didn't realize that the join syntax they use is deprecated in 2005 -- even though it seems like a long way off it will be valuable to get a head start on changing the code before the next release of MSSQL.
August 28, 2006 at 1:34 pm
Glad to help. I could probably get the query using the CTE to work, but not having any data to test, I did my best. It was looking at it a third time I realized that an outer join was a simpler solution.
Lynn
August 29, 2006 at 6:19 am
you also might have been better of starting with a [rebuild of your indexes] / maintenceplan , a dbcc updateusage(0) with count_rows and a sp_updatestats !
The depricated part is the left/right outer join syntax *= and =* !, but that is not used in your query.
I think it's still worth a try
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
August 29, 2006 at 8:33 am
About the depreciated part of older style joins may be true, however, that doesn't mean you shouldn't start look at rewriting queries that use the old style format. The ANSI format seems a lot more readable and understandable to me. Also, as I have read in several posts and articles on this site, it makes sense to revisit older code and see if it needs to be rewritten. This particular query was, as the author of the post indicated, poorly written.
August 29, 2006 at 9:41 am
Alzdba,
Thanks for the advice -- definitely good practices. Unfortunately it didn't change the query plan or the execution time. There may be nothing but I'm still looking for something at the server(or database level) that would influence the optimizer without changing code or adding hints. Obviously rebuilding indexes and updating statistics can help.
Lynn,
I agree that it makes sense to start converting or writing new queries with the new syntax. Microsoft has been hinting for some time for queries to be converted to the new syntax. They could very well pull the rug out on the old syntax on a not too distant future release.
August 29, 2006 at 12:55 pm
I'm curious, where in the original query did you have to add the hint to make it work as fast in 2005 as it was in 2000?
August 29, 2006 at 1:04 pm
between the join of the two tables. Where it had a comma I substituted "inner hash join" and for "where" I substituted "on".
select
clientid, divid, contactid, rolecd, firstnm, lastnm
from
csms.csms_mts.tmp_client_contact
where
convert(char(4),clientid)+convert(char(3),divid)+convert(char(12),contactid)+convert(char(4),rolecd)
not
in
(
select
convert(char(4),n.clientid)+convert(char(3),n.divid)+convert(char(12),n.contactid)+convert(char(4),n.rolecd)
from
csms..client_contact o inner hash join csms.csms_mts.tmp_client_contact n on
convert
(char(4),n.clientid)+convert(char(3),n.divid)+convert(char(12),n.contactid)+convert(char(4),n.rolecd)=convert(char(4),o.clientid)+convert(char(3),o.divid)+convert(char(12),o.contactid)+convert(char(4),o.rolecd)
August 29, 2006 at 1:25 pm
Is your database also set to compatibility mode 90 (2005)? Can you set a test database to compatibility mode 80 (2000) on a SQL 2005 server and see what happens with the query?
I think, and I may be wrong, that the issue with the query and SQL 2005 is the conversion and concatenation of the 4 "key" fields for the join/comparision. I think my last rewrite is the best alternative rather than searching for a database or server setting to change.
Let me know how your search progresses!
August 29, 2006 at 2:11 pm
We tested the query in both 80 and 90 with the same result. I think you're correct about rewriting the query. The concern I have is how many more differences like this we're going to find.
August 30, 2006 at 12:08 am
This kind of poor queries will have you suffer long time
here are some more solutions.
I guess the CTE is a good example, but not needed with this one.
You've already seen the right join solution. IMO left join read easyer
select
TCC.clientid, TCC.divid, TCC.contactid, TCC.rolecd, TCC.firstnm, TCC.lastnm
from csms.csms_mts.tmp_client_contact TCC
left join csms..client_contact o
on TCC.clientid = o.clientid
and TCC.divid = o.divid
and TCC.contactid = o.contactid
and TCC.rolecd = o.rolecd
where o.clientid is null
Using the exists ... may be the best solution
select
TCC.clientid, TCC.divid, TCC.contactid, TCC.rolecd, TCC.firstnm, TCC.lastnm
from csms.csms_mts.tmp_client_contact TCC
where not exists (select *
from csms..client_contact o
where TCC.clientid = o.clientid
and TCC.divid = o.divid
and TCC.contactid = o.contactid
and TCC.rolecd = o.rolecd )
As Lynn Pettis said, getting rid of depricated features certainly pays off in the long run. Some collegues of mine will suffer before they can migrate to 2005 because they wanted to save on "depricated features maintenance" !
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply