February 23, 2007 at 8:24 am
Hi Everyone,
I need some help with this query.
Delete from FT
where [Text] is null
and SKey not in (Select distinct FTSKey from
FTR)
I was trying to do some performance tuning on it myself today and discovered it's the [Text] is null portion of the query that is causing all the grief. The [TEXT] row is of text type. I did some searching around on the internet for suggestions on searching text but they all have to do with FullText search in SQL Server or using a third party tool to index the text. That's not really possible in this case due to budget restraints and the fact that this query is used for maintenance...and not in the application. Any suggestions on how I can get this to run faster?
Thanks,
Jessica
February 23, 2007 at 8:55 am
I do not think the DISTINCT or the IN will help the speed if there are a lot of rows in FTR. The following may work better especially if there is an index on FTSKey:
DELETE FT
FROM FT
LEFT JOIN FTR
ON FT.SKey = FTR.FTSKey
WHERE FTR.FTSKey IS NULL
AND FT.[Text] IS NULL
or
DELETE FT
WHERE [Text] IS NULL
AND NOT EXISTS (
SELECT *
FROM FTR
WHERE FTR.FTSKey = FT.SKey
)
March 15, 2007 at 7:01 am
Just a thought, it looks like you only want to delete the data from FT if it does not have a key in FTR.
You could therefore scan for those entries that are in FT that are not in FTR first and store these in a temp table.
Then use the temp table to join on FT for the delete and check the text column. This would reduce that amount text columns you would have to check to see if they where null.
It really depends on how much data you are scanning. If there is an index on the skey it should be quick.
Hope that helps
March 15, 2007 at 10:21 am
Remove the distinct from the in. x in (1,2,2,2,2,5) and x in (1,2,5) are equivalent.
Make sure you have an index on SKey and FTSKey.
Also, try the query as a left join, rather than a not in, see if it makes a difference.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 27, 2007 at 12:54 am
As an aside, it's worth nothing that IN/NOT IN (with a subquery), EXISTS/NOT EXISTS, INTERSECT/EXCEPT almost always perform better than an equilivent operation implemented with an INNER or LEFTER OUTER join.
The reason for this is that the query optimizer will almost always implement these operations as a semi or anti-semi join. Furthermore--although this isn't always the case--I have never seen a join outperform a semi join. Basically put, a semi-join is an optimized join that can be used when you don't necessarily want to join every record that could be joined (e.g. if you are joining to a table to see if any records exists, and there are 20 records that meet your criteria, you only really care that there is one.) The optimizer can then make better index decisions and, overall, your query will perform better.
It's worth noting that, in 2k5 at least, the optimizer will somtimes rewrite your left outer join or inner join as a semi join when it can tell that you are really only testing row existence. Of course this isn't always the case. I guess my point is, why tell the optimizer you want to perform one operation, when you really want to perform another.
One final point, IN/NOT IN, EXISTS/NOT EXISTS (and INTERSECT/EXCEPT) don't handle nulls the same way. If NULLS are going to be involved and you want the logic to be ANSI compliant, go with EXISTS/NOT EXISTS.
SQL guy and Houston Magician
March 27, 2007 at 12:59 am
There have been some good suggestions above and, like others have said, you definitly don't want the distinct in your subquery there. If this is still an issue and you really want some definitive answers, be sure to post DDL (including indexes and constraints) and run the query again with SET SHOWPLAN_TEXT ON and post the results.
SQL guy and Houston Magician
March 27, 2007 at 7:35 am
Thanks everyone for the great suggestions and explanations!!! We willl likely be going with:
Delete from FT
WHERE [Text] IS NULL
AND NOT EXISTS (
SELECT *
FROM FTR
WHERE FTR.FTSKey = FT.SKey
)
In the testing I did in our staging environment, it seemed to run in about half the time. This is sufficient given that this query is run as a batch update at night. Thanks again!!!
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply