Help rewriting a not in query

  • I read an article the other day that advised against the use of the NOT IN and NOT EXISTS statements. It said that even if the system finds the value in the subselect it will continue checking every single record rather than quitting there and then. It said that there should only be a few occasions where you need to use a negative clause and you should try to rewrite your queries to use IN or EXISTS instead.

    So I have found a couple of instances of the use of not in, in my application.

    Could someone give me some advice of how I could re-write the following insert statement so that it doesn't use NOT IN.

    The insert statement is within a proc that is used to add selected candidates to selected folders from a website. The user could select 1 or more candidates and then select 1 or more folders to add them to. The candidate IDs and FolderIDs are passed into the proc as two comma seperated strings. I then use a udf SPLIT function that splits a string on a delimiter and a cross join to populate the link table all in one hit. I thought this would be a better idea than either looping through either the cands or folders one at a time and doing multiple inserts into the table. I only want to insert the candidate into the appropriate folder if they don't already exist and thats where the NOT IN comes in.

    A simplified version is:

    declare @CandIDs varchar(500), @FolderIDs varchar(500)

    insert into tbl_FOLDER_CANDIDATES

    (FolderID,CandidateID)

    select  a.[Value], b.[Value]

    from dbo.udf_SPLIT(@CandIDs,',') as a, dbo.udf_SPLIT(@FolderIDs,',') as b

    where a.[Value] not in(select CandidateID from tbl_FOLDER_CANDIDATES where FolderID=b.[Value])

    I haven't had any problems with this method so far but taking on board this articles advice I'd like to know if I could improve it by replacing the NOT IN.

    Any help would be much appreciated and I thank you in advance.

     

     

     

     

  • i don't see how/if A is related to b, but it is something like this i think: maybe left or right joins, you'd have to examine the actual data:

    select  a.[Value], b.[Value]

    from tbl_FOLDER_CANDIDATES as c

    FULL OUTER JOIN  dbo.udf_SPLIT(@CandIDs,',') as a ON c.CandidateID =a.[Value]

    FULL OUTER JOIN  dbo.udf_SPLIT(@FolderIDs,',') as b  ON c.FolderID =  b.[Value] 

     --how is a related to b??? only thru C i think?

    where c.candidateId is NULL

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Something like the following gets rid of the NOT IN but introduces the overhead of an OUTER JOIN.

    After checking the same results are produced, you will need to look at the execution plan to see if it is better.

    INSERT INTO tbl_FOLDER_CANDIDATES (CandidateID, FolderID)

    SELECT C.Value, F.Value

    FROM dbo.udf_SPLIT(@CandIDs,',') C

        CROSS JOIN dbo.udf_SPLIT(@FolderIDs,',') F

        LEFT JOIN tbl_FOLDER_CANDIDATES T

            ON C.Value = T.CandidateID

                AND F.Value = T.FolderID

    WHERE T.CandidateID IS NULL

     

  • Ok thanks for both of those I will try them out soon and compare the Plans.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply