Query runs too slow!

  • I have the following query which takes around 15 minutes to run:

    dbo.All_keys_ = 88 million rows

    dbo.Load1_keys = 900 thousand rows

    CREATE TABLE dbo.Load1_PotentialOverlaps_(Master_Id VARCHAR(100), Master_GUID uniqueidentifier, Master_SubmissionID TINYINT,

    Duplicate_Id VARCHAR(100), duplicate_GUID uniqueidentifier, Duplicate_SubmissionID TINYINT, MatchKeyType TINYINT)

    CREATE INDEX idxduplicateID ON dbo.Load1_PotentialOverlaps_(Duplicate_Id)

    INSERT INTO dbo.Load1_PotentialOverlaps_(Master_Id, Master_GUID, Master_SubmissionID, Duplicate_Id, duplicate_GUID, Duplicate_SubmissionID, MatchKeyType)

    SELECT a.id, a.GUID, a.SubmissionID, b.id, b.GUID, b.SubmissionID, 1

    FROM dbo.All_keys_ AS a

    INNER JOIN dbo.Load1_keys_ AS b

    on a.mkMatchKeyType1 = b.mkMatchKeyType1

    WHERE NOT EXISTS (SELECT 1 from dbo.Load1_PotentialOverlaps_ as c WHERE a.id = c.duplicate_ID)

    AND NOT EXISTS (SELECT 1 from dbo.Load1_PotentialOverlaps_ as c WHERE b.id = c.duplicate_ID) OPTION (MAXDOP 2);

    Indexes:

    ALL_KEYS

    UNIQUE CLUSTERED INDEX ON ID

    NONCLUSTERED FILTERED INDEX ON mkMatchKeyType1 with INCLUDE (ID, GUID, SubmissionID).

    Load1_keys_

    CLUSTERED INDEX ON ID

    NONCLUSTERED FILTERED INDEX ON mkMatchKeyType1 with INCLUDE (ID, GUID, SubmissionID).

    Execution plan attached.

    Please advise.

    Thanks in advance.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Arrghhh... sorted it. Now it runs in 23 seconds ๐Ÿ˜€

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Good improvement!

    What was your solution? Looks almost like a double triangular join with those not exists!

  • Changed MAXDOp to 1, rebuilt the indexes to INCLUDE the columns in the SELECT (I didn't actually have this in my index when I posted this question).

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • This might give you a little boost too:

    SELECT a.id, a.GUID, a.SubmissionID, b.id, b.GUID, b.SubmissionID, 1

    FROM dbo.All_keys_ AS a

    INNER JOIN dbo.Load1_keys_ AS b

    on a.mkMatchKeyType1 = b.mkMatchKeyType1

    WHERE NOT EXISTS (SELECT 1 from dbo.Load1_PotentialOverlaps_ as c WHERE c.duplicate_ID IN (a.id,b.id)

    --AND NOT EXISTS (SELECT 1 from dbo.Load1_PotentialOverlaps_ as c WHERE b.id = c.duplicate_ID) OPTION (MAXDOP 2);

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply