better than not in

  • The following is a sql statement which has been brutal (indexes are correct)

    SELECT patid + facid

    FROM pharm.dbo.cms_PatientStaging

    WHERE

    patid + facid not in (

    SELECT patid + facid

    FROM pat..patients)

    Is there a better approach for this (there are no null values in either column)

  • Have you compared with left join query and not exists to see which one works better?

    I personnally preffer to use not exists because in my mind it's clearer what the query does.

  • I plead some ignorance in trying to get not exists to work the syntax below returns and error

    SELECT patid + facid

    FROM pharm.dbo.cms_PatientStaging

    WHERE

    --patid + mardbname in

    -- (

    -- SELECT patid + mardbname

    -- FROM pat..cmspatviewL

    -- )

    --and

    patid + facid not exists

    (

    SELECT patid + facid

    FROM pat..cmspatviewl)

  • This assumes that both column are not nullable and that patid and facid and indexed in the same index in pat..patients

    SELECT

    patid + facid

    FROM

    pharm.dbo.cms_PatientStaging Stag

    WHERE

    NOT EXISTS ( SELECT

    *

    FROM

    pat..patients P

    WHERE

    P.patid = Stag.patid

    AND P.facid = Stag.facid )

  • Might be worth a read: http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/13/2010)


    Might be worth a read: http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/

    Nice read. Always something more to [re]learn I guess :w00t:.

  • I went from not in to left outer join, went from 87 seconds to 2 seconds

    WEEEEEEEEHHHHHHHHHHHHHHHHHHHHHEEEEEEEEEEEEEEE

  • timscronin (10/13/2010)


    I went from not in to left outer join, went from 87 seconds to 2 seconds

    WEEEEEEEEHHHHHHHHHHHHHHHHHHHHHEEEEEEEEEEEEEEE

    Great, now switch it to the Not Exists that Ninja showed you as it should be very slightly faster than the left join.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Interesting, tried that the left outer join was faster and more efficient

  • Generally it's the other way around.

    http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/

    Can you post exec plans of the join and the not exists? Am curious.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Here it is, I get the plan look but this is more telling

    for the left outer join

    Table 'cms_PatientStaging'. Scan count 34, logical reads 842, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Facilities'. Scan count 34, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Patients'. Scan count 34, logical reads 4662, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    for the not exists

    Table 'cms_PatientStaging'. Scan count 17, logical reads 628, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Facilities'. Scan count 17, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Patients'. Scan count 9653, logical reads 33505, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Big hit on patients

    Plans

    not exists

    |--Parallelism(Gather Streams)

    |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([Stag].[FacID], [Stag].[PatID], [Expr1014]) OPTIMIZED WITH UNORDERED PREFETCH)

    |--Hash Match(Left Semi Join, HASH:([Expr1011])=([Expr1012]), RESIDUAL:([Expr1011]=[Expr1012]))

    | |--Bitmap(HASH:([Expr1011]), DEFINE:([Bitmap1013]))

    | | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([Expr1011]))

    | | |--Compute Scalar(DEFINE:([Expr1010]=[PHARM].[dbo].[cms_PatientStaging].[PatID] as [Stag].[PatID]+[PHARM].[dbo].[cms_PatientStaging].[FacID] as [Stag].[FacID], [Expr1011]=[PHARM].[dbo].[cms_PatientStaging].[PatID] as [Stag].[PatID]+[PHARM].[dbo].[cms_PatientStaging].[mardbname] as [Stag].[mardbname]))

    | | |--Clustered Index Scan(OBJECT:([PHARM].[dbo].[cms_PatientStaging].[PK_cms_PatientStaging] AS [Stag]), WHERE:([PHARM].[dbo].[cms_PatientStaging].[import_rec_status] as [Stag].[import_rec_status]='i'))

    | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([Expr1012]), WHERE:(PROBE([Bitmap1013])=TRUE))

    | |--Compute Scalar(DEFINE:([Expr1012]=[PAT].[dbo].[Patients].[PatID] as [p].[PatID]+[FAC].[dbo].[Facilities].[mardbname] as [f].[mardbname]))

    | |--Hash Match(Inner Join, HASH:([f].[FacID])=([p].[FacID]), RESIDUAL:([FAC].[dbo].[Facilities].[FacID] as [f].[FacID]=[PAT].[dbo].[Patients].[FacID] as [p].[FacID]))

    | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([f].[FacID]))

    | | |--Index Scan(OBJECT:([FAC].[dbo].[Facilities].[_dta_index_Facilities_9_652633468__K1_78] AS [f]))

    | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([p].[FacID]))

    | |--Index Scan(OBJECT:([PAT].[dbo].[Patients].[_dta_index_Patients_14_1922105888__K1_K2] AS [p]))

    |--Index Seek(OBJECT:([PAT].[dbo].[Patients].[Patients_CMSidx0e] AS [P]), SEEK:([P].[FacID]=[PHARM].[dbo].[cms_PatientStaging].[FacID] as [Stag].[FacID] AND [P].[PatID]=[PHARM].[dbo].[cms_PatientStaging].[PatID] as [Stag].[PatID]) ORDERED FORWARD)

    left join

    |--Parallelism(Gather Streams)

    |--Hash Match(Right Semi Join, HASH:([Expr1016])=([Expr1015]), RESIDUAL:([Expr1015]=[Expr1016]))

    |--Parallelism(Distribute Streams, Broadcast Partitioning)

    | |--Filter(WHERE:([PAT].[dbo].[Patients].[PatID] as [p].[PatID] IS NULL))

    | |--Parallelism(Gather Streams)

    | |--Hash Match(Left Outer Join, HASH:([p].[PatID], [p].[FacID])=([p].[PatID], [p].[FacID]), RESIDUAL:([PHARM].[dbo].[cms_PatientStaging].[PatID] as [p].[PatID]=[PAT].[dbo].[Patients].[PatID] as [p].[PatID] AND [PHARM].[dbo].[cms_PatientStaging].[FacID] as [p].[FacID]=[PAT].[dbo].[Patients].[FacID] as [p].[FacID]))

    | |--Bitmap(HASH:([p].[PatID], [p].[FacID]), DEFINE:([Bitmap1020]))

    | | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([p].[PatID], [p].[FacID]))

    | | |--Hash Match(Left Semi Join, HASH:([Expr1017])=([Expr1018]), RESIDUAL:([Expr1017]=[Expr1018]))

    | | |--Bitmap(HASH:([Expr1017]), DEFINE:([Bitmap1019]))

    | | | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([Expr1017]))

    | | | |--Compute Scalar(DEFINE:([Expr1016]=[PHARM].[dbo].[cms_PatientStaging].[PatID] as [p].[PatID]+[PHARM].[dbo].[cms_PatientStaging].[FacID] as [p].[FacID], [Expr1017]=[PHARM].[dbo].[cms_PatientStaging].[PatID] as [p].[PatID]+[PHARM].[dbo].[cms_PatientStaging].[mardbname] as [p].[mardbname]))

    | | | |--Index Scan(OBJECT:([PHARM].[dbo].[cms_PatientStaging].[IX_cms_PatientStaging] AS [p]))

    | | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([Expr1018]), WHERE:(PROBE([Bitmap1019])=TRUE))

    | | |--Compute Scalar(DEFINE:([Expr1018]=[PAT].[dbo].[Patients].[PatID] as [p].[PatID]+[FAC].[dbo].[Facilities].[mardbname] as [f].[mardbname]))

    | | |--Hash Match(Inner Join, HASH:([f].[FacID])=([p].[FacID]), RESIDUAL:([FAC].[dbo].[Facilities].[FacID] as [f].[FacID]=[PAT].[dbo].[Patients].[FacID] as [p].[FacID]))

    | | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([f].[FacID]))

    | | | |--Index Scan(OBJECT:([FAC].[dbo].[Facilities].[_dta_index_Facilities_9_652633468__K1_78] AS [f]))

    | | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([p].[FacID]))

    | | |--Index Scan(OBJECT:([PAT].[dbo].[Patients].[_dta_index_Patients_14_1922105888__K1_K2] AS [p]))

    | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([p].[PatID], [p].[FacID]), WHERE:(PROBE([Bitmap1020])=TRUE))

    | |--Hash Match(Inner Join, HASH:([f].[FacID])=([p].[FacID]), RESIDUAL:([FAC].[dbo].[Facilities].[FacID] as [f].[FacID]=[PAT].[dbo].[Patients].[FacID] as [p].[FacID]))

    | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([f].[FacID]))

    | | |--Index Scan(OBJECT:([FAC].[dbo].[Facilities].[IX_Facilities_1] AS [f]))

    | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([p].[FacID]))

    | |--Index Scan(OBJECT:([PAT].[dbo].[Patients].[_dta_index_Patients_14_1922105888__K1_K2] AS [p]))

    |--Compute Scalar(DEFINE:([Expr1015]=[PHARM].[dbo].[cms_PatientStaging].[PatID]+[PHARM].[dbo].[cms_PatientStaging].[FacID]))

    |--Clustered Index Scan(OBJECT:([PHARM].[dbo].[cms_PatientStaging].[PK_cms_PatientStaging]), WHERE:([PHARM].[dbo].[cms_PatientStaging].[import_rec_status]='i'))

  • As .sqlplan files please. Most of the information I'm interests in is not in that text.

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Edit: From a glance at what you did post, your indexes may need work. Please can you post table and index definitions for both tables?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • here they are

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply