INSERT Takes too Long

  • The Plan:

    To compare 2 tables that have the same structure and when the table which is updated daily has different information than the other table, to insert the revised record and move the old record to the AuditLog:

    The INSERT Syntax:

    insert into users

              (

              InsertDate,ADT.Username,ADT.initials,ADT.facsimileTelephoneNumber,ADT.streetAddress

              ,ADT.manager,ADT.whenCreated,ADT.uSNChanged,ADT.DN,ADT.objectClass,ADT.[name],ADT.objectGUID,ADT.[description],ADT.displayName,ADT.userAccountControl,ADT.sAMAccountName

    ,ADT.sn,ADT.givenName,ADT.memberOf,ADT.homeMTA,ADT.proxyAddresses,ADT.homeMDB,ADT.mDBUseDefaults

    ,ADT.mailNickname,ADT.mail,ADT.msExchHomeServerName,ADT.physicalDeliveryOfficeName,ADT.telephoneNumber

              ,ADT.department,ADT.homeDirectory,ADT.sIDHistory,ADT.directReports,ADT.homeDrive

              ,ADT.targetAddress,ADT.c,ADT.l,ADT.st,ADT.postalCode,ADT.mAPIRecipient)

         select

              U.InsertDate,U.Username,U.initials,U.facsimileTelephoneNumber,U.streetAddress

              ,U.manager,U.whenCreated,U.uSNChanged,U.DN,U.objectClass,U.[name],U.objectGUID

              ,U.[description],U.displayName,U.userAccountControl,U.sAMAccountName,U.sn,U.givenName

              ,U.memberOf,U.homeMTA,U.proxyAddresses,U.homeMDB,U.mDBUseDefaults,U.mailNickname

              ,U.mail,U.msExchHomeServerName,U.physicalDeliveryOfficeName,U.telephoneNumber,U.department

              ,U.homeDirectory,U.sIDHistory,U.directReports,U.homeDrive,U.targetAddress,U.c,U.l,U.st

              ,U.postalCode,U.mAPIRecipient

         from AD_temp ADT

         join users U

         on u.objectGUID = ADT.objectGUID

         where      u.initials<>ADT.initials

              or u.facsimileTelephoneNumber<>ADT.facsimileTelephoneNumber

              or u.streetAddress<>ADT.streetAddress

              or u.manager<>ADT.manager

              or u.whenCreated<>ADT.whenCreated

              or u.DN<>ADT.DN

              or u.objectClass<>ADT.objectClass

              or u.[name]<>ADT.[name]

              or u.[description]<>ADT.[description]

              or u.displayName<>ADT.displayName

              or u.userAccountControl<>ADT.userAccountControl

              or u.sAMAccountName<>ADT.sAMAccountName

              or u.sn<>ADT.sn

              or u.givenName<>ADT.givenName

              or u.memberOf<>ADT.memberOf

              or u.homeMTA<>ADT.homeMTA

              or u.proxyAddresses<>ADT.proxyAddresses

              or u.homeMDB<>ADT.homeMDB

              or u.mDBUseDefaults<>ADT.mDBUseDefaults

              or u.mailNickname<>ADT.mailNickname

              or u.mail<>ADT.mail

              or u.msExchHomeServerName<>ADT.msExchHomeServerName

              or u.physicalDeliveryOfficeName<>ADT.physicalDeliveryOfficeName

              or u.telephoneNumber<>ADT.telephoneNumber

              or u.department<>ADT.department

              or u.homeDirectory<>ADT.homeDirectory

              or u.directReports<>ADT.directReports

              or u.homeDrive<>ADT.homeDrive

              or u.targetAddress<>ADT.targetAddress

              or u.c<>ADT.c

              or u.l<>ADT.l

              or u.st<>ADT.st

              or u.postalCode<>ADT.postalCode

              or u.mAPIRecipient<>ADT.mAPIRecipient

              and ADT.objectclass in ('user', 'contact')

    The Delete Syntax:

    delete from users where initials not in (select initials from users where initials in (select initials from AD_Temp))

    delete from users where facsimileTelephoneNumber not in (select facsimileTelephoneNumber from users where facsimileTelephoneNumber in (select facsimileTelephoneNumber from AD_Temp))

    delete from users where streetAddress not in (select streetAddress from users where streetAddress in (select streetAddress from AD_Temp))

    delete from users where manager not in (select manager from users where manager in (select manager from AD_Temp))

    delete from users where whenCreated not in (select whenCreated from users where whenCreated in (select whenCreated from AD_Temp))

    delete from users where DN not in (select DN from users where DN in (select DN from AD_Temp))

    delete from users where objectClass not in (select objectClass from users where objectClass in (select objectClass from AD_Temp))

    delete from users where [name] not in (select [name] from users where [name] in (select [name] from AD_Temp))

    delete from users where objectGUID not in (select objectGUID from users where objectGUID in (select objectGUID from AD_Temp))

    delete from users where [description] not in (select [description] from users where [description] in (select [description] from AD_Temp))

    delete from users where displayName not in (select displayName from users where displayName in (select displayName from AD_Temp))

    delete from users where userAccountControl not in (select userAccountControl from users where userAccountControl in (select userAccountControl from AD_Temp))

    ........AND A WHOLE LOT MORE DELETES.

    Please don't laugh too hard at how inefficient my code is. That's why I turn to you for help.

    The AuditLog Trigger works fine

    I have 4 tables and the DTS works but takes 20 minutes to run.  I would like to cut the time by half.  There are approximately 5000  - 6000 records in AD_Temp and approx 2000 in USERS

    Thank you in advance

  • This could be cut to subsecond if you want my opinion.

    Is this what's going on??

    You have a live table where the info is kept.

    You have a staging table where new/updated data is entered and then at the end of the day will be used to replace the live table data.

    You have a log table that keeps any modifications to that data.

  • Here is a wild try:

    insert into users

              (

              InsertDate

              ,ADT.Username

              ,ADT.initials

              ,ADT.facsimileTelephoneNumber

              ,ADT.streetAddress

              ,ADT.manager

              ,ADT.whenCreated

              ,ADT.uSNChanged

              ,ADT.DN

              ,ADT.objectClass

              ,ADT.[name]

              ,ADT.objectGUID

              ,ADT.[description]

              ,ADT.displayName

              ,ADT.userAccountControl

              ,ADT.sAMAccountName

              ,ADT.sn

              ,ADT.givenName

              ,ADT.memberOf

              ,ADT.homeMTA

              ,ADT.proxyAddresses

              ,ADT.homeMDB

              ,ADT.mDBUseDefaults

              ,ADT.mailNickname

              ,ADT.mail

              ,ADT.msExchHomeServerName

              ,ADT.physicalDeliveryOfficeName

              ,ADT.telephoneNumber

              ,ADT.department

              ,ADT.homeDirectory

              ,ADT.sIDHistory

              ,ADT.directReports

              ,ADT.homeDrive

              ,ADT.targetAddress

              ,ADT.c

              ,ADT.l

              ,ADT.st

              ,ADT.postalCode

              ,ADT.mAPIRecipient)

         select

              U.InsertDate

              ,U.Username

              ,U.initials

              ,U.facsimileTelephoneNumber

              ,U.streetAddress

              ,U.manager

              ,U.whenCreated

              ,U.uSNChanged

              ,U.DN

              ,U.objectClass

              ,U.[name]

              ,U.objectGUID

              ,U.[description]

              ,U.displayName

              ,U.userAccountControl

              ,U.sAMAccountName

              ,U.sn

              ,U.givenName

              ,U.memberOf

              ,U.homeMTA

              ,U.proxyAddresses

              ,U.homeMDB

              ,U.mDBUseDefaults

              ,U.mailNickname

              ,U.mail

              ,U.msExchHomeServerName

              ,U.physicalDeliveryOfficeName

              ,U.telephoneNumber

              ,U.department

              ,U.homeDirectory

              ,U.sIDHistory

              ,U.directReports

              ,U.homeDrive

              ,U.targetAddress

              ,U.c

              ,U.l

              ,U.st

              ,U.postalCode

              ,U.mAPIRecipient

         from users U

         join (select objectGUID, BINARY_CHECKSUM(*) UCS

      from  users

       ) UBCS on UBCS.objectGUID = U.objectGUID,

         join (select objectGUID, BINARY_CHECKSUM(*) ACS

      from  AD_temp where ADT.objectclass in ('user', 'contact')

       ) ABCS on ABCS.objectGUID = U.objectGUID

         where  UBCS.UCS <> ABCS.ACS

    Cheers!


    * Noel

  • That's a nice touch Noeld... but I think that the problems comes from the 50 delete statements that don't use any clustered indexes to do the deletes.. resulting in way to many table scans.

    That's why I asked him what exactly he was trying to do. I was thinking that he could simply log the modification to history using a single select/insert, then truncate and reinsert the data.

    If this is not possible then I'd suggest doing a join on the checksum as you did to find the offending rows and doing a single delete statement. Even if it requires a table scan.

  • ..... I know, I know, It will always depends on how the process is but I just posted for the fun of it

    I am positive that workflow changes are at the core of the solution. Somebody told me once:

    When you find yourself writing very convoluted/uncommon queries don't fight with the query look instead at the design

     


    * Noel

  • noeld: wise words.

    remi: you are correct: I have a live table where the info is kept.

    I have a staging table where new/updated data is entered and then at the end of the day will be used to replace the live table data.

    I have a log table that keeps any modifications to that data.

    You were thinking that he could simply log the modification to history using a single select/insert, then truncate and reinsert the data.

     

     

    I'll work with your ideas and let you know

     

    Thanks for pointing me in the right direction

  • On a side note.  You might want to double check this statement:

             ....

              or u.c<>ADT.c

              or u.l<>ADT.l

              or u.st<>ADT.st

              or u.postalCode<>ADT.postalCode

              or u.mAPIRecipient<>ADT.mAPIRecipient

              and ADT.objectclass in ('user', 'contact')

    Should that be:

             ....(

              or u.c<>ADT.c

              or u.l<>ADT.l

              or u.st<>ADT.st

              or u.postalCode<>ADT.postalCode

              or u.mAPIRecipient<>ADT.mAPIRecipient)

              and ADT.objectclass in ('user', 'contact')

    What you have (implied) is:

             ....

              or u.c<>ADT.c

              or u.l<>ADT.l

              or u.st<>ADT.st

              or u.postalCode<>ADT.postalCode

              or (u.mAPIRecipient<>ADT.mAPIRecipient

              and ADT.objectclass in ('user', 'contact') )

     

  • Nice catch Ron... I had missed that one.

  • I took everyone's advices and reworked it to baby steps

    Background:

    Table1: Temporary table that's truncated and repopulated daily by a flat file that contains new users and modified users

    Table2: Current user information

    End Result: First I delete the users that are not in T2 but are in T1. Then I populate a ##DELUSERS table with users whose information has changed by comparing each field. The I repopulate T2 with users not in the T2 but are in T1.

    I hope this is clearer.  Here's the problem:

    When I run this:

    select objectguid,objectclass,[name],telephoneNumber from users

     where [name] = 'test'

    select objectguid, objectclass,[name],telephoneNumber from AD_Temp

     where [name] = 'test' and objectclass = 'user'

    I get this:

    X'1dd58d3b74ad8b4f95dfbdf724fbd5f3' user test NULL

    X'1dd58d3b74ad8b4f95dfbdf724fbd5f3' user test 555 555 5556

    NOTE the telephone numbers are different

    When I run the following which is supposed to catch any changes to user information and delete the user so I can reinsert the user, I get "0 row(s) affected"

    --Inserts into ##DELUSERS users whose data has changed

    SELECT  u.InsertDate

     ,u.Username,u.initials,u.facsimileTelephoneNumber,u.streetAddress,u.manager,u.whenCreated

     ,u.uSNChanged,u.DN,u.objectClass,u.[name],u.objectGUID,u.[description],u.displayName

     ,u.userAccountControl,u.sAMAccountName,u.sn,u.givenName,u.memberOf,u.homeMTA

     ,u.proxyAddresses,u.homeMDB,u.mDBUseDefaults,u.mailNickname,u.mail,u.msExchHomeServerName

     ,u.physicalDeliveryOfficeName,u.telephoneNumber,u.department,u.homeDirectory

     ,u.sIDHistory,u.directReports,u.homeDrive,u.targetAddress,u.c,u.l,u.st,u.postalCode

     ,u.mAPIRecipient

    INTO ##DELUSERS

     FROM USERS u

      join AD_temp ADT

      on u.objectguid = adt.objectguid

      where U.username <> ADT.username

       or u.[initials] <> adt.[initials] or u.[facsimileTelephoneNumber] <> adt.[facsimileTelephoneNumber]

       or u.[streetAddress] <> adt.[streetAddress] or u.[manager] <> adt.[manager]

       or u.[whenCreated] <> adt.[whenCreated] or u.[DN] <> adt.[DN] or u.[objectClass] <> adt.[objectClass]

       or u.[name] <> adt.[name] or u.[description] <> adt.[description] or u.[displayName] <> adt.[displayName]

       or u.[userAccountControl] <> adt.[userAccountControl] or u.[sAMAccountName] <> adt.[sAMAccountName]

       or u.[sn] <> adt.[sn] or u.[givenName] <> adt.[givenName] or u.[memberOf] <> adt.[memberOf]

       or u.[homeMTA] <> adt.[homeMTA] or u.[proxyAddresses] <> adt.[proxyAddresses]

       or u.[homeMDB] <> adt.[homeMDB] or u.[mDBUseDefaults] <> adt.[mDBUseDefaults]

       or u.[mailNickname] <> adt.[mailNickname] or u.[mail] <> adt.[mail]

       or u.[msExchHomeServerName] <> adt.[msExchHomeServerName]

       or u.[physicalDeliveryOfficeName] <> adt.[physicalDeliveryOfficeName]

       or u.[telephoneNumber] <> adt.[telephoneNumber] or u.[department] <> adt.[department]

       or u.[homeDirectory] <> adt.[homeDirectory] or u.[directReports] <> adt.[directReports]

       or u.[homeDrive] <> adt.[homeDrive] or u.[targetAddress] <> adt.[targetAddress]

       or u.[c] <> adt.[c] or u.[l] <> adt.[l] or u.[st] <> adt.[st] or u.[postalCode] <> adt.[postalCode]

       or u.[mAPIRecipient] <> adt.[mAPIRecipient]

     

    Please help again. THANKS

  •           ......

              or u.c<>ADT.c

              or u.l<>ADT.l

              or u.st<>ADT.st

              or u.postalCode<>ADT.postalCode

              or u.mAPIRecipient<>ADT.mAPIRecipient

    will be times faster if it will be

              NOT (.....

              and u.c = ADT.c

              and u.l = ADT.l

              and u.st = ADT.st

              and u.postalCode = ADT.postalCode

              and u.mAPIRecipient = ADT.mAPIRecipient)

    _____________
    Code for TallyGenerator

  • To compare data in nullable columns everytime use such construction:

    isnull(u.telephone, 'NULL') = isnull(ADT.telephone, 'NULL')

    I use string 'NULL' because there are definitely no telephones like 'NULL'.

    For other columns if you are not sure about value 'NULL' use kind of 'There is no such fu... value here!' as second argument for isnull.

    _____________
    Code for TallyGenerator

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply