Slow Insert Performance ! See DDL Code

  • Hi All,

    I normally isnert into a table, which has the code below, but i have found out that the speed has been degrading recently, for example to isnert 30000 records can take between 15 - 60 minutes or sometimes even more.

     

    In case you wondered why i have got an gnore duplicate key, its because we are trying to rpevent duplicates from being inserted into the table, although the other way to going round this is doing a left join whilst inserting, not sure if this approach is any good, but i look forward to replies.

     

     

     

    CREATE TABLE [dbo].[COMPANYPROFILE] (

     [PersonURN_id] [int] NOT NULL ,

     [PPFingerPrint] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

     [EventType] [int] NOT NULL ,

     [ReqSeqNum] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [StartDate] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [StartTime] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [UserProvCPN] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [RouteIg] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [RouteOg] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [CalledPN] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [CallingPN] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [ElapsedTime] [int] NULL ,

     [ContCalledNum] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [CallRedirInd] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [CauseOutput] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [CalledPNType] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [TelServInd] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [SwitchName] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [TypeOfCall] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [NTSService] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [IDACode] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL

    )

     CREATE  UNIQUE  INDEX [UNQ_COMPANYPROFILE] ON [dbo].[COMPANYPROFILE](PersonURN_id,PPFingerPrint,EventType ) WITH  fillfactor = 20 IGNORE_DUP_KEY 

    BULK INSERT INTO #PROFILETEMP

    CREATE CLUSTERED INDEX ON  #PROFILETEMP KEYS =  (PersonURN_id,PPFingerPrint,EventType )

     

    INSERT COMPANYPROFILE(

       PersonURN_id,

       PPFingerPrint,

       EventType,

       ReqSeqNum,

       StartDate,

       StartTime,

       UserProvCPN,

       RouteIg,

       RouteOg,

       CalledPN,

       CallingPN,

       ElapsedTime,

       ContCalledNum,

       CallRedirInd, 

       CauseOutput,

       CalledPNType,

       TelServInd,

       SwitchName,

       TypeOfCall,

       NTSService,

       IDACode,

             ModBNumber )

     SELECT

       p.PersonURN_id,

       p.PPFingerPrint,

       p.EventType,

       p.ReqSeqNum,

       p.StartDate,

       p.StartTime,

       p.UserProvCPN,

       p.RouteIg,

       p.RouteOg,

       p.CalledPN,

       p.CallingPN,

       p.ElapsedTime,

       p.ContCalledNum,

       p.CallRedirInd,

       p.CauseOutput,

       p.CalledPNType,

       p.TelServInd,

       p.SwitchName,

       p.TypeOfCall,

       p.NTSService,

       p.IDACode,   

     From #PROFILETEMP P

     Where p.PersonURN_id = 1003649    and RTRIM(Convert(Char(10), p.SeizeDate , 112 )) = '20060830'

     


    Kindest Regards,

    John Burchel (Trainee Developer)

  • All I can spot is this :

    RTRIM(Convert(Char(10), p.SeizeDate , 112 )) = '20060830'

    That forces the server to do a table/index scan instead of doing a seek.  If the table is always growing larger, then that query will always take more and more time to run.  Try to use between or >= and < to check for the date and avoid doing any operation on SeizeDate.

  • anytime i done a test run to see the execution plan, I does an index seek/Clsutered index seek.

    The reason why: RTRIM(Convert(Char(10), p.SeizeDate , 112 )) = '20060830'

    is there is because #PROFILETEMP can possibly contain data for other days, and we want to be sure that we are only selecting the particular date i.e todays date.

    #PROFILETEMP contains Raw information, so we cant just insert everything into the table. We have to do some filtering....

     

    How about doing a left join on the table, I read somewhere that the order in which the data is being inserted might also have an affect, as I havnt got an increasing key like a Clustered Indentity Column, i think it means that inserts are done in a random manner instead of inserting data to the end of the table.

     

     


    Kindest Regards,

    John Burchel (Trainee Developer)

  • Why not just compare to a date instead of converting to a char?

  • On the #ProfileTemp table, the date is likely to be all thesame, but at thesame time, converting to a char means that all dates will be covered.

    eg: comparing 2006/09/04 to a table that could have 2006:09:04 10:00  etc, once this has been converted it would simply be 2006/09/04.


    Kindest Regards,

    John Burchel (Trainee Developer)

  • That's an easy fix :

    Where Date >= '2006/09/04' and Date < '2006/09/05'

     

    Can you run that query and send us the execution plan??? maybe we'll figure out what's taking so long...

  • The last insert which I monitored via QA, its shown here graphically, not sure how i can send it.

    As the code is withing a Cursor, I cannot use the code you sent me because I cannot tell what the date in the #PROFILE table will look like.

     


    Kindest Regards,

    John Burchel (Trainee Developer)

  • You can generate a text version of the plan or you have to take the print screen image and put it up on a webserver so that we can see it.  but I'd preffer the text version.

  • I hahve tried print screen, not working with the server, i really wanted to show you that nsert that took 1.30 hours.

    But I just cant get the execution plan out, Looking at my CPU usage, its not high at all, but i do feel that since the Personprofile tablke has about 14Million records in it, for every insert, its trying to order it.

    From what i understant, 14 million rols already in the table, it only has a Unique non Clustered index, whilst its trying to insert, instead of trying to insert at the end of the table, I think its trying to insert just anywhere.

    As the table hasnt got an incremental Clustered key eg Identity Column, I dont think the inserts are ordered, I just need to underdstant what i need to do to stop this problem.

    Inserts on tables with smaller number of records is quite quick using thesame code, but ones with millions of records, it starts to take time.

     

     

     

     


    Kindest Regards,

    John Burchel (Trainee Developer)

  • I'm sure it can be a problem... but without seeing the data (both inserted and in the table) and the plan it's hard to make an accurate diagnosis.

    Here's how to produce the plan in text mode :

     

    SET

    SHOWPLAN_ALL ON

    GO

    --Query here

    Select

    * from sys.SysObjects

    GO

    SET

    SHOWPLAN_ALL OFF

    GO

     

    This will not execute the query... only create the actual plan.

  • what what do you think about the indexes on the Ignore Dupe keys, I know thats where the problem lies, I am not sure if I am using the right indexes.

    Is it meant to be clustered or non, I will work to geting you the execution plan.

    Thanks


    Kindest Regards,

    John Burchel (Trainee Developer)

  • I need to know way more about the process and the data before deciding on indexes and keys... However it is considered a best practice to have a clustered index, or actually a worst practice not to have any (except in very few rare cases).

  • I will get the execution plan out to you in a minute. please bear with me


    Kindest Regards,

    John Burchel (Trainee Developer)

  • I'm not under any deadline .

  • ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

      |--Assert(WHEREIf ([Expr1003]<'Aug 30 2006 12:00AM' OR [Expr1003]>='Aug 31 2006 12:00AM') then 0 else NULL))

           |--Table Insert(OBJECT[DB].[dbo].[CompanyProfile]), SET[CompanyProfile].[Switch_id]=[P].[Switch_id], [CompanyProfile].[TelephoneIndex_id]=[P].[TelephoneIndex_id], [CompanyProfile].[Location_id]=[P].[CULocat

                |--Top(1)

                     |--Segment

                          |--Sort(ORDER BY[Expr1002] ASC, [P].[EventType] ASC, [Expr1003] ASC))

                               |--Assert(WHEREIf ([Expr1007] IS NOT NULL) then 0 else NULL))

                                    |--Nested Loops(Left Semi Join, OUTER REFERENCES[P].[EventType], [Expr1002], [Expr1003]), DEFINE[Expr1007] = [PROBE VALUE]))

                                         |--Top(ROWCOUNT est 0)

                                         |    |--Compute Scalar(DEFINE[Expr1002]=Convert([P].[CDRFingerPrint]), [Expr1003]=Convert([P].[SeizeDate])))

                                         |         |--Filter(WHERE[ToInsertTable].[SeizeDate]=NULL))

                                         |              |--Nested Loops(Left Outer Join, OUTER REFERENCES[P].[CDRFingerPrint], [P].[EventType], [P].[SeizeDate]))

                                         |                   |--Clustered Index Seek(OBJECT[tempdb].[dbo].[#CDRDataProcesstest_________________________________________________________________________________________________0000000281F6].[IDX_#CDRDATAPROCESS]

                                         |                   |--Filter(WHERESTARTUP EXPR([P].[SeizeDate]<'Aug 24 2006 12:00AM' AND [P].[SeizeDate]>='Aug 23 2006 12:00AM')))

                                         |                        |--Clustered Index Seek(OBJECT[DB].[dbo].[CDRRawDataFeed20060823].[PK_CDRRawDataFeed20060823] AS [ToInsertTable]), SEEK[ToInsertTable].[SeizeDate]=[P].[SeizeDate] AND [ToInsertTable].

                                         |--Row Count Spool

                                              |--Filter(WHERESTARTUP EXPR([Expr1003]<'Aug 31 2006 12:00AM' AND [Expr1003]>='Aug 30 2006 12:00AM')))

                                                   |--Index Seek(OBJECT[DB].[dbo].[CompanyProfile].[IGD_CompanyProfile]), SEEK[CompanyProfile].[PersonURN_id]=[Expr1002] AND [CompanyProfile].[EventType]=[P].[EventTyp

     


    Kindest Regards,

    John Burchel (Trainee Developer)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply