September 4, 2006 at 8:37 am
Hi All,
I normally isnert into a table, which has the code below, but i have found out that the speed has been degrading recently, for example to isnert 30000 records can take between 15 - 60 minutes or sometimes even more.
In case you wondered why i have got an gnore duplicate key, its because we are trying to rpevent duplicates from being inserted into the table, although the other way to going round this is doing a left join whilst inserting, not sure if this approach is any good, but i look forward to replies.
CREATE TABLE [dbo].[COMPANYPROFILE] (
[PersonURN_id] [int] NOT NULL ,
[PPFingerPrint] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[EventType] [int] NOT NULL ,
[ReqSeqNum] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[StartDate] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[StartTime] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[UserProvCPN] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[RouteIg] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[RouteOg] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CalledPN] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CallingPN] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ElapsedTime] [int] NULL ,
[ContCalledNum] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CallRedirInd] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CauseOutput] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CalledPNType] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[TelServInd] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SwitchName] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[TypeOfCall] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[NTSService] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IDACode] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
)
CREATE UNIQUE INDEX [UNQ_COMPANYPROFILE] ON [dbo].[COMPANYPROFILE](PersonURN_id,PPFingerPrint,EventType ) WITH fillfactor = 20 IGNORE_DUP_KEY
BULK INSERT INTO #PROFILETEMP
CREATE CLUSTERED INDEX ON #PROFILETEMP KEYS = (PersonURN_id,PPFingerPrint,EventType )
INSERT COMPANYPROFILE(
PersonURN_id,
PPFingerPrint,
EventType,
ReqSeqNum,
StartDate,
StartTime,
UserProvCPN,
RouteIg,
RouteOg,
CalledPN,
CallingPN,
ElapsedTime,
ContCalledNum,
CallRedirInd,
CauseOutput,
CalledPNType,
TelServInd,
SwitchName,
TypeOfCall,
NTSService,
IDACode,
ModBNumber )
SELECT
p.PersonURN_id,
p.PPFingerPrint,
p.EventType,
p.ReqSeqNum,
p.StartDate,
p.StartTime,
p.UserProvCPN,
p.RouteIg,
p.RouteOg,
p.CalledPN,
p.CallingPN,
p.ElapsedTime,
p.ContCalledNum,
p.CallRedirInd,
p.CauseOutput,
p.CalledPNType,
p.TelServInd,
p.SwitchName,
p.TypeOfCall,
p.NTSService,
p.IDACode,
From #PROFILETEMP P
Where p.PersonURN_id = 1003649 and RTRIM(Convert(Char(10), p.SeizeDate , 112 )) = '20060830'
September 4, 2006 at 8:43 am
All I can spot is this :
RTRIM(Convert(Char(10), p.SeizeDate , 112 )) = '20060830'
That forces the server to do a table/index scan instead of doing a seek. If the table is always growing larger, then that query will always take more and more time to run. Try to use between or >= and < to check for the date and avoid doing any operation on SeizeDate.
September 4, 2006 at 8:58 am
anytime i done a test run to see the execution plan, I does an index seek/Clsutered index seek.
The reason why: RTRIM(Convert(Char(10), p.SeizeDate , 112 )) = '20060830'
is there is because #PROFILETEMP can possibly contain data for other days, and we want to be sure that we are only selecting the particular date i.e todays date.
#PROFILETEMP contains Raw information, so we cant just insert everything into the table. We have to do some filtering....
How about doing a left join on the table, I read somewhere that the order in which the data is being inserted might also have an affect, as I havnt got an increasing key like a Clustered Indentity Column, i think it means that inserts are done in a random manner instead of inserting data to the end of the table.
September 4, 2006 at 9:15 am
Why not just compare to a date instead of converting to a char?
September 4, 2006 at 9:47 am
On the #ProfileTemp table, the date is likely to be all thesame, but at thesame time, converting to a char means that all dates will be covered.
eg: comparing 2006/09/04 to a table that could have 2006:09:04 10:00 etc, once this has been converted it would simply be 2006/09/04.
September 4, 2006 at 10:00 am
That's an easy fix :
Where Date >= '2006/09/04' and Date < '2006/09/05'
Can you run that query and send us the execution plan??? maybe we'll figure out what's taking so long...
September 4, 2006 at 10:07 am
September 4, 2006 at 10:13 am
You can generate a text version of the plan or you have to take the print screen image and put it up on a webserver so that we can see it. but I'd preffer the text version.
September 4, 2006 at 10:35 am
I hahve tried print screen, not working with the server, i really wanted to show you that nsert that took 1.30 hours.
But I just cant get the execution plan out, Looking at my CPU usage, its not high at all, but i do feel that since the Personprofile tablke has about 14Million records in it, for every insert, its trying to order it.
From what i understant, 14 million rols already in the table, it only has a Unique non Clustered index, whilst its trying to insert, instead of trying to insert at the end of the table, I think its trying to insert just anywhere.
As the table hasnt got an incremental Clustered key eg Identity Column, I dont think the inserts are ordered, I just need to underdstant what i need to do to stop this problem.
Inserts on tables with smaller number of records is quite quick using thesame code, but ones with millions of records, it starts to take time.
September 4, 2006 at 11:17 am
I'm sure it can be a problem... but without seeing the data (both inserted and in the table) and the plan it's hard to make an accurate diagnosis.
Here's how to produce the plan in text mode :
SET
SHOWPLAN_ALL ON
GO
--Query here
Select
* from sys.SysObjects
GO
SET
SHOWPLAN_ALL OFF
GO
This will not execute the query... only create the actual plan.
September 4, 2006 at 11:37 am
September 4, 2006 at 11:40 am
I need to know way more about the process and the data before deciding on indexes and keys... However it is considered a best practice to have a clustered index, or actually a worst practice not to have any (except in very few rare cases).
September 4, 2006 at 11:46 am
September 4, 2006 at 11:49 am
I'm not under any deadline .
September 4, 2006 at 12:12 pm
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Assert(WHEREIf ([Expr1003]<'Aug 30 2006 12:00AM' OR [Expr1003]>='Aug 31 2006 12:00AM') then 0 else NULL))
|--Table Insert(OBJECT[DB].[dbo].[CompanyProfile]), SET[CompanyProfile].[Switch_id]=[P].[Switch_id], [CompanyProfile].[TelephoneIndex_id]=[P].[TelephoneIndex_id], [CompanyProfile].[Location_id]=[P].[CULocat
|--Top(1)
|--Segment
|--Sort(ORDER BY[Expr1002] ASC, [P].[EventType] ASC, [Expr1003] ASC))
|--Assert(WHEREIf ([Expr1007] IS NOT NULL) then 0 else NULL))
|--Nested Loops(Left Semi Join, OUTER REFERENCES[P].[EventType], [Expr1002], [Expr1003]), DEFINE[Expr1007] = [PROBE VALUE]))
|--Top(ROWCOUNT est 0)
| |--Compute Scalar(DEFINE[Expr1002]=Convert([P].[CDRFingerPrint]), [Expr1003]=Convert([P].[SeizeDate])))
| |--Filter(WHERE[ToInsertTable].[SeizeDate]=NULL))
| |--Nested Loops(Left Outer Join, OUTER REFERENCES[P].[CDRFingerPrint], [P].[EventType], [P].[SeizeDate]))
| |--Clustered Index Seek(OBJECT[tempdb].[dbo].[#CDRDataProcesstest_________________________________________________________________________________________________0000000281F6].[IDX_#CDRDATAPROCESS]
| |--Filter(WHERESTARTUP EXPR([P].[SeizeDate]<'Aug 24 2006 12:00AM' AND [P].[SeizeDate]>='Aug 23 2006 12:00AM')))
| |--Clustered Index Seek(OBJECT[DB].[dbo].[CDRRawDataFeed20060823].[PK_CDRRawDataFeed20060823] AS [ToInsertTable]), SEEK[ToInsertTable].[SeizeDate]=[P].[SeizeDate] AND [ToInsertTable].
|--Row Count Spool
|--Filter(WHERESTARTUP EXPR([Expr1003]<'Aug 31 2006 12:00AM' AND [Expr1003]>='Aug 30 2006 12:00AM')))
|--Index Seek(OBJECT[DB].[dbo].[CompanyProfile].[IGD_CompanyProfile]), SEEK[CompanyProfile].[PersonURN_id]=[Expr1002] AND [CompanyProfile].[EventType]=[P].[EventTyp
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply