Data Insertion very slow (91k rows)

  • Hi,

    I have a got two tables and the relation is one to many - meaning a single entry in the main table will have exact 12 records in the secondary table.

    Data to these tables are fed from external CSV file which is uploaded from web application. Just to clarify, the slowness is not due to file upload. File upload is completed instantly.

    Total Number of Rows I have been testing is 7,000 records in main table and 7000 * 12 = 84000 records in secondary table.

    The loading process creates a dynamic insert statements and are passed to stored procedure to execute in a batch of 1000 records at a time.

    SQL Server is taking around more than 40 minutes to insert 7,000 + 84,000 = 91,000 records which not acceptable.

    Could someone please hint me how can I make the date insertion process faster?

    Thanks a lot.

    Milan

  • Please post schema of the tables involved and the code being used to do the insertions, and perhaps someone can help you.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • milan (2/9/2009)


    Hi,

    I have a got two tables and the relation is one to many - meaning a single entry in the main table will have exact 12 records in the secondary table.

    Data to these tables are fed from external CSV file which is uploaded from web application. Just to clarify, the slowness is not due to file upload. File upload is completed instantly.

    Total Number of Rows I have been testing is 7,000 records in main table and 7000 * 12 = 84000 records in secondary table.

    The loading process creates a dynamic insert statements and are passed to stored procedure to execute in a batch of 1000 records at a time.

    SQL Server is taking around more than 40 minutes to insert 7,000 + 84,000 = 91,000 records which not acceptable.

    Could someone please hint me how can I make the date insertion process faster?

    Thanks a lot.

    Milan

    Can you use bcp or BULK INSERT ?


    * Noel

  • The table scripts for Main table and secondary table are:

    CREATE TABLE [dbo].[TblBudget_Exception](

    [BudgetExceptionID] [int] IDENTITY(1,1) NOT NULL,

    [BudgetBatch] [int] NOT NULL,

    [Budget_DATETIME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [Budget_Client] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [Budget_Business] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [Budget_Division] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [Budget_Brand] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [Budget_BusinessPillar] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [Budget_BusinessPillar2] [varbinary](50) NULL,

    [Budget_BusinessPillar3] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [Budget_Market] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [MarketingPlanNumber] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [MarketingPlanName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [PlanStatus] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [CampaignType] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [CampaignTypeGroup] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [CostElement] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [ConsumerActivityID] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [ConsumerActivityName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [EntryDate] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [PlanStartDate] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [PlanEndDate] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [RevisionDate] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [CurrencyCode] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [BudgetUser] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [ExceptionNotes] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [HOLD] [bit] NOT NULL CONSTRAINT [DF__TblBudget___HOLD__2EB2A197] DEFAULT ((0)),

    [Record_RevNo] [int] NOT NULL CONSTRAINT [DF__TblBudget__Recor__2942188C] DEFAULT ((1)),

    [Record_Entered] [datetime] NOT NULL CONSTRAINT [DF__TblBudget__Recor__2A363CC5] DEFAULT (getdate()),

    [Record_User] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL CONSTRAINT [DF__TblBudget__Recor__2B2A60FE] DEFAULT (suser_sname()),

    [Record_Updated] [datetime] NOT NULL CONSTRAINT [DF__TblBudget__Recor__2C1E8537] DEFAULT (getdate()),

    CONSTRAINT [PK__TblBudget_Except__2DBE7D5E] PRIMARY KEY CLUSTERED

    (

    [BudgetExceptionID] ASC

    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    CREATE TABLE [dbo].[TblBudgetMonth_Exceptions](

    [BudgetMonthExceptionID] [int] IDENTITY(1,1) NOT NULL,

    [BudgetBatch] [int] NULL,

    [Budget] [int] NOT NULL,

    [BudgetMonth] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [BudgetYear] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [BudgetInitial] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [BudgetRevised] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [BudgetCommitted] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [BudgetActual] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [HOLD] [bit] NOT NULL CONSTRAINT [DF_TblBudgetMonth_Exceptions_HOLD] DEFAULT ((0)),

    [Record_RevNo] [int] NOT NULL CONSTRAINT [DF__TblBudget__Recor__355F9F26] DEFAULT ((1)),

    [Record_Entered] [datetime] NOT NULL CONSTRAINT [DF__TblBudget__Recor__3653C35F] DEFAULT (getdate()),

    [Record_User] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL CONSTRAINT [DF__TblBudget__Recor__3747E798] DEFAULT (suser_sname()),

    [Record_Updated] [datetime] NOT NULL CONSTRAINT [DF__TblBudget__Recor__383C0BD1] DEFAULT (getdate())

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    USE [NBSFairfield2005]

    GO

    ALTER TABLE [dbo].[TblBudgetMonth_Exceptions] WITH CHECK ADD CONSTRAINT [FK_TblBudgetMonth_Exceptions_tblDataBatch] FOREIGN KEY([BudgetBatch])

    REFERENCES [dbo].[tblDataBatch] ([BatchID])

    GO

    ALTER TABLE [dbo].[TblBudgetMonth_Exceptions] WITH CHECK ADD CONSTRAINT [fkBudgetMonthsExBudget] FOREIGN KEY([Budget])

    REFERENCES [dbo].[TblBudget_Exception] ([BudgetExceptionID])

    INSERT statements are generated from the asp.net code and passed to a stored procedure which simply executes it using EXEC (@InsertStatement)

    Thanks.

    Milan G

  • I don't see anything odd about your schema at first glance.

    Do you have indexes on the columns used as foreign keys?

    Does the asp.net code generate an INSERT statement for each row to be inserted, or does it generate a single INSERT statement for the entire set of rows to be inserted? If it is generating one INSERT per row, that's a problem.

    Please paste the code that is being passed to SQL to be executed.

    To speed things up in the future, please follow the guidelines in these two articles. Doing this will give you a much better chance of getting a correct answer quickly. 🙂

    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    http://weblogs.sqlteam.com/jeffs/archive/2008/05/13/question-needed-not-answer.aspx

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Hi,

    Yes asp.net code creates a insert statement for each row. It seems creating a single insert statement using UNION-all does not affect much in the performance.

    Please refer to the article

    http://blog.sqlauthority.com/2007/06/08/sql-server-insert-multiple-records-using-one-insert-statement-use-of-union-all/

    Thanks for the hints though.

    Cheers,

    Milan G

  • So are we ever going to see the mysterious INSERT statement or not? 😉

    I'll read the article, but the fact remains that SOMETHING is dragging your performance down and we have the blind leading the blind here. Still looking for more information ... if it is generating 10,000 UNION ALLS, just show the first 10 or so.

    Other thoughts:

    Repeating an earlier question, are there indexes to support the foreign keys?

    Have you run a test to see how long it takes to do 10 inserts using the generated code?

    What does the execution plan look like?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Hi,

    Many thanks for showing interest...

    Here is the codesnippet that generates the statement...

    budgetDataList is a List of custom class ..basically used to stored all the values related to main table...and the AddBudgetMonthData function returns 12 insert statement for each record in the main list which is budgetDataList.

    Cheers.

    Milan G

    ..............................

    'Dynamic SQL Insert statements - Batch of 1000 @ a time

    Dim budgetExpSQL As String = ""

    Dim identitySQL As String = ""

    Dim bulkSQL As New StringBuilder()

    Dim monthSQL As String

    Dim counter As Integer = 0

    For Each budget In budgetDataList

    budgetExpSQL = "INSERT INTO TblBudget_Exception"

    budgetExpSQL = budgetExpSQL + ("( [BudgetBatch], [Budget_DATETIME], [Budget_Client],[Budget_Business],")

    budgetExpSQL = budgetExpSQL + ("[Budget_Division], [Budget_Brand], [Budget_BusinessPillar], [Budget_Market],")

    budgetExpSQL = budgetExpSQL + ("[MarketingPlanNumber],[MarketingPlanName], [PlanStatus],[CampaignType],")

    budgetExpSQL = budgetExpSQL + ("[CampaignTypeGroup], [CostElement], [ConsumerActivityID], [ConsumerActivityName],")

    budgetExpSQL = budgetExpSQL + (" [EntryDate], [RevisionDate], [PlanEndDate], [PlanStartDate], [CurrencyCode],")

    budgetExpSQL = budgetExpSQL + (" [BudgetUser] )")

    budgetExpSQL = budgetExpSQL + (" VALUES ")

    budgetExpSQL = budgetExpSQL + (" ( ")

    budgetExpSQL = budgetExpSQL + (budget.BatchID & ",'" & budget.FileCreatedOn & "','" & budget.Client & "','" & budget.Business & "',")

    budgetExpSQL = budgetExpSQL + ("'" & budget.Category & "','" & budget.Brand & "','" & budget.BusinessPillar & "','" & budget.MarketName & "',")

    budgetExpSQL = budgetExpSQL + ("'" & budget.MarketingPlanNumber & "','" & budget.MarketingPlanName & "','" & budget.PlanStatus & "','" & budget.CampaignType & "',")

    budgetExpSQL = budgetExpSQL + ("'" & budget.CampaignTypeGroup & "','" & budget.CostElement & "','" & budget.ConsumerActivityID & "','" & budget.ConsumerActivityDesc & "',")

    budgetExpSQL = budgetExpSQL + ("'" & budget.CreatedOn & "','" & budget.RevisionDate & "','" & budget.PlanEndDate & "','" & budget.PlanStartDate & "','" & budget.Currency & "',")

    budgetExpSQL = budgetExpSQL + ("'" & budget.Username & "'")

    budgetExpSQL = budgetExpSQL + (" ); ")

    If counter = 0 Then

    '@expID needs to be declared only once...

    identitySQL = "DECLARE @expID INT;"

    identitySQL = identitySQL + "SET @expID = SCOPE_IDENTITY();"

    Else

    'Multiple...

    identitySQL = "SET @expID = SCOPE_IDENTITY();"

    End If

    monthSQL = AddBudgetMonthsData(batchID, "@expID", budget.InitialPlanList, budget.RevisedPlanList, _

    budget.CommittedCostList, budget.ActualCostList)

    bulkSQL.Append(budgetExpSQL + identitySQL + monthSQL.ToString())

    'reset

    budgetExpSQL = ""

    If counter = 1000 Then

    dbDataBatch.InsertBulkBudgetExceptions(bulkSQL.ToString())

    bulkSQL.Remove(0, bulkSQL.Length)

    counter = 0

    Else

    counter = counter + 1

    End If

    Next

    dbDataBatch.InsertBulkBudgetExceptions(bulkSQL.ToString())

    .....................................

    Private Function AddBudgetMonthsData(ByVal batchID As Integer, ByVal budgetExceptionID As String, _

    ByVal initialPlanList As List(Of BudgetPlanCost), ByVal revisedPlanList As List(Of BudgetPlanCost), _

    ByVal committedCostList As List(Of BudgetPlanCost), ByVal actualCostList As List(Of BudgetPlanCost) _

    ) As String

    'Insert record into TblBudgetMonth_Exceptions

    'Assuming Number of months will be same for Initial, Revised, Committed and Actual Costs..

    Dim monthsCount As Integer

    monthsCount = initialPlanList.Count

    Dim planInitial As Double = 0

    Dim revisedPlan As Double = 0

    Dim committedCost As Double = 0

    Dim actualCost As Double = 0

    Dim month As Integer

    Dim year As Integer

    Dim i As Integer

    Dim budgetMonthExp As New StringBuilder()

    Dim dbDataBatch As New DBDataBatch()

    For i = 0 To monthsCount - 1

    month = initialPlanList(i).Month 'applies to all other columns as well

    year = initialPlanList(i).Year ' applies to all other columns as well

    planInitial = initialPlanList(i).Value

    revisedPlan = revisedPlanList(i).Value

    'This is for partner biz file - which does have have committedCost and actualCost

    If committedCostList.Count > 0 And committedCostList.Count >= i Then

    committedCost = committedCostList(i).Value

    Else

    committedCost = 0

    End If

    If actualCostList.Count > 0 And actualCostList.Count >= i Then

    actualCost = actualCostList(i).Value

    Else

    actualCost = 0

    End If

    'dbDataBatch.InsertBudgetMonthExceptionsV2(batchID, budgetExceptionID, month, year, planInitial, revisedPlan, committedCost, actualCost)

    budgetMonthExp.Append("INSERT INTO TblBudgetMonth_Exceptions")

    budgetMonthExp.Append("( BudgetBatch,Budget, BudgetMonth, BudgetYear, ")

    budgetMonthExp.Append("BudgetInitial, BudgetRevised, BudgetCommitted, BudgetActual ) ")

    budgetMonthExp.Append(" VALUES ")

    budgetMonthExp.Append("( ")

    budgetMonthExp.Append(batchID & "," & budgetExceptionID & "," & month & "," & year & ",")

    budgetMonthExp.Append(planInitial & "," & revisedPlan & "," & committedCost & "," & actualCost)

    budgetMonthExp.Append(");")

    Next

    Return budgetMonthExp.ToString()

    End Function

  • Milan... I don't see a UNION ALL.

    It doesn't matter if all the lines are being generated by the user interface at one time or not, there is an enormous performance difference between these two examples, which is the point of the article you referred to.

    ---------------------------------------------------------------

    INSERT INTO SomeTable -- this is a single insert statement

    select x,y,z UNION ALL

    select x,y,z UNION ALL

    select x,y,z UNION ALL

    select x,y,z UNION ALL

    .

    .

    .

    select x,y,z UNION ALL

    select x,y,z UNION ALL

    select x,y,z UNION ALL

    select x,y,z UNION ALL

    select x,y,z UNION ALL

    select x,y,z

    --------------------------------------------------------------------------------------

    INSERT INTO SomeTable VALUES(x,y,z) -- these are all individual insert statements

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    INSERT INTO SomeTable VALUES(x,y,z)

    .

    .

    .

    Why are you having the the UI generate so many statements? If you are just reading a file and inserting into a table, there are faster (and maybe easier) ways to load data into SQL.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • [font="Verdana"]Have you tried getting your application to generate two files (one for parent table, one for child table), and then using bulk insert to load the files?[/font]

  • Bruce, I would guess not, since noeld asked that question earlier. But the question bears repeating.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Hi,

    In reply to Union All statements,

    Generated statement would look like

    1. First cycle...

    INSERT INTO tblBudgetException

    SELECT d1,d2,d3

    INSERT INTO tblBudgetMonth_Exception

    --I need the id of the just entered record to insert into secondary table..

    -- therefore, it would like SET @x = scope_identity()

    1. Select @x,d2,d3

    2. Select @x,d2,d3

    3. Select @x,d2,d3

    4. Select @x,d2,d3

    5. Select @x,d2,d3

    6. Select @x,d2,d3

    7. Select @x,d2,d3

    8 .Select @x,d2,d3

    9. Select @x,d2,d3

    10.Select @x,d2,d3

    11 .Select @x,d2,d3

    12. Select SCOPE_Identity(),d2,d2

    2. Second cycle...

    INSERT INTO tblBudgetException

    SELECT d1,d2,d3

    INSERT INTO tblBudgetMonth_Exception

    --I need the id of the just entered record to insert into secondary table..

    -- therefore, it would like SET @x = scope_identity()

    1. Select @x,d2,d3

    2. Select @x,d2,d3

    3. Select @x,d2,d3

    4. Select @x,d2,d3

    5. Select @x,d2,d3

    6. Select @x,d2,d3

    7. Select @x,d2,d3

    8 .Select @x,d2,d3

    9. Select @x,d2,d3

    10.Select @x,d2,d3

    11 .Select @x,d2,d3

    12. Select SCOPE_Identity(),d2,d2

    ...it goes until the last one ---7,000 cycles..

    I don't think in the case it will help a lot..

    Yeah, I have thought of using SSIS package to load the data in to some table.. and work thr' it...but then it's not a one time job..it has to be done every now and then....users upload new files with into the system every often.

    Thanks.

    Milan G

  • [font="Verdana"]Two thoughts:

    You could develop an SSIS package, and your application could ensure that the files that the users upload are in the right place, with the right name, and then start the SSIS package to do the uploading.

    Or if you particularly need to parse the input, and you're not comfortable doing that within an SSIS package (for which I don't blame you: I don't recommend an environment that requires you to be fluent in three programming languages at once!) then just get your own code to generate two new files (in the right place with the right names), and then call a stored procedure to bulk insert the files.

    However you approach it, generating SQL statements to insert bulk data is going to be slow. SSIS and bulk insert can also avoid most of the overhead of transaction logging.

    Edited to add:

    Rather than using scope_identity() per row, have you had a look at the output clause of the insert statement? You can use this to return a set of the new identity values, and then insert the corresponding set of child records. Quicker than doing it row by row.

    [/font]

  • Hi Bruce,

    When I saw your first post related to the problem, I thought of creating a delimited text file and use BULK INSERT to dump data into the relevant tables...But it won't work..why?

    Because to insert records into secondary table..I need to have to have the ID (IDENTITY Column) of just inserted row from Primary table. So With out knowing the ID..I can't insert record into secondary table. To get the ID I have to insert the record into the table..

    Thanks.

    Milan G

  • 1. Bruce is absolutely right. If possible, rethink this whole idea of writing the code on the fly in your UI. That approach will ALWAYS run slower.

    But I am going to continue with your problem, so that you will understand something of why it is slower. If you are determined to generate SQL code from your UI, you need to understand what your SQL code is doing.

    2. Borrowing from your code

    -- one insert, that has overhead associated with launching an insert as well as overhead associated with logging

    INSERT INTO tblBudgetException

    SELECT d1,d2,d3

    -- followed by insert to secondary table, which also incurs overhead and gets logged

    INSERT INTO tblBudgetMonth_Exception

    --I need the id of the just entered record to insert into secondary table..

    -- therefore, it would like SET @x = scope_identity()

    1. Select @x,d2,d3

    2. Select @x,d2,d3

    3. Select @x,d2,d3

    4. Select @x,d2,d3

    5. Select @x,d2,d3

    As I read the second block of code, it would only insert one row, because of the absence of UNION ALL following each SELECT LINE. (But I see you putting line numbers in front, so I'll assume it's just a quickly typed example which is unexecutable instead of the actual code.)

    Lets assume you were to put the UNION ALLS into your code. Instead of doing a single insert of 7000 rows to one table, and a single insert of 84000 rows to the second table, you are doing 7000 separate inserts to your base table, followed by 7000 inserts (of 12 rows apiece) to your secondary table. That's 14,000 INSERTS, instead of two. A certain amount of overhead is incurred every time an SQL statement is launched. SQL is blindingly fast at delivering large sets of data with a single statement, but you cripple it when you start to handle one row at a time. You will often find this discussed in terms of "set-based" programming as contrasted to "procedural" programming.

    What you want is to insert 7000 rows into your primary table followed by 84000 rows into your secondary table. If you will read up on the OUTPUT clause to the INSERT statement, you will see that you can catch the id of each row inserted into the primary table in a temporary table. You can then use that table as the basis for inserting rows into the secondary table. You can "explode" the single row into twelve rows using a CROSS JOIN, if necessary. Then you are only doing two inserts and it will run faster.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply