February 9, 2009 at 10:37 am
Hi,
I have a got two tables and the relation is one to many - meaning a single entry in the main table will have exact 12 records in the secondary table.
Data to these tables are fed from external CSV file which is uploaded from web application. Just to clarify, the slowness is not due to file upload. File upload is completed instantly.
Total Number of Rows I have been testing is 7,000 records in main table and 7000 * 12 = 84000 records in secondary table.
The loading process creates a dynamic insert statements and are passed to stored procedure to execute in a batch of 1000 records at a time.
SQL Server is taking around more than 40 minutes to insert 7,000 + 84,000 = 91,000 records which not acceptable.
Could someone please hint me how can I make the date insertion process faster?
Thanks a lot.
Milan
February 9, 2009 at 12:58 pm
Please post schema of the tables involved and the code being used to do the insertions, and perhaps someone can help you.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 9, 2009 at 1:19 pm
milan (2/9/2009)
Hi,I have a got two tables and the relation is one to many - meaning a single entry in the main table will have exact 12 records in the secondary table.
Data to these tables are fed from external CSV file which is uploaded from web application. Just to clarify, the slowness is not due to file upload. File upload is completed instantly.
Total Number of Rows I have been testing is 7,000 records in main table and 7000 * 12 = 84000 records in secondary table.
The loading process creates a dynamic insert statements and are passed to stored procedure to execute in a batch of 1000 records at a time.
SQL Server is taking around more than 40 minutes to insert 7,000 + 84,000 = 91,000 records which not acceptable.
Could someone please hint me how can I make the date insertion process faster?
Thanks a lot.
Milan
Can you use bcp or BULK INSERT ?
* Noel
February 10, 2009 at 1:52 am
The table scripts for Main table and secondary table are:
CREATE TABLE [dbo].[TblBudget_Exception](
[BudgetExceptionID] [int] IDENTITY(1,1) NOT NULL,
[BudgetBatch] [int] NOT NULL,
[Budget_DATETIME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Budget_Client] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Budget_Business] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Budget_Division] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Budget_Brand] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Budget_BusinessPillar] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Budget_BusinessPillar2] [varbinary](50) NULL,
[Budget_BusinessPillar3] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Budget_Market] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[MarketingPlanNumber] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[MarketingPlanName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[PlanStatus] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[CampaignType] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[CampaignTypeGroup] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[CostElement] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ConsumerActivityID] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ConsumerActivityName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[EntryDate] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[PlanStartDate] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[PlanEndDate] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[RevisionDate] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[CurrencyCode] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[BudgetUser] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ExceptionNotes] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[HOLD] [bit] NOT NULL CONSTRAINT [DF__TblBudget___HOLD__2EB2A197] DEFAULT ((0)),
[Record_RevNo] [int] NOT NULL CONSTRAINT [DF__TblBudget__Recor__2942188C] DEFAULT ((1)),
[Record_Entered] [datetime] NOT NULL CONSTRAINT [DF__TblBudget__Recor__2A363CC5] DEFAULT (getdate()),
[Record_User] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL CONSTRAINT [DF__TblBudget__Recor__2B2A60FE] DEFAULT (suser_sname()),
[Record_Updated] [datetime] NOT NULL CONSTRAINT [DF__TblBudget__Recor__2C1E8537] DEFAULT (getdate()),
CONSTRAINT [PK__TblBudget_Except__2DBE7D5E] PRIMARY KEY CLUSTERED
(
[BudgetExceptionID] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
CREATE TABLE [dbo].[TblBudgetMonth_Exceptions](
[BudgetMonthExceptionID] [int] IDENTITY(1,1) NOT NULL,
[BudgetBatch] [int] NULL,
[Budget] [int] NOT NULL,
[BudgetMonth] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[BudgetYear] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[BudgetInitial] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[BudgetRevised] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[BudgetCommitted] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[BudgetActual] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[HOLD] [bit] NOT NULL CONSTRAINT [DF_TblBudgetMonth_Exceptions_HOLD] DEFAULT ((0)),
[Record_RevNo] [int] NOT NULL CONSTRAINT [DF__TblBudget__Recor__355F9F26] DEFAULT ((1)),
[Record_Entered] [datetime] NOT NULL CONSTRAINT [DF__TblBudget__Recor__3653C35F] DEFAULT (getdate()),
[Record_User] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL CONSTRAINT [DF__TblBudget__Recor__3747E798] DEFAULT (suser_sname()),
[Record_Updated] [datetime] NOT NULL CONSTRAINT [DF__TblBudget__Recor__383C0BD1] DEFAULT (getdate())
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
USE [NBSFairfield2005]
GO
ALTER TABLE [dbo].[TblBudgetMonth_Exceptions] WITH CHECK ADD CONSTRAINT [FK_TblBudgetMonth_Exceptions_tblDataBatch] FOREIGN KEY([BudgetBatch])
REFERENCES [dbo].[tblDataBatch] ([BatchID])
GO
ALTER TABLE [dbo].[TblBudgetMonth_Exceptions] WITH CHECK ADD CONSTRAINT [fkBudgetMonthsExBudget] FOREIGN KEY([Budget])
REFERENCES [dbo].[TblBudget_Exception] ([BudgetExceptionID])
INSERT statements are generated from the asp.net code and passed to a stored procedure which simply executes it using EXEC (@InsertStatement)
Thanks.
Milan G
February 10, 2009 at 9:37 am
I don't see anything odd about your schema at first glance.
Do you have indexes on the columns used as foreign keys?
Does the asp.net code generate an INSERT statement for each row to be inserted, or does it generate a single INSERT statement for the entire set of rows to be inserted? If it is generating one INSERT per row, that's a problem.
Please paste the code that is being passed to SQL to be executed.
To speed things up in the future, please follow the guidelines in these two articles. Doing this will give you a much better chance of getting a correct answer quickly. 🙂
http://www.sqlservercentral.com/articles/Best+Practices/61537/
http://weblogs.sqlteam.com/jeffs/archive/2008/05/13/question-needed-not-answer.aspx
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 10, 2009 at 10:16 am
Hi,
Yes asp.net code creates a insert statement for each row. It seems creating a single insert statement using UNION-all does not affect much in the performance.
Please refer to the article
Thanks for the hints though.
Cheers,
Milan G
February 10, 2009 at 10:24 am
So are we ever going to see the mysterious INSERT statement or not? 😉
I'll read the article, but the fact remains that SOMETHING is dragging your performance down and we have the blind leading the blind here. Still looking for more information ... if it is generating 10,000 UNION ALLS, just show the first 10 or so.
Other thoughts:
Repeating an earlier question, are there indexes to support the foreign keys?
Have you run a test to see how long it takes to do 10 inserts using the generated code?
What does the execution plan look like?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 10, 2009 at 10:30 am
Hi,
Many thanks for showing interest...
Here is the codesnippet that generates the statement...
budgetDataList is a List of custom class ..basically used to stored all the values related to main table...and the AddBudgetMonthData function returns 12 insert statement for each record in the main list which is budgetDataList.
Cheers.
Milan G
..............................
'Dynamic SQL Insert statements - Batch of 1000 @ a time
Dim budgetExpSQL As String = ""
Dim identitySQL As String = ""
Dim bulkSQL As New StringBuilder()
Dim monthSQL As String
Dim counter As Integer = 0
For Each budget In budgetDataList
budgetExpSQL = "INSERT INTO TblBudget_Exception"
budgetExpSQL = budgetExpSQL + ("( [BudgetBatch], [Budget_DATETIME], [Budget_Client],[Budget_Business],")
budgetExpSQL = budgetExpSQL + ("[Budget_Division], [Budget_Brand], [Budget_BusinessPillar], [Budget_Market],")
budgetExpSQL = budgetExpSQL + ("[MarketingPlanNumber],[MarketingPlanName], [PlanStatus],[CampaignType],")
budgetExpSQL = budgetExpSQL + ("[CampaignTypeGroup], [CostElement], [ConsumerActivityID], [ConsumerActivityName],")
budgetExpSQL = budgetExpSQL + (" [EntryDate], [RevisionDate], [PlanEndDate], [PlanStartDate], [CurrencyCode],")
budgetExpSQL = budgetExpSQL + (" [BudgetUser] )")
budgetExpSQL = budgetExpSQL + (" VALUES ")
budgetExpSQL = budgetExpSQL + (" ( ")
budgetExpSQL = budgetExpSQL + (budget.BatchID & ",'" & budget.FileCreatedOn & "','" & budget.Client & "','" & budget.Business & "',")
budgetExpSQL = budgetExpSQL + ("'" & budget.Category & "','" & budget.Brand & "','" & budget.BusinessPillar & "','" & budget.MarketName & "',")
budgetExpSQL = budgetExpSQL + ("'" & budget.MarketingPlanNumber & "','" & budget.MarketingPlanName & "','" & budget.PlanStatus & "','" & budget.CampaignType & "',")
budgetExpSQL = budgetExpSQL + ("'" & budget.CampaignTypeGroup & "','" & budget.CostElement & "','" & budget.ConsumerActivityID & "','" & budget.ConsumerActivityDesc & "',")
budgetExpSQL = budgetExpSQL + ("'" & budget.CreatedOn & "','" & budget.RevisionDate & "','" & budget.PlanEndDate & "','" & budget.PlanStartDate & "','" & budget.Currency & "',")
budgetExpSQL = budgetExpSQL + ("'" & budget.Username & "'")
budgetExpSQL = budgetExpSQL + (" ); ")
If counter = 0 Then
'@expID needs to be declared only once...
identitySQL = "DECLARE @expID INT;"
identitySQL = identitySQL + "SET @expID = SCOPE_IDENTITY();"
Else
'Multiple...
identitySQL = "SET @expID = SCOPE_IDENTITY();"
End If
monthSQL = AddBudgetMonthsData(batchID, "@expID", budget.InitialPlanList, budget.RevisedPlanList, _
budget.CommittedCostList, budget.ActualCostList)
bulkSQL.Append(budgetExpSQL + identitySQL + monthSQL.ToString())
'reset
budgetExpSQL = ""
If counter = 1000 Then
dbDataBatch.InsertBulkBudgetExceptions(bulkSQL.ToString())
bulkSQL.Remove(0, bulkSQL.Length)
counter = 0
Else
counter = counter + 1
End If
Next
dbDataBatch.InsertBulkBudgetExceptions(bulkSQL.ToString())
.....................................
Private Function AddBudgetMonthsData(ByVal batchID As Integer, ByVal budgetExceptionID As String, _
ByVal initialPlanList As List(Of BudgetPlanCost), ByVal revisedPlanList As List(Of BudgetPlanCost), _
ByVal committedCostList As List(Of BudgetPlanCost), ByVal actualCostList As List(Of BudgetPlanCost) _
) As String
'Insert record into TblBudgetMonth_Exceptions
'Assuming Number of months will be same for Initial, Revised, Committed and Actual Costs..
Dim monthsCount As Integer
monthsCount = initialPlanList.Count
Dim planInitial As Double = 0
Dim revisedPlan As Double = 0
Dim committedCost As Double = 0
Dim actualCost As Double = 0
Dim month As Integer
Dim year As Integer
Dim i As Integer
Dim budgetMonthExp As New StringBuilder()
Dim dbDataBatch As New DBDataBatch()
For i = 0 To monthsCount - 1
month = initialPlanList(i).Month 'applies to all other columns as well
year = initialPlanList(i).Year ' applies to all other columns as well
planInitial = initialPlanList(i).Value
revisedPlan = revisedPlanList(i).Value
'This is for partner biz file - which does have have committedCost and actualCost
If committedCostList.Count > 0 And committedCostList.Count >= i Then
committedCost = committedCostList(i).Value
Else
committedCost = 0
End If
If actualCostList.Count > 0 And actualCostList.Count >= i Then
actualCost = actualCostList(i).Value
Else
actualCost = 0
End If
'dbDataBatch.InsertBudgetMonthExceptionsV2(batchID, budgetExceptionID, month, year, planInitial, revisedPlan, committedCost, actualCost)
budgetMonthExp.Append("INSERT INTO TblBudgetMonth_Exceptions")
budgetMonthExp.Append("( BudgetBatch,Budget, BudgetMonth, BudgetYear, ")
budgetMonthExp.Append("BudgetInitial, BudgetRevised, BudgetCommitted, BudgetActual ) ")
budgetMonthExp.Append(" VALUES ")
budgetMonthExp.Append("( ")
budgetMonthExp.Append(batchID & "," & budgetExceptionID & "," & month & "," & year & ",")
budgetMonthExp.Append(planInitial & "," & revisedPlan & "," & committedCost & "," & actualCost)
budgetMonthExp.Append(");")
Next
Return budgetMonthExp.ToString()
End Function
February 10, 2009 at 10:52 am
Milan... I don't see a UNION ALL.
It doesn't matter if all the lines are being generated by the user interface at one time or not, there is an enormous performance difference between these two examples, which is the point of the article you referred to.
---------------------------------------------------------------
INSERT INTO SomeTable -- this is a single insert statement
select x,y,z UNION ALL
select x,y,z UNION ALL
select x,y,z UNION ALL
select x,y,z UNION ALL
.
.
.
select x,y,z UNION ALL
select x,y,z UNION ALL
select x,y,z UNION ALL
select x,y,z UNION ALL
select x,y,z UNION ALL
select x,y,z
--------------------------------------------------------------------------------------
INSERT INTO SomeTable VALUES(x,y,z) -- these are all individual insert statements
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
INSERT INTO SomeTable VALUES(x,y,z)
.
.
.
Why are you having the the UI generate so many statements? If you are just reading a file and inserting into a table, there are faster (and maybe easier) ways to load data into SQL.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 10, 2009 at 12:29 pm
[font="Verdana"]Have you tried getting your application to generate two files (one for parent table, one for child table), and then using bulk insert to load the files?[/font]
February 10, 2009 at 12:35 pm
Bruce, I would guess not, since noeld asked that question earlier. But the question bears repeating.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 10, 2009 at 2:27 pm
Hi,
In reply to Union All statements,
Generated statement would look like
1. First cycle...
INSERT INTO tblBudgetException
SELECT d1,d2,d3
INSERT INTO tblBudgetMonth_Exception
--I need the id of the just entered record to insert into secondary table..
-- therefore, it would like SET @x = scope_identity()
1. Select @x,d2,d3
2. Select @x,d2,d3
3. Select @x,d2,d3
4. Select @x,d2,d3
5. Select @x,d2,d3
6. Select @x,d2,d3
7. Select @x,d2,d3
8 .Select @x,d2,d3
9. Select @x,d2,d3
10.Select @x,d2,d3
11 .Select @x,d2,d3
12. Select SCOPE_Identity(),d2,d2
2. Second cycle...
INSERT INTO tblBudgetException
SELECT d1,d2,d3
INSERT INTO tblBudgetMonth_Exception
--I need the id of the just entered record to insert into secondary table..
-- therefore, it would like SET @x = scope_identity()
1. Select @x,d2,d3
2. Select @x,d2,d3
3. Select @x,d2,d3
4. Select @x,d2,d3
5. Select @x,d2,d3
6. Select @x,d2,d3
7. Select @x,d2,d3
8 .Select @x,d2,d3
9. Select @x,d2,d3
10.Select @x,d2,d3
11 .Select @x,d2,d3
12. Select SCOPE_Identity(),d2,d2
...it goes until the last one ---7,000 cycles..
I don't think in the case it will help a lot..
Yeah, I have thought of using SSIS package to load the data in to some table.. and work thr' it...but then it's not a one time job..it has to be done every now and then....users upload new files with into the system every often.
Thanks.
Milan G
February 10, 2009 at 2:34 pm
[font="Verdana"]Two thoughts:
You could develop an SSIS package, and your application could ensure that the files that the users upload are in the right place, with the right name, and then start the SSIS package to do the uploading.
Or if you particularly need to parse the input, and you're not comfortable doing that within an SSIS package (for which I don't blame you: I don't recommend an environment that requires you to be fluent in three programming languages at once!) then just get your own code to generate two new files (in the right place with the right names), and then call a stored procedure to bulk insert the files.
However you approach it, generating SQL statements to insert bulk data is going to be slow. SSIS and bulk insert can also avoid most of the overhead of transaction logging.
Edited to add:
Rather than using scope_identity() per row, have you had a look at the output clause of the insert statement? You can use this to return a set of the new identity values, and then insert the corresponding set of child records. Quicker than doing it row by row.
[/font]
February 10, 2009 at 2:52 pm
Hi Bruce,
When I saw your first post related to the problem, I thought of creating a delimited text file and use BULK INSERT to dump data into the relevant tables...But it won't work..why?
Because to insert records into secondary table..I need to have to have the ID (IDENTITY Column) of just inserted row from Primary table. So With out knowing the ID..I can't insert record into secondary table. To get the ID I have to insert the record into the table..
Thanks.
Milan G
February 10, 2009 at 3:02 pm
1. Bruce is absolutely right. If possible, rethink this whole idea of writing the code on the fly in your UI. That approach will ALWAYS run slower.
But I am going to continue with your problem, so that you will understand something of why it is slower. If you are determined to generate SQL code from your UI, you need to understand what your SQL code is doing.
2. Borrowing from your code
-- one insert, that has overhead associated with launching an insert as well as overhead associated with logging
INSERT INTO tblBudgetException
SELECT d1,d2,d3
-- followed by insert to secondary table, which also incurs overhead and gets logged
INSERT INTO tblBudgetMonth_Exception
--I need the id of the just entered record to insert into secondary table..
-- therefore, it would like SET @x = scope_identity()
1. Select @x,d2,d3
2. Select @x,d2,d3
3. Select @x,d2,d3
4. Select @x,d2,d3
5. Select @x,d2,d3
As I read the second block of code, it would only insert one row, because of the absence of UNION ALL following each SELECT LINE. (But I see you putting line numbers in front, so I'll assume it's just a quickly typed example which is unexecutable instead of the actual code.)
Lets assume you were to put the UNION ALLS into your code. Instead of doing a single insert of 7000 rows to one table, and a single insert of 84000 rows to the second table, you are doing 7000 separate inserts to your base table, followed by 7000 inserts (of 12 rows apiece) to your secondary table. That's 14,000 INSERTS, instead of two. A certain amount of overhead is incurred every time an SQL statement is launched. SQL is blindingly fast at delivering large sets of data with a single statement, but you cripple it when you start to handle one row at a time. You will often find this discussed in terms of "set-based" programming as contrasted to "procedural" programming.
What you want is to insert 7000 rows into your primary table followed by 84000 rows into your secondary table. If you will read up on the OUTPUT clause to the INSERT statement, you will see that you can catch the id of each row inserted into the primary table in a temporary table. You can then use that table as the basis for inserting rows into the secondary table. You can "explode" the single row into twelve rows using a CROSS JOIN, if necessary. Then you are only doing two inserts and it will run faster.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 15 posts - 1 through 15 (of 24 total)
You must be logged in to reply to this topic. Login to reply