July 16, 2011 at 4:31 pm
Hi,
This is my first post here - I’m hoping someone can give me some advice. I've included the procedure code along with definitions and execution plan, but I wonder whether my whole approach to this is wrong.
I'm trying to carry out bulk updates on a large table (currently 1.5 million rows but expected to grow to 10x that). Updates can be any number from 1 to 100,000 or more. I'm a leaner, so my approach has been to Google and see what others are doing. I'm using the approach below, which is not performing well. The procedure is run from a desktop application and frequently times out.
1. A 'merchant' at a time, use SqlBulkCopy (from VB.NET) to write the rows to be updated to a separate table
2. Create a View of the existing records in the main table belonging to the relevant merchant
3. Use the procedure below to update the records in the View
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:Paul Weston
-- Create date: 08/06/2011
-- Description:Performs a 'bulk' update of the products from a single merchant feed.
-- =============================================
CREATE PROCEDURE [dbo].[bpw_Product_UpdateModifiedProducts]
-- Add the parameters for the stored procedure here
@Result int OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
BEGIN TRANSACTION
CREATE TABLE #ProductsUpdated
(
ProductID nvarchar(60),
MerchantFeedID int,
ProductName nvarchar(512),
BrandName nvarchar(128),
DescriptionBrief nvarchar(MAX),
DescriptionFull nvarchar(MAX),
Currency nchar(3),
PriceNormal money,
PriceRecommendedRetail money,
PriceInStore money,
PriceSale money,
WebOffer bit,
DeepLink nvarchar(1024),
ImpressionURL nvarchar(1024),
ImageLargeURL nvarchar(1024),
ImageThumbURL nvarchar(1024),
PromotionalText nvarchar(MAX),
SpecialOffer bit
)
declare @sql varchar(8000);
set @sql = 'SELECT ProductID,MerchantFeedID,ProductName,BrandName,DescriptionBrief,DescriptionFull,Currency,PriceNormal,
PriceRecommendedRetail,PriceInStore,PriceSale,WebOffer,DeepLink,ImpressionURL,ImageLargeURL,ImageThumbURL,
PromotionalText,SpecialOffer
FROM ProductTemp EXCEPT
SELECT ProductID,MerchantFeedID,ProductName,BrandName,DescriptionBrief,DescriptionFull,Currency,PriceNormal,
PriceRecommendedRetail,PriceInStore,PriceSale,WebOffer,DeepLink,ImpressionURL,ImageLargeURL,ImageThumbURL,
PromotionalText,SpecialOffer
FROM ProductTempView'
INSERT #ProductsUpdatedexec (@SQL)
UPDATEProductTempView WITH (TABLOCKX, HOLDLOCK)
SETProductTempView.MerchantFeedID = SourceTable.MerchantFeedID,
ProductTempView.ProductName = SourceTable.ProductName,
ProductTempView.BrandName = SourceTable.BrandName,
ProductTempView.DescriptionBrief = SourceTable.DescriptionBrief,
ProductTempView.DescriptionFull = SourceTable.DescriptionFull,
ProductTempView.Currency = SourceTable.Currency,
ProductTempView.PriceNormal = SourceTable.PriceNormal,
ProductTempView.PriceRecommendedRetail = SourceTable.PriceRecommendedRetail,
ProductTempView.PriceInStore = SourceTable.PriceInStore,
ProductTempView.PriceSale = SourceTable.PriceSale,
ProductTempView.WebOffer = SourceTable.WebOffer,
ProductTempView.DeepLink = SourceTable.DeepLink,
ProductTempView.ImpressionURL = SourceTable.ImpressionURL,
ProductTempView.ImageLargeURL = SourceTable.ImageLargeURL,
ProductTempView.ImageThumbURL = SourceTable.ImageThumbURL,
ProductTempView.PromotionalText = SourceTable.PromotionalText,
ProductTempView.SpecialOffer = SourceTable.SpecialOffer
FROMProductTempView AS TargetTable INNER JOIN
#ProductsUpdated AS SourceTable
ON TargetTable.ProductID=SourceTable.ProductID
SELECT@Result = @@ROWCOUNT
IF @@ERROR = 0
BEGIN
COMMIT TRANSACTION
END
ELSE
BEGIN
ROLLBACK TRANSACTION
SELECT@Result = -1
END
DROP TABLE #ProductsUpdated
END
I decided to use a View because I understood (read) that large updates do not play well with multiple indexes - perhaps my understanding is wrong. The View has no indexes.
/****** Object: View [dbo].[ProductTempView] Script Date: 07/16/2011 22:56:30 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE VIEW [dbo].[ProductTempView] AS (SELECT * FROM Product WHERE MerchantFeedID = 184)
The ProductTemp definition is:
/****** Object: Table [dbo].[ProductTemp] Script Date: 07/16/2011 23:30:51 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[ProductTemp](
[ProductID] [nvarchar](60) COLLATE Latin1_General_CI_AS NULL,
[OurProductID] [int] NULL,
[MerchantFeedID] [int] NULL,
[ProductName] [nvarchar](512) COLLATE Latin1_General_CI_AS NULL,
[BrandName] [nvarchar](128) COLLATE Latin1_General_CI_AS NULL,
[DescriptionBrief] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[DescriptionFull] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[Currency] [nchar](3) COLLATE Latin1_General_CI_AS NULL,
[PriceNormal] [money] NULL,
[PriceRecommendedRetail] [money] NULL,
[PriceInStore] [money] NULL,
[PriceSale] [money] NULL,
[WebOffer] [bit] NULL,
[DeepLink] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,
[ImpressionURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,
[ImageLargeURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,
[ImageThumbURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,
[PromotionalText] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[SpecialOffer] [bit] NULL
) ON [PRIMARY]
I've considered trying batching the updates in a loop, but I'm not clear on how (or if) I should approach that.
Can this be refined or should I try a different approach?
Thanks again,
Paul
July 18, 2011 at 6:30 am
try to use output t-sql
July 18, 2011 at 1:21 pm
Some inline comments...
bpw (7/16/2011)
Hi,This is my first post here - I’m hoping someone can give me some advice. I've included the procedure code along with definitions and execution plan, but I wonder whether my whole approach to this is wrong.
Nice job posting the problem and supporting code.
I'm trying to carry out bulk updates on a large table (currently 1.5 million rows but expected to grow to 10x that). Updates can be any number from 1 to 100,000 or more. I'm a leaner, so my approach has been to Google and see what others are doing. I'm using the approach below, which is not performing well. The procedure is run from a desktop application and frequently times out.
You may want to change the CommandTimeout to prevent the timeout. With table growth will come longer run times. Prepare your user's expectations with a "Please Wait" message or something of the like.
1. A 'merchant' at a time, use SqlBulkCopy (from VB.NET) to write the rows to be updated to a separate table
2. Create a View of the existing records in the main table belonging to the relevant merchant
3. Use the procedure below to update the records in the View
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:Paul Weston
-- Create date: 08/06/2011
-- Description:Performs a 'bulk' update of the products from a single merchant feed.
-- =============================================
CREATE PROCEDURE [dbo].[bpw_Product_UpdateModifiedProducts]
-- Add the parameters for the stored procedure here
@Result int OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
BEGIN TRANSACTION
CREATE TABLE #ProductsUpdated
(
ProductID nvarchar(60),
MerchantFeedID int,
ProductName nvarchar(512),
BrandName nvarchar(128),
DescriptionBrief nvarchar(MAX),
DescriptionFull nvarchar(MAX),
Currency nchar(3),
PriceNormal money,
PriceRecommendedRetail money,
PriceInStore money,
PriceSale money,
WebOffer bit,
DeepLink nvarchar(1024),
ImpressionURL nvarchar(1024),
ImageLargeURL nvarchar(1024),
ImageThumbURL nvarchar(1024),
PromotionalText nvarchar(MAX),
SpecialOffer bit
)
declare @sql varchar(8000);
set @sql = 'SELECT ProductID,MerchantFeedID,ProductName,BrandName,DescriptionBrief,DescriptionFull,Currency,PriceNormal,
PriceRecommendedRetail,PriceInStore,PriceSale,WebOffer,DeepLink,ImpressionURL,ImageLargeURL,ImageThumbURL,
PromotionalText,SpecialOffer
FROM ProductTemp EXCEPT
SELECT ProductID,MerchantFeedID,ProductName,BrandName,DescriptionBrief,DescriptionFull,Currency,PriceNormal,
PriceRecommendedRetail,PriceInStore,PriceSale,WebOffer,DeepLink,ImpressionURL,ImageLargeURL,ImageThumbURL,
PromotionalText,SpecialOffer
FROM ProductTempView'
INSERT #ProductsUpdatedexec (@SQL)
UPDATEProductTempView WITH (TABLOCKX, HOLDLOCK)
SETProductTempView.MerchantFeedID = SourceTable.MerchantFeedID,
ProductTempView.ProductName = SourceTable.ProductName,
ProductTempView.BrandName = SourceTable.BrandName,
ProductTempView.DescriptionBrief = SourceTable.DescriptionBrief,
ProductTempView.DescriptionFull = SourceTable.DescriptionFull,
ProductTempView.Currency = SourceTable.Currency,
ProductTempView.PriceNormal = SourceTable.PriceNormal,
ProductTempView.PriceRecommendedRetail = SourceTable.PriceRecommendedRetail,
ProductTempView.PriceInStore = SourceTable.PriceInStore,
ProductTempView.PriceSale = SourceTable.PriceSale,
ProductTempView.WebOffer = SourceTable.WebOffer,
ProductTempView.DeepLink = SourceTable.DeepLink,
ProductTempView.ImpressionURL = SourceTable.ImpressionURL,
ProductTempView.ImageLargeURL = SourceTable.ImageLargeURL,
ProductTempView.ImageThumbURL = SourceTable.ImageThumbURL,
ProductTempView.PromotionalText = SourceTable.PromotionalText,
ProductTempView.SpecialOffer = SourceTable.SpecialOffer
FROMProductTempView AS TargetTable INNER JOIN
#ProductsUpdated AS SourceTable
ON TargetTable.ProductID=SourceTable.ProductID
SELECT@Result = @@ROWCOUNT
IF @@ERROR = 0
BEGIN
COMMIT TRANSACTION
END
ELSE
BEGIN
ROLLBACK TRANSACTION
SELECT@Result = -1
END
DROP TABLE #ProductsUpdated
END
I decided to use a View because I understood (read) that large updates do not play well with multiple indexes - perhaps my understanding is wrong. The View has no indexes.
A word about standard VIEWs...each is nothing more than a stored SELECT statement, nothing less either. They offer no performance advantage and do not abstract away indexes. At execution time the query referencing the VIEW is referred to the underlying table(s). If indexes exist there, they will be used to answer the query where possible.
/****** Object: View [dbo].[ProductTempView] Script Date: 07/16/2011 22:56:30 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE VIEW [dbo].[ProductTempView] AS (SELECT * FROM Product WHERE MerchantFeedID = 184)
The ProductTemp definition is:
/****** Object: Table [dbo].[ProductTemp] Script Date: 07/16/2011 23:30:51 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[ProductTemp](
[ProductID] [nvarchar](60) COLLATE Latin1_General_CI_AS NULL,
[OurProductID] [int] NULL,
[MerchantFeedID] [int] NULL,
[ProductName] [nvarchar](512) COLLATE Latin1_General_CI_AS NULL,
[BrandName] [nvarchar](128) COLLATE Latin1_General_CI_AS NULL,
[DescriptionBrief] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[DescriptionFull] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[Currency] [nchar](3) COLLATE Latin1_General_CI_AS NULL,
[PriceNormal] [money] NULL,
[PriceRecommendedRetail] [money] NULL,
[PriceInStore] [money] NULL,
[PriceSale] [money] NULL,
[WebOffer] [bit] NULL,
[DeepLink] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,
[ImpressionURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,
[ImageLargeURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,
[ImageThumbURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,
[PromotionalText] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[SpecialOffer] [bit] NULL
) ON [PRIMARY]
I've considered trying batching the updates in a loop, but I'm not clear on how (or if) I should approach that.
Can this be refined or should I try a different approach?
Thanks again,
Paul
1. Why did you go with dynamic SQL to populate #ProductsUpdated? I don't see a need for
2. Why are you using @@ERROR? As of SQL 2005 TRY/CATCH is preferred.
3. What made you decide to use these hints? WITH (TABLOCKX, HOLDLOCK)
As an aside re: item 2, if you;re stuck using @@ERROR then your implementation is flawed. This sequence will always guarantee that @@ERROR will equal 0 because it is the result of the SELECT, not of the UPDATE as you probably intended:
SELECT @Result = @@ROWCOUNT
IF @@ERROR = 0
This would be proper where @ERROR is a new variable to hold the value of @@ERROR:
SELECT @Result = @@ROWCOUNT,
@ERROR = @@ERROR
IF @ERROR = 0
CREATE PROCEDURE [dbo].[bpw_Product_UpdateModifiedProducts] (@Result INT OUTPUT)
AS
BEGIN
-- =============================================
-- Author: Paul Weston
-- Create date: 08/06/2011
-- Description: Performs a 'bulk' update of the products from a single merchant feed.
-- =============================================
SET NOCOUNT ON ;
BEGIN TRY
WITH ProductTempCTE
AS (
SELECT *
FROM Product
WHERE MerchantFeedID = 184
)
UPDATE Product WITH (TABLOCKX, HOLDLOCK)
SET MerchantFeedID = SourceTable.MerchantFeedID,
ProductName = SourceTable.ProductName,
BrandName = SourceTable.BrandName,
DescriptionBrief = SourceTable.DescriptionBrief,
DescriptionFull = SourceTable.DescriptionFull,
Currency = SourceTable.Currency,
PriceNormal = SourceTable.PriceNormal,
PriceRecommendedRetail = SourceTable.PriceRecommendedRetail,
PriceInStore = SourceTable.PriceInStore,
PriceSale = SourceTable.PriceSale,
WebOffer = SourceTable.WebOffer,
DeepLink = SourceTable.DeepLink,
ImpressionURL = SourceTable.ImpressionURL,
ImageLargeURL = SourceTable.ImageLargeURL,
ImageThumbURL = SourceTable.ImageThumbURL,
PromotionalText = SourceTable.PromotionalText,
SpecialOffer = SourceTable.SpecialOffer
FROM ProductTempCTE AS TargetTable
INNER JOIN (
SELECT ProductID,
MerchantFeedID,
ProductName,
BrandName,
DescriptionBrief,
DescriptionFull,
Currency,
PriceNormal,
PriceRecommendedRetail,
PriceInStore,
PriceSale,
WebOffer,
DeepLink,
ImpressionURL,
ImageLargeURL,
ImageThumbURL,
PromotionalText,
SpecialOffer
FROM ProductTemp
EXCEPT
SELECT ProductID,
MerchantFeedID,
ProductName,
BrandName,
DescriptionBrief,
DescriptionFull,
Currency,
PriceNormal,
PriceRecommendedRetail,
PriceInStore,
PriceSale,
WebOffer,
DeepLink,
ImpressionURL,
ImageLargeURL,
ImageThumbURL,
PromotionalText,
SpecialOffer
FROM ProductTempCTE
) AS SourceTable ON TargetTable.ProductID = SourceTable.ProductID
SET @Result = @@ROWCOUNT
END TRY
BEGIN CATCH
SET @Result = -1
END CATCH
END
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 18, 2011 at 2:35 pm
Thanks for your detailed response.
You may want to change the CommandTimeout to prevent the timeout. With table growth will come longer run times. Prepare your user's expectations with a "Please Wait" message or something of the like.
Ok, have done that.
Some questions for you:
1. Why did you go with dynamic SQL to populate #ProductsUpdated? I don't see a need for
2. Why are you using @@ERROR? As of SQL 2005 TRY/CATCH is preferred.
3. What made you decide to use these hints? WITH (TABLOCKX, HOLDLOCK)
The answer to these questions is essentially the same - I Googled 'batch update' or similar and copied bits from other examples, some of them a bit dated perhaps?
Having said that, I understood that 'TABLOCKX, HOLDLOCK' would speed things up? The process will run at daft o'clock at night, so having the table locked for a short period wouldn't be a disaster.
See if your proc rewritten without the use of temp tables and views sheds any light as to why this might be taking a long time:
I appreciate the time you've taken to do this and I'll happily try it. However, the query won't compile for me:
"Incorrect syntax near the keyword 'with'. If this statement is a common table expression or an xmlnamespaces clause, the previous statement must be terminated with a semicolon."
Can you advise?
Thanks again,
Paul
July 18, 2011 at 2:44 pm
Sorry. I don't have all your tables here so did not bother compiling it. I did not intend for you to compile it anyay...it was just to strip away the unnecessary use of temp tables and views so you could see what you would really be doing to the database. My version should not have changed what you were doing before, it just simplifies the code so you can see which tables you are actually working with. If you want to compile it, you can add a semi-colon after the BEGIN TRY...CTE's require the preceding line to be terminated with a semi-colon due to the multiple uses of the keyword WITH in T-SQL..
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply