Bulk Update

  • Hi,

    This is my first post here - I’m hoping someone can give me some advice. I've included the procedure code along with definitions and execution plan, but I wonder whether my whole approach to this is wrong.

    I'm trying to carry out bulk updates on a large table (currently 1.5 million rows but expected to grow to 10x that). Updates can be any number from 1 to 100,000 or more. I'm a leaner, so my approach has been to Google and see what others are doing. I'm using the approach below, which is not performing well. The procedure is run from a desktop application and frequently times out.

    1. A 'merchant' at a time, use SqlBulkCopy (from VB.NET) to write the rows to be updated to a separate table

    2. Create a View of the existing records in the main table belonging to the relevant merchant

    3. Use the procedure below to update the records in the View

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    -- =============================================

    -- Author:Paul Weston

    -- Create date: 08/06/2011

    -- Description:Performs a 'bulk' update of the products from a single merchant feed.

    -- =============================================

    CREATE PROCEDURE [dbo].[bpw_Product_UpdateModifiedProducts]

    -- Add the parameters for the stored procedure here

    @Result int OUTPUT

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    -- Insert statements for procedure here

    BEGIN TRANSACTION

    CREATE TABLE #ProductsUpdated

    (

    ProductID nvarchar(60),

    MerchantFeedID int,

    ProductName nvarchar(512),

    BrandName nvarchar(128),

    DescriptionBrief nvarchar(MAX),

    DescriptionFull nvarchar(MAX),

    Currency nchar(3),

    PriceNormal money,

    PriceRecommendedRetail money,

    PriceInStore money,

    PriceSale money,

    WebOffer bit,

    DeepLink nvarchar(1024),

    ImpressionURL nvarchar(1024),

    ImageLargeURL nvarchar(1024),

    ImageThumbURL nvarchar(1024),

    PromotionalText nvarchar(MAX),

    SpecialOffer bit

    )

    declare @sql varchar(8000);

    set @sql = 'SELECT ProductID,MerchantFeedID,ProductName,BrandName,DescriptionBrief,DescriptionFull,Currency,PriceNormal,

    PriceRecommendedRetail,PriceInStore,PriceSale,WebOffer,DeepLink,ImpressionURL,ImageLargeURL,ImageThumbURL,

    PromotionalText,SpecialOffer

    FROM ProductTemp EXCEPT

    SELECT ProductID,MerchantFeedID,ProductName,BrandName,DescriptionBrief,DescriptionFull,Currency,PriceNormal,

    PriceRecommendedRetail,PriceInStore,PriceSale,WebOffer,DeepLink,ImpressionURL,ImageLargeURL,ImageThumbURL,

    PromotionalText,SpecialOffer

    FROM ProductTempView'

    INSERT #ProductsUpdatedexec (@SQL)

    UPDATEProductTempView WITH (TABLOCKX, HOLDLOCK)

    SETProductTempView.MerchantFeedID = SourceTable.MerchantFeedID,

    ProductTempView.ProductName = SourceTable.ProductName,

    ProductTempView.BrandName = SourceTable.BrandName,

    ProductTempView.DescriptionBrief = SourceTable.DescriptionBrief,

    ProductTempView.DescriptionFull = SourceTable.DescriptionFull,

    ProductTempView.Currency = SourceTable.Currency,

    ProductTempView.PriceNormal = SourceTable.PriceNormal,

    ProductTempView.PriceRecommendedRetail = SourceTable.PriceRecommendedRetail,

    ProductTempView.PriceInStore = SourceTable.PriceInStore,

    ProductTempView.PriceSale = SourceTable.PriceSale,

    ProductTempView.WebOffer = SourceTable.WebOffer,

    ProductTempView.DeepLink = SourceTable.DeepLink,

    ProductTempView.ImpressionURL = SourceTable.ImpressionURL,

    ProductTempView.ImageLargeURL = SourceTable.ImageLargeURL,

    ProductTempView.ImageThumbURL = SourceTable.ImageThumbURL,

    ProductTempView.PromotionalText = SourceTable.PromotionalText,

    ProductTempView.SpecialOffer = SourceTable.SpecialOffer

    FROMProductTempView AS TargetTable INNER JOIN

    #ProductsUpdated AS SourceTable

    ON TargetTable.ProductID=SourceTable.ProductID

    SELECT@Result = @@ROWCOUNT

    IF @@ERROR = 0

    BEGIN

    COMMIT TRANSACTION

    END

    ELSE

    BEGIN

    ROLLBACK TRANSACTION

    SELECT@Result = -1

    END

    DROP TABLE #ProductsUpdated

    END

    I decided to use a View because I understood (read) that large updates do not play well with multiple indexes - perhaps my understanding is wrong. The View has no indexes.

    /****** Object: View [dbo].[ProductTempView] Script Date: 07/16/2011 22:56:30 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE VIEW [dbo].[ProductTempView] AS (SELECT * FROM Product WHERE MerchantFeedID = 184)

    The ProductTemp definition is:

    /****** Object: Table [dbo].[ProductTemp] Script Date: 07/16/2011 23:30:51 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[ProductTemp](

    [ProductID] [nvarchar](60) COLLATE Latin1_General_CI_AS NULL,

    [OurProductID] [int] NULL,

    [MerchantFeedID] [int] NULL,

    [ProductName] [nvarchar](512) COLLATE Latin1_General_CI_AS NULL,

    [BrandName] [nvarchar](128) COLLATE Latin1_General_CI_AS NULL,

    [DescriptionBrief] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,

    [DescriptionFull] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,

    [Currency] [nchar](3) COLLATE Latin1_General_CI_AS NULL,

    [PriceNormal] [money] NULL,

    [PriceRecommendedRetail] [money] NULL,

    [PriceInStore] [money] NULL,

    [PriceSale] [money] NULL,

    [WebOffer] [bit] NULL,

    [DeepLink] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,

    [ImpressionURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,

    [ImageLargeURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,

    [ImageThumbURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,

    [PromotionalText] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,

    [SpecialOffer] [bit] NULL

    ) ON [PRIMARY]

    I've considered trying batching the updates in a loop, but I'm not clear on how (or if) I should approach that.

    Can this be refined or should I try a different approach?

    Thanks again,

    Paul

  • try to use output t-sql

  • Some inline comments...

    bpw (7/16/2011)


    Hi,

    This is my first post here - I’m hoping someone can give me some advice. I've included the procedure code along with definitions and execution plan, but I wonder whether my whole approach to this is wrong.

    Nice job posting the problem and supporting code.

    I'm trying to carry out bulk updates on a large table (currently 1.5 million rows but expected to grow to 10x that). Updates can be any number from 1 to 100,000 or more. I'm a leaner, so my approach has been to Google and see what others are doing. I'm using the approach below, which is not performing well. The procedure is run from a desktop application and frequently times out.

    You may want to change the CommandTimeout to prevent the timeout. With table growth will come longer run times. Prepare your user's expectations with a "Please Wait" message or something of the like.

    1. A 'merchant' at a time, use SqlBulkCopy (from VB.NET) to write the rows to be updated to a separate table

    2. Create a View of the existing records in the main table belonging to the relevant merchant

    3. Use the procedure below to update the records in the View

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    -- =============================================

    -- Author:Paul Weston

    -- Create date: 08/06/2011

    -- Description:Performs a 'bulk' update of the products from a single merchant feed.

    -- =============================================

    CREATE PROCEDURE [dbo].[bpw_Product_UpdateModifiedProducts]

    -- Add the parameters for the stored procedure here

    @Result int OUTPUT

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    -- Insert statements for procedure here

    BEGIN TRANSACTION

    CREATE TABLE #ProductsUpdated

    (

    ProductID nvarchar(60),

    MerchantFeedID int,

    ProductName nvarchar(512),

    BrandName nvarchar(128),

    DescriptionBrief nvarchar(MAX),

    DescriptionFull nvarchar(MAX),

    Currency nchar(3),

    PriceNormal money,

    PriceRecommendedRetail money,

    PriceInStore money,

    PriceSale money,

    WebOffer bit,

    DeepLink nvarchar(1024),

    ImpressionURL nvarchar(1024),

    ImageLargeURL nvarchar(1024),

    ImageThumbURL nvarchar(1024),

    PromotionalText nvarchar(MAX),

    SpecialOffer bit

    )

    declare @sql varchar(8000);

    set @sql = 'SELECT ProductID,MerchantFeedID,ProductName,BrandName,DescriptionBrief,DescriptionFull,Currency,PriceNormal,

    PriceRecommendedRetail,PriceInStore,PriceSale,WebOffer,DeepLink,ImpressionURL,ImageLargeURL,ImageThumbURL,

    PromotionalText,SpecialOffer

    FROM ProductTemp EXCEPT

    SELECT ProductID,MerchantFeedID,ProductName,BrandName,DescriptionBrief,DescriptionFull,Currency,PriceNormal,

    PriceRecommendedRetail,PriceInStore,PriceSale,WebOffer,DeepLink,ImpressionURL,ImageLargeURL,ImageThumbURL,

    PromotionalText,SpecialOffer

    FROM ProductTempView'

    INSERT #ProductsUpdatedexec (@SQL)

    UPDATEProductTempView WITH (TABLOCKX, HOLDLOCK)

    SETProductTempView.MerchantFeedID = SourceTable.MerchantFeedID,

    ProductTempView.ProductName = SourceTable.ProductName,

    ProductTempView.BrandName = SourceTable.BrandName,

    ProductTempView.DescriptionBrief = SourceTable.DescriptionBrief,

    ProductTempView.DescriptionFull = SourceTable.DescriptionFull,

    ProductTempView.Currency = SourceTable.Currency,

    ProductTempView.PriceNormal = SourceTable.PriceNormal,

    ProductTempView.PriceRecommendedRetail = SourceTable.PriceRecommendedRetail,

    ProductTempView.PriceInStore = SourceTable.PriceInStore,

    ProductTempView.PriceSale = SourceTable.PriceSale,

    ProductTempView.WebOffer = SourceTable.WebOffer,

    ProductTempView.DeepLink = SourceTable.DeepLink,

    ProductTempView.ImpressionURL = SourceTable.ImpressionURL,

    ProductTempView.ImageLargeURL = SourceTable.ImageLargeURL,

    ProductTempView.ImageThumbURL = SourceTable.ImageThumbURL,

    ProductTempView.PromotionalText = SourceTable.PromotionalText,

    ProductTempView.SpecialOffer = SourceTable.SpecialOffer

    FROMProductTempView AS TargetTable INNER JOIN

    #ProductsUpdated AS SourceTable

    ON TargetTable.ProductID=SourceTable.ProductID

    SELECT@Result = @@ROWCOUNT

    IF @@ERROR = 0

    BEGIN

    COMMIT TRANSACTION

    END

    ELSE

    BEGIN

    ROLLBACK TRANSACTION

    SELECT@Result = -1

    END

    DROP TABLE #ProductsUpdated

    END

    I decided to use a View because I understood (read) that large updates do not play well with multiple indexes - perhaps my understanding is wrong. The View has no indexes.

    A word about standard VIEWs...each is nothing more than a stored SELECT statement, nothing less either. They offer no performance advantage and do not abstract away indexes. At execution time the query referencing the VIEW is referred to the underlying table(s). If indexes exist there, they will be used to answer the query where possible.

    /****** Object: View [dbo].[ProductTempView] Script Date: 07/16/2011 22:56:30 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE VIEW [dbo].[ProductTempView] AS (SELECT * FROM Product WHERE MerchantFeedID = 184)

    The ProductTemp definition is:

    /****** Object: Table [dbo].[ProductTemp] Script Date: 07/16/2011 23:30:51 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[ProductTemp](

    [ProductID] [nvarchar](60) COLLATE Latin1_General_CI_AS NULL,

    [OurProductID] [int] NULL,

    [MerchantFeedID] [int] NULL,

    [ProductName] [nvarchar](512) COLLATE Latin1_General_CI_AS NULL,

    [BrandName] [nvarchar](128) COLLATE Latin1_General_CI_AS NULL,

    [DescriptionBrief] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,

    [DescriptionFull] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,

    [Currency] [nchar](3) COLLATE Latin1_General_CI_AS NULL,

    [PriceNormal] [money] NULL,

    [PriceRecommendedRetail] [money] NULL,

    [PriceInStore] [money] NULL,

    [PriceSale] [money] NULL,

    [WebOffer] [bit] NULL,

    [DeepLink] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,

    [ImpressionURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,

    [ImageLargeURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,

    [ImageThumbURL] [nvarchar](1024) COLLATE Latin1_General_CI_AS NULL,

    [PromotionalText] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,

    [SpecialOffer] [bit] NULL

    ) ON [PRIMARY]

    I've considered trying batching the updates in a loop, but I'm not clear on how (or if) I should approach that.

    Can this be refined or should I try a different approach?

    Thanks again,

    Paul


    Some questions for you:

    1. Why did you go with dynamic SQL to populate #ProductsUpdated? I don't see a need for

    2. Why are you using @@ERROR? As of SQL 2005 TRY/CATCH is preferred.

    3. What made you decide to use these hints? WITH (TABLOCKX, HOLDLOCK)

    As an aside re: item 2, if you;re stuck using @@ERROR then your implementation is flawed. This sequence will always guarantee that @@ERROR will equal 0 because it is the result of the SELECT, not of the UPDATE as you probably intended:

    SELECT @Result = @@ROWCOUNT

    IF @@ERROR = 0

    This would be proper where @ERROR is a new variable to hold the value of @@ERROR:

    SELECT @Result = @@ROWCOUNT,

    @ERROR = @@ERROR

    IF @ERROR = 0


    See if your proc rewritten without the use of temp tables and views sheds any light as to why this might be taking a long time:

    CREATE PROCEDURE [dbo].[bpw_Product_UpdateModifiedProducts] (@Result INT OUTPUT)

    AS

    BEGIN

    -- =============================================

    -- Author: Paul Weston

    -- Create date: 08/06/2011

    -- Description: Performs a 'bulk' update of the products from a single merchant feed.

    -- =============================================

    SET NOCOUNT ON ;

    BEGIN TRY

    WITH ProductTempCTE

    AS (

    SELECT *

    FROM Product

    WHERE MerchantFeedID = 184

    )

    UPDATE Product WITH (TABLOCKX, HOLDLOCK)

    SET MerchantFeedID = SourceTable.MerchantFeedID,

    ProductName = SourceTable.ProductName,

    BrandName = SourceTable.BrandName,

    DescriptionBrief = SourceTable.DescriptionBrief,

    DescriptionFull = SourceTable.DescriptionFull,

    Currency = SourceTable.Currency,

    PriceNormal = SourceTable.PriceNormal,

    PriceRecommendedRetail = SourceTable.PriceRecommendedRetail,

    PriceInStore = SourceTable.PriceInStore,

    PriceSale = SourceTable.PriceSale,

    WebOffer = SourceTable.WebOffer,

    DeepLink = SourceTable.DeepLink,

    ImpressionURL = SourceTable.ImpressionURL,

    ImageLargeURL = SourceTable.ImageLargeURL,

    ImageThumbURL = SourceTable.ImageThumbURL,

    PromotionalText = SourceTable.PromotionalText,

    SpecialOffer = SourceTable.SpecialOffer

    FROM ProductTempCTE AS TargetTable

    INNER JOIN (

    SELECT ProductID,

    MerchantFeedID,

    ProductName,

    BrandName,

    DescriptionBrief,

    DescriptionFull,

    Currency,

    PriceNormal,

    PriceRecommendedRetail,

    PriceInStore,

    PriceSale,

    WebOffer,

    DeepLink,

    ImpressionURL,

    ImageLargeURL,

    ImageThumbURL,

    PromotionalText,

    SpecialOffer

    FROM ProductTemp

    EXCEPT

    SELECT ProductID,

    MerchantFeedID,

    ProductName,

    BrandName,

    DescriptionBrief,

    DescriptionFull,

    Currency,

    PriceNormal,

    PriceRecommendedRetail,

    PriceInStore,

    PriceSale,

    WebOffer,

    DeepLink,

    ImpressionURL,

    ImageLargeURL,

    ImageThumbURL,

    PromotionalText,

    SpecialOffer

    FROM ProductTempCTE

    ) AS SourceTable ON TargetTable.ProductID = SourceTable.ProductID

    SET @Result = @@ROWCOUNT

    END TRY

    BEGIN CATCH

    SET @Result = -1

    END CATCH

    END

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks for your detailed response.

    You may want to change the CommandTimeout to prevent the timeout. With table growth will come longer run times. Prepare your user's expectations with a "Please Wait" message or something of the like.

    Ok, have done that.

    Some questions for you:

    1. Why did you go with dynamic SQL to populate #ProductsUpdated? I don't see a need for

    2. Why are you using @@ERROR? As of SQL 2005 TRY/CATCH is preferred.

    3. What made you decide to use these hints? WITH (TABLOCKX, HOLDLOCK)

    The answer to these questions is essentially the same - I Googled 'batch update' or similar and copied bits from other examples, some of them a bit dated perhaps?

    Having said that, I understood that 'TABLOCKX, HOLDLOCK' would speed things up? The process will run at daft o'clock at night, so having the table locked for a short period wouldn't be a disaster.

    See if your proc rewritten without the use of temp tables and views sheds any light as to why this might be taking a long time:

    I appreciate the time you've taken to do this and I'll happily try it. However, the query won't compile for me:

    "Incorrect syntax near the keyword 'with'. If this statement is a common table expression or an xmlnamespaces clause, the previous statement must be terminated with a semicolon."

    Can you advise?

    Thanks again,

    Paul

  • Sorry. I don't have all your tables here so did not bother compiling it. I did not intend for you to compile it anyay...it was just to strip away the unnecessary use of temp tables and views so you could see what you would really be doing to the database. My version should not have changed what you were doing before, it just simplifies the code so you can see which tables you are actually working with. If you want to compile it, you can add a semi-colon after the BEGIN TRY...CTE's require the preceding line to be terminated with a semi-colon due to the multiple uses of the keyword WITH in T-SQL..

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply