how optimize update query which has column store index on all coumn and one non clustered index on the same column

  • Hi Champs,

    I have  an update query  for record which has around 50 crores of data , and  full table(120 columns) having column store index and one non clustered index with  5 columns are created on the  same column which i want to update. Its data ware house table (fact table). Indexes are help full in select so cant drop or disable. I understand that because of  index it is taking more time (9 hours ) .

    Still need to tune this query . Any help pls ?

    Query looks like this
    with cte as
       (
          SELECT
             [Local_Date_Key]
             ,[Associate_Profile_key]
             ,AP.Associateid
             ,F.Record_update_date
             ,F.Record_Updated_By
          FROM [Fact_table]as F
          join dimentions as AP on F.[Associate_Profile_key]=AP.Associate_key
                  and local_date_key >= 20160101
        
       )
       update c
          set c.Associate_Profile_key=AP.Associate_key,
          c.Record_update_date=sysdatetime(),
          c.Record_Updated_By=suser_name()
       from cte as c
       join dimentions as AP on c.Associateid=AP.Associateid
       and c.[Associate_Profile_key] <> AP.Associate_key;

    Regards,
    Ravi

  • First you need to improve the select part of your query, also from your query it is difficult to understand to which table the field [Local_Date_Key] belongs.
    Can you show the execution plan of this query?
    You may need indexes on AP.Associate_key and F.[Associate_Profile_key] with included fields AP.Associate_key, [Associate_Profile_key], and local_date_key.
    If that query touch more than 20-30% of all records, the most efficient way would be: remove all indexes -> run the update -> re-create indexes.

  • Updates to a column store are not a good idea.  Column store handles Inserts just fine, but Updates and Deletes slow the entire table access down.  Column stores were made to be populated and read and should not be used for OLTP workloads.

    If you must do regular updates/deletes, change the table to use a rowstore index or heap and then create non-clustered column store indexes for your reporting.

    If the updates/deletes are sporadic, select the updated date into a new column store table, then swap the old table with the new one.  I once had to run an update to a 100 billion row table.  I ended up use this approach and each day (100 million records each) only took about 90 seconds to copy over.

    Wes
    (A solid design is always preferable to a creative workaround)

  • Thanks a lot for your suggestion .

    Can  you please  elaborate more on this "If the updates/deletes are sporadic, select the updated date into a new column store table, then swap the old table with the new one. "  I need to run this update query every day.
    This is my update query, you can  refer my post for complete query.

      update c
      set c.Associate_Profile_key=AP.Associate_key,
      c.Record_update_date=sysdatetime(),
      c.Record_Updated_By=suser_name()
      from cte as c
      join dimentions as AP on c.Associateid=AP.Associateid
      and c.[Associate_Profile_key] <> AP.Associate_key;

  • I would consider updating once daily to be sporadic, rather than a traditional OLTP model.  It is feasible to create a process to handle updating the data and then clean-up the indexes.  However, handling updates for large volumes of data is not something that can be easily address through a forum post.  You need to educate yourself on how data is stored, logically and physically, in SQL Server and the trade-offs between the index types.

    Bottom line, you should not use a clustered column store index to store data that you may need to update or delete.  Column store is tuned for high speed inserts and reads, only.  That's it.  If you're doing anything else with the data, you're causing yourself performance problems.

    I suggest you looking into using a Clustered Index (Rowstore) with Non-clustered Columnstore

    Wes
    (A solid design is always preferable to a creative workaround)

  • ravi@sql - Monday, November 20, 2017 10:19 PM

    Thanks a lot for your suggestion .

    Can  you please  elaborate more on this "If the updates/deletes are sporadic, select the updated date into a new column store table, then swap the old table with the new one. "  I need to run this update query every day.
    This is my update query, you can  refer my post for complete query.

      update c
      set c.Associate_Profile_key=AP.Associate_key,
      c.Record_update_date=sysdatetime(),
      c.Record_Updated_By=suser_name()
      from cte as c
      join dimentions as AP on c.Associateid=AP.Associateid
      and c.[Associate_Profile_key] <> AP.Associate_key;

    -- Reduce the query to its simplest form
    UPDATE f SET
     Associate_Profile_key = ap2.Associate_key,
     Record_update_date = sysdatetime(),
     Record_Updated_By = suser_name()
    FROM Fact_table f
    INNER JOIN dimentions ap1
     ON f.Associate_Profile_key = ap1.Associate_key
     AND ap1.local_date_key >= 20160101
    INNER JOIN dimentions ap2
     ON ap1.Associateid = ap2.Associateid
     AND f.Associate_Profile_key <> ap2.Associate_key;
    -- Examine the execution plan for obvious tuning opportunities...
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply