Splitting table into multiple tables

  • I have a requirement where we have a table of 1 billion rows, which is causing some performance issues.

    so we decided to split the table into multiple tables

    this is the table structure, there are no primary keys, constraints. There is just one clustered columnstore index

    name(varchar)  date(smalldatetime)  category(varchar)

    A                   2021-09-09                 Protein

    B                      2020-08-08               Dairy

    C                    2008-04-04                Fat

    D                   2009-09-09                Dairy

     

    so we want to split tables based on Category..For example Category which has protein goes to its own table.

    and we want to add a new column called datekey(int) and update this new column with the date column values. Because we noticed it gives us significant performance improvement with this datatype

    what do you think is the best approach to do this?

    Thanks

    • This topic was modified 3 years, 2 months ago by  SQLAddict01.
  • why is it causing performance issues? and what are those issues?

    could it be that instead of splitting the table you may need to partition it (if all your queries use the date field for example) or maybe you need other indexes added?

    and is your datetime composed of just date or date + time? if date + time a possible workaround for performance is to split into date and time components which behave better with columnstore columns

  • Performance issues with maintenance tasks like update stats takes forever

    This is SQL 2014, so we can have only 1 index on the table..

    I tried implementing partitionning in dev, but didn't see any significant performance improvement.

     

    This is just date.It is not capturing time.

  • SQLAddict01 wrote:

    This is SQL 2014, so we can have only 1 index on the table..

    Where have you heard this piece of misinformation???  A table can certainly have more than one index.  Unless it's a clustered index.

    How are you updating stats?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Yes that is a clustered column store index. I tried creating a regular nc index but it didn’t allow.

    I am updating stats using ola hallengren script

  • SQLAddict01 wrote:

    Yes that is a clustered column store index. I tried creating a regular nc index but it didn’t allow.

    I am updating stats using ola hallengren script

    I'm not sure what is preventing you from creating a non clustered index, Columnstore is not in my scope of understanding!

    How many statistics are on this table? Are you using the "Only modified stastistics" flag in the IndexOptimize proc?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • SQLAddict01 wrote:

    Performance issues with maintenance tasks like update stats takes forever

    This is SQL 2014, so we can have only 1 index on the table..

    I tried implementing partitionning in dev, but didn't see any significant performance improvement.

    This is just date.It is not capturing time.

    1 clustered index. 999 non-clustered indexes

    https://docs.microsoft.com/en-us/previous-versions/sql/2014/sql-server/maximum-capacity-specifications-for-sql-server?view=sql-server-2014

    • This reply was modified 3 years, 2 months ago by  homebrew01.
    • This reply was modified 3 years, 2 months ago by  homebrew01.
  • Yes I am using that option.

  • The version I am using is 2014. Here is the error I get.

    Msg 35303, Level 16, State 1, Line 5

    CREATE INDEX statement failed because a nonclustered index cannot be created on a table that has a clustered columnstore index. Consider replacing the clustered columnstore index with a nonclustered columnstore index.

  • you really need to consider upgrading your SQL to most recent one - 2014 is now on extended support for just another 3 years.

    OP is correct in that in SQL 2014 if you have a clustered columnstore index you can't add any more non clustered indexes - this was the biggest limitation on SQL 2014 of using these indexes.

    This post should have been placed on that forum instead of the 2019 for clarity.

    regarding the performance issues below - is it only maintenance that has issues or do you have query performance issues as well?

    and when you tried partitioning did query performance stay the same/improved or decreased? and how did you implement partitioning? eg. which column

    and when you say you didn't see significant improvement are you talking only with regards to the maintenance tasks? if so it may have been to lack of performing the rebuilds/reorgs at a partition level (needs to be specified on Ola scripts)

    SQLAddict01 wrote:

    Performance issues with maintenance tasks like update stats takes forever

    This is SQL 2014, so we can have only 1 index on the table..

    I tried implementing partitionning in dev, but didn't see any significant performance improvement.

    This is just date.It is not capturing time.

  • How do you search the table?  It's likely you would get better performance using partitioning based on how you search the table, esp. if you can partition / cluster first on date.  You would then only need to update stats on the newest partition.

    Do you have Enterprise Edition or Standard Edition? (I ask because Ent has data compression but Std does not; in SQL 2016, compression is available in Std.)

     

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • The performance issues are with the maintenance tasks and also with the queries

    I partitioned on the date column, and in a dev region and ran couple of queries and compared the results. the performance seems to be the same.

  • We use enterprise edition.

    Yes I think partitioning would be helpful for stats to run fast, but I did not see any performance improvements in the queries.

  • Again, how do you search the table?  That is, what are the common WHERE clauses that you use on the table?  That is what should control the clustering.

    If you (almost) always include a specific column in the search criteria, then cluster first on that column.  For example, if you (almost) always specify date and fairly often specify name, then cluster on ( date, name ).  If date and category, then ( date, category ).  Finally, if by chance category is by far the most common and then date, then cluster on (category, date ).

    You might also consider testing page compression vs. columnstore, esp. in 2014.  You won't get as much data compression, but you still might get better overall performance results.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • homebrew01 wrote:

    SQLAddict01 wrote:

    Performance issues with maintenance tasks like update stats takes forever

    This is SQL 2014, so we can have only 1 index on the table..

    I tried implementing partitionning in dev, but didn't see any significant performance improvement.

    This is just date.It is not capturing time.

    1 clustered index. 999 non-clustered indexes

    https://docs.microsoft.com/en-us/previous-versions/sql/2014/sql-server/maximum-capacity-specifications-for-sql-server?view=sql-server-2014

    That's not for ColumnStore in 2014, though.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply