Best Practice vs Best Performance of Splitting a Table

  • Due to the nature of a specific table and the needs of different development groups we need to split a table into two separate but closely related tables. Without getting too indepth on the details of this the basic question came up. Is it best to use the identity key of the primary (or base) table as the primary unique foreign key of its secondary table?  Or would it be better to create an identity key within the secondary table and reference the identity key of the primary table as only a unique foreign key?

    Thoughts?

     

  • If 1-to-1, I can't think of any reason not to use the identity key of the original/primary table as the primary key and foreign key reference in the secondary table.

  • IF the split means that you can "reduce" the number of rows in one of the tables due to "duplicate information" then you should do it if savings are significant - and on this case new Identity and FK on the other table.

    if  its not possible to do the above, then do as ratbak mentioned as it will make life easier to everyone.

  • frederico_fonseca wrote:

    IF the split means that you can "reduce" the number of rows in one of the tables due to "duplicate information" then you should do it if savings are significant - and on this case new Identity and FK on the other table.

    Okay I cannot say I understand this comment, my understanding is that when you split a table you are removing columns from the primary table and placing them within the secondary table and perhaps adding additional columns to the secondary. I am not understanding per se how you would reduce duplicate information.

    The following link demonstrates doing this where there becomes a 1-to-many relationship (and perhaps this is what you meant above) between the primary and secondary table.

    https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhelp.red-gate.com%2Fhelp%2FSQLRefactor1%2F0%2Fen%2FTopics%2FSRF_SplitTable.html%23%3A~%3Atext%3DTo%2520split%2520a%2520table%253A%25201%2520On%2520the%2520SQL%2Cbuttons.%2520...%25206%2520Click%2520Next.%2520...%2520More%2520items&data=05%7C01%7Cdjensen%40triafed.com%7Cfc02580331d346fe7d5208db87120344%7C448649d2625b40eb9d855e1e14d01f6c%7C0%7C0%7C638252283684094383%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EZisFnoBxCKJfRz97tHxGudJ%2FC%2B0ZD%2B8Kvk7rJmNHgU%3D&reserved=0

    However, in our instance there will only be a 1-to-1 relationship between the primary and secondary table as the reason for the split is due to business functionality/clarity rather than expanding the data into a 1-to-many situation. So far there does not seem to be a reason, for this case, to create an separate identity column in the secondary table.

    However, if anyone has content to the contrary please do share. As searching out this question on the internet was less then robust in its provision.

  • This was removed by the editor as SPAM

  • What performance problem are you trying to solve by splitting the table? Reads are likely to go up, not down (depending on quite a few details). Writes may go down, some, but may also go up. Where, & what, are the performance problems that make creating a 1-1 relationship look attractive? If there are lots of NULL values, look to SPARSE as a way to deal with that. Otherwise, if your data is already appropriately normalized (or you have a well defined star schema), you really shouldn't be splitting columns off the table.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Because of the row size limitation in SQL Server, I had to split the columns of a data warehouse table. The combined size of the rows exceeded the permitted limit.

    To resolve this problem, we divided the columns between two tables. The primary table contained the more frequently accessed columns, while the secondary table stored the less frequently accessed columns.

    Both tables were indexed using the identity column from the primary table, which was also copied to the secondary table.

  • I am with Grant - I don't see any reason to be splitting a table.  No matter what - all development groups will still need access to the original table to at least create the row.  Moving additional columns to a separate table doesn't really make sense at that point.

    If there is a need to reduce the number of columns that some development groups need to be able to see, then just create a view with only those columns.  If there is a requirement for those other development groups to be able to insert/update/delete without having to view the original table - then maybe you can use an instead of trigger on the view.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams wrote:

    I am with Grant - I don't see any reason to be splitting a table.  No matter what - all development groups will still need access to the original table to at least create the row.  Moving additional columns to a separate table doesn't really make sense at that point.

    If there is a need to reduce the number of columns that some development groups need to be able to see, then just create a view with only those columns.  If there is a requirement for those other development groups to be able to insert/update/delete without having to view the original table - then maybe you can use an instead of trigger on the view.

    In cases where the row size exceeds the limit, it may be necessary to divide the rows into two separate tables. Additionally, even if you approach the row size limit and perform updates, the occurrence of page splits can significantly slow down the process, potentially bringing it to a near standstill.

  • Jonathan AC Roberts wrote:

    Because of the row size limitation in SQL Server, I had to split the columns of a data warehouse table. The combined size of the rows exceeded the permitted limit.

    To resolve this problem, we divided the columns between two tables. The primary table contained the more frequently accessed columns, while the secondary table stored the less frequently accessed columns.

    Both tables were indexed using the identity column from the primary table, which was also copied to the secondary table.

    Ah, OK. That's different.

    At that point... the paranoid person in me says give the table its own key and a foreign key back to the main table. The reason I say that is, if you're splitting it away, chances are, other structural changes may occur and you may see different access paths to the data than just through that original PK. Fewer structural changes down the road then.

    However, honestly, from a performance standpoint, you'd be fine with having the same key. Just, do put the FK in place because it will drive choices by the optimizer.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant I am with your first statement but ours was not a database related issue but more of a business application issue. The splitting was decided as being the best method to keep the fox out of the hen house so-to-speak.

    Basically the primary table is a purely historical table and is only ever appended to, while the secondary table which is directly related to the historical table will have columns that are updated now and again. So while it does not improve anything database wise it does help reduce issues on the application side of the house as it keeps the secondary process (owned by group B) from alterring records within the primary purely historical table (owned by group A).

    Further it is by far are our biggest table and we are probably going to have to fracture it some how as things move forward, where upon we will most likely have to address the secondary process as the primary process would have only minor issues (if any) when it comes time for that.

    Lastly thank you all for the feedback, it was greatly appreciated.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply