Best Practice vs Best Performance of Splitting a Table

Question

Best Practice vs Best Performance of Splitting a Table

Dennis Jensen

SSChasing Mays

Points: 606
More actions
July 17, 2023 at 5:27 pm

#4237419

Due to the nature of a specific table and the needs of different development groups we need to split a table into two separate but closely related tables. Without getting too indepth on the details of this the basic question came up. Is it best to use the identity key of the primary (or base) table as the primary unique foreign key of its secondary table? Or would it be better to create an identity key within the secondary table and reference the identity key of the primary table as only a unique foreign key?
Thoughts?
July 18, 2023 at 12:37 pm

This was removed by the editor as SPAM

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

ratbak Ten Centuries Points: 1381 More actions · Answer 1

If 1-to-1, I can't think of any reason not to use the identity key of the original/primary table as the primary key and foreign key reference in the secondary table.

frederico_fonseca SSCoach Points: 16167 More actions · Answer 2

IF the split means that you can "reduce" the number of rows in one of the tables due to "duplicate information" then you should do it if savings are significant - and on this case new Identity and FK on the other table.

if its not possible to do the above, then do as ratbak mentioned as it will make life easier to everyone.

Dennis Jensen SSChasing Mays Points: 606 More actions · Answer 3

frederico_fonseca wrote:

IF the split means that you can "reduce" the number of rows in one of the tables due to "duplicate information" then you should do it if savings are significant - and on this case new Identity and FK on the other table.

Okay I cannot say I understand this comment, my understanding is that when you split a table you are removing columns from the primary table and placing them within the secondary table and perhaps adding additional columns to the secondary. I am not understanding per se how you would reduce duplicate information.

The following link demonstrates doing this where there becomes a 1-to-many relationship (and perhaps this is what you meant above) between the primary and secondary table.

https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhelp.red-gate.com%2Fhelp%2FSQLRefactor1%2F0%2Fen%2FTopics%2FSRF_SplitTable.html%23%3A~%3Atext%3DTo%2520split%2520a%2520table%253A%25201%2520On%2520the%2520SQL%2Cbuttons.%2520...%25206%2520Click%2520Next.%2520...%2520More%2520items&data=05%7C01%7Cdjensen%40triafed.com%7Cfc02580331d346fe7d5208db87120344%7C448649d2625b40eb9d855e1e14d01f6c%7C0%7C0%7C638252283684094383%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EZisFnoBxCKJfRz97tHxGudJ%2FC%2B0ZD%2B8Kvk7rJmNHgU%3D&reserved=0

However, in our instance there will only be a 1-to-1 relationship between the primary and secondary table as the reason for the split is due to business functionality/clarity rather than expanding the data into a 1-to-many situation. So far there does not seem to be a reason, for this case, to create an separate identity column in the secondary table.

However, if anyone has content to the contrary please do share. As searching out this question on the internet was less then robust in its provision.

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 4

What performance problem are you trying to solve by splitting the table? Reads are likely to go up, not down (depending on quite a few details). Writes may go down, some, but may also go up. Where, & what, are the performance problems that make creating a 1-1 relationship look attractive? If there are lots of NULL values, look to SPARSE as a way to deal with that. Otherwise, if your data is already appropriately normalized (or you have a well defined star schema), you really shouldn't be splitting columns off the table.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jonathan AC Roberts SSCoach Points: 18361 More actions · Answer 5

Because of the row size limitation in SQL Server, I had to split the columns of a data warehouse table. The combined size of the rows exceeded the permitted limit.

To resolve this problem, we divided the columns between two tables. The primary table contained the more frequently accessed columns, while the secondary table stored the less frequently accessed columns.

Both tables were indexed using the identity column from the primary table, which was also copied to the secondary table.

Jeffrey Williams SSC Guru Points: 90304 More actions · Answer 6

I am with Grant - I don't see any reason to be splitting a table. No matter what - all development groups will still need access to the original table to at least create the row. Moving additional columns to a separate table doesn't really make sense at that point.

If there is a need to reduce the number of columns that some development groups need to be able to see, then just create a view with only those columns. If there is a requirement for those other development groups to be able to insert/update/delete without having to view the original table - then maybe you can use an instead of trigger on the view.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Jonathan AC Roberts SSCoach Points: 18361 More actions · Answer 7

Jeffrey Williams wrote:

I am with Grant - I don't see any reason to be splitting a table. No matter what - all development groups will still need access to the original table to at least create the row. Moving additional columns to a separate table doesn't really make sense at that point.
If there is a need to reduce the number of columns that some development groups need to be able to see, then just create a view with only those columns. If there is a requirement for those other development groups to be able to insert/update/delete without having to view the original table - then maybe you can use an instead of trigger on the view.

In cases where the row size exceeds the limit, it may be necessary to divide the rows into two separate tables. Additionally, even if you approach the row size limit and perform updates, the occurrence of page splits can significantly slow down the process, potentially bringing it to a near standstill.

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 8

Jonathan AC Roberts wrote:

Because of the row size limitation in SQL Server, I had to split the columns of a data warehouse table. The combined size of the rows exceeded the permitted limit.
To resolve this problem, we divided the columns between two tables. The primary table contained the more frequently accessed columns, while the secondary table stored the less frequently accessed columns.
Both tables were indexed using the identity column from the primary table, which was also copied to the secondary table.

Ah, OK. That's different.

At that point... the paranoid person in me says give the table its own key and a foreign key back to the main table. The reason I say that is, if you're splitting it away, chances are, other structural changes may occur and you may see different access paths to the data than just through that original PK. Fewer structural changes down the road then.

However, honestly, from a performance standpoint, you'd be fine with having the same key. Just, do put the FK in place because it will drive choices by the optimizer.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Dennis Jensen SSChasing Mays Points: 606 More actions · Answer 9

Grant I am with your first statement but ours was not a database related issue but more of a business application issue. The splitting was decided as being the best method to keep the fox out of the hen house so-to-speak.

Basically the primary table is a purely historical table and is only ever appended to, while the secondary table which is directly related to the historical table will have columns that are updated now and again. So while it does not improve anything database wise it does help reduce issues on the application side of the house as it keeps the secondary process (owned by group B) from alterring records within the primary purely historical table (owned by group A).

Further it is by far are our biggest table and we are probably going to have to fracture it some how as things move forward, where upon we will most likely have to address the secondary process as the primary process would have only minor issues (if any) when it comes time for that.

Lastly thank you all for the feedback, it was greatly appreciated.

This reply was modified 1 year, 5 months ago by Dennis Jensen.