October 14, 2024 at 5:59 am
Hi All,
I am currently testing the Table Partitioning to implement in SQL server 2022. We have daily data coming to staging table that needs to be loaded to main table. The data load is currently by agent job runs every 6 hours. Lets say we have to load data daily 3 to 6 times when new data comes in.
Also, we are deleting the data completely from database older than 90 days.
I have tested monthly partition for current year. In my test, I understand that Switch data from staging table to Partition table the partition must be empty. Since we always have data in current month or day.
Can anyone suggest, how to achieve this?
October 14, 2024 at 5:56 pm
-- Find the first non-empty partition ready for switch
SELECT @firstpartition = MIN(p.partition_number)
FROM sys.partitions p
JOIN sys.tables t ON t.object_id = p.object_id
WHERE t.name = 'table_name'
AND p.rows > 0;
--Switch
SET @CMD = N'ALTER TABLE [dbo].[table_name] SWITCH PARTITION ' + CAST(@firstpartition AS NVARCHAR(20)) + ' TO [dbo].[table_name]';
--cleanup
Prepare the STG table for next switch by truncation or dropping the tabel
Dynamic SQL is your friend every step of the way
October 14, 2024 at 10:06 pm
What do you want to achieve by partitioning the table? Monthly partitions help manage large tables and would allow you to truncate a historical partition at the end of a month, but they won't allow switching in from stage if you have multiple batches per day for the same partition(s).
If your goal is to avoid large inserts and your batches contain a timestamp that does not overlap other batches you could create a new partition for every load, but it's a fair bit of work for a very specific gain. You could identify the max timestamp from the existing data and use it to split the range of the partition function and create an empty partition. If you use the appropriate range direction (left?) it will avoid data movement. I think you would need to do this before importing the batch to the stage table because it's easier if the stage table is on the same partition scheme. When you remove the data that's older than 90 days it would make sense to truncate the partitions and then merge them so the total number of partitions remains fairly constant.
Any indexes on the table would need to be partition aligned, and created on the stage table, to allow switching. This makes them less useful unless you can take advantage of partition elimination. I would be clear what you hope to achieve before deciding to use partitioning.
October 15, 2024 at 5:26 am
Thanks very much ED, this give very good starting point for me to try and test.
I have test with range right. The main aim is to delete old data that take more time than insert. I have update statement to the non partition column. After the batch load will it cause or make any delay in update statement after partitioning. I do not think so, just wanted to clarify that as well.
If you have any readymade script, please share it.
Thank again for your time.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply