November 11, 2024 at 5:05 pm
I’m working with a SQL Server database that supports a multi-tenant application. We have row-level security based on a TENANTID column, which restricts data access by tenant.
Here’s the setup:
All tables include an identity column named ID that is currently the first key in the clustered index (defined as (ID, TENANTID)).
The tables are partitioned by TENANTID using a partition scheme (TenantIDPartitionScheme) that maps partitions across multiple filegroups.
The ID column is not used as a foreign key in any other table and isn't referenced by other tables.
I’m considering whether the clustered index should instead be ordered as (TENANTID, ID) to better align with row-level security and the partitioning strategy, as queries frequently/always filter by TENANTID. My goals are to optimise performance for tenant-based data retrieval, reduce fragmentation, and improve maintenance efficiency.
Specific Questions:
Any insights into the best approach for indexing and partitioning in this multi-tenant, row-level security scenario would be greatly appreciated!
November 11, 2024 at 9:07 pm
If the queries tend to be by TENANTID -- and I would expect that they are -- then I would definitely uniquely cluster by ( TENANTID, ID ). If lookups are done by ID only, you could a nonclus index to cover that. Thus, from my standpoint:
(1) Yes.
(3) No.
(2) is somewhat more involved to analyze. But generally you partition only to be able to easily work with individual partitions. With the table properly clustered (i.e. by ( TENANTID, ID )), you could easily find that you don't need to partition at all.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 12, 2024 at 6:24 pm
Thank you, Scott, for the helpful advice! I’d like to dive a bit deeper into point (2) to better understand how partitioning impacts performance in a multi-tenant setup. I’m new to working with both row-level security and partitioning in the same database.
Based on your experience:
Does partitioning by TENANTID improve query performance when most queries are filtered by tenant? For instance, will partitioning reduce the amount of data SQL Server scans, or is it mainly useful for manageability, like simplifying maintenance and backups?
If I use (TENANTID, ID) as the clustered index, would partitioning still provide any additional performance benefits? Or would the clustered index be sufficient for efficiently handling tenant-filtered queries on its own?
I believe the original implementers may have chosen ID as the leading column in the clustered index because, in general, the most selective columns are usually placed at the start of an index. However, in our multi-tenant setup with row-level security, it seems that making TENANTID the leading column might better serve our needs. Since TENANTID is frequently used for filtering (due to row-level security) and aligns with the partitioning scheme, having it as the first column in the clustered index could improve performance by allowing SQL Server to target relevant partitions directly for tenant-based queries. Additionally, since the clustered index on (TENANTID, ID) remains unique, we wouldn’t lose selectivity with this change.
November 12, 2024 at 6:54 pm
I'll address your qs/ concerns as (A) ["Does partitioning ..."], (B) and (C):
(A) Yes, absolutely. You only need to partition then for other aspects, such as manageability.
(B) In the vast majority of cases, the clus index alone is enough.
(C) You've got it exactly right. ( TENANTID, ID ) will be vastly better if most of your queries specify a WHERE on TENANTID. And, as you noted, it is important to include the ID so that all key values are already unique. I have one table clustered on 5 keys because data is always selected based on at least the first 3 and up to 5 of them (depending on the security level of the id doing the accessing).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 12, 2024 at 7:05 pm
Thank you for the detailed reply - it’s very helpful!
Just to clarify, the queries themselves don’t specify TENANTID in the WHERE clause. Instead, row-level security handles tenant filtering through
EXEC sp_set_session_context @key=N'TENANTID', @value=...
which restricts access based on the tenant ID set in the session context.
Given this setup, would you still expect (TENANTID, ID) to perform better than (ID, TENANTID), since the TENANTID filter isn’t explicitly in the query but is applied via the session context?
November 12, 2024 at 7:19 pm
Interesting. Look at the query plan and see if it is doing a SEEK as part of the clus index scan. I'll read up on the mechanics of row-level security too.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 12, 2024 at 7:32 pm
It's just a predicate / search, it's not an index SEEK.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 12, 2024 at 8:09 pm
if it is partitioned what matters is if it is has the partition operator on the plan or not - if it has then being partitioned is helping the query to reduce data even if it isn't the first column on the clustered index.
November 12, 2024 at 9:16 pm
Since it's not an index seek, partitioning should still help the query by reducing the number of pages it has to search. Once again, verify that by looking at the query plan. I would still definitely cluster the table by TENANTID first.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 12, 2024 at 10:49 pm
Thank you, Scott and Frederico, for the helpful advice!
I’ll take a closer look at the query plan tomorrow to verify whether it’s performing a seek or just a scan. I’ll also check if there’s a partition operator in the plan to confirm whether partitioning is effectively reducing the data scanned.
My plan is to change all indexes on our test database to use (TENANTID, ID) as the clustered index order and then run before-and-after load tests to compare performance. I’ll monitor the execution plans closely to see how the changes impact filtering and page scans.
Thanks again for your guidance.
November 19, 2024 at 11:49 pm
I've created a stored procedure that takes a table name as a parameter and checks the clustered index column order. If the order is ID, TENANTID, the procedure drops the primary key and recreates it as TENANTSID, ID. It also handles foreign keys: it drops any foreign keys in other tables that point to the table and recreates them with the reversed column order.
I ran this procedure on the tables and performed a quick test using SET STATISTICS IO, TIME ON with the following query:
exec sp_set_session_context @key = N'TENANTID', @value = 18
select * into #a from myTable
The results were remarkable:
Metric | Database A (TENANTID, ID) | Database B (ID, TENANTID) |
---|---|---|
Logical Reads | 51 | 40,950 |
CPU Time (ms) | 0 | 4,485 |
Elapsed Time (ms) | 17 | 5,392 |
Rows Retrieved | 984 | 984 |
If I can implement this change in production on all their tables, I’m optimistic it will significantly reduce the high CPU usage on the database.
November 19, 2024 at 11:56 pm
Nice.
Also, you don't have to change the FKs in other tables if you need one for just ID. Instead, just add a unqiue nonclus index on ( ID ); it can still be the PK if you prefer, it will just be NONCLUSTERED rather than clustered.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 20, 2024 at 12:01 am
Nice.
Also, you don't have to change the FKs in other tables if you need one for just ID. Instead, just add a unqiue nonclus index on ( ID ); it can still be the PK if you prefer, it will just be NONCLUSTERED rather than clustered.
All the FKs in this case are on (ID, TENANTID). I tried enabling them without reordering, but it failed. The only solution I found was to drop the FKs and recreate them in the opposite order. Fortunately, the stored procedure I wrote handles this automatically, so it doesn’t require any additional effort on my part.
November 20, 2024 at 1:05 am
Does partitioning by TENANTID improve query performance when most queries are filtered by tenant? For instance, will partitioning reduce the amount of data SQL Server scans, or is it mainly useful for manageability, like simplifying maintenance and backups?
If I use (TENANTID, ID) as the clustered index, would partitioning still provide any additional performance benefits? Or would the clustered index be sufficient for efficiently handling tenant-filtered queries on its own?
Just to add my 2 cents...
I built one large partitioned table years back. It was based on the "Created_Date" and "ID" columns. I was also involved in a thread on SSC long ago about partitioning that I can no longer find. I do, however, remember the results. Changing the Clustered Index on the monolithic table added some good performance in both cases. Once converted to a partitioned table, performance for the same queries dropped. I don't remember the percentage but I remember being mildly shocked and then thinking that the trade offs for maintenance and the serious reduced backup times (I made the older partition file groups read only so that I only needed to backup active file groups, which consisted of only one month and the empty "next month").
In the thread I was talking about, the monolithic structure was faster than the partitioned version.
All that stuff you hear about "partition elimination" requires that you actually have the date range in the criteria of your code and that your partitions are based on whatever defines a partition.... which is exactly the way a proper clustered index works on a monolith table without the extra layer of overhead of having to go through the portioning function, etc.
To wit, partitioning is almost never for reasons of query performance. It's usually for index maintenance and backups an other similar things. It can also be used for security as in your multi-tenant scenario.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 20, 2024 at 1:58 am
Jonathan AC Roberts wrote:Does partitioning by TENANTID improve query performance when most queries are filtered by tenant? For instance, will partitioning reduce the amount of data SQL Server scans, or is it mainly useful for manageability, like simplifying maintenance and backups?
If I use (TENANTID, ID) as the clustered index, would partitioning still provide any additional performance benefits? Or would the clustered index be sufficient for efficiently handling tenant-filtered queries on its own?
Just to add my 2 cents...
I built one large partitioned table years back. It was based on the "Created_Date" and "ID" columns. I was also involved in a thread on SSC long ago about partitioning that I can no longer find. I do, however, remember the results. Changing the Clustered Index on the monolithic table added some good performance in both cases. Once converted to a partitioned table, performance for the same queries dropped. I don't remember the percentage but I remember being mildly shocked and then thinking that the trade offs for maintenance and the serious reduced backup times (I made the older partition file groups read only so that I only needed to backup active file groups, which consisted of only one month and the empty "next month").
In the thread I was talking about, the monolithic structure was faster than the partitioned version.
All that stuff you hear about "partition elimination" requires that you actually have the date range in the criteria of your code and that your partitions are based on whatever defines a partition.... which is exactly the way a proper clustered index works on a monolith table without the extra layer of overhead of having to go through the portioning function, etc.
To wit, partitioning is almost never for reasons of query performance. It's usually for index maintenance and backups an other similar things. It can also be used for security as in your multi-tenant scenario.
I checked the partitioning, all the partitions are on the same file group [PRIMARY] so I think they are only logically partitioned.
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply