Best Clustered Index Configuration for Partitioned Multi-Tenant Table with RLS

Question

Best Clustered Index Configuration for Partitioned Multi-Tenant Table with RLS

Jonathan AC Roberts

SSCoach

Points: 18377
More actions
November 11, 2024 at 5:05 pm

#4480199
I’m working with a SQL Server database that supports a multi-tenant application. We have row-level security based on a TENANTID column, which restricts data access by tenant.
Here’s the setup:
All tables include an identity column named ID that is currently the first key in the clustered index (defined as (ID, TENANTID)).
The tables are partitioned by TENANTID using a partition scheme (TenantIDPartitionScheme) that maps partitions across multiple filegroups.
The ID column is not used as a foreign key in any other table and isn't referenced by other tables.
I’m considering whether the clustered index should instead be ordered as (TENANTID, ID) to better align with row-level security and the partitioning strategy, as queries frequently/always filter by TENANTID. My goals are to optimise performance for tenant-based data retrieval, reduce fragmentation, and improve maintenance efficiency.
Specific Questions:
1. Would changing the clustered index order to (TENANTID, ID) improve performance for this setup?
2. How much does the partition scheme on TENANTID alone help in query performance with the current clustered index (ID, TENANTID)?
3. Are there any potential downsides to switching the clustered index to (TENANTID, ID) with the existing partition scheme?
Any insights into the best approach for indexing and partitioning in this multi-tenant, row-level security scenario would be greatly appreciated!

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

ScottPletcher SSC Guru Points: 101126 More actions · Answer 1

If the queries tend to be by TENANTID -- and I would expect that they are -- then I would definitely uniquely cluster by ( TENANTID, ID ). If lookups are done by ID only, you could a nonclus index to cover that. Thus, from my standpoint:

(1) Yes.

(3) No.

(2) is somewhat more involved to analyze. But generally you partition only to be able to easily work with individual partitions. With the table properly clustered (i.e. by ( TENANTID, ID )), you could easily find that you don't need to partition at all.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jonathan AC Roberts SSCoach Points: 18377 More actions · Answer 2

Thank you, Scott, for the helpful advice! I’d like to dive a bit deeper into point (2) to better understand how partitioning impacts performance in a multi-tenant setup. I’m new to working with both row-level security and partitioning in the same database.

Based on your experience:

Does partitioning by TENANTID improve query performance when most queries are filtered by tenant? For instance, will partitioning reduce the amount of data SQL Server scans, or is it mainly useful for manageability, like simplifying maintenance and backups?

If I use (TENANTID, ID) as the clustered index, would partitioning still provide any additional performance benefits? Or would the clustered index be sufficient for efficiently handling tenant-filtered queries on its own?

I believe the original implementers may have chosen ID as the leading column in the clustered index because, in general, the most selective columns are usually placed at the start of an index. However, in our multi-tenant setup with row-level security, it seems that making TENANTID the leading column might better serve our needs. Since TENANTID is frequently used for filtering (due to row-level security) and aligns with the partitioning scheme, having it as the first column in the clustered index could improve performance by allowing SQL Server to target relevant partitions directly for tenant-based queries. Additionally, since the clustered index on (TENANTID, ID) remains unique, we wouldn’t lose selectivity with this change.

ScottPletcher SSC Guru Points: 101126 More actions · Answer 3

I'll address your qs/ concerns as (A) ["Does partitioning ..."], (B) and (C):

(A) Yes, absolutely. You only need to partition then for other aspects, such as manageability.

(B) In the vast majority of cases, the clus index alone is enough.

(C) You've got it exactly right. ( TENANTID, ID ) will be vastly better if most of your queries specify a WHERE on TENANTID. And, as you noted, it is important to include the ID so that all key values are already unique. I have one table clustered on 5 keys because data is always selected based on at least the first 3 and up to 5 of them (depending on the security level of the id doing the accessing).

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jonathan AC Roberts SSCoach Points: 18377 More actions · Answer 4

Thank you for the detailed reply - it’s very helpful!

Just to clarify, the queries themselves don’t specify TENANTID in the WHERE clause. Instead, row-level security handles tenant filtering through

EXEC sp_set_session_context @key=N'TENANTID', @value=...

which restricts access based on the tenant ID set in the session context.

Given this setup, would you still expect (TENANTID, ID) to perform better than (ID, TENANTID), since the TENANTID filter isn’t explicitly in the query but is applied via the session context?

ScottPletcher SSC Guru Points: 101126 More actions · Answer 5

Interesting. Look at the query plan and see if it is doing a SEEK as part of the clus index scan. I'll read up on the mechanics of row-level security too.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

ScottPletcher SSC Guru Points: 101126 More actions · Answer 6

It's just a predicate / search, it's not an index SEEK.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

frederico_fonseca SSCoach Points: 16167 More actions · Answer 7

if it is partitioned what matters is if it is has the partition operator on the plan or not - if it has then being partitioned is helping the query to reduce data even if it isn't the first column on the clustered index.

ScottPletcher SSC Guru Points: 101126 More actions · Answer 8

Since it's not an index seek, partitioning should still help the query by reducing the number of pages it has to search. Once again, verify that by looking at the query plan. I would still definitely cluster the table by TENANTID first.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jonathan AC Roberts SSCoach Points: 18377 More actions · Answer 9

Thank you, Scott and Frederico, for the helpful advice!

I’ll take a closer look at the query plan tomorrow to verify whether it’s performing a seek or just a scan. I’ll also check if there’s a partition operator in the plan to confirm whether partitioning is effectively reducing the data scanned.

My plan is to change all indexes on our test database to use (TENANTID, ID) as the clustered index order and then run before-and-after load tests to compare performance. I’ll monitor the execution plans closely to see how the changes impact filtering and page scans.

Thanks again for your guidance.

Jonathan AC Roberts SSCoach Points: 18377 More actions · Answer 10

I've created a stored procedure that takes a table name as a parameter and checks the clustered index column order. If the order is ID, TENANTID, the procedure drops the primary key and recreates it as TENANTSID, ID. It also handles foreign keys: it drops any foreign keys in other tables that point to the table and recreates them with the reversed column order.

I ran this procedure on the tables and performed a quick test using SET STATISTICS IO, TIME ON with the following query:

exec sp_set_session_context @key = N'TENANTID', @value = 18
select * into #a from myTable

The results were remarkable:

Metric	Database A (TENANTID, ID)	Database B (ID, TENANTID)
Logical Reads	51	40,950
CPU Time (ms)	0	4,485
Elapsed Time (ms)	17	5,392
Rows Retrieved	984	984

If I can implement this change in production on all their tables, I’m optimistic it will significantly reduce the high CPU usage on the database.

ScottPletcher SSC Guru Points: 101126 More actions · Answer 11

Nice.

Also, you don't have to change the FKs in other tables if you need one for just ID. Instead, just add a unqiue nonclus index on ( ID ); it can still be the PK if you prefer, it will just be NONCLUSTERED rather than clustered.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jonathan AC Roberts SSCoach Points: 18377 More actions · Answer 12

ScottPletcher wrote:

Nice.
Also, you don't have to change the FKs in other tables if you need one for just ID. Instead, just add a unqiue nonclus index on ( ID ); it can still be the PK if you prefer, it will just be NONCLUSTERED rather than clustered.

All the FKs in this case are on (ID, TENANTID). I tried enabling them without reordering, but it failed. The only solution I found was to drop the FKs and recreate them in the opposite order. Fortunately, the stored procedure I wrote handles this automatically, so it doesn’t require any additional effort on my part.

Jeff Moden SSC Guru Points: 1004419 More actions · Answer 13

Jonathan AC Roberts wrote:

Does partitioning by TENANTID improve query performance when most queries are filtered by tenant? For instance, will partitioning reduce the amount of data SQL Server scans, or is it mainly useful for manageability, like simplifying maintenance and backups?
If I use (TENANTID, ID) as the clustered index, would partitioning still provide any additional performance benefits? Or would the clustered index be sufficient for efficiently handling tenant-filtered queries on its own?

Just to add my 2 cents...

I built one large partitioned table years back. It was based on the "Created_Date" and "ID" columns. I was also involved in a thread on SSC long ago about partitioning that I can no longer find. I do, however, remember the results. Changing the Clustered Index on the monolithic table added some good performance in both cases. Once converted to a partitioned table, performance for the same queries dropped. I don't remember the percentage but I remember being mildly shocked and then thinking that the trade offs for maintenance and the serious reduced backup times (I made the older partition file groups read only so that I only needed to backup active file groups, which consisted of only one month and the empty "next month").

In the thread I was talking about, the monolithic structure was faster than the partitioned version.

All that stuff you hear about "partition elimination" requires that you actually have the date range in the criteria of your code and that your partitions are based on whatever defines a partition.... which is exactly the way a proper clustered index works on a monolith table without the extra layer of overhead of having to go through the portioning function, etc.

To wit, partitioning is almost never for reasons of query performance. It's usually for index maintenance and backups an other similar things. It can also be used for security as in your multi-tenant scenario.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18377 More actions · Answer 14

Jeff Moden wrote:

Jonathan AC Roberts wrote:
Does partitioning by TENANTID improve query performance when most queries are filtered by tenant? For instance, will partitioning reduce the amount of data SQL Server scans, or is it mainly useful for manageability, like simplifying maintenance and backups?
If I use (TENANTID, ID) as the clustered index, would partitioning still provide any additional performance benefits? Or would the clustered index be sufficient for efficiently handling tenant-filtered queries on its own?
Just to add my 2 cents...
I built one large partitioned table years back. It was based on the "Created_Date" and "ID" columns. I was also involved in a thread on SSC long ago about partitioning that I can no longer find. I do, however, remember the results. Changing the Clustered Index on the monolithic table added some good performance in both cases. Once converted to a partitioned table, performance for the same queries dropped. I don't remember the percentage but I remember being mildly shocked and then thinking that the trade offs for maintenance and the serious reduced backup times (I made the older partition file groups read only so that I only needed to backup active file groups, which consisted of only one month and the empty "next month").
In the thread I was talking about, the monolithic structure was faster than the partitioned version.
All that stuff you hear about "partition elimination" requires that you actually have the date range in the criteria of your code and that your partitions are based on whatever defines a partition.... which is exactly the way a proper clustered index works on a monolith table without the extra layer of overhead of having to go through the portioning function, etc.
To wit, partitioning is almost never for reasons of query performance. It's usually for index maintenance and backups an other similar things. It can also be used for security as in your multi-tenant scenario.

I checked the partitioning, all the partitions are on the same file group [PRIMARY] so I think they are only logically partitioned.