Hi all
Our BI team is using a lot of schemas on tables to appropriately manage the reporting audience.
What effect - if any - does creating a large number of schemas on a database have on performance, database capacity (space and size), etc.
Grateful for your comments please.
Thank you
Hirantha
January 10, 2022 at 2:56 pm
Define "a large number"?
Are you creating/allowing user-specific schemas?
A schema is an object like a table. SQL is limited to 2,147,483,647 objects. As long as you are several orders of magnitude below that, I wouldn't expect any perceptible performance impacts.
More importantly (for correctness more so than performance), make sure you are schema-qualifying all references, even to the default "dbo".
To add to what ratbak said, if you have a lot of schemas, you MAY not have a default set to dbo for a specific user. This is configurable, so you will want to make sure that everyone has an appropriate default set OR (the better approach) make sure to schema qualify all of your queries. No more doing "SELECT * FROM TABLE", but "SELECT * FROM SCHEMA.TABLE".
Other than that, we have not noticed any performance issues. We have an "applications" database (internally developed applications) and we use 1 schema per application. The data is semi-rated, so the application that owns the data gets the schema while applications that use it will either call it directly or will call a view with the name of the calling application. So if application A owns the data (handles the insert, update, and delete operations for the data), the schema would be A. If application B uses the data, it would USUALLY have a view in the schema B on the table in schema A pulling only the data it cares about. In the event 2 applications both need to modify the data, the table would get a "dbo" schema and two views would be created - one per application... usually. This is not a hard rule, just something we were trying out and seem to be happy with it so far.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
January 10, 2022 at 8:46 pm
Thank you so much for this, RatBak.
We are WAAAAAY below that 2,147,483,647 number. About 30 on this particular DB.
This server is a VM and the SAN is full only less than 10% contingency left. Plus the resources are quite stretched: Processes and RAM. Due to number of ETL jobs about 50 and running SQL 2012 Ent.
Hence my concern of any possible adverse impacts. Thank you once again.
January 10, 2022 at 8:49 pm
Thank you so much Brian Gale for this detailed description.
I believe the BI Team do address the queries as schema.table.
This is a reporting database created through a number of ETL processes. Not an operational database.
Thanks so much for the very descriptive answer.
Hirantha
January 11, 2022 at 5:47 pm
I would say give it a shot and see what happens.
Clone your system to a test/dev box and do some performance testing after cloning to get some baseline numbers. Then make your changes on a test system and do some performance comparisons.
My expectation is that you won't see much difference, but it is ALWAYS good to test things rather than just trust strangers on the internet. You may have some odd case where having multiple schemas will cause problems. It's like upgrading to SQL 2019 - for MOST people, things just got better and performance was good as before or better. BUT for some people, performance tanked after upgrading.
I can't think of any reason why it would perform worse by having more schemas, but it never hurts to test! Just make sure you are doing a 1:1 test. What I mean by that is if your dbo schema table has 10 indexes, your new schema version of the table must have those same 10 indexes. Otherwise your performance comparison will be inaccurate.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
January 20, 2022 at 6:11 am
This was removed by the editor as SPAM
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply