Creating a shared stored procedures accessing data from different schemas

  • We have a technical requirement to implement a multi tenant architecture where each of the tenants are separated using schema's. Our solution is to only allow access to the tables through stored procedures.

    How can we implement it in such as way that we have shared stored procedures accessing data from different schemas depending on the user who logins. Btw, we are excluding dynamic SQL as 1 of the solutions.

    Example:

    Table:

    Tenant1.TenantInfo

    Tenant2.TenantInfo

    Stored Procedure

    dbo.sp_retrieveTenantInfo

    1. When Tenant1 logins, it should retrieve the data from Tenant1.TenantInfo

    2. When Tenant2 logins, it should retrieve the data from Tenant2.TenantInfo

    Anyone who could advice me on this would be so much of a help.

    Thank you.

  • You can use SELECT CURRENT_USER and SYSTEM_USER

    You can use directly permission setting in SQL and select where you have access or no,

    or create some helping table with columns USER and TABLE and something else what you need and write yourself...

    I found some select for first option

    select permission_name, state_desc, type_desc, U.name, OBJECT_NAME(major_id)

    from sys.database_permissions P

    JOIN sys.tables T ON P.major_id = T.object_id

    JOIN sysusers U ON U.uid = P.grantee_principal_id

    and based on what you will choose, you will write some condition..

    Hope that will help

  • So you want a different schema for each tenant, a single procedure with the tenant as a parameter, and no dynamic SQL?

    Prepare to write lengthy IF ... ELSE IF ... structures, that you have to change everytime a tenant is added.

    You might want to rethink some of the requirements and boundary conditions.

    Also, please do not use sp_ as a prefix for your stored procedures.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Multi-tenant usually means there is some requirement for separation of data. If you have to do that, then you should look at using a source control system and separate databases for each client. Using schemas in the way you are attempting is not a good security implementation as they are not owned by database users, but are more equivalent to buckets of sql. As you have discovered already, you are writing all sorts of code to work the issue dynamically in the SQL layer when all that should be happening in the application layer. Eventually a client will leave and what then? How are you going to give them their data or cleanse it from the database without potentially breaking other clients?

    If there is no separation of data, then all you need is a Client table and and a client_id in related tables, and the application should do the rest.

  • Thanks WHug. 1 of the other considerations we went for schema is a additional resources that maintaining multiple database causes. This is what we have collectively gathered online, so that assumption we had is with multiple database, the resources required would be linear to the # of tenants.

    Is this true? Or maybe, overhead (additional resources in RAM and CPU) will having multiple databases entail?

  • Kenneth-1467 (2/29/2016)


    We have a technical requirement to implement a multi tenant architecture where each of the tenants are separated using schema's. Our solution is to only allow access to the tables through stored procedures.

    How can we implement it in such as way that we have shared stored procedures accessing data from different schemas depending on the user who logins. Btw, we are excluding dynamic SQL as 1 of the solutions.

    Example:

    Table:

    Tenant1.TenantInfo

    Tenant2.TenantInfo

    Stored Procedure

    dbo.sp_retrieveTenantInfo

    1. When Tenant1 logins, it should retrieve the data from Tenant1.TenantInfo

    2. When Tenant2 logins, it should retrieve the data from Tenant2.TenantInfo

    Anyone who could advice me on this would be so much of a help.

    Thank you.

    The idea of using stored procedures is absolutely the proper one. The idea of using a schema per tenant is going to be really painful. As Hugo pointed out, you'll need a lot of "IF" code and, at best, the will drive performance into the ground because the optimizer has problems with such parameter driven code execution.

    My recommendation would be to NOT use multiple schemas and build the tables as monoliths. There should be a tenant ID in each table and that would be passed as a parameter for criteria in every stored procedure.

    Even building separate databases and pushing code to all the databases would be easier and better performing. It would also allow for easy customization for each tenant without taking a chance on such customizations screwing up all the other clients for security or performance.

    The separate databases would also easily allow you to do restores by client instead of possibly taking the whole world out. Think of it as an easy way to do partitioning... and, no... I don't believe you want to go through the pains of proper portioning for any of this because that would have the same security problems as having common code with different schemas.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    I do have the same opinion with you guys that multiple databases are the most ideal. But this has been challenged by our architects that it would require more server resources and scaling the database to more tenants would be costly. What I'm having difficulty proving my case is a good comparison that shows us the resource utilization for both separated database and separated schema models will be identical or might be slightly higher but not linear?

    Thank you.

    Jeff Moden (3/1/2016)


    Kenneth-1467 (2/29/2016)


    We have a technical requirement to implement a multi tenant architecture where each of the tenants are separated using schema's. Our solution is to only allow access to the tables through stored procedures.

    How can we implement it in such as way that we have shared stored procedures accessing data from different schemas depending on the user who logins. Btw, we are excluding dynamic SQL as 1 of the solutions.

    Example:

    Table:

    Tenant1.TenantInfo

    Tenant2.TenantInfo

    Stored Procedure

    dbo.sp_retrieveTenantInfo

    1. When Tenant1 logins, it should retrieve the data from Tenant1.TenantInfo

    2. When Tenant2 logins, it should retrieve the data from Tenant2.TenantInfo

    Anyone who could advice me on this would be so much of a help.

    Thank you.

    The idea of using stored procedures is absolutely the proper one. The idea of using a schema per tenant is going to be really painful. As Hugo pointed out, you'll need a lot of "IF" code and, at best, the will drive performance into the ground because the optimizer has problems with such parameter driven code execution.

    My recommendation would be to NOT use multiple schemas and build the tables as monoliths. There should be a tenant ID in each table and that would be passed as a parameter for criteria in every stored procedure.

    Even building separate databases and pushing code to all the databases would be easier and better performing. It would also allow for easy customization for each tenant without taking a chance on such customizations screwing up all the other clients for security or performance.

  • Kenneth-1467 (3/1/2016)


    I do have the same opinion with you guys that multiple databases are the most ideal. But this has been challenged by our architects that it would require more server resources and scaling the database to more tenants would be costly. What I'm having difficulty proving my case is a good comparison that shows us the resource utilization for both separated database and separated schema models will be identical or might be slightly higher but not linear?

    If your architects make a wild claim, why does the burden of proof lie on you? Ask them to support their claim with links to trustwirthy websites or other credible sources.

    When you have hundreds or thousands of databases on a single instance, you will use some extra memory per open database - depending on usage patterns, the AUTO_CLOSE option might be useful.

    Opening the list of all databases in SSMS might take a while. Probably not going to be your biggest concern.

    You definitely want to script and automate all maintenance task, and standardize the databases. Create scripts to roll out a new database for a new tenant, and scripts to remove a database when a tenant cancels. And you'll probably want to learn powershell, or use one of the improved replacements for sp_MSForEachDB (do not use sp_MSForEachDP itself, it has some issues!!)

    The suggestion to use TenantID as part of the key in every table is also a good one. It simplifies management because you have a single database (or a few if you want to prepare for load-balancing once you're big enough to need multiple servers). But it makes security harder, that's something you get almost for free in the multi-database model.

    The schema-per-tenant model your architects suggest has neither of the benefits of either of these models. (In fact, the only sensible recommendation I have so far seen from your architects is to avoid dynamic SQL, but at the same time they impose something that can only be done by using dynamic SQL)


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • If you have multiple databases you would naturally have less data in each. Also, the common database with the IF statements and schemas will require more indexes.

    As far as quantity of databases...at some point you will need another app server, so at that point get another database server too. You can put 100's of databases on one server if needed. Its all about the hardware.

    The trick is to keep a good source control of each database. I also recommend using sql server data tools (SSDT) for all development work.

  • Using a schema per tenant seems to be a solution that is complicated for the sake of being complicated. As others have said: If you're unable to use Dynamic SQL (or at least advised against it), then you're not going to be able to access schemas dynamically without a lot of trouble.

    Here's an example of why the multi schema or multi database wouldn't work: If you needed to get some financial data for each tenant, then you'll have to start writing some impressively large statements with lots of UNIONs to collect data from each schema or database. With one database, gather the data in one query and then output it to your favourite reporting/data visualisation tool.

    The benefit of having multiple databases is again as others have said: data separation. Complexities will arise when tenants start wanting their own modifications, and then you get fracturing of code bases. It will happen. Whether it's allowed or not is another matter.

    The real question is though: Why do tenants need to be separated? If it's for security, fair enough - go with separate databases. If it's just so they are separated without any other reason, then one database to rule them all, so to speak.

    If you want to compare overhead, create one database containing 10 tables with 1,000,000 rows of data in each, and then create 10 databases with 10 tables with 100,000 rows of data in each table. Set up two agent jobs - one to rebuild indexes, update statistics and backup one database, and then another that does the same for the 10 databases. Make sure you output the data generated by the job to a file so you can review it. See how long each takes. Change more than 10% of the data, run the job again. Look at the size of the databases, how large each table is, how long it takes to get the data. Provide that data as some sort of baseline to your architects.

    SQL Server is pretty good at handling one large database with lots of users (Stackoverflow as an example), or lots of smaller databases with fewer users for each.

    My personal suggestion without having much information to go on: One database, well designed with referential integrity enforced through proper foreign key/unique constraints, good covering indexes, data accessed via stored procedures, sitting on the best drives you can afford/get access to. Use Ola Hallengrens scripts to simplify maintenance, and regularly review query plans to look for improvements that can be made to queries through re-writes or updating indexes (because things WILL change, and the database sometimes needs to change too).

  • I have been reading all the comments and yes, I am not one of the sharpest pencils in the packet when it comes to sql but I have an idea. Nearly everybody had the idea of an extra column for the tenant id and I fully agree. The other issue is security. I created an employee leave system once where the employees could log into the system and apply for or sell their leave. No-one was supposed to access each other's leave details so, I added a column on the employee table called loginuser. Each employee's user name was entered there and the stored procedure listed only the leave for the user that logged in. Easy as that and no major if statements and no dynamic sql. Unless a user allows someone to use their log in their data was safe.

    :-PManie Verster
    Developer
    Johannesburg
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • How big is this system going to get? Are you going to run into scale-up issues eventually? If you know things are going to get huge, it's certainly easier to design for scale-out at the beginning vs. trying to retrofit later.

    There's also operational support to consider. It's pretty painful to be in a multi-tenant single server solution where you have 1 tenant causing all the tenants issues. (Like the inability to patch or resource constraints).

    I've supported a multi-tenant global architecture where due to the time-zone rolling we had patch scheduling issues and performing "nightly" maintenance for once customer impacted production hours for another.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply