Synopsis
Modern relational databases have comprehensive security mechanisms to enforce fine-grained access control for tables and views. But when it comes to stored procedures, access control is rudimentary: either you can execute a stored procedure, or you can't.
In this article, we show a more flexible approach, using a programmable proxy, that allows precise control of stored procedure invocations based on all available information, such as parameter values, returned values, and more. This approach may be of interest to anyone who needs more granularity and flexibility in managing access to their enterprise databases.
Fine-grained access control
Most enterprise databases provide elaborate security mechanisms to control who can do what to which data, down to the level of individual rows and columns. For instance:
- Microsoft SQL Server has row-level security and column-level grants
- IBM DB2 has Row and Column Access Control and a fairly elaborate Label-Based Access Control
- Oracle has Fine-Grained Access Control, Oracle Label Security, and a few other mechanisms
In all cases, the notion is the same: you should be able to specify, to a very granular level, which operations can be performed on what data by which users.
When it comes to stored procedures, though, all databases have a simple access/no access mechanism. Either you can execute a given stored procedure, or you can't.
Do we need fine-grained access control for stored procedures?
Many organizations use stored procedures heavily, some to the point where direct access to tables and views is completely blocked, and all data access must go through stored procedures.
The stored procedures themselves often have to provide complex access control as part of their implementation, even if the underlying data is itself protected by fine-grained access control. This makes the stored procedures more complex, more expensive to invoke, and more onerous to change.
For many people, that's just the cost of doing business, but in many cases a proxy-based approach is useful when:
- the constraints depend on external data which is difficult or impossible to access from SQL
- the stored procedures cannot easily be modified to reflect the requirements, for instance because they are part of a third-party package that you don't control
- you don't have privileged access to the database
- as an architect, you prefer to externalize fine-grained access control rather than embed it in the stored procedures
- you need to programmatically change some parameter values or redirect the invocation to a different stored procedure
Even if none of these apply to you, you may want to read on and see how this approach can open new avenues for your database systems.
What can we control with a proxy?
A proxy can control just about anything that goes between database servers and clients, but in this article we'll focus on stored procedures.
In that context, a programmable proxy can control three critical things:
- the invocation itself
- the values of the parameters being passed to the stored procedure
- the values or result set(s) being returned by the stored procedure
Controlling the invocation
The proxy can reject or modify the invocation of the stored procedure. This can be for a number of reasons:
- the parameter values are not acceptable
- the context is not right: the invocation comes from an unexpected address, the proxy has detected an unexpected pattern of behavior, etc...
- the invocation needs to be redirected to a different stored procedure, and the parameters adjusted accordingly
- the invocation needs to actually call more than one stored procedure, and combine the results
Controlling parameter values
The proxy can also execute logic on the parameters being passed by the client:
- parameter values can be logged or recorded
- the logic can verify that the parameters have acceptable values
- the logic can modify the values of these parameters
- the logic can reject the parameters, either with an error message, or silently
Controlling returned values and result sets
Once a stored procedure has executed, it may return some data, either as individual values, or as one or more result set.
Based on these values or result sets, a proxy can then:
- let everything flow back to the client
- stop the invocation and return an error to the client, for instance if the proxy determines that the client is not authorized to see a specific piece of data
- modify the values or result sets, hide them, remove rows and column values from result sets, etc...
- return null values or empty result sets if it determines that the client should not have access, but also should not be aware that it does not have access
What does it look like?
Adding a proxy to your database is typically a simple matter of spinning up one or more proxies, and directing the clients to the proxies. So instead of the usual connection:
you add the proxy in the middle, and start adding your logic in the proxy:
Let's look at a simple example. We'll use SQL Server as the database, and Gallium Data as the proxy.
Given a simplistic stored procedure:
CREATE PROCEDURE DEMO.CREATE_PRODUCT ( IN NAME VARCHAR(50), IN PRICE DECIMAL(10,2), IN TYPEID INT)
we want to implement the following requirements:
- only users in the MGMT group can create a product of type 98 or 99
- only products with type > 100 can have a price greater than $500
- products of type 16 and 17 must actually be created using the internal CREATE_SPECIAL_PRODUCT procedure
The first requirement is easy to satisfy with an RPC filter in Gallium Data:
let typeId = context.packet.parameters[2].value; if (typeId === 98 || typeId === 99) { let rs = context.mssqlutils.executeQuery("select is_member('MGMT') as res"); let isMember = rs.rows[0].res; if ( ! isMember) { context.result.errorMessage = "User is not a member of the MGMT group"; return; } }
This will cause the database client to receive an error if requirement 1 is not satisfied.
The second requirement is a straightforward extension:
let price = context.packet.parameters[1].value; if (price > 500 && typeId <= 100) { context.result.errorMessage = "Price is too high for this type of product"; return; }
The third requirement is even easier:
if (typeId === 16 || typeId === 17) { context.packet.procName = "CREATE_SPECIAL_PRODUCT"; }
Here, we assume that the CREATE_SPECIAL_PRODUCT procedure takes the same parameters as CREATE_PRODUCT, but of course our logic could change the parameters as required if that is not the case.
These are simplistic examples, but Gallium Data gives us full access to any JavaScript and Java libraries, so the sky is the limit.
Benefits of proxy-based access control
Having a programmable proxy between your clients and your database offers significant advantages:
1 - More flexibility in the logic
Although SQL stored procedures usually have a fair amount of latitude, they do have some limitations:
- libraries are limited, especially compared to e.g. JavaScript and Java
- creating or modifying a stored procedure requires privileged access to the database
- JavaScript is familiar to many more programmers than SQL procedural languages
- it is much easier to interface with other systems from Java or JavaScript than from SQL
2 - Separation of concerns
By implementing some or all access control in the proxy, you allow the stored procedures to focus on their core functionality. This also allows you to modify access control without touching the stored procedures.
3 - No changes required in the server or clients
This is one of the most attractive aspects of this approach: it can easily fit into an existing system with no disruption.
In fact, in cases where you cannot change the stored procedures, either because you don't have access, or because they are in a third-party package, a proxy may be your only choice.
4 - Reduce workload for the database server
Executing stored procedures is not free -- it takes processing power on the database. Offloading this type of access control to a proxy frees up cycles on the database servers, and the proxy can be scaled more easily than the database.
Downsides of proxy-based access control
Obviously this approach is not a panacea -- it's appropriate in some scenarios, but there are two main downsides:
1 - The system is more complex
Adding another component (the proxy) to a system makes, by definition, for a more complex system: you have to maintain it, secure it, monitor it, etc... In the case of Gallium Data, a proxy is a single Docker container, which minimizes this issue, but it's still one more thing to worry about. On the other hand, if you need to add access control, this additional complexity should be compared to the alternative.
2 - Slight latency penalty
Because a proxy, by definition, adds a network hop between database clients and servers, there is a penalty in latency, though it's usually of the order of 5% - 10%, depending of course on the complexity of the logic in the proxy. Again, this should be compared to the alternative, which might be even more expensive.
Conclusion
Clearly, this approach is not limited to stored procedures: any request (and response) can be controlled, modified or rejected by a programmable proxy.
Should you add a proxy to your database systems? That's entirely dependent on your particular circumstances, but this approach should at least be considered when it might be simpler than the alternative, such as adding masses of code to your stored procedures, or inserting a heavy middle tier.
In any case, having this type of proxy in your toolbox will give you more options when designing database systems.