Separate Reads

Question

Separate Reads

Steve Jones - SSC Editor

SSC Guru

Points: 736172
More actions
November 13, 2024 at 12:00 am

#4475140

Comments posted to this topic are about the item Separate Reads

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

David.Poole SSC Guru Points: 76000 More actions · Answer 1

When I was developing web sites using a CMS we had the CMS Authoring suite pointing at one DB and the web site consuming a subscribing DB, effectively separating reads and writes.

Authoring traffic was low but web traffic was very high (by late 1990s standards). It worked well enough but I am not 100% convinced it was needed. It was also where I cut my teeth with SQL Server replication so this wasn't as reliable as expected. I suspect my lack of experience with replication was a factor in the reliability problems.

In other areas I have seen "when this use that else use t'other" confuse developers, myself included. I think a solution is to abstract the separation using a library or API. That way the developer consumes an API end point and that API end point has one responsibility which will be either read or write but almost never both. That separation means that the underlying DB user can be assigned a role that has ONLY the permissions it needs, not one user with every permission the app needs.

LinkedIn Profile

Rod at work SSC-Dedicated Points: 34007 More actions · Answer 2

This is the first time I've ever read of such a concept. A read-only database connection and a regular database connection. I've never seen this done anywhere. And I certainly can see how people would get confused. Even if the read-only connection was opened with the concept of making it read-only, I can see some developer getting frustrated when they couldn't use that connection to write to the database, so they'd get into the code and change it so that it could write to the database, etc.

Interesting idea, though. I'm going to keep it in the back of my mind.

Kindest Regards, Rod Connect with me on LinkedIn.

kevin77 SSCarpal Tunnel Points: 4118 More actions · Answer 3

I'm surprised there aren't more replies, and I'm surprised by Rod's comment.

I have always considered this. But only optimize when necessary. A somewhat alternative approach that we use in our application, that is along the lines of what David mentioned, is that you can get similar benefits without relying on developer code, by using a cluster. While I'm not completely familiar with SQL Server Availability Groups, the concept that our application relies upon is that it must be deployed in a 3 or 5 node cluster. Of which, one node is designated as the "primary", and is the only node that supports database writes.

Users access the web application that sits behind a load balancer. Most of an end user's work requires only read access. For a few things that do need to be written, one of the replica web server nodes forwards the request to the primary node.

When an end user needs to do things that involve database writes, that user is going to be an administrator of some portion of the application and that user will know to have to log in to the primary node of the cluster in order to perform that administrative work.

That probably wasn't the best description. There are many more details. But hopefully just the idea of a cluster and load balancer is the point I want to make as a possible alternative to managing two separate types of connections in code.

Steve Jones - SSC Editor SSC Guru Points: 736172 More actions · Answer 4

I'm surprised as well. I'd have guessed lots of people would say "no, haven't tried" or "it's hard to separate".

I think of all the things apps read that rarely change, which could be separated with something like replication to other locales. I supposed some use Redis/new-OSS-version or some caching to handle things, but I'd even think some of the reporting/querying of less-used things could be easily handled by RO nodes.

Licensing certainly is a consideration as well, as another node can be $$$$

Rod at work SSC-Dedicated Points: 34007 More actions · Answer 5

@kevin77, your description of using a cluster is very interesting to me. I'm here to learn. So, given the description you said how many users does that cluster support? I'm reasonably certain we don't have anything like that here. Maybe our workload is small? Either that or what you're describing has never been considered. Anyway, thank you for sharing this is benefiting me. 🙂

Kindest Regards, Rod Connect with me on LinkedIn.

kevin77 SSCarpal Tunnel Points: 4118 More actions · Answer 6

Yeah, all good, Rod. I just meant I've read some of your posts before and think you're pretty knowledgeable and experienced.

I can try to give more details, but as you know, answering the question of, "how many users does that cluster support", is always an open ended question with an answer of "it depends". While the clustered architecture does provide some horizontal scaling, there are some aspects and components of the application that plateau after a certain number of nodes.

The application is intended for enterprise businesses to be hosted on-prem. It is not a publicly available web site or anything. So the total number of users of a cluster is often bound by the number of employees, typically in the thousands or tens of thousands. However, customers often integrate with the application's API, which counts as extra load, and they never do it efficiently.

The clustered architecture offers not only some performance benefits, but also the disaster recovery benefits. Our application takes care of setting up and managing the cluster, over a private Wireguard VPN between the nodes. So the customers don't have to worry about the administrative overhead of managing the cluster. For example, while it is still recommended to take regular backups, the chance of ever actually using one is slim, at least in the case of a hardware failure or something, because of the already existing redundancy of the cluster. With just a click of a button, you can have the cluster "failover" and designate a new primary node, if necessary.

Obviously this kind of architecture isn't for every application. But yeah, I wanted to throw it out there as a similar alternative to what Steve mentioned.

Rod at work SSC-Dedicated Points: 34007 More actions · Answer 7

Hi Kevin, thank you for your very kind words. 🙂

The clustering you're describing I think could really work for us, in some situations. It's good to know that clustering isn't for all situations. We have a third party application that is one of our vital apps. Recently, for reasons we haven't yet figured out, it is running slower than normal. The users are complaining about that. It is an Intranet application, with some external users around the state. I would estimate we have hundreds of users on a daily basis. Anyway, clustering sounds like it might help in situations like this. Now I'm going to show my ignorance here, but I want to learn. Let's say that this internal app has a URL of https://ourapp. How does a cluster distinguish between two endpoints with one URL?

Kindest Regards, Rod Connect with me on LinkedIn.

kevin77 SSCarpal Tunnel Points: 4118 More actions · Answer 8

Yeah, that is where the load balancer comes into play. Load balancers are a pretty popular combination with web servers. You can get software based load balancers, like nginx, or hardware based load balancers (or even just simple DNS based load balancing). Then, you'll setup your DNS such that you have one URL of https://widget.acme.com that points to an IP address, 10.6.166.10. This will be the URL and IP address of the load balancer, which you'll give to all your users. But typically you'll also setup something like, https://widget1.acme.com, https://widget2.acme.com, and https://widget3.acme.com to point to 10.6.166.11, 10.6.166.12, and 10.6.166.13, respectively. This will go "around" the load balancer, instead of "through" the load balancer to be able to connect directly to individual nodes of the cluster, typically for administrative purposes (or testing and debugging).

It can be a fair amount of work to design and setup/configure. But, good news is there are a lot of options and a lot of information available, and you can start off with just a small part. For example, load balancers, or more specifically a web application proxy like nginx or a Web Application Firewall (WAF), allow you to enter "rules" based on the incoming URL path pattern matching. So if you only want to load balance your reports, you could configure it such that all requests to https://widget.acme.com/reports/*, can go to nodes 2 and 3, while all other requests go to node 1.

Rod at work SSC-Dedicated Points: 34007 More actions · Answer 9

Thank you, Kevin, for the explanation of load balancers. I've never worked with one. Thinking about my work environment, there may very well be load balancers involved. I'm not the one who's set those up, but it's likely someone else has. Still, thank you for this explanation. I'm saving this discussion in an Outlook folder for future reference.

Kindest Regards, Rod Connect with me on LinkedIn.