I saw a great question on Twitter from Frank Pachot, a developer advocate of Yugabyte. He wrote: Without thinking how your preferred database deals with it, what do you expect if:
- session 1 starts to reads table T
- session 2 drops table T
- session 1 continues to read
The choices in his poll were: session 2 waits, session 2 fails, session 1 fails, both fail. My first thought was SQL Server and the default need for session 2 to get an exclusive lock. In that case, session 2 would wait. Most people answered that same way, but then Frank posted a follow-up with a link to his blog. The answer for Yugabyte is that session 1 fails as it gets the message that the table was deleted.
Leaving aside the decision to drop a table, imagine this is some schema change instead. In the blog, some good points are raised about how to handle high concurrency changes, and the potential problems with having session 2 wait. On a busy system, this could cause lots of blocking as threads stack up behind session 2.
It's an interesting read about the challenges of distributed system design and how to handle changes. In some sense, I get that this makes sense, but I wonder where this causes issues. If any schema change on the table by session 2 were to cause an error in session 1, that would be bad. However, does this mean that the database engine must now evaluate whether a column change impacts a query in flight? Then decide to send an error? What about evaluating views or procedures/functions that depend on
Does this mean that all nodes need to sync up the schema changes quickly, and at a higher priority than data movements? I don't know exactly how Yugabyte distributes data, and if there are copies on multiple nodes, but I assume there are. This adds complexity to the communication between nodes, which is likely needed. Honestly, if someone drops a table and they should have, we probably don't want clients getting results. If they do this accidentally, I'd like to know about it quickly.
The question is interesting, and there are multiple ways to look at this, but I found it fascinating to spend a few minutes thinking about the complexities of data in distributed systems and the challenges involved. This also made me think that the people who keep data safe and fix problems when they occur are invaluable in the modern world.