October 27, 2009 at 3:01 pm
The idea that SQL DBMS and SQL language is much less than adequate for current and future enterprise data management needs is pretty mainstream now (some examples: alternative cloud storage models, "persistence-ignorant" development tools, business rules engines, OLAP, XML stores). I've long been of the view that we badly need a permanent replacement for the SQL model in core DBMS software. I strongly believe that its replacement(s) should be Relational, rather than some of the more fashionable, non-relational alternatives.
The odd thing is there seems to be plenty of discussion about this topic outside the SQL database professional community but almost total silence within the SQL community (at least in the circles I move in)!
So I'm wondering: Do other people in this forum recognise a need to move beyond the limitations of the SQL paradigm? What do we see as its potential future or future replacement? Are we doing enough to influence future trends and products through thought leadership and by feeding back our views to the DBMS software vendors.
This is not just a Microsoft SQL Server question of course. It's a question about SQL DBMS software generally.
October 27, 2009 at 5:36 pm
Maybe there is an assumption within the community that the Next Great Leap will be an extension of the Relational Model. I'd certainly like to believe that. It would be nice to have a fully functional system, similar to the one Date and Darwen describe in The Third manifesto, even if I have to forget SQL and learn some flavor of D (D#?).
But I'm also afraid that reason doesn't drive the market and too often hype and hope become substitutes for critical thinking. The fact that we're still talking about the object-relational impedence mismatch is telling. The momentum is to appease the development community with tools that make relational constructs look more like objects.
So, we should probably take a lesson from those folks (Date, Darwen, Pascal, etc.) who have argued from positions of theoretical strength, and argue from a marketing viewpoint instead; going from a stand that says "I won't buy this unless it has true relational support" which the vendors know isn't true, to a stand that says "I'd buy more of this if it had true relational support", which will light up the fires of avarice and potentially provide us with better systems.
Sorry, having a completely cynical day. Please feel free to ignore. 🙂
October 27, 2009 at 7:09 pm
I'm game. I don't care what I program in. My issue, and it's been my issue for a while, is that the "replacements" for SQL don't seem to be attempting to solve the problems of SQL but rather try to ignore the problems of SQL. Too many of them have that little cloud in the middle of the equation that reads "and then a miracle occurs" instead of an answer that actually provides a mechanism of pulling data out of the relational model in sets.
I agree that most of the tools and solutions presented these days are oriented towards making relational look like objects, even if it's not. As long as that continues, I suspect we're stuck with SQL, for good or for ill.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 17, 2009 at 1:29 pm
I agree with the comments so far. However, I do wonder if we database professionals working with SQL are collectively partly to blame for the lack of momentum and new solutions. Too often it seems that database professionals appear to be defending the SQL status quo rather than standing up and acknowledging that "SQL is broken. We need something better".
For example I've attended many sessions over the years with the SQL Server product group where the feedback of SQL Server professionals was being sought by Microsoft. I can only recall one ocassion when someone other than myself raised the question of the need to break with the SQL past and move on to something new. Yet this is a regular topic of discussion among IT people whose job is not primarily writing SQL!
So if we do want to move on, what features ought we to wish for and be demanding from the DBMS systems of the future? My own wishlist would include:
A truly relational data model and language
General purpose integrity constraints with optimiser support
Strong typing and type inheritence
Self-optimising storage engine with more flexible physical storage options
November 17, 2009 at 2:09 pm
The problem I've seen is that every non-relational data engine I've tested has had worse problems than SQL RDBMSs do. Sure, some of them solve some of the problems with SQL DBs, but in every case I've seen, the problems they replace them with are much, much worse.
For example, XML data stores. Yeah, XML translates nicely into screen objects for Web apps. But the performance, etc., I've seen on those are tragically bad. And let's not ignore the issues with transactional integrity that they seem to run into.
For example, Intersystems' Cache, which they promote as a "post-relational database". Yes, its performance and data integrity are excellent. Zero object-relational impedence, since you can directly reference data as a property of an object while at the same time applying joins and such as if it were relational. At the same time, development in it of anything more complex than a phone book database is nightmarishly bad, referential integrity is "we don't need no stinking referential integrity", and just forget tables of lookup values.
Pure OODBs work very nicely with OO applications, but the moment you need to run reports against one or try to do any data mining in it, you can expect a relatively simple project to be something you finish a week before you retire.
I've dealt with a number of these technologies, and the problems I've seen have been at least as bad as RDBMSs, usually much worse.
The day I see something come along that's better than Oracle, SQL Server, DB2, et al, in ways that matter, and not worse in ways that matter more, I'll consider it a good evolution. But I haven't seen such yet.
As for silence on the subject: I see it come up a couple of times a year in various forums, etc. Never gets past the exact phenomenon I'm writing about.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 17, 2009 at 2:26 pm
GSquared (11/17/2009)
The problem I've seen is that every non-relational data engine I've tested has had worse problems than SQL RDBMSs do. Sure, some of them solve some of the problems with SQL DBs, but in every case I've seen, the problems they replace them with are much, much worse.
Agreed. I strongly believe SQL ought to be replaced by new Relational DBMSs and languages. The trouble is, people who recognise the advantages of the relational model have traditionally defended its problem child: SQL. Even though SQL DBMSs are not truly relational at all.
If this situation continues then for better or worse the non-relational lobby will win the day. No question. They will deserve to succeed because the advocates of the relational model have mostly not proposed better relational alternatives to SQL. True there are some who have (Date and Darwen most notably of course), but they are mostly not associated with the "mainstream" of the industry and they tend to come in for some harsh criticism from the SQL lobby who go on the defensive when anyone starts pointing to SQL's glaring flaws.
November 17, 2009 at 2:35 pm
Hold on David, I take exception to some things:
Too often it seems that database professionals appear to be defending the SQL status quo rather than standing up and acknowledging that "SQL is broken. We need something better".
SQL isn't broken, but that isn't to say that it solves EVERY need. What I see most is tweaks to existing features.
I can only recall one ocassion when someone other than myself raised the question of the need to break with the SQL past and move on to something new.
Like what? The statement "we need to change" is useless unless there is some alternative. I think most of us would welcome a discussion is less abstract terms, something new like......
My own wishlist would include:
A truly relational data model and language
As long as you don't lose the set based operations otherwise all you get is RBAR and quite frankly THAT is a step backward.
General purpose integrity constraints with optimiser support
Please elaborate, this sounds interesting..
Strong typing and type inheritence
Maybe..
Self-optimising storage engine with more flexible physical storage options
Ok, first part self-optimising, most try to do this now, sounds like a side discussion of the post about "shouldn't it just work". More flexible storage options, ok, sounds good, what have you got in mind?
I like to work in specifics, I think that the deadening silence from the DBA crowd is from the fact that nobody has really offered up anything really as an alternative.
You brought up cloud technologies, the use of those has appeal when what you want to do fits into what they provide, many of them are relational like Azure and the new Amazon DB service. I looked at Amazon Simple Db and found that it didn't fit ANY of my requirements. You can often find a technology that fills a very definable niche, some of the non-sql variants do this, but very few of them fill the role that SQL Servers do, so when someone comes up with something that does that, THEN I think there will be a discussion..
CEWII
November 17, 2009 at 3:33 pm
Elliott W (11/17/2009)
SQL isn't broken, but that isn't to say that it solves EVERY need. What I see most is tweaks to existing features.
I don't think tweaking or enhancing SQL is a viable way to proceed any more. Much is fundamentally wrong with SQL. For example the need to support duplicate rows, type coercian and three-value logic. If you fix those things then what you would be left with would not look much like SQL. Having gone that far, why even try to maintain the illusion that you are still using something called SQL? Especially given the more absurdly dated trappings of the SQL language, such as the byzantine complexities of the SELECT syntax.
I long since came to the conclusion that the only way forward for the long term is to break with SQL entirely. Perhaps we could keep SQL as an alternative data access language on top of our future RDBMSs for the sake of supporting legacy code, but for future purposes something better is needed.
Like what? The statement "we need to change" is useless unless there is some alternative.
Like an RDBMS based on Date and Darwen's "D" model. There are some open source versions already and some practical languages have been developed that go beyond their simple Tutorial D syntax. I'd welcome an industrial-grade implementation of an RDBMS based on D.
As long as you don't lose the set based operations otherwise all you get is RBAR and quite frankly THAT is a step backward.
Row-by-row processing is part of the wretched legacy of SQL. It just isn't an issue in a proper relational database model because everything is set-based.
Elliott W (11/17/2009)
General purpose integrity constraints with optimiser support
Please elaborate, this sounds interesting..
Support for declarative referential integrity and declarative constraints generally is very poor in SQL. One obvious consequence is that customers have to spend large sums on business rules engines, middle tier code and data quality tools to implement integrity rather than use the DBMS. For example try implementing a constraint that every order in the Order table must have at least one OrderItem. Or try to write a join dependency or inclusion dependency constraint in SQL. What's needed is the ability to write any query expression as a constraint and have the database engine not only enforce it properly but to optimise it and use it to enable query rewrites and therefore optimise other code.
Strong typing and type inheritence
Maybe..
Every other major programming language benefits from strong typing. No C++, Java or C# developer I know of would dream of doing without it. Yet SQL's type system is still in the stone age.
Self-optimising storage engine with more flexible physical storage options
The most important feature here is that we shouldn't have just one storage option per table. The idea that logical determines physical storage cripples database design. Why should I be forced to "denormalize" a table just to change the way that the data is stored on disc? That works against the whole notion of logical / physical independence. It is quite feasible to have a separate "mapping" tier for translating a single logical table or tables into any number of underlying physical storage structures.
November 17, 2009 at 7:03 pm
David Portas (11/17/2009)
Much is fundamentally wrong with SQL. For example the need to support duplicate rows, type coercian and three-value logic.
Well, I disagree that much is fundamentally wrong, so I don't see any point in debating that. You can have duplicate rows right now, they aren't very useful but you can have them.. Type coercion exists now, but not for everything, which it doesn't in other languages either. I find three-value logic only useful for booleans and it is available now, through the use of NULL, Yes, no, I don't know(NULL). So in my view the three things you mention are already handled, could they be better, sure, but that doesn't support your position..
I long since came to the conclusion that the only way forward for the long term is to break with SQL entirely. Perhaps we could keep SQL as an alternative data access language on top of our future RDBMSs for the sake of supporting legacy code, but for future purposes something better is needed.
I still haven't seen a compelling argument for this..
Like what? The statement "we need to change" is useless unless there is some alternative.
Like an RDBMS based on Date and Darwen's "D" model. There are some open source versions already and some practical languages have been developed that go beyond their simple Tutorial D syntax. I'd welcome an industrial-grade implementation of an RDBMS based on D.
I reviewed the "D" model just to be sure I understood what exactly we were discussing and overall I was unimpressed. The single item that impressed me was this phrase "Tutorial D is a computationally complete programming language with fully integrated database functionality." It is unfortunate that it is followed by "It is deliberately not meant to be industrial strength; rather, it is a toy language, whose principal purpose is to serve as a teaching vehicle."
As long as you don't lose the set based operations otherwise all you get is RBAR and quite frankly THAT is a step backward.
Row-by-row processing is part of the wretched legacy of SQL. It just isn't an issue in a proper relational database model because everything is set-based.
I have to say that this statement couldn't be MORE wrong, row-by-row is part of the wretched legacy of procedural languages that only offer the option of row-by-row processing and when those developers start writing SQL they tend to want to do things row-by-row because they haven't been taught to think in sets.
General purpose integrity constraints with optimiser support
Please elaborate, this sounds interesting..
Support for declarative referential integrity and declarative constraints generally is very poor in SQL. One obvious consequence is that customers have to spend large sums on business rules engines, middle tier code and data quality tools to implement integrity rather than use the DBMS. For example try implementing a constraint that every order in the Order table must have at least one OrderItem. Or try to write a join dependency or inclusion dependency constraint in SQL. What's needed is the ability to write any query expression as a constraint and have the database engine not only enforce it properly but to optimise it and use it to enable query rewrites and therefore optimise other code.
These sound like nice features, but not a compelling argument to pitch it all and start over.
Strong typing and type inheritence
Maybe..
Every other major programming language benefits from strong typing. No C++, Java or C# developer I know of would dream of doing without it. Yet SQL's type system is still in the stone age.
I do think that the type system in SQL could be enhanced and within reason you have the capability through things like SQLCLR. Perhaps we should all read this Wiki article. It seems that much of this is done during initial compilation. What would you like to see here?
Self-optimising storage engine with more flexible physical storage options
The most important feature here is that we shouldn't have just one storage option per table. The idea that logical determines physical storage cripples database design. Why should I be forced to "denormalize" a table just to change the way that the data is stored on disc? That works against the whole notion of logical / physical independence. It is quite feasible to have a separate "mapping" tier for translating a single logical table or tables into any number of underlying physical storage structures.
Now this is a point I like, I would like to be able to do this too, but I still don't see a whole new paradigm to do it..
Listen, I have tried to see it from your perspective and I gave it the benefit of the doubt, but until there is a compelling reason AND a viable alternative, all the hype in the world isn't going to change what we are doing.
CEWII
November 17, 2009 at 9:32 pm
Elliott W (11/17/2009)
So in my view the three things you mention are already handled, could they be better, sure, but that doesn't support your position..
Oh dear! I wasn't very clear maybe. My point was that duplicate rows, type coercian and three-value logic DO all exist in SQL. They are fundamental to the way SQL works but they are three major flaws that should be got rid of. I think the problems they cause are pretty well-known and extensively covered in textbooks.
November 18, 2009 at 7:11 am
Again, the problem with replacing SQL with a more "relational" database engine and language is that, despite numerous attempts to do so, nobody has been able to come up with one that actually solves any real-world issues without causing more problems than it solves.
For example, on the point of 3-value logic, there are very, very few situations in this world that can be answered with only "yes" or "no" and that don't allow for "not sure" or "unknown at this time". Even Codd recognized this, despite his early opposition to the concept of Null values in relational data.
The problems with allowing nulls are purely academic. The differences between academia and reality are even more well-documented than the problems with allowing three-value logic in a relational database.
Here's my challenge to you on this subject: Write up, in detail, the actual situations in real databases, where the problems you have sited with regards to SQL have stopped you from modeling a real-world situation into a database. Specifically, I want to know the exact situation where 3-value logic being possible in a database has made it more difficult to model a business (or other real entity) situation. Note that that solution you were forced to hack must be more complex than simply setting the column to not allow Nulls. I want to see the same for each of the other "fundamental problems" you are citing. These must not be hypothetical situations, nor academic exercises, they must be real situations that caused real problems to real people.
I've seen this same set of arguments forwarded dozens of times over the years. It's not "met with resounding silence", it's met with "well, what real world problem are you trying to solve", which is usually answered by "well, nothing, but it's just not right".
If you can prove, in the real world, that these "fundamental problems" really are fundamental, and neither academic, incidental, nor trivial, then I'll buy your argument. Thus far, in a decade of seeing your exact statements brought up dozens of times, I've never seen anyone do this yet.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 8:25 am
I didn't want this to become a discussion about nulls because if you fix the other issues (such as providing decent type support and a truly relational language) then the "null problem" can go away and you can get on just as well without them.
You are of course correct that nulls don't actually prevent you from achieving any desired results from the database because it is just about possible to avoid them anyway. You can wrap every SUM() in a COALESCE() function in case it returns a null. You can eliminate nulls from outer joins using CASE expressions. Even if you never use them however, they still have a demonstrable cost. A SQL database engine needs to acommodate nulls internally even if they are never actually returned in a query. That means that certain query rewrites and optimisations just aren't possible. So your DBMS is in fact much less effective than it could be just because SQL needs nulls (the same also applies to duplicate rows).
A compromise might be to have a switch that allows a user to turn off null support. That would have serious consequences for the SQL language however because it would have to either force errors or change the meaning of existing code. It would also leave customers asking why they should have to pay for software complexity that they don't actually use.
All information can be represented without nulls or anything like them and I don't recall that Codd ever said otherwise. Science, mathematics and logic were able to describe the real world without using any symbol like a null for thousands of years before SQL came along. They continue to do so today. In any case, SQL's model of null certainly doesn't match real world requirements. Where in the "real world" (or in common sense for that matter) is x = x anything other than a True proposition (whether x is known or not)? Where in the real world does the sum of nothing equal something other than Zero?
You asked for real world examples of problems that can't be solved by SQL. I've given one already: the ability to enforce a constraint that an Order must contain at least one Item in the item table. That's an example of a problem I've encountered many times professionally and is a well-known textbook limitation of SQL. This and other business rule type of problems have to be solved by procedural code or by other software outside the SQL DBMS - frequently at great expense. A RDBMS with more general constraint capabilities would eliminate or reduce that expense.
A number of RDBMSs and languages have been built and are being used. They include Dataphor, Muldis and various other open source efforts. I'm not sure what problems you are referring to with those systems when compared to SQL DBMSs. Maybe you could elaborate.
I'd like to know what alternatives you think there are. The deficiencies of SQL are a pressing and very topical problem for a great many people right now. That's why there is so much interest and investment in alternative non-relational models. How else do you propose to improve the capabilities of our DBMSs? Do you really think current problems can be solved within the SQL model and if so, how? Or do you propose using alternative (non-relational) data models instead?
November 18, 2009 at 8:45 am
All information can be represented without nulls or anything like them and I don't recall that Codd ever said otherwise. Science, mathematics and logic were able to describe the real world without using any symbol like a null for thousands of years before SQL came along. They continue to do so today. In any case, SQL's model of null certainly doesn't match real world requirements. Where in the "real world" (or in common sense for that matter) is x = x anything other than a True proposition (whether x is known or not)? Where in the real world does the sum of nothing equal something other than Zero?
Null is not "nothing". Null is "unknown value". Nothing is 0.
Keep in mind that the invention of the concept of 0 in mathematics took tens of thousands of years to develop, and revolutionized the whole subject radically.
And, in the real world of real science x != x if x is a real world object/particle/phenomenon. No two apples are equal to a physicist, chemist or biologist. Only in the world of mathematics are they equal, and that's a mental construct used as an abstraction of reality.
As for science not using "unknown value" (null) for thousands of years, you're just dead wrong about that. The whole basis of all higher mathematics is that you can have unknown values. They are called "variables" in math. If you have two that you can't resolve in a particular situation, you end up with "unknown value". Math simply assigns these to non-numeric symbols in order to be able to write them down. They've been in use for thousands of years. Different word, same concept.
On the question of, "Where in the real world does the sum of nothing equal something other than Zero?", you're asking the wrong question. What's the sum of 5 + unknown value? What's the average of 2, 30, and unknown? Null doesn't necessarily equal 0. Null means "unknown". You're using the wrong definition of the concept.
Outside of numeric situations, what would you put for my first, middle and last names in you contact database, if you had one? You could enter an alias, "GSquared" with certainty. If you do a little digging around on this site, you could easily find that I answer to "Gus" as a first name, and you might be able to find my last name if you look hard enough. But what about my middle name? Can you enter into your database that I don't have one? That's pretty much what an empty string would mean. Or would it be more useful to leave the MiddleName column in your Contacts table listed as "Unknown". In database parlance, you'd leave it null, since entering the string "Unknown" could end up with a letter being addressed with that as the middle name, which would look stupid. Also, if you put "Gus" as my first name, you'd technically be incorrect, and you'd be better off entering that as a valid nickname and leaving the first name null as well as the middle name, till you gather further information. So, how would you record my name in a "truly relational database", without something that indicates "unknown/unverified value"?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 8:51 am
You asked for real world examples of problems that can't be solved by SQL. I've given one already: the ability to enforce a constraint that an Order must contain at least one Item in the item table. That's an example of a problem I've encountered many times professionally and is a well-known textbook limitation of SQL. This and other business rule type of problems have to be solved by procedural code or by other software outside the SQL DBMS - frequently at great expense. A RDBMS with more general constraint capabilities would eliminate or reduce that expense.
Interesting problem, but easily solved by adding a foreign key constraint to the Orders table, referencing the OrdersItems table (a join table between Orders and Items). Expense = one line of code, if you've defined your primary keys correctly and modeled the tables and data correctly. If the people writing those textbooks can't come up with something as simple as that, then they've got bigger problems than they think.
Next unsolvable problem?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 9:08 am
GSquared (11/18/2009)
Interesting problem, but easily solved by adding a foreign key constraint to the Orders table, referencing the OrdersItems table (a join table between Orders and Items).
Then how would you insert a new Order or populate the OrderItems table?
Null does not (accurately) represent the property of being unknown. That's just your assumption or convention, not anything actual or implied by standard SQL. That is demonstrable by the straightforward fact that x = x is True in mathematics and reality if x is unknown but is NOT true in SQL.
If you are seriously going to use quantum mechanics as a criteria for the design of database systems then I have nothing much more to say! If you are really going to be that silly then you would have to disallow ALL deterministic results of any kind from the database wouldn't you? No, don't bother to answer that... :crying:
I didn't say null equals nothing. What I meant was that SQL wrongly returns a null as the sum of an empty set, instead of the mathematical and real world answer of zero.
Viewing 15 posts - 1 through 15 (of 36 total)
You must be logged in to reply to this topic. Login to reply