NewSQL Can Save the World

  • Eric M Russell (7/26/2011)


    cengland0 (7/26/2011)


    Revenant (7/26/2011)


    cengland0 (7/26/2011)


    Does anyone have a link to the performance differences for all the different database products out there? I heard a rumor that the MS SQL license agreement prevents people from posting performance comparisons but I haven't confirmed that myself.

    I'd like to see (using the same data set) the performance for:

    * MS Sql

    * MySQL

    * Oracle

    * Teradata

    * DB2

    * PostgreSQL

    That would be a MAJOR exercise: results may, and likely will, significantly vary with the hardware. Even if we agreed that the machine will have say 8 GB, and the same (or roughly same) clock speed, there will be differences betweeen Nehalem and Sandy Bridge and - I have not tried that - AMD.

    Can't all of those run on the same hardware? In fact, if the hardware wasn't exactly the same, I wouldn't be interested in the results anyway.

    You would have thought someone would have done this comparison already but I haven't been successful in finding one.

    TPC publishes database processing benchmarks for a handful of the big name database vendors running on specific hardware configurations.

    http://www.tpc.org/tpcc/results/tpcc_results.asp?orderby=dbms

    Of course some databases are optimized to perform very well under a narrow range of case usages, like querying EAV or BLOB data. Also, some of the more mainstream RDMS engines (I think that MySQL InnoDB is one example) can be configured for very fast data loads in a *staging* environment, because it's possible to totally disable tranaction logging. Of course that only useful in a staging environment or for populating write/once read/many tables referenced by a website or data mart.

    Still, you would have to agree on a particular hardware, and you results would be valid only for that particular configuration.

    I specifically mentioned Sandy Bridge and Nehalem because if you compare processors from the two families, at the same clock speed Sandy Bridge will be likely faster thanks to its improved caching and hardware-assisted parallelization. Depending on their design, various DBMSs may or may not take advantage of these new features. In other words, one db may be faster than the other on Nehalem while on Sandy Bridge this will be reversed.

  • richardd (7/26/2011)


    So a man who runs a company which sells a competitor to "oldSQL" products has given an interview saying the "oldSQL" is bad, and his "newSQL" competitor is doubleplusgood?

    OK, I'll trust every word he says then. :rolleyes:

    He's a respected database researcher, including credits in developing the Ingres and Informix, worked on/owned Streambase, which provides a competitor to StreamInsight.

    He makes valid points, and addresses some database problems that other platforms don't handle. While he does have an interest in promoting his products, I don't think he is trying to say his product handles every domain of problems. But in some edge cases, I think his solutions, and similar ones, do a better job.

    NoSQL doesn't handle some problems well, and certainly OldSQL doesn't handle others well.

  • cengland0 (7/26/2011)


    Revenant (7/26/2011)


    cengland0 (7/26/2011)


    Does anyone have a link to the performance differences for all the different database products out there? I heard a rumor that the MS SQL license agreement prevents people from posting performance comparisons but I haven't confirmed that myself.

    I'd like to see (using the same data set) the performance for:

    * MS Sql

    * MySQL

    * Oracle

    * Teradata

    * DB2

    * PostgreSQL

    That would be a MAJOR exercise: results may, and likely will, significantly vary with the hardware. Even if we agreed that the machine will have say 8 GB, and the same (or roughly same) clock speed, there will be differences betweeen Nehalem and Sandy Bridge and - I have not tried that - AMD.

    Can't all of those run on the same hardware? In fact, if the hardware wasn't exactly the same, I wouldn't be interested in the results anyway.

    You would have thought someone would have done this comparison already but I haven't been successful in finding one.

    The question would then be - whose hardware requirements do you favor? The various big ones act differently enough that the "preferred" config may not be the same. One might prefer more cores on a single process: the other, more separate processors, but with fewer cores each. etc...

    Similarly - as TPC has already shown with their testing - is it fair to presume that only one test scenario is THE end-all, be-all to pick the "best" product. TPC E and TPC C are dominated unilaterally by one of the DB vendors, essentially proving that one vendor's solution would be markedly better than the other in that given case.

    At the end of the day there is no one single clear victor. Oracle is better at some things, SQL is better at others; this should be no surprise.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Eric M Russell (7/26/2011)


    TPC publishes database processing benchmarks for a handful of the big name database vendors running on specific hardware configurations.

    http://www.tpc.org/tpcc/results/tpcc_results.asp?orderby=dbms

    Of course some databases are optimized to perform very well under a narrow range of case usages, like querying EAV or BLOB data. Also, some of the more mainstream RDMS engines (I think that MySQL InnoDB is one example) can be configured for very fast data loads in a *staging* environment, because it's possible to totally disable tranaction logging. Of course that only useful in a staging environment or for populating write/once read/many tables referenced by a website or data mart.

    Thanks but that link does not appear to compare the database speeds. Sure it has a lot of computer configurations listed and you can click on a specific computer and it will tell you the price and what's installed. That's not what I was looking for.

    I would like to know, given a set of specific queries on the same set of data, how long each "brand" of database would take to return the results.

  • cengland0 (7/26/2011)


    Thanks but that link does not appear to compare the database speeds. Sure it has a lot of computer configurations listed and you can click on a specific computer and it will tell you the price and what's installed. That's not what I was looking for.

    I would like to know, given a set of specific queries on the same set of data, how long each "brand" of database would take to return the results.

    That's so dependent on the indexing, the query structure in places, it's hard.

    I supposed you could set up the same hardware, same data set returned, and compare, but likely they would all be so close as to make it silly to test. They all can work in 80% of the situations if you've sized hardware and written good code for that platform.

  • Steve Jones - SSC Editor (7/26/2011)


    That's so dependent on the indexing, the query structure in places, it's hard.

    I supposed you could set up the same hardware, same data set returned, and compare, but likely they would all be so close as to make it silly to test. They all can work in 80% of the situations if you've sized hardware and written good code for that platform.

    I think it would still be a fair comparison because all of those database "brands" can have the same indexes, dataset, hardware, etc.

    I work for a bank and we use Oracle, Teradata, MS SQL Server, and MySQL (limited). I see pro's and cons with all of them and have always wondered why one data center has chosen Teradata over Oracle, for example.

  • cengland0 (7/26/2011)


    Steve Jones - SSC Editor (7/26/2011)


    That's so dependent on the indexing, the query structure in places, it's hard.

    I supposed you could set up the same hardware, same data set returned, and compare, but likely they would all be so close as to make it silly to test. They all can work in 80% of the situations if you've sized hardware and written good code for that platform.

    I think it would still be a fair comparison because all of those database "brands" can have the same indexes, dataset, hardware, etc.

    I work for a bank and we use Oracle, Teradata, MS SQL Server, and MySQL (limited). I see pro's and cons with all of them and have always wondered why one data center has chosen Teradata over Oracle, for example.

    In a perfect test environment where all things are equal, except the RDMS engine, it would be possible to see exactly how SQL Server, Oracle, or VoltDB compare at specific tasks. However, in the real world the biggest limiting factor is always the database developer or DBA and in some cases budget. It would be interesting to hear from folks whose database project failed because they hit some hard limitation of the database engine itself, something that could not have been otherwise solved had they made a different data model, hardware, SQL coding, or configuration decision earlier on.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • cengland0 (7/26/2011)


    . . . I work for a bank and we use Oracle, Teradata, MS SQL Server, and MySQL (limited). I see pro's and cons with all of them and have always wondered why one data center has chosen Teradata over Oracle, for example.

    Speed is only one factor in evaluation of which database to use. Price per processor will be another one, and the responsiveness and cost of tech support might be next. I do not know how it is today, but Oracle used to be known for sub-par support.

  • Eric M Russell (7/26/2011)


    . . . It would be interesting to hear from folks whose database project failed because they hit some hard limitation of the database engine itself, something that could not have been otherwise solved had they made a different data model, hardware, SQL coding, or configuration decision earlier on.

    It ties with my previous post stressing the importance of tech support.

    The only time I had an important project way late and over budget was in 1997. There was a bug in Oracle and their tech support took two weeks to reply. It took them three weks to acknowledge there was a bug and almost five weeks to suggest a circumvention. Not coincidentally, since 1997 I have been exclusively on SQLS.

  • Interesting. I hope that Azure is scalable.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply