Done or Almost Done?

  • Vertica Systems

    IS RDBMS technology obsolete? I saw this article that Michael Stonebraker had written that we might need to re-think our database architectures. There's a blog post that briefly describes his thoughts. As a blatant plug, we follow this blog on Database Weekly, so if new posts appear, we'll put them out.

    Specifically in this case he's talking about a columnar database from Vertica Systems, of which he's a founder. He's also the CTO of Streambase, which build a database to handle high volumes of streaming data. Stonebraker has an impressive resume, architecting Ingres> and POSTGRES, which helped form the foundations for DB2, Oracle, and SQL Server. I think he knows what he's talking about and he might have some great ideas here.

    However being a little cynical as well, I'm a little concerned when an expert talks about how we need change and, oh by the way, I have a company that can help with the change. It concerns me because many people take advantage of their position or fame for personal profit.

    In this case it's hard to know if he's pushing a new idea because he believes it can really help and it's needed or because it will increase his personal wealth. I'd like to think that it's both, and Stonebraker is placing on a bet on something he sees will benefit other companies.

    I do think that we having some specialized database systems can improve performance, but at what cost? Already the SQL Server platform is so wide and varied that very few, or none of us, can be an expert in the entire thing. Having to train staff to work with stream databases, columnar databases, cubes, and OLTP is a large burden for most companies.

    Slow and, well, not so steady

    http://users.cwnet.com/xephyr/rich/dzone/hoozoo/images/toby2.gif

    I've seen a few complaints lately and experienced the slowdowns myself. What used to be a fairly well responding site, considering how much stuff loads in multiple panels, has been very slow the last week. Which site?

    MSDN, of course. The reference for those of us in the Microsoft world. I use the site for research, usually on the Question of the Day section, but also for articles and to learn something here and there. The last week has seen some timeouts and very, very slow page loads. As in 3-4 minutes to get to (DON'T CLICK THE LINK) the SQL Server developer center and almost that long to get Books Online to load. That's if it loads at all. Hopefully these aren't early releases from Patch Tuesday being applied 😉

    Come on Microsoft, you're trying to get us developers to believe in your technology. Believe that you have well run systems. The site needs to be up to prove that and not make us think that it's the software that's failing.

    And if it's human error, let us know what went wrong. Don't throw anyone under the bus, but in general terms tell us what went wrong. You might help us prevent outages at our places.

    Slowing Down

    This was a fairly quiet week on the SQL Server front. I'm guessing that between PASS and the new CTP for SQL Server 2008, that most people were busy elsewhere and not a lot of writing got done. The SQLServerCentral.com / Database Weekly crew (meaning me) will be at the PASS Summit, so be sure you register with the "SSC" source code and come to Denver if at all possible.

    There were a number of KB articles that appeared over the last week, most of which are caught in the cumulative update 3, so if you haven't installed that, you might want to get a copy and keep that handy.

    Steve's Pick of the Week :

    Former network engineer faces jail time for sabotaging patient data - This guy is a bum and I have another editorial on this at SQLServerCentral.com, but keep this link handy. The next time someone wants to sidestep a security rule, show them this.

  • I first encountered Sybase IQ when it was v11.5 at a Sybase Techwave conference - I believe that MS/SQL version I was working with then was v6.0 or v6.5. At the time I dismissed it as more marketing hype than than substance. However a few years later I had the opportunity to work with it from v12.0 to v12.4.3 an found everyting that was hyped about the product to be very 'real'. From a data perspective it achieved data compression on load anywhere from 40-60% for starters. Yes that is correct 40-60% !!! From a traditional DBA perspective there were other wondrous features. Each and every column of data is indexed as it is loaded. Yup every one ! There are also other indexing options available as well in addition to the defaults. By the way the defaults are usually good from the get go 95% of the time too. The wonders do not stop here. Imagine no index maintenance, no defragmenting, no statistics updates whatsoever. Just an occasional index type change (one of about 11 or 12 that I recall at the time) when the cardinality of the data in a given column surpasses certain thresholds. As a DBA imagine yet too that UNION ALL in a VIEW is the developers most powerful and useful tool. Data partitioning is a breeze and is the standard modus opperandi. There are built-in/configurable governors on result set creation and temp resources utilization as well. 90% of Sybase IQ is in the developers hands. From a production support perspective all a DBA needs to manage is backup, disk space, user creation, user permissions, and index cardinality checking. Oh, lest I forget the not so occasional SP migration. Imagine something like a SELECT COUNT(*) on a 1.5 billion row table returning a result set in l3ss than 3 seconds wall time. Better yet, a 2 or 3 table JOIN where 2 of the 3 tables have 1 billion plus rows returning query results in 5 to 10 seconds with 10 seconds being on the poor performing side. Yes there are still query plans to optimize, although this is really quite rare since they are produced by default all of the time. The major need for this is for Sybase to modify the engine via fixes in future releases. Oh I almost forgot a thing called 'Multiplex'. This is where your primary data resides on one server usually with a huge SAN and a number of 'other' servers complete with their own temp resources to handle user connectivity and query activity. This thing called 'Multiplex' makes the product extremely scalable very rapidly.

    Now all of this sounds too good to be true doesn't it ? Well now for the bad news. This software product while ideal for decision support systems reporting, data analysis (OLAP, etc) and plain raw and huge data storage it is not 'true' transactional a DBMS in the OLTP sense <PERIOD> Yes it can do it, as is extraction of data in real-time data from Sybase ASE, but it is abysmally slow - I actually think a small team of data entry clerks can beat it in a race. But that is not what it was designed to do so it is not its strength.

    All in all I believe that column level DBMSs are here to stay. However the row based DBMSs are not through yet !

     

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • I am a bit skeptical, what with his new company and all.

    Making claims that his new product has a 150x performance advantage stinks of using extremely engineered tests for the benchmarks. Perhaps on certain hardware for a specific set of tests a column-store can outperform a row-store database, but unless a more general benchmark is released I won't believe it is a generally better solution. I'm willing to bet that many of the developers at Microsoft or Oracle could create a test proving the opposite if they were so inclined.

    In the linked blog post, Mr. Stonebraker criticized "one-size-fits-all" RDBMS technology as old and obsolete. What he seems to have missed is that there are many different uses for an RDBMS, and many different possible designs for each of them. A "one-size-fits-all" database product can be used for all of them; even though it might sometimes be difficult (or impossible) to get the most fully optimized solution up and running, it will do well for most situations. A new product that excels at a portion of what another product already handles decently is always welcome of course (I am sure there are many data warehouses out there that are pushing the envelope as far as table size etc), but this "row-based storage is DEAD" attitude is just marketing hype IMO.

    What would be neat is if the major vendors allowed for (or more likely supplied) various swappable storage engines, as MySQL does. I haven't played with MySQL for a while now (PostgreSQL is my current free pet RDBMS) but I always thought that part of it was extremely nifty. If SQL Server allowed you to decide whether you wanted traditional row-based storage for OLTP or a column-based engine for your DW we could have the best of both worlds. It doesn't sound so difficult to do either (of course nothing does when you are speculating about some unknown group of engineers working out the details).

    -- Stephen Cook

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply