Falling Over our Assumptions

  • george sibbald (4/8/2009)


    Manie Verster (4/8/2009)


    DBA's have the assumption that all developers are bad for a database or bad SQL programmers.

    that's a common assumption about DBA's in itself Manie 🙂

    Nah. Both of those are more like theories than assumptions. And I mean the full-on scientific definition of "theory", as in a body of related data that is mathematically true and which holds in all reliably observed cases. 🙂

    "Sir, please step away from the database with your hands in full view. We don't want anyone to get hurt."

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GilaMonster (4/8/2009)


    Robert Frasca (4/8/2009)


    Actually, if memory serves, in SQL Server 2005, table variables are stored in TempDB as well. In SQL Server 2000 they were stored in RAM. There's another incorrect generalization.

    Not at all. Even back in SQL 2000, table variables could spill to TempDB if needed and both temp tables and table variables were kept in memory as much as possible

    http://support.microsoft.com/kb/305977

    From above kb article:

    Q4: Are table variables memory-only structures that are assured better performance as compared to temporary or permanent tables, because they are maintained in a database that resides on the physical disk?

    A4: A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).

    Gail,

    The point I was trying to make is that the assumption that table variables are better than temp tables because they are stored in RAM is false. Yes it's true that they might end up in RAM. I can't assume that when I'm coding. As is true of most things in the world, sometimes table variables are better than temp tables and vice-versa. Here's something from a blog posting by Kimberly Tripp that explains the difference fairly clearly.

    "First - a bit of understanding of TempDB - what goes there?

    * Internal temporary objects needed by SQL Server in the midst of other complex operations. For example, worktables created by a hash aggregate will be stored in TempDB or interim tables uses in hash joins (almost anything that shows as "hash" something in your query plan output is likely to go to tempdb).

    * User objects created with either # (for local temporary objects), ## (globabl temporary objects) or @ (table variables)

    o # = Local temporary object

    Local temp objects are objects accessible ONLY in the session that created it. These objects are also removed automatically when the session that created it ends (unless manually dropped).

    o ## = Globabl temporary object

    Global temporary objects are objects that are accessible to ANYONE who can login to your SQL Server. They will only persist as long as the user that created it lasts (unless manually dropped) but anyone who logs in during that time can directly query, modify or drop these temporary objects. These objects are also removed automatically when the session that created it ends (unless manually dropped) OR if being used by another session when the session that created it ends, when the session using it finishes using it (and it's only as long as any locks are held). If other sessions need more permanent use of a temporary object you should consider creating a permanent objects and dropping it manually.

    o @ = User-defined Table Variable

    User-defined Table Variables were introduced in SQL Server 2000 (or, wow - was it 7.0?) and provide an alternative to temporary tables by allowing you to create a variable defined as type TABLE and then you can populate and use it in a variety of ways. There has been A LOT of debate over whether or not you should always use table variables or always use temp tables. My response is that I ALWAYS avoid the word always! My point is that table variables are NOT always better nor are temp tables always better. There are key uses to each. I tend to like temp tables in scenarios where the object is used over a longer period of time - I can create non-key indexes on it and it's more flexible to create to begin with (SELECT INTO can be used to create the temp table). I also have the ability to use the temporary table in nested subprocedures because it's not local to the procedure in which it was created. However, if you don't need any of those things then a table variable might be better. When it is likely to be better - when you have smaller objects that don't need to be accessed outside of the procedure in which it was created and when you only need KEY indexes (a table variable ONLY supports the indexes created by a create table statement - meaning PRIMARY KEY and UNIQUE KEY).

    * Objects created by client applications - this is possibly a large part of your problem... Profiling can help you to determine if there's a lot of TempDB usage from your client applications."

    "Beliefs" get in the way of learning.

  • [font="Verdana"]

    A good article.

    I've been using that trick of breaking large operations down into smaller ones since, oh, SQL Server 6.5 days?

    Compared with re-assembling the small steps into the one or two humungaloid SQL expressions, sometimes you end up better performance (statistics on the temporary tables) from the smaller steps, and sometimes better from the combined statements (reducing the additional I/O needed for creating the temporary tables), but it's normally not a huge difference.

    So I err on the side of what's easier to maintain.

    Oh, favourite hated expression is "disc is cheap". Uhuh. By the time you add the need to add the same space to multiple environments (Dev, QA, UAT, DR, Prod), add in RAID requirements, add in time to order, the time to fit and the time for outages (if required), sometimes the need for new disc housing, sometimes the need for additional controllers...

    [/font]

  • Robert Frasca (4/8/2009)


    The point I was trying to make is that the assumption that table variables are better than temp tables because they are stored in RAM is false.

    I agree with you there. I was just challenging your statement that in SQL 2000 table variables were stored in memory. The disk/memory placement of table variables did not change between 2000 and 2005. In both versions they were created in tempDB, kept in memory as much as possible and split to disk if they get too big.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GSquared (4/8/2009)


    andrew gothard (4/8/2009)


    Manie Verster (4/8/2009)


    DBA's have the assumption that all developers are bad for a database or bad SQL programmers. Well, I am a develepor cum DBA but before I had to operate as a DBA I am sure I wrote some bad queries.

    . What I am trying to say is, sometimes it is not bad developers but insufficient training.

    My other assumption is "don't pass the buck".... oh no! that is not an assumption it's a fact!

    From what I've seen, all the top bods on this site seem to have come from the same type of background. Oh, and me. BUT, quite frankly, there does seem to be a significant issue with a certain type of developer who because they can bung in "SELECT * FROM SomeTable" into a bit of c# code, think they know the lot about the database side, irrespective of actual clue level.

    Even worse, when you attempt to explain to them - well, they're not interested. I have no idea why someone not interested in learning wants to go into coding - but it does seem to be becoming worryingly common.

    My pre-specialist DBA background includes management/executive, sales, marketing, proofreading/editing, typesetting, and I was a file clerk for a while. I'm not sure that comes under the heading of "same type of background"... 🙂

    Oh - as well as the IT background - including COBOL, SQL Server 6.5 to 2005, Access 1.0 +, Clipper and various other forms of xBase, some C, VB 4+, Oracle, Auto ID Solutions, Informix, BaaN; I could add in Sales, running a boozer (voluntary), Welfare rights advice, labouring, assistant gardener, Marketing as part of a post grad course. All grist to the mill when it comes to dealing with people effectively, OMO, a varied background.

    One thing I'd say is, the background on the buildings is one of the things that makes me worry about IT ever becoming a genuine, respected profession. Any buffoon on a site strutting about ranting "F*** the foundations, we need to get the walls up ASAP to show we're doing something" would be thrown offsite - probably bruised, into the nearest skip. Now, in an IT project, in most cases the database schema is the foundations. And yet ...........

    A thought suddenly comes to mind .....

    DBA is talking to an AGILE bod ...

    DBA: Hey - there's a party on tonight, fancy dress, cowboys theme, are you coming?

    AGILE Bod: Duuude, how AM I going to get home and change for that.

    DBA: Change? Why? Just go in your work clothes!

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • GSquared The old "table variables are faster because they're in RAM" is a very, very common false datum.

    Or "Bollocks" as we like to call it in th eUK. Surprising how often that canard crops up in interview questions, mind. It was even in BOL.

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • Bruce W Cassidy (4/8/2009)


    [font="Verdana"]

    A good article.

    I've been using that trick of breaking large operations down into smaller ones since, oh, SQL Server 6.5 days?

    Compared with re-assembling the small steps into the one or two humungaloid SQL expressions, sometimes you end up better performance (statistics on the temporary tables) from the smaller steps, and sometimes better from the combined statements (reducing the additional I/O needed for creating the temporary tables), but it's normally not a huge difference.

    [/font]

    Ah - it can be. I'm not really much of a fan of temp tables. Probably, in all honesty, as a pavlovian reaction to some of the deranged drivel I've seen inflicted upon the engine, apparently by inebriated baboons, where they've been used so incorrectly an amoeba could probably spot it.

    On the other hand - the optimiser, at heart, is just a guesstimate based upon a handful of electrons bumbling around a bit of sand someone's taught to do tricks. When you have several tables, particularly large ones, which may be referenced more than once in a complex query - the optimiser can go totally hatstand.

    One example - put together and tuned 4 derived tables - all results within 1 second. Link them together - 15 minutes. Into indexed temp tables with the orders in the queries appropriate to the clustered index in the temp tables - 2-6 seconds.

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • andrew gothard (4/8/2009)


    I'm not really much of a fan of temp tables. Probably, in all honesty, as a pavlovian reaction to some of the deranged drivel I've seen inflicted upon the engine, apparently by inebriated baboons, where they've been used so incorrectly an amoeba could probably spot it.

    [font="Verdana"]Something about baths and babies and water... assemble as required. :w00t:[/font]

  • andrew gothard (4/8/2009)


    On the other hand - the optimiser, at heart, is just a guesstimate based upon a handful of electrons bumbling around a bit of sand someone's taught to do tricks. When you have several tables, particularly large ones, which may be referenced more than once in a complex query - the optimiser can go totally hatstand.

    One example - put together and tuned 4 derived tables - all results within 1 second. Link them together - 15 minutes. Into indexed temp tables with the orders in the queries appropriate to the clustered index in the temp tables - 2-6 seconds.

    This reminds me of my favourite false assumption (which I think I first heard on a course about Oracle SQL)...

    SQL is a declarative rather than a procedural language. So you just need to tell the database engine what you want and it will figure out the best way to get it. Focus on 'what' not 'how'.

    The instructor then went on to explain how badly formed cross joins could cripple performance... 😀

    20-odd years later, the optimiser still seems to need hand-holding at every opportunity!

    In its defence, of course, queries are often written in a development environment where the data is only a rough approximation to production, so the optimiser is often working with one hand tied behind its back!

    Derek

  • As others have posted, I recently had great success breaking down a large single SQL query into smaller bits (among some other changes made as well). In this case, intermediate results were stored in a table variable with a note about converting it to a temp table in the future if it looks like it could use some of the strengths of temp tables.

    One of the ways in which this change improved performance was that I was able to get rid of using several CTEs which had been used more than once in the query. *Thanks* to SQLServerCentral! where I recently read an article that explained that CTEs are re-run with every call within a query. I had not known that before and subconsciously assumed that SQL Server would create the data once and re-use it with each call. Bad assumption on my part.

    My favorite assumption is along the lines of: "Any electronic application is better than no electronic application at all." A poorly designed database and application can be a worse choice than a paper method. An electronic application is not automatically better just because it is electronic. Thus it is up to us back and front end programmers to significantly improve user's lives, not to just create "anything that works."

  • JJ B (4/13/2009

    Any electronic application is better than no electronic application at all.

    Among several examples - the guy who ran a (pretty grotty backstreet spit and sawdust place. I only feel safe saying that because he's not likely to end up on this thread) gym I used to go to asked for a word once when I was on the way in. "Andrew - you work in computers - keeping track of when all the memberships are due is a pain in the arse. Tell me what computer and what programs to buy"

    "er - have you got £2-3 grand to spend, that you have nothing better to do with, on something you are prepared to leave overnight in here, that may well as a result get stolen, with some of your financial details on?"

    I paraphrase somewhat here ....

    "NO!" ....

    "I suppose I could get the lads to take it home with them every night"

    "Mate, what about something cheaper and easier .... "

    "Like what"

    "You only charge £10 a year, make it £5 every 6 months, payable first session after 1 Jan n 1 June, put it in a book. Everyone pays at the same time, you're good at faces - if someone starts 3 weeks before one of those dates you can decide whether or not to waive it"

    Now - would a PC based solution have served him better, been as cost effective or efficient?

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • from andrew gothard: "Among several examples - ..."

    Andrew: What a fun post! I'm all smiles.

    What a smart guy you are. You saved him a ton of money and hassle.

  • JJ B (4/15/2009)


    from andrew gothard: "Among several examples - ..."

    Andrew: What a fun post! I'm all smiles.

    What a smart guy you are. You saved him a ton of money and hassle.

    Dude - if you'd seen the size of this guy ......

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • Heh, nothing changes.  I find the licks in the article to still be very relevant today and the more than a decade old discussion sound as if it occurred today.

    Nothing changes except the words used to describe things.  My least favorite words lately boil down to "A Developers time is much too valuable to worry about doing things right".

    <headdesk>

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've learned that the biggest assumption is that there must be a technical solution.  I've seen a few that look like an attempt to automate the devising of Captain Stupid, Major Wrong and Colonel Cockup.  Revising, and if possible eliminating,  the process was the answer, not automation.

    A few more I've come across recently

    • You don't have to do data modelling with NOSQL
    • Schema on read allows us to put that off until later
    • We can perform superior data referential integrity checks using POJOs
    • MongoDB will support all our BI needs.

     

Viewing 15 posts - 61 through 75 (of 81 total)

You must be logged in to reply to this topic. Login to reply