Hidden RBAR: Triangular Joins

  • TheSQLGuru (1/20/2009)


    is that real world coders do and always will write suboptimal code

    Maybe when I first get there, but a month or two after I get done training them... viola.

    There's just no excuse. If you let people write suboptimal code, they will. If you don't and you give them the training to do otherwise, they won't because they all have the pride and want to do a good job. And, the difference between doing a good job and a good enough job isnt that much. Just a little training goes a long way. My "real world" is that everyone wants to do very well and, given the correct training and incentives, can and will. It sometimes takes a bit of patience from the resident DBA, but it's time well spent.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • LeeBear35 (1/20/2009)


    It is a shame that anyone would cut such corners in project planning and development. Yes you may need to get the software out the door, but I watched as the company I work for spent over a million dollars on consulting because of this type of short sighted coding with a 10,000 row result set and then everything either failed in production or took days to run.

    Seems like there is never time in the project to do things right, but there is always time to do it over...

    Exactly...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Charles Kincaid (1/20/2009)


    I'd much rather say, "Congratulations! Our hardware folks will be calling you but your software is ready to handle the additional load right now.

    YEEE-HAAA! Now we're talking!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GSquared (1/20/2009)


    Jeff Moden (1/17/2009)


    ...CLR's are better at most RegEx and file handling. With those two exceptions (so far)....

    The other place I've found CLRs to be "better" is using CLR as opposed to OLE automation when a proc needs to access DLLs outside of SQL.

    The sp_OACreate, sp_OAMethod, et all, method of calling outside code is pretty inefficient, generally unstable, doesn't clean up after itself well, and is a pain to implement compared to CLR.

    Exactly... that's pretty much what I meant when talking about file handling, but your expansion on the sp_OA* procs, in general, is more correct. In fact, I still don't know if they've actually fixed all of the memory leaks associated with OLE atuomation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • David.Poole (1/20/2009)


    I understand the argument about not developing for 6 million products because getting a product to market, even with diminished functionality, will allow you to grab the high ground.

    I disagree with it, but I understand it. My job is now to clean up the results of such an approach. I thought I had seen every possible example of bad database design/code but I've come across a few new ones, and that is after 20 years in the business. Recursive procedure calls where the procedure calls itself within a cursor?:w00t: The SQL Injection proc from hell, object names with mathematical symbols in them, rampant denormalisation, you name it, it is there to be cleaned up.

    Microsoft wasn't the first to develop......well just about anything really.

    First spreadsheet? Hell even Lotus didn't do that.

    First word processor? Wasn't that Wordstar or Word Perfect or maybe an ancestor?

    What Microsoft focussed on was how people use their products. Under the hood their products may not have been as technically good as competitors products but I can remember thinking that Microsoft Multiplan and Microsoft Chart were far easier to use than Lotus 123 and CA Supercalc.

    If you are going to take the chuck the 10K product version out of the door you better make sure the 6 million product version is going to be ready very shortly.

    Heh.. dang, David... you might even be older than me! 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Apparently, I'm not the only one who has a different "real world" than some and I'm very happy to see that. Everywhere that I've gone, people (including the managers with whacko schedules) are embarrassed to get a call pointing out a failure due to performance or scalability. They DO want to make a real difference. Many of the managers I've worked with, simply didn't realize how little it takes to make the difference between the proverbial 10k row package and the 6M row package and neither did the developers. With a bit of encouragement and thoughtful in-house training, those who were not capable suddenly become capable and the pride in doing a job extremely well and still on time or nearly on time (at the beginning) becomes infectious. Many people consider "zero defects" to be a pipe dream. Yet, every shop I've gone to has been able to come very close to not only the "zero defects" mark, but have also done so with performance and scalability included.

    As Charles said, even the managers and their bosses, who used to be part of the problem with insane scheduling, take great pride in being able to say, "Our hardware folks will be calling you but your software is ready to handle the additional load right now."

    And, no competitor can beat that or the reputation for integrity that builds.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • TheSQLGuru (1/20/2009)


    GSquared, I think you are off-base on many fronts.

    1) Again, just because someone can't or won't design/write optimal software does not mean he/she should not be working in development. You and Jeff agree on that, I very very strongly disagree and that will just have to be that I suppose. At a minimum I believe that far less software would be developed if your point were reality.

    2) Many mentions in the last few posts have spoken about BUGGY software being put out quickly then losing in the long run. I NEVER said to put out software that did the WRONG thing - just do the right thing SUBOPTIMALLY or less efficiently than is optimum. There is HUGE difference between the two. For my point about suboptimal code you only need to look at SAP, Lawson, Peoplesoft, etc. Even the relatively new Biztalk is a dog from a database perspective. Actually Sharepoint is probably the best example of all. That system is a frickin' nightmare on the database and processing logic side up one side and down the other (as are the ERP packages from what I have seen/heard of them - RBAR, missing indexes, inefficient data access and data types, yada yada yada). Yet Sharepoint has and continues to garner significant marketshare in a very short span of time as far as enterprise servers go (IMHO).

    I understand that you disagree. I don't understand why.

    Would you accept that kind of standard of workmanship in ANY other field of endeavor?

    Would you continue to get medical advice from someone who "has trouble understanding the germ theory of disease, so he continues to use leaches to ballance your humors"?

    Would you take your car to a mechanic who insists that vegetable oil is good enough for lubricating your engine, because he just can't seem to get that motor oil isn't the same thing?

    Those are, from what I can read in your posts, the same thing as accepting a database developer who just can't get set-based code, and has similar results.

    You say, "not wrong code, just code that doesn't work as well". I've never yet seen poorly written code that actually accomplished what the customer actually wants. Even just "it's too slow" is a bug. I guess we just disagree on the meaning on the term "bug". The definition I use is from Webster's, and is "noun: a fault or defect in a system or machine". Running too slowly is, to my understanding, a "fault or defect". What definition do you use? Do you limit your definition of "bug" to "produces incorrect output"?

    To clarify, would you consider it a "bug" if Windows took 3 days to boot up, and 2 days to launch programs, but was otherwise "perfect"? How about if your web browser took over an hour to load any given web page? Bug? Or just "the developer didn't know the perfect method, but what he did got the job done, even if not perfectly"?

    If you say "yes" to either of those, what's the cut-off for it being a bug vs it being, "well, that's the best these ignorant developers can do, and it's good enough"? An hour? A minute? Thirty seconds?

    Is it not a specific amount of time, but a ratio? If a person who knows what he's doing can get it to run in 10 milliseconds, and a person who uses table variables, cursors and recursive UDFs can't get it to run in less than 8 minutes, is that "buggy"? Even if the 8+ minute code does, eventually, produce the right results? If 8 minutes is too long, what's "fast enough" to no longer be a bug? 10 milliseconds is the "perfect" version, what's the "good enough" version? 8+ minutes? 1 minute? 30 seconds? Do you have a defined threshhold or is it just, "when it's as good as this particular developer can do it"?

    That's what I don't understand in what you're saying. I don't think any other field of endeavor with the kind of pay scale that professional developers get would accept that level of "good enough". For that matter, most people won't accept that level of service from fast food restaurants.

    You seem to be asking me to forgive people for doing poorly at jobs they are being very well paid for. I just don't get that.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • As an addendum on the point of Sharepoint, et al: Have you tried refactoring the databases to test overall performance and operation with the missing indexes, non-RBAR, etc., all fixed?

    Keep in mind that what's optimum for loading data may be non-optimum for retrieving it, or vice versa, and sometimes "standard practices" need to be compromised for specific functionality.

    For example, I've got tables with no indexes at all (clustered or otherwise), no constraints other than data type and a default or two, no PK, no FKs, etc. Normally, that would be considered a really bad table design. In this particular case, the overriding consideration is speed of inserts, and extensive testing showed that this was the best way to achieve that.

    I have another table that consists of nothing but a varchar(max) column named "Column1". Bad design? In most cases, absolutely. In this case, it's a staging table for importing text files that have an inconsistent column format and inconsistent column delimiters. Once the data is in there, various procs and functions figure out what format its in, which sometimes is different from row to row, figure out the columns, work out what target tables the various rows go into, and get it in there. Building that system took quite a bit of work, but it ended up saving about 16 man-hours per day for the company, reduced the error factor to essentially zero and thus saved a few thousand dollars per month in refunds, and so on.

    In both cases, any "standards purist" who looked at the tables without looking at the whole system would probably say, definitively, that I'm a moron and shouldn't be allowed anywhere near a database. In both cases, they'd be dead wrong.

    Is it possible, perhaps, that the people who built Sharepoint knew more about the gestalt of the system than the people who are complaining about specific non-standard details? I'd have to go with, "it is not only possible, it is essential".

    But that's the whole point of what Jeff and I are writing about. You have to know what you're doing before you know what rules to break and when to break them. An unskilled cook had better follow the recipe exactly as it's written, or you end up with cookies that taste like soap (true story). A highly skilled chef can change the seasoning every time he cooks something, and you can bet that it'll taste good every time!

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Heh.. dang, David... you might even be older than me! 😉

    Thanks a bunch. I'll start saving for a mid-life crisis. Oh hang on, interest rates have gone to hell in a hand cart, I may be gone a while:D

    I used to do a lot of work with content management systems and remember one where the only manual was in badly translated German. Due to our ignorance we wrote our own database code to perform key functions that we later found to be available as standard library functions.

    The first stress test showed against a comparible out-of-the-box system showed that our bespoke DB work yielded a 5x boost in performance!

    The thing I don't understand is why commercially written software seems to have universally poor database design. Most developers I work with are pretty clued up on the database side of things. I'm beginning to think there are special recruitment agencies for crap DB developers.

  • David.Poole (1/21/2009)


    The first stress test showed against a comparible out-of-the-box system showed that our bespoke DB work yielded a 5x boost in performance!

    Now, that's exactly what Gus (GSquared) and I are talking about. You improved the system performance, in those areas you touched, by 5X... and that's not even what you set out to do! You just had Developers that knew what they were doing.

    The thing I don't understand is why commercially written software seems to have universally poor database design. Most developers I work with are pretty clued up on the database side of things. I'm beginning to think there are special recruitment agencies for crap DB developers.

    Heh... take a look back at what some folks have said in this very thread for the answer to that! The main theme for those seems to be something along the lines of "get the product out at all cost (ie before the supposed competitors)". Time-to-Market and the insane schedules levied in product development is (almost) the sole reason for shrink-wrapped or third party crap code.

    The other problem isn't the problem of recruitment agencies... it's the dummy managers that don't know how to interview for quality developers. Most of them think that if you have mile of letters after your name, that you actually know something. I had a guy that claimed to have an MS/CS, DBA certification in both, and also claimed more than 10 years of experience in both SQL Server and Oracle [font="Arial Black"]and [/font]claimed, on his resume, to be (and I quote), a "9 out of 10" in both database engines. My boss wanted to hire him on the spot... I asked him one question which I thought to ask just to "break the ice" by giving him a simple question... "Using a SELECT statement, how do you return the current system date and time as part of the return in either?" He couldn't tell me!!! Interview was over in less than a minute.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/21/2009)


    ... simple question... "Using a SELECT statement, how do you return the current system date and time as part of the return in either?" ....

    Bad to the bone 😉

    We old farts always try to "break the ice" using tricky questions.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZDBA (1/22/2009)


    Jeff Moden (1/21/2009)


    ... simple question... "Using a SELECT statement, how do you return the current system date and time as part of the return in either?" ....

    Bad to the bone 😉

    We old farts always try to "break the ice" using tricky questions.

    I thought I was being nice and friendly by giving such a simple question to start with. After that, it became the first question I asked on every interview for SQL Developers...

    "Which T-SQL system function returns the current date and time?"

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/22/2009)


    ALZDBA (1/22/2009)


    Jeff Moden (1/21/2009)


    ... simple question... "Using a SELECT statement, how do you return the current system date and time as part of the return in either?" ....

    Bad to the bone 😉

    We old farts always try to "break the ice" using tricky questions.

    I thought I was being nice and friendly by giving such a simple question to start with. After that, it became the first question I asked on every interview for SQL Developers...

    "Which T-SQL system function returns the current date and time?"

    I had a question about how many tables it takes to do a many-to-many joins in my last tech interview. I gave the standard "it takes 3 tables" answer, and that was accepted.

    I checked afterwards what would have happened if I had given the actually correct answer of "one". (You have to violate 1NF to do it, but you can do a many-to-many in a single table joined to itself by a delimited list stored in a single column, or other variations on that.) I wouldn't have passed the interview.

    You can do a two-table many-to-many without violating even 5NF if it's a table being joined to itself through another table. That also wouldn't have flown in the interview.

    But I can understand not knowing those ones. Those are exceptions and/or violations, of the basic idea of the query. But not knowing about GetDate?? That's more than a little weird to me. My wife knows about GetDate, and she's not even an IT person, she just thinks it's funny that database nerds are so clueless that the command returns the date and time, instead of hooking you up with a hot person of the opposite gender!

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I have my red-light questions that fire off warning bells for me.

    Any DBA who does not know what a temporary table or table variable is.

    Any DBA who can't see why WHERE DATEDIFF(mi,DateCreated,GETDATE())<2 doesn't perform well.

    Anyone who things SELECT * is a good idea.

    Any DBA who thinks an index on a bit field is a good idea.

    A surprising number of alegedly experienced DBAs seem to get tripped by these. I don't think I'm being harsh with these sorts of questions. In fact I think of these as basic warm up questions to settle the candidate down for the proper questions.

  • Joe Celko (1/22/2009)


    getdate()?? Didn't you mean to say CURRENT_TIMESTAMP? We want to use all the ANSI/ISO stuff that Microsoft will give us instead of UNIX dialect museum exhibits.

    Nah. That one's not what makes my wife laugh. That's far more important! 🙂

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 196 through 210 (of 255 total)

You must be logged in to reply to this topic. Login to reply