The T-SQL Paradigm

  • Jeff Moden (4/7/2009)


    ... just don't want anyone forcing such rules on me.

    Now, THAT'S what I'm talking about! The use of brackets, though optional, is one of the rules of thumb that I was referring to earlier that is of the aesthetic category. Just as in the case with aliases, though it may be a good idea, it won't make any difference on performance whatsoever. I certainly don't want rules such as that imposed upon me unless my organization requires it.

  • I'm almost of the opinion that the brackets around identifiers would be better required. It might speed the parser. We surround strings with single quotes and you don't see much griping about that. Think of all those key strokes saved by EMPEROR=Charles rather than EMPEROR = 'Charles'.

    Could we also require the white space around operators too? We could also use the double equals for testing and the single for assignment. That one saves me so much when writing C++, C#, PHP, and the like. Context sensing of operator meaning is so outmoded.

    ATBCharles Kincaid

  • Jeff Moden (4/6/2009)


    Aaron N. Cutshall (4/6/2009)


    David Walker (4/3/2009)


    Although this is off the original topic, I still haven't seen a SQL statement that was easier to read with the aliases than without them. Please show me some, if you have any.

    I don't have any short examples to give you because that rather defeats the purpose. Some of our queries, if printed, are several pages long due to extensive use of CTEs, inner joins and conditional clauses. They are quite difficult enough to follow as it is without adding large table names repeatedly through the statement.

    The problem we found with using full table names in our queries is that it quickly becomes full of table names that, because of the number of fields being referenced, get repeated throughout the query and make it difficult to follow the logic because the eye tends to focus on the table names due to their repetition and not on the field names. By using a known set of aliases, we've reduce the problem dramatically. It becomes one of those situations where it's difficult to see the forest due to the trees.

    I don't mean to impinge upon David's experience and from a certain point of view I can understand his position. As a rule of thumb, it is a good idea to use as full of a declaration as possible for unambiguity. Unfortunately, as most everyone can attest to, the development world is not black and white and not all rules can be applied in all situations. All we can do is to balance all of these good ideas with each situation and make the best of it that we can. At the moment, I can't think of any "must be followed" rules that should be used in all situations regardless of circumstances. The true measure of a developer is in the application of good judgment in each given situation. That, more than anything else, will determine the success or failure of a project.

    Spot on. I normally use table aliases to add simple clarity to the code. For example and to make a much longer story shorter, we had a rather hairy query that was supposed to return a report of failures but only if the failure hadn't been followed by a success. The failures could happen many times as the code being logged could do up to 10 retries based on whether or not a particular download failed or succeeded.

    Obviously the log table had to be queried not once, but twice. Once for failures and once for success and then joined for exclusion. The ProcLog table was given two aliases... "fail" and "pass". It made life a whole lot easier to follow. And, obviously, you couldn't use the full table name for the two instances because it would have given errors in ambiguity.

    But, like you said, to each their own.

    As for "rules that must be followed", I can think of a few. For example, in a joined update, the object of the update must be included in the FROM clause or bad things can sometimes happen depending on a bazillion different factors such as scale, indexes available (or not), and whether or not parallelism is spawned. We had such an UPDATE slam 4 CPU's into the wall for over 2 hours. When we made the correction, the code ran in about 20 seconds.

    Of course, if you end up in my shop, the "rules that must be followed" increase quite a bit. 😉 The first rule that many developers seem to omit is that the code actually has to work. 😉

    With self-joins, which is what you have with "pass" and "fail", the aliases are absolutely required. I have no problem with that.

    For complex queries, the more complex they are, the more the presence of any *unneccessary* aliases make it harder to read. (Self-joins are not unneccessary aliases.) At least that is what I have found. Having the table names there, even though the eye tends to focus on them, makes it easier than seeing just the aliases and having to match up the alias with the real table name.

    It might be easier for the person who wrote the code originally to see it better if there are table aliases, but I think that ANYONE who comes in later and tries to read complex code will find it easier if the table names are not aliased (unless they have to be).

    I won't begrudge you doing whatever you need with multi-page SQL statements (yech) but in your case, I would try a few common table expressions and see what happens. (Try to find an EXAMPLE that shows TWO common table expressions defined. They exist, but they are rare.)

    I wish that it WASN'T case that 99.999% of the examples in the wild used table aliases to teach joins. Again, this teaches people that aliases are part of the required join syntax, and they are not.

    I have probably said enough on this subject now, so I think I'll stop.

    David Walker

  • Charles Kincaid (4/7/2009)


    I'm almost of the opinion that the brackets around identifiers would be better required. It might speed the parser. We surround strings with single quotes and you don't see much griping about that. Think of all those key strokes saved by EMPEROR=Charles rather than EMPEROR = 'Charles'.

    Could we also require the white space around operators too? We could also use the double equals for testing and the single for assignment. That one saves me so much when writing C++, C#, PHP, and the like. Context sensing of operator meaning is so outmoded.

    If you don't surround strings with single-quotes, they are column names instead of string literals. Necessary differentiation.

    I find reading stuff with brackets more difficult than without. So I use object names that don't require them, and avoid the issue as much as possible. Just the way my eyes have been trained to read. Probably could change it, but I don't see a reason to. No ROI on that.

    White space around operators is good, but why make it required?

    And on the point about operators, why would the second version save you time?

    set @Variable = value;

    set @Variable == value;

    Maybe I'm missing something, but I don't see any value to the second one. Doesn't make it more readable, doesn't save time or effort, there's certainly no advantage to the compiler either way. The first one has the value of being more English-familiar. I don't think I've run into a newbie dev yet who had to ask what the first one meant, but I've run into plenty of people who thought the double-equals was a typo.

    And, if we're going to go for explicit assignment, why "==" and not ":="? They both have plenty of history and tradition and all that.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Sorry. My attempt at sarcastic humor was missed. In the languages I mentioned the assignment operator is the single quote and the equality test is the double equal. Leaving the second equal out turns your expression into an assignment. The result of the assignment is TRUE. So

    If (A = B) {delete_data(ID);}

    will copy B into A, return TRUE, and perform your delete_data without much regard for the content of A or B. This is almost guaranteed NOT to be what was intended and is hard to spot by the "Mark One" eye-ball.

    ATBCharles Kincaid

  • Charles Kincaid (4/7/2009)


    I'm almost of the opinion that the brackets around identifiers would be better required. It might speed the parser. We surround strings with single quotes and you don't see much griping about that. Think of all those key strokes saved by EMPEROR=Charles rather than EMPEROR = 'Charles'.

    Could we also require the white space around operators too? We could also use the double equals for testing and the single for assignment. That one saves me so much when writing C++, C#, PHP, and the like. Context sensing of operator meaning is so outmoded.


    Now you're getting somwhere! bring in some of the C++/C#/Java operators. I'd be good with ++, -- and I like == for comparison and = for assignment. I'm not to keen on using || in place of "or".

    I think your example with strings is off base though. Pick your language and you have to enclose a string in something to designate extent (and sometimes type). You need to indicate that "Charles Kincade" is the variable's assigned value. I don't give a hoot about making life easy for the parser... just clear. That's why I dislike any shift+ key combo (except for capital letters).

    I hope somebody from the SQL team wades thru this post for suggestions.

    --Paul Hunter

  • David Walker (4/7/2009)


    I won't begrudge you doing whatever you need with multi-page SQL statements (yech) but in your case, I would try a few common table expressions and see what happens. (Try to find an EXAMPLE that shows TWO common table expressions defined. They exist, but they are rare.)

    BWAA-HAAA... 🙂 We're stuck with SQL Server 2000 at work so that won't happen in production any time soon. Also, many of us have written both serial and parallel CTE examples for lot's of folks on this forum. The reason why they're hard to find is almost none of us say "TWO common table expressions" anywhere in the text or the code.

    I agree, for the most part, that multi-page SQL statements can be pretty ugly and, yeah, "Divide'n'Conquer" usually works much faster to boot.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jason Miller (4/7/2009)


    Jeff Moden (4/7/2009)


    Other people love the darned things and that's fine for them... just don't want anyone forcing such rules on me.

    I'm stepping out on a limb here....

    Didn't you earlier state that there are some rules of thumb that you force others to work with / through?

    Absolutely... The code has to be readable, properly documented, and basically devoid of all brackets. 😉 What I don't want is for the IDE or the "compiler" to force those choices on me or my developers because, as has been said so many times on this thread, there are always exceptions to the rules of thumb... even if they're my rules of thumb.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/7/2009) there are always exceptions to the rules of thumb... even if they're my rules of thumb.

    Hehe, the exceptions prove the usefulness of the rule.

    I've written way more than my fair share of silly little compilers and interpreters that have never seen the light of day. Apart from the fundamental issue of "should I parse LTR or RTL?", there is only one real show stopper of a problem that I've seen.

    Should the instruction set be escaped, or should the data be escaped, or should the interpretation be context sensitive?

    Natural languages are heavily context sensitive, particularly in their aural aspects.

    Escaping the the instruction set is the opposite pole from a natural language so is likely to be the most annoying to write in.

    Escaping [the data] is probably a generally acceptable middle path. But there are always exceptions to any interpretation of a rule, aren't there.

    :Whistling:

    Peter Edmunds ex-Geek

  • Charles, you must be talking about the way Java, C# and other languages is working. To me it is plain stupid that you have to use == for one thing and = for another thing. I wonder what the developers of Java etc. thought when they developed these languages. I like to think that = is equals in anyway and <> is not equals and not !=. This confuses the hell out of me and that is why I am glad that T-SQL was not developed in that way. Maybe that's because I was trained in languages like Visual Basic and T-SQL.

    :-PManie Verster
    Developer
    Johannesburg
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • Steve Jones - Editor (4/7/2009)I think T-SQL is relatively easy to read as well, once it's formatted the way I see it.

    At least as easy to read as some of the C++ I've seen,

    But that was half of my point - C++ is neither terse nor verbose. Variable naming conventions tend to make things worse, but the SQL language is more verbose than C++, therefore logically you would expect it to be easier to read.

    If SQL is one of the first languages that you learn, then it is always going to be more readable than anything else, but for me, I came across C++ and SQL at much the same time, as probably my fifth and sixth languages, so they should be on an equal footing. Incidentally, previous languages were Hypertalk, Fortran, Basic, PERL, COBOL and stuff like that, most of which have a greater similarity to SQL than to C++

    Throw away your pocket calculators; visit www.calcResult.com
  • Steve Jones - Editor (4/6/2009)


    davidandrews13 (4/6/2009)


    there's nothing that i find unintuitive about it.

    I think you're ready to be a DBA some day!

    thanks! although i'm not sure whethre thats a good thing or not 😀

  • Manie Verster (4/7/2009)


    Charles, you must be talking about the way Java, C# and other languages is working. To me it is plain stupid that you have to use == for one thing and = for another thing. I wonder what the developers of Java etc. thought when they developed these languages. I like to think that = is equals in anyway and <> is not equals and not !=. This confuses the hell out of me and that is why I am glad that T-SQL was not developed in that way. Maybe that's because I was trained in languages like Visual Basic and T-SQL.

    I admit I used to get quite confused by the == operator in the C languages. But after a few compile errors, or rather core dumps, I acclimated very quickly! I haven't yet come across any rational as to why the == operator was chosen as the equality operator instead of overloading the = operator and relying on context to see if its assignment or equality. I do however prefer != over <> because ! is the not operator in C and = is equals (or should be, hehe), so it seems logical to have != be not equals.

  • I haven't yet come across any rational as to why the == operator was chosen as the equality operator instead of overloading the = operator and relying on context to see if its assignment or equality.

    If you overload the = operator then you can't do:

    if ((A=B)==(C=D))

    Because:

    if((A==B)==(C==D))

    Is a different, but legitmate statement. While I would probably never use the first condition, I could imagine a circumstance where I would want to use the second one [Which, admittedly, is just a brief form of if (((A==B) && (C==D)) || ((A!=B) && (C!=D))) ]

    It also makes all the equality operators have 2 characters while bitwise operators have 1 character which is a useful mnemonic.

    But the real reason is probably that it made the grammar easier.

    --

    JimFive

  • So if we are going to talk about how NOT to construct a language try RPG. That's not "Role Playing Game" or "Rocket Propelled Grenade" but "Report Program Generator". (Search for RPG 400) Things mean different things based upon what column in the line they are.

    But then again "Real Men Code In Assembler". I worked at a place where the assembler was to SLOW so we chucked it an wrote in machine code. Him, "What's the difference between MASM and ASM?" Her, "You trying to sound like Golem." One's particular coding preferences might be "precious". "Yes, we likes it."

    ATBCharles Kincaid

Viewing 15 posts - 121 through 135 (of 266 total)

You must be logged in to reply to this topic. Login to reply