Group by

  • Brandie Tarvin (3/11/2011)


    Now that's an interesting solution I never would have thought of.

    But Parsename() only works because it's treating the IPAddress as a Server.Database.Schema.Object item (because of the periods). It wouldn't work if Roy was using some other data or delimiter. And it "fails" if I stick in data that contains more than 4 periods. It returns a NULL for the value '22.28.23.1.5'.

    Parsename will fail if you try to use it for IPv6 addresses. Any code using it to parse out IP addresses is already obsolete because of that, and has a built-in Y2K type issue as they become more common.

    Also, using parsename in this fashion requires more complex documentation, since it's being used in a fashion that it isn't designed for. Anyone looking up "parsename" in BOL/MSDN is potentially going to be confused by the code.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (3/15/2011)


    Brandie Tarvin (3/11/2011)


    Now that's an interesting solution I never would have thought of.

    But Parsename() only works because it's treating the IPAddress as a Server.Database.Schema.Object item (because of the periods). It wouldn't work if Roy was using some other data or delimiter. And it "fails" if I stick in data that contains more than 4 periods. It returns a NULL for the value '22.28.23.1.5'.

    Parsename will fail if you try to use it for IPv6 addresses. Any code using it to parse out IP addresses is already obsolete because of that, and has a built-in Y2K type issue as they become more common.

    Also, using parsename in this fashion requires more complex documentation, since it's being used in a fashion that it isn't designed for. Anyone looking up "parsename" in BOL/MSDN is potentially going to be confused by the code.

    Exactly the point I was trying to make, but much better said. Thank you, Gus.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • GSquared (3/15/2011)


    Also, using parsename in this fashion requires more complex documentation, since it's being used in a fashion that it isn't designed for. Anyone looking up "parsename" in BOL/MSDN is potentially going to be confused by the code.

    Ummmm.... you've seen my code. I have this weird table with nothing in it but a wad of sequential numbers that I use all the time. It's not documented in BOL, it's not taught in MS classes, and there are no certifications that require knowledge of it. I even Cross-Join it to variables for who knows what and Cross-Joins are supposed to be one of the worst things you can do in a database1

    I also Cross-Self-Join a table two or more times and never use anything from either table and people have actually started to do the same!

    I cascade variables in the same SELECT and sometimes update both a column and a variable at the same time across millions of rows and none of that is in BOL or any certs.

    I use two copies of different row numbers in the same query to group on-off sequences. I subtract whole numbers from dates to find contiguous date ranges. I use MAX to pivot things that only have one value.

    And the list goes on ad infinitum.

    How do people know what I'm doing? It's easy. I use one of the best documented yet least used options in SQL Server there is...

    -- text_of_comment

    How would they know what I'm doing with PARSENAME? It certainly doesn't require any complex documentation.

    --===== Split the IP4 String into 4 parts using PARSENAME

    Ya gotta trust me on this. Using PARSENAME for something it wasn't designed for just isn't a problem

    😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/15/2011)


    GSquared (3/15/2011)


    Also, using parsename in this fashion requires more complex documentation, since it's being used in a fashion that it isn't designed for. Anyone looking up "parsename" in BOL/MSDN is potentially going to be confused by the code.

    Ummmm.... you've seen my code. I have this weird table with nothing in it but a wad of sequential numbers that I use all the time. It's not documented in BOL, it's not taught in MS classes, and there are no certifications that require knowledge of it. I even Cross-Join it to variables for who knows what and Cross-Joins are supposed to be one of the worst things you can do in a database1

    I also Cross-Self-Join a table two or more times and never use anything from either table and people have actually started to do the same!

    I cascade variables in the same SELECT and sometimes update both a column and a variable at the same time across millions of rows and none of that is in BOL or any certs.

    I use two copies of different row numbers in the same query to group on-off sequences. I subtract whole numbers from dates to find contiguous date ranges. I use MAX to pivot things that only have one value.

    And the list goes on ad infinitum.

    How do people know what I'm doing? It's easy. I use one of the best documented yet least used options in SQL Server there is...

    -- text_of_comment

    How would they know what I'm doing with PARSENAME? It certainly doesn't require any complex documentation.

    --===== Split the IP4 String into 4 parts using PARSENAME

    Ya gotta trust me on this. Using PARSENAME for something it wasn't designed for just isn't a problem

    😉

    Except that I've actually seen overloading functions like that be a problem.

    And you ignored the first (and more important) part of my post, about Parsename being incompatible as a solution for resolving octets out of an IPv6 address.

    Every solution you've built by using Parsename for IP address parsing is going to have to be completely rewritten the second you have to deal with those. Since they are already in use, so you better get to work doing something that could have been, to use your own words, "done right once since you don't have time to do it twice".

    If you use a string-based parsing routine, you can change a single parameter value (from "." to ":"), and get pieces of an IPv6 address. If you use Parsename for it, you have to rewrite the whole function.

    I could almost see overloading Parsename like that in the 90s, but IPv6 has been on the horizon as an issue for forward-compatibility for a LONG time. You know as well as I do that building code that won't work next year just to shave a minute off of development time is "penny-wise, dollar-foolish".

    As for the documentation, I hold to a policy that documenting "what" is nowhere near so important as documenting "why". Anyone who can use Google can parse out the What on a piece of code. "--===== Split the IP4 String into 4 parts using PARSENAME" is a "what". Anyone who can look up Parsename online and can look at what columns/variables you're using it on can already figure out what it's doing. "-- Parsing IPv4 address octects for location information. Overloading Parsename for convenience. N.B.: Not compatible with IPv6. ToDo: Rewrite for forward compatibility when time allows." That would be useful documentation.

    I can't count the number of times I've had to thrash my way through pages and pages of code to reverse engineer business rules out of it, because every line of code said, quite clearly, exactly what it was doing, but nobody ever bothered to say why. "--Inserts into Customers table", "--Updates Orders table", "--Inserts into Inventory table": These kind of comments are about as useless for documentation as it's possible to get, but I see them ALL over the place.

    If Why weren't more important that What, you'd never have written the articles for SSC that you've written. You document like crazy in those articles. "I'd also been using the INDEX(0) hint to force the use of the clustered index in an attempt to quell the fears that some folks may have in using the method. Judging from the discussions on several other forums about the method, the use of the index hint hasn't quelled any fears at all. Ironically, that's a good thing because all it ever did was slow things down... a lot. It's actually not needed and won't be included in any of the examples in this article." That's all Why, not What, and it's a beautiful bit of documentation. You'd want to summarize like crazy before stating something like that inside a code-comment, for speed and ease purposes, but it says Why you're doing something, or not doing something expected. That's documentation!!!!! That's worth the time it took to compose and type it!! It's Tolkein or Homer or Dostoyevsky, and "Inserts into MyTable" is the menu at McDonalds!

    So, sorry, but I still see overloading Parsename as a liability when it comes to documentation, even disregarding the fact that it's a broken solution, and has been since 1996.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Before I answer, I don't know what you specifically mean by "overloading" in this case. Do you simply mean using something for which it was not intended? Please explain.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • "Overloading" is a programming term for using a function for more than one thing.

    The most common example that most encounter is that the "=" symbol has two functions, one to test for equality ("if a = b") and another to assign value ("set a = b").

    Not about whether it's designed for it or not, it just means using one function for two things. Parsename can parse out object names in the database, and it can parse out IPv4 address octets (pieces), hence, it's "overloaded" with more than one use.

    Edit: I should mention that overloading is the source of the documentation problem I mentioned, not any particular habit of yours. Every time a word in the dictionary has two completely different meanings, it adds complexity to communication. One of the best examples is "bi" as a prefix: It can mean "two", or "half", hence, "bi-annual" can technically mean every two years, or twice every year (every half year), which makes use of that prefix inherently less clear. Since Parsename has multiple functions, one of which isn't actually officially documented, that makes its use inherently less clear.

    Since you're using it for something which, to go back to the dictionary example, can't be looked up in the source material, that adds to the confusion. If someone isn't familiar with Parsename, and looks it up in BOL, they're still left with "he did what with that?" for at least a few seconds.

    I'm not saying it automatically generates the kind of confusion that, for example, Microsoft documentation suffered from in the early '90s, I'm just saying it adds somewhat to the complexity that someone has to deal with when working with code that uses Parsename that way.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • But that's no reason to not use a function for something not included or even contrary to established documentation especially if it greatly simplifies the resolution to a problem or provides and reasonable increase in performance. Whether we use a home grown splitter or something like ParseName, there is really no difference in the amount of documentation required in the code that splits out the first two octets of an IPv4 or IPv6. I'll also say that every person who I've introduced ParseName as a splitter to has immediately gleaned that ParseName is being used as a splitter without further explanation from me even if the delimiter is a non-period character.

    As for the differences between IPv4 and IPv6, I have to admit that I'd likely have a function for each even in the presence of a decent splitter function just as I have more than one splitter function depending on the condition of the input data. For example, a tight cteTally splitter is great for shorter length CSV elements (eg: < 10 character) whereas an rCTE splitter is much better for longer elements (eg: ~80 characters) even when the overall length of a CSV string is nearly identical.

    they're still left with "he did what with that?" for at least a few seconds.

    They only have to look it up once and if they can learn something new in a few seconds, then I've really done my job very well. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/19/2011)


    But that's no reason to not use a function for something not included or even contrary to established documentation especially if it greatly simplifies the resolution to a problem or provides and reasonable increase in performance. Whether we use a home grown splitter or something like ParseName, there is really no difference in the amount of documentation required in the code that splits out the first two octets of an IPv4 or IPv6. I'll also say that every person who I've introduced ParseName as a splitter to has immediately gleaned that ParseName is being used as a splitter without further explanation from me even if the delimiter is a non-period character.

    As for the differences between IPv4 and IPv6, I have to admit that I'd likely have a function for each even in the presence of a decent splitter function just as I have more than one splitter function depending on the condition of the input data. For example, a tight cteTally splitter is great for shorter length CSV elements (eg: < 10 character) whereas an rCTE splitter is much better for longer elements (eg: ~80 characters) even when the overall length of a CSV string is nearly identical.

    they're still left with "he did what with that?" for at least a few seconds.

    They only have to look it up once and if they can learn something new in a few seconds, then I've really done my job very well. 🙂

    We're going to have to agree to disagree on this one.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (3/21/2011)


    We're going to have to agree to disagree on this one.

    Heh... agreed. Thanks, Gus.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply