SQL query help

  • CELKO (4/27/2015)


    Calling either a scalar is outrageous: neither is "a quantity having magnitude but no direction, and representable by a single real number" since neither is a quantity and neither has magnitude (neither has direction and each can be represented by a single real number in all sorts of ways, so the rest of the definition is satisfied); and if you are using some other definition of scalar you are completely out of line both with mathematics and with decent English dictionaries.

    This used to be measurement theory, but now it is “data theory” to go with “data Science”, etc. I miss being just a statistician sometimes. Look at Joe Celko's Data, Measurements and Standards in SQL [ISBN-13: 978-0123747228].

    “Scalar” means measured with a scale in this discipline. In case of identifiers, it is a nominal scale. We could also have categorical, rank, ordinal, interval and ratio scales for other data elements.

    Well, an identifier for a position is just as much an identifier as an identifier for a book with a paricular media-type and edition. So if the ISBN, because it's an identifier, can be a scalar, despite being a quartet or a quintet, because it's an identifier, the position identifier too can be a scalar, despite being a pair (or a triplet), because it too is an identifier. So even with that interpretation of scalar it's clear that your claim that ISBNs and position identifiers have an essential difference that an ISBN is scalar while a position identifier can't be a scalar falls down by your own argument.

    Tom

  • CELKO (4/22/2015)


    An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings.

    Credit card number is not an identifier.

    When you card is expired you get a replacement one with the same number. So, you get 2 or more cards with the same number, but the cards are not identical: you cannot use them the same way.

    For a credit card it's a combination of string(number) and date(expiry date) values which identify a card. And expiry date is a DATE, NOT STRING, because expressions 04/2015, 04-2015, 04/15, 04-15 present the same value.

    Therefore, identifiers may be of any data type.

    Whatever is suitable in any particular situation.

    _____________
    Code for TallyGenerator

  • MAC address - is it a string?

    By all means - NO!

    Compare E8:AB:... and e8-ab-...

    These are completely different strings, but they present the same MAC address.

    Just another example of an identifier which cannot be stored as a string.

    _____________
    Code for TallyGenerator

  • CELKO (4/23/2015)


    The ISBN is atomic; it identifies one and only one book. The book is my “unit of work” which cannot be decomposed. The language and publisher are levels of aggregation up the hierarchy.

    "http://www.sqlservercentral.com" is a code which identifies one and only web site. Just like ISBN identifies a book.

    Is it atomic? We all know the answer.

    As well as ISBN codes web addresses are built from separate blocks - segment, protocol, etc.

    Are they "levels of aggregation up in the hierarchy"? Well, for someone who collects statistics, it may look like that. But we all know it actually goes another way around.

    _____________
    Code for TallyGenerator

  • Sergiy (4/27/2015)


    MAC address - is it a string?

    By all means - NO!

    Compare E8:AB:... and e8-ab-...

    These are completely different strings, but they present the same MAC address.

    Just another example of an identifier which cannot be stored as a string.

    As much fun as it is to see people pushing back on Mr Celko, and as much as I may disagree with a few of his examples,

    create table #test

    (

    minimac varchar(5) PRIMARY KEY CLUSTERED

    )

    GO

    insert #test (minimac) values ('E8:AB')

    GO

    insert #test (minimac) values ('e8:ab')

    GO

    drop table #test

    Violation of PRIMARY KEY constraint 'PK__#test_____40C5B956C93DE1AB'. Cannot insert duplicate key in object 'dbo.#test'. The duplicate key value is (e8:ab).

    The strings aren't completely different as far as SQL server is concerned except with specific collations.

    Just sayin'

  • Nevyn (4/28/2015)


    Sergiy (4/27/2015)


    MAC address - is it a string?

    By all means - NO!

    Compare E8:AB:... and e8-ab-...

    These are completely different strings, but they present the same MAC address.

    Just another example of an identifier which cannot be stored as a string.

    As much fun as it is to see people pushing back on Mr Celko, and as much as I may disagree with a few of his examples,

    create table #test

    (

    minimac varchar(5) PRIMARY KEY CLUSTERED

    )

    GO

    insert #test (minimac) values ('E8:AB')

    GO

    insert #test (minimac) values ('e8:ab')

    GO

    -- d r o p table #test

    Violation of PRIMARY KEY constraint 'PK__#test_____40C5B956C93DE1AB'. Cannot insert duplicate key in object 'dbo.#test'. The duplicate key value is (e8:ab).

    The strings aren't completely different as far as SQL server is concerned except with specific collations.

    Just sayin'

    You are only correct when you are using a case insensitive collation. The code above works just fine on my SQL 2012 DE instance where I use a case sensitive collation.

  • Good point.

    But the point being argued was whether you could put a MAC address in a string and have it handle uniqueness properly, and you clearly can, you just need to be careful about the collation used for that string.

  • Nevyn (4/28/2015)


    Sergiy (4/27/2015)


    MAC address - is it a string?

    By all means - NO!

    Compare E8:AB:... and e8-ab-...

    These are completely different strings, but they present the same MAC address.

    Just another example of an identifier which cannot be stored as a string.

    As much fun as it is to see people pushing back on Mr Celko, and as much as I may disagree with a few of his examples,

    create table #test

    (

    minimac varchar(5) PRIMARY KEY CLUSTERED

    )

    GO

    insert #test (minimac) values ('E8:AB')

    GO

    insert #test (minimac) values ('e8:ab')

    GO

    drop table #test

    Please note the difference between the strings in my example and in your code.

    I hope it was an honest mistake on your side.

    _____________
    Code for TallyGenerator

  • Ah, the seperators. Did not notice that.

    Well then that one gets back to the argument Lynn and Scott were having about ZIP codes and other postal codes.

    Or more appropriately it is analogous to storing phone numbers. Usually the formatting is stripped out before storing, but I don't think I've ever seen one stored as an integer.

  • Nevyn (4/28/2015)


    Or more appropriately it is analogous to storing phone numbers. Usually the formatting is stripped out before storing, but I don't think I've ever seen one stored as an integer.

    Storing phones as strings as they entered, with delimiters or without them, is totally lame programming.

    When you enter a phone number it must be split to the logical parts according to the rules of the country it's been entered in. You may have a peek how to do it from your smart phone.

    And every part of a number must be stored separately.

    Then, when you need to dial it, the phone can build a proper dialling sequence of digits according to the dialling rules of the country you're currently in.

    Believe me, Americans would not be able to use their phones in, say, France, if numbers would be dialled just as they've been entered.

    _____________
    Code for TallyGenerator

  • Sergiy (4/27/2015)


    CELKO (4/22/2015)


    An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings.

    Credit card number is not an identifier.

    When you card is expired you get a replacement one with the same number. So, you get 2 or more cards with the same number, but the cards are not identical: you cannot use them the same way.

    For a credit card it's a combination of string(number) and date(expiry date) values which identify a card. And expiry date is a DATE, NOT STRING, because expressions 04/2015, 04-2015, 04/15, 04-15 present the same value.

    Therefore, identifiers may be of any data type.

    Whatever is suitable in any particular situation.

    ALthough I agree with your conclusion, I can't agree with your premises. My credit cards change their numbers on replacement (whether after expiry on time, or after looss and cancelation). So do my debit cards.

    Is the US banking system still using the old thoroughly insecure renewal/replacement system which I seem to remember from the 1960s and 70s? If it does that would be pretty appalling.

    Tom

  • TomThomson (4/28/2015)


    Is the US banking system still using the old thoroughly insecure renewal/replacement system which I seem to remember from the 1960s and 70s? If it does that would be pretty appalling.

    I'm not in US.

    I'm in NZ.

    And if you think this is appaling - go to Australia.

    They still accept credit cards with no PIN code, just signature.

    _____________
    Code for TallyGenerator

  • Sergiy (4/28/2015)


    Nevyn (4/28/2015)


    Or more appropriately it is analogous to storing phone numbers. Usually the formatting is stripped out before storing, but I don't think I've ever seen one stored as an integer.

    Storing phones as strings as they entered, with delimiters or without them, is totally lame programming.

    When you enter a phone number it must be split to the logical parts according to the rules of the country it's been entered in. You may have a peek how to do it from your smart phone.

    And every part of a number must be stored separately.

    Then, when you need to dial it, the phone can build a proper dialling sequence of digits according to the dialling rules of the country you're currently in.

    Believe me, Americans would not be able to use their phones in, say, France, if numbers would be dialled just as they've been entered.

    Well I checked how my smartphone stores contacts and it lets me enter a lot of consecutive numbers with no logic or rhyme or reason behind them, saves it for me and lets me retrieve it. Being that I stopped at about 80 digits I have reasonable confidence these are being stored in a large string.

    Now the contacts app will happily apply a display mask while im entering valid numbers. And it will happily recognize when ive entered a number that does or does not need a country code or long distance indicator at the moment I tell it to dial. I am sure based on the native country selected for the phone vs the country of the network the phone is connecting to, the front end can apply all sorts of cool rules to validate and dial the number.

    But what it stored was a string.

    And in all seriousness, I have gotten to see Oracle and SQL server databases for any number of commercial applications (some good, some bad), and I have never encountered one that stores telephone numbers in the database the way you describe.

    If you have seen a relational database with a phone number implemented this way, I'd love to see the schema. Does it have separate tables for each country with different rules (so it can get the right number of integers to store for each one and apply appropriate constraints), or one table with 4 or 5 numeric columns, and which columns actually have values and what the contents of each means varies country to country? Not being sarcastic, if you are saying this is widely done I'd love to see DDL of an example.

    edit:As a side note, checked how adventureworks2012 implemented it, and it has an nvarchar(25) null user defined type, and seems to have formatting included.

  • PSB sure did get a lot more than bargained for!!

  • Nevyn (4/28/2015)


    But what it stored was a string.

    Most of the numbers stored in my phone contain country codes, even for my own home number. Just in case I'll use the phone when abroad.

    Nevertheless, I'm not charged with international rates when I call those numbers locally. That means that the phone identifies country codes in the numbers and excludes them from the dialling sequence when it matches the country code for my current location.

    Another example.

    I can store the same local number in 2 different ways:

    +6495553322 or

    095553322

    Resulting dialing sequences are the same for both of them.

    It clearly indicates that the phone identifies and separates country code (64 and NULL in this example), area code (9) and the local number when saving those strings. And it generates dialing sequences by building it from saved "blocks".

    P.S. I've seen plenty of crappy database designs myself.

    And AdventureWorks (all of its versions) is one of them. I cannot say it's the worst one I ever saw, but it's in top 10 of the "hall of shame".

    _____________
    Code for TallyGenerator

Viewing 15 posts - 31 through 45 (of 76 total)

You must be logged in to reply to this topic. Login to reply