"Real" copy of Codd's 12 (13) Rules for RDBMS

  • The count of members in a set is not the same as memberwise addition. Taking two 2 balls from a set with two balls in it results in a set with a COUNT of balls =0.

    Quite right. But if you write numbers on the balls then the sum of those numbers is just as valid whether there are 2, 1 or 0 balls. You don't need 2 balls to make a sum. The sum of any set is equal to the sum of the sums of its subsets - that is simply the definition of the addition operation. It follows that the summation of set A less set A must equal zero. There is no other possible result that preserves the commutativity of the sum operation and there is certainly no other possible result that matches the physical reality of sets of things.

  • David Portas (8/8/2010)


    The count of members in a set is not the same as memberwise addition. Taking two 2 balls from a set with two balls in it results in a set with a COUNT of balls =0.

    Quite right. But if you write numbers on the balls then the sum of those numbers is just as valid whether there are 2, 1 or 0 balls. You don't need 2 balls to make a sum. The sum of any set is equal to the sum of the sums of its subsets - that is simply the definition of the addition operation. It follows that the summation of set A less set A must equal zero. There is no other possible result that preserves the commutativity of the sum operation and there is certainly no other possible result that matches the physical reality of sets of things.

    Again you are getting yourself into trouble by bringing up the physical reality. Tell me - what EXACTLY is the physical reality of an empty set?

    And also "sums of sets" is STILL not the same as "memberwise sums of sets". You just cannot keep switching vocabulary and treat them the same.

    Look - I know what the usual conventions say about memberwise operations on sets with 1 or no members, and why this was done. It's just healthy to recognize them for what they are, and that we could just as justifiably have gone some other direction.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (#4) (8/8/2010)


    Again you are getting yourself into trouble by bringing up the physical reality. Tell me - what EXACTLY is the physical reality of an empty set?

    And also "sums of sets" is STILL not the same as "memberwise sums of sets". You just cannot keep switching vocabulary and treat them the same.

    Look - I know what the usual conventions say about memberwise operations on sets with 1 or no members, and why this was done. It's just healthy to recognize them for what they are, and that we could just as justifiably have gone some other direction.

    Fair point about terminology. I was talking only about the memberwise sum of sets. Let me try one more explanation. Take any set of numbers and randomly divide it up into disjoint subsets. Then calculate the memberwise sum of these smaller sets individually. The sum of the memberwise sums of those subsets always equals the memberwise sum of the original set - no matter how you divided it up (the division always being disjoint of course). What you are saying is that if one of the subsets you choose happens to be empty then the sum of the memberwise sums might be something different. That cannot be though, because the members are exactly the same in each case. The sum of numbers does not change just because you divide them up differently - that's a basic property of numbers.

    Yes I do think modelling reality is important because that is what people expect databases to do successfully. That's why databases are useful. Empty sets are all around us. In stock control if I have an empty stockroom then I have an empty set. The set of stock is empty and the value of that stock is zero. In the database if I have a table of stock and that table is empty then the sum of the value of that stock table ought to be zero too.

  • David Portas (8/8/2010)


    PaulB-TheOneAndOnly (8/8/2010)


    So... at reservation time... which value do you use to populate date_pickup_actual and date_return_actual considering business specs are stating that those values are "not known" at that time? ... it sounds like Null value would do the trick and that's exactly what better translates the business requirements.

    If the requirement is to represent cars which have not been picked up then it makes perfect sense to put them in a table without a pickup date (or a return date). The fact that a car has been picked up could perfectly well go in a different table. No nulls required. The database designer may decide to add some nulls to the model for various reasons but those reasons are not part of any business requirement - they are simply a design choice made by the designer. Notice that the word "null" did not appear in your statement of the business requirement. Null is not part of any properly stated business requirement.

    It doesn't comes as a surprise poster is once again resorting to shifting sands arguing process e.g. taking small pieces out of context, making unsupported assumptions, etc. I'm too old for this, time for me to graciously leave the room. 🙂

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • David Portas (8/9/2010)


    Matt Miller (#4) (8/8/2010)


    Again you are getting yourself into trouble by bringing up the physical reality. Tell me - what EXACTLY is the physical reality of an empty set?

    And also "sums of sets" is STILL not the same as "memberwise sums of sets". You just cannot keep switching vocabulary and treat them the same.

    Look - I know what the usual conventions say about memberwise operations on sets with 1 or no members, and why this was done. It's just healthy to recognize them for what they are, and that we could just as justifiably have gone some other direction.

    Fair point about terminology. I was talking only about the memberwise sum of sets. Let me try one more explanation. Take any set of numbers and randomly divide it up into disjoint subsets. Then calculate the memberwise sum of these smaller sets individually. The sum of the memberwise sums of those subsets always equals the memberwise sum of the original set - no matter how you divided it up (the division always being disjoint of course). What you are saying is that if one of the subsets you choose happens to be empty then the sum of the memberwise sums might be something different. That cannot be though, because the members are exactly the same in each case. The sum of numbers does not change just because you divide them up differently - that's a basic property of numbers.

    Yes I do think modelling reality is important because that is what people expect databases to do successfully. That's why databases are useful. Empty sets are all around us. In stock control if I have an empty stockroom then I have an empty set. The set of stock is empty and the value of that stock is zero. In the database if I have a table of stock and that table is empty then the sum of the value of that stock table ought to be zero too.

    But the DB modelling isn't just about numbers, and trying to model DB behavior against "reality" means you have to deal with every permutation of every kind of sets you could have. Not everything is about sets of "numbers" anymore (what happens when you add an empty set of blue circles to another empty set of yellow spacecrafts?), nor can you extend that behavior of simple numbers blindly without running into other inconsistencies.

    While your stock example is certainly valid in its own setting, it isn't a universal case. For example, if I am looking to find a baseline heart rate on a set of patients, and that particular measurement wasn't performed on some patients, I had better damn well NOT return 0 (or else I will have a LOT of lawsuits on my hands for calling patient familiews and telling them their Johnny is dead).

    We're going to keep going around and around with these. The reality is that neither position is entirely consistent when you get down to specific details, and the amount of modeling to get past such issues takes you out of "reality" (to levels of detail that are usually not achievable).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I just wanted to say "thank you" for this fascinating, even if to some extent, academic discussion on NULLs.

    From a practical point of view, I think that they're more trouble than they're worth. When a NULL is actually needed - it can always be created by the developer.

    The blood-pressure example is a good one, yet the developer has to shoulder some of the responsibility for interpreting the data. If there were zeros there and you had to guard against returning zeros in an application, then there should be a built-in mechanism to catch that - such as the last date that the blook-pressure was measured. Or, in this case a zero could in effect represent a null.

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • mtillman-921105 (8/9/2010)


    I just wanted to say "thank you" for this fascinating, even if to some extent, academic discussion on NULLs.

    From a practical point of view, I think that they're more trouble than they're worth. When a NULL is actually needed - it can always be created by the developer.

    The blood-pressure example is a good one, yet the developer has to shoulder some of the responsibility for interpreting the data. If there were zeros there and you had to guard against returning zeros in an application, then there should be a built-in mechanism to catch that - such as the last date that the blook-pressure was measured. Or, in this case a zero could in effect represent a null.

    May I?

    In my opinion Zeros and Nulls are not the same and do not represent the same thing.

    Lets go back to the blood pressure example.

    Picture an ER where an automated system is measuring blood pressure on Pete - the poor guy lying on the bed - alongside other vitals like temperature, etc. every 10 minutes.

    You check the status of Pete and the report shows blood pressure has been Zero during the last two tries... so, is Pete dead? meaning the device worked fine and the guy just went to the other side or, the device failed and the system put there a Zero because somebody doesn't like Nulls?

    Same thing again but blood pressure shows N/A (Null) during the last two tries - in this case, most likely something is wrong with the device 'cause you know the system will record any value (including zero) if the device was working fine.

    Can you see the difference? Null means unknown while Zero means nothing. 😀

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • duplicate post - deleted

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (8/9/2010)


    In my opinion Zeros and Nulls are not the same and do not represent the same thing.

    Lets go back to the blood pressure example.

    Picture an ER where an automated system is measuring blood pressure on Pete - the poor guy lying on the bed - alongside other vitals like temperature, etc. every 10 minutes.

    You check the status of Pete and the report shows blood pressure has been Zero during the last two tries... so, is Pete dead? meaning the device worked fine and the guy just went to the other side or, the device failed and the system put there a Zero because somebody doesn't like Nulls?

    Same thing again but blood pressure shows N/A (Null) during the last two tries - in this case, most likely something is wrong with the device 'cause you know the system will record any value (including zero) if the device was working fine.

    Can you see the difference? Null means unknown while Zero means nothing. 😀

    Paul,

    I agree with you that null and zero mean different things and should not be used as a substitute for each other. You didn't seem to be impressed with my previous response to the scenario you proposed, so this time I will just ask questions. In your example, why store nulls when the machine doesn't have a value to record? What purpose is served by doing that rather than (for example) generating a row in an error table instead?

  • PaulB-TheOneAndOnly (8/9/2010)


    mtillman-921105 (8/9/2010)


    I just wanted to say "thank you" for this fascinating, even if to some extent, academic discussion on NULLs.

    From a practical point of view, I think that they're more trouble than they're worth. When a NULL is actually needed - it can always be created by the developer.

    The blood-pressure example is a good one, yet the developer has to shoulder some of the responsibility for interpreting the data. If there were zeros there and you had to guard against returning zeros in an application, then there should be a built-in mechanism to catch that - such as the last date that the blood-pressure was measured. Or, in this case a zero could in effect represent a null.

    May I?

    In my opinion Zeros and Nulls are not the same and do not represent the same thing.

    Lets go back to the blood pressure example.

    Picture an ER where an automated system is measuring blood pressure on Pete - the poor guy lying on the bed - alongside other vitals like temperature, etc. every 10 minutes.

    You check the status of Pete and the report shows blood pressure has been Zero during the last two tries... so, is Pete dead? meaning the device worked fine and the guy just went to the other side or, the device failed and the system put there a Zero because somebody doesn't like Nulls?

    Same thing again but blood pressure shows N/A (Null) during the last two tries - in this case, most likely something is wrong with the device 'cause you know the system will record any value (including zero) if the device was working fine.

    Can you see the difference? Null means unknown while Zero means nothing. 😀

    You're right PaulB, I shouldn't have implied that zero = NULL in that case.

    But if the machine returns zero and I was the developer, I believe I'd be tempted to replace the zero with negative one (or a similar error-designating number) instead to signify an issue with the data (or maybe the patient!). :w00t: If this method was used, a zero would be the NULL, and a negative one a numeric result. But no built-in, explicit NULL is really necessary.

    Actually, this breaks a cardinal rule in database design - we are using one column for more than one kind of data. It answers two questions: Was there a test taken? and What were the results of the test? Ideally, there should be one column signifying if the blood pressure was taken and another for the results. Actually, a child table would handle this well so that no test would mean no rows in the table for a given patient.

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • mtillman-921105 (8/9/2010)


    Actually, this breaks a cardinal rule in database design - we are using one column for more than one kind of data. It answers two questions: Was there a test taken? and What were the results of the test? Ideally, there should be one column signifying if the blood pressure was taken and another for the results.

    So... let me ask the question... have you actually implemented two columns like that and for the reason you gave when only one is needed? Be honest now...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • PaulB-TheOneAndOnly (8/9/2010)


    In my opinion Zeros and Nulls are not the same and do not represent the same thing.

    PAUL FOR PRESIDENT! 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • mtillman-921105 (8/9/2010)


    I just wanted to say "thank you" for this fascinating, even if to some extent, academic discussion on NULLs.

    From a practical point of view, I think that they're more trouble than they're worth.

    Or, in this case a zero could in effect represent a null.

    That makes it a non academic discussion. I, for one, love NULLs and the tri-state logic they sometimes offer.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Good to know we are in agreement 🙂

    Answering question below...

    David Portas (8/9/2010)


    In your example, why store nulls when the machine doesn't have a value to record? What purpose is served by doing that rather than (for example) generating a row in an error table instead?

    I wouldn't generate a row in an error table because it is not an error.

    The reason this is not an error is that at reservation time the system knows neither actual-pickup-date nor actual-return-date (those values are in the future and not even Oracle has invented yet flashforward-query) therefore those values are unknown at the time. The best way I know to represent an unknown value is to set it to Null.

    I know you proposed to create an actual-pickup-date and an actual-return-date tables to hold that information but that would end with tables having a 1-to-1 relationship with Reservations table and, I do not like 1-to-1 relationships - those only add complexity and overhead so I opt to have all the attributes of a given reservation in a single row.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • mtillman-921105 (8/9/2010)


    You're right PaulB, I shouldn't have implied that zero = NULL in that case.

    But if the machine returns zero and I was the developer, I believe I'd be tempted to replace the zero with negative one (or a similar error-designating number) instead to signify an issue with the data (or maybe the patient!). :w00t: If this method was used, a zero would be the NULL, and a negative one a numeric result. But no built-in, explicit NULL is really necessary.

    mmmhhh I think there is a conceptual issue here.

    Following your line of thinking data in the database is not reflecting the real world, the imagination of a developer is needed to show a picture of the real world by modifying data in the front end.

    In my opinion raw data in the database has to be (the close as possible to a) true representation of the real world therefore I model having that into consideration. In this particular case a Zero will tell the poor guy has no blood pressure while a Null will tell that for some reason data collection failed - which are two very different things and require no data manipulation in the application layer to represent them.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

Viewing 15 posts - 61 through 75 (of 106 total)

You must be logged in to reply to this topic. Login to reply