NULLS

  • > If we do not know his middlename then he does not have one.

    If we do not know his middlename then it DOES NOT MEAN he does not have one.

    If Middlename = '' then WE KNOW THAT THARE IS NO MIDDLENAME.

    Fill the difference.

    There is an exersise.

    Fill a column with values 2,4,NULL and select AVG from this column.

    Than replace NULL with 0 and repeat. See the difference?

    It's for you to decide - which result is right, which is wrong.

    After that do it again with COUNT.

    _____________
    Code for TallyGenerator

  • vladan, can u tell me what benefit or what we are going to minimize when you are avoiding NULLS

  • How about a medical database where a row is entered for each test that is ordered for a patient.  The existance of the row gives information about when and who ordered the test.  The NULL result indicates the lab has not yet entered the test result.  The result cannot be considered True/False (positive/negative).  It is still unknown.  And the fact that it is unknown is valid.  If necessary a medical decision can be made based on the absence of a test result.

    ron

  • I didn't say that I am avoiding NULLs, just that I don't like it if a table has say 20 columns of which 12 have a value supplied only in 10% of rows. If you know that there is a set of several values that will only be supplied in certain cases, you can divide the information into two tables - main for obligatory (or mostly known) information, and one for special info that is known say in 10% of cases. The main table will have less columns, be smaller and quicker to access. "Special" table has only 10% of the number of rows in main table - the difference is what would be filled with NULL if you have one table. There still will be some NULLs in the special table, since not all values will always be known.

    Many NULLs will be eliminated by correct design - if your tables seem to have lots of null values, then probably you should think again. That is what I meant by minimizing. If I should simply answer a question "is NULL or empty string filled through default better?", I would say that nullable column with NULL is better in most cases (also see Sergiy's and ron k's posts).

    Maybe the problem with my previous post was that I was slightly off topic - I was more discussing the issue with Chris, not answering your question. Sorry for confusion

     

  • >If Middlename = '' then WE KNOW THAT THARE IS NO MIDDLENAME.

    I do not agree. You only know that the stored value for middlename is an empty string. What does that tell you?

    Let me take another approach at this. A database is a collection of facts. From these you can answer questions. For instance, you want to know which users do not have a middle name stored. To answer this, why do we need to store the fact that we do not know his middlename? The fact that we do not know his middlename is a responsibility for the client to state. The database does not need to do more than be able to answer that it does not have that fact stored.

    I am not sure if the example you supplied where directed at me. I agree that it shows the difference between using default values and nulls. But it does not change anything of what I am saying. With a correct design you do not need to store either NULL or a default dummy value.

  • Hey Chris, you didn't answer my question yet!!!

  • Simple, until the results are in you do not store them. Something like:

    TEST { testid , orderdate , orderedby , ... }

    TESTRESULT { testid , result }

    Until we have a result there is no row in TESTRESULT for that test. So the result is therefore unknown. You can still take any medical decision based on the fact that there is no result stored (it is unknown).

    When we get the result we store it in TESTRESULT and can now answer if the result was true or false (or whatever range of results there is).

  • Yes, I know, I am getting there. It is a difficult question to formulate an answer for, and I should also have gone home from work more than 30 min ago.. Anyway:

    Sure, it is definitely more difficult to create this design than to simply allow nulls.I will not argue with that. But by avoiding allowing nulls you avoid problems down the road.

    Chris Date says that Codd was actually wrong when he stated the need for NULLs (or rather two tokens, for missing and unknown information) and that there should not be any such concept in a relational database. If that was the case many things would be easier. One of them would be no more trilogic, only true/false.

    Anyway, what I am really trying to get through here is that there is a difference between being able to say that some information is missing and/or unknown and storing that the information is missing.

  • My argument was more performance-wise, but I'll save you the trouble >> performance shouldn't come in the way of a correct design (which usually performs better anyways).

  • Yes, that design will work but at the cost of always joining to the results table.  I would agree if it were common to not have results.  But in this case there will always be a result at some time.  So the creation of the second table seems to be a little inefficient in this case.

    BTW, I have used the design you illustrated but in other situations where it seem more appropriate.

    I guess as others have stated, it depends.

    ron

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply