Database Design- IS NOT NULL always better choice?

Question

Post reply

Database Design- IS NOT NULL always better choice?

sawantpl@gmail.com

Old Hand

Points: 326
More actions
January 29, 2010 at 9:28 am

#215141

Suppose, we are going to design a system & i will put numeic values as 0, Character valueas as '' i.e. blank instead of NULL. Will it be a good design? Please help regarding this with propper justification.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 741102 More actions · Answer 1

It's not really clear what you are asking here. You need to provide some more details.

[Moved to design forum]

Lowell SSC Guru Points: 323755 More actions · Answer 2

i think he's asking is it a good practice to make columns NOT NULL and insert default values, vs leaving them nullable.

as usual, the answer is "It depends"; this is not a question that can be answered with a generic yes/no; it depends on your business and logic;

I'd say that generally, nullable is betterm but SOME items that have a relevance to a business process will benefit from not null/default values.

I cannot think of something that absolutely must have a value offhand, but maybe someone else can think of one.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

sqlvogel SSCrazy Eights Points: 9526 More actions · Answer 3

I think you are asking the wrong question. Information can always be represented either with or without nulls. When designing a database you have to make a choice about whether to design it with nulls or without. It is not really a question of whether to use something "instead of" nulls.

I would recommend that you either minimise or avoid altogether the use of nulls. Nulls generally add complexity and invariably cause incorrect or ambiguous results in the database.

As to whether to use numerics and strings to represent information, I suppose that all depends on what the intended meaning is. There is no single, universally agreed sematics for null so it's not certain what meaning you intended by using such values in place of null. For example using a magic value for attribute A to represent the fact that A is uknown is probably not a good idea. Better to not have a row for attribute A at all (partition the table) and/or to represent the fact that A is unknown by using another attribute.

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 4

It's a matter of implementation, I suppose. I've had the opposite experience from what Davd reports: whenever someone implements something with a value to represent "invalid data" or a flag, eventually someone forgets the rule and invariably implements based on the bad data marker and Boom goes the application.

Again - there are good points and not so good in both cases. There have been religious wars about the merits on both sides, so suffice it to say - you may get a variety of answers.....

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 5

As has been suggested, you could start a real religious war over this.

I believe there are two questions here... the first is "should we use nulls" and the second is also "should we use nulls". 😉

For the first question, if all you're doing is replacing NULLs with zeros or empty strings, you might see some minor benefit in performance here and there but you may be ignoring the underlying problem... denormalization. Whether you have a lot of repeated NULLs or you change them to something else, if there are a lot of columns of "empty" data, you probably have a bit of a problem with design.

For the second question, to me, NULL means "UNKNOWN" and can have a lot of value compared to zero or an empty/blank string. For example, NULLs are eliminated by aggregates and when it comes to taking a simple count of things, can be instrumental in getting the correct count.

I believe it was David Portas (one of the respondents on this post) that brought up the subject of an "Address" table on a different post. More specifically, he brought up the "2nd address line" column that many Address tables have. If you KNOW that a person doesn't have a 2nd address line in their address, should that column contain a NULL or a blank? He and I both agreed that since we KNOW that given person has no 2nd address line, then NULL, which means "UNKNOWN", is inappropriate and should be made to hold an empty string which says "We KNOW this person has no 2nd address line". Others may suggest that the 2nd address line should somehow be normalized in a separate table to prevent even that duplication amongst rows.

So, the answer ultimately falls back to the ol' standby of "It Depends". Replacing NULLs with zero data or empty strings should never be used as a substitute for proper database design but it can be useful to show that you KNOW something has no value by changing a NULL to a known value of zero or an empty string.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

sqlvogel SSCrazy Eights Points: 9526 More actions · Answer 6

Jeff Moden (1/29/2010)
to me, NULL means "UNKNOWN" and can have a lot of value compared to zero or an empty/blank string. For example, NULLs are eliminated by aggregates and when it comes to taking a simple count of things, can be instrumental in getting the correct count.

YOU may understand null to mean "unknown" but SQL doesn't. For example X = X should be TRUE if X is unknown but it is not true if X is null. Similarly in SQL the SUM(Z) of an empty set is null even though it is a known value (ie. zero).

So the problem is that if you use null to represent an unknown value then some of your queries may give right results but some will give wrong results. Or at least they are "wrong" according to common sense and everyday reality - reality being what databases are normally expected to model!

This is what I mean when I say that null has no universally agreed semantics. Everyone chooses null for their own purposes but none of them match what SQL actually does. The ISO SQL Standard is no help because it does not define any particular meaning for null.

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 7

True enough. If you define "X" as being unknown and compare the same two unknowns, they are equally unknown... by definition. If SQL Server really knew what Unknown is and X where unknown, then SQL Server would know that X = X+1 are two different unknowns and the test for equality should fail.

None the less, the way nulls have been defined in SQL Server as a default is that they are treated as unknown and X=X where X is NULL, produces an unknown answer and so does X=X+1.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 8

Sorry... posted a duplicate post.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

sqlvogel SSCrazy Eights Points: 9526 More actions · Answer 9

Jeff Moden (1/30/2010)
None the less, the way nulls have been defined in SQL Server as a default is that they are treated as unknown and X=X where X is NULL, produces an unknown answer and so does X=X+1.

I think you may be missing the point. If X is unknown then X=X is still TRUE (in reality I mean). X=X is always true whether we know X or not. Similarly X=X+1 is always FALSE. X=X+1 is always false whether we know X or not.

SQL does not give the answer TRUE for X=X when X is null, or FALSE for X=X+1 when X is null. So SQL is not treating null as meaning "X is unknown". The way nulls are defined in SQL is therefore not a sensible representation of the "unknown" case. If you choose to use it that way then you will get wrong answers to some queries ("wrong" in reality I mean).

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 10

Heh... no, I'm not missing the point. I said virtually the same thing in the first paragraph of my last post.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 11

David Portas (1/30/2010)
Jeff Moden (1/30/2010)
None the less, the way nulls have been defined in SQL Server as a default is that they are treated as unknown and X=X where X is NULL, produces an unknown answer and so does X=X+1.
I think you may be missing the point. If X is unknown then X=X is still TRUE (in reality I mean). X=X is always true whether we know X or not. Similarly X=X+1 is always FALSE. X=X+1 is always false whether we know X or not.
SQL does not give the answer TRUE for X=X when X is null, or FALSE for X=X+1 when X is null. So SQL is not treating null as meaning "X is unknown". The way nulls are defined in SQL is therefore not a sensible representation of the "unknown" case. If you choose to use it that way then you will get wrong answers to some queries ("wrong" in reality I mean).

NULL isn't a value. Remember - it's a state, representing a lack of known value (which interestingly enough is what your flag idea was supposed to be as well). This is why any of the operations involving comparing two values are not and cannot be valid.

But these have been belabored for many moons before... Might as well post up the reference material for those who haven't read it yet....

http://www.sqlservercentral.com/columnists/mcoles/2829.asp

The WikiPedia entry on this is actually very good:

http://en.wikipedia.org/wiki/Null_(SQL)

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

sqlvogel SSCrazy Eights Points: 9526 More actions · Answer 12

Matt Miller (#4) (1/30/2010)
NULL isn't a value. Remember - it's a state, representing a lack of known value (which interestingly enough is what your flag idea was supposed to be as well). This is why any of the operations involving comparing two values are not and cannot be valid.

I realise null isn't a value. But in what sense does null "represent" the lack of a known value if the behaviour of nulls in expressions does not match the way unknown values behave? Why even bother to pretend that such a thing is being represented when it clearly is not?

I think it's more accurate to say that Null is marker which causes a certain non-intuitive set of results in queries and expressions. That set of results does NOT accurately model unknown values, missing values or anything else based in reality or common sense. It just is what it is. Either you find it useful or you don't. Whether you use null or not, please don't pretend that it represents something which it does not.

sqlvogel SSCrazy Eights Points: 9526 More actions · Answer 13

Jeff Moden (1/30/2010)
Heh... no, I'm not missing the point. I said virtually the same thing in the first paragraph of my last post.

The thing that made me doubt it was this statement:

"If you define "X" as being unknown and compare the same two unknowns, they are equally unknown"

It's not a matter of Xs being "equally unknown". If X is a value (known or not) then it is equal to itself because all values are equal to themselves. If I say to you "X is my age. Is X equal to X?" then you can confidently reply Yes, whether or not you know how old I am. This is the behaviour which is not accurately modelled if you use null to indicate an unknown value. I've no clear idea of what you mean by "equally unknown".

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 14

David Portas (1/31/2010)
Matt Miller (#4) (1/30/2010)
NULL isn't a value. Remember - it's a state, representing a lack of known value (which interestingly enough is what your flag idea was supposed to be as well). This is why any of the operations involving comparing two values are not and cannot be valid.
I realise null isn't a value. But in what sense does null "represent" the lack of a known value if the behaviour of nulls in expressions does not match the way unknown values behave? Why even bother to pretend that such a thing is being represented when it clearly is not?
I think it's more accurate to say that Null is marker which causes a certain non-intuitive set of results in queries and expressions. That set of results does NOT accurately model unknown values, missing values or anything else based in reality or common sense. It just is what it is. Either you find it useful or you don't. Whether you use null or not, please don't pretend that it represents something which it does not.

I think we're confusing "unknown" with "partially known with an indeterminate value". When I say X is NULL, it means I know nothing about it: I don't even know if it's something that can HAVE a value. Like Jeff said earlier, if you know that X can have a value, then you can do things like evaluate X=X or X=X+1. NULL just happens to represent a state prior to that.

This is back to that same argument about the address line 2. In the same way, the most correct way to represent that if you know there will not be a second line is '' (empty string). If on the other hand you do NOt know if a second line should be supplied, then it's an unknown (a.k.a. NULL).

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?