The Number that shouldn't be a number

  • I think the marketing hype of NoSQL has made a lot of managers and educators start to think that the RDBMS is dead. They said the same thing about the mainframe and COBOL in the 90's.

    Both are alive and well, as is SQL.

  • jarick 15608 (2/10/2015)


    I think the marketing hype of NoSQL has made a lot of managers and educators start to think that the RDBMS is dead. They said the same thing about the mainframe and COBOL in the 90's.

    Both are alive and well, as is SQL.

    I'm perfectly willing to give up ecommerce shopping carts and twitter data mining to the NoSQL crowd, so we SQL developers can focus on more important database applications.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • jarick 15608 (2/10/2015)


    I think the marketing hype of NoSQL has made a lot of managers and educators start to think that the RDBMS is dead.

    Exactly. Little do they realize that while SQL is a poor realization of a "true" RDBMS, it's the best we've got (and is getting better). The "NoSQL" databases are a recreation of the data storage mechanisms we threw away 40 years ago - every single piece of data that went in had to be run through a validation program.

    Fabian Pascal has a great post about this (the quote at the beginning is priceless - and this was *years* ago).

    Link

  • jarick 15608 (2/2/2015)


    Oh I still see tons of developers who love to argue that that is a best practice to make your primary keys all uniqueidentifier data types.

    It's the same group that believes in the code first approach.

    I once was guilty of the former (about a decade ago). Never the later (except to drive stored procedure requirements - database schemas need to support those requirements NOT be driven by them but there is negotiation to be done too).

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • My example:

    XML - using a RDBMS as an object database is backward as you do not get the true benefits of either style yet lose the benefits of relational data.

    Also, I have seen databases that stored gender as a bit that later on struggled with needed extension of gender to other states. Most of us are lucky enough to have the simplicity of being constantly and permanently either male or female, however, for some people this is not the case.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Gary Varga (2/12/2015)

    ... I have seen databases that stored gender as a bit that later on struggled with needed extension of gender to other states. Most of us are lucky enough to have the simplicity of being constantly and permanently either male or female, however, for some people this is not the case.

    Indeed. and at the point where the transition takes place fields such as GenderOnAdmission and GenderAtDischarge could be used. No reason to persist the previous gender unless you're planning to use it for illegal purposes (IMHO).

  • t.pinder (2/12/2015)


    Gary Varga (2/12/2015)

    ... I have seen databases that stored gender as a bit that later on struggled with needed extension of gender to other states. Most of us are lucky enough to have the simplicity of being constantly and permanently either male or female, however, for some people this is not the case.

    Indeed. and at the point where the transition takes place fields such as GenderOnAdmission and GenderAtDischarge could be used. No reason to persist the previous gender unless you're planning to use it for illegal purposes (IMHO).

    This was not a medical database but (validly) needed gender. I can see from what you say that it is even more complicated when the point of transition is relevant!!!

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • SQL is a poor realization of a "true" RDBMS

    Can you explain this a little more. Why is it a "poor" realization? It seems fine in most cases to me.

  • This is an unusual example of a common problem. People include attributes without necessarily thinking about what they mean. Unfortunately "Sex" or "Gender" is one of those things which get added because someone "thinks it ought to be there". Quite rightly, that annoys the data protection people. If we do need to hold the information, we need to ask some questions about it: What does the business use it for? Can it change? If it changes, do we care? And if it changes, do we need to know previous values?

    For many systems we really don't care about the previous values of attributes. These systems actually have quite a blinkered view of the world, but that is OK, so long as we understand the consequences. If we do need something more sophisticated, then either we have to look really closely at the thing in question (which is potentially a whole new entity) and we have to understand the consequences, in terms of either more complicated data structures and processing or restrictions in what the system can do.

    Tom Gillies LinkedIn Profilewww.DuhallowGreyGeek.com[/url]

  • Hello Ron, I agree that SQL is pretty good in most cases. It is certainly better than some of the alternatives which relied on knowing how the physical data structures were constructed. However, Ted Codd originally envisaged something rather different. The access language he originally designed was called "Alpha" and as far as I am aware was never seen outside of the Santa Teresa lab. SQL was developed by another team and I have been told that it always had "bits missing". The most obvious of these was the absence of "Relational Integrity" in the early implementations of Relational Databases.

    The mathematics are beyond me, but Codd's ideas focussed on defining and manipulating data entirely by "declarative programming". The most accessible reference I have found on this is Chris Date's "What not How" which interestingly is subtitled "The Business Rules Approach to Application Development". It's a short book (130 pages) and it was published in 2000, so I wonder how much influence it had.

    Tom Gillies LinkedIn Profilewww.DuhallowGreyGeek.com[/url]

  • RonKyle (2/12/2015)


    SQL is a poor realization of a "true" RDBMS

    Can you explain this a little more. Why is it a "poor" realization? It seems fine in most cases to me.

    It is fine in many cases. It's at the edges where we start to see problems (particuarly with most implementations not fully working with constraints). Also, the 3VL is a mess. Chris Date's book, SQL and Relational Theory, talks about the good parts and bad parts, and how to use the good parts responsibly.

  • I don't think so. Four digits of precision of money is not enough. You can use convention that 1 means 0,00001€, but do You really want to do that? Numeric(19,9) is usable, but for 9B You get less practical precision than 8B float. Due to different facturation systems with different rounding steps the costs are actually calculated, so they are actually real numbers. Fixed decimal point types don't solve the rounding problem. They make it worse.

  • sparky-407434 - Saturday, January 31, 2015 2:36 AM

    On the plus side, storing telephone number as int uses less storage than a varchar. Ok you won't need to use any numeric functions on it, but you probably won't need to use any string functions on it either. I don't think this is as bad as putting a date into a varchar.

    Hi,

    You cannot use an int for phone numbers in the UK since virtually all the numbers have a leading 0. The storage is less, of course, but selecting the correct data type is a constraint on the data to protect its integrity. As the title of the post says the number is not a number, it is a code. In fact the first part of the phone number is called the code, international dialling code or area code, depending on the scope of your operation. With a column for phone numbers you can add on additional constraints on the string for validation. For example, all mobile (cell phone) numbers in the UK begin with 07. I can use this to ensure that only mobile numbers go in the mobile number column.

    Less serious, but still annoying is using datetimes when you only want to keep a record of the date. Or using datetime2(7) or time(7) when you are only interested in recording a time to the nearest second.

    John

  • Most of my pet peeves have already been mentioned here and it was almost always caused by the developers not caring about the database when they were developing. Not every field needs to be int, nvarchar(max) or datetime2(7). Why anybody would consider using a float when a smallint would do is beyond me.
     Exactly for the reason mentioned above, I replace all bit fields with tinyint or char(1) whenever I am doing a code review. I sometimes question the necessity of tables with only a couple of entries and which is not expected to grow in the future (e.g. we have a YesNo table with two entries...).
     And as for telephone numbers as strings, they need strict CHECK-constraints to either only allow numbers or only accept strings in a specific format. It gets more complicated when many standards are at play (e.g. telephone numbers from different countries).
      We have one database that makes me cry: the tables are seemingly generic, have 120 columns with the datatype varchar(900), 15 columns with the datatype text and a few otehr fields, all with generic names and numbers. There are a 100 such tables. There are no foreign keys. At least there is a clustered primary key.

  • sparky-407434 - Saturday, January 31, 2015 2:36 AM

    On the plus side, storing telephone number as int uses less storage than a varchar. Ok you won't need to use any numeric functions on it, but you probably won't need to use any string functions on it either. I don't think this is as bad as putting a date into a varchar.

    Except--storing it as a varchar(255) gives you the ability to have "Ext 1234 call only after 5" as part of the phone number field--and you can still use a mask to format it properly on the front end. 🙂

    The OP is right, phone numbers aren't a number--1-800-Flowers anyone?--and formatting should happen on the front end anyway but VARCHAR when used with phone numbers is only a little less efficient than multiple INT fields and is infinitely more flexible.

    As for the argument of "what about storing composite data in fields?" I would argue that yes, a phone number arguably is 4 different fields but crucially (unlike addresses, say) they are used largely as a unit and never divided. If they need to be divided a regular expression can do so fairly easily.

Viewing 15 posts - 76 through 90 (of 121 total)

You must be logged in to reply to this topic. Login to reply