The Number that shouldn't be a number

  • That raises a good point.

    I'll have to admit that I am also clueless on the code-first thing. It just sounds really scary to a DBA when these applications generate the database queries and commands on the fly.

  • jarick 15608 (1/31/2015)


    The one I dont get is the use of float in a financial application. I just don't see the reason. Someone help me.

    I've seen some big money financial packages do this.

    Not sure if someone already said this but was Money an original ANSI SQL standard data-type? I know float was and was also in Fortran.

    Here are the things that seem to drive data-type selection in the order I have experienced them.

    Size on disk.

    How hard is it to convert from type used by poorly developed code into one that make sense to data storage?

    What uses the least amount of CPU cycles to do CRUD from our application and is ACID.

    Size.

    The person using the data.

    Size.

    Yes I put size at the top and the bottom and did not include "does this make sense from a SQL data storage or data type". We know building an application database first is like making something from the feet up. However 90% of the production world I have lived in does it database last.

  • Stephanie Giovannini (2/2/2015)


    What is best practice for storing time intervals in a table?

    I would use an integer and define that it's a number of minutes, seconds or hours as necessary

    Is TIME a valid choice for intervals or not a good idea?

    Time is a point in time, not a time span. It's a terrible choice for an interval because it cannot go over 24 hours and you can't add or subtract two times without needing to involve DateDiffs

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • bdcoder (1/31/2015)


    Ha! - store *every* column as VARBINARY(max) and never worry about data types again!

    I've heard that suggested, and received enthusiastically. In all seriousness. ... "Over *your* dead body!"

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • Jeff Moden (2/1/2015)


    Michael Valentine Jones (2/1/2015)


    I worked at a place where consultants came up with a design for a high volume OLTP application that would completely replace SQL Server with XML flat files to store all data. They said it was more flexible because you could change the schema at any time.

    When they presented this plan to the CTO he looked at them for a few seconds and said "You're fired. Get out of here by the end of the day."

    <HeadDesk> At least your CTO caught that and stopped it. I'm going through similar (really stupid DB Design) with a 3rd party product that, according to the powers that be, is "desperately needed" and "only this 3rd party company can do it". Unfortunately, that also means that I've been hog-tied by everyone from the Dev Manager up through the GM. I've fired out the strong warnings both in my company and to the 3rd party company but there's an idiot lead "developer" there that has managed to convince everyone that he's following someone's idea of "best practices". He's also the one that designed the tables with no clustered indexes or DRI (heh... in his words, "to make it portable") and thinks it's ok to change column names every time it's used in a different table.

    Aye. Just had one of those bought from a third party. And another one at the other extreme. Many, many indexes. As in 41 indexes on 20 column table, and 33 indexes on another 21 column table (no. not mistypes) - and the general pattern is repeated throughout the database. Vast majority on a single column. Ohhh, the number with names starting _dta gives much away. This is OLTP, in case you're wondering. There are There are, in total, 2650 indexes in the database. Of which during the testing, initial setup and training fewer than 5% have been hit.

    Still - at least the tables have PK's. Every. Single. One. A. GUID. Clustered! <- This bit's on topic!

    Apparently "The Box Is Slow".

    10 seconds to add a record at times, with the db practically empty - certainly compared to the volumes it's going to have to take. So even on a massively over specced machine for what it SHOULD need, with a PLE of 42 days with only 80% of memory used and a buffer cache hit ratio of 100% - performance has been raised as a concern.

    I've pointed out this is never going to scale, at all, in that state. Reply came there non.

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • GilaMonster (2/3/2015)


    Stephanie Giovannini (2/2/2015)


    What is best practice for storing time intervals in a table?

    I would use an integer and define that it's a number of minutes, seconds or hours as necessary

    Is TIME a valid choice for intervals or not a good idea?

    Time is a point in time, not a time span. It's a terrible choice for an interval because it cannot go over 24 hours and you can't add or subtract two times without needing to involve DateDiffs

    Thanks for the clarification!

  • Some stuff that I've seen:

    --USA zip codes stored as integers

    --USA social security numbers stored as integers

    --Date/time stored as strings (those were A LOT of fun to deal with)

    --When told that his tables needed unique keys, a guy added an autonumber to every table

    --Unlimited text fields stored in 50 character chunks, requiring a UDF to piece them back together

    --Process loops of hundreds/thousands of lines of code rather than modular programming (cut processing time to about 5% of the original run after I re-wrote it)

    --Turning on the third bit of the last byte of a number to make it a letter when negative values were stored so you didn't have the overhead of a negative sign

    --An overtime/leave slip system that we bought a special OCR system for that required precise filling out of boxes, then buying amazingly slow desktop-grade scanners for all of the payroll clerks so it was faster to type everything in by hand than to scan (not a DDL issue, but still pretty awesomely stupid)

    Oddly, I haven't run in to phone numbers stored as integers.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • There are cases where using a GUID as the primary key is valid. The problem comes if you also make it your clustered index.

    These cases are rare. To say the problem comes in if it's also the clustered index makes it sound like it could otherwise be okay. Unless you are in a very specific situation, it's not.

  • RonKyle (2/3/2015)


    There are cases where using a GUID as the primary key is valid. The problem comes if you also make it your clustered index.

    These cases are rare. To say the problem comes in if it's also the clustered index makes it sound like it could otherwise be okay. Unless you are in a very specific situation, it's not.

    Not so rare. Replication is one area, and it is more frequent than some people think, including me at one point.

  • RonKyle (2/3/2015)


    There are cases where using a GUID as the primary key is valid. The problem comes if you also make it your clustered index.

    These cases are rare. To say the problem comes in if it's also the clustered index makes it sound like it could otherwise be okay.

    It is otherwise OK.

    There's nothing wrong with a uniqueidentifier primary key. If the system is going to use an artificial key, it's not much better or worse than an identity. The main extra difficulty of a uniqueidentifier primary key is remembering the values 'The transaction with an ID of 32899' vs trying to remember and match a GUID.

    Any form of distributed system, anywhere where the app design is such that the key is created in the application, not the DB.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It is otherwise OK.

    There's nothing wrong with a uniqueidentifier primary key. If the system is going to use an artificial key, it's not much better or worse than an identity. The main extra difficulty of a uniqueidentifier primary key is remembering the values 'The transaction with an ID of 32899' vs trying to remember and match a GUID.

    Any form of distributed system, anywhere where the app design is such that the key is created in the application, not the DB.

    Never comfortable disagreeing with a known master, but I have to disagree here. The joins are slower, the indexes are larger, and I'm just getting started. There is a place for GUIDs and a problem they solve. But in each case I've seen them, the app design should have inserted the parent record, retrieved the id, and used it to populate the children. It's an extra pass, but it's a one-time extra pass that doesn't create a bunch of permanent headaches. I've seen transaction logs more than twice the size of the full database backup because of these GUIDs.

  • bdcoder (1/31/2015)


    Ha! - store *every* column as VARBINARY(max) and never worry about data types again!

    And cast a maniacal laugh when the table is schema locked for two days because it has to expand the value of every column in every row because someone inserted one row with 8000 spaces as the value for each column.

  • I read this in a quote from Jeff Moden:

    "Well, pre-optimization is the root of all evil"

    How stupid. The saying is actually this:

    "post-optimization is the root of all evil"

    It is in several text books and programing language books from the 80's and 90's.

    We came up with it while having to re-write all the C and C++ code that was not optimized properly or did not take into consideration the limitations of the host systems.

    Jeff, Next time some idiot that is doing development as a second or third career choice tells you that ask him if he even know the difference between IF EXISTS and IF NOT NULL. Let the silence build and then say "enough to pay for a real developer"

    Developers pretending to be architects that know nothing about databases, computations, distributed application design are the root of all evil.

  • Thanks people. There are some thought-provoking ideas there. I'm going to update some of my "mental guidelines" as a result.

    The main thing I am going to add to my thinking is: "consider the operations someone may want to perform using this piece of data" (add it, concatenate it etc). I'm rehearsing in my own mind how to explain the consequences of bad choices to those who find it hard to understand.

    The discussion started from "The number that shouldn't be a number". I think there is some scope for some new forked discussions or editorials around: "appropriate use of Date and time datatypes" and "Choosing the appropriate key". Regarding that last topic, I've learned a bit from the to-and-fro about GUIDs.

    Tom Gillies LinkedIn Profilewww.DuhallowGreyGeek.com[/url]

  • I have to work with a database where a numeric code that can be one, two or three digits long and be listed in reports in proper numerical order is stored as varchar(3) without the leading zeroes. PITA!

Viewing 15 posts - 46 through 60 (of 121 total)

You must be logged in to reply to this topic. Login to reply