January 31, 2015 at 12:09 am
Comments posted to this topic are about the item The Number that shouldn't be a number
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 31, 2015 at 12:27 am
I remember the pain of working with a system where a developer had chosen the TIME data type to store the number of hours worked in a day. Anyone want to figure out how you add two TIME columns together? I'll wait…
Wwwhhhuuuuttttt???? You don't believe in picking your data-types for purposes of final formatting? I'm shocked! :-P:-D:-P:-D:-P:-D And "everybody" knows that it's best to store currency amounts as NVARCHAR(255), right? :laugh:
--Jeff Moden
Change is inevitable... Change for the better is not.
January 31, 2015 at 1:32 am
More popular is sticking date time values in nvarchar datatyped columns, of course accepting any client format without storing the culture nor format.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
January 31, 2015 at 2:36 am
On the plus side, storing telephone number as int uses less storage than a varchar. Ok you won't need to use any numeric functions on it, but you probably won't need to use any string functions on it either. I don't think this is as bad as putting a date into a varchar.
January 31, 2015 at 3:42 am
Storing currency values in a FLOAT is my least favourite, as I am constantly having to do stupid things like this
WHERE ABS(table1.column1 - table2.column2)<0.01
just to check for equality...
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 31, 2015 at 6:20 am
I agree with the Gail's point. I describe myself as an analyst rather than a DBA or a developer and I would agree: Keep it Simple (S)! Certainly never try and be "clever", because if you are, then you can be sure it will come back and bite you or someone else.
It is true that telephone numbers are "numbers", but a lot of people provide them formatted with spaces or dashes (0123-123-4567) and who are we to tell them they are wrong? If we really care that much that we want to exclude invalid characters then I suppose one could provide a constraint, but then what about the companies who have numbers which they have gone to the trouble of choosing their number to make a point (0800-SQLDBA)? The whole point is that what people remember is the letters.
Postal Codes are another example where unnecessary restrictions are sometimes applied. USA Zip codes and Russian codes are all numeric. I learned from Gail's piece that South African ones are numeric with significant leading zeros. British post codes are alpha-numeric with a particular structure (AB12 34XY but with exceptions).
Too often people don't put enough thought into choosing the datatypes. When in doubt I try to talk to a DBA who is experienced with the database manager we are going to be using.
As for "weirdest", this one wasn't weird, in fact the choice of datatype was appropriate, but it was funny (or at least my reaction was). I had a job looking for data errors in a production database (I had access to a redacted copy etc etc). The transaction or account values were held in columns which had been defined with a type of CURRENCY (or something like that). l found a candidate error in the production data and nearly had a heart attack! After a few deep breaths, and before pressing the "big red alarm button", I went and spoke with the application support people and they put me right. I had found a real candidate error, but the accounting application worked in "pennies" not "currency units" (Euro/Pounds/Dollars), so I had seen the error as being 100 times bigger than it actually was! It turned out the error had actually been trapped by the application anyway, so it was all ok in the end, but I think I lost some weight that day.
Tom Gillies LinkedIn Profilewww.DuhallowGreyGeek.com[/url]
January 31, 2015 at 7:52 am
1) Storing dates as strings.
2) Storing financial values as floats.
I feel dirty now...
January 31, 2015 at 9:03 am
The one I dont get is the use of float in a financial application. I just don't see the reason. Someone help me.
I've seen some big money financial packages do this.
January 31, 2015 at 9:14 am
jarick 15608 (1/31/2015)
The one I dont get is the use of float in a financial application. I just don't see the reason. Someone help me.I've seen some big money financial packages do this.
Lack of understanding of data types is the only reason I can think of.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 31, 2015 at 9:19 am
I'm currently working on a warehouse where someone decided it was a good idea to use numeric(2,0) for a cancellation code, works fine for the current system, but now breaks in the new system where cancel codes are 3 chars long!
January 31, 2015 at 9:20 am
mister.magoo (1/31/2015)
Storing currency values in a FLOAT is my least favourite, as I am constantly having to do stupid things like this
WHERE ABS(table1.column1 - table2.column2)<0.01
just to check for equality...
Wow. Pretty bad. That could become a computational least favorite for me. Fortunately, that normally doesn't happen for me.
What I really hate are the most common mistakes that really consume memory and disk space. My "favorite" is that it appears that much "automatic design" software used by front-enders create every integer column as a NUMERIC(18,0). Yeah... I like using 9 bytes instead of 4, 2, or even 1. Of course, storing phone numbers (etc) as NVARCHAR(anything) is insane. Another "favorite" of mine is VARCHAR(1).
What really lifts the stops on my steam valve is when you approach some "developers" on the subject, they sometimes get really snotty and give you that stupid retort about "Well, pre-optimization is the root of all evil". While that expression certainly isn't stupid and is painfully true, it's been inappropriately applied for such mistakes. They don't understand the difference between "pre-optimization" and "just" writing good code/proper design.
My other "favorites" are the subjects of indexing, DRI, and constraints. I recently had one 3rd party lead "developer" tell me that his database design, which includes no clustered indexes, no PKs, no FKs, and not even nullability constraints was all "best practice" because such things should all be enforced in the "business layer" and to make the database "portable". This idiot also doesn't see a problem with changing the name of a column every time it's stored in a different table. The really bad part is that this is supposed to be for a really high performance app that our company hired those moroffs for and I can't convince the 3rd party management (they don't employ a DBA or even a good database developer) nor our own ("Well, they wouldn't be in business if they were that bad") that all of this falls under the category of "worst practices" and that the app will be anything but high performance. Of course, when it fails to perform, they going to blame me for the "database performing slowly" at which time I'm going to redeliver the mountain of recommendations that I previously gave all of them one at a time in a rapid fire session of point blank high velocity pork chops. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
January 31, 2015 at 9:57 am
Just take a look at a JD Edwards database and you will find all the dont´s mentioned in this topic.
You will find floats for financials data, integers for date and time, numbers displayed on the screen as 1 but stored as 1000. You name it, JDE has done it.
January 31, 2015 at 9:58 am
mister.magoo (1/31/2015)
jarick 15608 (1/31/2015)
The one I dont get is the use of float in a financial application. I just don't see the reason. Someone help me.I've seen some big money financial packages do this.
Lack of understanding of data types is the only reason I can think of.
Ditto. I suppose someone may think they can handle "bigger numbers" this way, but that is to misunderstand floating point numbers. FP numbers should always be thought of as an approximation. I would like my accountant and bank to get things _right_, not _approximately right_. On the other hand, if I am measuring a distance or a temperature then various forms of float are entirely appropriate.
I suppose it might just be appropriate for economists to use floating point numbers, but I have been known to have doubts about what economists do ;-).
I can remember a case where a company had two interfaced PoS systems which did rounding related to VAT (Value Added Tax, sort-of percentage based Sales Tax) very slightly differently. That caused problems from time to time, but that was an application problem not caused by anything wrong in the databases.
Tom Gillies LinkedIn Profilewww.DuhallowGreyGeek.com[/url]
January 31, 2015 at 10:33 am
Nice article Gail...
Which brings up MY question regarding data types...
Using MS SQL Server -- I'll be working on a project in the near future where we need to store latitude and longitude ( NO -- we do NOT need to use the geography data type, so please exclude from any suggestions ) -- but I've seen FLOAT, DECIMAL(9,6), DECIMAL(19,16) and many other used -- what it YOUR choice for storing latitude and longitude values; and for the sake of argument -- let's say to a maximum of 6 digits of precision ?
Just interested in knowing what others have used. Thanks in advance!
January 31, 2015 at 1:11 pm
My two favorites:
1. Worked at one client where they stored Guid's in image datatype. At least they understood Guids are binary and not text, but still had to do a cast anytime it was referenced. And, the value was a logically a foreign key that could not be indexed.
2. Another client stored Xml data in a text datatype. Granted this database was originally designed before xml was a datatype in SQL Server, but was running on a server where we could have modified the table to use xml datatype. I had to implement an application enhancement to provide a count of certain types of elements in this xml data. Size of these values ranged from 200 characters to some over 10mb. Converting the text value to xml to do a XPath query for those rows that were over 10mb took 3-4 minutes. I ended up scheduling a SQL job to run a sproc to convert to xml then count using xpath, 100 rows at a time then end, then run the job every 5 minutes (to satisfy the dba that this wouldn't interfere with production). It took almost 4 weeks to run through all 18 million rows.
Mark
Viewing 15 posts - 1 through 15 (of 121 total)
You must be logged in to reply to this topic. Login to reply