June 28, 2007 at 5:27 pm
Hello all,
We are getting ready to redo an old text file system in which each data row was basically a slab of data (think icons in Office) which was parsed in code based upon various positional codes in the beginning of the data. If we normalize the data into Sql Server we are going to end up with a few dozen little tables and a bunch of left joins because much of the data is variable and may or may not exist for a given row.
Are there performance considerations with that many joins? I've always worked under the assumption that normalization is almost always the best way to go. To make things less cumbersome, would we be wise to create some views for common combinations or are there other ideas? Thanks in advance for any ideas and/or suggestions!
June 28, 2007 at 7:51 pm
Overuse of joins usually results in longer running queries and waistes system resources. You have to find out (I know it's not easy) which transactions are going to be run more often over your tables here. MS says that frequent operations requiring 5 or more table joins should be avoided by going back to your db design and denormalise. It really depends on how often a particular query will be run, the row lenght, whether or not you have good indexes, or good hardware; I had queries with 12 joins and still I could get good performance, but again, if all the other factors above would not help, then you should go back and denormalise.
About views, I would say there is not much benefit for a DBA to have views in dbs just for storing frequently used queries. A view adds unnecessary overhead and, in most cases would cause more data than necessary to be returned => overhead. The rule is: keep your queries as short as possible. If there is no good reason to select data from views, as for example when implementing an app security through views, it's better to use tsql code to select exactly the fields required from the tables directly.
Good luck.
June 28, 2007 at 10:14 pm
Not being a smart guy here... was very interested in what you said...
MS says that frequent operations requiring 5 or more table joins should be avoided by going back to your db design and denormalise |
Do you happen to have the Microsoft Link where MS said that?
--Jeff Moden
Change is inevitable... Change for the better is not.
June 28, 2007 at 11:56 pm
Jeff, please find the link here http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/oltp-performance-issues.mspx
Also I would appreciate if you could explain what "
Not being a smart guy here... was very interested in what you said..."
means actually. Thank you for your consideration.
June 30, 2007 at 6:55 am
Hello michaela,
I took Jeff's comment to mean that he wasn't trying to insult you by questioning what you said but that he would like to see your response documented which you did when you provided the link.
June 30, 2007 at 8:54 am
Heh, thanks Michaela,
I don't mind explaining at all... Some folks really take things the wrong way when asked for the source of what they said and I wanted to make sure no one thought I was saying anything like "Really? Prove it!" to you
The reason why I really wanted to see the source of the information is because I have a constant war going on between myself and some of the, so called, "designers" at work and other places. They insist that its always appropriate to denormalize those portions of the database that have "performance problems". I keep telling them that's a lot of hooey and they simply need to write better code that will correctly use 3rd normal form... long story... basically, I have spent and continue to spend much time teaching my "Single Select to do the whole thing" folks that it's much better to "divide and conquer" in the code on complicated batch jobs and GUI lookups than to put up with the future problems that will appear when the DB is denormalized (below 3rd normal form) for performance reasons (or course, there's always an exception or two... just shouldn't be a general rule).
Anyway, I beat the normalization drum pretty hard at work and on these forums and when I saw your statement about "MS says that frequent operations requiring 5 or more table joins should be avoided by going back to your db design and denormalise", I thought I was going to have to make a roadtrip to Microsoft so I could kick someone's butt or make a serious re-evaluation as to what I've been stessing so much. In the vein of "know thy enemy", I wanted to see the MS article before I opened my big mouth at work about how wrong the article is if the article ever came up or study a new mantra if I thought the article was actually correct.
Fortunately, I can save the gas to drive to Microsoft and I don't have to change my mantra... no where in the article do they recommend actually recommend denormalizing as a suggested method to improve performance (whew!). I can see a lot of people taking it that way (so I still have to prepare for the potential onslaught), but MS never said to denormalize for performance. They do emphasize good database design, which, to me anyway, usually means 3rd normal form. If denormalizing means bringing the database design back down to a correct 3rd normal form, then I'll support that... if denormalizing means violating 3rd normal form, than I have to put my battle gear on, gather up sharp sticks, and wade into a battle
Again, there are some exceptions (like staging tables and certain report tables in the absence of a data mart), but, as a general rule for most database operations, denormalizing tables for performance is usually a bad idea, so far as I'm concerned.
What is very cool about the article, is they (MS) list the number of joins as the very first bullet in the list in the very first section as one of the "top performance bottlenecks" and that's what I've been preaching at work... "divide and conquer" in a proc instead of single "do all" SELECTs is my prime recommendation/solution for solving that type of problem.
Anyway, thanks for the article, Michaela... it covers a LOT of ground on performance issues as a great summary... it also has one of my other "battle subjects" in it as a performance problem... deadlocks... whole 'nother subject there
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply