June 5, 2014 at 5:23 am
I'm designing a new database which will be the back-end to a heavily-used web-based application (all these terms are relative - I guess the use won't be that heavy in the grand scheme of things, I'm only talking 100 users or so at the very most). Data from the old application database will be migrated to this one, and the old database is around 7GB in size after 5 years of use.
I have two different ways of linking some tables in mind, one which is slightly more complex than the other but which potentially has benefits over the simpler method. However, I'm concerned that I might be 'over-cooking' the design, and that performance would suffer as a result, so I've tried creating the two different versions of the database (the part of it I'm concerned with, anyway), one for each of the solutions I've got in mind, migrated the data into the relevant tables and carried out some queries on the data to collect some statistics.
The problem is that, whilst I can see that the more complex method is more expensive, as expected, I don't really understand if the difference is significant. Since I don't know what the numbers in the Client Statistics window actually mean (there are no units! I'm guessing times are in milliseconds?), or how much of real-world impact the difference will have, I'm finding it hard to interpret my statistics and come to a decision.
Querying the entirety of my tables to return ~20,000 records listing one column from each of the main tables I'm playing with, the simpler method had a Total Execution Time of 199, and the more complex a Total Execution Time of 272. Is that the statistic I should be most concerned with? Is that a difference I should be concerned about? Is the difference likely to be magnified when the database is much larger and in use, such that a difference of 73 milliseconds in this test scenario could end up being as much as a whole second in production, for example?
I understand that the answers to some of these questions might fall into the "How long is a piece of string?" category, but some pointers and general indications of how concerned or otherwise I should be would be greatly appreciated!
June 5, 2014 at 7:45 am
Those times are in milliseconds.
Usually, the way I measure performance for a query is to SET STATISTICS IO ON and SET STATISTICS TIME ON for the query window I'm operating in. Neither measure by itself is adequate to express performance, so I usually take both into account when trying to determine if one query is outperforming the other. Also, I usually do several runs of the queries to get an average behavior as the first run is loading so much stuff into memory (maybe) it could give a false impression of speed or the lack thereof. The multiple runs have the added benefit of making sure any blocking or contention issues average out. In addition, I'd suggest checking the execution plans of each query to see how the optimizer is resolving them in order to understand how your indexes and constraints are being used with your queries, if they're being used.
In terms of "linking the tables" I find that straight forward referential constraints against normalized storage usually perform the best.
There are lots and lots more details around this. I cover a lot of it in my books (see the links below in my signature).
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply