May 31, 2012 at 12:11 pm
For a University project I'm writing applications that read data from a home weather station and add it to a database. These updates occur every minute so obviously after a while there are going to be hundreds of thousands of records. This database will also be available online so if I want to display the max/average values of various data what is the most efficient way? or will loading the entire table and querying as normal be best?
Here is some sample data: (I have only one table)
12012-05-31 19:01:03.000000021.615.112.458840.0337.5NNW15.10.000.512.06991.800FallingRainy
22012-05-31 19:02:03.000000021.615.012.358840.0247.5WSW15.00.000.512.06992.000FallingRainy
Many thanks,
Mike
May 31, 2012 at 12:23 pm
depends on how you want to display the data? average and max per what? per hour, day,week month? post in some table definitions and data and exactly what you want to achieve and i will try help you
***The first step is always the hardest *******
May 31, 2012 at 1:18 pm
*Cluster* the table on the add date/time (rather than on, say, an identity column, which may still be a good idea to have on this table, just don't cluster by it :-)).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 31, 2012 at 1:40 pm
ScottPletcher (5/31/2012)
*Cluster* the table on the add date/time (rather than on, say, an identity column, which may still be a good idea to have on this table, just don't cluster by it :-)).
While I may agree, I have to play devils advocate, why not cluster on the identity column?
May 31, 2012 at 1:49 pm
While I may agree, I have to play devils advocate, why not cluster on the identity column?
Because the overwhelming majority of queries will specify the datetime.
I wish I had a memory-erasing device to get rid of the horrible notion that identity is a good default clustered key. More db performance issues caused by that one notion than anything else I've ever seen.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 31, 2012 at 1:53 pm
ScottPletcher (5/31/2012)
While I may agree, I have to play devils advocate, why not cluster on the identity column?
Because the overwhelming majority of queries will specify the datetime.
I wish I had a memory-erasing device to get rid of the horrible notion that identity is a good default clustered key. More db performance issues caused by that one notion than anything else I've ever seen.
May not be a good default, but it may server a viable purpose. Good designs are a must first.
May 31, 2012 at 1:57 pm
There are vastly better ways to stress that in general a table needs a clustered key than to have the damaging and false notion that it should be an identity column.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 31, 2012 at 1:59 pm
ScottPletcher (5/31/2012)
There are vastly better ways to stress that in general a table needs a clustered key than to have the damaging and false notion that it should be an identity column.
Still going to stick with the old dependable, "It depends."
May 31, 2012 at 2:05 pm
Other than logging-type tables, where the clus key value is essentially irrelevant, I can't imagine any case where identity would be the proper clus key on business data.
Sure, for some "master" tables, like a "customer number", identity is useful. But it's vastly overused as a clus key.
If the key is so irrelevant/meaningless, do you really have a business need for that data in the first place??
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 31, 2012 at 2:11 pm
There may be architectual reasons for using an identity column for the clustered index. For instance, you could potentially have a very large and active table that is in a database mirrored over a slow wan connection. Due to the activity on the table other viable clustered indexes may result in high fragmentation of the index requiring frequent rebuilds. Not necessarily something you may want to do in this environment.
As I said, "it depends."
May 31, 2012 at 2:14 pm
True, there could be special situations.
you could potentially have a very large and active table that is in a database mirrored over a slow wan connection.
If you're trying to mirror a very large and active table over a slow wan, you've got bigger issues than key choices to worry about.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 31, 2012 at 2:24 pm
ScottPletcher (5/31/2012)
True, there could be special situations.you could potentially have a very large and active table that is in a database mirrored over a slow wan connection.
If you're trying to mirror a very large and active table over a slow wan, you've got bigger issues than key choices to worry about.
This may be something you have no control over as well. We work with what we have and have to learn and adapt.
May 31, 2012 at 2:30 pm
An occassional highly unusual exception doesn't validate the horrible idea in general of identity as the default for a clus key.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 31, 2012 at 2:33 pm
ScottPletcher (5/31/2012)
An occassional highly unusual exception doesn't validate the horrible idea in general of identity as the default for a clus key.
Nevermind, we aren't going anywhere and doing it really fast.
May 31, 2012 at 3:33 pm
ScottPletcher (5/31/2012)
*Cluster* the table on the add date/time (rather than on, say, an identity column, which may still be a good idea to have on this table, just don't cluster by it :-)).
Thanks for the reply, I don't have in depth knowledge of indexes/clusters, I assume using one on the datetime column will increase query efficiency as like you say, many of the queries will be date ranges and things like warmest/coldest temps, highest wind speeds etc etc
Many thanks,
Mike
Viewing 15 posts - 1 through 15 (of 32 total)
You must be logged in to reply to this topic. Login to reply