July 23, 2009 at 7:01 am
Hello,
I am dealing with 12 months of sales data 2-3M rows per month. I initially set them up as 12 seperate tables and have dealt with my queries through stored procedures (rarely change them). However, recently I have been running many ad hoc queries that require all 12 months of data....this makes for lenghty SQL.
I am wondering if 12 2-3 million row tables perform better than one 30 million row table (indexes being equal)? Should I combine these into a single table for convenience sake?
Please advise.
Thank you,
Lonnie
July 23, 2009 at 7:30 am
In this situation I would be inclined to put all of the data into one table and then partition the table with one partition for each month. Thereafter, when you query the table SQL Server will take care of which partitions it needs to read from and give you optimum performance, with the added benefit that you can query as many or as few months as you'd like to.
Having the data split across 12 tables isn't very scalable and will provide more management overhead than using partitions - although you'll still need to manage the partitions and possibly use a sliding window mechanism with each new month.
Hope that helps.
July 23, 2009 at 7:23 pm
I'll second that.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 25, 2009 at 8:36 am
SQLZ (7/23/2009)
In this situation I would be inclined to put all of the data into one table and then partition the table with one partition for each month. Thereafter, when you query the table SQL Server will take care of which partitions it needs to read from and give you optimum performance, with the added benefit that you can query as many or as few months as you'd like to.Having the data split across 12 tables isn't very scalable and will provide more management overhead than using partitions - although you'll still need to manage the partitions and possibly use a sliding window mechanism with each new month.
Hope that helps.
Partitioning requires Enterprise Edition.
Partitioning performance can be disappointing, especially in 2005.
Splitting the data into 12 tables and using a partitioned view (available in all editions) will almost certainly perform better.
Table partitioning in SQL Server is internally implemented as separate table structures anyway.
Paul
edit: also, a database containing a partitioned table (or even a partition scheme or function) prevents that database being attached, or restored, to an edition of SQL Server which is not Enterprise or equivalent. This can be extremely tedious if you use copies of the database for reporting purposes (for instance).
July 25, 2009 at 10:30 am
Good points, Paul.
I like the idea of using a view to join the tables together if you need that.
Are you querying by month that often? I'm just wondering if you need a current and archived table instead, letting SQL separate out things by table based on a date or a couple of columns (yr/mn)
July 25, 2009 at 9:35 pm
Paul White (7/25/2009)
Splitting the data into 12 tables and using a partitioned view (available in all editions) will almost certainly perform better.
Steve Jones - Editor (7/25/2009)
I'm just wondering if you need a current and archived table instead...
I absolutely agree with both of those suggestions... we had a similar problem and I split the data at the 120 day mark (Telephony and Call Detail Records (CDR's))... we would call the table directly for the short term stuff and a partitioned view for the long term stuff. It helped even really good code never mind the junk code.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 29, 2009 at 9:02 am
Hello All,
Thanks for the suggestions. I have created three views of the data (based on the time frames I generally query). This will suffice for now but I believe I will eventually create one large table with partitions. I am curious about the archiving idea of Steve Jones....but not sure how to implement?
Thanks again,
Lonnie
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply