October 20, 2011 at 7:58 am
what ever amount your company need really , generally i see places have up to two years detailed data and going back from there it is in summary form. though this really depends on your situation.
with a table that sized and the new adition of data each month, you could look at partitioning. Possible partition based on month and year could give you the ability to easily switch out partitions.
October 20, 2011 at 7:58 am
I'm guessing your table was imported from excel or access at some point.
Yes, if you plan to store 126M rows in a table, you need to do some serious optimization.
October 20, 2011 at 8:02 am
@steve-2 Cullen
You are right. The ODBC Connection (read query) to pull data from the table was originally based in an excel. Now since I am the only person who is handling pretty much everything, I was just too lazy to do any conversions etc to the query/data types.
Good thing is, I still havent loaded all the data yet. My project is still in the "proof of concept" stages but I need to make sure that when it goes live, it doesnt face any of these avoidable issues.
October 20, 2011 at 8:03 am
Ninja's_RGR'us (10/20/2011)
rajiv.varma (10/20/2011)
Oops..Bad math hits me again :D. You are right Ninja. Your count seems to be right.Maybe but your count can be right too.
There's a point where you should be able to archive the older data to another table making the recenter data easier to access.
I never unavailable. I said archived. it can be in the same db in a different table. other db, other server... take your pick
October 20, 2011 at 8:23 am
Another question for you guys. Will it really matter if I choose NVARCHAR or VARCHAR for the data type? If I keep it as unicode data type, I dont have to do any conversions, which means faster loading.
Converting 90 something columns to varchar is something that I am reluctant to do. But will do it if its best for performance.
October 20, 2011 at 8:26 am
If you can convert, it save a heck of a lot of space giving more ressources available for the same cost (data takes less space on disks, ram, cpu...)
October 20, 2011 at 8:29 am
You are the guru Ninja. I am going to stick by your wisdom š
October 20, 2011 at 8:38 am
rajiv.varma (10/20/2011)
Thats a design issue that I need to take up with my boss. He wants all the data (Yes, the entire 7 years) ready to be reported. No archival. Now I am not sure how that would work, but I will try to educate him a bit about the Data warehousing principal.Generally speaking, what would be good amount of data to keep for reporting
???
That's what Views utilizing UNION are good for. It's also what partitioning is good for 12 months * 1.5 mil = 18 mil rows... With schema binding, you can even index the views.
that's not too bad with a yearly partition. But a clustered index first to get rid of the big heap table, followed by non-clustered indexes will do you wonders.
October 20, 2011 at 8:41 am
Just to make this 100% clear : UNION ALL.
October 20, 2011 at 8:47 am
Ninja's_RGR'us (10/20/2011)
Just to make this 100% clear : UNION ALL.
Sorry, but yes. UNION ALL, UNION except when actually needed is bad ju-ju š
October 20, 2011 at 8:47 am
Thats a design issue that I need to take up with my boss.
That's the best idea. š
He wants all the data (Yes, the entire 7 years) ready to be reported. No archival. Now I am not sure how that would work, but I will try to educate him a bit about the Data warehousing principal.
Well, he is your manager (& if he is also a technical person) you should try to understand his solution first. He might have a very good reason for such a crazy idea which we can't guess here.
Generally speaking, what would be good amount of data to keep for reporting???
Again, this question can be best answered by your manager / tech lead. Why??? Because it depends on business requirements. Some business houses which generate millions of records in a day / week / month go for weekly / monthly / quarterly reports. I call it HOT data (my own term ;-)) & it should be readily available for reporting.
Does it really mean these business houses don't have reports for last 5 years? They do have it but this is COLD data might be pulled once a while. So it can be pulled from archived tables.
October 20, 2011 at 8:51 am
Yes, if you plan to store 126M rows in a table, you need to do some serious optimization
Agree. And best way to achieve it would be partitioning (transparent to end users).
October 20, 2011 at 8:53 am
The reason for 7 years of data archival is because sometimes the executives query for a particular customer. For example what was the purchase amount for customer X between a period A and B. Thats the kind of questions people here are looking to answer.
October 20, 2011 at 9:01 am
That's what Views utilizing UNION are good for
Is it not a CLASSIC implementation in SQL Server (I believe SS2K), when we didn't have partitioning options?
What about maintenance of these tables? Logically adding one column in design translates to adding a column to āNā tables Physically.
October 20, 2011 at 9:10 am
rajiv.varma (10/20/2011)
The reason for 7 years of data archival is because sometimes the executives query for a particular customer. For example what was the purchase amount for customer X between a period A and B. Thats the kind of questions people here are looking to answer.
This is not sufficient information. Let's assume if MD of a company is looking for this report he would expect it in seconds (or milliseconds perhaps). This is defines the WEIGHT of report. Another example is if the same report is pulled by Sales Executives (100 Executives, 3 times a day as an average). This reports needs to come fast as well. This is FREQUENCY that is adding value. There are many more factors... I can tell you more but would be Paid Service. š
Viewing 15 posts - 31 through 45 (of 58 total)
You must be logged in to reply to this topic. Login to reply