August 14, 2022 at 12:38 pm
I have a 1.3 terabyte database and growing, in which 1 table a month collects all of the new data... The process is collecting machine data and creates a new Table each month to log it's data. It logs over 2 billion records each month. The past data which doesn't change obviously is included in my full backup. This not only makes the Backup large, but increases backup run time.
I'm looking for suggestions on Backup model\methods how to keep storage requirements in check, and of course recover options in case of Restore would ever be needed. They need the past data Online for review so no option to archive off previous data tables.
Suggestions\Comments appreciated .. I'm running STD MsSql not enterprise..
Thx.
August 15, 2022 at 2:15 am
I have a 1.3 terabyte database and growing, in which 1 table a month collects all of the new data... The process is collecting machine data and creates a new Table each month to log it's data. It logs over 2 billion records each month. The past data which doesn't change obviously is included in my full backup. This not only makes the Backup large, but increases backup run time.
I'm looking for suggestions on Backup model\methods how to keep storage requirements in check, and of course recover options in case of Restore would ever be needed. They need the past data Online for review so no option to archive off previous data tables.
Suggestions\Comments appreciated .. I'm running STD MsSql not enterprise..
Thx.
First of all page compress those tables. If they have LOB columns, use the COMPRESS function on those.
Second, make sure there's no wasted space within the month tables. Unless they're being queried all the time, I'd drop all non-clustered indexes and make sure that the clustered index had very high page density. After that...
Third, they no longer have any business living in the database they were born in. I would move them to another database with their own separate file and their own separate filegroup, set set the file group to READ ONLY, back them up one final time, and call it a day.
If the month tables are queried often, post back and I can tell you how to do that nicely even though you're in the standard edition. And don't let anyone talk you into using Partitioned Tables until we've had a chance to talk about a "safer to restore" method.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2022 at 10:54 am
Thanks...
The monthly tables are queried often..
August 15, 2022 at 4:43 pm
Is there a script that will report savings for Page Compression?
Thanks.
August 15, 2022 at 5:58 pm
Thanks...
The monthly tables are queried often..
Ok... how are people doing that when there's one table per month? Are you already using partitioning and, if so, what kind?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2022 at 6:01 pm
Bruin wrote:Is there a script that will report savings for Page Compression?
Thanks.
yes - its called "read the manual"
Sometimes, that's difficult because a lot of technical manuals aren't written for people to make discoveries and neither are the searches. It's like the same irony of using a dictionary to find the correct spelling for a word you don't know how to spell. A good example of that is finding the word that sounds like "seg-way". 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2022 at 6:04 pm
Is there a script that will report savings for Page Compression?
Thanks.
Better than that, there's a system stored procedure that will estimate the savings as bytes before'n'after. You can find that at the following link...
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2022 at 6:05 pm
The application has access to use previous tables in trending and analysis... it just has a feature that when a new month starts it defines a new Sql table inside the DB.
THx.
August 15, 2022 at 6:08 pm
So that means that the application could be modified to look in a different database for a different table, then... correct?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2022 at 6:36 pm
frederico_fonseca wrote:Bruin wrote:Is there a script that will report savings for Page Compression?
Thanks.
yes - its called "read the manual"
Sometimes, that's difficult because a lot of technical manuals aren't written for people to make discoveries and neither are the searches. It's like the same irony of using a dictionary to find the correct spelling for a word you don't know how to spell. A good example of that is finding the word that sounds like "seg-way". 😀
perhaps - but on this case the manual for adding page compression also states what script to run - so reading the manual is a good thing.
August 15, 2022 at 7:21 pm
Jeff Moden wrote:frederico_fonseca wrote:Bruin wrote:Is there a script that will report savings for Page Compression?
Thanks.
yes - its called "read the manual"
Sometimes, that's difficult because a lot of technical manuals aren't written for people to make discoveries and neither are the searches. It's like the same irony of using a dictionary to find the correct spelling for a word you don't know how to spell. A good example of that is finding the word that sounds like "seg-way". 😀
perhaps - but on this case the manual for adding page compression also states what script to run - so reading the manual is a good thing.
"It Depends". The first returns on a FireFox search for "Page Compression" mentions it only at the very bottom. It's not even in "See Also". It's below that in "Recommended Reading" and neophytes to MS Documentation are very likely to miss it. When a link does show up in the text of a document, the link looks almost identical to the text. MS colors them a very dark, almost black, Blue and they don't underline it like in the good ol' days of easy recognition.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2022 at 8:07 pm
Do you query the history tables directly? or does the application do that exclusively? Depending on the way the tables are used, the history tables may be candidates for a clustered columnstore. The space savings can be enormous. The current month would probably not be a candidate, unless it it loaded in batches and has no updates or deletes.
The size of your monthly table might make it tricky to convert. I don't think I would try to create a columnstore on 2 billion rows at once, but inserting a new table one day at a time would probably be manageable. I was skeptical when we moved our fact tables to column stores as people seemed to think they are magic, but it was worth it for the space savings alone, plus it is pretty cool to be able to aggregate columns over billions of rows without a covering index.
August 15, 2022 at 10:49 pm
Ed, what's your take on using Column Store for column-oriented (they record the entity and the column that changed as well as the "OldValue" and the "NewValue {big mistake on storing the old value every time IMHO) audit tables where they're not interested in "aggregation" per se` but are interesting in seeing how individual values progressed for a "full row" that needs to be "assembled" from the data?
I ask because I have a few similar table as what the OP does and the nature of the way they stored things means they have to keep all rows for ever if they keep the same "principle" of how they're storing the data (which I'm trying to get them to understand a change to).
To summarize, do you believe that column store would be worthwhile not only for compression but for lookup performance?
It's also worth saying that I've not yet used Column Store for anything, yet, and so want to know from someone that has.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 16, 2022 at 10:29 am
What sort of overhead does it have on the server when running, and any idea of duration on a 2.1 billion ow table?
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply