Compression in an OLTP System

  • I have another topic going focusing on clustered index modification that kinda turned into more of a compression topic. Ordinarily I'd let this just continue on, but because the questions are completely changing scope here, I figured I'd actually create a new topic to keep things organized.

    Here's my setup.

    Server: Dual QuadCore 32G RAM (3 raid arrays for temp/data/log, don't have the exact specs handy)

    Database Size: 70GB starting, 20GB after using Paul Nielsen's db_compress .25

    Database Type: Heavy OLTP. Not sure what my reads to writes are, but there is *heavy* data modification done constantly. I'd guess around 60/40 R/W. Starting to look into a way to get real numbers now (Current system with usage is SQL 2000, so no DMV's. I'd need something that could give me results after just a day or two of activity if anyone knows of anything)

    SQL Version: Currently SQL 2000, (Moving to 64 bit 2008 Enterprise Edition)

    The way our system typically accesses data is not normal and thus might change the rules somewhat. We have triggers on... everything. Typically 4-5 triggers on every table. Most of these triggers run cursors and have stored procedures inside of those cursors. Nested Triggers = On. These triggers often perform updates to other tables which have their own triggers etc. etc. You probably get the idea. So, basically, a large amount of our activity is going to be very tiny and not set based.

    Currently, we are seeing almost no CPU Pressure, but we're often I/O capped and run into table contention / blocking issues regularly as well as occasional buffer issues with the heavier queries.

    Any advice on the best approach to implementing compression here? I've seen things stating that Page level compression on heavy OLTP systems can lead to issues, but I don't know if that's going to apply here. I'd also have more RAM than the size of our entire database using the compression (mostly page) generated by that whole db proc, so I'm not sure if that would alleviate a lot of the theoretical cpu strain. I'll be reading over a lot more compression documentation over the next few days (Thanks to Paul White for this link to a lot of great info about compression), but any tips/pointers/advice/thoughts are appreciated.

    If you want any other info, just let me know, I'm happy to furnish it.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • If you are really wanting to look at compression on heavily used tables you might want to consider ROW level rather than PAGE level. You obviously won't get the same reduction in space usage however it has a lot less impact on the system.

    Are you going to be able to get a good baseline analysis on your system and replicate workloads so you can test different scenarios without impacting production?



    Shamless self promotion - read my blog http://sirsql.net

  • Nope, I'm not, that's the problem. We're currently running on SQL 2000 for all our production system, and there are pretty significant infrastructure changes between that database and the new 2008 ones, so I can't do anything to replicate that workload over. This is a best guess scenario for go live and then tweaking after the fact. I'm trying to optimize it as much as I possibly can with the least chance of making it inoperable. I don't have a huge issue with having to back tables/indexes down to row or no compression from page after the fact for a performance improvement, so long as it's not such a big difference that it actually drags the system to a crawl. Basically, I'm trying to make the best guess I can as to the performance because I can't load test this in any realistic way beforehand and the less I have to do afterwards the better, as I like sleeping occasionally =).

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • I was doing some investigation into page level and row level compression when we first started looking at moving our PeopleSoft systems to SQL Server 2008. One of the things I noticed in my testing was that the first 4 largest tables dropped disk space used by about 4 GB, the next 4 largest tables only dropped disk usage by 2 GB, and the progression continued for every 4 tables.

    This could have been due to the type of data stored in each table, or simply less bang as the tables got smaller. I did see better compression using page level versus row level.

  • Take a look at this best practices article from the SQLCat team out on MSDN http://msdn.microsoft.com/en-us/library/dd894051.aspx. It's got some really in depth information and a graph that shows the difference time taken for a workload on OLTP systems given compression levels.



    Shamless self promotion - read my blog http://sirsql.net

  • Nicholas Cain (3/25/2010)


    Take a look at this best practices article from the SQLCat team out on MSDN http://msdn.microsoft.com/en-us/library/dd894051.aspx. It's got some really in depth information and a graph that shows the difference time taken for a workload on OLTP systems given compression levels.

    Have a printed copy of that one sitting on my desk at the moment. I've skimmed it, need to go back and read it a bit more closely.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • The differences in our database are pretty drastic. From 70-> 20 gigs using a good amount of page level compression. I'm thinking that is due to us having a lot of nchars/nvarchars with repetitive data in them and the fair number of fields that use 'left padding' of spaces or zeroes. I guess I should go back and look at a table by table comparison of the size differences though. It may not be worth compressing tables that only have a few rows in them.

    I just actually looked at the distribution, and it's not compressing as many things as I'd thought. The following is tables + indexes.

    NONE2273

    ROW99

    PAGE621

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply