Partitioning to Address Resource Contention

Question

Partitioning to Address Resource Contention

paul_ny_murray

SSC Enthusiast

Points: 164
More actions
October 8, 2017 at 9:15 pm

#340596

HI. I'm new to partitioning so please bear with some naive questions.
We have a large application that does daily pricing of products for multiple clients. At around 4 pm in the afternoon client data hits the system and once all the relevant data is in for a particular client, a user triggers the calculation process for all of that clients products. the basic entities are:
Client 1-m with products 1-m with product items
Other data that comes in from third parties are prices, fx rates and accrued interest.
the general work flow for a client is as follows:
1. Client submits the data for their products
2. The data is stored in staging tables where it is cleaned.
3. A request message is created on SSB to process the data from staging to actuals
4. Once all data is in the actuals the user requests via another SSB message to process all the products for that client resulting in a final daily report per product.
The flaw in the workflow is that if Client A has 3000 products and Client B has only 5, if Client A gets in the queue first, Client B will have to wait. With 30 clients and 1000's of products system performance appears to lag and we have a lot of unhappy users. We are addressing this issue in a number of ways.
We are creating processing queues for each client and one catch all queue for new clients. This takes care of the bottleneck in terms of message processing however, if we try to multi thread each queue with more than one instance of their activation procedure, we don't really get an improvement in actual performance since all the queues are still competing for the same processing tables.
This is where we hope partitioning can help. We are thinking if we create a file group per client grouping the partitions for each of the processing tables this would relieve cross client contention. So each processing table will have a numeric column called partition_no. This will be referenced in SQL code to direct execution to the file group relevant to that client. We're hoping this will create the affect of having tables per client.
Most of the examples I see with partitioning are concerned with dates or numeric ranges. I haven't seen any examples attempting what we have in mind.
Has anyone tried this approach? We would very much appreciate feed back and opinions.
Thanks in Advance
Paul

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

frederico_fonseca SSCoach Points: 16155 More actions · Answer 1

Partitioning will most likely not help here.

where exactly is the contention happening?
And are server specs up to scratch? sometimes performance is due to server either being too slow or too fast.

From what you explained the different clients should not be competing within row level locks as data should not be the shared.
With multi-thread for the same client locking can occur but you didn't give any detail of where and why that contention is happening e.g. several threads trying to update the same records or just waiting to read for records that may be locked even if what is being updated is not important for some particular selects.

One thing that may help on these locks but that needs due consideration is enabling RCSI - Read Committed Snapshot Isolation - on the database . Will impact your tempdb usage and coding will need to take this in consideration but it may be what you need.

Ordering the threads to process the data in different order or in logical blocks - e.g. common product records get processed by the same thread - can also improve performance by minimizing the probability of locking.

paul_ny_murray SSC Enthusiast Points: 164 More actions · Answer 2

Hi Frederico, Thanks for the reply and sorry it's taken awhile to get back. .

where exactly is the contention happening?
And are server specs up to scratch? sometimes performance is due to server either being too slow or too fast.
>>Yes of course we're planning regular upgrades that this will help.

From what you explained the different clients should not be competing within row level locks as data should not be the shared.
With multi-thread for the same client locking can occur but you didn't give any detail of where and why that contention is happening e.g. several threads trying to update the same records or just waiting to read for records that may be locked even if what is being updated is not important for some particular selects.

>>so each filegroup contains the necessary partitions to isolate the processing data for each client.
>>FileGroup1 FileGroup2
>>Prices for client A Partition 1 Prices for client b Partition 2
>>Product info for client A Partition 1 Product for for client b Partition 2 etc
>>I might also note that the partitions are segmented by client. There is also a date and a product identifier involved, but we didn't think that we should get so granular since that would involve a partition >>for each product/date with a calculated column combining the two.
>>So the problem in the architecture is that each ssb activation procedure will process a particular product for a given client. Ideally we'd want to multi thread with a number of activation procedures. >>However, since each activation procedure would be competing for the same partition.. it seems like it would be the original issue we had just on a client/partition level instead of a whole table.

One thing that may help on these locks but that needs due consideration is enabling RCSI - Read Committed Snapshot Isolation - on the database . Will impact your tempdb usage and coding will need to take this in consideration but it may be what you need.
>>We do have the DB set to RCSI.

Ordering the threads to process the data in different order or in logical blocks - e.g. common product records get processed by the same thread - can also improve performance by minimizing the probability of locking.
>> We actually have a queue per client that corresponds to the file groups. That's about as similar as it gets. Each product is unique.

>>I'm starting to think there maybe no other solution other than to partition using product and date.

Thanks Again
Paul

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 3

Perhaps partitioning isn't the answer. For example, unless you can isolate and guarantee that each filegroup is on a separate spindle (or spindles), you're still going to get head thrashing at the disk level. From what I've seen, partitioning not only isn't a panacea of performance, it frequently makes performance slower for multiple reasons. Rumor has it that it's because it has to go through more than one B-TREE to get where it's going. I've observed an increase in slowness on one particularly large table that I made the mistake of partitioning using Partitioned Tables technology.

I've also found that, for those reasons and more, parallel processing is also NOT a panacea of performance. If you do have guaranteed separation at the spindle level, then you might get some increase in performance.

I say that because you still only have on log file, which is a performance bottleneck if you have poorly written code.

Also remember that data must be in memory to be processed. Separate filegroups aren't really going to help there because whatever data you're working with still has to be in memory before you can do anything with it.

That leaves really only one thing... the code. And that's where the performance needs to be.

I'd stop looking for the proverbial "magic bullet" and start looking closely at where the performance actually needs to be... in the code. Find your top 10 most costly queries (not to be confused with the longest running queries) and fix them. If the queries are generated by ORM or managed code, consider moving that functionality to stored procedures.

Either that or do like EBay did... buy hundreds of servers and assign just a couple of micro services to each one. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)