partitioning and filegroups

  • I'm working to design a new system and there are several tables we determined can be combined together - same structures, just for different clients, but housing the same data. Because this table has the potential to be a hot spot, I've been looking heavily at partitioning this table. The thought is to go with one filegroup per client. The issue is that we have about 100 clients and are growing quickly.

    A look at current data has about 8m records that would be spread out of 100 filegroups and every time a new client would come on board a new filegroup would be created (incorporating the ClientID) with that data being assigned to the new partition. I'm not sure of the growth rate on these tables for 2 reasons - 1, I haven't been tracking it and 2, it's hard to predict clients we don't have yet

    Does anyone out there have any experience with databases and filegroups up in the hundreds? Am I maybe overdesigning this? Could this be an administration nightmare? I'm not 100% sold on doing it this way, but I do like the idea of partitioning certain tables by ClientID.

    I'm interested to hear thoughts. And let me know if you want more details from me...

    TIA....

    Peter

  • Peter,

    Maybe a better way to handle this would be to normalize the design a little further?

    If you think this could be a hot spot then maybe BCNF if not already could solve the hot spot.

    If someone is going to be adding a that many customer, the partitioned Id could be admin nightmare

  • 456789psw (8/11/2010)


    Peter,

    Maybe a better way to handle this would be to normalize the design a little further?

    If you think this could be a hot spot then maybe BCNF if not already could solve the hot spot.

    If someone is going to be adding a that many customer, the partitioned Id could be admin nightmare

    This is still in the very early stages, but let me post a sample design. This is for medical record charts. One row = one chart. There are additional tables that store more info about the charts. The type of client/chart determines which secondary table to get additional info from.

    For new clients, we might get 3 or 4 in a month, but might go 3-4 months without bringing on a new client. We are currently at about 100 clients and I think we're bringing on about 5 new ones by the end of the year. So, you can see we're not talking about constant administration in this area, but it's enough that I need to think about different options.

    /**********************************************

    CREATE TABLE [dbo].[tblEdpChart]

    (

    [ChartID] Uniqueidentifier NOT NULL,

    [ClientID] Int NOT NULL,

    [AccountNumber] [dbo].[udtAccountNumber] NOT NULL,

    [MRN] [dbo].[udtMRN] NOT NULL,

    [DOS] [dbo].[udtDOS] NOT NULL,

    [ChartStatusID] Int NOT NULL,

    [PrintedChartLocation] Varchar(255) NOT NULL,

    [UACReasonID] Char(1) NULL,

    [HL7Prof] Char(1) DEFAULT 'Y' NULL,

    [HL7Tech] Char(1) DEFAULT 'N' NULL,

    [Abstract] Varchar(11) DEFAULT No NULL,

    [BatchNum] Int NULL

    )

    go

    -- Add keys for table dbo.tblEdpChart

    ALTER TABLE [dbo].[tblEdpChart] ADD CONSTRAINT [pk_tblEdpChart_ChartID] PRIMARY KEY ([ChartID])

    go

    **********************************************/

    Additional notes:

    - ClientID is FK to a client table

    - there is no CI, only a PK

    The more I think about how this table will be used on a daily basis, it won't be a hot spot for the application we'll be building around it. The supporting tables I mentioned could be (so maybe they're a candidate for this). This table could be a hot spot on data loads. For example, if we receive a batch of charts from 5 different clients all at the same time, if they are all imported in parallel, this could be a problem.

    I guess I need to spend more time kicking around ideas with the application developer. If he plans to process in parallel, partitions might be necessary. I may also group clients together, versus one partition for every client.

    Your post just got the wheels turning even more, so I'm just thinking "out loud" here. More comments/suggestions are welcome!

    Thanks!

  • Little tough trying to break it down since I dont know what each col does. this would be my best guess on table one....the sec table I am just leaving left over col there....but its a start

    just trying to reduce the tables to avoid locking and blocking on batch inserts

    CREATE TABLE [dbo].[tblEdpChart]

    (

    [ChartID] Uniqueidentifier NOT NULL, -- link to a patient table where you ACCt#

    [ClientID] Int NOT NULL,

    [ChartStatusID] Int NOT NULL,

    [PrintedChartLocation] Varchar(255) NOT NULL,

    [BatchNum] Int NULL

    )

    go

    CREATE TABLE [dbo].[tblEdpChart2]

    (

    [AccountNumber] [dbo].[udtAccountNumber] NOT NULL,

    [MRN] [dbo].[udtMRN] NOT NULL,

    [DOS] [dbo].[udtDOS] NOT NULL,

    [[UACReasonID] Char(1) NULL,

    [HL7Prof] Char(1) DEFAULT 'Y' NULL,

    [HL7Tech] Char(1) DEFAULT 'N' NULL,

    [Abstract] Varchar(11) DEFAULT No NULL,

    )

    go

  • It may not matter, but keep in mind in the current SQL architecture you only get 1000 partitions per object.

    You mention you've got 100 clients. How good is your sales staff?

    Though if you think they can get you to 10x let me know so I can invest in your company. 😉

  • wiseguy (8/13/2010)


    It may not matter, but keep in mind in the current SQL architecture you only get 1000 partitions per object.

    You mention you've got 100 clients. How good is your sales staff?

    Though if you think they can get you to 10x let me know so I can invest in your company. 😉

    Yeah, I've decided to scrap it to that extent. I think rather than many filegroups per client for a given table, I'm going to create filegroups for specific tables, one per table. No breaking out of clients. At least that way I can alleviate some of the activity on PRIMARY. I'll have to work with the application developers on how best to do data loads and processing.

    Funny thing is we don't really have a sales staff yet. It's all been word of mouth. They are trying to create that group, though. And while 1000 clients is rather aggressive, I wouldn't be shocked if we had 700-750 in 10 years. We've taken on about 40 in the last 12 months.

  • In 10 years SQL (probably) would support more partitions... but what you're discussing is "multi-tenant" architecture.

    Good article here re pros/cons of different strategies:

    http://msdn.microsoft.com/en-us/library/aa479086.aspx

  • wiseguy (8/13/2010)


    In 10 years SQL (probably) would support more partitions... but what you're discussing is "multi-tenant" architecture.

    Good article here re pros/cons of different strategies:

    http://msdn.microsoft.com/en-us/library/aa479086.aspx

    SPOT ON!!!!

    So nice there is a term that goes with what I'm trying accomplish here. I haven't been very good at remembering the names of strategies, techniques, etc. (especially with BI stuff), I just know how to do them. Now that I have a term to go with the design, I'm finding all sorts of good stuff out there.

    Thanks so much for the link. I see me spending a good amount of time on MS's architecture site in the upcoming weeks.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply