June 10, 2008 at 2:26 pm
Hi gang,
Does anyone out there have good experience with Federating on SQL Server 2K5?
I am building a rather large (>1B rows) data warehouse for a client, and am looking into a federated solution. There would be a small amount of OLTP fallover in that one of the tables will contain critical lookup data that drives the presentation of a store front, however there would only be one query against the DB per user per session.
What I am looking for is:
1) Is federating the right answer? I am looking into this just because of the sheer volume of data and that it will be growing significantly on a yearly basis, and we need to be able to do timely data mining (usually instant to < 1hr turns on reports).
2) If federation is the path, what are the ramifications of creating a "federation of 1" as a starting point? Is this any different than setting up a solitary server?
3) Any gotchas I should be aware of during the build and deployment of the federation.
Thanks a ton for any input!
Chris
June 11, 2008 at 5:31 am
Setting this up is really not that complex. distributed partitioned views can replace a table relatively easily. You will not take a performance hit using views to access your database on a single server, so configuring it initially will not really hurt anything other than increasing your maintenance complexity (you have to change views if you change tables). It can make security a bit more complicated, but it is not too bad.
If you plan on using distributed partitioned views, you must make sure that you will have a kay that can be used for the partitioning that will work correctly. If you do not, your queries will check every server for data rather than just the one that has the data.
June 11, 2008 at 9:01 am
Hi Michael,
Thanks for the response! When you say I need a key (I am assuming that was a typo?) that will allow the cross-view lookups, will a typical auto-incrementing PK work or are there other details I should be aware of?
Cheers,
Chris
June 11, 2008 at 9:55 am
Yup - typo.
An Identity column becomes a problem because you will have a view hitting multiple databases, so to use an identity column you will have to specify an identity range on each server so they do not conflict.
In addition to this, you really want to try to get queries to be satisfied on a single server, so grouping data in a way in which you can get all of the records you need from one server is the goal. Then, once you have come up with some kind of partitioning key, you have to actually get this to work with your partitioned view. In SQL 2000, this was always a bit of a problem because of the way the partitioning key needed to be part of the primary key for the tables. I have never tried to set this up in 2005, so I am not sure if it works better now.
But, that's usually the biggest snag.
June 12, 2008 at 4:26 pm
Hmmm... now you have me thinking...
If the goal is to try to structure data and queries to hit only one server, it seems that (my perception of) the benefits of partitioning are lessened. I was hoping for a solution where as our data store grows and becomes a performance drag, we throw another server into the federation and reap the rewards of more horsepower, more storage space, etc.
Is this not the case?
Also, as far as keys go, what about looking at the data chronologically? Is that common/best practice for this? Perhaps structuring the data to span servers but time-period (2-year-old data on box 1, 1-year-old data on box 2, and current data on box 3)?
Thanks!
Chris
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply