Question about setting up an 'Info Hub' type environment...Cliffs and Details Inside

  • Cliffs

    1. Was asked about creating a HUB SQL server that would be the source of multiple other servers data to make the data available to multiple systems.

    2. The various systems, all varying in age and settings have different collations and data subsets.

    3. I am not sure how all the destination databases that would reside on the HUB server would potentially interact with each other with their collations being different. I would assume that each database would match the collation from it source server.

    4. Not sure what the server collation would be for the HUB server when taking into account all the different types of collations that would be replicating to the HUB server (collation conflicts is the first thing that comes to mind when dealing with different database collations vs. default server collation).

    5. Looking for advice.

    If you were asked to set up a 'Info Hub / Data warehouse' type server that would be used to share information across multiple systems how would you go about setting up the server in regards to collation?

    The number of systems that would interact with this 'Hub' have a variety of different collations: default, custom, case sensitive, unicode, non-unicode.

    The scope and requirements are still not defined to the best of my knowledge but I would presume that specific subsets of data would be replicated to the hub from each source server...to keep things separated from each other, they would go to their own database that resides on the HUB sql server.

    My initial thought would be that each 'database would be the same collation as the source of the data (simple enough)...

    But what would that mean as far as what collation the Hub sever is? I would guess you would want it to be as agnostic as possible so all the various databases could exist on the server and possibly interact with each other (for queries where the tempdb would be involved and you get the collation cannot be resolved error).

    In my own bubble I have dealt with individual collation conflicts and understand the concepts at that level, but when presented with a challenge of typing multiple systems together to act as a HUB to share data I do NOT know all the big picture items to keep in mind when installing the system from the ground up.

    If anyone has recommendations or comments I am all ears.

  • If you need tempdb to be able to deal with different collations, you're probably better off with either separate instances of SQL Server, or (probably best bet) multiple virtual servers.

    The first question I'd be inclined to ask is, what's the perceived benefit of consolidating all the separate servers into one "hub"?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I am told the direction / benefit of a 'hub' would that it would allow for a 'plug and play' environment where you can add another system or change out a system.

    Or put another way to me it removes the need for vendor dependencies...

    I.E. the main HR system isn't the source of xxx data, that data is moved to the HUB...removing the dependency from the HR system...and allowing all that need to share that information access to the HUB vs. the HR system.

    These are probably not the best examples of why this is the chosen approach, but I hope it is enough of a description to give you an idea...

    I am kind of on the outside of the planning side of this at the moment so I can't speak in-depth...I was asked how it would function behind the scenes...how I would get SQL 7 legacy machines to replicate to a SQL 2008 R2 box...etc...how I would deal with different collations...etc...

    I know the replication answer / limitation btw...just giving the questions that are running through my mind...

    Still a lot of planning ahead...I think the thought was the HUB would be one system (SQL server) but it might have to be grouped by what plays well together...or by systems that don't have to be co-mingled and can operate independent...

  • Not knowing exactly what you're dealing with, it almost sounds like you're looking at an enterprise data store or an operational data store setup. Chances are the data will need to go out in .csv, .xml, or .txt type feeds, unless you use web services to communicate between systems.

    Speaking from experience with a similar setup, there is a LOT of maintenance, architecture mapping, etc, that will be needed. Creating your hub is a large project best dealt with in small chunks (1 system at a time) and will require a lot of resources to put together. It may take 5 years or more, depending on how complicated it is and how many people you have working on it.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • All very good points indeed...

    The scope of this 'HUB' has a very large reach...or that is my initial impression...

    I hadn't thought about an 'export / import' type process, I guess it would depend on a lot of variables for each system...size, refresh schedule, etc...

    Thanks for the input.

  • In the case you're talking about, I think your best bet is build a proof-of-concept server, using Developer Edition probably, and test what happens if you have multiple databases with different collations set and all that. Build it right, and you can test a lot of things before you do a lot of work on the actual project. That'll allow you at least estimate the total effort needed, and get an idea of what will/won't work.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I had pondered on that thought...

    My only fear is that you go ahead and get all these different databases set up...and then how they are used isn't decided until x time frame down the road...or...additional functionality that wasn't originally in the plan is introduced that now breaks the previously working model...

    I guess that might be some of the boundary testing to verify what will and won't break process X

  • You might also want to make the point to those who are asking you about it that you can't even begin to give time or cost estimates without having more detail. This is when a project manager, a data architect, and an analyst would come in handy.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Later this week is a large kick off meeting so I will hopefully find out if there are other details or if I am going to be involved...

    Thanks for the helpful comments, it will definitely aid me when the discussion starts up.

    I will post back if there are additional details.

    Thanks again,

    Lee

  • Leeland (1/4/2012)


    I had pondered on that thought...

    My only fear is that you go ahead and get all these different databases set up...and then how they are used isn't decided until x time frame down the road...or...additional functionality that wasn't originally in the plan is introduced that now breaks the previously working model...

    I guess that might be some of the boundary testing to verify what will and won't break process X

    There is no way to ever tell what some business will decide to do with data outside of the normal constraints. It's like worrying about being struck by lightning. Anticipate the things you can, learn enough to have the flexibility to deal with the things you can't anticipate, and assume Murphy will slap you upside the head periodically with something that seems designed to knock all your plans askew.

    The best insurance against "nobody ever anticipated X" is to have as broad and deep a skillset as you can manage, and have lots of contacts with people who have other skillsets. This site (SQLServerCentral), LinkedIn, and a zillion others will be your best friends when some manager comes up with "a brilliant idea" that you absolutely, positively cannot support, because it's a pretty fair bet, no matter how zany and original the idea, someone, somewhere has been through it before you.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply