April 12, 2004 at 9:42 am
We currently have a single Microsoft SQL Server db-driven .Net web app for a single city. It handles much data storage and statistical analysis. Soon we will be increasing this to several other cities in several states, each city needing its own separate database. However, we then need to provide access from both state and national levels for some users, allowing these users to see and query across certain databases.
So my question is basically what is the best overall design to implement for this scenerio. I have never done anything like this, but I would assume there is a "standard" general way to implement this. I am not asking for every detail to the answer, but need to get started in the right direction. Whatever information/advice someone could give me would be greatly appreciated. Even directions to the best book or training resources would be great.
Some details that might come in handy.
1. The individual cities MUST have their own db.
2. My thought is to replicate the single dbs into one huge db each night, allowing the individual cities' web apps to always refer to the individual db and then having one "overall" web app refer to the one "overall" db. Hopefully, this would allow us to use the same web code for ALL situations.
3. I suppose the other alternative is to keep all the dbs separate and query across multiple dbs. But this would seem to me to be too slow and require much change in web or db coding (don't want to support more than one app).
4. To combine multiple dbs, this would seem to necessitate many tables to use GUIDs (yuck!) as keys so that they will still be unique after the combining. Sounds ugly (and perhaps slow), but don't know what else can be done.
Sorry for the verbosity of this.
Thanks.
April 13, 2004 at 3:18 am
couple of things to add to the pot for you.
1. what will be done with the national database over and above the cities versions will be key to what needs to be delivered. it sounds like you may want(need) to go down the route of using the 'analysis server'.
2. when merging the cities versions into the national picture, is it 'workable' to mix data from 2 (or more) different days?.....if not, then you cannot 'publish for use' a new national database until all cities have uploaded the same days data...or else you make accessible to the end users, the control informiton...that cityx and cityy is displaying current data, but that cityz is 2 days old....this way the users won't be bvugging you to investigate 'timing differences'
3. you may also need to seperate the loading of data structures from data....(in essence your lookup tables)...ie....what happens when you have a product-x from cityx and also a product-x from cityy....and they are different product-x's....what happens to the national picture....do you load into 2 different products or attempt to get the data cleaned?
Finally....."but I would assume there is a "standard" general way to implement this".....there's no such thing in life....each business situation usually has it's own pecularities!!!....
but asking here, will improve the chance of getting close to a workable solution....
April 13, 2004 at 8:40 am
Several thoughts come to mind.
Look at the special needs of your national users. What do they particularly want to look at? Are they looking for combined stats, or particular sales information? Do they have to drill down to the finest granularity or are summaries good enough?
Does each city have the same reporting time frame? Does each State have different regulations that make the data different? In general I call these distinct areas business zones. I did a package for US realestate a few years back where some business zones were state wide, while others could be as small as a zip code. New York is terrible for instance, with regulation combinations that boggle the mind.
Time can be divided differently in different areas as well. Even if they dont appear different, there are differences in holidays, buying patterns, and traditions that your database can be sensitive to. (or not, dont get more complex than the business case requires)
One of the questions that 'big' data users usually have is what are the differences in sales by region and why do they exist? Answering this can be hard. Good database design will give the data a shape that will help answer these sorts of questions without getting in the way by pre-digesting too much.
That is about as vague a statement as I have ever given as an answer to a design question, but it is accurate. Striking the ballance between helping and getting in the way is key to good large data.
Michael Rempel
April 13, 2004 at 11:26 am
Andrew: 1. not sure what an 'analysis server' is. 2. different days do not matter. 3. the concern over non-unique keys across dbs is a big one. I don't want to use GUIDS, but may have to. I suppose each city could have an id and then a composite key would work. But I would like to think there is another way.
Michael: 1. no real special needs of national or state users. They essentially will have the same ability as the city but with more data. Summaries will not be enough. 2. Time frame and regional differences don't matter.
I am admittedly new to this and apologize for any lack of clarity in info and questions. My main question is to determine what the overall scheme should be to implement the use of individual dbs while allowing access across the dbs for some users. I had at first assumed that nightly creating one huge db would be necessary. But I am beginning to think that I don't need to create one major db, but can query across the dbs for "higher up" access.
Here is a streamlined list of what has been expressed by "the boss": 1. Single db per single city. 2. Single web app for all city, state, and national access. 3. At login, determine access rights for user. 4. Single city user to be given specific city db only. 5. State/National user to be given access to user-specified multiple dbs and allowed to query across dbs as if all data in one db.
Any help in determining a general overall scheme is greatly appreciated. And any direction to what resources I can turn to for study and investigation is also appreciated.
Thanks.
April 14, 2004 at 2:37 am
1. analysis server http://www.microsoft.com/sql/evaluation/bi/bianalysis.asp
2. you could use 'own-codes' instead of GUID's.... ie prefix all data with the ZIP-CODE of the relevant city...
so you would have 90210-PRxx and 90111-PRxx and 90212-PRxx for 3 versions of PRxx from 3 different cities....by going this route, you could make some sense of the source of the data in the national database.
3. "1. no real special needs of national or state users. They essentially will have the same ability as the city but with more data."....is this completely true for all national users? do the city users require any ability to modify/add/delete data...and will the national users require the same style of functionality...or are they both just 'reporting'.....or where is the original raw data coming from?
You could load all data into 1 national database....and use VIEWS to filter/restrict city users to just their areas of relevance....having multiple databases (one for each city) could be an administrative nightmare....
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply