August 6, 2007 at 4:50 am
Does anyone have the patience to explain what fact tables and dimensions are? I have an idea and am very familar with the basics - tables, joins, queries. Why are these terms slightly subverted for analaysis & business services?
Surely all tables are fact tables
What is the equivilent level of granularity for a dimension? a column, a row?
Thanks in advance if you cater to this foolish question!
August 7, 2007 at 6:01 am
I'll start by saying you should go get a book about data warehousing. There is no way I can explain this in a few short posts.
A very common data warehouse strategy is to build a Data Mart. This is a database full of information from one or more OLTP systems that will be used to build OLAP cubes (Analysis Services). One commonly used structure for this Data Mart is called the Star Schema. This is basically a single table (Fact) that contains your rows of information - let's call them units and dollars for this example. In addition, you would find a number of foreign key fields in this Fact table referring to the Dimension tables. These dimension tables would hold the information that describes the attributes about these facts.
Since this schema is not very relational, it is very easy to traverse the joins (they are all a single level and can only return a single dimension record), this structure makes it very easy to calculate aggregations on the fact information (units and dollars) for any attribute in a dimension table. So, if you had a dimension for Customer and one for Product, it would be very easy to calculate the units and dollars for each Customer, for each Product, of for a comination of Custmers and Products.
This calculation of aggregations is basically what you get from an OLAP cube. It not onl can hold the information, but it can hold some or all of these aggregations making it easier to display them to a user very quickly.
Again, there are lots of books about this. If you plan to do any large-scale reporting projects, pick one up and at least read the first chapter.
August 8, 2007 at 6:51 am
You'll have to do some book learning for sure, but to answer your question here goes:
A fact table contains information about things you can measure so in an OLAP cube the numbers in there become measures. For instance sales dollars.
A dimension table contains different ways to look at your facts. Do you want to look at the sales dollars by date? Then you need a date dimension table. Do you want to look at sales by employee? Then you should have an employee dimension table. Want to look up sales by customer's astrological symbol? Then build a customer dimension table with their dates of birth and astrological symbol. These would be examples of dimension tables in your star schema (or you could build them as a snowflake schema). They would all be joined to the fact table hopefully by using small ID fields.
August 8, 2007 at 6:55 am
Mark -
Similar to what Michael is explaining, we build a Fact table for certain things we wish to measure. When we invoice a Sales order, we capture several key pieces of information about the transaction in 1 row for that invoice line. Qty, $, did it ship on time, etc. The fact table will contain things you want to sum up to measure.
I think of dimensions as ways to slice the data. An invoice record has an Item on it. If we were building Bikes, each bike might belong to a Product Category. So slicing the Invoices by a couple of Dimensions - which type of Bikes are selling in which Regions of the country would be a common question.
Cubes are a very efficient way to expose summarized data. They require more thought when designing as they are less forgiving with less than perfect data or relationships. SSAS2005 is much more flexible than AS2000, but also requires more understanding of the architecture and expects you to create the proper relationships.
The Books Online and Tutorials might be a good start. And I totally agree with Michael - a good book would be a great start. And pick off a bit at a time. There is definately a bit of art involved when designing a good data warehouse. And a couple of different styles of how to handle some things.
Greg E
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply