September 8, 2014 at 11:20 pm
Comments posted to this topic are about the item Basics of Dimensional Modeling
September 9, 2014 at 6:34 am
Great topic, very short article. I think you should follow with more details. For example; try to explain why the year Dimension only has 2 columns.
September 9, 2014 at 7:31 am
This is a great little article for beginners that have no idea what a dimensional model is. This really helps break it down in a short simple explanation.
September 9, 2014 at 9:30 am
Good quick read. And I have a question on the specific statement made at the end of the article.
First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.
So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?
Maybe I'm all confused. Thanks.
September 9, 2014 at 9:47 am
piotrka (9/9/2014)
Great topic, very short article. I think you should follow with more details. For example; try to explain why the year Dimension only has 2 columns.
I agree. Now what?
I suggest a 'basic' article maybe could also explain why 'dimensional' modeling is different from whatever else you might already know...
then show us how/why queries against this schema perform better than queries I might have written before I learned this new strategy.
September 9, 2014 at 10:47 am
qbrt (9/9/2014)
First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.
So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?
In dimensional modeling, your fact table is highly normalized, and you typically try to represent the smallest level of transactional data (metric) that you can. In this case it is "one record per WEO subject per unit type per country per year" , this is called the grain of the table. The source spreadsheet data will be un-pivoted to store all those columns of years as a separate row each.
Your dimension tables are typically designed to incorporate columns from multiple tables of your operational system, so that you don't have any descriptive data more than one join away from the fact table. Dimension tables from a source like this are split amongst the related attributes, a date/time scale (in this case we only have year), by geography (by country here), by WEO subject, by units. Typically you'd want all related fact data to be using the same scale, so I probably would have done that before loading the fact data as a cleansing step, avoiding a scale dimension. Having related fact data in the same scale makes it easier to report and do analysis on the data.
September 9, 2014 at 11:28 am
Chris Harshman (9/9/2014)
qbrt (9/9/2014)
First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.
So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?
In dimensional modeling, your fact table is highly normalized, and you typically try to represent the smallest level of transactional data (metric) that you can. In this case it is "one record per WEO subject per unit type per country per year" , this is called the grain of the table. The source spreadsheet data will be un-pivoted to store all those columns of years as a separate row each.
Your dimension tables are typically designed to incorporate columns from multiple tables of your operational system, so that you don't have any descriptive data more than one join away from the fact table. Dimension tables from a source like this are split amongst the related attributes, a date/time scale (in this case we only have year), by geography (by country here), by WEO subject, by units. Typically you'd want all related fact data to be using the same scale, so I probably would have done that before loading the fact data as a cleansing step, avoiding a scale dimension. Having related fact data in the same scale makes it easier to report and do analysis on the data.
Ok. Thank you for that explanation. I see were I was confused. The fact table is in "fact" Normalized. From what you're saying,
Your dimension tables are typically designed to incorporate columns from *multiple* tables...
the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.
Right? :unsure:
September 9, 2014 at 11:34 am
... the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.
Right? :unsure:
Correct. If the dimensions were highly normalized, then you'd end up with a snowflake, instead of a star schema.
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
September 9, 2014 at 11:41 am
Alvin Ramard (9/9/2014)
... the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.
Right? :unsure:
Correct. If the dimensions were highly normalized, then you'd end up with a snowflake, instead of a star schema.
LOL...:-D
Love your refund policy.
And thx for the reply.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply