September 15, 2014 at 1:49 am
Hi all
I am new to the BI side of SQL Server so I have a couple of newbie questions.
Am I correct in that ROLAP is when your data warehouse is stored as normal Relational SQL Database within your instance and MOLAP is when your data warehouse is stored as Cubes in Analysis services?
Is a "dimension" the same thing in both instances?
Thanks for the help.
September 15, 2014 at 8:29 am
SQLSACT (9/15/2014)
Am I correct in that ROLAP is when your data warehouse is stored as normal Relational SQL Database within your instance and MOLAP is when your data warehouse is stored as Cubes in Analysis services?
Is a "dimension" the same thing in both instances?
Broadly speaking you are right, but there are more nuances to it than just that. Have a read of this article which describes the three storage modes available http://www.sql-server-performance.com/2013/ssas-storage-modes/.
As for your dimension question, the term "dimension" is used to describe a set of data that is set out in a particular way to perform a particular function. For a more theoretical (and, you know, better) explanation of dimensions then head here http://databases.about.com/od/datamining/g/Dimension.htm or read The Data Warehouse Toolkit by Kimball.
In terms of an exact answer to your question - a dimension is a dimension whether it is stored on a database or on a potato 😀
September 16, 2014 at 12:48 am
Thanks for the help
I want to make sure that I have the basic architecture cleared up.
In a typical configuration, does MOLAP require a ROLAP database to be present? What I mean is, is the data taken from the OLTP database into a ROLAP database (Warehouse) and then moved into the cubes?
Thanks
September 16, 2014 at 3:13 am
Data can be taken from any database source - it's generally a good idea for it to be a data warehouse but it doesn't have to be. ROLAP is a storage mode for the cube once it's built - you wouldn't describe a data warehouse as "ROLAP" without this.
September 17, 2014 at 2:42 am
Thanks for the help
Getting my head around the architecture of data warehousing is proving to me tougher than I expected.
So basically, the data source is irrelevant when it comes to deciding between ROLAP and MOLAP as these options are only available once the cube has been created, and the only difference between them is the way the data in the cube is stored.
MOLAP - Multi-dimensional, creates its own db files based on the data source. - Correct?
ROLAP - Can you please elaborate on the topic of indexed views in this regard. Are the indexed views created on the data source or on the cube itself? In ROLAP, isn't the cube just a representation of the data source and not physically stored on the analysis server?
Thanks
September 17, 2014 at 4:23 am
It's a bit of a shift in thinking when you're used to something else. I really cannot recommend more to read The Data Warehouse Toolkit by Kimball. It will make a lot of things clear.
Remember though, a data warehouse is distinct from an OLAP cube. If it helps think of a data warehouse as just a data source for an OLAP cube and nothing more. Although the elements of both share nomenclature (dimensions, attributes etc.) they are architecturally very very different.
So MOLAP does store the data in it's own files, but they are not db files. As for ROLAP, to say the data source is irrelevant is misleading. If a data source for a cube is a working application database (not recommended by the way!) then the choice of ROLAP over MOLAP could prove to be a costly one. Less so if your source is an isolated data warehouse that is not constantly being updated and queried. As for indexed views, they are created on the data source and stored there. With ROLAP, the cube becomes a framework by which queries are made to the relational db so all data is stored there.
Also choosing between ROLAP and MOLAP is on an object by object basis. A cube can have elements of ROLAP and MOLAP (and even HOLAP) in it. For example, you might have a situation where a cube is taking a long time to process because of one partition. You could make this partition ROLAP and the rest MOLAP to solve the processing issue (at the sacrifice of query performance).
I've been working with SSAS for over 10 years now and I have to say I have never had to implement ROLAP at any point (it was considered for a client a few years back, to get around a storage limitation, but this was solved with a redesign). It's important to know about the different storage options and think of them at the design stage but generally speaking unless you have a special case, MOLAP is the way you will proceed.
Some resources:
September 19, 2014 at 2:57 am
Thanks for the help
I decided to attempt building my first cube using AdventureWorks, I used the FactInternetSales table and its related tables, I noticed a few things when looking at the cube:
For every measure group, there is a dimension - is this normal or did I fall off the wagon somewhere? What is the difference between the two?
I did pick up that each measure group can have its own storage mode and can be processed separately from the rest of the cube.
Thanks
September 19, 2014 at 3:13 am
You used the wizard didn't you? 😀 I think what's happened is that the wizard looks at your dsv and tries to determine what are fact tables (measure groups in the parlance of SSAS) and what are dimensions. If you're not careful it'll put everything into each category (since it creates a your table name record count for the tables it can't find any aggregate columns for) so you have to pick and choose what you want as measure groups and what you want as dimensions.
Step through this tutorial from MS - it will give you an idea of how to do it all: http://msdn.microsoft.com/en-us/LIBRARY/ms170208%28v=sql.105%29.aspx
The difference between measure groups and dimensions is the same as the difference between facts and dimensions. A fact table (measure group) is where transactional events are stored (sales, for example). The fact tables have attributes that reference dimension tables (order date, for example) and these dimensions are used to "slice" the fact data for more meaningful. Well that's a simplistic explanation - measure groups can be augmented by different types of calculations including non-additive and custom, but that's getting ahead of ourselves a bit!
Have a run through of that tutorial and give a shout back if you have more questions 🙂
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply