April 24, 2009 at 11:52 am
I just got through the first cube creation tutorial. The hand holding is very comforting, all seems logical. Then, the thought occurred to me: I could duplicate all that with this (admittedly trivial) SELECT from the Adventure Works DW database:
SELECT c.NumberChildrenAtHome,
c.MaritalStatus,
p.EnglishProductName,
SUM (i.SalesAmount) total_sales
FROM dbo.FactInternetSales i,
dbo.DimCustomer c,
dbo.DimProduct p
WHERE i.CustomerKey = c.CustomerKey
AND i.ProductKey = p.ProductKey
GROUP BY CUBE (c.NumberChildrenAtHome,
c.MaritalStatus,
p.EnglishProductName)
Am I missing something here? Wouldn't it be a whole heckuva lot easier and quicker just to write this kind of statement, then export it to Excel? Is there anything else besides to make all that clicking and dragging worth the time and carpal tunnel syndrome, aside from resume enhancement?
April 24, 2009 at 12:12 pm
You can't duplicate the scale and speed up multiple aggregations across different measures and dimensions. They cube pre-processes this, often from thousands or millions, or rows, and retrieves the results much quicker.
You can duplicate everything in Excel or T-SQL, but not the speed or flexibility of retrieving the data and performing the calculations.
April 24, 2009 at 12:18 pm
For a very simple cube like that tutorial, no, there's no real point to it.
For complex cubes working on millions of rows of relational data, having it pre-aggregated for various pivots into cubes can make the difference between queries that take hours to run and queries that take seconds to run.
The tutorial uses extremely simple data. Real life data warehouses can include hundreds (even thousands) of tables of data, and are often terrabytes in size. If you tried to run some of the data mining that they are used for on a relational database using traditional aggregates and group-bys, you'd die of old age before you got to the interesting information. (Well, maybe that's a slight exaggeration.)
In smaller businesses, cubes set up behind dynamic reports (SSRS), can allow management to view data in various ways that would otherwise require a DBA to write dozens of queries.
In the real world, SSAS doesn't get anywhere near the use that it's built to handle, so far as I can tell.
I've built a system with it, with real cubes and real data from a real OLTP database with valuable data in it, built dynamic reports on top of it, and had the managers ignore the data completely and ruin a whole business. The information was there for them to use, but they didn't bother to understand it much less use it to guide to business. I think that's actually the most common use of it, unfortunately.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 24, 2009 at 12:35 pm
OK, I see, it's real worth comes in DBA type optimizations.
However G, you do make the following point:
"In the real world, SSAS doesn't get anywhere near the use that it's built to handle, so far as I can tell."
That's the problem I see with all these fancy ways to optimize GROUP BY's. They get too complicated to understand, and they are certainly too complicated for 99.8% of all users to used dynamically.
Finally, I've found in real life that the overwhelming majority of reporting needs can be well satisfied by just doing a SELECT on the normalized tables. I just don't see that much demand for the drill down stuff that is so much fun to do.
April 24, 2009 at 1:38 pm
In small businesses, that's going to be true.
Now, try analyzing Microsoft's sales figures that way...:-)
For the most part, the heavy duty features are used in larger organizations with a lot more data.
If you're a one-person company, "analyzing your sales and marketing to work out target demographics", probably means looking at a paper file of last month's invoices, and sending thank you letters to each customer.
If you're a working in a 10-person company, it's more likely to involve some sort of database, and a simple select from. Might be a "database" like QuickBooks, but it's probably going to be electronic these days.
If you're working for a 200-person company, it's probably going to be a relational query of relational tables, and it's going to be a bit more complex. There's going to be a need for a report for each product line, some analysis of demographics of repeat customers, and so on. Real, in-depth market analysis is going to be annual, and is going to take a couple of weeks to get done, because T-SQL experts are going to be hand-building ad hoc queries as managers realize they need some data in a particular format.
If you're working for a 10,000-person company, it had better be a data warehouse, and the managers and analysts and consultants and such had better be able to slice and dice the data on their own, because otherwise the annual review will take more than a year to get done. And it won't be annual, it'll be on-demand, and often ad odd schedules because of managers and offices in different time zones and even on different continents.
If you're looking at SSAS from the viewpoint of a 10-person company, or something in that scale, you're probably looking at hunting rabbits with an elephant gun, and it sure does seem silly to have an elephant gun when you're doing that. If you're looking at it from the viewpoint of a multi-thousand employee company, it better not be an elephant gun, it better be an armada of battleships!
So, you get a free elephant gun when you buy an SQL Server license, but it's no-harm-no-foul if you put it in your gun rack and never use it.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply