This article is intended to get you started on Multi-Dimensional eXpressions,
the Microsoft OLAP query language. MDX allows you to do complex reports quickly
on Very Large Databases. We have reports that run on many millions of records
with over 20 fields in under a second! OLAP is a powerful technology.
OLAP is a way of looking at data that has been totaled up, but allowing any
report on that data you require.
Suppose you were logging sales in SQL server, you could have a table [sales] which had [value], [customer], [product], [timeofsale], [salesrep], [store], [county] etc. If you wanted to know how many sales each sales rep had made you might use: select s.[salesrep], count (*) as 'total' from [sales] s group by s.[salesrep]
If you had a lot of sales this report would take a long time. Suppose you
wanted to report on something else at the same time? You might want to
see a cross-tab report of [salesrep] by [product]. This is messy to
write in SQL as you have to use case statements for each value of whichever
field you want on the columns.
What we may need to do is build all these totals at a time of low load and then
make them easily accessible for reporting all the time. This is what
analysis services does. In OLAP we would build a cube of this [sales] table. We call them cubes
because they can be visualized as such for simpler reports. As a cube can
have up to 128 dimensions however this metaphor quickly breaks down.
Supposing that we only want to report on 3 fields ([timeofsale], [salesrep] and
[product]); we could think of all the reports you could want on those fields as
a cube. Each edge of the cube would have all the values of a field along it,
and each face would be a cross tab of two of the fields.
An introduction to the rest of the terminology may be useful at this point:
- Dimension
- A dimension is one of the fields that you want to report on. Dimensions have a
tree structure, allowing complicated data to be reported on at different
levels. For instance [timeofsale] could be a dimension if you wanted to report
on it
- Measure
- What we are actually reporting on, be it sum, count or average.
- Level
- A level is a step along a dimension. In [timeofsale] we could have [Year],
[Month], [Day] and [Hour] allowing us to report on sales per hour or products
year on year.
- Member
- A member is a value of a level. In [timeofsale].[Year] we might have [2001],
[2002], [2003], or in [timeofsale].[Month] we have [March], [April], [May] etc
- Axis
- This is what you set to put [product] across the columns, [timeofsale] down the
rows or report on a [salesrep] to a page. An MDX statement can have up to 128
axis, although it is rare to use more than 2. The first 5 have names:
0 Columns
1 Rows
2 Pages
3 Chapters
4 Sections
- Dimensionality
- This is an attribute of a collection of members or levels which describes what
dimensions they are from and what order those dimensions are in.
- Tuple
- A tuple is a collection of members which all have different dimensionality
(so each is from a different dimension). Tuples have () brackets around them,
but don't need them when there is only one member. For instance
( [timeofsale].[Year].[2001], [product].[all products] )
- Set
- A set is a collection of tuples which all have the same dimensionality
(so all have the same dimensions in the same order). Sets have {} brackets
around them, and always need them. For instance
{ ( [timeofsale].[Year].[2001], [product].[all products]
) , ( [timeofsale].[Year].[2002], [product].[all products] ) }
- Function
- Functions can return Sets, Tupels, Members, Levels, Dimensions or values. We'll
come across some of the more useful ones later.
- Slice
- We may want to cross tab by two fields for some specific value of a third, for
instance [timeofsale] by [product] for a particular [salesrep].
When we picture the report as a cube we think of this filter as a slice into
the cube, to show the values on a new face.
So that's the lexicon done, now how do we use it? Well here is the structure of a
statement
select
{set 0} on axis(0) , /* this would be a block comment */
{set 1} on axis(1) , // this is a line comment
...
{set n} on axis(n)
from [cube]
where (tupel) | No axis or the where statement can share any of the same dimensions. |
So if we wanted a report of [product] on columns by [salesrep] on rows we would
execute:
select { ( [product].[productname].[product1] ) , ( [product].[productname].[product2] ) } on columns , { [salesrep].[repname]. members } on rows from [sales] | Note that I've used on columns and on rows rather than on axis(n) , because it is more clear. |
On the columns I have a set with two tupels from the same dimension. The ()
brackets are not required in this case because each tupel contains just one
member. The {} are required.
The rows has a function . members , which returns a set
with all the member of that level it. As . members returns
a set we don't need the {} brackets but again I've put them in.
Here is another one:
select { [product].[productname].[product1] : [product].[productname].[product20] } on columns , { [timeofsale].[Year].[2002]. children } on rows from [sales]
In this example I've used a range to give me a set of all the products
inclusive between [product1] and [product20] on columns. On rows I've used
another function called . children to give me all the
months in [timeofsale].[Year].[2002]
. members works on a level to give all the members at
that level.
. children works on a member to give all the members
below it (assuming there are any).
Two more useful features before we look at slices:
select non empty { [product].[productname]. members } on columns , { { [timeofsale].[Year].[2002]. children } * { [salesrep].[repname]. members } } on rows from [sales]
First of all the keyword non empty excludes any values
from that axis where no values are returned.
The * operator does a cross join between the two sets,
and works in a similar way to a cross join in sql. The final set will be made
up of every possible combination of the tuples in the two sets.
Now we will add a slice:
select { [product].[productname]. members } on columns , { [timeofsale].[Year].[2002]. children } on rows from [sales] where ( [salesrep].[repname].[Mr Sales Rep1] )
Note that the where criteria requires a tuple rather than a slice and that tuple
cannot contain any of the same dimensions as the sets on the axis
And to finish off this introduction a list of all the MDX functions along with a
brief summary of what they do. For more detail look them up in SQL Books Online
or MDSN:
Returns a set | |
---|---|
AddCalculatedMembers | Adds calculated members to a set. |
AllMembers | Returns a set containing all members of a specified dimension or level, including calculated members. |
Ancestors | Returns all the ancestors of a member at a specified distance. |
Ascendants | Returns the set of the ascendants of the member, including the member itself. |
Axis | Returns the set associated with the main axis. |
BottomCount | Returns a specified number of items from the bottom of a set, optionally ordering the set first. |
BottomPercent | Sorts a set and returns the bottom n elements whose cumulative total is at least a specified percentage. |
BottomSum | Sorts a set and returns the bottom n elements whose cumulative total is at least a specified value. |
Children | Returns the children of a member. |
Correlation | Returns the correlation of two series evaluated over a set. |
Crossjoin | Returns the cross product of two sets. |
Descendants | Returns the set of descendants of a member at a specified level or at a specified distance from a member, optionally including or excluding descendants in other levels. |
Distinct | Eliminates duplicate tuples from a set. |
DistinctCount | Returns the count of tuples in a set, excluding duplicate tuples. |
DrilldownLevel | Drills down the members of a set, at a specified level, to one level below. Alternatively, drills down on a specified dimension in the set. |
DrilldownLevelBottom | Drills down the bottom n members of a set, at a specified level, to one level below. |
DrilldownLevelTop | Drills down the top n members of a set, at a specified level, to one level below. |
DrilldownMember | Drills down the members in a set that are present in a second specified set. |
DrilldownMemberBottom | Similar to DrilldownMember, except that it includes only the bottom n children. |
DrilldownMemberTop | Similar to DrilldownMember, except that it includes only the top n children. |
DrillupLevel | Drills up the members of a set that are below a specified level. |
DrillupMember | Drills up the members in a set that are present in a second specified set. |
Except | Finds the difference between two sets, optionally retaining duplicates. |
Extract | Returns a set of tuples from extracted dimension elements. The opposite of Crossjoin. |
Filter | Returns the set resulting from filtering a set based on a search condition. |
Generate | Applies a set to each member of another set and joins the resulting sets by union. |
Head | Returns the first specified number of elements in a set. |
Hierarchize | Orders the members of a set in a hierarchy. |
Intersect | Returns the intersection of two input sets, optionally retaining duplicates. |
LastPeriods | Returns a set of members prior to and including a specified member. |
Members | Returns the set of all members in a dimension, hierarchy, or level. |
Mtd | A shortcut function for the PeriodsToDate function that specifies the level to be Month. |
NameToSet | Returns a set containing a single member based on a string expression containing a member name. |
NonEmptyCrossjoin | Returns the cross product of two or more sets, excluding empty members. |
Order | Arranges members of a set, optionally preserving or breaking the hierarchy. |
ParallelPeriod | Returns a member from a prior period in the same relative position as a specified member. |
PeriodsToDate | Returns a set of periods (members) from a specified level starting with the first period and ending with a specified member. |
Qtd | A shortcut function for the PeriodsToDate function that specifies the level to be Quarter. |
Siblings | Returns the siblings of a member, including the member itself. |
StripCalculatedMembers | Removes calculated members from a set. |
StrToSet | Constructs a set from a string expression. |
Subset | Returns a subset of elements from a set. |
Tail | Returns a subset from the end of a set. |
ToggleDrillState | Toggles the drill state of members. This function is a combination of DrillupMember and DrilldownMember. |
TopCount | Returns a specified number of items from the top of a set, optionally ordering the set first. |
TopPercent | Sorts a set and returns the top n elements whose cumulative total is at least a specified percentage. |
TopSum | Sorts a set and returns the top n elements whose cumulative total is at least a specified value. |
Union | Returns the union of two sets, optionally retaining duplicates. |
VisualTotals | Dynamically totals child members specified in a set using a pattern for the total label in the result set. |
Wtd | A shortcut function for the PeriodsToDate function that specifies the level to be Week. |
Ytd | A shortcut function for the PeriodsToDate function that specifies the level to be Year. |
Returns a tupel | |
Current | Returns the current tuple from a set during an iteration. |
ItemItem | Returns a member from a tuple or a tuple from a set. |
StrToTuple | Constructs a tuple from a string. |
ValidMeasure | Returns a valid measure in a virtual cube by forcing inapplicable dimensions to their top level. |
Returns a member | |
Ancestor | Returns the ancestor of a member at a specified level or at a specified distance from the member. |
ClosingPeriod | Returns the last sibling among the descendants of a member at a level. |
Cousin | Returns the member with the same relative position under a member as the member specified. |
CurrentMember | Returns the current member along a dimension during an iteration. |
DataMember | Returns the system-generated data member associated with a nonleaf member. |
DefaultMember | Returns the default member of a dimension or hierarchy. |
FirstChild | Returns the first child of a member. |
FirstSibling | Returns the first child of the parent of a member. |
Ignore | Prevents further recursion along the dimension |
Lag | Returns a member prior to the specified member along the member's dimension. |
LastChild | Returns the last child of a member. |
LastSibling | Returns the last child of the parent of a member. |
Lead | Returns a member further along the specified member's dimension. |
LinkMember | Returns a hierarchized member. |
Members | Returns the member represented by the string expression |
NextMember | Returns the next member in the level that contains a specified member. |
OpeningPeriod | Returns the first sibling among the descendants of a member at a level. |
Parent | Returns the parent of a member. |
PrevMember | Returns the previous member in the level that contains a specified member. |
StrToMember | Returns a member based on a string expression. |
Returns a value or something else | |
Aggregate | Returns a calculated value using the appropriate aggregate function, based on the context of the query. |
Avg | Returns the average value of a numeric expression evaluated over a set. |
CalculationCurrentPass | Returns the current calculation pass of a cube for the current query context. |
CalculationPassValueCalculationPassValue | Returns the value of an MDX expression evaluated over the specified calculation pass of a cube. |
Call UDF (args) | Executes the string expression containing a user-defined function. |
CoalesceEmptyCoalesceEmpty | Coalesces an empty cell value to a string or number. |
Count | Returns the number of dimensions in a cube, the number of levels in a dimension, the number of cells in a set, or the number of dimensions in a tuple. |
Covariance | Returns the population covariance of two series evaluated over a set, using the biased population formula. |
CovarianceN | Returns the sample covariance of two series evaluated over a set, using the unbiased population formula. |
Generate | Evaluate a string expression for each member of a set |
Iif | Returns one of two numeric or string values determined by a logical test. |
LinRegIntercept | Calculates the linear regression of a set and returns the value of b in the regression line y = ax + b. |
LinRegPoint | Calculates the linear regression of a set and returns the value of y in the regression line y = ax + b. |
LinRegR2 | Calculates the linear regression of a set and returns R2 (the coefficient of determination). |
LinRegSlope | Calculates the linear regression of a set and returns the value of a in the regression line y = ax + b. |
LinRegVariance | Calculates the linear regression of a set and returns the variance associated with the regression line y = ax + b. |
LookupCubeLookupCube | Returns the value of an MDX expression evaluated over another specified cube in the same database. |
Max | Returns the maximum value of a numeric expression evaluated over a set. |
Median | Returns the median value of a numeric expression evaluated over a set. |
MemberToStr | Constructs a string from a member. |
Min | Returns the minimum value of a numeric expression evaluated over a set. |
Name | Returns the name of a dimension, hierarchy, level, or member. |
Ordinal | Returns the zero-based ordinal value associated with a level. |
Predict | Evaluates the string expression within the data mining model specified within the current coordinates. |
Properties | Returns a string containing a member property value. |
Rank | Returns the one-based rank of a tuple in a set. |
RollupChildren | Scans the children of the member parameter and applies the string expression operator to their evaluated value. |
SetToArray | Converts one or more sets to an array for use in a user-defined function. |
SetToStr | Constructs a string from a set. |
Stddev | Alias for Stdev. |
StddevP | Alias for StdevP. |
Stdev | Returns the sample standard deviation of a numeric expression evaluated over a set, using the unbiased population formula. |
StdevP | Returns the population standard deviation of a numeric expression evaluated over a set, using the biased population formula. |
StrToValueStrToValue | Returns a value based on a string expression. |
Sum | Returns the sum of a numeric expression evaluated over a set. |
TupleToStr | Constructs a string from a tuple. |
UniqueName | Returns the unique name of a dimension, level, or member. |
UserName | Returns the domain name and user name of the current connection. |
ValueValue | Returns the value of a measure. |
Var | Returns the sample variance of a numeric expression evaluated over a set, using the unbiased population formula. |
Variance | Alias for Var. |
VarianceP | Alias for VarP. |
VarP | Returns the population variance of a numeric expression evaluated over a set, using the biased population formula. |
A good place to go start trying the ideas in this article is the Foodmart
2000 database that comes as a sample when you install Analysis services.