This is my first blog for PBI – DAX. I lately started working on DAX and trying to understand it in depth.
While looking for a DBA hat, getting different mid-destination in between.
While working with DAX, First challenge which I faced is with RANKX function. I had a requirement of finding the Top and Bottom 10. This is doable through Rankx function but it is so confusing to work with, but once you know how PBI and DAX work it become more convenient.
Before I proceed, There is an assumption I made which is, You know about Ranking functions in SQL.
So let’s begin :
Top N\Bottom N can easily be doable through SQL Server Ranking functions. But there are situations when you must do calculation on PBI end. For ex. You might have to work on multiple calculation in single dataset. In that case it won’t be advisable to write different query for each calculation. You can create measures for such purpose.
Although I will try to cover all the important aspect of RANKX, but still if you need to know more details, please refer.
https://docs.microsoft.com/en-us/dax/rankx-function-dax
Definition for Rankx is
RANKX(<table>, <expression>, <value>, <order, <ties>)
Table : A data over which an expression is calculated.
Expression : Consider this as the column on which Rank have to be determined.
Value : This is the most confusing part in Rankx function. This is an optional value, Whatever value defines here, Rank of that value will be identified. If you keep it blank. Then whatever you defined as “Expression”, it will be considered as Value.
Order : Asc or desc.
Ties : Dense or skip, It has same meaning as Dense_Rank and Rank in SQL.
Lets us take a very small example.
Consider this table as t_Emp
EmpID | MonthlySalary |
1 | 2120 |
2 | 3705 |
3 | 8487 |
4 | 1552 |
5 | 6195 |
Now let us identify the Rank on basis of salary.
Using RANKX as Computed Column
Go by the definition of Rankx, Rank can be applied using below DAX
RANKX(t_Emp, MonthlySalary , , asc, dense)
We use this as a computed column.
EmpID | MonthlySalary | RankComputedCol= Rankx(t_Emp, MonthlySalary, ,asc, dense) |
1 | 2120 | 2 |
2 | 3705 | 3 |
3 | 8487 | 5 |
4 | 1552 | 1 |
5 | 6195 | 4 |
To elaborate on how this work, understand steps followed by PBI for any dataset.
Computed column work for each respective row, see the example below.
When Dataset is processed, It goes row wise. Rankx function is called each time row is included in dataset.
First Row came in and by default it will have Rank 1.
When second Row comes, it is compared with all existing rows and Ranking is re-ordered accordingly. This process continues till the last row is achieved.
I hope above explains how to use Rankx function as computed column but things get more complex when a RANKX is used in measure.
Using RANKX in MEASURE
If i use the same formulae for measure, it will fail with below error.
“A single value for column ‘MonthlySalary’ in table t_emp cannot be determined. This can happen when a measure formula refers to a column that contains many values without specifying an aggregation such as min, max, count, or sum to get a single result.”
This is because of the difference in Computed Column and Measure. If you are working in PBI. Then it is very important to understand, the difference between these two.
As shown on previous images, a computed column is Row-Wise calculation. However Measure is something which takes account the whole dataset for its calculation.
Now in this formulae for measure; Rankx(t_Emp, MonthlySalary, ,asc, dense), Value will be MonthSalary (as it is blank and replaced by Expression). This is an issue because Monthsalary in case of measure is a whole column rather than a single scalar value.
An illustration below on how it is read by DAX.
Due to this DAX will fail.
To correct this, We need to make ‘Value’ as scalar.
For this we first create a separate Measure as below and use this in Rankx
Measure_MAX = MAX(MonthlySalary) (You can use SUM as well)
Rankx(t_emp, MonthlySalary, Measure_Max,asc,dense)
But still problem is not solved. Now you will see data like.
To correct this use ALL function for dataset, Reason being for each row Measure will treat it as individual identity and by default a single row will have Rank 1. We need to make DAX understand the it need to consider the whole table.
Hence the final DAX query should be like below.
Rankx(ALL(t_emp), MonthlySalary, Measure_Max,asc,dense)
I hope this helps you in creating amazing dashboards.
A highly recommend you read 3 post series of the below article. It is the best i encountered on internet.
http://radacad.com/how-to-use-rankx-in-dax-part-1-of-3-calculated-columns
This particular blog is tough for me as well, considering this function is very hard to be describe conceptually. If i have made any mistakes or you believe there are some corrections required, Feel free to shoot your comments.