We continue our examination of DAX function / iterator pairs, in this Level of the Stairway to PowerPivot and DAX series, with theCOUNT() and COUNTX() functions. COUNT() and COUNTX() support, respectively, the need to return a count of a number of cells in a column containing numbers, and a requirement to count the number of rows containing a number (or an expression that evaluates to a number) when evaluating a number over a table.
As a part of our introduction, we will discuss ways in which we can employ COUNT() and COUNTX(); we'll look at each function in standalone scenarios, as well as in situations where we combine it with other functions, to meet the business requirements of clients, employers or peers in our own environments. During our exploration of the COUNT() and COUNTX() functions, we will:
- Discuss DAX function / iterator function pairs, in general;
- Examine the syntax involved in exploiting each;
- Undertake illustrative examples of the use of each function in practice exercises;
- Briefly discuss the results datasets we obtain in each of the practice examples.
As is always the case in my introductions to DAX functions throughout this Stairway series, we will examine the output we obtain, in this specific Level employing COUNT() and COUNTX() via calculated columns we construct within the PowerPivot window, in the practice session that follows our overview and explanation of the purpose and operation of each function. We will further create calculations that put both COUNT() and COUNTX() to work in PivotTable calculated fields, to further examine the behavior that we have observed in the calculated columns created in the PowerPivot window. Along the way, as appropriate, we will follow another path we have taken consistently within the Levels of this series: we will compare and contrast differences in use and behavior of the functions in calculated columns and calculated members in general, suggesting criteria to consider in choosing which of these calculation types to employ to meet the business requirements we encounter.
Preparation for the Levels of the Stairway to PowerPivot and DAX Series
To obtain the most benefit from the Levels of the Stairway to PowerPivot and DAX series, you will need to have enabled the Excel 2013PowerPivot (embedded with Excel 2013 out-of-the-box) add-in on your PC. You can also use Microsoft Office Excel 2010 with the respective PowerPivot add-in - available for free at www.PowerPivot.com, as of this writing - to complete most steps of the levels of this Stairway, although naming conventions and screen appearances might differ a little from what you see in the prospective articles of this series. In many cases, you will also need to have installed SQL Server 2008 or above (including Analysis Services) and the respective relational and Analysis Services samples on your local machine, or have the client portions of each on your machine, with access to the rest on a server. To perform all steps of preparation for the Levels of Stairway to PowerPivot and DAX, see the installation section of the charter Level of this series, Level 1: Getting Started with PowerPivot and DAX.
NOTE: In this Level of the series, Excel 2013 with the PowerPivot add-in enabled, together with SQL Server 2012, is in use on a Windows 8.1 PC. The above installation instructions are still largely applicable for SQL Server, with the samples for SQL Server 2012 being obtainable from the same sources noted.
Preparation for the Practice Exercises in this Level
Once we've installed and configured Excel 2013, enabling the embedded PowerPivot add-in as part of the process, and have SQL Server 2008 (or later) and the samples noted above installed, we're ready to perform the practice exercises within this Level. Importing the sample database and taking other preparatory steps are critical to providing the environment required to successfully complete this Level, and should be performed before beginning.
Because we will work with different data sources throughout this series, the steps of environment preparation will be repeated in various Levels. I do this primarily to make each Level a freestanding session that can be completed independently of other Levels, so that you can focus upon gaining exposure to the specific skills required to meet your own immediate requirements efficiently, with all steps I undertook "in one place." Where appropriate, references to functions, procedures and other details will be supplied, but the objective of each Level is to focus upon one or more specific DAX functions and / or PowerPivot / PivotTable procedures to satisfy needs that mirror requirements we may encounter in our own business environments.
Let's open PowerPivot and import our data by taking the steps below. This will put us in position to begin learning the material in this Level.
- From the Start menu, select Microsoft Excel, and begin with a blank workbook.
- Above the Excel ribbon, from within the Home tab of a new worksheet, click the PowerPivot tab, as shown.
The PowerPivot ribbon appears.
- Click the Manage launch button, appearing at the left of the newly appearing PowerPivot toolbar, as shown.
The PowerPivot window, containing its own ribbon, opens atop the existing Excel spreadsheet, assuming the name of the workbook as its own name, as depicted.
As we noted in Level 1: Getting Started with PowerPivot and DAX, it is in the PowerPivot window that we load and prepare the data with which we will be working (or will continue working, if data is already added to the workbook). We will typically build a relational model here (even when we import other-than-relational sources, such as Analysis Services data, text files, RSS feeds, and so forth). As we also noted in Level 1, the PowerPivot window displays the tables on individual, tabbed sheets. This is the central place where we import tables, create relationships, maintain column data types and formats, and view the data that underlies our data model, among other actions.
- Next, we'll designate a source from which to import data.
- On the Home tab of the PowerPivot Window, click the downward-pointing selector arrow on the right side of the From Database button, within the Get External Data group on the ribbon.
Select From SQL Server on the cascading menu that appears next, as shown.
The Table Import Wizard dialog opens next.
- In the top input box, titled Friendly connection name, type (or copy and paste) the following (or another appropriate label):
AdventureWorksDW2012
- Click the selector (downward pointing arrow) on the right side of the box titled Server name.
PowerPivot begins a scan of the machine to detect, and return to the selector, the available server choices.
- Select the appropriate server for your local environment, or type in the server name / "localhost," as appropriate.
- Enter the authentication as required for the local environment (ideally selecting Use Windows Authentication).
- Select AdventureWorksDW2012 (or your appropriately versioned equivalent) using the dropdown selector to the right of the box titled, Database name, at the bottom of the dialog.
The Table Import Wizard dialog, with our input, appears similar to that depicted below.
- Click the Test Connection button underneath the Database name selector.
A message box appears, indicating that the test connection has succeeded.
- Click OK to dismiss the message box.
- Click the Next button at the bottom of the Table Import Wizard dialog.
- On the dialog that appears next, labeled Choose How to Import the Data, leave the radio button at its default of Select from a list of tables and views to choose the data to import, as shown.
- Click the Next button.
PowerPivot loads the tables and views from the AdventureWorksDW2012 database into the Select Tables and Views page that appears next.
- Select each of the tables listed in Table 1 by clicking the checkbox to the immediate left of the respective table listing in the Source Table column of the dialog, and change its name, via the Friendly Name in the dialog.
Table to Select (Source Table) | Change Name to (Friendly Name): |
DimCustomer | Customer |
DimDate | Date |
DimProduct | Product |
DimProductCategory | Product Category |
DimProductSubcategory | Product Subcategory |
DimPromotion | Promotion |
dimReseller | Reseller |
FactInternetSales | Internet Sales |
FactResellerSales | Reseller Sales |
The Select Tables and Views dialog appears, as partially shown, with the tables listed selected and designated with their new names.
The idea in these immediate steps, once again, is to import enough information into our model to allow us to do some illustrative analysis surrounding the business activities conducted by our hypothetical client, the Adventure Works Cycles organization.
- Click the Finish button.
The import process runs, and then we see a "Success" message, complete with a Details list that indicates population by our choices.
- Click Close.
The Table Import Wizard is dismissed, and we arrive at the PowerPivot window once again, where we see the imported data as partially depicted.
Note that a tab for each imported table has been created in the PowerPivot window (the tabs appear in the bottom left of the window, as seen above). What we are now seeing are not Excel tables / spreadsheets, but a view of the efficiently compressed columnar database that PowerPivot uses to store imported tables in memory. For more detailed information about the PowerPivot window itself, please see Level 1: Getting Started with PowerPivot and DAX.
- Click File, to the immediate left of the Home tab.
- Select Save As from the dropdown menu that appears, as shown.
- Save the worksheet as ST_DAX08-1.xlsx, placing it in a convenient location.
- Leave the PowerPivot Window open for the practice session below.
We are ready, at this stage, to get some hands-on exposure to another couple of DAX functions, COUNT() and COUNTX(). We will also revisit functions that we introduced in previous Levels, while continuing to build experience with the general use of PowerPivot for Excel to meet business requirements, in the sections that follow.
The COUNT() Function
Introduction
According to the Microsoft Developer Network (MSDN), the DAX COUNT() function "counts the number of cells in a column that contain numbers." COUNT() is a member of the Statistical function group, as displayed in the formula bar for PowerPivot and PowerPivot PivotTables, many of the members of which are identical to functions in Excel. The function, like many others, is often used in combination with other DAX functions.
Many of us have worked with the COUNT() function in standard Excel, where, like SUM(), it is a relative "cornerstone" function. And, while anytime we drag a field into a PivotTable (either a "classic" PivotTable or a PowerPivot PivotTable) Values drop zone the default summary that is applied is Sum, we can choose from other treatments, including Count, as we can verify by right clicking upon the newly created sum of a value we have dropped into the Values drop zone, and then selecting Summarize By (whereby we can change the type of aggregation of a given Value). In the example below, we note the default behavior of the Sum option being checked as being in force, but that we can modify the aggregation behavior by selecting one of the other options appearing below Sum on the cascading menu.
COUNT() simply generates a count of the number of cells in a column that contain numbers. The cells must contain either a numeric or date value, or COUNT() simply ignores it in coming to the total that it delivers. The value delivered byCOUNT() therefore is not necessarily the number of rows in a table with a column subjected to COUNT(). The DAX COUNTROWS() function is the appropriate one to use to return number of rows, regardless of the occupied / empty state of each.
NOTE: For more information about the COUNTROWS() function, together with numerous examples of its use, both standalone and in conjunction with other functions, see Level 2: The DAX COUNTROWS() and FILTER() Functions and More About Calculated Columns and Measures here in my Stairway to PowerPivot and DAX series.
It will seem obvious that, where there are no rows to count in a given column, COUNT() returns a blank. If rows do exist, but none of the rows meet the criteria for counting, COUNT() returns a zero ( 0 ) value. A row that contains text that cannot be translated into a number is not counted.
When we employ COUNT() across date values, the returned result is an integer, just as it is in a count of numerical values. When cells containing a value zero (0) exist inside columns specified within the COUNT() function – as opposed to cells that are blank (which are not counted at all) – those cells are included.
It is significant to note that the COUNT() function calculates the count of all values (with the above noted exceptions) in a specified column. If a need exists to count the number of rows containing a specific numeric value, or an expression evaluating to a specific numeric value, we can do so via the COUNTX() function (discussed in its own section of this Level).
We will explore the syntax for theCOUNT() function after a brief discussion in the next sections. We will then gain some hands-on exposure to its use, within practice examples constructed to support hypothetical business needs, in the Practice section later in this Level. This will allow us to activate what we explore in the Discussion and Syntax sections.
Discussion
To restate our initial explanation of its operation, the COUNT() function returns a count of the rows containing date or numeric values within a specified column. (We can specify columns containing other data types, but only rows with date or numeric values are counted.) The output of COUNT() is a whole number representing said row count, and the function can be used with equal utility within PowerPivot calculated columns or PivotTable measures. COUNT() is often employed within other DAX functions.
NOTE: As we address in the section below, the COUNTX() function allows us to extend the operation of COUNT(), and take, as an argument, an expression that is, in turn, evaluated over a table. This enables us to perform independent calculations, and to subsequently take a count of the rows containing either a number / date or for which the stipulated expression evaluates to a number / date.
Syntax
Syntactically, in using the COUNT() function to return a count of rows, we specify the column of values we wish to count within the parentheses to the right of the COUNT keyword. As an illustration, we'll cite the use of COUNT() within the Reseller Sales tab of the PowerPivot window which we have established in our earlier preparation. Let's say we employ COUNT(), via a calculated column, in the following manner to count the rows containing numbers in the TotalProductCost column:
=COUNT('Reseller Sales'[TotalProductCost])
Were we to examine the new calculated column on the tab, we would see something like this (I've inserted the calculated column in the tab to be adjacent to theTotalProductCost column, from which it is derived):
Reaching beyond the partial view we see above, and extending to all rows in the column, the calculation returns the desired outcome, based upon a count of rows (in this case all rows) in the TotalProductCost column containing values.
The COUNTX() Function
Introduction
According to the Microsoft Developer Network (MSDN), the DAX COUNTX() function "…takes two arguments. The first argument must always be a table, or any expression that returns a table. The second argument is the column or expression that is searched by COUNTX()." This enables us, in essence, to perform calculations and then take the count of rows containing the calculated values. Like COUNT(), COUNTX() is an "aggregate" member of the Statistics function group, as displayed in the formula bar for PowerPivot and PowerPivot PivotTables, many of whose members are identical to functions in Excel. COUNTX(), like other iterator ("X") functions, and by its very nature, lends itself to use in combination with other DAX functions.
As we've noted in other Levels within the Function / Iterator Function Pairs subseries of the Stairway to PowerPivot and DAX series, iterator functions typically contain a table expression (the first parameter specified within the function), for which a calculation, based upon an expression specified by a second parameter, is performed iteratively for each row of the table. The ultimate result is generated by the application of the respective aggregation function (including SUM, AVERAGE, COUNT, MAX, and MIN) to the dataset returned by the first and second expressions. We expose the behavior of other DAX "X" functions in independent examinations of each of the respective function / x-function pairs published in parallel to this review of the COUNT() and COUNTX() functions.
In cases where there are no "qualified" (meaning "containing number / date values") rows to count, COUNTX() returns a blank. If rows do exist, but none of the rows meet the criteria for counting, COUNTX() returns a zero ( 0 ) value. With regard to nonnumeric values, the COUNTX() function ignores any empty cells, or logical values contained within the specified column.
No count action at all can be accomplished for a column that solely contains text, and, in such cases, COUNTX(), like COUNT(), returns blanks. Finally, when cells containing a value zero (0) exist inside a column specified within the COUNTX() function – as opposed to cells that are blank (which are not counted at all) – those cells are included in the whole number total delivered by the function.
NOTE: The COUNTA() or COUNTAX() functions support counts of text or logical values, To gain exposure to the details involved with performing such counts, see the associated Level in this series.
COUNTX() is useful anytime we want to filter a table we specify, based upon criteria we specify in a second expression, and then return a count of all the "qualified" values in the resulting column. If we do not need to apply qualifying criteria, we use the DAX COUNT() function, which, like SUM(), is similar to the Excel function of the same name, except that it references a column, versus a range.
We will briefly discuss the COUNTX() function in the section that follows, after which we will explore COUNTX() from a syntax perspective. We will then activate what we have explored in the Discussion and Syntax sections, through hands-on exposure to its use, within practice examples constructed to support hypothetical business needs.
Discussion
As we have noted, the COUNTX() function iterates through a specified table expression – the function's first parameter. It then employs the second parameter, a column (or an expression that delivers scalar output) for which we seek to generate a count, performing the expression for each row, and then all the respective date / number results obtained.
Let's look at some syntax illustrations to further clarify the operation of COUNTX().
Syntax
Syntactically, in using the COUNTX() function to evaluate expressions for each row of a table, and then tabulate the resulting set of values and calculate its whole number count, we take a table expression (first argument of the function), and the expression (second parameter of the function) that we seek to apply to each row of the table (specified in the first argument), within parentheses, to the right of the COUNTX keyword. The general syntax is shown in the following string:
COUNTX(<table>, <expression>)
Putting COUNTX(), or any of the "X iterator" functions, for that matter, to work is intuitive, once we grasp the purpose. When using the function to return the count of the results of the expression ("<expression>" in the syntax string above) upon each of the rows of the specified table ("<table>" in the syntax string above), we simply enclose the table and expression to the right of the COUNTX keyword, as we have noted. The COUNT function is then applied to the values returned for the respective rows, resulting in a single value that represents a count of those values.
As an example, let's get a feel for the general use of COUNTX() within the Reseller Sales tab of the PowerPivot window, where we have performed the imports described in the Preparation section above. Let's say we want to put COUNTX() to use in a scenario where we wish to do something along the lines of this:
=COUNTX('Reseller Sales',[TaxAmt])
This calculates the count of all rows within the Reseller Sales table that have a Tax Amount (again, in this case, all rows of the table are populated with a number). A partial view of the results of this calculation appears as shown:
In this simple example, we can see how COUNTX() works., allowing us to take, as an argument, an expression that is, in turn, evaluated for each row in a table (in this case, Reseller Sales) . We'll see a more sophisticated example in the Practice section below, where we see a count derived from a table subset of one of our imported tables, which has been subjected to a filter.
We'll get some hands-on practice with COUNT() and COUNTX() in the sections that follow.
Practice
Let's reinforce our understanding of the basics by putting the COUNT() and COUNTX() functions to work within the definition of some calculations. We'll do this much as we have in the other Levels of the Stairway to PowerPivot and DAX series, first via calculated columns within a tab of the PowerPivot window, and then via calculated fields within a PivotTable of a worksheet within the associated Excel workbook.
Also in keeping with our approach in other Levels of the series, we will examine the use of each of the functions in combination, where useful, with functions we introduce in other Levels. The intent, as in all the practice sessions we undertake together within this series, is to demonstrate the operation of the functions we examine in a straightforward, memorable manner. We will turn to the Excel worksheet we prepared earlier as a platform from which to construct and execute the DAX we examine, and to view the results we obtain.
PowerPivot Calculated Columns
Let's start with a straightforward example: Let's say that we wish to perform a simple summarization to obtain the count of rows containing a value for the Product Standard Cost on the Product tab within the PowerPivot data sheet view that we established in our preparation steps earlier.
To do so, we will take the following steps:
- Return to the worksheet we created in the Preparation for the Practice Exercises in this Level section above, ensuring that the PowerPivot Window is open, where we left it earlier.
- Click the Product tab.
- Click the heading of the column labeled "FinishedGoodsFlag," which appears to the immediate right of the StandardCost column on the Product tab, as shown.
- Right-click the column label for the FinishedGoodsFlag column.
- Select Insert Column from the context menu that appears.
A new calculated column, labelled CalculatedColumn1, appears.
- Select the function button ("fx") to the left of the formula bar next.
The Insert Function dialog appears.
- Click the downward pointing arrow on the selector appearing atop the dialog under the Select a category label.
The following DAX function categories appear:
- All
- Date & Time
- Math & Trig
- Statistical
- Text
- Logical
- Filter
- Information
- Parent-Child
- Select the Statistical category, on the Select a category dropdown, as depicted.
- In the Select a function list that populates for the Statistical category, select the COUNT() function, as shown.
- Click OK to confirm the selection and dismiss the Insert Function dialog.
The COUNT keyword, proceeded by the "=" operator and followed by a left parenthesis ( "(" ), appears in the formula bar, with a function usage tooltip appearing just underneath the function.
We will first create a simple calculation that generates a count of the entire Standard Cost column of the Product tab, which we will name "Count - Product Cost."
- Following the left parenthesis inserted by our selection of COUNT, begin to type in the word "Product."
- Scroll down and double-click / select the Product[StandardCost]column within the dropdown list.
- Once the selector is dismissed, and =COUNT(Product[StandardCost] appears in the formula bar, type a right parenthesis ( ")" ) at the current end of the syntax, to close the COUNT() function.
The syntax, at this point, should be as shown here:
=COUNT(Product[StandardCost])
- Once the syntax shown above appears in the formula bar, press the ENTER key.
The new calculated column is populated, in every row, by the count of all values in the StandardCost column of the Product tab. Let's next rename the column to its intended title.
- Right-click the new column header label (currently at its default of "CalculatedColumn1"), and select Rename Column from the cascading menu that appears.
- Name the newly added column "Count - Product Cost" to provide a new, distinct label.
The newly-named column, with updated formula results, appears as partially depicted below.
We thus see how we can employ COUNT() to deliver a simple count within the PowerPivot window. Next we'll take a look at the same operation, in a similar scenario, using COUNTX().
COUNTX() the" X-function" version of COUNT(),is also a Statistical function. As we noted, in Level 6: Function / Iterator Function Pairs: The DAX SUM() and SUMX() Functions, was the case with the SUM() function, the COUNT() function, like other functions in both Excel and in DAX, is limited to the use of a single column as an argument. The X-function counterparts (this time, COUNTX() being the example) to several such "standard" functions (the respective example being COUNT(), in the present case) enable us to construct arguments with multiple columns.
Let's look at an example that shows how we can use COUNTX() to determine the number of a set of values in a given table, as filtered by criteria applied within a related table. In this example, we seek to count those Products listed in the Product table that belong to the Product Subcategory group (the membership of which is defined within the related Product Subcategory table) with the English name of "Bottles and Cages." Our objective will be to count the Standard Cost values appearing on the Product tab for those items meeting this criteria.
To meet the requirement, we'll take the following steps from the Product tab:
- Click the top of the column labeled "WeightUnitMeasureCode," which appears to the immediate right of the ProductSubcategoryKey column, as shown.
- Right-click the column label for the WeightUnitMeasureCode column.
- Select Insert Column from the context menu that appears, as we did in the previous exercise.
A new calculated column, labelled CalculatedColumn1, appears, as we noted in earlier steps.
- Select the function button ("fx") to the left of the formula bar next, as we did earlier.
The Insert Function dialog appears, once more.
- Click the downward pointing arrow, once again, on the selector appearing atop the dialog under the Select a category label.
- Select the Statistical category.
- In the Select a function list that populates for the Statistical category, select the COUNTX() function, as shown in Illustration 28.
Click OK to confirm the selection and dismiss the Insert Function dialog, as we did earlier.
The COUNTX keyword, proceeded by the "=" operator and followed by a left parenthesis ( "(" ), appears in the formula bar, with a function usage tooltip appearing just underneath the function, as we noted in working with the COUNT() function earlier.
We will continue with the creation of the calculation we have described; this time, we will generate "No. Std-Costed Selections" which returns the total number of Standard Cost values assigned to Products belonging to the Bottles and Cages Product Subcategory, within the Product tab. (Our intent is to count the "number of Standard Cost values appearing in the Products tab, filtered by the Product Subcategory Bottles and Cages.") So we are creating a calculation that performs count of the rows of a column, with the rows filtered for criteria that exists within another column in another table, which we will specify in the appropriate argument of the function, as we shall see.
- Following the left parenthesis inserted by our selection of COUNTX, begin to type in the word "FILTER."
NOTE: For a detailed explanation of the operation of the DAX FILTER() function, see Level 2: The DAX COUNTROWS() and FILTER() Functions and More About Calculated Columns and Measures.
- Select the FILTER function within the dropdown list by double-clicking the function, as depicted in Illustration 39.
- Once the selector is dismissed, and =COUNTX(FILTER( appears, begin to type in the word "Product."
- Scroll down and double-click the Product table within the dropdown list, as shown.
- Once the selector is dismissed, and =COUNTX(FILTER(Product appears in the formula bar, type a comma ( , ) after "Product".
The syntax in the function bar, at this point, should be as shown here:
=COUNTX(FILTER(Product,
After the comma that we have added at the end of the expression, begin to type in the word "Related."
- Double-click the RELATED function within the dropdown list, as depicted.
NOTE: For a detailed overview of the DAX RELATED() function, see Stairway to PowerPivot and DAX - Level 1: Getting Started with PowerPivot and DAX.
- Once the selector is dismissed, and =COUNTX(FILTER(Product,RELATED( appears, begin to type in the words "Product Subcategory."
- Scroll down and double-click / select the 'Product Subcategory'[EnglishProductSubcategoryName] column within the dropdown list, as shown.
- Once the selector is dismissed, and =COUNTX(FILTER(Product,RELATED('Product Subcategory'[EnglishProductSubcategoryName] appears in the formula bar, type a right parenthesis ( " ) " ), and then an equals ( " = " ) sign, after 'Product Subcategory'[EnglishProductSubcategoryName].
The syntax in the function bar, at this point, should be as shown here:
=COUNTX(FILTER(Product,RELATED('Product Subcategory'[EnglishProductSubcategoryName])=
After the equals sign that we have added at the end of the expression, begin to type in the following string:
"Bottles and Cages"),
The syntax in the function bar, at this point, should be as shown here:
=COUNTX(FILTER(Product,RELATED('Product Subcategory'[EnglishProductSubcategoryName])="Bottles and Cages"),
- After the comma at the end of the string we have added, begin to type in the word "Product," once again.
- Scroll down and double-click / select the Product[StandardCost] column within the dropdown list, as shown:
- Once the selector is dismissed, type a right parenthesis ( " ) " ) after Product[StandardCost].
The completed function should be as shown here:
=COUNTX(FILTER(Product,RELATED('Product Subcategory'[EnglishProductSubcategoryName])="Bottles and Cages"), Product[StandardCost])
…and appears as presented in Illustration 35.
In effect, we are submitting a filtered, related table as the first argument of the COUNTX() function – a table filtered, via the DAX FILTER() function, to return, via the RELATED() function, only the rows in the Product table for which the related rows in the Product Subcategory table meet the condition Product Subcategory = "Bottles and Cages." (Note that the FILTER() function is applied to the Product table using a value looked up in the Product Subcategory table.) The count is then performed on the rows in the resulting table for which a Standard Cost exists (in this case three (3), as we shall see).
- Press the ENTER key to accept the syntax we have input, and to populate the new calculated column.
Let's name the new calculated column to distinguish it, as we have done with the calculated columns we have created before.
- Right-click the new column header label (currently at its default of "CalculatedColumn1," once again), and select Rename Column from the cascading menu that appears, as we have done in earlier examples.
- Name the newly added column "No. Std-Costed Selections" to provide a unique column header name.
The newly-named column, with updated formula results, appears as partially shown.
Let's check our result with a simple test:
- On the Product table, click the heading for the ProductSubcategoryKey column, to the immediate left of the new calculated column, to select the column.
- Click the Sort Smallest to Largest button on the ribbon, as depicted.
- On the ProductSubcategoryKeycolumn, scroll down to the number 28. This is the ProductSubcategoryKeythat is related to the same key on the Product Subcategory tab – the key that coincides with the FILTER() expression ("Bottles and Cages") within our new calculated column:
RELATED('Product Subcategory'[EnglishProductSubcategoryName])="Bottles and Cages")
The isolated ProductSubcategoryKey, to which we have scrolled on the Product tab, appears as shown:
As we can verify that three rows exist for the related key on the Product tab, and that each row contains a StandardCost value, we can easily ascertain that our calculated column, which is counting the StandardCost values for the products meeting the filter criteria we have enforced, is returning the correct results ("3"). So our "quick and dirty" test shows that the COUNTX() function has performed as expected.
- Clear the sort that we placed on the OrderDateKey column of the Product table within the PowerPivot window.
In the section that follows, we will access the PowerPivot calculated columns we have created from within a PivotTable. Moreover, we will get some hands-on exposure to putting the DAX COUNT() function to work within a calculated measure we will create in the PivotTable.
PivotTable Measure
We will next work further – this time via calculated fields (called measures in Excel 2010 PivotTables using PowerPivot data) - with the DAX functions we have exposed in the PowerPivot calculated column section above. We do this keeping in mind the assertion we make throughout the Stairway to PowerPivot and DAX series that PowerPivot exists, in its fundamental role, to act as a data source for PivotTables. The difference, this time, will be that we will present values and calculations within a PivotTable.
As most of us know by now, we can begin our work with a PivotTable from either the Excel workbook or from the PowerPivot window. From where we are in the PowerPivot window, we'll proceed by taking the following steps:
- From the Product tab in the PowerPivot window, click the downward pointing arrow on the PivotTable button to cause the drop-down menu to appear, as shown.
As noted in other Levels of Stairway to PowerPivot and DAX, although we work with examples, from time to time, of some of the other options we see in the dropdown menu, we will work with a standalone PivotTable in this session. This will be the most efficient path to getting hands-on exposure to the underlying PowerPivot data and DAX, in general.
- Click PivotTable in the dropdown menu we've exposed (the top selection).
We are returned to Excel, where we encounter the Insert Pivot dialog.
- Ensure that the radio button labeled New Worksheet is selected, as depicted.
- Click OK to accept the setting and dismiss the dialog.
The PivotTable placeholder / canvas, together with the PivotTable Fields list, appear on the new worksheet as presented:
We'll leave the placement of the new PivotTable at default.
NOTE: Because we cover the details for the first time in an earlier Level of the series, and particularly for those not already familiar with standard Excel PivotTables, Table 1 of Level 2: The DAX COUNTROWS() and FILTER() Functions and More About Calculated Columns and Measures gives a description for each of the areas / zones on the PivotTable Fields list. Other PivotTable basics are discussed there as well.
To gain some exposure to the use of the DAX COUNT() and COUNTX() functions within the PivotTable, we will replicate some of the steps we took with regard to the Internet Sales tab, upon which we operated in the PowerPivot window in the earlier part of this Level. In accomplishing this, we'll see how to approach what we did with the calculated column earlier, but this time we'll do so with a calculated measure in the PivotTable. Moreover, we will touch upon more PivotTable – specific considerations as we encounter them. But first, we will pull the calculated columns we created earlier into a new PivotTable we construct, to allow us to see the output of our calculation when we select it from the PivotTable Fields list.
- Expand both of the Product and Product Subcategory tables within the PivotTable Fields list.
- Select the StandardCostcolumn from the Product tab, by placing a check in the checkbox to its immediate left.
Sum of StandardCost, the default name and summarization method, appears in the Values drop zone.
- Select, in order, the ProductSubcategoryKeyand the EnglishProductSubcategoryName from the Product Subcategory table, by right-clicking, and selecting Add to Row Labels from the context menu that appears, for each, as shown.
Our initial rows axis and a value are in place at this stage, with the overall view as partially depicted.
Now let's add a couple more values: the calculated columns with which we worked in the PowerPivot window in our earlier sections.
- Select Count – Product Cost and No. Std-Costed Selections from the Product table, in that order, by placing a check in the associated checkboxes, as shown.
Recall that the selections we have made from the Product table represent the calculated columns we created in the Product tab of the PowerPivot Window. The first, Count – Product Cost, relies upon the DAX COUNT() function, the syntax of which is as follows:
=COUNT(Product[StandardCost])
The second calculated column, No. Std-Costed Selections, exploits the DAX COUNTX() function, using the following syntax:
=COUNTX(FILTER(Product,RELATED('Product Subcategory'[EnglishProductSubcategoryName])="Bottles and Cages"), Product[StandardCost])
The PivotTable summary we have constructed to this point appears as shown:
We notice that the values presented in the columns, which we have populated by placing the respective calculated columns into the PivotTable Values drop zone, seem quite large. In a "close up" of one such value, ProductSubcategoryKey 28, with which we have become familiar in the last few paragraphs, the No. Std-Costed Selections (the far right column in the illustration presented below) indicates a count of nine (9), versus the three (3) that was returned by the No. Std-Costed Selections calculated column that we created earlier.
As we have done in our work with other function / iterator function pairs within this subseries of Levels, let's research why we have a difference from what we have seen earlier, not to mention to ascertain which values are correct. When we delve into the nature of our values a bit we notice in short order that the cause of the issue is at least partially indicated by the default names for the newly created values: Both have "Sum of" appended to the front of the names of the underlying calculated columns in PowerPivot. This is because values that are simply added to the Values drop zone are summed, by default. Our indiscriminate inclusion of the calculations in the Values drop zone has resulted in a default summation of a count – something that, on the surface, would be problematic.
Let's see how changing this summing behavior to more accurately handle a count delivers immediate results – results that would, ideally, coincide with results that we can easily verify via the underlying PowerPivot calculation results. We'll then take a look at recreating the No. Std-Costed Selections calculation, as an example, which is currently a calculated column from the Product tab in the underlying PowerPivot Window. As we have seen, both our calculated columns appear in the PivotTable Fields list, much as if they were any other columns in the Product tab (this isn't any way to distinguish calculated columns from standard columns within the PivotTable Fields list under these circumstances).
First, let's make the tweak to more properly treat the Count - Product Cost and No. Std-Costed Selections calculations as counts, in general, as we have discussed.
- Within the Values drop zone of the PivotTable Fields list, click the downward pointing arrow on the value labelled Sum of Count - Product Cost.
- Select Value Field Settings from the menu that appears, as depicted here.
The Value Field Settings dialog appears, with settings as shown.
Here we see that the Summarize value field by setting is Sum, as we discovered earlier.
- Modify the Summarize value field by setting to Count.
- Click OK to save changes and to dismiss the dialog.
Next, let's tweak the No. Std-Costed Selections calculation to reflect a count, versus a sum, in like manner to the way we modified the Count - Product Cost calculation.
- Within the Values drop zone of the PivotTable Fields list, click the downward pointing arrow on the value labelled Sum of No. Std-Costed Selections.
- Select Value Field Settings from the menu that appears, as we did for the Count - Product Cost calculation.
- Modify the Summarize value field by setting to Count.
- Click OK to save changes and to dismiss the dialog.
We see immediate changes in the PivotTable, affecting both the Count of Count - Product Cost and the Count of No. Std-Costed Selections:
Depending upon the summaries we are attempting to generate in the PivotTable, many are still not likely to be what we were expecting for every row – one circumstance that stands out is the fact that, while the newly modified calculated column Count of No. Std-Costed Selections appears to generate the "3" (the underlying value populating the calculated column in the PowerPivot window) that we are after, for the count in the rows containing ProductSubcategoryKey 28, as presented here:
…it would appear that Count of No. Std-Costed Selections is generating erroneous values for the other rows … these rows would not contain the values we see at present, as the COUNTX() function behind the No. Std-Costed Selections calculated column is being filtered to return counts only of rows bearing StandardCost values for the "Bottles and Cages" Product Subcategory (whose key, again, is 28) in the Product tab.
To get the desired results from our calculated column at the PivotTable level, we will have to go beyond subjecting the PowerPivot COUNTX() calculation behind the No. Std-Costed Selections calculated column to the count aggregation that we get by default here. We need to do a little more work, via a calculated field, to leverage the filter context of the PivotTable and generate the correct result. Let's add a calculated field in the PivotTable to give us an alternative, direct way to achieve the No. Std-Costed Selections. We'll leave the new Count of No. Std-Costed Selections in place, for comparison purposes as needed.
- Click the PowerPivot tab, if necessary, atop the worksheet containing our current PivotTables.
- Click the Calculated Fields button in the ribbon atop the tab (to the right of the PowerPivot Manage button in the upper left of the ribbon).
- Select New Calculated Field… from the menu that appears, as presented here:
The Calculated Field dialog appears.
- Select Product in the dropdown selector labeled Table name atop the dialog.
- Type (or copy and paste) the following into the Calculated Field Name box of the dialog.
No. Std-Costed Selections CF
(We are simply naming the calculated field in a way to distinguish it from the entrained calculated column already in the PivotTable.)
- Type (or copy and paste) the following syntax into the Formula input box of the dialog:
=COUNTX(FILTER(Product,RELATED('Product Subcategory'[EnglishProductSubcategoryName])="Bottles and Cages"), Product[StandardCost])
This is the syntax we used in creating the No. Std-Costed Selections calculated column in the Product tab of the PowerPivot window.
- Select Number in the Formatting Options section (the "Category" selections in bottom left corner) of the Calculated Field dialog.
- Select Whole Number in the Format selector.
The Calculated Field dialog appears, with our input, as depicted.
- Click OK to accept our input and dismiss the Calculated Field dialog.
The new calculated field, No. Std-Costed Selections (which finds its home at the bottom of the member listings of the Product tree), appears within the Values drop zone automatically, as shown.
At this juncture, the new No. Std-Costed Selections calculation appears, within its home PivotTable, as shown:
We note that the column containing the new calculated field seems to display many empty cells. Scrolling down in the PivotTable to ProductSubcategoryKey28, once again, however, we see that the only non-empty cells contain the value of "3," the count we expected, having obtained it in all rows of the No. Std-Costed Selections calculated column that we created earlier. Its purpose, we recall, was to count only the number of StandardCost values associated with the filter we enacted as an argument in the COSTX() function we employed.
In this example, we once again illustrate the importance of "choosing the right calculation" (that is, choosing between a calculated column in PowerPivot and a calculated field in the PivotTable) for presentation in a PivotTable. We have noted in other Levels of this series that there are multiple things to consider in choosing "which total to use" (underlying, exposed calculated column or calculated field) in a given PivotTable, in similar cases. In this example, we can see that the "sum of counts" or even "count of counts" we obtained using the underlying calculated column is hardly likely to deliver the intended outcome, while the calculated field delivers the desired results by exploiting filter context at the PivotTable level.
The examples we have used with COUNT()and COUNTX() in this Level of the series are only relatively simple instances of what we can accomplish with their use. As we consistently note in other Levels, particularly where "X-" iterator DAX functions are involved, we can analyze and report upon data based upon date ranges, among other criteria, and create conditions (from simple to complex) that further refine the data, and / or include data taken from multiple tables. Once again, it is easy to see how DAX empowers us to generate and support comprehensive BI within our Excel environments, transforming the relatively limited options we once had with simple spreadsheets into sophisticated and robust reporting and analysis solutions.
Summary
In this Level of the Stairway to PowerPivot and DAX series, we exposed the DAX COUNT() and COUNTX() functions. We discussed the general purposes and operation of each of these two DAX functions, and then focused upon using each in general, as well as comparing and contrasting ways that each can be employed, particularly in situations where we can combine them with other functions to achieve results like those we might need to generate in client or employer environments.
In like manner to our standard approach to every DAX function we explore within the Stairway to PowerPivot and DAX series, we examined the syntax surrounding each of the COUNT() and COUNTX() functions, undertaking illustrative examples of the uses of each in practice exercises, and then exploring the results datasets we obtained in the practice examples. We combined each of COUNT() and COUNTX() with other functions to deliver illustrative answers to business questions, within both PowerPivot calculated columns and PivotTable calculated fields we built in the section that followed. As we put calculations to use within the PivotTable, we compared and contrasted the values returned, explaining the reasons behind differences, and discussing considerations in choosing which calculation approaches to take in meeting business requirements in the business world.