Introduction
Report development has been an important part of my job. In my 7 years of work experience I have come across many requirements for financial business reports. The financial reports are full of numbers, formulas, sub-total, totals, averages and many more things. In an earlier time it had been a challenge for me to display the sub-totals and the totals at different hierarchy level. e.g. display the amount invested under the different schemes of a product, total amount invested under a product and total amount invested across all products etc.
One day I was reading SQL books in my downtime and I found the GROUPING with ROLL UP function. This function does the job that I was doing in a very efficient way. In this article I will explain how to use the GROUPING function with ROLL UP operation to generate totals and sub-totals and return it as a query result.
I will refer to GROUPING with ROLL UP as GROUPING further in the article for simplicity.
An Exercise
The basic requirement for any query to execute is a table and data in that table. So let's first define a table named @GroupTest and then fill it with some sample data.
DECLARE @GroupTest TABLE
(
Product_Name varchar(100),
Scheme_Name varchar(100),
Invest_Amount numeric(18,2),
Broker_Commission numeric(18,2),
Issue_Date datetime
)
Our next step is to populate the @GroupTest table with some data. I prefer to use a WHILE loop and some mathematical operations instead of writing N number of individual INSERT statements. Execute the following query to load the data into the table.
declare @cnt int, @idx int, @index int set @cnt = 1 set @idx = 1 set @index = 1 while @cnt <=100 begin insert into @GroupTest values('Product'+CAST(@idx AS VARCHAR),'Scheme'+CAST(@index AS VARCHAR), ((@cnt * @idx / @index)* 1524.15),((@cnt * @idx / @index)* 124.32), GETDATE()) if @cnt % 10 = 0 set @idx = @idx + 1 set @index = @index + 1 if @cnt % 5 = 0 set @index = 1 set @cnt = @cnt + 1 end
Execute the following select query to check the populated data in the table.
SELECT Product_Name, Scheme_Name, Invest_Amount, Broker_Commission, Issue_Date FROM @GroupTest
Note: I avoided a ' * ' after the SELECT. Using the ' * ' is a bad practice. It brings down the performance of queries. Smaller tables will not have any problem, but as a table grows (in volume of data) it will be an issue.
What do we need in the output?
We need a result set which will provide us the detail set of records, sub-totals and grand total. The output should show us the following:
- Individual investment record
- A separate row for the total amount invested and total broker commission amount under each scheme
- A separate row for the total amount invested and total broker commission amount under each product
- A single row for the total amount invested and total broker commission amount under all products.
The output should contain the investment amount totals as shown below:
Total amount invested in Product 1 --> Scheme1 is 10669.05
Total amount invested in Product 1 --> Scheme2 is 6096.60
Total amount invested in Product 1 --> Scheme3 is 4,572.45
Total amount invested in Product 1 is 30483.00
Method 1
This method doesn't use GROUPING function. The method uses four different queries combined with a UNION ALL clause.
Query 1:
Let's execute first query to get the individual records. This will list down investment done in each scheme.
SELECT Product_Name,Scheme_Name,Invest_Amount, broker_commission, '0 Detail' AS Comment FROM @GroupTest ORDER BY Product_Name,Scheme_Name,Comment
The output of the above query is as follows:
The comment column is added to sort the records so that the Detail will appear first followed by the Total and Grand Total respectively.
Query 2:
Second query lists down the total amount invested in each scheme of each product. I mean here that the query will return a single row for Scheme 1 under each product with the total investment amount as 10669.05. It should be the similar for rest of the schemes in all products. A normal GROUP BY clause is used in the query.
SELECT Product_Name,Scheme_Name,SUM(Invest_Amount) AS 'Invest_Amount', SUM(broker_commission) AS 'Broker_Commission', '1 Scheme_Total' FROM @GroupTest GROUP BY Product_Name,Scheme_Name ORDER BY Product_Name,Scheme_Name, Comment
The above query produces the following result:
Query 3:
The third query provides the amount invested at product level. Hence the investment amount for Product 1 will be the total of investment_amount of all schemes in Product 1, which is 30483.00
SELECT Product_Name, 'ZTotal' AS Scheme_Name, SUM(Invest_Amount) AS 'Invest_Amount', SUM(broker_commission) AS 'Broker_Commission', '2 Product_Total' AS Comment FROM @GroupTest GROUP BY Product_Name ORDER BY Product_Name, Scheme_Name, Comment
In the second column in place of the Scheme Name I have used a hard coded string value of 'ZTotal'. This will order the product level total row below the scheme total rows in the final query. If I leave the second column value as blank or NULL, the row will appear above all other rows. I found adding hard coded strings a simpler way to keep the row at the bottom. You will understand this when we put all these queries in a UNION.
Query 3 provides the following output.
Query 4:
This is the last query under Method 1. This query gives the total amount invested across all products. This is a Grand Total row for all of the output results above mentioned in Query 1, Query 2 and Query 3.
SELECT 'ZProductTotal' As Product_Name, 'ZTotal' AS Scheme_Name, SUM(Invest_Amount) AS 'Invest_Amount', SUM(broker_commission) AS 'Broker_Commission', '3 Grand_Total' AS Comment FROM @GroupTest ORDER BY Product_Name,Scheme_Name, Comment
The Query 4 returns following row:
Putting all four queries in a UNION and getting final output
All the above four queries are put into a UNION clause as shown below.
SELECT Product_Name,Scheme_Name,Invest_Amount, broker_commission, '0 Detail' AS Comment FROM @GroupTest UNION ALL SELECT Product_Name,Scheme_Name,SUM(Invest_Amount) AS 'Invest_Amount', SUM(broker_commission) AS 'Broker_Commission', '1 Scheme_Total' as Comment FROM @GroupTest GROUP BY Product_Name,Scheme_Name UNION ALL SELECT Product_Name, 'ZTotal' AS Scheme_Name, SUM(Invest_Amount) AS 'Invest_Amount', SUM(broker_commission) AS 'Broker_Commission', '2 Product_Total' AS Comment FROM @GroupTest GROUP BY Product_Name UNION ALL SELECT 'ZProductTotal' As Product_Name, 'ZTotal' AS Scheme_Name, SUM(Invest_Amount) AS 'Invest_Amount', SUM(broker_commission) AS 'Broker_Commission', '3 Grand_Total' AS Comment FROM @GroupTest ORDER BY Product_Name,Scheme_Name, Comment
This query produces the scheme level and product level totals as described in the section 'What do we need in the output?'.
Look at the structure of the output. First an individual detail record is displayed. Then the Scheme_Total is displayed after it. Product_Total follows the Scheme_Total. The hard coded values 'ZProductTotal', 'ZTotal' and Comment column have helped to get the rows in this order.
Method 2 : Using GROUPING function
Until now we have seen how to generate the Sub-totals, Totals and Grand Totals using normal queries with the GROUP BY clause. The same output can be produced using the GROUPING function with less code. The GROUPING function is used as an additional column in the query, and this function will render the output value as '1' for the grand total row and '0' for the rest. This means the row that contains the investment amount across all the products will have value '1' under the column denoted by the GROUPING function. The new query will be much simpler though we still have to use GROUP BY clause for the grouping purpose.
Note: GROUPING function doesn't replace GROUP BY. We must have GROUP BY to get GROUPING to work. Also the GROUPING function needs to be supported with the ROLL UP attribute, which is placed after the GROUP BY. If we omit the ROLL UP the GROUPING function will not produce the desired output.
Two queries are combined with the UNION ALL operator to get the desired output. There is a slight difference in the output. The 'Comment' column has a numerical value instead of a string value as mentioned in the queries of Method 1 for the same purpose.
In following code, the first query will return the detail records (same as Query 1 of Method 1) and the second query will return the totals at different level i.e. scheme level, product level and a Grand Total.
SELECT Product_Name,Scheme_Name,Invest_Amount, broker_commission,-1 AS GrandTotal FROM @GroupTest UNION ALL SELECT ISNULL(Product_Name,'ZProductTotal'),ISNULL(Scheme_Name,'ZTotal'), SUM(Invest_Amount) AS 'Invest_Amount', SUM(broker_commission) AS 'Broker_Commission', GROUPING(Product_Name)'GrandTotal' FROM @GroupTest GROUP BY Product_Name,Scheme_Name WITH ROLLUP ORDER BY Product_Name,Scheme_Name, GrandTotal
The output of the above query is as follows:
What Grouping does?
Grouping separates the NULL values that are returned by ROLL UP from normal null values. In the above example (Method 2) the Totals and the Grand Total rows are actually generated by the ROLL UP operation. ROLL UP adds the NULL value for the summary row. Grouping separates this NULL by indicating 1 against the row. Hence in the above output (Method 2) the Grand Total summary row is indicated by a 1.
Here one question comes in the mind that why only one row is indicated by 1? The Scheme_Total and Product_Total is also aggregated but it is not indicated by 1. I have not yet received any satisfactory answer for this. I will post the answer here once I get it.
If we don't specify the hard coded string 'ZProductTotal' and 'ZTotal' then the output will look like this screen shot.
The first row is the Grand Total where Product_Name is NULL. The second row is total for Product 1 where the Scheme_Name is NULL. This should help you to understand why I use hard coded strings.
Summary
We got the output needed by writing the queries in two different ways. and each method has its advantages and disadvantages. Method 1 is useful when you have a number of non-aggregate fields in the SELECT list. If you have a few non-aggregate fields (more than 3 in my personal opinion) then it is better to use Method 1. If we have non-aggregate columns in the SELECT list we also need to mention all of them in GROUP BY clause.
The GROUPING function is useful for smaller queries. It is good to go with GROUPING function if we have less non-aggregate columns in the SELECT list. ROLL UP operation is the feature available in SQL server for aggregation and comparatively efficient than normal aggregating queries as written in Method 1. The GROUPING function is also easy to maintain and helps reducing number of code lines.