August 11, 2016 at 2:24 am
I am currently using SSRS 2008R2. This is the script I am using:
SELECT
a.name,
DATENAME(M, a.Date) as Month,
DATENAME(YEAR,a.Date) as Year,
DATEPART(YYYY, a.Date) as YearNum,
DATEPART (M, a.Date) as MonthNum,
COUNT(*) as [Total]
FROM Database1.Table1 as a
WHERE a.name = 'active'
GROUP BY a.name, DATENAME(month, a.Date),
DATENAME(year, a.Date), DATEPART(YYYY, a.date), DATEPART (M, a.date)
ORDER BY Month desc;
I have created a Tablix that has the name of the product as the row group and the month as the column group. This works fine in that it gives a monthly count, but I would also like to see the monthly count expressed as a %.
My question is, how do I create another group in the Tablix so that the monthly % figure appears by each product?
From this, I would like to create a column chart that displays monthly % totals.
I am getting stuck because I have added a column to the end of my Tablix and have added this expression:
=sum(Fields!Total.Value) / Sum(Fields!Total.Value, "DataSet2") to calculate the %.
This just gives a year to date figure and appears at the end of the Tablix.
Any help much appreciated – thanks!
August 11, 2016 at 6:46 pm
one way is to not use the totals query as the base for your report and to use a Matrix instead of a tablix. You could create a variable to store the total number of records in your dataset. and then you could divide the counts you get by that variable. I did it a while ago... I was doing counts for some basic statistics, and I wanted to divide the count by the number of patients in my study... so my PatientsCount was something like
SELECT COUNT(*) FROM Patient; and I assigned that to my variable @PopulationSize
Then, you can divide your [Frequency] or count by that...
[Relative Frequency] = [Frequency]/@PopulationSize
(You just have to point at the Variables section when you're building it.)
August 11, 2016 at 9:03 pm
faulknerwilliam2 (8/11/2016)
I am currently using SSRS 2008R2. This is the script I am using:SELECT
a.name,
DATENAME(M, a.Date) as Month,
DATENAME(YEAR,a.Date) as Year,
DATEPART(YYYY, a.Date) as YearNum,
DATEPART (M, a.Date) as MonthNum,
COUNT(*) as [Total]
FROM Database1.Table1 as a
WHERE a.name = 'active'
GROUP BY a.name, DATENAME(month, a.Date),
DATENAME(year, a.Date), DATEPART(YYYY, a.date), DATEPART (M, a.date)
ORDER BY Month desc;
I have created a Tablix that has the name of the product as the row group and the month as the column group. This works fine in that it gives a monthly count, but I would also like to see the monthly count expressed as a %.
My question is, how do I create another group in the Tablix so that the monthly % figure appears by each product?
From this, I would like to create a column chart that displays monthly % totals.
I am getting stuck because I have added a column to the end of my Tablix and have added this expression:
=sum(Fields!Total.Value) / Sum(Fields!Total.Value, "DataSet2") to calculate the %.
This just gives a year to date figure and appears at the end of the Tablix.
Any help much appreciated – thanks!
Just trying to figure out what you have for table columns. It appears that you may have a mistake in your sample code the I need clarification for... You speak of "products" and I thought that the a.Name column was the name of a product but your WHERE clause seems to be using a.Name as a status.
Please clarify because once we know, this could be a pretty easy problem to solve with a couple of COUNT() OVER functions.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 12, 2016 at 12:23 am
Dear Jeff - thanks for getting back
Code should be a.status = 'live';
a.name is indeed the name of the product.
Thanks.
August 15, 2016 at 12:23 am
Dear Jeff - thanks for getting back
Code should be a.status = 'live';
a.name is indeed the name of the product.
Thanks.
August 15, 2016 at 12:24 am
Jeff Moden (8/11/2016)
faulknerwilliam2 (8/11/2016)
I am currently using SSRS 2008R2. This is the script I am using:SELECT
a.name,
DATENAME(M, a.Date) as Month,
DATENAME(YEAR,a.Date) as Year,
DATEPART(YYYY, a.Date) as YearNum,
DATEPART (M, a.Date) as MonthNum,
COUNT(*) as [Total]
FROM Database1.Table1 as a
WHERE a.name = 'active'
GROUP BY a.name, DATENAME(month, a.Date),
DATENAME(year, a.Date), DATEPART(YYYY, a.date), DATEPART (M, a.date)
ORDER BY Month desc;
I have created a Tablix that has the name of the product as the row group and the month as the column group. This works fine in that it gives a monthly count, but I would also like to see the monthly count expressed as a %.
My question is, how do I create another group in the Tablix so that the monthly % figure appears by each product?
From this, I would like to create a column chart that displays monthly % totals.
I am getting stuck because I have added a column to the end of my Tablix and have added this expression:
=sum(Fields!Total.Value) / Sum(Fields!Total.Value, "DataSet2") to calculate the %.
This just gives a year to date figure and appears at the end of the Tablix.
Any help much appreciated – thanks!
Just trying to figure out what you have for table columns. It appears that you may have a mistake in your sample code the I need clarification for... You speak of "products" and I thought that the a.Name column was the name of a product but your WHERE clause seems to be using a.Name as a status.
Please clarify because once we know, this could be a pretty easy problem to solve with a couple of COUNT() OVER functions.
Dear Jeff - thanks for getting back
Code should be a.status = 'live';
a.name is indeed the name of the product.
Thanks.
August 15, 2016 at 4:12 pm
Apologies for missing your follow up posts on this. It IS a bit amazing to me that no one else picked up on this thread. I'll see what I can do after work.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2016 at 9:53 pm
Ok... Here we go. Let's do this "for real" using a test table of 7 million constrained but random rows over a period of 7 years (million rows per year).
Here's the code to build such test data. It has 26 products lettered "A" thru "Z".
--===== If the test table already exists, drop to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Table1','U') IS NOT NULL
DROP TABLE #Table1
;
--===== Create the test table
CREATE TABLE #Table1
(
Name CHAR(1)
,[Date] DATE
,[Status] CHAR(8)
)
;
--===== Declare some obviously named parameters to control test data generation.
-- Makes an average of about 1 million rows per year. 10% will be inactive.
DECLARE @StartDT DATE = '2010' --Inclusive
,@EndDT DATE = '2017' --Exclusive
,@Rows INT = 7000000
;
--===== Populate the test table with random, constrained data
-- This only takes about 15-30 seconds to build 7 million rows.
INSERT INTO #Table1
(Name,[Date],[Status])
SELECT TOP (@Rows)
Name = CHAR(ABS(CHECKSUM(NEWID())%26)+65) --A-Z
,[Date] = DATEADD(dd,ABS(CHECKSUM(NEWID())%DATEDIFF(dd,@StartDT,@EndDT)),@StartDT)
,[Status] = CASE WHEN ABS(CHECKSUM(NEWID())%10) > 0 THEN 'Active' ELSE 'InActive' END
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
Before we get started on a solution, let's introduce a couple of concepts....
The first concept is "pre-aggregation" (as Peter "Peso" Larsson calls it). What it does is to aggregate the smallest number of things possible to support the rest of the calculations and very quickly reduces internal row counts in the execution plan. Translation: Makes the other stuff run NASTY FAST.
The second concept is DRY, which stands for "Don't Repeat Yourself". Although it makes the code longer, it allows you to "Divide'n'Conquer" the problem in logical steps without repeating the same formulas over and over, which is also a bitch to read when they're nested even just once. It makes the code easy and logical to read, troubleshoot, and modify. It's also NASTY FAST. We're going to process more extra analysis columns that you can shake a stick at on all 7 million rows in about 5 seconds (on my little i5 4-core 6GB RAM laptop with SS 2008).
Here's the code to do that... you'll see that a couple of extra percentages have been calculated and then each product/month was ranked with ties (DENSE_RANK).
--===== Solve the problems
WITH ctePreaggByMonth AS
( --=== Preaggregates the count and DRYs out the MonthDate
-- THIS is incredibly important for performance.
-- Does all 7 million rows in just over a second.
-- The rest of the code only takes an additional 4 seconds
-- for a "first run" total of 5 seconds AND THAT'S WITH NO INDEXES!
-- "Second" runs only take 2 seconds total!
SELECT Name
,MonthDate = DATEADD(mm,DATEDIFF(mm,0,[Date]),0)
,ProductMonthTotal = COUNT(*)
FROM #Table1
WHERE [Status] = 'Active'
GROUP BY Name,DATEDIFF(mm,0,[Date])
)
,
cteDRYDateParts AS
( --=== DRYs out the date parts we want and part of the % calculation.
SELECT Name
,MonthDate
,[Month] = DATENAME(mm,MonthDate)
,[Year] = DATENAME(yy,MonthDate)
,YearNum = DATEPART(yy,MonthDate)
,MonthNum = DATEPART(mm,MonthDate)
,ProductMonthTotal
,ProductMonthTotal100 = ProductMonthTotal * 100.0
FROM ctePreaggByMonth
)
,
ctePercents AS
( --=== Produces the multiple DRY percentages for analysis and carries columns forward
SELECT Name
,MonthDate
,[Month]
,[Year]
,YearNum
,MonthNum
,ProductMonthTotal
,PctOfTotalMonth = ProductMonthTotal100 / SUM(ProductMonthTotal) OVER (PARTITION BY YearNum,MonthNum)
,PctOfProdYear = ProductMonthTotal100 / SUM(ProductMonthTotal) OVER (PARTITION BY Name,YearNum)
,PctOfYear = ProductMonthTotal100 / SUM(ProductMonthTotal) OVER (PARTITION BY YearNum)
,PctOfGrandTotal = ProductMonthTotal100 / SUM(ProductMonthTotal) OVER ()
FROM cteDRYDateParts
)
--===== Final output contains all the above plus ranking by percent
SELECT Name
,[Month]
,[Year]
,YearNum
,MonthNum
,ProductMonthTotal
,PctOfTotalMonth
,PctOfProdYear
,PctOfYear
,PctOfGrandTotal
,RnkOfTotalMonth = DENSE_RANK() OVER (PARTITION BY YearNum,MonthNum ORDER BY PctOfTotalMonth DESC)
,RnkOfProdYear = DENSE_RANK() OVER (PARTITION BY Name,YearNum ORDER BY PctOfProdYear DESC)
,RnkOfYear = DENSE_RANK() OVER (PARTITION BY YearNum ORDER BY PctOfYear DESC)
,RnkOfGrandTotal = DENSE_RANK() OVER ( ORDER BY PctOfGrandTotal DESC)
FROM ctePercents
ORDER BY MonthDate,Name
;
Obviously, the ability to generate and control a large amount of constrained, random data is important for these types of experiments. You can find out more about how to do that in the following articles.
http://www.sqlservercentral.com/articles/Data+Generation/87901/
http://www.sqlservercentral.com/articles/Test+Data/88964/
Let us know if you have any additional questions on this because <insert drumroll here>, you're the one that will need to support it. You might also want to have a look at the Windowed Aggregate Functions I used... notice that the first cCTE (Cascading CTE) is the only CTE with a GROUP BY in it even though we use SUM() in one of the other CTEs. The power there is in the OVER clause.
{EDIT: Spelling correction}
--Jeff Moden
Change is inevitable... Change for the better is not.
August 16, 2016 at 12:40 am
Dear Jeff
A thousand thanks for your help and your taking the time and trouble to help with my query. Your solution has worked perfectly. I love the ranking element.
I really appreciate the logical way your scripting evolves; and your notes make such a complex piece of scripting easy to follow.
Inspirational at many levels and I will be deconstructing / studying the script for my own development.
Many thanks again.
August 16, 2016 at 8:55 am
faulknerwilliam2 (8/16/2016)
Dear JeffA thousand thanks for your help and your taking the time and trouble to help with my query. Your solution has worked perfectly. I love the ranking element.
I really appreciate the logical way your scripting evolves; and your notes make such a complex piece of scripting easy to follow.
Inspirational at many levels and I will be deconstructing / studying the script for my own development.
Many thanks again.
Absolutely my pleasure and apologies for the delay in returning to your post. Thank you very much for the feedback.
What I'm really happy about is this part of your response...
I will be deconstructing / studying the script for my own development.
There are a lot of people that would take the code and apply it to solve their problem but not take the time to understand it not only so they can actually maintain it, but so that they can solve similar problems on their own in the future. My hat is off to you, good Sir!
If you don't mind and if you have the time, I'd like to ask a couple of "usage" questions so that I might be able to help others in the future by setting up more realistic test data.
1. Obviously, this is a narrow test table and your real table may be quite a bit wider. How many columns are actually in this table and what's the average width of the rows in bytes? sys.dm_db_index_phyical_stats would be the tool to use for that number.
2. How many rows do you have in the real table?
3. How many products do you have in the real table?
4. How many months do you have in the table?
5. What's the ratio of active to inactive rows?
Thanks again for both the interesting problem and the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 17, 2016 at 2:20 am
I just noticed this thread: though Jeff's code is as always impressive (and as always, yet another plug for his Tally tables :rolleyes:), I can't help feeling it would be much easier to have a simple SQL query to return raw data rows and then have the Tablix do all the fancy summing and percentage calculations.
This is pretty easy to do: judicious use of Groups (OP has already done this I think?), putting sensible names on a couple of key cells in the tablix which you will be using in the calculations (e.g. the row Total column cell ;-)), and writing Expressions in other cells to do the percentage calculations — and all within the Tablix.
This is something I do a LOT and to my mind, it's easier to maintain than Jeff's brain-boggling T-SQL. (No, I never could get my head properly around CTEs … or OVER … or the whole Tally table thing … but that's my fault, not Jeff's. :-))
August 17, 2016 at 6:45 pm
cad.delworth (8/17/2016)
I just noticed this thread: though Jeff's code is as always impressive (and as always, yet another plug for his Tally tables :rolleyes:), I can't help feeling it would be much easier to have a simple SQL query to return raw data rows and then have the Tablix do all the fancy summing and percentage calculations.This is pretty easy to do: judicious use of Groups (OP has already done this I think?), putting sensible names on a couple of key cells in the tablix which you will be using in the calculations (e.g. the row Total column cell ;-)), and writing Expressions in other cells to do the percentage calculations — and all within the Tablix.
This is something I do a LOT and to my mind, it's easier to maintain than Jeff's brain-boggling T-SQL. (No, I never could get my head properly around CTEs … or OVER … or the whole Tally table thing … but that's my fault, not Jeff's. :-))
You'll have to show me where there's a plug for the Tally Table anywhere in what I've posted on this thread. 😉 The closest thing to a Tally Table in any of the code I posted is in the test table generator as a constrained Cartesian Product used only as a "Psuedo-Cursor" row source and that's not actually a part of the solution. It's just for generating the test data so that anyone, including SSRS users, can have something to play with.
Although I thank you for the compliment, I wouldn't classify the code that I used to solve the problem as "mind boggling", though. It's just some cascading CTEs, each relying on the outcome of the previous one, to take advantage of the DRY principle to keep the code simple and fast by "peeling just one potato at a time". Using SUM() OVER to prevent the need for additional GROUP BYs isn't a mind boggling concept, either. It's all just simple arithmetic and conservation of energy and the code works in all versions of SQL Server from 2005 and up.
That, notwithstanding, since the OP IS using SSRS, I personally would love to see an example reporting package that would do this (same as the result I ended up with) using a Matrix instead of a call to T-SQL. Any chance of you posting a package that folks could load for 2008 or 2012 along with an explanation of what you did? Please feel free to use the test data generator I provided.
This is how I learn and I'm all for learning something better. It would be a big help to the OP and to others reading this thread, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 18, 2016 at 2:39 am
Jeff, there's no such thing as "JUST some CASCADING CTEs!"
I have enough trouble trying to understand using just ONE CTE!
Also, an OVER is not "just simple arithmetic!"
I was just kidding about the tally table, though you did say something about seven million records, so yeah I did assume that a tally table was in there somewhere. 😀 When I saw all the CTEs I confess my brain went into Standby so I didn't actually check right the way through your code because I knew it would be impenetrable to me (and having read all of it, it still is!).
I'm kinda up to the eyes at work right now so it may be several weeks or even months before I'll have enough time to organise an SSRS example, but I do plan to give it a shot. 🙂 What's the limit here on file sizes for attachments BTW? The only way I could post an example would be to post an entire .RDL file when I've got one to show you folks.
August 18, 2016 at 6:27 am
Hmm, seems the only (?) way to put an image on here is to add it as an attachment.
So, the tablix in the first image file is produced by a query which returns literally half a dozen columns. The columns used in the tablix are IncidentNumber, DummyForGrouping, TeamGroup, OwnerTeam (the full team name), and BreachPassed (the latter being a varchar having the value Successful or Unsuccessful set in the query, based on each Incident's outcome).
DummyForGrouping is set by a CASE in the query to D or blank. We need this to group by, so that the first set of rows including their subtotal row can form an outer group.
TeamGroup is set by a CASE in the query to 1 or 2 (or other values) according to the team name, and is used to add Row Groups to the tablix so we can include only some of the teams available, i.e. the ones we want to report on.
Pretty much everything else you need to know is in the second image file, Setup in Report Builder: the one with the dodgy red lines and text on it (so sue me, I used Paint and I was in a hurry!).
The technique is to keep the 'feeder' query as simple as possible and let the tablix do the work for you. By uniquely naming the Total TextBox in each row, you can reference that TextBox's as a member of the ReportItems collection; and hence use that reference in other tablix cells to do calculations with the Total TextBox's Value — such as percentage calculations. I hope the second image file makes it clear as to what's going on, and how it was all put together. I always find it difficult to explain the ins and outs of a medium-complexity tablix like this one without using my hands … :rolleyes:
We'll leave how to make the sub-par cells coloured red on yellow for another time, or as an exercise for the interested student ;-).
So from my admittedly report-centric POV, I'd much rather have a Real Simple T-SQL query and set up a tablix to do as much calculation as possible than have a hugely advanced T-SQL query (IMHO) which I can barely comprehend — let alone ever hope to debug! — driving a very simple tablix.
Hopefully this has convinced some of you to give this a try, and I'm not in a minority of one (as usual LOL!).
August 18, 2016 at 6:50 am
cad.delworth (8/18/2016)
Hmm, seems the only (?) way to put an image on here is to add it as an attachment.So, the tablix in the first image file is produced by a query which returns literally half a dozen columns. The columns used in the tablix are IncidentNumber, DummyForGrouping, TeamGroup, OwnerTeam (the full team name), and BreachPassed (the latter being a varchar having the value Successful or Unsuccessful set in the query, based on each Incident's outcome).
DummyForGrouping is set by a CASE in the query to D or blank. We need this to group by, so that the first set of rows including their subtotal row can form an outer group.
TeamGroup is set by a CASE in the query to 1 or 2 (or other values) according to the team name, and is used to add Row Groups to the tablix so we can include only some of the teams available, i.e. the ones we want to report on.
Pretty much everything else you need to know is in the second image file, Setup in Report Builder: the one with the dodgy red lines and text on it (so sue me, I used Paint and I was in a hurry!).
The technique is to keep the 'feeder' query as simple as possible and let the tablix do the work for you. By uniquely naming the Total TextBox in each row, you can reference that TextBox's as a member of the ReportItems collection; and hence use that reference in other tablix cells to do calculations with the Total TextBox's Value — such as percentage calculations. I hope the second image file makes it clear as to what's going on, and how it was all put together. I always find it difficult to explain the ins and outs of a medium-complexity tablix like this one without using my hands … :rolleyes:
We'll leave how to make the sub-par cells coloured red on yellow for another time, or as an exercise for the interested student ;-).
So from my admittedly report-centric POV, I'd much rather have a Real Simple T-SQL query and set up a tablix to do as much calculation as possible than have a hugely advanced T-SQL query (IMHO) which I can barely comprehend — let alone ever hope to debug! — driving a very simple tablix.
Hopefully this has convinced some of you to give this a try, and I'm not in a minority of one (as usual LOL!).
I have to admit my disappointment. I was hoping you'd attach a working package that folks could load and experiment with like I did with the T-SQL. That way, we could also performance test it against the 7 million rows. It might have also convinced me to give it a try. Does it take so long that there's no room to fit it into one of your busy days?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 25 total)
You must be logged in to reply to this topic. Login to reply