November 25, 2016 at 7:28 am
Hello,
I have the following table:
ID bigint,
Name varchar(50),
Address varchar(250),
RecDate smalldatetime,
PurPrice decimal(7,2),
PaidBy varchar(50)
I am trying to come up with a select statement that will group the results by ID and sums the PurPrice but also at the same time group the returned result by month extracted from RecDate.
Is this possible in one statement?
example of output would be something like this:
ID Name SumPrice Month
1 Test 1500 01
1 Test 1000 02
etc...
Thanks in advance
November 25, 2016 at 7:50 am
tonytohme (11/25/2016)
Hello,I have the following table:
ID bigint,
Name varchar(50),
Address varchar(250),
RecDate smalldatetime,
PurPrice decimal(7,2),
PaidBy varchar(50)
I am trying to come up with a select statement that will group the results by ID and sums the PurPrice but also at the same time group the returned result by month extracted from RecDate.
Is this possible in one statement?
example of output would be something like this:
ID Name SumPrice Month
1 Test 1500 01
1 Test 1000 02
etc...
Thanks in advance
Here's a guess:SELECT
ID
,Name
,SUM(CurPrice) AS SumPrice
,YEAR(RecDate) AS RecYear
,MONTH(RecDate) AS RecMonth
FROM MyTable
GROUP BY
ID
,Name
,YEAR(RecDate
,MONTH(RecDate)
John
November 25, 2016 at 7:50 am
Below query will give you the desired results:
declare @table table (ID bigint,
Name varchar(50),
Address varchar(250),
RecDate smalldatetime,
PurPrice decimal(7,2),
PaidBy varchar(50)
)
insert into @table
values (1, 'Peter', 'Street 1', '2016-11-22', 100, 'PayPal')
, (2, 'John', 'Lane 1', '2016-10-12', 225, 'PayPal')
, (2, 'John', 'Lane 1', '2016-11-12', 75, 'CreditCard')
, (1, 'Peter', 'Street 1', '2016-09-08', 80, 'PayPal')
, (3, 'Stephen', 'Park 1', '2016-09-05', 90, 'CreditCard')
, (3, 'Stephen', 'Park 1', '2016-11-15', 120, 'CreditCard')
, (1, 'Peter', 'Street 1', '2016-09-17', 15, 'CreditCard')
, (3, 'Stephen', 'Park 1', '2016-11-20', 60, 'PayPal')
select ID
, Name
, MONTH(RecDate) as month
, SUM(PurPrice) as Total_spend
from @table
group by ID
, Name
, MONTH(RecDate)
order by ID
, month
Btw: your sample table is not normalized. You should seperate the person details from the purchace records. The below structure is a better approach because you only have to store the person/address once.
declare @purchace table (ID bigint
, RecDate smalldatetime
, PurPrice decimal(7,2)
, PaidBy varchar(50)
)
declare @name table (ID bigint
, Name varchar(50)
, Address varchar(250)
)
insert into @purchace
values (1, '2016-11-22', 100, 'PayPal')
, (2, '2016-10-12', 225, 'PayPal')
, (2, '2016-11-12', 75, 'CreditCard')
, (1, '2016-09-08', 80, 'PayPal')
, (3, '2016-09-05', 90, 'CreditCard')
, (3, '2016-11-15', 120, 'CreditCard')
, (1, '2016-09-17', 15, 'CreditCard')
, (3, '2016-11-20', 60, 'PayPal')
insert into @name
values (1, 'Peter', 'Street 1')
, (2, 'John', 'Lane 1')
, (3, 'Stephen', 'Park 1')
select name.ID
, Name
, MONTH(RecDate) as month
, SUM(PurPrice) as Total_spend
from @purchace purchace
inner join @name name
on purchace.id = name.id
group by name.ID
, Name
, MONTH(RecDate)
order by ID
, month
November 25, 2016 at 7:53 am
Thank you so much guys for the quick reponce you are awesome.
The tables I mentioned in my post are not actuals its just an example to get an understading of how I can use group by. Actual tables are normalized/
Thanks again
November 25, 2016 at 8:36 am
tonytohme (11/25/2016)
Hello,I have the following table:
ID bigint,
Name varchar(50),
Address varchar(250),
RecDate smalldatetime,
PurPrice decimal(7,2),
PaidBy varchar(50)
I am trying to come up with a select statement that will group the results by ID and sums the PurPrice but also at the same time group the returned result by month extracted from RecDate.
Is this possible in one statement?
example of output would be something like this:
ID Name SumPrice Month
1 Test 1500 01
1 Test 1000 02
etc...
Thanks in advance
First, I knew a "Tony Tohme" way back when. Did you ever work for a company called "ACN"?
Second, there is a bit of a danger in one of the posted solutions if there are multiple years of data involved because any given month number is NOT unique across years.
So the question is, are you going to run into multi-year data with this query and, if so, what would you like to see in the output to differentiate the total of, say, January of 2016 from January of 2015? The solution posted by John Mitchell takes care of that problem but there's also the opportunity to generate Sub-Totals and maybe even "pivot" the data for reporting purposes by 1 row per ID per year with a column for each month, year totals, Sub-Totals for each ID for all months, and a Grand Total for all months/years.
Also, to help us help you in the future and to make things a bit more clear, please see the article at the first link under "Helpful Links" in my signature line below.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 25, 2016 at 12:33 pm
Hi Jeff
Firstly no I never even heard of ACN.
Second you are right I might be trying to get data from across multiple years in that case I would condition the select tot return only data where year would be equal to a parameter.
Unless you have a better solution :-).
Thanks
November 26, 2016 at 12:51 pm
tonytohme (11/25/2016)
Hi JeffFirstly no I never even heard of ACN.
Second you are right I might be trying to get data from across multiple years in that case I would condition the select tot return only data where year would be equal to a parameter.
Unless you have a better solution :-).
Thanks
Ok... then let's open up some reporting "possibilities". 🙂
First, let's simulate a more realistic bit of data. 1 Million transactions across 10,000 customers across 7 years should be fun. This takes about 12 seconds to generate if your test database, 4 seconds if your test database is TempDB.
--===== If the test table exists, drop it to make reruns in SSMS easier.
IF OBJECT_ID('dbo.AggTestTable','U') IS NOT NULL
DROP TABLE dbo.AggTestTable
;
--===== Create the test table.
CREATE TABLE dbo.AggTestTable
(
ID BIGINT NOT NULL
,Name VARCHAR(50) NOT NULL
,Address VARCHAR(250) NOT NULL
,RecDate SMALLDATETIME NOT NULL
,PurPrice DECIMAL(7,2) NOT NULL
,PaidBy VARCHAR(50) NOT NULL
)
;
--===== Add the Clustered Index to support performance and to keep fragmentation at bay
CREATE CLUSTERED INDEX IXC_TestTable ON dbo.AggTestTable (RecDate,ID)
;
GO
--===== Populate the test table with constrained random data.
-- ID - 1 THRU 10,000 (10,000) "customers"
-- Name - 'SomeName' + the ID left padded with zeros to 6 places.
-- Address - 'SomeAddress' + the ID left padded with zeros to 6 places.
-- RecDate - 2010-01-01 00:00 THRU 2016-12-31 23:59
-- PurPrice - 1.00 THRU 500.00
-- PaidBy - PayPal or CreditCard
WITH cteGenBaseData AS
(
SELECT TOP 1000000
ID = ABS(CHECKSUM(NEWID())%10000)+1
,RecDate = RAND(CHECKSUM(NEWID()))*DATEDIFF(DD,'2010','2017')+CONVERT(SMALLDATETIME,'2010')
,PurPrice = CONVERT(DECIMAL(7,2),RAND(CHECKSUM(NEWID()))*499+1)
,PaidBy = CASE ABS(CHECKSUM(NEWID())%2) WHEN 0 THEN 'PayPal' ELSE 'CreditCard' END
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
)
INSERT INTO dbo.AggTestTable WITH(TABLOCK) -- Part of requirement for "Minimal Logging"
(ID,Name,Address,RecDate,PurPrice,PaidBy)
SELECT ID = ID
,Name = 'SomeName' + RIGHT('000000'+CONVERT(VARCHAR(10),ID),6)
,Address = 'SomeAddress' + RIGHT('000000'+CONVERT(VARCHAR(10),ID),6)
,RecDate = RecDate
,PurPrice = PurPrice
,PaidBy = PaidBy
FROM cteGenBaseData
ORDER BY RecDate,ID-- Part of requirement for "Minimal Logging" with Clustered Index in Place
OPTION (RECOMPILE)
;
--===== Let's see what the first 10,000 rows look like
SELECT TOP 10000 * FROM dbo.AggTestTable
;
Ok... So we have 7 years of random transactions with 10,000 customers. Let's see what most all of the possibilities are. We need rows for the following.
For all the rows, we'll "pivot" the data to month columns for each year using pre-aggregation and an "ancient" but fast and easy to understand method known as a "CROSS TAB". The end of each row will also have a "Row Total".
With that in mind, we need rows for each of the following things and each sub-total and the grand total will all have the correct totals for every month AND the RowTotal. That means that the RowTotal of the Grand Total row will be the total of all PurPrice's.
1. Detail rows - basically, these show the "raw" data.
2. Sub-Total User/Year
3. Sub-Total User
4. Sub-Total Year/PaidBy
5. Sub-Total Year
6. Sub-Total PaidBy
7. Grand Total
This whole shebang takes 13-15 seconds to run. If you add a WHERE clause for just one or a couple of IDs, it's nearly instantaneous. If you add a WHERE clause for, say, the years 2015 and 2006 (>=2015 and < 2017 so the index can be used), it takes about 3 seconds.
I left the GID (GROUPING_ID) in the output just for this demonstration. It can be removed from the output. I also calculated and displayed a "LineType" column. Again, just for demonstration. It can be removed or changed to suppress (for example, the word "Detail").
Here's the code. I would NOT use it as a view but a clever person could use it as an iTVF (Inline Table Valued Function)
WITH ctePreAgg AS
( --=== This pre-aggregates the data and DRYs out some forumulas.
SELECT ID --I'm assuming this is some sort of Customer ID.
,Name = MAX(Name) --Assumes there's only one Name per ID
,Address = MAX(ADDRESS) --Assumes there's only one Address per ID (could be very wrong here)
,RecYear = DATENAME(yy,RecDate)
,RecMonth = DATEPART(mm,RecDate)
,PurPrice = SUM(PurPrice)
,PaidBy
FROM dbo.AggTestTable
--WHERE ID IN (1,2000)
--WHERE RecDate >= '2015' AND RecDate < '2017'
GROUP BY ID,DATENAME(yy,RecDate),DATEPART(mm,RecDate),PaidBy
)
SELECT GID = GROUPING_ID(ID,RecYear,PaidBy)
-- 4 2 1 --Each column grouping has a binary number assigned and are added together to for a GID.
--A GID of 6 (for example) means that ID (4) and RecYear(2)
--were **NOT** included in the grouping meaning it **WAS** grouped by PaidBy(1).
--Conversely, a GID of 1 means that PaidBy (1) was **NOT** included
--in the grouping meaning it **WAS** grouped by ID(4) and RecYear(2)
,LineType = CASE GROUPING_ID(ID,RecYear,PaidBy)
WHEN 0 THEN 'Detail'
WHEN 1 THEN 'Sub-Total User/Year'
WHEN 3 THEN 'Sub-Total User'
WHEN 4 THEN 'Sub-Total Year/PaidBy'
WHEN 5 THEN 'Sub-Total Year'
WHEN 6 THEN 'Sub-Total PaidBy'
WHEN 7 THEN 'Grand_Total'
END
,ID = ISNULL(CONVERT(VARCHAR(32),ID),'----------')
,Name = CASE WHEN GROUPING_ID(ID,RecYear,PaidBy) IN (0,1,3) THEN MAX(Name) ELSE '--------------------' END
,Address = CASE WHEN GROUPING_ID(ID,RecYear,PaidBy) IN (0,1,3) THEN MAX(ADDRESS) ELSE '--------------------' END
,[Year] = ISNULL(RecYear,'----')
,PaidBy = ISNULL(PaidBy ,'----------')
,Jan = SUM(CASE WHEN RecMonth = 1 THEN PurPrice ELSE 0 END)
,Feb = SUM(CASE WHEN RecMonth = 2 THEN PurPrice ELSE 0 END)
,Mar = SUM(CASE WHEN RecMonth = 3 THEN PurPrice ELSE 0 END)
,Apr = SUM(CASE WHEN RecMonth = 4 THEN PurPrice ELSE 0 END)
,May = SUM(CASE WHEN RecMonth = 5 THEN PurPrice ELSE 0 END)
,Jun = SUM(CASE WHEN RecMonth = 6 THEN PurPrice ELSE 0 END)
,Jul = SUM(CASE WHEN RecMonth = 7 THEN PurPrice ELSE 0 END)
,Aug = SUM(CASE WHEN RecMonth = 8 THEN PurPrice ELSE 0 END)
,Sep = SUM(CASE WHEN RecMonth = 9 THEN PurPrice ELSE 0 END)
,Oct = SUM(CASE WHEN RecMonth = 10 THEN PurPrice ELSE 0 END)
,Nov = SUM(CASE WHEN RecMonth = 11 THEN PurPrice ELSE 0 END)
,Dec = SUM(CASE WHEN RecMonth = 12 THEN PurPrice ELSE 0 END)
,RowTotal = SUM(PurPrice)
FROM ctePreAgg
GROUP BY GROUPING SETS
( --===== We're using GROUP SETS instead of a CUBE to shave off about 2/3rds the duration of the run.
(ID,RecYear,PaidBy) --Detail
,(ID,RecYear) --Sub-Total User/Year
,(ID) --Sub-Total User
,(RecYear,PaidBy) --Sub-Total Year/PaidBy
,(RecYear) --Sub-Total Year
,(PaidBy) --Sub-Total PaidBy
,() --Grand Total
)
ORDER BY GROUPING(ID) ,ID --In order to preserve the order of where the subtotals appear,
,GROUPING(RecYear),RecYear --you have to sort by the GROUPING of each column and the column
,GROUPING(PaidBy) ,PaidBy --that appear in the grouping or GROUPING SETS.
;
The imaginative person could use similar to do things like add on RowTotals by quarter and what percent a quarter was of each row and other "crazy" things that CFOs and CEOs like to know.
The really cool part is that you don't need SSRS for this and, if you persist the result set as a table and get a little imaginative with a spreadsheet that reads from the table, the C-Level folks can do all sorts of pretty things without having to bug you for another report. 😉
For more information on the "Black Arts" technique of "CROSSTABS" and "Pre-Aggregation" (thank you Peter "PESO" Larsson for the term), please see the following introductory articles on the subject. The first article also includes some performance tests against the relatively aweful PIVOT operator.
[font="Arial Black"]Cross Tabs and Pivots, Part 1 – Converting Rows to Columns[/font][/url]
[font="Arial Black"]Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs[/font][/url]
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply