March 6, 2012 at 4:06 am
I cant figure out how to split up a count with group by into columns
The desired result would be:
total group1 group2 group3 group4
8----2------2------1------3-----
Here's the table
CREATE TABLE #a(id INT PRIMARY KEY, name VARCHAR(20), igroup INT)
INSERT #a VALUES (1, 'Johan', 1)
INSERT #a VALUES (2, 'David', 1)
INSERT #a VALUES (3, 'Fredrik', 2)
INSERT #a VALUES (4, 'Mikael', 2)
INSERT #a VALUES (5, 'Noa', 3)
INSERT #a VALUES (6, 'Massimo', 4)
INSERT #a VALUES (7, 'Leo', 4)
INSERT #a VALUES (8, 'Sasha', 4)
--what i have now
SELECT COUNT(id) FROM #a group by igroup
DROP TABLE #a
Ty for your time
March 6, 2012 at 4:23 am
SELECT COUNT(*) AS Total,
COUNT(CASE WHEN igroup=1 THEN 1 END) AS group1,
COUNT(CASE WHEN igroup=2 THEN 1 END) AS group2,
COUNT(CASE WHEN igroup=3 THEN 1 END) AS group3,
COUNT(CASE WHEN igroup=4 THEN 1 END) AS group4
FROM #a
March 6, 2012 at 4:23 am
select
sum(1) as Total,
sum(case igroup when 1 then 1 end) as Group1,
sum(case igroup when 2 then 1 end) as Group2,
sum(case igroup when 3 then 1 end) as Group3,
sum(case igroup when 4 then 1 end) as Group4
from
#a
March 6, 2012 at 7:13 am
This would be my solution
CREATE TABLE #a (id INT PRIMARY KEY, name VARCHAR(20), igroup INT)
INSERT #a VALUES (1, 'Johan', 1)
INSERT #a VALUES (2, 'David', 1)
INSERT #a VALUES (3, 'Fredrik', 2)
INSERT #a VALUES (4, 'Mikael', 2)
INSERT #a VALUES (5, 'Noa', 3)
INSERT #a VALUES (6, 'Massimo', 4)
INSERT #a VALUES (7, 'Leo', 4)
INSERT #a VALUES (8, 'Sasha', 4);
With ConsolidatedCTE(iGroup,nCount) as
(Select 0,Count(id) FROM #a
union all
SELECT igroup,COUNT(id) FROM #a group by igroup)
select coalesce([0],0) as Total,coalesce([1],0) as Group1,coalesce([2],0) as Group2,coalesce([3],0) as Group3,coalesce([4],0) as Group4 from
(Select iGroup,nCount from ConsolidatedCTE)
#aTable Pivot
(Sum (nCount)
for iGroup in ([0],[1],[2],[3],[4])
) as PivotTable
DROP TABLE #a
March 6, 2012 at 7:16 am
#aTabale should be dataTable
my find and repalce gave that problem
here is the corrected code
CREATE TABLE #a (id INT PRIMARY KEY, name VARCHAR(20), igroup INT)
INSERT #a VALUES (1, 'Johan', 1)
INSERT #a VALUES (2, 'David', 1)
INSERT #a VALUES (3, 'Fredrik', 2)
INSERT #a VALUES (4, 'Mikael', 2)
INSERT #a VALUES (5, 'Noa', 3)
INSERT #a VALUES (6, 'Massimo', 4)
INSERT #a VALUES (7, 'Leo', 4)
INSERT #a VALUES (8, 'Sasha', 4);
With ConsolidatedCTE(iGroup,nCount) as
(Select 0,Count(id) FROM #a
union all
SELECT igroup,COUNT(id) FROM #a group by igroup)
select coalesce([0],0) as Total,coalesce([1],0) as Group1,coalesce([2],0) as Group2,coalesce([3],0) as Group3,coalesce([4],0) as Group4 from
(Select iGroup,nCount from ConsolidatedCTE)
DataTable Pivot
(Sum (nCount)
for iGroup in ([0],[1],[2],[3],[4])
) as PivotTable
DROP TABLE #a
March 6, 2012 at 7:23 am
Ty!
Both solutions worked great. Also nice to see different approaches to same problem.
March 6, 2012 at 7:34 am
siva 20997 (3/6/2012)
#aTabale should be dataTablemy find and repalce gave that problem
here is the corrected code
CREATE TABLE #a (id INT PRIMARY KEY, name VARCHAR(20), igroup INT)
INSERT #a VALUES (1, 'Johan', 1)
INSERT #a VALUES (2, 'David', 1)
INSERT #a VALUES (3, 'Fredrik', 2)
INSERT #a VALUES (4, 'Mikael', 2)
INSERT #a VALUES (5, 'Noa', 3)
INSERT #a VALUES (6, 'Massimo', 4)
INSERT #a VALUES (7, 'Leo', 4)
INSERT #a VALUES (8, 'Sasha', 4);
With ConsolidatedCTE(iGroup,nCount) as
(Select 0,Count(id) FROM #a
union all
SELECT igroup,COUNT(id) FROM #a group by igroup)
select coalesce([0],0) as Total,coalesce([1],0) as Group1,coalesce([2],0) as Group2,coalesce([3],0) as Group3,coalesce([4],0) as Group4 from
(Select iGroup,nCount from ConsolidatedCTE)
DataTable Pivot
(Sum (nCount)
for iGroup in ([0],[1],[2],[3],[4])
) as PivotTable
DROP TABLE #a
Have you checked the execution plan for the above?
1. Twice as many code lines compare to conditional SUM's
2. Twice as many table scans due to UNION
3. Unneccessary Sort
4. Expensive PIVOT
Why would you do this, until you wants to slow down your query...
March 6, 2012 at 7:57 am
I agree that my code is little bit heavier on execution
but there are a few reasons
1) I come from a windows programming background and making the execution plan fast as possible has not been my motive. The amount of data selected is fairley low so heavy execution plans has not made that differnce.
2) in some cases (not this) I find the code very difficult to undertand and worry about maintaing them for the future by me or someone. People seemed to happy as long as it works now without undertanding it and when they need to modify it they are stuck. I have had to write CTE's which are cascading to a depth of 8. Trying to condense that would have made it unmanageable
3) Breaking the tasks down to logical steps and maiantainable has been my motive. I agree in this case the earlier code was easy to understand. I dint think what is faster for SQL to execute but as follows
- The task is primarily a Pivot. Could have done it with the Pivot command alone but
- It needed a Total count in addition. I could have just added the Total as Union to the raw table and then Pivoted it
4) I have instances like this where I have to write for 12 periods, 3 or 4 sets of figures. I didnt like writing 36 or 48 lines of code even if that was effient for SQL Server
I dont mind discussing this more but like to stay away from heated arguments because it saps the energy out of you
March 6, 2012 at 8:27 am
siva 20997 (3/6/2012)
I agree that my code is little bit heavier on executionbut there are a few reasons
1) I come from a windows programming background and making the execution plan fast as possible has not been my motive. The amount of data selected is fairley low so heavy execution plans has not made that differnce.
2) in some cases (not this) I find the code very difficult to undertand and worry about maintaing them for the future by me or someone. People seemed to happy as long as it works now without undertanding it and when they need to modify it they are stuck. I have had to write CTE's which are cascading to a depth of 8. Trying to condense that would have made it unmanageable
3) Breaking the tasks down to logical steps and maiantainable has been my motive. I agree in this case the earlier code was easy to understand. I dint think what is faster for SQL to execute but as follows
- The task is primarily a Pivot. Could have done it with the Pivot command alone but
- It needed a Total count in addition. I could have just added the Total as Union to the raw table and then Pivoted it
4) I have instances like this where I have to write for 12 periods, 3 or 4 sets of figures. I didnt like writing 36 or 48 lines of code even if that was effient for SQL Server
I dont mind discussing this more but like to stay away from heated arguments because it saps the energy out of you
No offense, but all these reasons make it sound like you are lazy and don't care about the efficiency of the business.
1) A good execution plan does not just mean that the query executes faster, but also reduces the amount of resources needed. That affects your business. Amount of data changes and if you do not build your code to be scalable you are doing your employer a disservice.
2) Proper code is easier to maintain than poor code. That's just a fact.
3) ?
4) You didn't "feel" like it? 1 proper script could be used across the board with small changes.
I'm not trying to demean you. If this works for you and your employer, then fine. However, I don't want to impress on people visiting this forum that this is acceptable thinking in this profession. People come here to learn how to become better at what they do, not to do things that just simply work.
Jared
CE - Microsoft
March 6, 2012 at 8:51 am
siva 20997 (3/6/2012)
I agree that my code is little bit heavier on executionbut there are a few reasons
1) I come from a windows programming background and making the execution plan fast as possible has not been my motive. The amount of data selected is fairley low so heavy execution plans has not made that differnce.
I came from IBM Mainframe Assembler background, so I want everything done on punch cards:
1 line - 1 card - and in Assembler...:cool:
or, more common one:
I came from VB6.0 background, so I don't care that T-SQL has a totally different paradigm and it's not procedural language... (Have you seen stubborn VB coders coding in C#?):Whistling:
2) in some cases (not this) I find the code very difficult to undertand and worry about maintaing them for the future by me or someone. People seemed to happy as long as it works now without undertanding it and when they need to modify it they are stuck. I have had to write CTE's which are cascading to a depth of 8. Trying to condense that would have made it unmanageable
Instead of learning and understanding how things should be done in T-SQL properly, looks like you advises to use CTE for everything where possible... This one is particulary wrong for this example: PIVOT is one of the least understanble and least manageable operators in T-SQL.;-)
3) Breaking the tasks down to logical steps and maiantainable has been my motive. I agree in this case the earlier code was easy to understand. I dint think what is faster for SQL to execute but as follows
- The task is primarily a Pivot. Could have done it with the Pivot command alone but
- It needed a Total count in addition. I could have just added the Total as Union to the raw table and then Pivoted it
Wrong logical steps for T-SQL, therefore implementation is much less maintainable.
There is only one logical step here:
- Calculate counts of igroup occurences and total count
There is also output requirement:
Return results as cros-tab style.
4) I have instances like this where I have to write for 12 periods, 3 or 4 sets of figures. I didnt like writing 36 or 48 lines of code even if that was effient for SQL Server
What do you mean "I have to write for 12 periods, 3 or 4 sets of figures"????
What will you end up with if you ask C# programmer to write opera libretto? What's wrong with it? He is clever and can write...
I dont mind discussing this more but like to stay away from heated arguments because it saps the energy out of you
Actually, there is hardly anything to discuss about. It's obvious.
March 6, 2012 at 9:05 am
Line by Line code in Applications software is not thought of as good practice. It must be done in logical For next Loops etc and broken down niceley
I agree that here we have to think of what the SQl server would do and then plan on writing the code. we have to think on behalf of the machine than the human. I undertand that is the differnce and I have to think of that in future. However I have seen replies on this forum where people have accepted the solution and further discusion has shown me that they have missed the point
Most users on this forum seems to be DBA's who are manipulating large chunks of data and are thinking effiencey. However I think of my self as manaipulating small chunks of data in much more complex way. We can discuss that another day but my reasoning for doing it my way is that.
If I had a million rows and the speed was the essence i would use the first method but if had only few thousand rows and 36 columns result i would use my method
Just because the original question had only 4 columns it does not mean his actual problem had only 4 columns. he might have had more and 12 was a logical extension
But what I would like to know is the purpose of the Pivot command in TSQL if the same work can be done by line by line code more effientley
March 6, 2012 at 9:20 am
siva 20997 (3/6/2012)
Most users on this forum seems to be DBA's who are manipulating large chunks of data and are thinking effiencey. However I think of my self as manaipulating small chunks of data in much more complex way. We can discuss that another day but my reasoning for doing it my way is that.
Are you saying that you don't consider efficiency when coding? I would not be proud of thinking of myself as moving small chunks of data in a more complex way when there are easy ways of moving large amounts of data in a simple way. That is like being proud of writing hundreds of lines of inefficient code when 10 lines of efficient code would produce the same output.
If I had a million rows and the speed was the essence i would use the first method but if had only few thousand rows and 36 columns result i would use my method
Just because the original question had only 4 columns it does not mean his actual problem had only 4 columns. he might have had more and 12 was a logical extension
A couple of things here. First it is obvious that the OP posted an example. None of us know how much data the real tables have. You made the assumption they are small like the example.
The second thing is trust me, at even a few thousand rows performance is critical. Even a few hundred rows. I have seen select statements on tables with less than a thousand rows take forever because they were written poorly. I have also seen select statements on tables with millions of rows take < 1 seconds because it was efficient.
I would take fewer lines of simpler and faster code over lots of lines of code that is slower any day.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 6, 2012 at 9:30 am
siva 20997 (3/6/2012)
Line by Line code in Applications software is not thought of as good practice. It must be done in logical For next Loops etc and broken down niceley
What do you refer to as "Line by Line" code? In T-SQL it's totally not advisable to do Line By Line coding. It's SET-based non-procedural language.
Most users on this forum seems to be DBA's who are manipulating large chunks of data and are thinking effiencey. However I think of my self as manaipulating small chunks of data in much more complex way. We can discuss that another day but my reasoning for doing it my way is that.
Strange understanding of DBA responsibilities. DBA's do not manipulate large chunks of data, they are Database Administrators. Their primary responsibilities are to make sure that SQLServer is running smoothly, monitoring Db and server health, maintenance, backups and security.
It's T-SQL devlopers who manipulate with data, depends on requirements it can be small or large chunks. If you are T-SQL developer you do it in T-SQL way not in C#, C++, Fortran or Pascal way. This forum is here to show how things should be done in T-SQL way.
If I had a million rows and the speed was the essence i would use the first method but if had only few thousand rows and 36 columns result i would use my method
If you are T-SQL developer you use the first method, if not: you can use many other non-T-SQL ones (eg. extracting everything into text file and write some command batch to do stuff).
Actually, with 36 columns your PIVOT method would even be worth in compare to single select with conditional SUM. At the end, dynamic sql can be used if you don't like to put 36 lines for conditional SUM's
But what I would like to know is the purpose of the Pivot command in TSQL if the same work can be done by line by line code more effientley
And finally is coming - the right question!
March 6, 2012 at 9:35 am
I accept that the first method is best
Can we leave it at that
March 6, 2012 at 9:47 am
siva 20997 (3/6/2012)
I accept that the first method is bestCan we leave it at that
No one argued with that :hehe:
But I still advise you to read Jeff Moden's articles about cross tab queries:
http://www.sqlservercentral.com/articles/T-SQL/63681/
http://www.sqlservercentral.com/articles/Crosstab/65048/
They are quite comprehensive...
Viewing 15 posts - 1 through 15 (of 56 total)
You must be logged in to reply to this topic. Login to reply