February 24, 2015 at 2:29 pm
Hello Everyone,
I am looking for a query which can give me a count of combination of different columns of a table where count is greater than 1.
Here is the DDL and DML
CREATE TABLE [dbo].[DATA1](
[C1] [int] NULL,
[C2] [int] NULL,
[C3] [int] NULL,
[C4] [int] NULL,
[C5] [int] NULL,
[C6] [int] NULL,
[C7] [int] NULL,
[C8] [int] NULL,
[C9] [int] NULL,
[C10] [int] NULL,
[C11] [int] NULL,
[C12] [int] NULL
) ON [PRIMARY]
insert into data1 (c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12) Values
(1,3,4,5,6,7,9,18,20,22,23,24),
(1,2,3,6,8,13,14,15,16,20,21,22),
(2,4,8,11,12,13,14,16,17,18,23,24),
(1,3,5,6,8,9,10,15,20,21,23,24),
(1,3,4,7,9,10,11,15,20,21,22,23),
(1,4,5,7,11,12,14,17,18,21,22,23),
(5,6,9,10,11,12,13,14,16,18,19,20),
(2,3,7,9,12,13,15,16,17,18,19,24),
(1,4,5,6,7,13,14,19,21,22,23,24),
(1,2,3,4,6,7,11,13,14,18,21,24),
(2,3,4,6,7,8,10,16,19,20,21,22),
(2,4,5,7,9,13,16,17,18,20,22,24),
(3,4,5,7,10,11,12,13,16,18,23,24),
(1,2,3,6,8,11,12,13,15,19,21,22)
Sample output:
attached
Thanks a ton.
“If your actions inspire others to dream more, learn more, do more and become more, you are a leader.” -- John Quincy Adams
February 24, 2015 at 2:56 pm
What a bizarre requirement!!! You are going to have to write some code for this one. The only way I can think of is something like this.
select CAST(c1 as varchar(5)) + ',' + CAST(c2 as varchar(5))
, COUNT(*)
from DATA1
group by CAST(c1 as varchar(5)) + ',' + CAST(c2 as varchar(5))
having COUNT(*) > 1
union all
select CAST(c1 as varchar(5)) + ',' + CAST(c2 as varchar(5)) + ',' + CAST(c3 as varchar(5))
, COUNT(*)
from DATA1
group by CAST(c1 as varchar(5)) + ',' + CAST(c2 as varchar(5)) + ',' + CAST(c3 as varchar(5))
having COUNT(*) > 1
That gets you the first 3 columns. Just keep adding more result sets and off you go. This (like your sample output) does NOT take into account extended sets. In other words 1,2 is really the same thing here as 1,2,3 because they are effectively the same thing.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 24, 2015 at 5:04 pm
I wasn't sure about this, but it seems to work. It's resource intensive because there are lots of combinations. I don't have time to comment the code, but I strongly suggest that you do it.
WITH cteCounts AS
(
SELECT c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,
COUNT(*) counts
FROM #DATA1
GROUP BY CUBE( c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12)
)
,cteCombos AS(
SELECT STUFF( ISNULL( ',' + CAST(c1 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c2 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c3 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c4 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c5 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c6 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c7 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c8 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c9 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c10 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c11 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c12 AS varchar(5)), '') , 1, 1, '') combos,
SUM(counts) counts
FROM cteCounts
GROUP BY STUFF( ISNULL( ',' + CAST(c1 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c2 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c3 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c4 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c5 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c6 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c7 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c8 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c9 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c10 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c11 AS varchar(5)), '')
+ ISNULL( ',' + CAST(c12 AS varchar(5)), '') , 1, 1, '')
HAVING SUM(counts) > 1
)
SELECT *
FROM cteCombos
WHERE LEN(combos) - LEN(REPLACE( combos, ',', '')) > 0
ORDER BY combos;
February 24, 2015 at 5:09 pm
Unless I'm misinterpreting these odd requirements, this is really simple.
It depends on the number of selections you make from the set. So with 12 columns, and if you want to uniquely select 5 of those columns, the formula is:
12! / 5!*7! or in SQL:
SELECT 2*3*4*5*6*7*8*9*10*11*12/((2*3*4*5)*(2*3*4*5*6*7));
This assumes you really want combinations and not permutations. In the latter, order is important.
Reference: http://en.wikipedia.org/wiki/Combination
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply