January 3, 2013 at 11:57 am
Hi all.
I want to generate all possible combination on dividing N object to N person.
I create a complex query, but for dividing 4 object to 4 person it take 10 minute in my PC.
Are there some other fast solutions?
DDL of tables:
create table n_objects
(code int not null primary key,
name varchar(100) not null);
create table person
(code int not null primary key,
name varchar(100) not null unique);
some sample data:
code name
----------- ---------
1 David
3 Joe
2 Nancy
code name
----------- ---------
1 Apple
2 Orange
3 Orange
4 Banana
The wanted result is:
No combins
----- --------------------------------
1 David:Nothing!
Nancy:Nothing!
Joe:Apple.Orange.Orange.Banana
2 David:Nothing!
Nancy:Orange.Orange.Banana
Joe:Apple
3 David:Nothing!
Nancy:Orange.Banana
Joe:Apple.Orange
4 David:Nothing!
Nancy:Orange
Joe:Apple.Orange.Banana
....
33 David:Orange.Banana
Nancy:Orange
Joe:Apple
34 David:Banana
Nancy:Nothing!
Joe:Apple.Orange.Orange
35 David:Banana
Nancy:Apple
Joe:Orange.Orange
...
39 David:Banana
Nancy:Orange.Orange
Joe:Apple
40 David:Nothing!
Nancy:Apple.Banana
Joe:Orange.Orange
41 David:Nothing!
Nancy:Orange.Orange
Joe:Apple.Banana
And here is my query:
with
cnt_objects(c) as
(
select count(*) from n_objects
),
ArrangedObjects as
(
select row_number() over(order by code) as code,
name
from n_objects
),
cnt_persons(c) as
(
select count(*) from person
),
ArrangedPerson as
(
select row_number() over(order by code) as code,
name
from Person
),
PowerSet as
(
select n.code,
cast(n.code as varchar(50)) as id,
1 as nbr,
cast(n.name as varchar(500)) as name
from ArrangedObjects n
union all
select t.code,
cast(c.id+'.'+cast(t.code as varchar(50)) as varchar(50)),
nbr + 1,
cast(c.name +'.'+ t.name as varchar(500))
from PowerSet c
join ArrangedObjects t
on c.code < t.code
where c.nbr + 1 <= (select c from cnt_objects)
),
PowerSet2 AS
(
SELECT id,cast(ROW_NUMBER() OVER(ORDER BY id desc) as varchar(50)) AS row_nbr,
name
FROM (SELECT n.id, n.name
FROM PowerSet n
UNION ALL
SELECT '0', 'Nothing!'
FROM ArrangedPerson
where code > 1)d
),
Permutations AS
(
select cast(row_nbr as varchar(50)) as row_nbr,
cast(id as varchar(50)) id,
cast(name as varchar(500)) as name,
1 as nbr
from PowerSet2
union all
select cast(c.row_nbr +'.'+t.row_nbr as varchar(50)),
cast(c.id +'.'+t.id as varchar(50)),
cast(c.name+'*'+t.name as varchar(500)),
c.nbr + 1
from Permutations c
join PowerSet2 t
on charindex(t.row_nbr,c.row_nbr) = 0
where c.nbr+1 <= (select c from cnt_persons)
),
Checker As
(
select cast('2'as varchar(50)) ix,
id,
name
from permutations
where nbr = (select c from cnt_persons)
group by id, name
having len(id)-len(replace('.'+id, '.1',''))=1
union all
select cast(ix*1+1as varchar(50)),
id,
name
from Checker
where len(id)-len(replace('.'+id, '.'+ix,''))=len(ix)
),
CTE(ID, S, St) AS
(
SELECT Id,
CAST(name + '*' AS VARCHAR(500)) ,
CAST('' AS VARCHAR(500))
FROM checker
where ix = (select c from cnt_objects)+1
UNION ALL
SELECT Id,
SUBSTRING(S, CHARINDEX('*', S) + 1, LEN(S) - CHARINDEX('*', S) + 1),
SUBSTRING(S, 1, CHARINDEX('*', S))
FROM CTE
WHERE CHARINDEX('*', S) > 0
),
joins as
(
select g,p.name+':'+st as st
from (
select cast((rnk+(select c from cnt_persons)-1)/(select c from cnt_persons) as varchar(5)) g,
row_number()
over(partition by (rnk+(select c from cnt_persons)-1)/(select c from cnt_persons)
order by len(s)desc) rnk,
st
from
(
select row_number()over(order by id)rnk,
* from cte
where st <> ''
)d
)d
join person p
on p.code = d.rnk
)
select cast(row_number()over(order by min(d.g)) as varchar(5)) [No],
replace(stuff(c.l,1,6,''),'*','') as combins
from (select distinct g from joins)d
cross apply(select replicate(' ',6)+st+char(10)
from joins j
where j.g =d.g
for xml path(''))c(l)
group by replace(stuff(c.l,1,6,''),'*','');
January 3, 2013 at 12:46 pm
I'm not really sure what you are trying to do, but how can 'Nothing' be a result of doing calculations on columns with not null constraints?
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
January 3, 2013 at 12:48 pm
Edit: misread ddl. Said something stupid.
I tried to solve this and failed. Well done Dwain C.
-- Itzik Ben-Gan 2001
January 3, 2013 at 6:46 pm
What is confusing is that you are asking about 4:4 matches, but you show 3:3 in your sample data, with one duplicate.
Your results also don't seem to make sense in terms of 4:4 matches. If you are doing 4*4 or 4!, neither of those add up to 41 result sets.
Can you explain how you get the combinations of the results? Or what you expect to happen? It might be helpful if you start with 4 items matching 4, and then explain what happens if you don't have 4 of one set.
January 3, 2013 at 7:02 pm
I get 324 unique, fruity combos!
While I could have done this as a single query, I think it will be slightly more efficient using a divide and conquer approach. This will take a bit of explaining so we'll break it up and explain as we go along.
First, let's show the DDL (using temp tables) and sample data, plus one extra table we'll use to store intermediate results.
create table #n_objects
(code int not null primary key,
name varchar(100) not null);
create table #person
(code int not null primary key,
name varchar(100) not null unique);
INSERT INTO #n_objects
SELECT 1, 'Apple' UNION ALL SELECT 2, 'Orange'
UNION ALL SELECT 3, 'Orange' UNION ALL SELECT 4, 'Banana'
INSERT INTO #person
SELECT 1, 'David' UNION ALL SELECT 2, 'Nancy' UNION ALL SELECT 3, 'Joe'
CREATE TABLE #FruitCombos
(Name VARCHAR(100)
,Fruits VARCHAR(8000)
,FCodes VARCHAR(8000)
,FCount INT)
Now, drawing on a little piece I wrote some time back: (Generating n-Tuples with SQL[/url]), we're going to generate all of the unique combinations of fruits that can be created and assign each of those possible combinations to one of our names with the final CROSS JOIN. Actually, the UNIQUEnTuples rCTE is based on a slightly performance-improved version that appears towards the end of the discussion thread in that article. Note that we also added in the "Nothing" fruit allocation at this time.
;WITH UNIQUEnTuples (n, Tuples, ID, FCodes, FCount) AS (
SELECT 1, CAST(name AS VARCHAR(8000)), code
,'[' + CAST(code AS VARCHAR(8000)) + ']', 1
FROM #n_objects
UNION ALL
SELECT 1 + n.n, t.name + ',' + n.Tuples, code
,'[' + CAST(t.code AS VARCHAR(8000)) + ']' + n.FCodes, FCount + 1
FROM UNIQUEnTuples n
CROSS APPLY (
SELECT name, code
FROM #n_objects t
WHERE t.code < n.ID) t
)
INSERT INTO #FruitCombos
SELECT Name, Fruits=Tuples, FCodes, FCount
FROM UNIQUEnTuples
CROSS JOIN #person
UNION ALL
SELECT name, 'Nothing!', '', 0
FROM #person
You see we've put our intermediate results into our #FruitCombos table. We'll use 2 CROSS JOINs (for 3 tables - one for each person) to create all of the possible joined combinations as columns across and then some strategic eliminations based on the count of fruit (4) and the FCodes (which contain the fruit codes). The eliminations based on CHARINDEX may not actually be necessary, but I left them in just to be sure. Finally, we'll draw on another SQL Spackle article (actually the 4th in my signature line): An Alternative (Better?) Method to UNPIVOT [/url] to unpivot those columns into our final rows.
;WITH AllCombos AS (
SELECT DISTINCT Name1=a.Name, Fruits1=a.Fruits
,Name2=b.Name, Fruits2=b.Fruits
,Name3=c.Name, Fruits3=c.Fruits
FROM #FruitCombos a
CROSS JOIN #FruitCombos b
CROSS JOIN #FruitCombos c
CROSS APPLY (SELECT AllCodes=a.FCodes + b.FCodes + c.FCodes) d
WHERE a.FCount + b.FCount + c.FCount = 4 AND
a.Name <> b.Name AND b.Name <> c.Name AND a.Name <> c.Name AND
CHARINDEX(a.FCodes, b.FCodes) = 0 AND CHARINDEX(b.FCodes, c.FCodes) = 0 AND
CHARINDEX(b.FCodes, a.FCodes) = 0 AND CHARINDEX(c.FCodes, b.FCodes) = 0 AND
CHARINDEX(a.FCodes, c.FCodes) = 0 AND CHARINDEX(c.FCodes, a.FCodes) = 0 AND
LEN(REPLACE(AllCodes, '[1]', '')) + 3 = LEN(AllCodes) AND
LEN(REPLACE(AllCodes, '[2]', '')) + 3 = LEN(AllCodes) AND
LEN(REPLACE(AllCodes, '[3]', '')) + 3 = LEN(AllCodes) AND
LEN(REPLACE(AllCodes, '[4]', '')) + 3 = LEN(AllCodes)
)
SELECT ComboNo=e.n, Name, Fruits
FROM (
SELECT Name1, Fruits1, Name2, Fruits2, Name3, Fruits3
,n=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM AllCombos
) a
CROSS APPLY (
VALUES (n, Name1, Fruits1)
,(n, Name2, Fruits2)
,(n, Name3, Fruits3)) e(n, Name, Fruits)
ORDER BY ComboNo, Name
DROP TABLE #n_objects
DROP TABLE #person
DROP TABLE #FruitCombos
Overall, she's a mighty ugly baby, but then she's my baby! On the other hand, I believe she actually produces the right results, even though someone with a bit more time on their hands will probably be able to come up with a more efficient way to do this.
Note that this is clearly not designed to allocate n fruits across m people, but it could be modified to do this (the second DML query would need to by Dynamic SQL). With larger m's and n's this will be quite slow, although with just 3 people and 4 fruits it runs in less than one second on my Core i5 laptop.
Edit: One other note. This is sort of a bin packing problem where each person represents a bin with the objective to identify all possible ways to pack the bins (which have no constraint on contents). This is not the first time I've applied the UniqueNTuples rCTE to a bin packing problem and made it work.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 3, 2013 at 8:03 pm
Alan.B (1/3/2013)
Edit: misread ddl. Said something stupid.I tried to solve this and failed. Well done Dwain C.
Thanks Alan!
One further point of note. Because of the fact there are two oranges in our fruit cocktail, it appears in the results as if there are several duplicates because the solution I provided segregates each fruit by its code. These could be eliminated if desired but I chose to not do so.
I also validated that the CHARINDEX eliminations are indeed not required.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 3, 2013 at 8:48 pm
I get 324 unique, fruity combos!
I think your result is not correct. Run you query with following sample data:
INSERT INTO #n_objects
SELECT 1, 'Apple' UNION ALL
SELECT 2, 'Orange' UNION ALL
SELECT 3, 'Banana'
INSERT INTO #person
SELECT 1, 'David' UNION ALL
SELECT 2, 'Nancy'
your result is empty set. Why?!
Steve Jones - SSC Editor (1/3/2013)
What is confusing is that you are asking about 4:4 matches, but you show 3:3 in your sample data, with one duplicate.Your results also don't seem to make sense in terms of 4:4 matches. If you are doing 4*4 or 4!, neither of those add up to 41 result sets.
Can you explain how you get the combinations of the results? Or what you expect to happen? It might be helpful if you start with 4 items matching 4, and then explain what happens if you don't have 4 of one set.
I did not ask about 4:4! I said when I run query with 4 rows in Peson table and 4 rows in Object table query return result after more than 10 minute.
No! I have 4 rows in Object table at sample data. 2*Orange + 1*Apple + 1*Banana.
I can not explain in English, but the result is completely illustrative. I prepare some other sample data and desire result:
Object Table:
code name
----------- ------------
1 Apple
2 Orange
3 Orange
Person Table:
code name
----------- --------------
1 David
2 Nancy
Wanted Result is:
No combins
----- ----------------------------
1 David:Nothing!
Nancy:Apple.Orange.Orange
2 David:Apple
Nancy:Orange.Orange
3 David:Apple.Orange
Nancy:Orange
4 David:Apple.Orange.Orange
Nancy:Nothing!
5 David:Orange*
Nancy:Apple.Orange*
6 David:Orange.Orange*
Nancy:Apple*
January 3, 2013 at 8:52 pm
zombieisdead2020 (1/3/2013)
I get 324 unique, fruity combos!
I think your result is not correct. Run you query with following sample data:
INSERT INTO #n_objects
SELECT 1, 'Apple' UNION ALL
SELECT 2, 'Orange' UNION ALL
SELECT 3, 'Banana'
INSERT INTO #person
SELECT 1, 'David' UNION ALL
SELECT 2, 'Nancy'
your result is empty set. Why?!
As I said, my solution is not generalized to m fruits x n people.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 3, 2013 at 9:19 pm
I forgot to say thank you.
The generalizability is necessary.
I get 324 unique, fruity combos!
Because your solution return duplicate combos! you need to filter them.
For example see ComboNo IN (1, 11, 145, 156, 319)
January 3, 2013 at 9:51 pm
zombieisdead2020 (1/3/2013)
I forgot to say thank you.The generalizability is necessary.
I get 324 unique, fruity combos!
Because your solution return duplicate combos! you need to filter them.
For example see ComboNo IN (1, 11, 145, 156, 319)
I see what the problem is and I believe it can be fixed.
Fixing it and generalizing it will be challenging but I'll work on it.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 4, 2013 at 2:21 am
By George I think I've got it.
Stick with the first two queries provided earlier. Use the following (that sets up Dynamic SQL) to complete the divide and conquer solution.
DECLARE @sql NVARCHAR(MAX)
,@SQL1 NVARCHAR(MAX) = ''
,@SQL2 NVARCHAR(MAX) = ''
,@SQL3 NVARCHAR(MAX) = 'a1.FCount'
,@SQL4 NVARCHAR(MAX) = ''
,@SQL5 NVARCHAR(MAX) = ''
,@SQL6 NVARCHAR(MAX) = ''
,@SQL7 NVARCHAR(MAX) = ''
,@SQL8 NVARCHAR(MAX) = ''
SELECT @SQL1 = @SQL1 + ' ,Name' + a.code + '=a' + a.code + '.Name, Fruits' +
a.code + '=a' + a.code + '.Fruits'
FROM #person b
CROSS APPLY (SELECT CAST(code AS NVARCHAR)) a(code)
WHERE b.code > 1
SELECT @SQL2 = @SQL2 + ' CROSS APPLY (SELECT * FROM #FruitCombos b WHERE a' +
CAST(code-1 AS NVARCHAR) + '.Name < b.Name) a' + CAST(code AS NVARCHAR)
FROM #person
WHERE code > 1
SELECT @SQL3 = @SQL3 + '+ a' + a.code + '.FCount'
FROM #person b
CROSS APPLY (SELECT CAST(code AS NVARCHAR)) a(code)
WHERE b.code > 1
SELECT @SQL4 = @SQL4 + 'a' + CAST(a.code AS VARCHAR) + '.Name <> a' +
CAST(b.code AS VARCHAR) + '.Name AND '
FROM #person a CROSS JOIN #person b
WHERE a.code <> b.code
SELECT @SQL5 = @SQL5 + 'LEN(REPLACE(AllCodes, ''[' + a.code + ']'', '''')) + 3 = LEN(AllCodes)'
+ CASE b.code WHEN 1 THEN '' ELSE ' AND ' END
FROM #n_objects b
CROSS APPLY (SELECT CAST(code AS NVARCHAR)) a(code)
ORDER BY b.code DESC
--SELECT @SQL1, @SQL2, @SQL3, @SQL4, @SQL5
SELECT @SQL6 = @SQL6 + 'a' + a.code + '.FCodes ' +
CASE b.code WHEN 1 THEN '' ELSE ' + ' END
,@SQL7 = @SQL7 + 'Name' + a.code + ',Fruits' + a.code +
CASE b.code WHEN 1 THEN ' ' ELSE ', ' END
,@SQL8 = @SQL8 + '(n, Name' + a.code + ',Fruits' + a.code + ')' +
CASE b.code WHEN 1 THEN ' ' ELSE ', ' END
FROM #person b
CROSS APPLY (SELECT CAST(code AS NVARCHAR)) a(code)
ORDER BY b.code DESC
--SELECT @SQL6, @SQL7, @SQL8
SELECT @sql = '
;WITH AllCombos AS (
SELECT DISTINCT Name1=a1.Name, Fruits1=a1.Fruits ' + @SQL1 + '
FROM #FruitCombos a1 ' + @SQL2 + '
CROSS APPLY (SELECT AllCodes=' + @SQL6 + ') d
WHERE ' + @SQL3 + (SELECT '='+CAST(COUNT(*) AS VARCHAR) FROM #n_objects) +
' AND ' + @SQL4 + @SQL5 + '
)
SELECT ComboNo=e.n, Name, Fruits
FROM (
SELECT ' + @SQL7 + '
,n=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM AllCombos
) a
CROSS APPLY (
VALUES ' + @SQL8 + '
) e(n, Name, Fruits)
ORDER BY ComboNo, Name'
PRINT @sql
EXEC (@SQL)
DROP TABLE #n_objects
DROP TABLE #person
DROP TABLE #FruitCombos
You could probably improve a little on the way I set up the Dynamic SQL (now its a really ugly baby!) but it seems to produce the correct results for the two cases you posted:
- 3 names, 4 fruits (one repeated): 54 delectable fruit cocktails
- 2 names, 3 unique fruits: 8 delicious fruit salads
Note that the final query that is actually EXECuted will appear in the results pane due to the PRINT and, of course, it is different for each case.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply