Searching for n-categories

Question

Post reply

Searching for n-categories

ThomasSQL

Grasshopper

Points: 10
More actions
September 30, 2013 at 6:57 am

#299303

Hi there
I have a challenge where I need some help. May someone can help me?
I have a table where a project can have n-different categories. That means that a project (referenced by ID) can have zero or multiple categories.
I am looking for a sql statement to find every project id which contains a specific combination of category. For example I'm looking for every project which contains the category 3 AND 6. As a result I should get the following result (based on the example below):
Project
2
3
The statement should by dynamic in the sens of looking for one or n-categories.
Below you see an example of the table:
-- Create table
USE [test]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Project_Category](
[Project] [int] NOT NULL,
[Category] [int] NOT NULL,
CONSTRAINT [PK_Project_Category] PRIMARY KEY CLUSTERED
(
[Project] ASC,
[Category] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
--Insert values
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (1, 2)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (1, 3)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (1, 7)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (2, 3)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (2, 6)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (2, 7)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (3, 3)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (3, 6)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (4, 1)
GO
Thanks a lot for you help!
Regards, Thomas

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Mark Cowne One Orange Chip Points: 26952 More actions · Answer 1

WITH CategoriesWanted([Category]) AS (

SELECT [Category]

FROM ( VALUES (3),(6)) x([Category])

)

SELECT [Project]

FROM [dbo].[Project_Category]

WHERE [Category] IN (SELECT [Category] FROM CategoriesWanted)

GROUP BY [Project]

HAVING COUNT(DISTINCT [Category])=(SELECT COUNT(*) FROM CategoriesWanted);

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

ThomasSQL Grasshopper Points: 10 More actions · Answer 2

Great, works perfect. That was really what I'm looking for!

Thanks for your support.

Regards, Thomas

Dwain Camps SSC Guru Points: 86958 More actions · Answer 3

Mark-101232 (9/30/2013)
WITH CategoriesWanted([Category]) AS (
SELECT [Category]
FROM ( VALUES (3),(6)) x([Category])
)
SELECT [Project]
FROM [dbo].[Project_Category]
WHERE [Category] IN (SELECT [Category] FROM CategoriesWanted)
GROUP BY [Project]
HAVING COUNT(DISTINCT [Category])=(SELECT COUNT(*) FROM CategoriesWanted);

Mark,

I don't think you need to COUNT DISTINCT Category within Project as Project, Category is the primary key (IGNORE_DUP_KEY = OFF).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Mark Cowne One Orange Chip Points: 26952 More actions · Answer 4

dwain.c (9/30/2013)
Mark-101232 (9/30/2013)
WITH CategoriesWanted([Category]) AS (
SELECT [Category]
FROM ( VALUES (3),(6)) x([Category])
)
SELECT [Project]
FROM [dbo].[Project_Category]
WHERE [Category] IN (SELECT [Category] FROM CategoriesWanted)
GROUP BY [Project]
HAVING COUNT(DISTINCT [Category])=(SELECT COUNT(*) FROM CategoriesWanted);
Mark,
I don't think you need to COUNT DISTINCT Category within Project as Project, Category is the primary key (IGNORE_DUP_KEY = OFF).

Yep, you're right. Nice one.

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Dwain Camps SSC Guru Points: 86958 More actions · Answer 5

Mark-101232 (10/1/2013)
dwain.c (9/30/2013)
Mark-101232 (9/30/2013)
WITH CategoriesWanted([Category]) AS (
SELECT [Category]
FROM ( VALUES (3),(6)) x([Category])
)
SELECT [Project]
FROM [dbo].[Project_Category]
WHERE [Category] IN (SELECT [Category] FROM CategoriesWanted)
GROUP BY [Project]
HAVING COUNT(DISTINCT [Category])=(SELECT COUNT(*) FROM CategoriesWanted);
Mark,
I don't think you need to COUNT DISTINCT Category within Project as Project, Category is the primary key (IGNORE_DUP_KEY = OFF).
Yep, you're right. Nice one.

And sorry for not saying initially that otherwise I thought the query you suggested was an excellent solution for the problem the OP posted.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

ccavaco Old Hand Points: 302 More actions · Answer 6

Similar -

WITH CTE AS (

SELECT Project

FROM Project_Category

WHERE Category = '3')

SELECT c.Project

FROM CTE c JOIN Project_Category p ON p.Project = c.Project AND p.Category = '6'

Hunchback SSCrazy Points: 2649 More actions · Answer 7

This problem is known as Relational Divsion[/url].

There is the exact division (only 3 and 6) and the one with a remainder (at least 3 and 6). Yours seems to be the second one.

DECLARE @C TABLE (

[Category] int NOT NULL PRIMARY KEY

);

INSERT INTO @C ([Category]) VALUES (3), (6);

SELECT

PC.Project

FROM

[dbo].[Project_Category] AS PC

INNER JOIN

@C AS C

ON PC.Category = C.Category

GROUP BY

PC.Project

HAVING

COUNT(DISTINCT PC.Category) = (SELECT COUNT(*) FROM @C);

GO

Notice that I am constraining the categories I am looking for to be unique (primary key in this case).

This is not the only way of solving relational division neither the one with best performance.