September 30, 2013 at 6:57 am
Hi there
I have a challenge where I need some help. May someone can help me?
I have a table where a project can have n-different categories. That means that a project (referenced by ID) can have zero or multiple categories.
I am looking for a sql statement to find every project id which contains a specific combination of category. For example I'm looking for every project which contains the category 3 AND 6. As a result I should get the following result (based on the example below):
Project
2
3
The statement should by dynamic in the sens of looking for one or n-categories.
Below you see an example of the table:
-- Create table
USE [test]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Project_Category](
[Project] [int] NOT NULL,
[Category] [int] NOT NULL,
CONSTRAINT [PK_Project_Category] PRIMARY KEY CLUSTERED
(
[Project] ASC,
[Category] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
--Insert values
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (1, 2)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (1, 3)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (1, 7)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (2, 3)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (2, 6)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (2, 7)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (3, 3)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (3, 6)
GO
INSERT [dbo].[Project_Category] ([Project], [Category]) VALUES (4, 1)
GO
Thanks a lot for you help!
Regards, Thomas
September 30, 2013 at 7:31 am
WITH CategoriesWanted([Category]) AS (
SELECT [Category]
FROM ( VALUES (3),(6)) x([Category])
)
SELECT [Project]
FROM [dbo].[Project_Category]
WHERE [Category] IN (SELECT [Category] FROM CategoriesWanted)
GROUP BY [Project]
HAVING COUNT(DISTINCT [Category])=(SELECT COUNT(*) FROM CategoriesWanted);
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 30, 2013 at 8:33 am
Great, works perfect. That was really what I'm looking for!
Thanks for your support.
Regards, Thomas
September 30, 2013 at 7:08 pm
Mark-101232 (9/30/2013)
WITH CategoriesWanted([Category]) AS (
SELECT [Category]
FROM ( VALUES (3),(6)) x([Category])
)
SELECT [Project]
FROM [dbo].[Project_Category]
WHERE [Category] IN (SELECT [Category] FROM CategoriesWanted)
GROUP BY [Project]
HAVING COUNT(DISTINCT [Category])=(SELECT COUNT(*) FROM CategoriesWanted);
Mark,
I don't think you need to COUNT DISTINCT Category within Project as Project, Category is the primary key (IGNORE_DUP_KEY = OFF).
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 1, 2013 at 1:33 am
dwain.c (9/30/2013)
Mark-101232 (9/30/2013)
WITH CategoriesWanted([Category]) AS (
SELECT [Category]
FROM ( VALUES (3),(6)) x([Category])
)
SELECT [Project]
FROM [dbo].[Project_Category]
WHERE [Category] IN (SELECT [Category] FROM CategoriesWanted)
GROUP BY [Project]
HAVING COUNT(DISTINCT [Category])=(SELECT COUNT(*) FROM CategoriesWanted);
Mark,
I don't think you need to COUNT DISTINCT Category within Project as Project, Category is the primary key (IGNORE_DUP_KEY = OFF).
Yep, you're right. Nice one.
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537October 1, 2013 at 9:14 pm
Mark-101232 (10/1/2013)
dwain.c (9/30/2013)
Mark-101232 (9/30/2013)
WITH CategoriesWanted([Category]) AS (
SELECT [Category]
FROM ( VALUES (3),(6)) x([Category])
)
SELECT [Project]
FROM [dbo].[Project_Category]
WHERE [Category] IN (SELECT [Category] FROM CategoriesWanted)
GROUP BY [Project]
HAVING COUNT(DISTINCT [Category])=(SELECT COUNT(*) FROM CategoriesWanted);
Mark,
I don't think you need to COUNT DISTINCT Category within Project as Project, Category is the primary key (IGNORE_DUP_KEY = OFF).
Yep, you're right. Nice one.
And sorry for not saying initially that otherwise I thought the query you suggested was an excellent solution for the problem the OP posted.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 2, 2013 at 11:47 am
Similar -
WITH CTE AS (
SELECT Project
FROM Project_Category
WHERE Category = '3')
SELECT c.Project
FROM CTE c JOIN Project_Category p ON p.Project = c.Project AND p.Category = '6'
October 2, 2013 at 12:52 pm
This problem is known as Relational Divsion[/url].
There is the exact division (only 3 and 6) and the one with a remainder (at least 3 and 6). Yours seems to be the second one.
DECLARE @C TABLE (
[Category] int NOT NULL PRIMARY KEY
);
INSERT INTO @C ([Category]) VALUES (3), (6);
SELECT
PC.Project
FROM
[dbo].[Project_Category] AS PC
INNER JOIN
@C AS C
ON PC.Category = C.Category
GROUP BY
PC.Project
HAVING
COUNT(DISTINCT PC.Category) = (SELECT COUNT(*) FROM @C);
GO
Notice that I am constraining the categories I am looking for to be unique (primary key in this case).
This is not the only way of solving relational division neither the one with best performance.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply