September 26, 2013 at 7:17 am
Hi All,
I have RATE and SHIFT columns in a table as mentioned below. When I pass RATE=1 then I get SHIFT = 1,2,3. Now in second step, I need to write a query that will find all rates where atleast these three shifts compulsorily exist (e.g. RATE 3,5 as per below data).
All above work is to be done as a part of a single query and first input of RATE will be dynamic. Below mentioned data is only for example, and real table will be having huge data.
RateShift
11
1 2
1 3
25
31
32
33
84
41
42
51
52
53
54
61
72
82
Regards,
Ashish
September 26, 2013 at 7:21 am
You have been around here long enough to know that posting some consumable ddl and data is far more likely to get responses.
That being said why is only 3 and 5 returned? What about 1?
Seems to me like all you need is
GROUP BY Col1
HAVING COUNT(*) >= 3
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 26, 2013 at 1:34 pm
This problem is known as Relational Division[/url].
SET NOCOUNT ON;
USE tempdb;
GO
DECLARE @T TABLE (
Rate int NOT NULL,
Shift int NOT NULL,
PRIMARY KEY (Rate, Shift)
);
INSERT INTO @T (
Rate,
Shift
)
VALUES
(1, 1),
(1, 2),
(1, 3),
(2, 5),
(3, 1),
(3, 2),
(3, 3),
(8, 4),
(4, 1),
(4, 2),
(5, 1),
(5, 2),
(5, 3),
(5, 4),
(6, 1),
(7, 2),
(8, 2);
DECLARE @Rate int = 1;
WITH C1 AS (
SELECT
Shift
FROM
@T
WHERE
Rate = @Rate
)
SELECT
A.Rate
FROM
@T AS A
INNER JOIN
C1 AS B
ON A.Shift = B.Shift
GROUP BY
A.Rate
HAVING
COUNT(*) = (SELECT COUNT(*) FROM C1);
GO
/*
Rate
1
3
5
*/
September 26, 2013 at 1:48 pm
Thanks to hunchback for the ddl and data. Here is another way to do the same thing.
select Rate
from @T
cross apply (select count(*) as RowsCount from @t where rate = @Rate) r
group by Rate
having COUNT(*) >= max(r.RowsCount)
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 26, 2013 at 7:47 pm
Sean Lange (9/26/2013)
Thanks to hunchback for the ddl and data. Here is another way to do the same thing.
select Rate
from @T
cross apply (select count(*) as RowsCount from @t where rate = @Rate) r
group by Rate
having COUNT(*) >= max(r.RowsCount)
Sean - Should this return Rate = 9? I think your approach simply returns any Rate that has 3 or more rows.
DECLARE @T TABLE (
Rate int NOT NULL,
Shift int NOT NULL,
PRIMARY KEY (Rate, Shift)
);
INSERT INTO @T (
Rate,
Shift
)
VALUES
(1, 1),
(1, 2),
(1, 3),
(2, 5),
(3, 1),
(3, 2),
(3, 3),
(8, 4),
(4, 1),
(4, 2),
(5, 1),
(5, 2),
(5, 3),
(5, 4),
(6, 1),
(7, 2),
(8, 2),
(9, 2),
(9, 3),
(9, 4);
DECLARE @Rate int = 1;
select Rate
from @T
cross apply (select count(*) as RowsCount from @t where rate = @Rate) r
group by Rate
having COUNT(*) >= max(r.RowsCount)
I think this works as the OP requested also:
SELECT Rate
FROM @T
WHERE Shift IN (SELECT Shift FROM @T WHERE Rate = @Rate)
GROUP BY Rate
HAVING COUNT(*) = (SELECT COUNT(*) FROM @T WHERE Rate = @Rate);
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 26, 2013 at 9:16 pm
I am a bit curious how hunchback and my solutions would scale against a divide and conquer approach.
SELECT Shift
INTO #T
FROM @T
WHERE Rate = @Rate;
DECLARE @Rows INT = @@ROWCOUNT;
SELECT Rate
FROM @T
WHERE Shift IN (SELECT Shift FROM #T)
GROUP BY Rate
HAVING COUNT(*) = @Rows;
GO
DROP TABLE #T;
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 26, 2013 at 10:14 pm
well, almost all the sql masters have answered the questions.
a simple group by and having count will serve the purpose.
September 26, 2013 at 11:05 pm
Thanks hunchback, Sean Lange and dwain.c for help.
I'm going with CROSS APPLY as it has least logical reads and lesser cost then other two.
Thanks Again.
September 26, 2013 at 11:08 pm
If I am reading this article (http://weblogs.sqlteam.com/peterl/archive/2010/07/02/Proper-Relational-Division-With-Sets.aspx) correctly, this solution should also be pretty fast:
WITH Target AS
(
SELECT Rate, Shift
FROM @T
WHERE Rate = @Rate
)
SELECT c.Rate
FROM
(
SELECT Rate, rc=COUNT(*)
FROM @T
GROUP BY Rate
) a
JOIN
(
SELECT rc=COUNT(*)
FROM Target
) b ON b.rc <= a.rc
JOIN @T AS c ON c.Rate = a.Rate
JOIN Target AS d ON d.Shift = c.Shift
GROUP BY c.Rate
HAVING COUNT(*) = MIN(b.rc);
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 26, 2013 at 11:16 pm
T.Ashish (9/26/2013)
Thanks hunchback, Sean Lange and dwain.c for help.I'm going with CROSS APPLY as it has least logical reads and lesser cost then other two.
Thanks Again.
I do not recommend that you judge a query's performance by its plan cost.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 27, 2013 at 12:41 am
Continuing curiosity regarding this problem drove me to see how the proposed solutions scale, so I built a test harness.
CREATE TABLE #T (
Rate int NOT NULL,
Shift int NOT NULL,
PRIMARY KEY (Rate, Shift)
);
INSERT INTO #T (
Rate,
Shift
)
VALUES
(-1, 1),
(-1, 2),
(-1, 3);
WITH Tally (n) AS
(
SELECT TOP 500000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns a CROSS JOIN sys.all_columns b
),
STally (n) AS
(
SELECT TOP 10 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns a
)
INSERT INTO #T
SELECT a.n, b.n
FROM Tally a
CROSS APPLY STally b;
DELETE
FROM #T
WHERE (Rate < 20000 AND Shift = 1) OR (Rate > 400000 AND Shift = 3)
OR (RATE BETWEEN 50000 AND 150000 AND Shift = 2);
DECLARE @Rate INT = -1
,@Holder INT
,@StartDT DATETIME;
SELECT @StartDT=GETDATE();
PRINT 'Solution by Hunchback';
SET STATISTICS TIME ON;
WITH C1 AS (
SELECT
Shift
FROM
#T
WHERE
Rate = @Rate
)
SELECT
@Holder=A.Rate
FROM
#T AS A
INNER JOIN
C1 AS B
ON A.Shift = B.Shift
GROUP BY
A.Rate
HAVING
COUNT(*) = (SELECT COUNT(*) FROM C1);
SET STATISTICS TIME OFF;
SELECT ElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());
SELECT @StartDT=GETDATE();
PRINT 'Solution by Dwain.C';
SET STATISTICS TIME ON;
SELECT @Holder=Rate
FROM #T
WHERE Shift IN (SELECT Shift FROM #T WHERE Rate = @Rate)
GROUP BY Rate
HAVING COUNT(*) = (SELECT COUNT(*) FROM #T WHERE Rate = @Rate);
SET STATISTICS TIME OFF;
SELECT ElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());
SELECT @StartDT=GETDATE();
PRINT 'Divide and Conquer Solution by Dwain.C';
SET STATISTICS TIME ON;
SELECT Shift
INTO #T1
FROM #T
WHERE Rate = @Rate;
DECLARE @Rows INT = @@ROWCOUNT;
SELECT @Holder=Rate
FROM #T
WHERE Shift IN (SELECT Shift FROM #T1)
GROUP BY Rate
HAVING COUNT(*) = @Rows;
SET STATISTICS TIME OFF;
SELECT ElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());
SELECT @StartDT=GETDATE();
PRINT 'Solution by Peso/Adam Machanic';
SET STATISTICS TIME ON;
WITH Target AS
(
SELECT Rate, Shift
FROM #T
WHERE Rate = @Rate
)
SELECT @Holder=c.Rate
FROM
(
SELECT Rate, rc=COUNT(*)
FROM #T
GROUP BY Rate
) a
JOIN
(
SELECT rc=COUNT(*)
FROM Target
) b ON b.rc <= a.rc
JOIN #T AS c ON c.Rate = a.Rate
JOIN Target AS d ON d.Shift = c.Shift
GROUP BY c.Rate
HAVING COUNT(*) = MIN(b.rc);
SET STATISTICS TIME OFF;
SELECT ElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());
GO
DROP TABLE #T;
DROP TABLE #T1;
From the statistics, I get these CPU/Elapsed time results:
Solution by Hunchback
SQL Server Execution Times:
CPU time = 1637 ms, elapsed time = 452 ms.
Solution by Dwain.C
SQL Server Execution Times:
CPU time = 1467 ms, elapsed time = 473 ms.
Divide and Conquer Solution by Dwain.C
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 1169 ms, elapsed time = 301 ms.
Solution by Peso/Adam Machanic
SQL Server Execution Times:
CPU time = 1996 ms, elapsed time = 2235 ms.
And these results are quite similar to the results SELECTED into the 4 results panes for elapsed MS:
460
480
306
2263
Comments:
- I eliminated Sean's solution because I believe he's solving a different problem. But of course since the OP has confirmed that he prefers Sean's solution, I have to assume that both Hunchback and I read the requirements wrong.
- The result for my last suggestion (Peso/Adam Machanic solution) was probably impacted by the <= I had to use in the first JOIN. I believe their solution was based on the no remainder problem posed by Mr. Celko.
I was surprised that Divide and Conquer came in at about 30% faster.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 27, 2013 at 3:39 am
Thanks Dwain.C
Kindly ignore my earlier post, I missed it completely. Actually, Other four solutions are giving accurate results
My results goes as below:
1.hunchback
WITH C1 AS (SELECT Shift FROM test12 WHERE Rate = 4)
SELECT A.Rate FROM test12 AS A INNER JOIN C1 AS B ON A.Shift = B.Shift
GROUP BY A.Rate HAVING COUNT(*) = (SELECT COUNT(*) FROM C1);
Table 'test12'. Scan count 2, logical reads 48
Query Cost relative to batch = 21%
2.dwain.c
SELECT Rate FROM test12
WHERE Shift IN (SELECT Shift FROM test12 WHERE Rate = 4)
GROUP BY Rate HAVING COUNT(*) = (SELECT COUNT(*) FROM test12 WHERE Rate = 4);
Table 'test12'. Scan count 2, logical reads 48
Query Cost relative to batch = 21%
3.dwain.c (divide and concur)
SELECT Shift INTO #T FROM @T WHERE Rate = @Rate;
DECLARE @Rows INT = @@ROWCOUNT;
SELECT Rate FROM @T
WHERE Shift IN (SELECT Shift FROM #T)
GROUP BY Rate HAVING COUNT(*) = @Rows;
GO
DROP TABLE #T;
Table 'test12'. Scan count 1, logical reads 2
'#T_____00000000000D'. Scan count 1, logical reads 22
Table 'test12'. Scan count 1, logical reads 2
Query Cost relative to batch = 23%
4.dwain.c
WITH Target AS
(SELECT Rate, Shift FROM test12 WHERE Rate = 4)
SELECT c.Rate
FROM
(SELECT Rate, rc=COUNT(*) FROM test12 GROUP BY Rate) a
JOIN
(SELECT rc=COUNT(*) FROM Target) b ON b.rc <= a.rc
JOIN test12 AS c ON c.Rate = a.Rate
JOIN Target AS d ON d.Shift = c.Shift
GROUP BY c.Rate
HAVING COUNT(*) = MIN(b.rc);
Table 'test12'. Scan count 8, logical reads 54,
Query Cost relative to batch = 35%
So, Cost wise, 1 and 2 are performing better then 3.
ElapsedMS is as below:
990
956
960 (Divide and Conquer)
2723
I could not get "Divide and Conquer" at first place.
Thanks again everyone for putting your efforts.
September 27, 2013 at 3:46 am
T.Ashish (9/27/2013)
Thanks Dwain.CKindly ignore my earlier post, I missed it completely. Actually, Other four solutions are giving accurate results
My results goes as below:
1.hunchback
WITH C1 AS (SELECT Shift FROM test12 WHERE Rate = 4)
SELECT A.Rate FROM test12 AS A INNER JOIN C1 AS B ON A.Shift = B.Shift
GROUP BY A.Rate HAVING COUNT(*) = (SELECT COUNT(*) FROM C1);
Table 'test12'. Scan count 2, logical reads 48
Query Cost relative to batch = 21%
2.dwain.c
SELECT Rate FROM test12
WHERE Shift IN (SELECT Shift FROM test12 WHERE Rate = 4)
GROUP BY Rate HAVING COUNT(*) = (SELECT COUNT(*) FROM test12 WHERE Rate = 4);
Table 'test12'. Scan count 2, logical reads 48
Query Cost relative to batch = 21%
3.dwain.c (divide and concur)
SELECT Shift INTO #T FROM @T WHERE Rate = @Rate;
DECLARE @Rows INT = @@ROWCOUNT;
SELECT Rate FROM @T
WHERE Shift IN (SELECT Shift FROM #T)
GROUP BY Rate HAVING COUNT(*) = @Rows;
GO
DROP TABLE #T;
Table 'test12'. Scan count 1, logical reads 2
'#T_____00000000000D'. Scan count 1, logical reads 22
Table 'test12'. Scan count 1, logical reads 2
Query Cost relative to batch = 23%
4.dwain.c
WITH Target AS
(SELECT Rate, Shift FROM test12 WHERE Rate = 4)
SELECT c.Rate
FROM
(SELECT Rate, rc=COUNT(*) FROM test12 GROUP BY Rate) a
JOIN
(SELECT rc=COUNT(*) FROM Target) b ON b.rc <= a.rc
JOIN test12 AS c ON c.Rate = a.Rate
JOIN Target AS d ON d.Shift = c.Shift
GROUP BY c.Rate
HAVING COUNT(*) = MIN(b.rc);
Table 'test12'. Scan count 8, logical reads 54,
Query Cost relative to batch = 35%
So, Cost wise, 1 and 2 are performing better then 3.
ElapsedMS is as below:
990
956
960 (Divide and Conquer)
2723
I could not get "Divide and Conquer" at first place.
Thanks again everyone for putting your efforts.
#1 and 2 have effectively identical execution plans so any timing variances you're seeing with those is probably random and multiple testing runs probably would swing one way or the other.
#3 (Divide and Conquer) has an advantage that will likely grow depending on the number of rows in your table (the more rows, the more the initial INSERT capturing the row count will help). I think anyway.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 27, 2013 at 7:47 am
Well I guess I didn't quite get the requirements. Chock that up to a lack of details. This thread could serve as a good example of why it is so important to provide those details. 😉
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 27, 2013 at 7:54 am
#3 (Divide and Conquer) has an advantage that will likely grow depending on the number of rows in your table (the more rows, the more the initial INSERT capturing the row count will help). I think anyway.
I will keep that in mind.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply