March 26, 2015 at 8:33 am
Right I'm pulling blank today.
Number | Letter
1 A
1 B
2 A
3 A
3 B
4 A
4 C
5 A
I'm trying to retrieve all of the numbers that have the letter A AND B
For the above I'd be aiming for the result of number 1 and number 3.
Anyone point me in the right direction of achieving this.
Thanks
March 26, 2015 at 8:55 am
LittleDuck (3/26/2015)
Right I'm pulling blank today.Number | Letter
1 A
1 B
2 A
3 A
3 B
4 A
4 C
5 A
I'm trying to retrieve all of the numbers that have the letter A AND B
For the above I'd be aiming for the result of number 1 and number 3.
Anyone point me in the right direction of achieving this.
Thanks
Here's some DDL to create and populate a temp table:
if object_id('tempdb.dbo.#letters', 'u') is not null drop table #letters;
create table #letters (
num integer,
letter char(1));
insert into #letters(num, letter)
values(1, 'A'),
(1, 'B'),
(2, 'A'),
(3, 'A'),
(3, 'B'),
(4, 'A'),
(4, 'C'),
(5, 'A');
To pull the results you want, you want the results of your two queries where they intersect:
select num
from #letters
where letter = 'A'
intersect
select num
from #letters
where letter = 'b';
There are probably several ways to do this. This is just one of them.
March 26, 2015 at 8:59 am
Ah cool. My brain came alive and I came up with a couple of ways to achieve it.
Damn me for only have 4 hours sleep last night.
Cheers for the quick reply.
SELECT DISTINCT Number, letter
FROM Testing
WHERE EXISTS (SELECT 1
FROM Testing t WHERE Letter = 'A'
AND Testing.Number = t.Number) AND EXISTS (SELECT 1 FROM
Testing t WHERE Letter = 'B' AND Testing.Number = t.Number) AND Letter IN ('A', 'B')
And a messier one
SELECT DATA.Number, DATA.Letter FROM (SELECT Number, Letter
FROM [Testing]
WHERE Testing.Letter = 'A') DATA
INNER JOIN (SELECT Number, Letter
FROM [Testing]
WHERE Testing.Letter = 'B') B ON B.Number = DATA.Number
UNION ALL
SELECT B.Number, B.Letter FROM (SELECT Number, Letter
FROM [Testing]
WHERE Testing.Letter = 'A') DATA
INNER JOIN (SELECT Number, Letter
FROM [Testing]
WHERE Testing.Letter = 'B') B ON B.Number = DATA.Number
March 26, 2015 at 9:08 am
I know this is just test data, but take a look at the query plans for all of them.
March 26, 2015 at 9:42 am
Just another way to get this result.
SELECT num
FROM #letters
WHERE letter IN('A', 'B')
GROUP BY num
HAVING COUNT( DISTINCT letter) = 2
March 26, 2015 at 10:25 am
With no index, as in the sample data, here's a couple more options to add to your own, Ed's and Luis' :w00t:
SELECT a.num
FROM #letters a
CROSS APPLY ( SELECT TOP 1
1
FROM #letters b
WHERE a.num = b.num
AND b.letter = 'B'
) ca ( X )
WHERE a.letter = 'A';
SELECT a.num
FROM #letters a
WHERE EXISTS ( SELECT 1
FROM #letters b
WHERE a.num = b.num
AND b.letter = 'B' )
AND a.letter = 'A';
I thought we could do some performance testing, over 1,000,000 rows and with no indexes.
Test Environment Setup: -
IF OBJECT_ID('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment;
END;
SELECT TOP 1000000
( ABS(CHECKSUM(NEWID())) % 100000 ) + 1 AS num,
CHAR(( ( ABS(CHECKSUM(NEWID())) % 3 ) + 1 ) + 64) AS letter
INTO #testEnvironment
FROM master.dbo.syscolumns sc1
CROSS JOIN master.dbo.syscolumns sc2
CROSS JOIN master.dbo.syscolumns sc3;
So that's #testEnvironment populated with some semi-random data. Now lets' create a harness to run the queries in and record the timings: -
IF OBJECT_ID('tempdb..#performance') IS NOT NULL
BEGIN
DROP TABLE #performance;
END;
CREATE TABLE #performance
(
Query VARCHAR(100),
StartTime DATETIME,
EndTime DATETIME
);
GO
DECLARE @StartTime DATETIME,
@Holder INT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET @StartTime = GETDATE();
SELECT @Holder = num
FROM ( SELECT num
FROM #testEnvironment
WHERE letter = 'A'
INTERSECT
SELECT num
FROM #testEnvironment
WHERE letter = 'B'
) a;
INSERT INTO #performance
( Query, StartTime, EndTime )
VALUES ( 'Intersect', @StartTime, GETDATE() );
GO 50
DECLARE @StartTime DATETIME,
@Holder INT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET @StartTime = GETDATE();
SELECT @Holder = num
FROM #testEnvironment
WHERE letter IN ( 'A', 'B' )
GROUP BY num
HAVING COUNT(DISTINCT letter) = 2;
INSERT INTO #performance
(
Query,
StartTime,
EndTime
)
VALUES (
'Having Count',
@StartTime,
GETDATE()
);
GO 50
DECLARE @StartTime DATETIME,
@Holder INT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET @StartTime = GETDATE();
SELECT @Holder = a.num
FROM #testEnvironment a
CROSS APPLY ( SELECT TOP 1
1
FROM #testEnvironment b
WHERE a.num = b.num
AND b.letter = 'B'
) ca ( X )
WHERE a.letter = 'A';
INSERT INTO #performance
(
Query,
StartTime,
EndTime
)
VALUES (
'Cross Apply',
@StartTime,
GETDATE()
);
GO 50
DECLARE @StartTime DATETIME,
@Holder INT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET @StartTime = GETDATE();
SELECT @Holder = a.num
FROM #testEnvironment a
WHERE EXISTS ( SELECT 1
FROM #testEnvironment b
WHERE a.num = b.num
AND b.letter = 'B' )
AND a.letter = 'A';
INSERT INTO #performance
( Query, StartTime, EndTime )
VALUES ( 'Exists', @StartTime, GETDATE() );
GO 50
DECLARE @StartTime DATETIME,
@Holder INT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET @StartTime = GETDATE();
SELECT @Holder = num
FROM ( SELECT DISTINCT
num,
letter
FROM #testEnvironment
WHERE EXISTS ( SELECT 1
FROM #testEnvironment t
WHERE letter = 'A'
AND #testEnvironment.num = t.num )
AND EXISTS ( SELECT 1
FROM #testEnvironment t
WHERE letter = 'B'
AND #testEnvironment.num = t.num )
AND letter IN ( 'A', 'B' )
) a;
INSERT INTO #performance
(
Query,
StartTime,
EndTime
)
VALUES (
'Distinct Exists',
@StartTime,
GETDATE()
);
GO 50
DECLARE @StartTime DATETIME,
@Holder INT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET @StartTime = GETDATE();
SELECT @Holder = num
FROM ( SELECT DATA.num
FROM ( SELECT num,
letter
FROM #testEnvironment
WHERE #testEnvironment.letter = 'A'
) DATA
INNER JOIN ( SELECT num,
letter
FROM #testEnvironment
WHERE #testEnvironment.letter = 'B'
) B ON B.num = DATA.num
UNION ALL
SELECT B.num
FROM ( SELECT num,
letter
FROM #testEnvironment
WHERE #testEnvironment.letter = 'A'
) DATA
INNER JOIN ( SELECT num,
letter
FROM #testEnvironment
WHERE #testEnvironment.letter = 'B'
) B ON B.num = DATA.num
) a;
INSERT INTO #performance
(
Query,
StartTime,
EndTime
)
VALUES (
'UNION ALL',
@StartTime,
GETDATE()
);
GO 50
So "#performance" will hold the timings of each query after 50 runs. I've used the batch "GO" to repeat each query because in testing, I've found that loops can affect the timing comparisons for some queries, so I now run them inside batch separators like that.
We can run the following to get a bit of a total time elapsed for 50 runs: -
SELECT a.Query,
[Total Time Taken (seconds)] = a.Total / 1000000.0,
[Average Time Taken (seconds)] = ( a.Total / 50.0 ) / 1000000.0
FROM ( SELECT Query,
Total = SUM(DATEDIFF(MICROSECOND, StartTime, EndTime))
FROM #performance
GROUP BY Query
) a;
On my box, the results are as follows: -
Query Total Time Taken (seconds) Average Time Taken (seconds)
------------------ --------------------------------------- ---------------------------------------
Cross Apply 109.038000000 2.180760000000000
Distinct Exists 10.292000000 0.205840000000000
Exists 5.277000000 0.105540000000000
Having Count 5.747000000 0.114940000000000
Intersect 4.209000000 0.084180000000000
UNION ALL 14.643000000 0.292860000000000
March 26, 2015 at 10:52 am
Nice job on the testing. Thanks. I half-expected a slew of other approaches to come from the OP. 😛
March 30, 2015 at 7:25 pm
I believe that this is relational division with remainder:
High Performance Relational Division in SQL Server [/url]
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply