Check that a specific Group By condition does not exist

Question

Check that a specific Group By condition does not exist

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply

webskater SSCarpal Tunnel Points: 4214 More actions · Answer 1

Lynn Pettis (9/7/2012)
Since all you want is the TemplateName where BaseStage is 1 in at least 1 record in a group, the following works:

That is not quite what I want.

I want the TemplateName where BaseStage is 1 in at least 1 record in EVERY group (that relates to a template).

So, I don't want Template 1 because, in the first group of three records (for Template 1), BaseStage is zero in all of them. Notwithstanding that BaseStage is 1 for the second group of three records (for Template 1). This doesn't matter. If any group of records for a template does not have a BaseStage of 1 - the Template is not complete so I don't want it returned.

Thanks again.

webskater SSCarpal Tunnel Points: 4214 More actions · Answer 2

Lynn Pettis (9/7/2012)
So what you are saying is that BaseStage must be 1 in at least one record of BOTH groups, correct?

Correct.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

Try this:

CREATE TABLE #tblTemplates

(

TemplateID int,

TemplateName varchar(50)

)

GO

INSERT INTO #tblTemplates (TemplateID, TemplateName)

SELECT 1, 'Template 1' UNION ALL

SELECT 2, 'Template 2'

GO

CREATE TABLE #tblBlocks

(

TemplateID int,

BlockID int,

FieldID int,

BaseStage bit

)

GO

INSERT INTO #tblBlocks (TemplateID, BlockID, FieldID, BaseStage)

SELECT 1, 1, 1, 0 UNION ALL

SELECT 1, 1, 2, 0 UNION ALL

SELECT 1, 1, 3, 0 UNION ALL

SELECT 1, 2, 4, 0 UNION ALL

SELECT 1, 2, 5, 1 UNION ALL

SELECT 1, 2, 6, 0 UNION ALL

SELECT 2, 1, 7, 1 UNION ALL

SELECT 2, 1, 8, 0 UNION ALL

SELECT 2, 1, 9, 0 UNION ALL

SELECT 2, 2, 10, 1 UNION ALL

SELECT 2, 2, 11, 0 UNION ALL

SELECT 2, 2, 12, 0

GO

WITH BaseBlocks AS (

SELECT

tb.TemplateID,

tb.BlockID,

MAX(CAST(tb.BaseStage AS INT)) BaseStage

FROM

#tblBlocks tb

GROUP BY

tb.TemplateID,

tb.BlockID

)

SELECT

*

FROM

#tblTemplates t1

WHERE

NOT EXISTS (SELECT 1 FROM BaseBlocks tb1 WHERE tb1.TemplateID = t1.TemplateID AND tb1.BaseStage = 0);

go

DROP TABLE #tblBlocks

DROP TABLE #tblTemplates

GO

ScottPletcher SSC Guru Points: 101147 More actions · Answer 4

SELECT TemplateID

FROM #tblBlocks

GROUP BY TemplateID

HAVING COUNT(DISTINCT CASE WHEN #tblBlocks.BaseStage = 1 THEN BlockID END) = COUNT(DISTINCT BlockID)

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

webskater SSCarpal Tunnel Points: 4214 More actions · Answer 5

ScottPletcher (9/7/2012)
SELECT TemplateID FROM #tblBlocks GROUP BY TemplateID HAVING COUNT(DISTINCT CASE WHEN #tblBlocks.BaseStage = 1 THEN BlockID END) = COUNT(DISTINCT BlockID)

That does the job. Thanks very much.

webskater SSCarpal Tunnel Points: 4214 More actions · Answer 6

That does the job too. Again, thanks very much.

Just wish I understood both the solutions that work.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 7

You may want to compare the execution plans between the two solutions provided:

CREATE TABLE #tblTemplates

(

TemplateID int,

TemplateName varchar(50)

)

GO

INSERT INTO #tblTemplates (TemplateID, TemplateName)

SELECT 1, 'Template 1' UNION ALL

SELECT 2, 'Template 2'

GO

CREATE TABLE #tblBlocks

(

TemplateID int,

BlockID int,

FieldID int,

BaseStage bit

)

GO

INSERT INTO #tblBlocks (TemplateID, BlockID, FieldID, BaseStage)

SELECT 1, 1, 1, 0 UNION ALL

SELECT 1, 1, 2, 0 UNION ALL

SELECT 1, 1, 3, 0 UNION ALL

SELECT 1, 2, 4, 0 UNION ALL

SELECT 1, 2, 5, 1 UNION ALL

SELECT 1, 2, 6, 0 UNION ALL

SELECT 2, 1, 7, 1 UNION ALL

SELECT 2, 1, 8, 0 UNION ALL

SELECT 2, 1, 9, 0 UNION ALL

SELECT 2, 2, 10, 1 UNION ALL

SELECT 2, 2, 11, 0 UNION ALL

SELECT 2, 2, 12, 0

GO

WITH BaseBlocks AS (

SELECT

tb.TemplateID,

tb.BlockID,

MAX(CAST(tb.BaseStage AS INT)) BaseStage

FROM

#tblBlocks tb

GROUP BY

tb.TemplateID,

tb.BlockID

)

SELECT

*

FROM

#tblTemplates t1

WHERE

NOT EXISTS (SELECT 1 FROM BaseBlocks tb1 WHERE tb1.TemplateID = t1.TemplateID AND tb1.BaseStage = 0);

go

SELECT TemplateID

FROM #tblBlocks

GROUP BY TemplateID

HAVING COUNT(DISTINCT CASE WHEN #tblBlocks.BaseStage = 1 THEN BlockID END) = COUNT(DISTINCT BlockID)

go

DROP TABLE #tblBlocks

DROP TABLE #tblTemplates

GO

ScottPletcher SSC Guru Points: 101147 More actions · Answer 8

Lynn Pettis (9/7/2012)
You may want to compare the execution plans between the two solutions provided:
CREATE TABLE #tblTemplates
(
TemplateID int,
TemplateName varchar(50)
)
GO
INSERT INTO #tblTemplates (TemplateID, TemplateName)
SELECT 1, 'Template 1' UNION ALL
SELECT 2, 'Template 2'
GO
CREATE TABLE #tblBlocks
(
TemplateID int,
BlockID int,
FieldID int,
BaseStage bit
)
GO
INSERT INTO #tblBlocks (TemplateID, BlockID, FieldID, BaseStage)
SELECT 1, 1, 1, 0 UNION ALL
SELECT 1, 1, 2, 0 UNION ALL
SELECT 1, 1, 3, 0 UNION ALL
SELECT 1, 2, 4, 0 UNION ALL
SELECT 1, 2, 5, 1 UNION ALL
SELECT 1, 2, 6, 0 UNION ALL
SELECT 2, 1, 7, 1 UNION ALL
SELECT 2, 1, 8, 0 UNION ALL
SELECT 2, 1, 9, 0 UNION ALL
SELECT 2, 2, 10, 1 UNION ALL
SELECT 2, 2, 11, 0 UNION ALL
SELECT 2, 2, 12, 0
GO
WITH BaseBlocks AS (
SELECT
tb.TemplateID,
tb.BlockID,
MAX(CAST(tb.BaseStage AS INT)) BaseStage
FROM
#tblBlocks tb
GROUP BY
tb.TemplateID,
tb.BlockID
)
SELECT
*
FROM
#tblTemplates t1
WHERE
NOT EXISTS (SELECT 1 FROM BaseBlocks tb1 WHERE tb1.TemplateID = t1.TemplateID AND tb1.BaseStage = 0);
go
SELECT TemplateID
FROM #tblBlocks
GROUP BY TemplateID
HAVING COUNT(DISTINCT CASE WHEN #tblBlocks.BaseStage = 1 THEN BlockID END) = COUNT(DISTINCT BlockID)
go
DROP TABLE #tblBlocks
DROP TABLE #tblTemplates
GO

If you're concerned with performance, then do this:

SELECT TemplateID

FROM (

SELECT TemplateID, BlockID,

MAX(CASE WHEN BaseStage = 1 THEN 1 ELSE 0 END) AS BackStage_Found

--MAX(CAST(BaseStage AS tinyint)) AS BackStage_Found --or this, if your col is a bit

FROM #tblBlocks

GROUP BY TemplateID, BlockID

) AS derived

GROUP BY TemplateID

HAVING MIN(BackStage_Found) = 1

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 9

The real test will be against a 1,000,000 (or so) table. I will try and set up such a test tonight at home. One of the things I noticed between the first two solutions was the number of table scans. Have a large enough table, and I think having 2 scans against the same large table is going to be slower. I may be wrong and will admit it if my testing proves otherwise, but then that is the reason for testing isn't it.

What may appear logical isn't always that way.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 10

I do like the latest version. Looks like it should win the 1,000,000 row test.

ScottPletcher SSC Guru Points: 101147 More actions · Answer 11

Lynn Pettis (9/7/2012)
The real test will be against a 1,000,000 (or so) table. I will try and set up such a test tonight at home. One of the things I noticed between the first two solutions was the number of table scans. Have a large enough table, and I think having 2 scans against the same large table is going to be slower. I may be wrong and will admit it if my testing proves otherwise, but then that is the reason for testing isn't it.
What may appear logical isn't always that way.

Yes, it's two table scans but it avoids the loop join. It will be interesting, albeit that I didn't write the first query with a concern for performance but for SQL compactness / understandability / "self-describedness", since I didn't figure there would be enough TemplateIds to cause a performance issue anyway :-).

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

webskater SSCarpal Tunnel Points: 4214 More actions · Answer 12

Thanks again guys.

As a matter of interest - the real world scenario is a maximum of 5 templates.

Each template has 15 to 20 blocks.

Each block has anything from 1 to 30 fields.

A template is not useable unless every block has a StageBase set ... so the real world requirement is to look at 5 templates and only return the ones where every block within the template has a BaseStage set.

Cheers again.

ScottPletcher SSC Guru Points: 101147 More actions · Answer 13

Lynn Pettis (9/7/2012)
The real test will be against a 1,000,000 (or so) table. I will try and set up such a test tonight at home. One of the things I noticed between the first two solutions was the number of table scans. Have a large enough table, and I think having 2 scans against the same large table is going to be slower. I may be wrong and will admit it if my testing proves otherwise, but then that is the reason for testing isn't it.
What may appear logical isn't always that way.

So, the 1M rows testing, how did it go?

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 14

ScottPletcher (9/10/2012)
Lynn Pettis (9/7/2012)
The real test will be against a 1,000,000 (or so) table. I will try and set up such a test tonight at home. One of the things I noticed between the first two solutions was the number of table scans. Have a large enough table, and I think having 2 scans against the same large table is going to be slower. I may be wrong and will admit it if my testing proves otherwise, but then that is the reason for testing isn't it.
What may appear logical isn't always that way.
So, the 1M rows testing, how did it go?

Not enough hours in the day this weekend. I will have to work over the couple of days. Hit me up again on Wednesday for an update.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 15

Just a quick test using 6 million rows:

Lynn's method:

Elapsed Time (ms) 8276

ScottPletcher's first method:

Warning: Null value is eliminated by an aggregate or other SET operation.

Elapsed Time (ms) 12244

ScottPletcher's second method:

Elapsed Time (ms) 7464

I ran this several times with very similar results. The third method on 6 million rows is about 1 second faster.