Efficient solution using TSQL

Question

Post reply

Efficient solution using TSQL

SQLMAIN

SSC Eights!

Points: 936
More actions
March 25, 2008 at 10:49 am

#183627

I have a situation where a table carries data like detailed below
Region name sequence description
1 apple 1 It is round
1 apple 2 It is sweet
1 apple 3 it is colored
1 orange 1 It is yellow
1 orange 2 It is sour
1 lemon 1 It is green
1 nut 1 hard to crack
2 apple 1 It is round
2 apple 2 It is sweet
2 apple 3 it is colored
2 orange 1 It is yellow
2 orange 2 It is sweet and sour
2 lemon 1 It is green
2 pear 1 it is a fruit
3 apple 1 It is round
3 apple 2 It is sweet
3 apple 3 it is colored
3 orange 1 It is yellow
3 orange 2 It is sweet and sour
3 lemon 1 It is yelow
3 berry 1 It is small
3 nut 1 hard to crack
3 pear 1 it is vegetable
So there are 1 to many rows of definition for fruits in multiple regions. There can be more than 3 regions - above is just a sample.
I need to identify those fruits which are defined the same in all regions those are defined in (if not definied in a region we do not care) or are defined in just one region. If a fruit is defined in more than one regions than I need to select only if it is defined the same in all.
In the example above I need to select
Apple rows because it is same in all regions
1 apple 1 It is round
1 apple 2 It is sweet
1 apple 3 it is colored
NUT rows because it is same in all or is blank in region 2
1 nut 1 hard to crack
BERRY row becaue it is just there in one and blank in other
3 berry 1 It is small
Do not need orange or Lemon because these have different definitions
Do not need pear because it has diff definitions on the two regions it is present.
Do not want to do it using cursors - just a simple set operation solution is what I am looking for.
Thanks in advance.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Kent Waldrop Hall of Fame Points: 3064 More actions · Answer 1

Main:

I am a bit shaky on this, but this is what I came up with for a first pass:

declare @fruit table

( region integer,

name varchar(9),

sequence integer,

description varchar(25)

)

insert into @fruit

select 1, 'apple', 1, 'It is round' union all

select 1, 'apple', 2, 'It is sweet' union all

select 1, 'apple', 3, 'it is colored' union all

select 1, 'orange', 1, 'It is yellow' union all

select 1, 'orange', 2, 'It is sour' union all

select 1, 'lemon', 1, 'It is green' union all

select 1, 'nut', 1, 'hard to crack' union all

select 1, 'lime', 1, 'It is green' union all

select 1, 'lime', 2, 'It is sour' union all

select 2, 'apple', 1, 'It is round' union all

select 2, 'apple', 2, 'It is sweet' union all

select 2, 'apple', 3, 'it is colored' union all

select 2, 'orange', 1, 'It is yellow' union all

select 2, 'orange', 2, 'It is sweet and sour' union all

select 2, 'lemon', 1, 'It is green' union all

select 2, 'pear', 1, 'it is a fruit' union all

select 2, 'lime', 1, 'It is green' union all

select 3, 'apple', 1, 'It is round' union all

select 3, 'apple', 2, 'It is sweet' union all

select 3, 'apple', 3, 'it is colored' union all

select 3, 'orange', 1, 'It is yellow' union all

select 3, 'orange', 2, 'It is sweet and sour' union all

select 3, 'lemon', 1, 'It is yelow' union all

select 3, 'berry', 1, 'It is small' union all

select 3, 'nut', 1, 'hard to crack' union all

select 3, 'pear', 1, 'it is vegetable'

;with descriptionException as

( select

name,

sequence,

min(description) as min_description,

max(description) as max_description

from @fruit b

group by name, sequence

having min(description) <> max(description)

)

select distinct

a.name

from @fruit a

where not exists

( select 0 from descriptionException b

where a.name = b.name

)

/* ------- Sample Output: --------

name

---------

apple

berry

lime

nut

*/

Note that I added in the lime case because I was not sure what to do with a case like this. The LIME case has the same SEQ=1 entry in both regions 1 and 2; however, there is a SEQ=2 entry in region 1 that is missing in region 2. This code needs some adjustment if the LIME case should not be included.

SQLMAIN SSC Eights! Points: 936 More actions · Answer 2

Thanks for looking into this odd request.

Lime is also not acceptable.

Actually the definitions are later to be concatenated - so each region must have same number of lines for the definition.

BTW - I had come up w a solution also - which again is not handling the LIME case.

drop table #fruit

select distinct name, sequence, description into #fruit from fruit

--These are to be excluded the remaining I can keep

select name, sequence, count(*)

from #fruit

group by name, sequence

having count(*) > 1

This also does not exclude LIME.

John Rowan SSC Guru Points: 56440 More actions · Answer 3

I'm not sure I fully understand what you are after here, but here's a go using Kent's sample data:

SELECT f1.*

FROM @Fruit f1

LEFT JOIN (

SELECT f1.Region, f1.Name, f1.Sequence

FROM @Fruit f1

INNER JOIN @Fruit f2

ON f1.Name = f2.Name AND f1.Sequence = f2.Sequence

WHERE f1.Description <> f2.Description

) f2

ON f1.Region = f2.Region AND f1.Name = f2.Name AND f1.Sequence = f2.Sequence

WHERE f2.Name IS NULL

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 4

Try thisSELECTf1.Name

FROM@Fruit AS f1

LEFT JOIN@Fruit AS f2 ON f2.Name = f1.Name

AND f2.Sequence = f1.Sequence

AND f2.Description = f1.Description

AND f2.Region <> f1.Region

GROUP BYf1.Name

HAVINGSUM(CASE WHEN f2.Name IS NULL THEN 1 ELSE 0 END) IN (0, COUNT(*))

ORDER BYf1.Name

N 56°04'39.16"
E 12°55'05.25"

Kent Waldrop Hall of Fame Points: 3064 More actions · Answer 5

This version avoids the self join but I am not sure if it is as good:

select

name

from

( select

name,

sequence,

count(*) as sequence_count,

case when max(description) = min(description)

then 0 else 1 end as same_description

from @fruit

group by

name,

sequence

) x

group by name

having max(same_description) = 0

and max(sequence_count) = min(sequence_count)

SQLMAIN SSC Eights! Points: 936 More actions · Answer 6

SQLMAIN

SSC Eights!

Points: 936

March 26, 2008 at 9:24 am

#794129

This one from Kent is cool - thanks a lot.

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 7

If I change this line

select 3, 'nut', 1, 'hard to crack' union all

to

select 3, 'nut', 2, 'hard to crack' union all

it fails too...

N 56°04'39.16"
E 12°55'05.25"

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 8

Try this

SELECTd.Name

FROM(

SELECTf1.Name,

MIN(f1.Region) AS minRegion,

MAX(f1.Region) AS maxRegion,

SUM(CASE WHEN f2.Name IS NULL THEN 1 ELSE 0 END) AS Nulls,

COUNT(*) AS Items

FROM@Fruit AS f1

LEFT JOIN@Fruit AS f2 ON f2.Name = f1.Name

AND f2.Sequence = f1.Sequence

AND f2.Description = f1.Description

AND f2.Region <> f1.Region

GROUP BYf1.Name

) AS d

WHEREd.Nulls = 0

OR (d.Nulls = d.Items AND d.minRegion = d.maxRegion)

ORDER BYd.Name

N 56°04'39.16"
E 12°55'05.25"

Kent Waldrop Hall of Fame Points: 3064 More actions · Answer 9

This might eliminate the "different nut" probably; however, as before I am still not sure that Peso's pont isn't valid and you might be forced to perform the self-join. However, here is yet another try at avoiding the self-join:

select

name

from

( select

name,

sequence,

min(region) as min_region,

max(region) as max_region,

count(*) as sequence_count,

case when max(description) = min(description)

then 0 else 1 end as same_description

from @fruit

group by

name,

sequence

) x

group by name

having max(same_description) = 0

and max(sequence_count) = min(sequence_count)

and max(min_region) = min(min_region)

and max(max_region) = max(max_region)

You might be best off just taking the self join; this code is looking a bit mangy.