March 25, 2008 at 10:49 am
I have a situation where a table carries data like detailed below
Region name sequence description
1 apple 1 It is round
1 apple 2 It is sweet
1 apple 3 it is colored
1 orange 1 It is yellow
1 orange 2 It is sour
1 lemon 1 It is green
1 nut 1 hard to crack
2 apple 1 It is round
2 apple 2 It is sweet
2 apple 3 it is colored
2 orange 1 It is yellow
2 orange 2 It is sweet and sour
2 lemon 1 It is green
2 pear 1 it is a fruit
3 apple 1 It is round
3 apple 2 It is sweet
3 apple 3 it is colored
3 orange 1 It is yellow
3 orange 2 It is sweet and sour
3 lemon 1 It is yelow
3 berry 1 It is small
3 nut 1 hard to crack
3 pear 1 it is vegetable
So there are 1 to many rows of definition for fruits in multiple regions. There can be more than 3 regions - above is just a sample.
I need to identify those fruits which are defined the same in all regions those are defined in (if not definied in a region we do not care) or are defined in just one region. If a fruit is defined in more than one regions than I need to select only if it is defined the same in all.
In the example above I need to select
Apple rows because it is same in all regions
1 apple 1 It is round
1 apple 2 It is sweet
1 apple 3 it is colored
NUT rows because it is same in all or is blank in region 2
1 nut 1 hard to crack
BERRY row becaue it is just there in one and blank in other
3 berry 1 It is small
Do not need orange or Lemon because these have different definitions
Do not need pear because it has diff definitions on the two regions it is present.
Do not want to do it using cursors - just a simple set operation solution is what I am looking for.
Thanks in advance.
March 25, 2008 at 12:30 pm
Main:
I am a bit shaky on this, but this is what I came up with for a first pass:
declare @fruit table
( region integer,
name varchar(9),
sequence integer,
description varchar(25)
)
insert into @fruit
select 1, 'apple', 1, 'It is round' union all
select 1, 'apple', 2, 'It is sweet' union all
select 1, 'apple', 3, 'it is colored' union all
select 1, 'orange', 1, 'It is yellow' union all
select 1, 'orange', 2, 'It is sour' union all
select 1, 'lemon', 1, 'It is green' union all
select 1, 'nut', 1, 'hard to crack' union all
select 1, 'lime', 1, 'It is green' union all
select 1, 'lime', 2, 'It is sour' union all
select 2, 'apple', 1, 'It is round' union all
select 2, 'apple', 2, 'It is sweet' union all
select 2, 'apple', 3, 'it is colored' union all
select 2, 'orange', 1, 'It is yellow' union all
select 2, 'orange', 2, 'It is sweet and sour' union all
select 2, 'lemon', 1, 'It is green' union all
select 2, 'pear', 1, 'it is a fruit' union all
select 2, 'lime', 1, 'It is green' union all
select 3, 'apple', 1, 'It is round' union all
select 3, 'apple', 2, 'It is sweet' union all
select 3, 'apple', 3, 'it is colored' union all
select 3, 'orange', 1, 'It is yellow' union all
select 3, 'orange', 2, 'It is sweet and sour' union all
select 3, 'lemon', 1, 'It is yelow' union all
select 3, 'berry', 1, 'It is small' union all
select 3, 'nut', 1, 'hard to crack' union all
select 3, 'pear', 1, 'it is vegetable'
;with descriptionException as
( select
name,
sequence,
min(description) as min_description,
max(description) as max_description
from @fruit b
group by name, sequence
having min(description) <> max(description)
)
select distinct
a.name
from @fruit a
where not exists
( select 0 from descriptionException b
where a.name = b.name
)
/* ------- Sample Output: --------
name
---------
apple
berry
lime
nut
*/
Note that I added in the lime case because I was not sure what to do with a case like this. The LIME case has the same SEQ=1 entry in both regions 1 and 2; however, there is a SEQ=2 entry in region 1 that is missing in region 2. This code needs some adjustment if the LIME case should not be included.
March 25, 2008 at 3:47 pm
Thanks for looking into this odd request.
Lime is also not acceptable.
Actually the definitions are later to be concatenated - so each region must have same number of lines for the definition.
BTW - I had come up w a solution also - which again is not handling the LIME case.
drop table #fruit
select distinct name, sequence, description into #fruit from fruit
--These are to be excluded the remaining I can keep
select name, sequence, count(*)
from #fruit
group by name, sequence
having count(*) > 1
This also does not exclude LIME.
March 25, 2008 at 4:37 pm
I'm not sure I fully understand what you are after here, but here's a go using Kent's sample data:
SELECT f1.*
FROM @Fruit f1
LEFT JOIN (
SELECT f1.Region, f1.Name, f1.Sequence
FROM @Fruit f1
INNER JOIN @Fruit f2
ON f1.Name = f2.Name AND f1.Sequence = f2.Sequence
WHERE f1.Description <> f2.Description
) f2
ON f1.Region = f2.Region AND f1.Name = f2.Name AND f1.Sequence = f2.Sequence
WHERE f2.Name IS NULL
March 26, 2008 at 3:36 am
Try thisSELECTf1.Name
FROM@Fruit AS f1
LEFT JOIN@Fruit AS f2 ON f2.Name = f1.Name
AND f2.Sequence = f1.Sequence
AND f2.Description = f1.Description
AND f2.Region <> f1.Region
GROUP BYf1.Name
HAVINGSUM(CASE WHEN f2.Name IS NULL THEN 1 ELSE 0 END) IN (0, COUNT(*))
ORDER BYf1.Name
N 56°04'39.16"
E 12°55'05.25"
March 26, 2008 at 8:58 am
This version avoids the self join but I am not sure if it is as good:
select
name
from
( select
name,
sequence,
count(*) as sequence_count,
case when max(description) = min(description)
then 0 else 1 end as same_description
from @fruit
group by
name,
sequence
) x
group by name
having max(same_description) = 0
and max(sequence_count) = min(sequence_count)
March 26, 2008 at 9:24 am
This one from Kent is cool - thanks a lot.
March 26, 2008 at 9:34 am
If I change this line
select 3, 'nut', 1, 'hard to crack' union all
to
select 3, 'nut', 2, 'hard to crack' union all
it fails too...
N 56°04'39.16"
E 12°55'05.25"
March 26, 2008 at 10:27 am
Try this
SELECTd.Name
FROM(
SELECTf1.Name,
MIN(f1.Region) AS minRegion,
MAX(f1.Region) AS maxRegion,
SUM(CASE WHEN f2.Name IS NULL THEN 1 ELSE 0 END) AS Nulls,
COUNT(*) AS Items
FROM@Fruit AS f1
LEFT JOIN@Fruit AS f2 ON f2.Name = f1.Name
AND f2.Sequence = f1.Sequence
AND f2.Description = f1.Description
AND f2.Region <> f1.Region
GROUP BYf1.Name
) AS d
WHEREd.Nulls = 0
OR (d.Nulls = d.Items AND d.minRegion = d.maxRegion)
ORDER BYd.Name
N 56°04'39.16"
E 12°55'05.25"
March 26, 2008 at 11:46 am
This might eliminate the "different nut" probably; however, as before I am still not sure that Peso's pont isn't valid and you might be forced to perform the self-join. However, here is yet another try at avoiding the self-join:
select
name
from
( select
name,
sequence,
min(region) as min_region,
max(region) as max_region,
count(*) as sequence_count,
case when max(description) = min(description)
then 0 else 1 end as same_description
from @fruit
group by
name,
sequence
) x
group by name
having max(same_description) = 0
and max(sequence_count) = min(sequence_count)
and max(min_region) = min(min_region)
and max(max_region) = max(max_region)
You might be best off just taking the self join; this code is looking a bit mangy.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply