October 28, 2009 at 10:05 am
Below I have written two options to join two tables. It represents the data I am working with: The #descriptions table has entries where sometimes field A is enough to determine the recordtype, sometimes A+B is requred, and sometimes A+B+C is required.
Is a more involved join better than a bunch of unions?
I'm assuming so, but wanted to be sure. We can assume there are indexes on the tables.
Anything I'm missing to make this more efficient?
create table #largedataset (
a int,
b int,
c int,
manyotherfields int identity)
insert into #largedataset (a, b, c) values (1, 1, 1)
insert into #largedataset (a, b, c) values (1, 1, 2)
insert into #largedataset (a, b, c) values (1, 1, 3)
insert into #largedataset (a, b, c) values (1, 2, 1)
insert into #largedataset (a, b, c) values (1, 2, 2)
insert into #largedataset (a, b, c) values (1, 2, 3)
insert into #largedataset (a, b, c) values (1, 3, 1)
insert into #largedataset (a, b, c) values (1, 3, 2)
insert into #largedataset (a, b, c) values (1, 3, 3)
insert into #largedataset (a, b, c) values (2, 1, 1)
insert into #largedataset (a, b, c) values (2, 1, 2)
insert into #largedataset (a, b, c) values (2, 1, 3)
insert into #largedataset (a, b, c) values (2, 2, 1)
insert into #largedataset (a, b, c) values (2, 2, 2)
insert into #largedataset (a, b, c) values (2, 2, 3)
insert into #largedataset (a, b, c) values (2, 3, 1)
insert into #largedataset (a, b, c) values (2, 3, 2)
insert into #largedataset (a, b, c) values (2, 3, 3)
insert into #largedataset (a, b, c) values (3, 1, 1)
insert into #largedataset (a, b, c) values (3, 1, 2)
insert into #largedataset (a, b, c) values (3, 1, 3)
insert into #largedataset (a, b, c) values (3, 2, 1)
insert into #largedataset (a, b, c) values (3, 2, 2)
insert into #largedataset (a, b, c) values (3, 2, 3)
insert into #largedataset (a, b, c) values (3, 3, 1)
insert into #largedataset (a, b, c) values (3, 3, 2)
insert into #largedataset (a, b, c) values (3, 3, 3)
create table #descriptions (
a int,
b int,
c int,
Info varchar(100))
insert into #descriptions values (1, null, null, 'Defintion for 1')
insert into #descriptions values (2, 1, null, 'Defintion for 2, 1')
insert into #descriptions values (2, 2, null, 'Defintion for 2, 2')
insert into #descriptions values (2, 3, 1, 'Defintion for 2, 3, 1')
insert into #descriptions values (2, 3, 2, 'Defintion for 2, 3, 2')
insert into #descriptions values (2, 3, 3, 'Defintion for 2, 3, 3')
insert into #descriptions values (3, null, null, 'Defintion for 3')
-- option 1
select ds.*, d.Info
from #largedataset ds
inner join #descriptions d
on
ds.a = d.a and
(d.b is null or ds.b = d.b) and
(d.c is null or ds.c = d.c)
-- option 2
select ds.*, d.Info
from #largedataset ds
inner join #descriptions d
on
ds.a = d.a
where d.b is null
union all
select ds.*, d.Info
from #largedataset ds
inner join #descriptions d
on
ds.a = d.a and
ds.b = d.b
where d.c is null
union all
select ds.*, d.Info
from #largedataset ds
inner join #descriptions d
on
ds.a = d.a and
ds.b = d.b and
ds.c = d.c
October 28, 2009 at 10:25 am
I would try it both ways and see what is faster. Indexes would change the efficiency of both queries. I like the first one based on readability.
October 28, 2009 at 11:30 am
I would agree, try them both. There are a lot of factors, that make it difficult to tell which would be better from a 10,000 foot view.
Use the one that performs the best in the real world..
CEWII
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply