Efficiency question

  • Below I have written two options to join two tables. It represents the data I am working with: The #descriptions table has entries where sometimes field A is enough to determine the recordtype, sometimes A+B is requred, and sometimes A+B+C is required.

    Is a more involved join better than a bunch of unions?

    I'm assuming so, but wanted to be sure. We can assume there are indexes on the tables.

    Anything I'm missing to make this more efficient?

    create table #largedataset (

    a int,

    b int,

    c int,

    manyotherfields int identity)

    insert into #largedataset (a, b, c) values (1, 1, 1)

    insert into #largedataset (a, b, c) values (1, 1, 2)

    insert into #largedataset (a, b, c) values (1, 1, 3)

    insert into #largedataset (a, b, c) values (1, 2, 1)

    insert into #largedataset (a, b, c) values (1, 2, 2)

    insert into #largedataset (a, b, c) values (1, 2, 3)

    insert into #largedataset (a, b, c) values (1, 3, 1)

    insert into #largedataset (a, b, c) values (1, 3, 2)

    insert into #largedataset (a, b, c) values (1, 3, 3)

    insert into #largedataset (a, b, c) values (2, 1, 1)

    insert into #largedataset (a, b, c) values (2, 1, 2)

    insert into #largedataset (a, b, c) values (2, 1, 3)

    insert into #largedataset (a, b, c) values (2, 2, 1)

    insert into #largedataset (a, b, c) values (2, 2, 2)

    insert into #largedataset (a, b, c) values (2, 2, 3)

    insert into #largedataset (a, b, c) values (2, 3, 1)

    insert into #largedataset (a, b, c) values (2, 3, 2)

    insert into #largedataset (a, b, c) values (2, 3, 3)

    insert into #largedataset (a, b, c) values (3, 1, 1)

    insert into #largedataset (a, b, c) values (3, 1, 2)

    insert into #largedataset (a, b, c) values (3, 1, 3)

    insert into #largedataset (a, b, c) values (3, 2, 1)

    insert into #largedataset (a, b, c) values (3, 2, 2)

    insert into #largedataset (a, b, c) values (3, 2, 3)

    insert into #largedataset (a, b, c) values (3, 3, 1)

    insert into #largedataset (a, b, c) values (3, 3, 2)

    insert into #largedataset (a, b, c) values (3, 3, 3)

    create table #descriptions (

    a int,

    b int,

    c int,

    Info varchar(100))

    insert into #descriptions values (1, null, null, 'Defintion for 1')

    insert into #descriptions values (2, 1, null, 'Defintion for 2, 1')

    insert into #descriptions values (2, 2, null, 'Defintion for 2, 2')

    insert into #descriptions values (2, 3, 1, 'Defintion for 2, 3, 1')

    insert into #descriptions values (2, 3, 2, 'Defintion for 2, 3, 2')

    insert into #descriptions values (2, 3, 3, 'Defintion for 2, 3, 3')

    insert into #descriptions values (3, null, null, 'Defintion for 3')

    -- option 1

    select ds.*, d.Info

    from #largedataset ds

    inner join #descriptions d

    on

    ds.a = d.a and

    (d.b is null or ds.b = d.b) and

    (d.c is null or ds.c = d.c)

    -- option 2

    select ds.*, d.Info

    from #largedataset ds

    inner join #descriptions d

    on

    ds.a = d.a

    where d.b is null

    union all

    select ds.*, d.Info

    from #largedataset ds

    inner join #descriptions d

    on

    ds.a = d.a and

    ds.b = d.b

    where d.c is null

    union all

    select ds.*, d.Info

    from #largedataset ds

    inner join #descriptions d

    on

    ds.a = d.a and

    ds.b = d.b and

    ds.c = d.c

  • I would try it both ways and see what is faster. Indexes would change the efficiency of both queries. I like the first one based on readability.

  • I would agree, try them both. There are a lot of factors, that make it difficult to tell which would be better from a 10,000 foot view.

    Use the one that performs the best in the real world..

    CEWII

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply