Group By Price Ranges

  • rbarryyoung (6/18/2008)


    OK, fair enough, I can see that is was not caught up to the changes that you were making.

    To answer your question then: from a performance standpoint, 2 different version is preferred. If you want to consolidate the logic however, you can also apply your IS NULL technique to the join conditions:

    FROM item

    INNER JOIN tblPriceRanges

    ON (@priceRange IS NULL)

    OR (RangeOrder = @priceRange AND price BETWEEN RangeFrom AND RangeTo)

    Now why didn't I think of that??? Works perfectly! Its a great little technique, not sure where I learnt it, but it works nicely...just don't let the SSMS designer get its grubby mits on it, whilst what it changes works, it leaves it totally unreadable.

    Thanks again!

    Steve

  • I spoke too soon, it doesn't work perfectly, each row is repeated for each price range, so I am getting 6 times as many rows as I expected when @PriceRange IS NULL.

  • Dooza (6/18/2008)


    I spoke too soon, it doesn't work perfectly, each row is repeated for each price range, so I am getting 6 times as many rows as I expected when @PriceRange IS NULL.

    But this does work:

    FROM item

    INNER JOIN tblPriceRanges

    ON (@priceRange IS NULL

    OR RangeOrder = @priceRange) AND (price BETWEEN RangeFrom AND RangeTo)

  • FROM item I

    WHERE @priceRange IS NULL

    OR EXISTS (select 1 from tblPriceRanges R

    where R.RangeOrder = @priceRange

    AND I.price BETWEEN R.RangeFrom AND R.RangeTo

    )

    _____________
    Code for TallyGenerator

  • I wholeheartedly agree with using a table to hold the cutoff values. The slight performance hit is well worth the flexibility of the design. However, there are two flaws in the design. One is that if a price is exactly one of the cutoff values, it will get counted twice. A price of, say, £50 falls into both the £0 - £50 range and the £50 - £100 range. Also, the table can be simplified. If the lower range should change to £0 - £75, there are two rows that need to be changed. This could lead to synchronization problems.

    My cutoff table is just one column:

    CREATE TABLE dbo.Cutoffs(

    Limit money NOT NULL PRIMARY KEY CLUSTERED

    );

    insert dbo.Cutoffs(Limit)

    select 0 union all

    select 50 union all

    select 100 union all

    select 500 union all

    select 1000;

    This table cannot be messed up. Update the 50 to 75 to adjust a range or insert the value 75 to create a new range. A check constraint to make sure the value can't be negative is the only improvement I can think of right now.

    To make this into a comparison range:

    select c.Limit as LL, IsNull( c2.Limit, 1000000 ) as UL

    from dbo.Cutoffs c

    left join dbo.Cutoffs c2

    on c.Limit < c2.Limit

    and c2.Limit = (

    select Min( Limit )

    from dbo.Cutoffs

    where Limit > c.Limit

    )

    This may seem terribly complicated, a self-join and subquery, but I have used this pattern many times and on very large tables with quite impressive results (as long as the table is properly indexed -- which, in this case, it is). I use views to simplify the query but I'm leaving it all in here for illustration.

    Then the final query just falls into place:

    select Cutoff.LL as [From], Cutoff.UL as [Up To], Count(*) as [# In Range]

    from dbo.Prices p

    join (

    select c.Limit as LL, IsNull( c2.Limit, 1000000 ) as UL

    from dbo.Cutoffs c

    left join dbo.Cutoffs c2

    on c.Limit < c2.Limit

    and c2.Limit = (

    select Min( Limit )

    from dbo.Cutoffs

    where Limit > c.Limit

    )

    ) Cutoff

    on p.Price >= Cutoff.LL and p.Price < Cutoff.UL

    group by Cutoff.LL, Cutoff.UL

    order by Cutoff.LL;

    Notice the comparison is '>= and <' instead of 'between' to avoid double counting.

    (The wink 😉 in the code above is just a right parenthesis/bracket. Don't know how to make it stop doing that...)

    I created a Prices table, loaded it with 32768 random prices (but made sure there were prices right on the boundaries: 50, 100, etc.) and noticed almost no pause between pressing F5 and the result -- the timer on the status bar indicated 00:00:00. Only the price column is used in whatever table you have, so it should work anywhere. If you run into a performance issue, I would be interested in seeing the details.

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply