Find which rows have a column value that is less than the previous row's column value

  • Hi Guys,

    I know this should be simple to code but i've hit a wall trying to produce something simple that will do the following:-

    I need to traverse through a table to find which rows have a column value that is less than the previous row's column value

    e.g.

    ColValue

    23

    123

    233

    32 ----- Bingo! This value is less than previous value

    45

    124

    32 ----Bingo! This value is less than previous value

    and so on

    Any ideas?

    Thanks,

    Pianoman

  • How about this?

    DECLARE @tab TABLE

    ( ID INT , ColValue INT)

    ;

    INSERT INTO @tab (ID, ColValue)

    SELECT 1, 23

    UNION ALL SELECT 2, 123

    UNION ALL SELECT 3, 233

    UNION ALL SELECT 4, 32

    UNION ALL SELECT 5, 45

    UNION ALL SELECT 6, 124

    UNION ALL SELECT 7, 32

    ;

    ;

    WITH rCTE AS

    (

    SELECT T.ID , T.ColValue

    , 0 AS Indicator

    FROM @tab T

    WHERE T.ID = 1

    UNION ALL

    SELECT Base.ID , BASE.ColValue

    , CASE WHEN BasE.ColValue < R.ColValue THEN 1

    ELSE 0

    END

    FROM @tab Base

    JOIN rCTE R

    ON R.ID + 1 = Base.ID

    )

    SELECT *

    FROM rCTE

    WHERE Indicator = 1

  • If you want something simpler than ColdCoffee's solution, but not nearly as efficient (according to the estimated execution plan, unless you index the join and include the return values), using the same @tab table variable...

    SELECT *

    FROM @tab A

    INNER JOIN @tab B

    ON A.ID = B.ID - 1

    WHERE A.ColValue > B.ColValue

  • Thanks ColdCoffee and Jeffem - Appreciate the replies. Will give these suggestions a try.

    My original attempt to code this problem ended up being a ugly cursor - I knew the would be a more elegant way.

    Hope to return the favour (SQL engineering more my forte.)

    Cheers!

  • There can be a small problem with Jeffem and ColdCoffee's solution.

    If you miss an ID value (most common scenario is when you delete a record and the identity does not use the value anymore)

    For example:

    INSERT INTO @tab (ID, ColValue)

    SELECT 1, 23

    UNION ALL SELECT 2, 123

    UNION ALL SELECT 3, 233

    UNION ALL SELECT 5, 32

    UNION ALL SELECT 6, 45

    UNION ALL SELECT 7, 124

    UNION ALL SELECT 8, 32

    ;

    ColdCoffee's solution won't show any value and jeffem's solution will only show one value.

    I have the same problem and used something like this, but I'm concerned about the performance.

    DECLARE @tab TABLE

    ( ID INT , ColValue INT)

    ;

    INSERT INTO @tab (ID, ColValue)

    SELECT 1, 23

    UNION ALL SELECT 2, 123

    UNION ALL SELECT 3, 233

    UNION ALL SELECT 4, 32

    UNION ALL SELECT 5, 45

    UNION ALL SELECT 6, 124

    UNION ALL SELECT 7, 32

    ;

    WITH rCTE AS

    (

    SELECT A.ID, A.ColValue, MAX( B.ID) AS prev_ID

    FROM @tab A

    INNER JOIN @tab B

    ON A.ID > B.ID

    GROUP BY A.ColValue, A.ID

    )

    SELECT *

    FROM rCTE A

    INNER JOIN @tab B

    ON A.prev_ID = B.ID

    WHERE A.ColValue < B.ColValue

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • A good point! It seems to me the best way to handle it might be to create a ROW_NUMBER(), to have perfectly sequential IDs, then use the solutions presented before. But that requires the data to be sortable exactly as the comparison is desired.

    I more wanted to supply the logic without getting into those details, which was a bit irresponsible. But I think ROW_NUMBER() is the best way to go.

  • You can also do it with a subquery like this:

    DECLARE @tab TABLE

    ( ID INT , ColValue INT)

    ;

    INSERT INTO @tab (ID, ColValue)

    SELECT 1, 23

    UNION ALL SELECT 2, 123

    UNION ALL SELECT 3, 233

    UNION ALL SELECT 5, 32

    UNION ALL SELECT 6, 45

    UNION ALL SELECT 7, 124

    UNION ALL SELECT 8, 32

    ;

    ;WITH CTE AS (

    SELECT ID, ColValue

    ,(SELECT TOP 1 t2.ColValue

    FROM @tab t2

    WHERE t1.ID > t2.ID

    ORDER BY ID DESC) PrevRow

    FROM @tab t1

    )

    SELECT ID, ColValue

    FROM CTE

    WHERE 1=CASE WHEN PrevRow IS NULL THEN 0 WHEN PrevRow<ColValue THEN 0 ELSE 1 END

    Just guessing mind you but because of the TOP clause in the subquery this should perform better than joining the table onto itself.

    Note that this solution also handles the case of missing IDs.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Cheers Guys.

    Jeff - Yup I thought about adding ROW_NUMBER() too for the same reason Luis made about missing ID's (thanks Luis)

    Dwain - Cheers to your feedback. Nice to see lovely bits of code with no cursor in sight 🙂

    thanks

    Pianoman

  • Dwain, good job on accounting for the missing IDs! 🙂

    Just a note about the performance: I did a comparison, and using a table variable will give a bit of a performance advantage (15% or so, for reasonable numbers of records) to your method. If this were in an actual table (temp, or otherwise) with a good, covering index, then the execution plan between our two solutions is essentially identical.

  • Try this too: DECLARE @tab TABLE

    ( ID INT , ColValue INT)

    ;

    INSERT INTO @tab (ID, ColValue)

    SELECT 1, 23

    UNION ALL SELECT 2, 123

    UNION ALL SELECT 3, 233

    UNION ALL SELECT 4, 32

    UNION ALL SELECT 5, 45

    UNION ALL SELECT 6, 124

    UNION ALL SELECT 7, 32

    SELECT t1.id,t1.colvalue

    FROM @tab t1 LEFT JOIN @tab t2 ON t1.id = t2.id + 1 AND t1.colvalue < t2.colvalue

    WHERE t2.id IS NOT NULL

    Should be pretty good on performance. For some reason I'm currently in love with LEFT JOINS.

  • It doesn't hurt to do it with a left join, but if you use an inner join there, you don't need to require where t2.id is not null.

  • I'm surprised it hasn't been mentioned before, but I do hope you have some way to actually define the ordering within the table. A common stumbling block is to think there is some physical order to the table (there isn't). Assuming you DO, a CTE with Row_number() will perform fairly well, even on large sets. I'd think it would perform something with a correlated sub, but that will depend on existing indexes on the correlated table.

    As an aside - if you have a healthy table (i.e anything over a couple hundred records), you will want to avoid table variables. Table variables tend to play nasty tricks on execution plans (estimated # of rows in a table variable stay pegged at 1), so the engine can't optimize for "large" table variables.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • jeffem (5/31/2012)


    But that requires the data to be sortable exactly as the comparison is desired

    Heh.... if you can sort exactly as the comparison is desired, you're dead no matter what. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jeffem (6/1/2012)


    Dwain, good job on accounting for the missing IDs! 🙂

    Just a note about the performance: I did a comparison, and using a table variable will give a bit of a performance advantage (15% or so, for reasonable numbers of records) to your method. If this were in an actual table (temp, or otherwise) with a good, covering index, then the execution plan between our two solutions is essentially identical.

    Can you post your comparison? I'm confused whether you're comparing actual run results (e.g., with Jeff's 1M row test harness) or just comparing execution plans. I'd do it myself but I'm kinda tied up on something at the moment, but if I don't see something posted maybe I'll come back to it.

    I'm really just curious because I saw a subquery version like mine blast the heck out of an alternative something like yours (no comparison here intended of course) before, where the execution plans rated them about the same.

    I use table variables for convenience so I don't have temp tables littering my sandbox if I forget to drop them or something. I agree they don't always end up rating the same as using an actual table. And with good indexing, it could completely reverse the results.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Matt Miller (#4) (6/1/2012)


    A common stumbling block is to think there is some physical order to the table (there isn't).

    Of course there is. You just can't depend on results coming back in that physical order 🙂

    DECLARE @Example AS TABLE

    (

    ColValue integer NOT NULL

    );

    INSERT INTO @Example

    (ColValue)

    VALUES

    (23),

    (123),

    (233),

    (32),

    (45),

    (124),

    (32);

    SELECT

    e.*,

    file_num = CONVERT(integer,CONVERT(binary(2),REVERSE(SUBSTRING(%%PhysLoc%%,5,2)))),

    page_num = CONVERT(integer,CONVERT(binary(4),REVERSE(SUBSTRING(%%PhysLoc%%,1,4)))),

    slot_num = CONVERT(integer,CONVERT(binary(2),REVERSE(SUBSTRING(%%PhysLoc%%,7,2))))

    FROM @Example AS e

    ORDER BY

    file_num,

    page_num,

    slot_num

    OPTION (RECOMPILE);

    As an aside - if you have a healthy table (i.e anything over a couple hundred records), you will want to avoid table variables. Table variables tend to play nasty tricks on execution plans (estimated # of rows in a table variable stay pegged at 1), so the engine can't optimize for "large" table variables.

    Well it can a bit. See the example above (the number of estimated rows from the table variable in the actual plan is 7). If a statement-level recompile occurs (for whatever reason) the optimizer will see the run-time cardinality of the table variable. Table variables do not, however, currently support statistics (even on indexes). Where a logical query has cost-based physical execution choices that benefit from statistical information, temporary tables are a better choice. If a trivial plan is possible, creating statistics on the temporary table just adds overhead.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply