Identifying consecutive values with constraints

  • Hi guys

    Just letting you know that I did manage to write some sequential code to do this. Don't think you can simply do it in a single sql statement..so I wrote a while loop (not a cursor though) which does the processing. It's a bit slow..takes about 20seconds to process a table with about 13500 rows..but it does the job.

    Cheers

  • thusi (11/23/2008)


    Hi guys

    Just letting you know that I did manage to write some sequential code to do this. Don't think you can simply do it in a single sql statement..so I wrote a while loop (not a cursor though) which does the processing. It's a bit slow..takes about 20seconds to process a table with about 13500 rows..but it does the job.

    Cheers

    Heh... wanna bet? Post your sequential code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok, here's the update I promised, and with your full set of requirements taken into account:

    DECLARE @BP_READINGS TABLE (

    MMID varchar(10),

    MESDATE DateTime,

    SYSTOLIC int,

    DIA int

    )

    INSERT INTO @BP_READINGS (MMID, MESDATE, SYSTOLIC, DIA)

    SELECT 'A006283', '2006-01-14', 148, 80 UNION ALL

    SELECT 'A006283', '2006-01-16', 130, 88 UNION ALL

    SELECT 'A006283', '2006-02-25', 150, 80 UNION ALL

    SELECT 'A006283', '2006-06-02', 120, 75 UNION ALL

    SELECT 'A006283', '2006-08-02', 144, 96 UNION ALL

    SELECT 'A006283', '2006-10-20', 140, 80 UNION ALL

    SELECT 'A006283', '2006-10-30', 130, 80 UNION ALL

    SELECT 'A006283', '2007-02-28', 140, 80 UNION ALL

    SELECT 'A006283', '2007-06-02', 120, 78 UNION ALL

    SELECT 'A006283', '2007-08-12', 144, 96 UNION ALL

    SELECT 'A006283', '2007-10-21', 140, 80 UNION ALL

    SELECT 'A006283', '2007-11-30', 130, 80 UNION ALL

    SELECT 'A006283', '2008-02-28', 140, 80 UNION ALL

    SELECT 'M020100', '2008-04-20', 130, 90

    SELECT *

    FROM @BP_READINGS

    ORDER BY MMID, MESDATE

    DECLARE @GRP_NUM AS int

    SET @GRP_NUM = 0

    DECLARE @GROUPED_READINGS TABLE (

    SEQUENCE int,

    MMID varchar(10),

    MESDATE DateTime,

    SYSTOLIC int,

    DIA int,

    QUALIFIED bit

    )

    INSERT INTO @GROUPED_READINGS (SEQUENCE, MMID, MESDATE, SYSTOLIC, DIA, QUALIFIED)

    SELECT ROW_NUMBER() OVER (PARTITION BY MMID ORDER BY MESDATE) AS SEQUENCE, *,

    CASE

    WHEN SYSTOLIC >= 130 AND DIA >= 80 THEN 1

    ELSE 0

    END AS QUALIFIED

    FROM @BP_READINGS

    ORDER BY MMID, MESDATE

    DECLARE @QUALIFIED_READINGS TABLE (

    QRN int,

    GRPID int,

    SEQUENCE int,

    MMID varchar(10),

    MESDATE DateTime,

    SYSTOLIC int,

    DIA int

    )

    INSERT INTO @QUALIFIED_READINGS (QRN, SEQUENCE, MMID, MESDATE, SYSTOLIC, DIA)

    SELECT ROW_NUMBER() OVER (PARTITION BY MMID ORDER BY MESDATE) AS QRN, SEQUENCE, MMID,

    MESDATE, SYSTOLIC, DIA

    FROM @GROUPED_READINGS

    WHERE QUALIFIED = 1

    SELECT *

    FROM @GROUPED_READINGS

    UPDATE Q1

    SET @GRP_NUM = Q1.GRPID =

    CASE

    WHEN EXISTS (

    SELECT MMID

    FROM @QUALIFIED_READINGS AS Q2

    WHERE Q2.GRPID IS NULL AND

    Q2.MMID = Q1.MMID AND

    Q1.SEQUENCE - Q2.SEQUENCE = 1

    ) THEN @GRP_NUM

    ELSE @GRP_NUM + 1

    END

    FROM @QUALIFIED_READINGS AS Q1

    ;WITH GROUP_COUNTS AS (

    SELECT GRPID

    FROM @QUALIFIED_READINGS

    GROUP BY GRPID

    HAVING COUNT(*) > 2 AND

    DATEDIFF(d,MIN(MESDATE),MAX(MESDATE)) > 119

    )

    SELECT Q.*

    FROM @QUALIFIED_READINGS AS Q INNER JOIN GROUP_COUNTS AS G

    ON Q.GRPID = G.GRPID

    Please test on a broader range of inputs so that we can be sure it does indeed do what I think it does, and also for performance. I don't quite have the means as yet to generate a large test data set and thus little to no means of testing performance.

    Steve

    (aka smunson)

    :):):)

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Ah..great. Thanks Steve! It seems to work just fine. It was taking about 14s to run on the full table I've got with ~14k records, but then I filtered out the records at the very beginning instead of running the code on the entire table - ie. a patient should have to have at least 3 BPs >130/80 for him/her to have 3 consecutives..so if you filter out by this criteria at the very top, you end up with much fewer rows..which makes the query run much faster. Thanks again for posting the update.

    Just for completeness of this thread, to answer Jeff's question, this is the seq. code I wrote:

    declare @ConsistentlyHighBPs TABLE (

    MMID mmid_var,

    MESDATE DATE,

    SYS INT,

    DIA INT,

    sequence int

    )

    DECLARE @BpTableWithMinBpMeas TABLE (

    id int identity,

    MMID mmid_var,

    MESDATE DATE,

    SYS INT,

    DIA INT

    )

    -- Test data for debugging

    INSERT @BpTableWithMinBpMeas (MMID, MESDATE, SYS, DIA) VALUES ('A006283', '2006-05-06', 130, 80)

    INSERT @BpTableWithMinBpMeas (MMID, MESDATE, SYS, DIA) VALUES ('A006283', '2006-05-15', 155, 88)

    INSERT @BpTableWithMinBpMeas (MMID, MESDATE, SYS, DIA) VALUES ('A006283', '2006-08-08', 180, 96)

    INSERT @BpTableWithMinBpMeas (MMID, MESDATE, SYS, DIA) VALUES ('A006283', '2006-10-02', 120, 96)

    INSERT @BpTableWithMinBpMeas (MMID, MESDATE, SYS, DIA) VALUES ('A006283', '2006-10-02', 130, 96)

    INSERT @BpTableWithMinBpMeas (MMID, MESDATE, SYS, DIA) VALUES ('A006283', '2006-11-08', 132, 96)

    INSERT @BpTableWithMinBpMeas (MMID, MESDATE, SYS, DIA) VALUES ('A006283', '2008-02-18', 220, 80)

    INSERT @BpTableWithMinBpMeas (MMID, MESDATE, SYS, DIA) VALUES ('A006283', '2008-03-11', 110, 96)

    declare @id int = 1, @rows int=@@rowcount

    declare @currentPatientBps int, @currentPatient mmid_var, @bpCntr int

    declare @syst int, @dias int, @highBpCnt int = 0, @highBpSeq int

    -- Process the data

    WHILE @id<=@rows

    begin

    select @currentPatient = (select mmid from @BpTableWithMinBpMeas WHERE id=@id)

    select @currentPatientBps = (select COUNT(*) from @BpTableWithMinBpMeas WHERE MMID = @currentPatient)

    set @bpCntr = 0 -- this is the BP counter for a patient

    set @highBpSeq = 0 -- a patient can have multiple episodes of high BPs, this is to keep track of them

    set @highBpCnt = 0 -- this counts the number of high BPs as required for a single episode

    WHILE @bpCntr < @currentPatientBps

    begin

    select @syst = sys, @dias = dia from (select SYS, DIA from @BpTableWithMinBpMeas WHERE id=@id+@bpCntr) as somename

    IF (@syst >= @minSysThresh AND @dias >= @minDiaThresh)

    begin

    set @highBpCnt = @highBpCnt + 1

    end

    ELSE

    begin

    if (@highBpCnt >= @minHighBpsReqd)

    begin

    set @highBpSeq = @highBpSeq + 1

    INSERT INTO @ConsistentlyHighBPs

    SELECT MMID, MESDATE, SYS, DIA, @highBpSeq

    FROM @BpTableWithMinBpMeas

    WHERE id BETWEEN @id+@bpCntr-@highBpCnt AND @id+@bpCntr-1

    end

    --print @currentPatient + ': not high: ' + cast(@syst as varchar)

    set @highBpCnt = 0

    end

    set @bpCntr = @bpCntr + 1

    end

    if (@highBpCnt >= @minHighBpsReqd)

    begin

    -- if the high BP seq was at the end (and not followed by a 'normal' BP), then that seq gets caught here

    set @highBpSeq = @highBpSeq + 1

    INSERT INTO @ConsistentlyHighBPs

    SELECT MMID, MESDATE, SYS, DIA, @highBpSeq

    FROM @BpTableWithMinBpMeas

    WHERE id BETWEEN @id+@bpCntr-@highBpCnt AND @id+@bpCntr-1

    end

    set @id = @id + @currentPatientBps

    end

    It actually runs pretty fast!

  • Glad you have it working. Just a quick point - I realized too late in the day yesterday that I left in two SELECT statements in that list out the original data as well as that intermediate table, which I had in there solely to help me test my results. You can safely remove those, which I presume you probably realized and took care of already.

    By the way, Jeff is likely to ask "how fast?"... you should know better than to tease the man... 😀

    Steve

    (aka smunson)

    :):):)

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • thusi (11/25/2008)


    It actually runs pretty fast!

    Heh... "How fast?" 😀 Looks interesting... lemme take a look.

    By the way, just to establish the scale of this project... right now you have about 14k rows... what's it likely going to get to?

    Also, do you really need to do this for all patients at the same time... or just when they show up for a Doctor's appointment?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thusi... you don't have these defined in your code anywhere... what are the datatypes and starting or constant values for the following variables?

    Server: Msg 137, Level 15, State 2, Line 47

    Must declare the variable '@minSysThresh'.

    Server: Msg 137, Level 15, State 1, Line 53

    Must declare the variable '@minHighBpsReqd'.

    Server: Msg 137, Level 15, State 2, Line 53

    Must declare the variable '@minDiaThresh'.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Well, we are doing retrospective quality audit..so it really depends on how far back we want to extract the data. At the moment it's a single practice with roughly 18-months data. Not really sure at the moment whether we want to keep appending to the existing DB any new data we might extract..or simply use a completely new DB for the new data. Need to figure out how to copy all my queries, tables etc if we go with the latter option though. Will worry about that then..hehe

  • Oh sorry..I used this inside a UDF so that I can easily pass in different thresholds I need. Just use:

    declare @minHighBpsReqd int = 3, @minSysThresh int = 130, @minDiaThresh int = 80

  • Jeff Moden (11/25/2008)


    Thusi... you don't have these defined in your code anywhere... what are the datatypes and starting or constant values for the following variables?

    Server: Msg 137, Level 15, State 2, Line 47

    Must declare the variable '@minSysThresh'.

    Server: Msg 137, Level 15, State 1, Line 53

    Must declare the variable '@minHighBpsReqd'.

    Server: Msg 137, Level 15, State 2, Line 53

    Must declare the variable '@minDiaThresh'.

    Never mind... I got it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply