MAX not working with a CTE

  • I have a query that is supposed to return one patient id but even using MAX within my CTE I'm getting multiple patient ids when it seems like I should only get the record with the highest ProcedureCodeID... any ideas? (

    WITH cte
    AS (select
    pcs.pc_desc as category
    , a.pat_id as PatientId
    , a.pc_cd as ProcedureCode
    , max(a.pc_id) as ProcedureCodeId
    from vw_claims as a
    INNER JOIN Reporting.dbo.cc200_patients as b on a.pat_id = b.pat_id
    LEFT JOIN dbo.tb_proc_codes pcs on a.pc_id = pcs.pc_id
    group by
    pcs.pc_desc
    , a.pat_id
    , a.pc_cd
    )
    select distinct
    cte.PatientId
    , ProcedureCodeId
    from cte
    order by cte.PatientId

    Code enclosed. I can provide the DDL but it would need to be simplified):

    Here's what I need:

    PatientID       ProcedureCodeID

    1               12097

    3              12200

    4             12046

    Here's what I'm getting:

    PatientID       ProcedureCodeID

    1               12097

    3               57

    3              12200

    4             12044

    4             236

    4             202

    4             12046

     

    • This topic was modified 2 years, 3 months ago by  DaveBriCam.
  • Use CROSS APPLY to return the first record

    SELECT p.PatientID...

    ca.ProcedureCodeID

    FROM Patient p

    CROSS APPLY (SELECT TOP 1 ProcedureCodeID

    FROM SomeTable t

    WHERE t.PatientID = p.PatientID

    ORDER BY <some column>) ca

  • Your CTE does not tell the server to get only the row with the max pc_id. It tells the server to get the max pc_id for each combination of pcs.pc_desc, a.pat_id,  and a.pc_cd.

    If all you want is one patient ID, I think all you need is:

    select TOP 1
        a.pat_id as PatientId
        a.pc_id as ProcedureCodeId
    from vw_claims as a
    INNER JOIN Reporting.dbo.cc200_patients as b on a.pat_id = b.pat_id
    ORDER BY
        a.pc_cd DESC

    If there is foreign key integrity -- all patients in vw_claims really exists in cc200_patients -- then you don't even need the inner join either.

    Note: There was no need to return pcs.pc_desc and a.pc_cd in the CTE since they weren't used in the final result. And therefore, LEFT JOIN dbo.tb_proc_codes pcs on a.pc_id = pcs.pc_id was also of no value since you aren't filtering with it or returning any columns from in the final result.  But perhaps you were using those to execute just the CTE part of the query to validate data w/ recognizable values?

  • Should I be placing my Cross Apply within the CTE in place of MAX or place it in the selection at the bottom?

  • I need all the patient ids returned but no duplicates, only the record that has the highest ProcedureCodeId.

  • I need all the patient ids returned but no duplicates, only the record that has the highest ProcedureCodeId.

    SELECT PatientID, MAX(ProcedureCodeID) AS MaxCodeID

    FROM t

    GROUP BY PatientID

    If you need other columns too, then you need to join that back to the original table after the group by happens

    Some even fake data would probably help a lot here.

  • If you really only need the Patient ID and Procedure Code ID then you can just use the claims view and group by patient id. or you can use the cross apply suggested by pietlinden. The cross apply might be faster, I would try both. I modified the cross apply to return the procedure code with the max id, but if you need the id, just change the column.

    select  pat_id,
    max(pc_id) as pc_id
    from vw_claims
    group by pat_id
    order by pat_id

    SELECT p.pat_id,
    ca.pc_cd
    FROM Patient as p
    CROSS APPLY ( SELECT TOP (1) pc_cd-- replace with pc_id
    FROM vw_claims as t
    WHERE t.pat_id = p.pat_id
    ORDER BY pc_id DESC) ca

    If you need the IDs so you can then join to the other tables to get the detail, then I would use a row_number. I don't know if it will run as I don't have the tables, but the principle should be valid, but only if you need other columns, such as the member name, subscriber id from the patient table and the procedure code and category etc. The left join probably isn't helping, unless you have orphan pc_ids and you want the procedure code in your results.

    select  x.PatientId,
    x.ProcedureCode,
    x.pc_id,
    x.category
    from (
    select pcs.pc_desc as category,
    a.pat_id as PatientId,
    a.pc_cd as ProcedureCode,
    a.pc_id,
    ROW_NUMBER() OVER (PARTITION BY a.pat_id ORDER BY a.pc_id DESC) AS RowNum
    from vw_claims as a
    INNER JOIN Reporting.dbo.cc200_patients as b on a.pat_id = b.pat_id
    LEFT JOIN dbo.tb_proc_codes pcs on a.pc_id = pcs.pc_id
    ) as x
    where x.RowNum = 1
  • pietlinden wrote:

    I need all the patient ids returned but no duplicates, only the record that has the highest ProcedureCodeId.

    SELECT PatientID, MAX(ProcedureCodeID) AS MaxCodeID FROM t GROUP BY PatientID

    If you need other columns too, then you need to join that back to the original table after the group by happens Some even fake data would probably help a lot here.

    I would think that using ROW_NUMBER() in a CTE would be more efficient than joining back to the original table.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • that's what I realized a little bit ago.

  • Thanks!! I got it to work using cross apply.

  • I have the query bringing back good data but for just 5,204 records it is taking over 10 minutes. The SQL I shared for solving the problem with the MAX is actually an extremely simplified sample of a much larger query below if anyone has any ideas how to improve the performance many thanks:

    select distinct p.pat_id                                                    as PatientID
    , ca.category as Category
    , p.pat_dob as DateOfBirth
    , (datediff(hour, p.pat_dob, cast(getdate() as date)) / 8766) as PatientAge
    , ca.Race as Race
    , ca.gender as Gender
    , ca.cnty_nm as County
    , ca.pc_id as ProcedureCodeId
    , ca.ProcedureCode as ProcedureCode
    , case
    when (datediff(hour, p.pat_dob, cast(getdate() as date)) / 8766) < 18 then
    'Younger than 18'
    else
    'Older than 18'
    end as AgeGroup
    , ca.FundingSource
    , ca.prv_nm as ProviderName
    , ca.PatientName
    from dbo.cc200_patients p

    cross apply

    (
    select distinct top (1)
    t.pc_id
    , case
    when t.bp_id = 1 then
    'State'
    else
    'Federal'
    end as FundingSource
    , e.dd_value as Race
    , f.dd_value as gender
    , pcs.pc_desc as category
    , d.cnty_nm
    , pcs.pc_code as ProcedureCode
    , prov.prv_nm
    , case
    when b.pat_ln is null then
    ''
    else
    ltrim(rtrim(b.pat_ln))
    end + ', ' + case
    when b.pat_fn is null then
    ''
    else
    ltrim(rtrim(b.pat_fn))
    end + ' ' + case
    when b.pat_mn is null then
    ''
    else
    ltrim(rtrim(b.pat_mn))
    end as PatientName
    from dbo.vw_claims as t
    inner join dbo.cc200_patients as b
    on t.pat_id = b.pat_id
    left join dbo.tb_proc_codes as pcs
    on t.pc_id = pcs.pc_id
    inner join dbo.tb_providers as prov
    on t.prv_id = prov.prv_id
    left join dbo.tb_dd_list_values as e
    on b.pat_race_id = e.dd_val_id
    and e.dd_list_id = 73
    left join dbo.tb_dd_list_values as f
    on b.pat_gndr_id = f.dd_val_id
    and f.dd_list_id = 36
    inner join
    (
    select row_number() over (partition by pat_id, pat_ins_id order by end_dt desc) as rownum
    , X.pat_id
    , X.cnty_id
    , X.bp_id
    from tb_pat_ins X
    where (X.eff_dt <= '2022-04-30')
    and
    (
    X.end_dt is null
    or end_dt >= '2022-04-01'
    )
    ) as uu
    on t.pat_id = uu.pat_id
    and t.bp_id = uu.bp_id
    left join dbo.tb_counties as d
    on uu.cnty_id = d.cnty_id
    where t.pat_id = p.pat_id
    and d.cnty_id = 41
    and t.clm_dos
    between '2022-04-01' and '2022-04-30'
    order by pc_id asc
    ) ca;
  • Solved: Removing the 'DISTINCT' from the very first line made the query run much faster.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply