May 26, 2022 at 3:59 pm
I have a table of patients that has often multiple occurrences of a patient number but with differing dates of which I need to return only the latest date but I'm unsure if I need a subquery, in short I'm stuck as to what might work. Here is my table and some sample data.
The below gives output of:
123548, '2022-01-17'
123548, '2022-02-21'
985254, '2022-03-19'
774589, '2022-01-17'
754512, '2022-04-04'
754512, '2022-02-09'
But I need the output to be:
123548, '2022-02-21'
754512, '2022-04-04'
create table [dbo].[cases](
[case_id] [int] identity(1,1) not null,
[pat_id] [bigint] not null,
[eff_dt] [datetime] null
INSERT INTO [dbo].[cases] ([pat_id],[eff_dt]) VALUES (123548, '2022-01-17')
INSERT INTO [dbo].[cases] ([pat_id],[eff_dt]) VALUES (123548, '2022-02-21')
INSERT INTO [dbo].[cases] ([pat_id],[eff_dt]) VALUES (985254, '2022-03-19')
INSERT INTO [dbo].[cases] ([pat_id],[eff_dt]) VALUES (774589, '2022-01-17')
INSERT INTO [dbo].[cases] ([pat_id],[eff_dt]) VALUES (754512, '2022-04-04')
insert INTO [dbo].[cases] ([pat_id],[eff_dt]) VALUES (754512, '2022-02-09')
select pat_id, eff_dt
from dbo.cases
May 26, 2022 at 4:09 pm
The standard approach for this is to use a CTE and ROW_NUMBER
to get the "top 1" in each group. You can also use a windowed COUNT
to allow you to filter to patients with multiple rows:
WITH CTE AS(
SELECT pat_id,
eff_dt,
COUNT(pat_id) OVER (PARTITION BY pat_id) AS C,
ROW_NUMBER() OVER (PARTITION BY pat_id ORDER BY eff_dt DESC) AS RN
FROM dbo.cases)
SELECT pat_id,
eff_dt
FROM CTE
WHERE C > 1
AND RN = 1;
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
May 26, 2022 at 4:24 pm
That works perfectly, many thanks!
May 26, 2022 at 4:55 pm
One question, what if I have some dates as NULL values and want them to return?
May 26, 2022 at 5:28 pm
One question, what if I have some dates as NULL values and want them to return?
The null date is ordered after the valid dates (in descending order) so will not be returned with row_number = 1 unless there are no other dates. You can use isnull(eff_dt, '1900-0101') if you really wanted to make sure, but applying functions like this could degrade performance.
The code excludes patients with only one row, so if the patient has one row and the eff_dt is null, they will be dropped along with any other patients with a single eff_dt. Your desired output implied that's what you wanted, but if that is not the case you could drop the C > 1 line.
If you have a patient with two rows and both have null eff_dt, or there are duplicate max eff_dt values, one of them will be returned, but the result will be non-deterministic. In this case I would order the row_number by eff_dt desc, case_id desc so that the same rows are always returned. In this example it doesn't matter, but if dupes are possible and you needed to return the case_id of the most recent eff_dt so you could join it to other tables etc, making it deterministic would be a good idea.
May 26, 2022 at 7:06 pm
Sorry, I was wrong with what my needed output should be, I needed this, in other words keeping those with just one date:
123548, '2022-02-21'
985254, '2022-03-19'
774589, '2022-01-17'
754512, '2022-04-04'
May 27, 2022 at 8:20 am
Sorry, I was wrong with what my needed output should be, I needed this, in other words keeping those with just one date:
123548, '2022-02-21'
985254, '2022-03-19'
774589, '2022-01-17'
754512, '2022-04-04'
All the information you need to get this answer is still in my code before; you just need to now not just return rows where there are more than 1 instance of a value.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
July 13, 2023 at 6:59 pm
I thought it best to add onto an existing thread since the issue is very similar with a new challenge. I need to return all the insurance numbers that have two different people listed such as:
pat_ID ins_num
111807 100449401H
461261 100449401H
106820 101159201B
35599 101203104H
85116 101859902J
445903 103703602G
199156 103839502A
342322 104206501C
457017 105117401F
98573 105450402E
Note that what would need to be returned would be only the number '100449401H'
Below is some code I hoped would work but too many records are being returned:
with CTE
as (select i.pat_id
, i.ins_num
, count(i.pat_id) over (partition by i.pat_id) as C
, row_number() over (partition by i.ins_num order by i.ins_num desc) as RN
from dbo.tb_pat_ins i
)
select pat_id
, ins_num
from CTE
where C > 1
and RN = 1;
July 13, 2023 at 7:13 pm
Does this work?
DROP TABLE IF EXISTS #tb_pat_ins
CREATE TABLE #tb_pat_ins
( pat_ID INT,
ins_num VARCHAR(20)
)
INSERT #tb_pat_ins (pat_ID,ins_num)
VALUES (111807 , '100449401H'),
(461261, '100449401H'),
(106820, '101159201B'),
(35599, '101203104H'),
(85116,' 101859902J'),
(445903, '103703602G'),
(199156, '103839502A'),
(342322, '104206501C'),
(457017, '105117401F'),
(98573, '105450402E')
SELECT i.ins_num
FROM #tb_pat_ins i
GROUP BY i.ins_num
HAVING COUNT(DISTINCT pat_ID) > 1
July 17, 2023 at 1:01 pm
That works until I try to add additional fields to be returned:
select distinct
i.ins_num
from dbo.tb_pat_ins i
where i.active = 1
and i.bp_id <> 1
group by i.ins_num
having count(distinct pat_id) > 1
order by i.ins_num desc; -- returns 442 records
select distinct
i.ins_num
, i.pat_id
, i.bp_id
, i.eff_dt
, i.end_dt
, i.card_ln
, i.card_fn
, i.card_mn
from dbo.tb_pat_ins i
where i.active = 1
and i.bp_id <> 1
group by i.ins_num
, i.pat_id
, i.bp_id
, i.eff_dt
, i.end_dt
, i.card_ln
, i.card_fn
, i.card_mn
having count(distinct pat_id) > 1
order by i.ins_num desc; -- returns no records
July 17, 2023 at 3:06 pm
Identify the ins_nums, then join/semi join back to the table.
SELECT i.ins_num
, i.pat_id
, i.bp_id
, i.eff_dt
, i.end_dt
, i.card_ln
, i.card_fn
, i.card_mn
from dbo.tb_pat_ins as i
where i.active = 1
and i.bp_id <> 1
and exists (select 1
from dbo.tb_pat_ins as i2
where i2.ins_num = i.imsnum
and i2.active = 1
and i2.bp_id <> 1
group by i2.ins_num
having count(i2.distinct pat_id) > 1
)
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply