I have a string_agg line in my code that is bringing back hundreds of duplicates in my service_cd field. Any ideas? Thx!:
mbr_name codes
abc 971, 971, 971, 971..............
xyz 978, 978, 978, 978..............
select distinct
s.MBR_NAME
, string_agg(cast(s.SERVICE_CD as varchar(max)), ', ') as codes
from MyReporting.dbo.EPISODE as s
where s.EPISODE_TYPE in ( x2, x3 )
group by s.MBR_NAME
, s.SERVICE_CD
December 19, 2023 at 3:22 pm
Your DISTINCT is applied to the result of the STRING_AGG, not to the elements it contains.
Remove the dupes in a CTE and select your STRING_AGG from that.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
December 19, 2023 at 3:25 pm
What is the datatype of SERVICE_CD? CASTing every element to VARCHAR(MAX) sounds like madness.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
December 19, 2023 at 3:30 pm
It's varchar(15). Casting as varchar(max) was the seemingly only way to avoid this error:
"STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation."
December 19, 2023 at 3:32 pm
It's varchar(15). Casting as varchar(max) was the seemingly only way to avoid this error:
"STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation."
After getting rid of the dupes, hopefully the error goes away without the need for a CAST.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
December 19, 2023 at 3:49 pm
Maybe like this?:
select
s.MBR_NAME
, string_agg(cast(s.SERVICE_CD as varchar(max)), ', ') as codes
from (
select distinct MBR_NAME, SERVICE_CD
FROM MyReporting.dbo.EPISODE
where EPISODE_TYPE in ( x2, x3 )
) as s
group by s.MBR_NAME
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 19, 2023 at 5:20 pm
Thx! I prefer, for now, to look for a subquery since my *real* SQL already has one CTE (I've used 20+ before). I should not have oversimplified my code, but below is what I'm really up against. I tried the below and the duplicates persist
declare @startdate date = '2023-08-01';
declare @enddate date = '2023-08-31';
with CTE_SARids
as -- this is for collecting all the 'SAR' numbers for a date range along with the patients
(select distinct
DH.ENC_IDN
, DH.MBR_IDN
, DH.HIST_ADDED_DATE
from MyReporting.dbo.V_MODEL_EPISODE_DECISION_HIST as DH
where DH.HIST_ADDED_DATE
between @startdate and @enddate
group by DH.ENC_IDN
, DH.MBR_IDN
, DH.HIST_ADDED_DATE)
-- Bringing back rest of needed data
select SAR.ENC_IDN as 'SAR ID'
, M.INDIVIDUALIZATION_NUMBER as 'Patient ID'
, M.MBR_NAME as 'Patient Name'
, DH.CURRENT_DECISION as 'Status'
, string_agg(cast(S.SERVICE_CD as varchar(max)), ', ') as 'PC Codes'
, convert(date, DH.HIST_ADDED_DATE, 101) as 'Initial Submission Date'
, convert(date, DH.LAST_CHANGED_DATE, 101) as 'Effective Date'
from CTE_SARids as SAR
inner join MyReporting.dbo.V_MODEL_EPISODES as e
on SAR.ENC_IDN = e.ENC_IDN
inner join
(
select distinct
SERVICE_CD
, ENC_IDN
, SVC_EXTN_IDN
from MyReporting.dbo.V_MODEL_EPISODE_SERVICES
) as S
on S.ENC_IDN = e.ENC_IDN
inner join MyReporting.dbo.V_MODEL_EPISODE_PROVIDERS as P
on P.ENC_IDN = e.ENC_IDN
inner join MyReporting.dbo.V_MODEL_AUTH_SERVICE_EXTN_HIST as EH
on EH.ENC_IDN = S.ENC_IDN
and EH.SVC_EXTN_IDN = S.SVC_EXTN_IDN
inner join MyReporting.dbo.V_MODEL_EPISODE_DECISION_HIST as DH
on DH.ENC_IDN = S.ENC_IDN
and DH.SVC_EXTN_IDN = S.SVC_EXTN_IDN
inner join MyReporting.dbo.V_MODEL_EPISODE_COVERAGE as EC
on EC.ENC_IDN = S.ENC_IDN
inner join MyReporting.dbo.V_MODEL_MEMBERS as M
on M.MBR_IDN = e.MBR_IDN
inner join JivaReporting.dbo.V_MODEL_MBR_COVERAGE as MC
on MC.MBR_IDN = e.MBR_IDN
and MC.ID_TYPE_CD = 'alt'
where e.EPISODE_TYPE in ( 'Behavioral Health Inpatient', 'Behavioral Health Outpatient', 'Inpatient', 'OutPatient' )
and cast(DH.UPDATED_DATE as date)
between @startdate and @enddate
and cast(e.RECEIVED_DATE as date) > '3/31/2023'
and DH.CURRENT_DECISION <> '-'
and DH.CURRENT_DECISION in ( 'Approved', 'Denied' )
and S.SERVICE_CD in ( '97151', '97152', '97153', '97154', '97155', '97156', '97157' )
group by M.INDIVIDUALIZATION_NUMBER
, M.MBR_NAME
, SAR.ENC_IDN
, DH.HIST_ADDED_DATE
, EC.CLIENT_NAME
, DH.LAST_CHANGED_DATE
, DH.CURRENT_DECISION
, S.SERVICE_CD
, SAR.ENC_IDN;
:
December 19, 2023 at 8:30 pm
Wow, don't be offended, but what a mess! It might be time to think about a divide and conquer approach. Build up your result sets individually and then assemble them at the end to output what you need, as this will be much easier to debug.
Also, in the CTE, you are doing a SELECT DISTINCT and a GROUP BY, on the same columns. Remove the GROUP BY and you'll get the same results.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
select SAR.ENC_IDN as 'SAR ID'
, M.INDIVIDUALIZATION_NUMBER as 'Patient ID'
, M.MBR_NAME as 'Patient Name'
, DH.CURRENT_DECISION as 'Status'
, S.SERVICE_CDS as 'PC Codes' --<<--
, convert(date, DH.HIST_ADDED_DATE, 101) as 'Initial Submission Date'
, convert(date, DH.LAST_CHANGED_DATE, 101) as 'Effective Date'
from ...
inner join --<<--
(
select ENC_IDN,
string_agg(SERVICE_CD, ', ') AS SERVICE_CDS
from
(
select distinct
SERVICE_CD
, ENC_IDN
from MyReporting.dbo.V_MODEL_EPISODE_SERVICES
where S.SERVICE_CD in ( '97151', '97152', '97153', '97154', '97155', '97156', '97157' )
) as S0
group by ENC_IDN
) as S
on S.ENC_IDN = e.ENC_IDN
...
/* shouldn't need nor want this
group by M.INDIVIDUALIZATION_NUMBER --<<--
, M.MBR_NAME
, SAR.ENC_IDN
, DH.HIST_ADDED_DATE
, EC.CLIENT_NAME
, DH.LAST_CHANGED_DATE
, DH.CURRENT_DECISION
, S.SERVICE_CD
, SAR.ENC_IDN;
*/
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 20, 2023 at 2:10 pm
Thanks Scott! That worked
December 20, 2023 at 3:42 pm
Great. Glad it helped, and that the abbreviated way I posted it made sense.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 6, 2024 at 3:02 pm
I have another pesky string_agg for which I'm trying to follow the example above but not getting the results like a field that has the aggregate like: "All, MH":
SELECT DISTINCT
clm.pat_id AS 'Consumer #',
TRIM(pat.pat_ln) AS 'LastName',
TRIM(pat.pat_fn) AS 'FirstName',
TRIM(pat.pat_mn) AS 'MiddleName',
ISNULL(
(
SELECT ins_desc
FROM AlphaMCS_shc.dbo.tb_insurances
WHERE clm.bp_id = ins_id
),
''
) AS 'Benefit Plan',
CAST(pat.pat_dob AS DATE) AS 'DOB',
cty.cnty_nm AS 'County',
ad.state AS 'State',
s.Diagnosis_Group AS 'DiagnosisGroup',
pcodes.pc_desc AS 'ServiceDescription',
clm.prv_id AS 'ProviderNK',
sit.site_id AS 'SiteProviderID1',
sit.site_nm AS 'SiteProviderName1',
CAST(clm.clm_dos AS DATE) AS 'Date Of Service'
FROM AlphaMCS_shc.dbo.vw_claims clm
INNER JOIN AlphaMCS_shc.dbo.tb_patients pat
ON clm.pat_id = pat.pat_id
INNER JOIN AlphaMCS_shc.dbo.tb_pat_addresses AS ad
ON ad.pat_id = pat.pat_id
INNER JOIN AlphaMCS_shc.dbo.tb_pat_diags AS pd
ON pd.pat_id = pat.pat_id
INNER JOIN AlphaMCS_shc.dbo.tb_diag_to_diag_groups AS ddg
ON ddg.dx_id = pd.dx_id
INNER JOIN AlphaMCS_shc.dbo.tb_diag_groups AS dg
ON dg.dx_grp_id = ddg.dx_grp_id
INNER JOIN AlphaMCS_shc.dbo.tb_counties CTY
ON cty.cnty_id = ad.cnty_id
INNER JOIN AlphaMCS_shc.dbo.tb_providers prv
ON prv.prv_id = clm.prv_id
INNER JOIN AlphaMCS_shc.dbo.tb_sites sit
ON sit.site_id = clm.site_id
INNER JOIN AlphaMCS_shc.dbo.tb_proc_codes pcodes
ON clm.pc_id = pcodes.pc_id
INNER JOIN
(
select dx_grp_id,
string_agg(dx_grp_desc, ', ') AS Diagnosis_Group
from
(
select distinct dx_grp_id,
dx_grp_desc
from AlphaMCS_shc.dbo.tb_diag_groups
WHERE dx_grp_desc IN ('All','AO','IDD','MH','SA','MH High End','MED')
) as S0
group BY dx_grp_id
) as S
on S.dx_grp_id = dg.dx_grp_id
WHERE clm.clm_dos
BETWEEN '2023-12-01' AND '2024-01-31'
AND clm.rev_code = '0911'
ORDER BY clm.pat_id, 'Date Of Service'
September 6, 2024 at 7:02 pm
Got it to work with some CTE acrobatics:
WITH First_CTE AS (
SELECT DISTINCT
clm.pat_id,
pat.pat_ln,
pat.pat_fn,
pat.pat_mn,
ISNULL(
(
SELECT ins_desc
FROM AlphaMCS_shc.dbo.tb_insurances
WHERE clm.bp_id = ins_id
),
''
) AS 'Benefit_Plan',
pat.pat_dob,
cty.cnty_nm,
ad.state,
dg.dx_grp_desc ,
pcodes.pc_desc,
clm.prv_id,
sit.site_id,
sit.site_nm,
clm.clm_dos
FROM AlphaMCS_shc.dbo.vw_claims clm
INNER JOIN AlphaMCS_shc.dbo.tb_patients pat
ON clm.pat_id = pat.pat_id
INNER JOIN AlphaMCS_shc.dbo.tb_pat_addresses AS ad
ON ad.pat_id = pat.pat_id
INNER JOIN AlphaMCS_shc.dbo.tb_pat_diags AS pd
ON pd.pat_id = pat.pat_id
INNER JOIN AlphaMCS_shc.dbo.tb_diag_to_diag_groups AS ddg
ON ddg.dx_id = pd.dx_id
INNER JOIN AlphaMCS_shc.dbo.tb_diag_groups AS dg
ON dg.dx_grp_id = ddg.dx_grp_id
INNER JOIN AlphaMCS_shc.dbo.tb_counties CTY
ON cty.cnty_id = ad.cnty_id
INNER JOIN AlphaMCS_shc.dbo.tb_providers prv
ON prv.prv_id = clm.prv_id
INNER JOIN AlphaMCS_shc.dbo.tb_sites sit
ON sit.site_id = clm.site_id
INNER JOIN AlphaMCS_shc.dbo.tb_proc_codes pcodes
ON clm.pc_id = pcodes.pc_id
WHERE clm.clm_dos
BETWEEN '2023-12-01' AND '2024-01-31'
AND clm.rev_code = '0911'
)
,Second_CTE
AS
(
SELECT
*
FROM First_CTE
GROUP BY
pat_id,
pat_ln,
pat_fn,
pat_mn,
Benefit_Plan,
pat_dob,
cnty_nm,
state,
dx_grp_desc,
pc_desc,
prv_id,
site_id,
site_nm,
clm_dos
)
SELECT
pat_id AS 'Consumer #',
TRIM(pat_ln) AS 'LastName',
TRIM(pat_fn) AS 'FirstName',
TRIM(pat_mn) AS 'MiddleName',
Benefit_Plan AS 'Benefit Plan',
CAST(pat_dob AS DATE) AS 'DOB',
cnty_nm AS 'County',
state AS 'State',
string_agg(dx_grp_desc, ', ') AS 'DiagnosisGroup',
pc_desc AS 'ServiceDescription',
prv_id AS 'ProviderNK',
site_id AS 'SiteProviderID1',
site_nm AS 'SiteProviderName1',
CAST(clm_dos AS DATE) AS 'Date Of Service'
FROM Second_CTE
GROUP BY
pat_id,
pat_ln,
pat_fn,
pat_mn,
Benefit_Plan,
pat_dob,
cnty_nm,
state,
pc_desc,
prv_id,
site_id,
site_nm,
clm_dos
ORDER BY [Consumer #]
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply