October 20, 2011 at 8:45 am
peterzeke (10/20/2011)
Makes sense to create a mapping table that can be leveraged for performance when doing reporting, but... doesn't the act of building such a mapping/crosswalk table return us back to my initial inquiry (query) of how to join the tables to get the correct mapping in the first place without performance totally dragging even for adding to the mapping table? At the very least, limiting the rows to evaluate being based on the latest insert_date would trim down the pool of data to be considered for appends to the mapping table, but the join itself would still be "flimsy" if charindex() or LIKE% is employed.For example, if the mapping table has three variations for "Citalopram", and a nightly run identifies a fourth variation of this drug, what technique/join is employed to identify this new drug description in the first place? It's simple enough to use an outer join to identify the new drug names, but somehow those new drug names need to map back to a key/primary drug description.
Lastly, before I forget to say it, thanks to all of your input and participation on this topic.
As Phil shared earlier your mapping table would have 2 columns, one for the incoming name and one for the common name. So your process would do a "lookup" on the mapping table and if a match is not found you'd insert that row with a null for the common name. Unfortunately, this is where the manual intervention comes in, you'd have to look at the mapping table for NULL in the common name column and manually update them. Then you'd have to re-run the process for the ones that you didn't map.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
October 20, 2011 at 8:52 am
This should give you some ideas, Peter.
SELECT
DrugNameString = c.DrugName,
DrugName = x.Item
FROM (
SELECT ClaimID = 123, DrugName = 'Oral Solution Citalopram HBR' UNION ALL
SELECT 124, 'CITALOPRAM TABLETS' UNION ALL
SELECT 125, 'Paracetamol TABLETS' UNION ALL
SELECT 126, 'Paracetamol solution' UNION ALL
SELECT 127, 'Paracetamol' UNION ALL
SELECT 128, 'Tab, Citalopram'
) c
CROSS APPLY(SELECT DrugString = REPLACE(c.DrugName, ' ', ',')) cl -- for clarity - avoids multiple REPLACE in below APPLY
CROSS APPLY (
SELECT
Item = CAST(LEFT(cl.DrugString, ISNULL(NULLIF(CHARINDEX(',', cl.DrugString, 1),0)-1,8000)) AS VARCHAR(8000))
UNION ALL
SELECT
Item = SUBSTRING(cl.DrugString,n,ISNULL(NULLIF(CHARINDEX(',',cl.DrugString,n),0)-n,8000))
FROM (
SELECT n = n+1
FROM (
SELECT TOP (1+ISNULL(DATALENGTH(cl.DrugString),0))
n = (n1 + n2 + n3 + n4)
FROM (((VALUES (0),(100),(200),(300),(400),(500),(600),(700),(800),(900)) t3 (n3)
CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t1 (n1))
CROSS JOIN (VALUES (0),(10),(20),(30),(40),(50),(60),(70),(80),(90)) t2 (n2))
CROSS JOIN (VALUES (0),(1000),(2000),(3000),(4000),(5000),(6000),(7000),(8000),(9000)) t4 (n4) --
) n
WHERE SUBSTRING(cl.DrugString,n,1) = ','
) d
) x (Item)
INNER JOIN (SELECT DrugName = 'CITALOPRAM' UNION ALL SELECT 'Paracetamol') rft_DrugXrefDrugClass
ON rft_DrugXrefDrugClass.DrugName = x.Item
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 20, 2011 at 10:00 am
Chris -- interesting query. Looks like it's for SQL 2008?
So, if I understand it, the CROSS APPLY components transform a drug description into a comma-delimited array, which is then pivoted vertically to inner join the parsed values against the rft table values. The deeply nested derived table of "n" values is another form of a tally table, it seems -- clever.
I'll need to tweek the query to work in SQL 2005, but certainly promising.
Without turning this thread into a SQL 2008 inquiry, I have a couple quick questions:
1) what exactly does "t3 (n3)" and "t1 (n1)" ... and so on doing? Not familiar with parens () after an alias. Does the "t1" alias the "VALUES" as a derived table, and (n1) as a field name within the "VALUES"?
2) Similar to the first question, what's "x (Item)" at the end of the second CROSS APPLY all about?
Thanks
October 20, 2011 at 10:13 am
Hi Peter, I'm running out of time today but here's some notes.
The entire cross apply bit can - and should - be replaced by a string splitter function. This is the bit I mean:
CROSS APPLY (
SELECT
Item = CAST(LEFT(cl.DrugString, ISNULL(NULLIF(CHARINDEX(',', cl.DrugString, 1),0)-1,8000)) AS VARCHAR(8000))
UNION ALL
SELECT
Item = SUBSTRING(cl.DrugString,n,ISNULL(NULLIF(CHARINDEX(',',cl.DrugString,n),0)-n,8000))
FROM (
SELECT n = n+1
FROM (
SELECT TOP (1+ISNULL(DATALENGTH(cl.DrugString),0))
n = (n1 + n2 + n3 + n4)
FROM (((VALUES (0),(100),(200),(300),(400),(500),(600),(700),(800),(900)) t3 (n3)
CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t1 (n1))
CROSS JOIN (VALUES (0),(10),(20),(30),(40),(50),(60),(70),(80),(90)) t2 (n2))
CROSS JOIN (VALUES (0),(1000),(2000),(3000),(4000),(5000),(6000),(7000),(8000),(9000)) t4 (n4) --
) n
WHERE SUBSTRING(cl.DrugString,n,1) = ','
) d
) x (Item)
And the reason? Firstly, it will drastically reduce the complexity of your code. Secondly, I'm not even sure if Row Constructor - VALUES () - works with 2k5 😉
t3 is the table alias, n3 is the column name/alias.
x is the alias of the cross apply, (item) is the column alias
The cross apply takes a delimited string and chops it into elements, a new row for each.
Use REPLACE to replace spaces with a designated character, ',' is good. ' ' doesn't always work well as the delimiter.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 20, 2011 at 10:18 am
peterzeke (10/20/2011)
Chris -- interesting query. Looks like it's for SQL 2008?So, if I understand it, the CROSS APPLY components transform a drug description into a comma-delimited array, which is then pivoted vertically to inner join the parsed values against the rft table values. The deeply nested derived table of "n" values is another form of a tally table, it seems -- clever.
I'll need to tweek the query to work in SQL 2005, but certainly promising.
Without turning this thread into a SQL 2008 inquiry, I have a couple quick questions:
1) what exactly does "t3 (n3)" and "t1 (n1)" ... and so on doing? Not familiar with parens () after an alias. Does the "t1" alias the "VALUES" as a derived table, and (n1) as a field name within the "VALUES"?
2) Similar to the first question, what's "x (Item)" at the end of the second CROSS APPLY all about?
Thanks
For 1, that basically builds an inline numbers table (derived). I've never seen it done that way before but it works.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
October 20, 2011 at 10:36 am
OK -- great tips, clever tricks (and new fangled features I might get to play with if I ever get beyond sql 2k5)!
Plenty to go on from here.
thanks!
October 20, 2011 at 11:44 am
Phil Parkin (10/20/2011)
Surely another possibility is building a mapping table, if this is likely to be a repeated process?Something like this:
select 'CITALOPRAM TABLETS' Pseudonym,
'Citalopram' GenericName
union
select 'Tab, Citalopram',
'Citalopram'
union
select 'Oral Solution Citalopram HBR',
'Citalopram'
where 'Pseudonym' is the PK.
Then joining on that to get to the Generic name (apologies - pharmaceutical nomenclature is not my strong point)
Agreed. There are also formularies that at least used to be public info (Medicare's formulary, some blue shields have formularies) which hold this very info (i.e. every specific "market" name and formulation mapped to the generic/pharmaceutical name). Leveraging one of those should substantially cut down on the amount you need to map (excluding typos, etc...)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 7 posts - 16 through 21 (of 21 total)
You must be logged in to reply to this topic. Login to reply