July 1, 2016 at 2:27 am
Hi,
I have a text string that appears in files as below:
'There were 6 discharges and 3 admission since reporting 21/04/2016'
However sometimes, admissions are swapped around with discharges in the file such that on the following day:
'There were 5 admissions and 3 discharges since reporting 22/04/2016'
There can be variable amounts of white space on either side and within the string.
Please note that occasionally discharges is spelt discharge and admissions as admission.
How can i always pull out the correct number for Admissions and the correct number for Discharges please?
Q
July 1, 2016 at 2:34 am
quentin.harris (7/1/2016)
Hi,I have a text string that appears in files as below:
'There were 6 discharges and 3 admission since reporting 21/04/2016'
However sometimes, admissions are swapped around with discharges in the file such that on the following day:
'There were 5 admissions and 3 discharges since reporting 22/04/2016'
There can be variable amounts of white space on either side and within the string.
Please note that occasionally discharges is spelt discharge and admissions as admission.
How can i always pull out the correct number for Admissions and the correct number for Discharges please?
Q
can you ALWAYS guarantee the number will come immediately before either "admissions" or "discharges"
what happens if someone types this "Since reporting 22/04/2016, the admissions are 5 and the discharges are 3"
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 1, 2016 at 2:39 am
Start by using PATINDEX to find the position of " admission" and " discharge" in the string. You can then work back from there to find the first instance of space followed by a number, and convert that to int. Have a go, and post back if there's anything you're struggling with.
John
July 1, 2016 at 4:01 am
How can i always pull out the correct number for Admissions and the correct number for Discharges please?
can you please give example of how you want the end result to be presented? rows, a single line of text, something else?
thanks
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 1, 2016 at 5:06 am
Hi,
The output would be a row that would populate a table with headers Date,Admissions,Discharges:
Date Admissions Discharges
22/04/2016 5 3
I had used patindex in existing code. I used a variable VarDisch to which i assigned the string based on finding the one occurrence in the file of 'disch'.
The problem is because i searched on finding a number as opposed to a string.
The code i have I am unsure how to post as each time i try i get the response: connection reset.
So any clues on how to do that would help!
Regards,
Q
July 1, 2016 at 5:12 am
can you ALWAYS guarantee the number will come immediately before either "admissions" or "discharges"
what happens if someone types this "Since reporting 22/04/2016, the admissions are 5 and the discharges are 3"
please can you advise?
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 1, 2016 at 5:29 am
Hi,
I cannot guarantee that. In the template submitted for the last 7 months, the only changes i have seen to this text field have been:
1. number of spaces (white space within and without);
2. the spelling of admissions and discharges, and;
3. the ordering of admissions and discharges.
The number (of) has always preceded the text 'admissions' or 'discharges', and the date has always been at the end of the string.
The SQL is a workaround until validation is in place for the template file.
Q
July 1, 2016 at 3:45 pm
WARNING WARNING WARNING: You MUST get your inputs standardized. You are at risk until then.
That said. This was a fun problem, and the code below should get you started.
select * into #input
from (values (1,'There have been 5 admissions and 3 discharges since 04/14/2016.')
,(2,'There has been 1 admission and 2 discharges since 04/22/2016.')
,(3,'There have been 2 admissions and 1 discharge since 04/29/2016.')
,(4,'There have been 2 discharges and 2 admissions since 05/15/2016.')
,(5,'There have been 2 discharges and 1 admission since 05/24/2016.')) dt (Rowid,TextString);
select * from #input;
with cte as (select rowID, itemnumber,convert(varchar(50),item) as item
from #input i
cross apply DelimitedSplit8K(TextString, ' ')
)
,cte2 as(select rowID, Item
, casewhen Item like 'Admission%' or Item Like 'Discharge%'
then LAG(item,1,null) over(order by rowid,itemnumber) else null end as CountX
, TRY_CONVERT(date, replace(item,'.','')) as DateX
from cte )
selectMax(Case when Item Like 'Admission%' then CountX else null end) as Admissions
,Max(Case when Item Like 'Discharge%' then CountX else null end) as Discharges
,Max(DateX) as DateX
from cte2
where CountX is not null or DateX is not null
group by Rowid
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
July 2, 2016 at 12:27 pm
The Dixie Flatline (7/1/2016)
WARNING WARNING WARNING: You MUST get your inputs standardized. You are at risk until then.That said. This was a fun problem, and the code below should get you started.
select * into #input
from (values (1,'There have been 5 admissions and 3 discharges since 04/14/2016.')
,(2,'There has been 1 admission and 2 discharges since 04/22/2016.')
,(3,'There have been 2 admissions and 1 discharge since 04/29/2016.')
,(4,'There have been 2 discharges and 2 admissions since 05/15/2016.')
,(5,'There have been 2 discharges and 1 admission since 05/24/2016.')) dt (Rowid,TextString);
select * from #input;
with cte as (select rowID, itemnumber,convert(varchar(50),item) as item
from #input i
cross apply DelimitedSplit8K(TextString, ' ')
)
,cte2 as(select rowID, Item
, casewhen Item like 'Admission%' or Item Like 'Discharge%'
then LAG(item,1,null) over(order by rowid,itemnumber) else null end as CountX
, TRY_CONVERT(date, replace(item,'.','')) as DateX
from cte )
selectMax(Case when Item Like 'Admission%' then CountX else null end) as Admissions
,Max(Case when Item Like 'Discharge%' then CountX else null end) as Discharges
,Max(DateX) as DateX
from cte2
where CountX is not null or DateX is not null
group by Rowid
Hi Dixie
as you say a fun problem....but your code doesnt seem to cope with the white space issues as posted...
for example
select * into #input
from (values (1,'There have been 5 admissions and 3 discharges since 04/14/2016.')
,(2,'There has been 1 admission and 2 discharges since 04/22/2016.')
,(3,'There have been 2 admissions and 1 discharge since 04/29/2016.')
,(4,'There have been 2 discharges and 2 admissions since 05/15/2016.')
,(5,'There have been 2 discharges and 1 admission since 05/24/2016.')) dt (Rowid,TextString);
select * from #input;
with cte as (select rowID, itemnumber,convert(varchar(50),item) as item
from #input i
cross apply DelimitedSplit8K(TextString, ' ')
)
,cte2 as(select rowID, Item
, casewhen Item like 'Admission%' or Item Like 'Discharge%'
then LAG(item,1,null) over(order by rowid,itemnumber) else null end as CountX
, TRY_CONVERT(date, replace(item,'.','')) as DateX
from cte )
selectMax(Case when Item Like 'Admission%' then CountX else null end) as Admissions
,Max(Case when Item Like 'Discharge%' then CountX else null end) as Discharges
,Max(DateX) as DateX
from cte2
where CountX is not null or DateX is not null
group by Rowid
DROP TABLE #input
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 3, 2016 at 2:00 pm
well this is an interesting exercise....and I found it fun to have a go.
I appreciate that that this only a "workaround" until you get some sense into the data entry validation....here below is my "effort"...its far from foolproof and performant...but does cover a lot of permutations.
That said...we never know what a user is going to type in a "text field" 😉
SET DATEFORMAT DMY;
SELECT *
INTO #input
FROM (values
( 1, 'There were 5 admissions and 3 discharge since reporting 22/04/2016'),
( 2, 'today here were 6 discharges and 3 admission since reporting 21/04/2016'),
( 3, 'since reporting 30/04/2016 the discharges are 8 and the admissions are 10' ),
( 4, ' the discharges are 20 and the admissions are 30 on 10/05/2016 '),
( 5, ' the admission was 100 on 15/05/2016 and the discharge was 30 '),
( 6, ' on 15/08/2016 5 admissions and discharge 10' ),
( 7, ' 100 admission 200 discharges 25/12/2018' ),
( 8, ' discharge 500 on 01/01/2020 and 100 admission '),
( 9, ' discharge 1 2 admission 01/01/2020' ),
( 10, ' 20180101 100 admission 200 discharges' ),
( 11, '20160101 admission 1 2 discharges '),
( 12, 'here are the results up to 12/12/12 Admission 300 400 Discharge ' ) ,
( 13, ' Discharges 9000 20/May/2015 8000 Admissions ' ) ,
( 14, ' Discharges 99999 88888 Admissions 161216' ),
( 15, ' Discharges I think were 99 and maybe 88 possibly less Admissions on 2016-05-15' )
)
dt (rowID,TextString)
;
WITH cte AS (
SELECT *,
CASE WHEN ttype = 'tdate' then 3 ELSE ROW_NUMBER() OVER (PARTITION BY rowID,ttype ORDER BY Itemnumber) END rn
FROM (
SELECT i.rowid, i.TextString, s.ItemNumber, 'Admissions' as Item, 'category' as ttype
FROM #input i
CROSS APPLY dbo.DelimitedSplit8K(i.TextString,' ') s
where s.Item Like 'admission%'
UNION ALL
SELECT i.rowid, i.TextString, s.ItemNumber, 'Discharges', 'category'
FROM #input i
CROSS APPLY dbo.DelimitedSplit8K(i.TextString,' ') s
where s.Item Like 'discharge%'
UNION ALL
SELECT i.rowid, i.TextString, s.ItemNumber, s.Item, 'count'
FROM #input i
CROSS APPLY dbo.DelimitedSplit8K(i.TextString,' ') s
where s.Item Like '%[0-9]%' AND LEN(s.item) < 6 -- max count 99999
UNION ALL
SELECT i.rowid, i.TextString, s.ItemNumber, s.Item, 'tdate'
FROM #input i
CROSS APPLY dbo.DelimitedSplit8K(i.TextString,' ') s
where s.Item Like '%[0-9]%' AND LEN(s.item) >= 6)x
)
, cte_cat as (
SELECT rowID, Item, ttype, rn
FROM cte
WHERE (ttype = 'category')
)
, cte_cnt as (
SELECT rowID, Item, ttype, rn
FROM cte
WHERE (ttype = 'count')
)
, cte_grp as (
SELECT cte_cat.rowID,
cte_cat.Item,
cte_cnt.Item AS result
FROM cte_cat
INNER JOIN cte_cnt ON cte_cat.rowID = cte_cnt.rowID
AND cte_cat.rn = cte_cnt.rn
UNION ALL
SELECT rowID,
ttype,
item
FROM cte
WHERE(ttype = 'tdate')
)
SELECT rowID,
MAX(CASE WHEN Item = 'Admissions' THEN Result END) as Admissions,
MAX(CASE WHEN Item = 'Discharges' THEN Result END) as Discharges,
MAX(CASE WHEN Item = 'tdate' THEN TRY_CONVERT(date,Result) END) as tDate
FROM cte_grp
GROUP BY rowId
DROP TABLE #input
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 4, 2016 at 7:45 am
Thank you for the replies.
I will try to adapt what has been posted to suit our sandpit environment where i have discovered that some functions are not available (LAG and TRY_CONVERT).
Seems 2012 is not used for the sandpit which limits things.
🙁
July 4, 2016 at 8:45 am
quentin.harris (7/4/2016)
Thank you for the replies.I will try to adapt what has been posted to suit our sandpit environment where i have discovered that some functions are not available (LAG and TRY_CONVERT).
Seems 2012 is not used for the sandpit which limits things.
🙁
use this instead
MAX(CASE WHEN Item = 'tdate' THEN CAST(Result as DATE) END) as tDate
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 4, 2016 at 8:51 am
Thank you.
What is this line you are using:
dbo.DelimitedSplit8K?
July 4, 2016 at 8:56 am
quentin.harris (7/4/2016)
Thank you.What is this line you are using:
dbo.DelimitedSplit8K?
sorry...should have posted the link.
its a nasty fast "splitter"
http://www.sqlservercentral.com/articles/Tally+Table/72993/
the DelimittedSplit8K function (code is at end of article)
and I suggest you read the article and associated discussion....well worth it
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 4, 2016 at 9:06 am
Awesome.
It works. Now have to get my head round it what it actually all does.
Thank you!
Viewing 15 posts - 1 through 15 (of 37 total)
You must be logged in to reply to this topic. Login to reply