April 25, 2008 at 7:29 am
Gus,
This is an SQL Server 2000 forum... 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
April 25, 2008 at 7:32 am
April 25, 2008 at 7:41 am
Try this... it's Adrian's original code with some conditions added...
SELECT LEFT(raw_payload, CHARINDEX('+', raw_payload, 1)-1),
CASE
WHEN LEN(raw_payload)-LEN(REPLACE(raw_payload,'+','')) >= 2
THEN
CAST(SUBSTRING(raw_payload,
CHARINDEX('+', raw_payload, 1) + 1,
CHARINDEX('+', raw_payload, CHARINDEX('+', raw_payload, 1) +1) - (CHARINDEX('+', raw_payload, 1) + 1))
AS FLOAT)
ELSE NULL END,
CASE
WHEN LEN(raw_payload)-LEN(REPLACE(raw_payload,'+','')) >= 3
THEN
CAST(SUBSTRING(raw_payload,
CHARINDEX('+', raw_payload, CHARINDEX('+', raw_payload, 1) +1) + 1,
LEN(raw_payload) - CHARINDEX('+', raw_payload, CHARINDEX('+', raw_payload, 1) +1) + 1)
AS FLOAT)
ELSE NULL END
FROM #SplitMe
--Jeff Moden
Change is inevitable... Change for the better is not.
April 25, 2008 at 8:10 am
CODE PASSED:
DECLARE @table TABLE (raw_payload VARCHAR(200))
INSERT INTO @Table SELECT raw_payload FROM raw_data
SELECT LEFT(raw_payload, CHARINDEX('+', raw_payload, 1)-1),
CASE
WHEN LEN(raw_payload)-LEN(REPLACE(raw_payload,'+','')) >= 2
THEN
CAST(SUBSTRING(raw_payload,
CHARINDEX('+', raw_payload, 1) + 1,
CHARINDEX('+', raw_payload, CHARINDEX('+', raw_payload, 1) +1) - (CHARINDEX('+', raw_payload, 1) + 1))
AS FLOAT)
ELSE NULL END,
CASE
WHEN LEN(raw_payload)-LEN(REPLACE(raw_payload,'+','')) >= 3
THEN
CAST(SUBSTRING(raw_payload,
CHARINDEX('+', raw_payload, CHARINDEX('+', raw_payload, 1) +1) + 1,
LEN(raw_payload) - CHARINDEX('+', raw_payload, CHARINDEX('+', raw_payload, 1) +1) + 1)
AS FLOAT)
ELSE NULL END
RESULTS:
(2551671 row(s) affected)
(624 row(s) affected)
Server: Msg 536, Level 16, State 3, Line 5
Invalid length parameter passed to the substring function.
SAMPLE RESULTS:
S9000NULLNULL
UV100.0NULL
UV101.0NULL
UV102.0NULL
UV103.0NULL
UV104.0NULL
Third column is not null in these cases but that is what is diplayed.
April 25, 2008 at 8:43 am
Jeff - that last code nulls out the 0's in the 3rd column - is it supposed to?
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 25, 2008 at 9:06 am
All right - one more variation on Adrian's original solution, avoiding the need for the CASE statements....
[font="Courier New"]SELECT
LEFT(raw_payload+'+++', CHARINDEX('+', raw_payload+'+++', 1)-1),
CAST(--NULLIF(
SUBSTRING(raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++', 1) + 1,
CHARINDEX('+', raw_payload+'+++',
CHARINDEX('+', raw_payload, 1) +1) -
(CHARINDEX('+', raw_payload+'+++', 1) + 1))
--,'')
AS FLOAT),
CAST(--NULLIF(
REPLACE(
SUBSTRING(raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++', 1) +1) + 1,
LEN(raw_payload) - CHARINDEX('+', raw_payload,
CHARINDEX('+', raw_payload+'+++', 1) +1) + 1)
,'+','')
--,'')
AS FLOAT)
FROM #mytable[/font]
Now - I'm not crazy that the NULL columns are being converted to 0 (so you can't tell what was passed in as 0 versus no input), so you might want the NULLIF's uncommmented, but they're fairly costly (they double processing time).
Perf-wise - it looks like:
Jeff's: 5,907ms
This one (without NULLIF): 7,015ms
This one (with NULLIF): 11,536ms
I still haven't quite figured out where the flaw is in Jeff's that ends up nulling out the third column.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 25, 2008 at 9:16 am
Jeff Moden (4/25/2008)
Gus,This is an SQL Server 2000 forum... 😉
Well, that'll certainly change a few things! (Like maybe I'll start paying attention.... nah ... that'll never happen.)
For that, you'll need a numbers-based string parser (there's one on the scripts section of SQL Server Central), instead of the XML version.
Also, would need to use regular derived tables in the From clause, instead of a CTE. Should work with those modifications.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 25, 2008 at 9:38 am
SELECT raw_payload,
LEFT(raw_payload+'+++', CHARINDEX('+', raw_payload+'+++', 1)-1),
CAST(--NULLIF(
SUBSTRING(raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++', 1) + 1,
CHARINDEX('+', raw_payload+'+++',
CHARINDEX('+', raw_payload, 1) +1) -
(CHARINDEX('+', raw_payload+'+++', 1) + 1))
--,'')
AS FLOAT),
CAST(--NULLIF(
REPLACE(
SUBSTRING(raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++', 1) +1) + 1,
LEN(raw_payload) - CHARINDEX('+', raw_payload,
CHARINDEX('+', raw_payload+'+++', 1) +1) + 1)
,'+','')
--,'')
AS FLOAT)
FROM raw_data
RESULT:
Server: Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to float.
April 25, 2008 at 9:52 am
n1pcman (4/25/2008)
SELECT raw_payload,LEFT(raw_payload+'+++', CHARINDEX('+', raw_payload+'+++', 1)-1),
CAST(--NULLIF(
SUBSTRING(raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++', 1) + 1,
CHARINDEX('+', raw_payload+'+++',
CHARINDEX('+', raw_payload, 1) +1) -
(CHARINDEX('+', raw_payload+'+++', 1) + 1))
--,'')
AS FLOAT),
CAST(--NULLIF(
REPLACE(
SUBSTRING(raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++',
CHARINDEX('+', raw_payload+'+++', 1) +1) + 1,
LEN(raw_payload) - CHARINDEX('+', raw_payload,
CHARINDEX('+', raw_payload+'+++', 1) +1) + 1)
,'+','')
--,'')
AS FLOAT)
FROM raw_data
RESULT:
Server: Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to float.
hmm - it works for me as long as the second and third parameters are numeric or missing. Are you sure you don't have any of those that are NOT numeric?
Perhaps try commenting out one of the casts and then the other, just to find out which one is causing the headache.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 25, 2008 at 9:59 am
Yes that is the problem I do have some bad data, 1 or 2 lines per every 600 is stepped on data its un unavoidable buffering issue with the device dumping the data. I don't want that data if I do this process manually I delete it. I am just hoping to come up with a stored proc or something that cleans the data for my queries and reports. I want the first row text, second row number and third row number in the finished deal.
April 25, 2008 at 8:17 pm
Matt Miller (4/25/2008)
Jeff - that last code nulls out the 0's in the 3rd column - is it supposed to?
Nah... I screwed up.
The real key to this whole thing is that the data must be "equalized". That is, every row should be updated to have the same number of delimiters before any splitting is attempted. Sure, we can hack all day with CHARINDEX and the like, but if the original data is "equalized", this becomes a simple split and reassemble problem.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 28, 2008 at 12:39 pm
What if we could write a query that says pickup all data that meets this standard
text(*) '+' NUM(*) '+' NUM(*) copy to new table then parse into a new table with 3 fields
"Description Field would be Text" & "Usage Field would be a Number" & "Adjustable field would be a Number"
April 28, 2008 at 12:45 pm
n1pcman (4/25/2008)
Yes that is the problem I do have some bad data, 1 or 2 lines per every 600 is stepped on data its un unavoidable buffering issue with the device dumping the data. I don't want that data if I do this process manually I delete it. I am just hoping to come up with a stored proc or something that cleans the data for my queries and reports. I want the first row text, second row number and third row number in the finished deal.
Then - try one of the above methods, and dump the data into varchar(20) fields FIRST. Then figure out if what you got in those fields is convertible to FLOAT: if it is - keep it - if not - chunk it. As of now - the parsing is working, it's just that it's parsing non-numbers in a couple of records.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 28, 2008 at 12:51 pm
I dont know enough about SQL to any of this I just know what I want, the bad data or non standard data can be trashed with no problems.
April 29, 2008 at 6:14 am
Select *
FROM raw_data[raw_payload]
Where raw_payload not like '%+%+%'
This finds all the bad data is there some thing I can add to auto delete based on this query like a stored procedure or something and then run the parser on clean data all on a scheduled cycle???
or all the good data with this
Select *
FROM raw_data[raw_payload]
Where raw_payload like '%+%+%'
Viewing 15 posts - 16 through 30 (of 37 total)
You must be logged in to reply to this topic. Login to reply