October 25, 2007 at 3:20 pm
I'm trying to get parameter names and variables out of huge strings that I am importing from log files. The parameters look kind of like this:
Parameter Name: PN.someParName
In the file the parameter and respective value looks like this:
PN.someParName=thisIsTheValue&PN.someParName2=anotherValue&PN.someParName3=lastValue
...and goes on row after row after row.
I want to be able to return the parameter names as a row set and the values associated also. Basically I want to be able to get this...
Parameter Name Value
PN.someParName thisIsTheValue
PN.someParName2 anotherValue
PN.someParName3 lastValue
Would the best way to parse this be with regex code? What would be a suggested way to go about this? In SSIS or in CLR compiled procedure using Regex? ...or is there an easy way to do it in regular T-SQL?
One point of context, I'll be processing millions of rows at a time.
Thanks,
Adron
October 26, 2007 at 7:54 am
I'd think Regex using CLR integration. Specifically a CLR Table-valued function.
If you created the right Regex, then simply ran the MATCHES method - you'd end up with an array of all parameters and values, which you'd then need to turn into a table variable.
Here's a good running start at it:
http://blogs.msdn.com/stuartpa/archive/2005/07/21/441468.aspx
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
October 26, 2007 at 10:43 am
Thanks for the input Matt.
So far I just did some basic old C# string parsing using the string object and string builder object. That is a remarkable improvement over whatever is used in the previous methods.
So next I'm going to work in some regex work and see how things go.
So far I have the import and some parsing with the C# string parsing down to about 3 minutes for 600,000 rows of data. That is from text log file to clean well formed table.
I'm hoping I can get that down to about 1-2 minutes with full parsing. I think with Regular expressions I might just be able to do that. 😎
October 26, 2007 at 7:35 pm
How would you like to import 5 million rows in about a minute? Post, say, 100 rows of data as an attachment and I'll show you how to build a BCP format file and use BULK INSERT to just wail on files like this.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply