Parse string from URL between...

  • Hi,

    i want to get the num-Value between the 2 Delimiters (h_hmid - and the End of the numeric Value)

    in this case: 2907831 (var len of Value !)

    What is the fastest way, to get all Values from a big Table? Is there a Index on this Column suggest ?

    Example:

    http://www.mydomain.info/Customer.aspx?h_hmid=%5Bb%5D2907831%5B/b%5D&dcc=EUR&mobiredirect=no&cpn=3790

    Thanks

    Regards

    Nicole 😉

  • info 58414 (12/9/2015)


    Hi,

    i want to get the num-Value between the 2 Delimiters (h_hmid - and the End of the numeric Value)

    in this case: 2907831 (var len of Value !)

    What is the fastest way, to get all Values from a big Table? Is there a Index on this Column suggest ?

    Example:

    http://www.mydomain.info/Customer.aspx?h_hmid=%5Bb%5D2907831%5B/b%5D&dcc=EUR&mobiredirect=no&cpn=3790

    Thanks

    Regards

    Nicole 😉

    A little bit of string manipulation is pretty simple here.

    declare @URL varchar(100) = 'http://www.mydomain.info/Customer.aspx?h_hmid=2907831&dcc=EUR&mobiredirect=no&cpn=3790'

    select SUBSTRING(@URL, CHARINDEX('h_hmid=', @URL) + 7, CHARINDEX('&', @URL) - CHARINDEX('h_hmid=', @URL) - 7)

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • cool, but the final delimiter is not always the &. It must be the End of num Values.

    thanks

  • This should be safer.

    declare @URL varchar(100) = 'http://www.mydomain.info/Customer.aspx?dcc=EUR&h_hmid=2907831&mobiredirect=no&cpn=3790'

    SELECT *, LEFT( initial.pos, CHARINDEX( '&', initial.pos + '&') - 1)

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY (SELECT SUBSTRING( x.url, CHARINDEX('h_hmid=', x.url) + 7, 100) ) initial(pos)

    The "(VALUES (@URL)) AS x(url)" should be replaced by an actual table.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (12/9/2015)


    This should be safer.

    declare @URL varchar(100) = 'http://www.mydomain.info/Customer.aspx?dcc=EUR&h_hmid=2907831&mobiredirect=no&cpn=3790'

    SELECT *, LEFT( initial.pos, CHARINDEX( '&', initial.pos + '&') - 1)

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY (SELECT SUBSTRING( x.url, CHARINDEX('h_hmid=', x.url) + 7, 100) ) initial(pos)

    The "(VALUES (@URL)) AS x(url)" should be replaced by an actual table.

    I like this approach but not sure why it is safer. It is easier to maintain for sure since it removed the duplicate charindex function.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • info 58414 (12/9/2015)


    cool, but the final delimiter is not always the &. It must be the End of num Values.

    thanks

    How about some more sample data then that more closely represents the real situation?

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Just a couple more approaches because parsing strings is fun.

    If the string you are searching for is always going to be 7 characters long you could do this (note my comments):

    declare @URL varchar(100) = 'http://www.mydomain.info/Customer.aspx?dcc=EUR&h_hmid=2907831&mobiredirect=no&cpn=3790'

    -- Formula to parse the string

    SELECT SUBSTRING(@URL,PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',@URL),7)

    -- using the formula against a table similar to what Luis posted

    SELECT *

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY (SELECT SUBSTRING(x.url,PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',x.url),7)) s(parsed);

    If the search string is not always going to be 7 characters long, to get the first series of numbers in the string you could do this:

    -- Where the first number in the string begins and ends

    declare @URL varchar(100) = 'http://www.mydomain.info/Customer.aspx?dcc=EUR&h_hmid=2907831&mobiredirect=no&cpn=3790'

    SELECT x.url, s2.parsed

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY (VALUES (SUBSTRING(@URL,PATINDEX('%[0-9]%',@URL),1000))) s1(start) -- get where the number starts

    CROSS APPLY (VALUES (SUBSTRING(s1.start,1,PATINDEX('%[^0-9]%',s1.start)-1))) s2(parsed) -- get where it ends

    -- Note using two Cross APPLIES to eliminate the need to use this formula twice: SUBSTRING(@URL,PATINDEX('%[0-9]%',@URL),1000)

    GO

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Sean Lange (12/9/2015)


    I like this approach but not sure why it is safer. It is easier to maintain for sure since it removed the duplicate charindex function.

    It doesn't rely on h_hmid being the first parameter in the URL. It missed an additional "safety net".

    SELECT *, LEFT( initial.pos, CHARINDEX( '&', initial.pos + '&') - 1)

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY (SELECT SUBSTRING( x.url, CHARINDEX('h_hmid=', x.url + 'h_hmid=') + 7, 100)) initial(pos)

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Another solution... This will be a smidgen slower but the code will be much cleaner. Using the Pattern-Based splitter referenced in my signature you could get the first series of numbers in your string like this:

    declare @URL varchar(100) = 'http://www.mydomain.info/Customer.aspx?dcc=EUR&h_hmid=2907831&mobiredirect=no&cpn=3790'

    SELECT x.url, item

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY dbo.patternsplitCM(@url,'%[0-9]%')

    WHERE ItemNumber = 2;

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Luis Cazares (12/9/2015)


    Sean Lange (12/9/2015)


    I like this approach but not sure why it is safer. It is easier to maintain for sure since it removed the duplicate charindex function.

    It doesn't rely on h_hmid being the first parameter in the URL. It missed an additional "safety net".

    SELECT *, LEFT( initial.pos, CHARINDEX( '&', initial.pos + '&') - 1)

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY (SELECT SUBSTRING( x.url, CHARINDEX('h_hmid=', x.url + 'h_hmid=') + 7, 100)) initial(pos)

    Ahh gotcha. 😛

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Cool Stuff !

    Thank @all..... 🙂

  • Using good old Tally table:

    DECLARE @tag VARCHAR(50)

    --SET @tag = 'cpn='

    SET @tag = 'h_hmid='

    SELECT url, SUBSTRING(url, ID_starts, ID_ends - ID_starts) h_hmid

    FROM (

    SELECT url, CHARINDEX(@tag, url)+LEN(@tag) ID_starts, MIN(N) ID_ends

    FROM (SELECT 'http://www.mydomain.info/Customer.aspx?h_hmid=2907831&dcc=EUR&mobiredirect=no&cpn=3790' url

    UNION SELECT 'http://www.mydomain.info/Customer.aspx?h_hmid=290783100++&dcc=EUR&mobiredirect=no&cpn=3790'

    UNION SELECT 'http://www.mydomain.info/Customer.aspx?h_hmid=290783199999?&dcc=EUR&mobiredirect=no&cpn=3790') SourceTable

    inner JOIN dbo.TallyGenerator(1, 1000, 1,1) tg ON N >=CHARINDEX(@tag, url)+LEN(@tag) AND N<=LEN(URL)+1

    WHERE PATINDEX('%[^0-9]%', SUBSTRING(url, N,1)) = 1 -- finding 1st non-numeric character after the ID

    OR N=LEN(URL)+1 -- in case there are no characters after ID in the url

    GROUP BY url

    ) DT

    _____________
    Code for TallyGenerator

  • Luis Cazares (12/9/2015)


    This should be safer.

    declare @URL varchar(100) = 'http://www.mydomain.info/Customer.aspx?dcc=EUR&h_hmid=2907831&mobiredirect=no&cpn=3790'

    SELECT *, LEFT( initial.pos, CHARINDEX( '&', initial.pos + '&') - 1)

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY (SELECT SUBSTRING( x.url, CHARINDEX('h_hmid=', x.url) + 7, 100) ) initial(pos)

    The "(VALUES (@URL)) AS x(url)" should be replaced by an actual table.

    Safer, but not quite safe.

    1. "+ 7" must be really "+ LEN('h_hmid=')"

    2. it relies on "&" to be the delimiter ending h_hmid. As OP stated - it's not always the case.

    _____________
    Code for TallyGenerator

  • Sergiy (12/13/2015)


    Luis Cazares (12/9/2015)


    This should be safer.

    declare @URL varchar(100) = 'http://www.mydomain.info/Customer.aspx?dcc=EUR&h_hmid=2907831&mobiredirect=no&cpn=3790'

    SELECT *, LEFT( initial.pos, CHARINDEX( '&', initial.pos + '&') - 1)

    FROM (VALUES (@URL)) AS x(url)

    CROSS APPLY (SELECT SUBSTRING( x.url, CHARINDEX('h_hmid=', x.url) + 7, 100) ) initial(pos)

    The "(VALUES (@URL)) AS x(url)" should be replaced by an actual table.

    Safer, but not quite safe.

    1. "+ 7" must be really "+ LEN('h_hmid=')"

    2. it relies on "&" to be the delimiter ending h_hmid. As OP stated - it's not always the case.

    I'm pretty sure that LEN('h_hmid=') will always return 7, as that's a string literal with a constant value.

    On the second remark, the "&" is used to separate parameters in a URL. The only way to the value not being ended by an "&" is by being the last parameter in the list. That's why I use CHARINDEX( '&', initial.pos + '&'). In the example you quoted, I missed the validation for the lack of the parameter which I posted on my reply to Sean.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (12/15/2015)I'm pretty sure that LEN('h_hmid=') will always return 7, as that's a string literal with a constant value.

    It's constant only until they change the name of parameter.

    "7" is a length of the parameter name, so it must be coded correspondingly.

    Placing data into code is not a good habit.

    On the second remark, the "&" is used to separate parameters in a URL. The only way to the value not being ended by an "&" is by being the last parameter in the list. That's why I use CHARINDEX( '&', initial.pos + '&'). In the example you quoted, I missed the validation for the lack of the parameter which I posted on my reply to Sean.

    Your general knowledge about the way URL's are built must be correct, but you must have missed this requirement:

    info 58414 (12/9/2015)


    cool, but the final delimiter is not always the &. It must be the End of num Values.

    thanks

    _____________
    Code for TallyGenerator

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply