How to extract part of URL from varchar column?

  • I'm trying to figure out how to extract just the folder from a column. The table is called tblPosts and the column name is exprvar1, it's varchar(8000). The column has data like this: (this is just one of 6,000 rows with different folder names, but ALWAYS has http://www.websitename.com/myfoldername/gallery/ before the folder name I need and always has /index.html after)

    Jane Doe and John Doe enjoyed Saturday afternoon with their kids Child and Kid at Pan Pacific Park in Los Angeles, March 28th.\r<div style="text-align: center;"><a href="http://www.websitename.com/myfoldername/gallery/candids2/2009-03-28/index.html" target="_blank"><img style="border: none;" src="http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/001s.jpg" alt="" /> <img style="border: none;" src="http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/002s.jpg" alt="" /> <img style="border: none;" src="http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/003s.jpg" alt="" /> <img style="border: none;" src="http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/004s.jpg" alt="" /> <img style="border: none;" src="http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/005s.jpg" alt="" /> <img style="border: none;" src="http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/006s.jpg" alt="" /></a></div>

    All I need is the first URL folder part. Example: Candids/2009-03-28

    I really hope that makes sense. I'm trying to figure out which folders I need to upload to a server instead of uploading all of them.

    Thanks in advance!

  • Well you could do separate queries for ones starting with src= and href= since those add on characters. then as long as the count of characters is the same for each row, you can do left(myString. <however many chacaters it is).

    Or if you really wanted to get shfancy, try someting like this:

    declare @STR varchar(max) ='<img style="border: none;" src="http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/006s.jpg" alt="" /></a>'

    select PATINDEX('%y/%', @STR)

    select @STR = (replace(@str, LEFT(@str, 82), ''))

    select @STR = SUBSTRING(@str, 0, charindex('/', @STR, 0))

    select @STR

    Executive Junior Cowboy Developer, Esq.[/url]

  • the part i need is always 52 characters past the <a href=" part. I don't know exactly what sql query I'd use to still extract the folder name and date?

  • This is an interesting little problem. There are a few different ways you could go about doing it. Since the URLs are delimited by '/', you could probably write something like a split function to grab out the individual pieces using a loop.

    Another method I like a lot is to progressively cut down the original string, grab peices you need out of it, and repeat. It works really well for this sort of static string format you have.

    if object_id('tempdb.dbo.#FolderDates') > 0 drop table #FolderDates

    create table #FolderDates

    (

    FullString varchar(1000),

    EditedString varchar(1000),

    FolderName varchar(100),

    TheDate varchar(100)

    )

    insert into #FolderDates(FullString)

    select 'http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/index.html' union

    select 'http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/001s.jpg' union

    select 'http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/002s.jpg'

    update #FolderDates

    set EditedString = right(FullString, len(FullString) - 50) -- The number of characters you said stays static. You said it was 52, maybe I cut the string wrong. You can play with this.

    update #FolderDates

    set FolderName= substring(EditedString, 0, charindex('/', EditedString)),

    EditedString = stuff(EditedString, 1, charindex('/', EditedString), '')

    update #FolderDates

    set TheDate = substring(EditedString, 0, Charindex('/', EditedString))

    select *

    from #FolderDates

    What I've done here is put your original strings into a table, made a copy of the strings (EditedString) so that I dont destroy the original, and two columns for the pieces of information you said you were interested in.

    First, I'm cutting off everything to the right of the 52 characters you said you didn't want (i found it to be 50, but maybe I started with a slightly different string).

    Then, i'm taking a substring of the start of the string to the first '/'.

    then I'm removing that section I just cut off from the edited string,

    and finally I repeat the substring method for the date as well.

    Just be careful because this is fully dependent upon always having the same number of leading characters to cut off at the start. If it changes, you could do something like EditedString = stuff(EditedString, 1, patindex(EditedString, '%.com/%) + 4, '') to make that opening part more dynamic, but that may be beyond what you need.

    Hope this helps!

    Executive Junior Cowboy Developer, Esq.[/url]

  • That does correctly separate if the select only had a url. But it has tons of dynamic text before the url. If we can get it to find the URL and then go 50 over from that (like this query does) then it'd be perfect. Does that make sense?

    Thanks so much for all the help!

  • Try this:

    if object_id('tempdb.dbo.#FolderDates') > 0 drop table #FolderDates

    create table #FolderDates

    (

    FullString varchar(1000),

    EditedString varchar(1000),

    FolderName varchar(100),

    TheDate varchar(100)

    )

    insert into #FolderDates(FullString)

    select '\r<div style="text-align: center;"><a href=http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/index.html' union

    select '\r<div style="text-align: center;"><a href=http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/001s.jpg' union

    select 'ForThoseAboutToRockWeSaluteYou0oo0http://www.mywebsitename.com/myfoldername/gallery/candids2/2009-03-28/thumbnails/002s.jpg'

    update #FolderDates

    set EditedString = right(FullString, len(FullString) - patindex('%/gallery%', FullString) - 8) -- The number of characters you said stays static. You said it was 52, maybe I cut the string wrong. You can play with this.

    update #FolderDates

    set FolderName= substring(EditedString, 0, charindex('/', EditedString)),

    EditedString = stuff(EditedString, 1, charindex('/', EditedString), '')

    update #FolderDates

    set TheDate = substring(EditedString, 0, Charindex('/', EditedString))

    select *

    from #FolderDates

    The difference here is I have it searching for the pattern "/gallery' in the string as the point to start looking for the folder and the date, and then I am shifting the starting character index (so to speak) by the number of characters in the pattern (in this case, 8). You can change that pattern as you see fit for your particular needs.

    Executive Junior Cowboy Developer, Esq.[/url]

  • That works like a dream! You're a genius!!! Thank you so much for all of your help, sorry to be so difficult. You've just made my life WAY easier 😀

  • Glad it helped 🙂

    Executive Junior Cowboy Developer, Esq.[/url]

  • tetroxide (1/19/2012)


    you could probably write something like a split function to grab out the individual pieces using a loop.

    As a sidebar... loops are not required for such a thing. Please see the following article...

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • That's amazing. I've never seen something like that before. Thanks!

    Executive Junior Cowboy Developer, Esq.[/url]

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply