Extract IP Address from text field

  • I have a nice problem. There is a char(255) field in a 3rd-party database that contains, among other things, an IP address that I would like to extract. But the IP address does not appear in the field at any fixed location; nor is it always preceded or followed by anydistinguishing characters.

    How can I extract the IP address easily?

    Regular expressions is one way, but all of the implementations I've found use functions, and I am on a SQL 7 server, so I cannot set up any user-defined functions.

    Any assistance would be wolcome.

    -steve

  • Easily? Without any distinguishing charachteristics? Can't be done "easily".

    Anyway, you'll have to loop through the entire string until you find the ip (or at least by best guess believe you have found it)

    The thought is this - find the first period in the string, see if the preceding 3 charachters are numbers, if they are, we hope this is the first octet in the ip, so we grab it from there. If it's not a number, we go look for the next period in the string. Repeat until there are no more periods.

    Since there can't be a function, you may implement it as a stored procedure instead..

    -- example

    declare @s-2 varchar(255), @i int, @j-2 int

    set @s-2 = 'long.. . . string from 123.123.123.123 that is placed here by chance. It could very well be somewhere else as well.'

    set @i = 1

    set @j-2 = len(@s)

    while @i <= @j-2

      begin

        if (select isnumeric(substring(@s, (charindex('.', @s-2, @i) - 3), 3))) = 1 -- triplet is a number

        begin

     select substring(@s, (charindex('.', @s-2, @i) - 3), 15) -- grab the ip string

         set @i = @j-2

        end

        else if isnumeric(substring(@s, (charindex('.', @s-2, @i) - 3), 3)) = 0 -- not a number

        begin

           set @i = @i + charindex('.', @s-2, @i) -- move offset so we can find the next period

        end

      end

    Good luck

    /Kenneth

  • Try this for a laugh

    SELECT COALESCE(

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),15),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),14),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),14),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),14),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9].[0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9].[0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9].[0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9].[0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9].[0-9][0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9].[0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9].[0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),14),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9].[0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9].[0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9].[0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9].[0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9].[0-9][0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9].[0-9][0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9].[0-9]%',[column]),8),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),13),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9].[0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),12),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9].[0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9].[0-9][0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9].[0-9]%',[column]),8),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),11),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9][0-9].[0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),10),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9].[0-9][0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9].[0-9]%',[column]),8),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9].[0-9][0-9][0-9]%',[column]),9),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9].[0-9][0-9]%',[column]),8),

      SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9].[0-9]%',[column]),7),

      '')

    FROM 

    Far away is close at hand in the images of elsewhere.
    Anon.

  • I agree it is a messy search for an IP string. Consider a string like this:

      DECLARE @string varchar(255)

      SET @string = 'Title: Lawn Mower.  Cost: $125.67.  IP Address: 123.12.1.16'

    There's a false positive in this string.  I think the pure characteristics of an IP address

    are: The IP string is all numbers and periods (period: aka point aka full-stop). There are

    exactly 3 periods, the length of any numeric substring is between 1 and 3, and the total length of

    the IP string is between 7 and 15. In fact, the value of each numeric substring must be

    between 0 and 255.

    -- Here's the plan for the stored procedure:

    -- 1. Find iX = location of first '.'

    -- 2. Set iY = iX-1; decrement while char at iY is NUMERIC, setting iBegin = iY

    -- 3. Set iX2, iX3 = Location of second and third '.'

    -- 4. Set iY = iX3+1; increment while char at IY is NUMERIC, setting iEnd = iY

    -- 5. You have the begin and end. Calc IPLENGTH = end - begin + 1; get the candidate IP address

    -- 6. Additional checks; add these if the above returns strings that are not the IP address:

    -- - All characters from IX+1 to IX2-1 are numeric

    -- - All characters from IX2+1 to IX3-1 are numeric

    -- - Length of string is <= 15 (eg: 123.123.123.123)

    -- - Length of string is >=  7 (eg: 1.2.3.4)

    -- - Etc., etc. I'd code as few edits as necessary to get a reasonable answer.

    -- 7. If the found string fails the edit, then resume at the next period.

    DECLARE @work varchar(255), @string varchar(255)

    DECLARE @ix int, @ix2 int, @ix3 int, @iy int, @ibegin int, @iend int

    DECLARE @len int, @iplength int, @ipstring, @valid int

     

    -- 1. Find iX = Location of first '.'

    SET @work = @string

    SET @len = LEN(@work)

    SET @ix = CHARINDEX('.',@work,1)

    SET @valid = 1

    WHILE @valid = 1

    BEGIN

      -- 2. Set iY = iX-1; decrement while char at iY is NUMERIC, setting iBegin = iY

      SET @iy = @ix -1

      SET @ibegin = @ix

      WHILE ISNUMERIC(SUBSTRING(@work,@iy,1))

      BEGIN

        SET @ibegin = @iy

        IF @iy = 1 BREAK

        SET @iy = @iy - 1

      END

      -- 3. Set iX2, iX3 = Location of second and third '.'

      SET @ix2 = CHARINDEX('.',@work,@ix+1)

      SET @ix3 = CHARINDEX('.',@work,@ix2+1)

      -- 4. Set iY = iX3+1; increment while char at IY is NUMERIC, setting iEnd = iY

      SET @iy = @ix3 + 1

      SET @iend = @ix3

      WHILE ISNUMERIC(SUBSTRING(@work,@iy,1))

      BEGIN

        SET @iend = @iy

        IF @iy = @len BREAK

        SET @iy = @iy + 1

      END

      -- 5. You have the begin and end. Calc IPLENGTH = end - begin + 1; get the candidate IP address

      SET @iplength = @iend - @begin + 1

      SET @ipstring = SUBSTRING(@work,@ibegin,@iplength)

      SET @valid = 1

     

      -- 6. Additional checks; add these if the above returns strings that are not the IP address:

      -- - All characters from IX+1 to IX2-1 are numeric

      -- - All characters from IX2+1 to IX3-1 are numeric

      -- - Length of string is <= 15 (eg: 123.123.123.123)

      -- - Length of string is >=  7 (eg: 1.2.3.4)

     

      -- 7. IF THE STRING FAILS ANY EDIT(S) - SET @valid = 0

      --   and keep looking starting at the next '.'

     

      IF @valid = 0

      BEGIN

        SET @ix = @ix2 

      END

      

    END

    -- Return to the caller: @ipstring


    Regards,

    Bob Monahon

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply