Extracting specific data from a column

  • Hi, can we extract specific data from the text field with starting and ending text. For eg. if the text is "Hello sqlservercentral.com is wonderful site". Here if starting text is "Hello" and Ending text is "wonderful" the result should be "sqlservercentral.com is"

    For ease I have created table below

    CREATE TABLE extractExample(

    ItemNumber char(12) NOT NULL,

    Data varchar(50) NOT NULL,

    PRIMARY KEY

    (ItemNumber)

    )

    INSERT INTO extractExample VALUES ('1', "Hello sqlserver.com is wonderful site")

    INSERT INTO extractExample VALUES ('2', "Hello hotmail.com is wonderful site")

    INSERT INTO extractExample VALUES ('3', "Hello google.com is wonderful site")

    Thanks in advance.

  • Hi, here's something you can do:

    You can easily put this in a procedure, for reusing, it would be a good idea.

    Hope it helps,

    CREATE TABLE extractExample

    (

    ItemNumber CHAR(12) NOT NULL,

    Data VARCHAR(50) NOT NULL,

    PRIMARY KEY ( ItemNumber )

    )

    DECLARE @StartText VARCHAR(100),

    @Endtext VARCHAR(100)

    SET @StartText = 'Hello'

    SET @EndText = 'is wonderful site'

    INSERT INTO extractExample

    VALUES (

    '1',

    'Hello sqlserver.com is wonderful site'

    )

    INSERT INTO extractExample

    VALUES (

    '2',

    'Hello hotmail.com is wonderful site'

    )

    INSERT INTO extractExample

    VALUES (

    '3',

    'Hello google.com is wonderful site'

    )

    SELECT ItemNumber,

    SUBSTRING(data, LEN(@StartText) + 1,

    LEN(data) - ( LEN(@StartText) + LEN(@EndText) )) AS BetweenText,

    data

    FROM ExtractExample

    DROP TABLE ExtractExample

    Cheers,

    J-F

  • My bad, that does not work, it worked for an easy example, but when you put a text that does not have those values, it still works.. and it should not... lemme work on that!

    Sorry!

    Cheers,

    J-F

  • Finally got it to work.. hehe, sorry about that,

    Here it is:

    CREATE TABLE extractExample

    (

    ItemNumber CHAR(12) NOT NULL,

    Data VARCHAR(50) NOT NULL,

    PRIMARY KEY ( ItemNumber )

    )

    DECLARE @StartText VARCHAR(100),

    @Endtext VARCHAR(100)

    SET @StartText = 'Hello'

    SET @EndText = 'is wonderful site'

    INSERT INTO extractExample

    VALUES (

    '1',

    'Aello sqlserver.com is wonderful site'

    )

    INSERT INTO extractExample

    VALUES (

    '2',

    'HelloA hotmail.com is wonderful site'

    )

    INSERT INTO extractExample

    VALUES (

    '3',

    'ABC Hello google.com is wonderful site ABC'

    )

    INSERT INTO extractExample

    VALUES (

    '4',

    'ABC Hello gAoAle.cAm is wonderful site DEF'

    )

    SELECT ItemNumber,

    SUBSTRING(data, LEN(@StartText) + CHARINDEX(@StartText, data), --Start of the text

    LEN(data) - ( LEN(data) - CHARINDEX(@EndText, data) -- Size of the text

    + CHARINDEX(@StartText, data) + LEN(@StartText) )) AS BetweenText,

    data

    FROM ExtractExample

    WHERE data LIKE '%' + @StartText + '%'

    AND data LIKE '%' + @EndText + '%'

    DROP TABLE ExtractExample

    Cheers,

    J-F

  • Thank you very very much I will try this and let you know my result. Thanks again

  • Your code does work well for the given example but if I change the table as below then it gives the following error.

    "Argument data type text is invalid for argument 1 of len function."

    CREATE TABLE extractExample

    (

    ItemNumber CHAR(12) NOT NULL,

    Data TEXT NOT NULL,

    PRIMARY KEY ( ItemNumber )

    )

    I tried to figure it out but I was not successful.

    I think TEXT data type doesn't support LEN function. I hope there can be other alternatives.

    Thanks again.

  • Hi

    Try the DATALENGTH function. This should also work with TEXT data.

    DECLARE @T TABLE (txt TEXT)

    INSERT INTO @T VALUES ('hello world')

    SELECT DATALENGTH(txt) FROM @t

    Hint 1

    If you need to use NTEXT you have to divide the result by two because it returns the count of bytes, not really a text length.

    Hint 2

    You should think about changing TEXT/NTEXT/IMAGE to VARCHAR(MAX)/NVARCHAR(MAX)/VARBINARY(MAX). These data types are marked deprecated since SQL Server 2005 (as I just learned here 🙂 ). They are still available within SQL Server 2008 but will be removed in later versions.

    Greets

    Flo

  • Great it works! Thank you very much for your help.

    I need to change only DATALENGTH in place of LEN.

    Thanks again.

  • No problem, thanks for your feedback!

    Cheers,

    J-F

  • Our pleasure!

    The main job was done by J+F! 🙂

    Greets

    Flo

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply