Patern Matching

  • Hi,

    I have a text column that contains notes about a customer, in this column there are mutiple reference numbers of a customer in the format of xxxxxx-xxxxx-xx (e.g 123456-12345-12).

    For each row I want to get all the reference number present in the notes column.

    e.g

    CId, Notes

    1, These are the notes for customer 1 having first reference number 123456-12345-12, this customer came back again and we assign another reference number 654321-54321-21

    2, This is a brand new reference number 098765-09876-09

    3, This is the last example, please help me to extract this reference number 333222-11144-22

    I need

    1, 123456-12345-12

    1, 654321-54321-21

    2, 098765-09876-09

    3, 333222-11144-22

    Thanks

  • if your column is varchar or nvarchar, it's fairly easy; you know the exact length of the string to catch, and can use charindex or patindex to find where it starts in any string:

    results:

    (No column name)(No column name)

    18123456-12345-12

    48654321-54321-21

    10098765-09876-09

    27333222-11144-22

    code example:

    select ' a bunch of text 123456-12345-12 and more text' as descrip

    into #tmp union all

    select 'more stuff but obviously a different length 1, 654321-54321-21' union all

    select 'yaada 2, 098765-09876-09 fffffff' union all

    select 'stuff that i dont need 3, 333222-11144-22 blah blah'

    select

    --15 chars, due to dashes

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9]-[0-9][0-9]%',descrip),

    substring(descrip,patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9]-[0-9][0-9]%',descrip),15)

    from #tmp

    where patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9]-[0-9][0-9]%',descrip) > 0

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (1/29/2010)


    if your column is varchar or nvarchar, it's fairly easy; you know the exact length of the string to catch, and can use charindex or patindex to find where it starts in any string:

    Good idea... but take a look at the first row of example data the OP posted... it has 2 reference numbers in it...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • UT- (1/29/2010)


    Hi,

    I have a text column that contains notes about a customer, in this column there are mutiple reference numbers of a customer in the format of xxxxxx-xxxxx-xx (e.g 123456-12345-12).

    For each row I want to get all the reference number present in the notes column.

    e.g

    CId, Notes

    1, These are the notes for customer 1 having first reference number 123456-12345-12, this customer came back again and we assign another reference number 654321-54321-21

    2, This is a brand new reference number 098765-09876-09

    3, This is the last example, please help me to extract this reference number 333222-11144-22

    I need

    1, 123456-12345-12

    1, 654321-54321-21

    2, 098765-09876-09

    3, 333222-11144-22

    Thanks

    Not a problem... welcome to one of the many uses for a Tally table...

    --===== Create a test table.

    -- THIS IS NOT A PART OF THE SOLUTION

    CREATE TABLE #TestData

    (

    RowNum INT PRIMARY KEY CLUSTERED,

    Notes VARCHAR(8000)

    )

    INSERT INTO #TestData

    (RowNum, Notes)

    SELECT 1, 'These are the notes for customer 1 having first reference number 123456-12345-12, this customer came back again and we assign another reference number 654321-54321-21' UNION ALL

    SELECT 2, 'This is a brand new reference number 098765-09876-09' UNION ALL

    SELECT 3, 'This is the last example, please help me to extract this reference number 333222-11144-22'

    --===== Demo the solution using a Tally table

    SELECT src.RowNum, SUBSTRING(src.Notes,t.N,15) AS RefNum

    FROM #TestData src

    CROSS JOIN dbo.Tally t

    WHERE t.N < LEN(src.Notes)

    AND SUBSTRING(src.Notes,t.N,15) LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9]-[0-9][0-9]'

    If you don't have a Tally table or you would like to know more about how it works to replace certain loops/cursors, take a peek at the following article...

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • @lowell:

    Thank you so much for your solution it works but only if there is one reference number in the notes. If you look at my original post there were two reference number in the first row BUT that was just an example there could be many reference numbers in one row and i need all of them. The problem with PATINDEX is that it does not accept a start index and will only give the first occurence of the partern.

    @jeff:

    Thanks for this new idea, i am really looking forward to see the results BUT in your post you did not include the code to generate the tally table "dbo.Tally" so i am getting an error of "Invalid object name 'dbo.Tally'".

    Please tell me how did you generate this table for my sample data?

    Appreciate your help guys.

  • UT- (1/30/2010)


    ...

    @jeff:

    Thanks for this new idea, i am really looking forward to see the results BUT in your post you did not include the code to generate the tally table "dbo.Tally" so i am getting an error of "Invalid object name 'dbo.Tally'".

    Please tell me how did you generate this table for my sample data?

    Appreciate your help guys.

    He posted the link at the end of the post about Tally table ...once again there it is the location;

    http://www.sqlservercentral.com/articles/T-SQL/62867/[/url]

    http://www.sqlservercentral.com/articles/T-SQL/62867/[/url]

    http://www.sqlservercentral.com/articles/T-SQL/62867/%5B/url%5D

    The Tally table is very simple and very useful!

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • UT- (1/30/2010)


    @Jeff:

    Thanks for this new idea, i am really looking forward to see the results BUT in your post you did not include the code to generate the tally table "dbo.Tally" so i am getting an error of "Invalid object name 'dbo.Tally'".

    Please tell me how did you generate this table for my sample data?

    Dugi is correct. I posted a link to the article near the end of my post that tells you how to the Tally table works as well as a couple of ways to build it instead of posting the code. I do that for two reasons... first, I wore out Ctrl-C/Ctrl-V on my keyboard making copies of the code 😛 and, second, I wanted to give folks the opportunity to see how the Tally table can be used to replace loops instead of just using a proverbial "black box" for a single task. It's an incredibly useful tool (although I must warn, it's not a panacea) and the only way it will be useful to folks is if they know how it actually works. It's the ol' "but teach a man to fish" thing. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • UT,

    It's been a couple of days... how'd all that work out for you?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you so much for following up Jeff, and sorry for not providing the feedback. Actually i needed it for a one time import so i used Lowell's solution and it worked for me. But i really appreciate your solution and will definitely explore the tally tables.

    Thanks,

    UT

  • Absolutely no problem... Just wanted to make sure everything was ok with you. Thank you for the come back.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply