Search Millions of Column Values in Another Column

  • ...

  • I'd start by simply filtering out any "document" that doesn't have an "@" sign in it. That's bound to cut out some of the documents.

    The next step would be to split out the "@"s along with contiguous characters that show up in typical email addresses both before and after the "@" signs. This would be like a 2 part "split".

    How many characters in your largest VARCHAR(MAX) document?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • nm

  • May be home made email index will have better perfomance.

    Use regexp using CLR or sp_OACreate 'VBScript.RegExp' to build a

    #mailIndex (OriginalRowId, Pos, Mail). Build index and join with the 9 millions of mails. For mail rexep see http://www.regular-expressions.info/email.html

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply