search a string that contain 6 or 7 characters but no spaces

  • How do i search a string that contains 6 or seven characters but no spaces this is to identify incorrect postcodes

    ie select pk,pcode

    from postcode

    where pcode like '%%%%%%'

    Thanks,

    Iain

  • icampbell (7/5/2012)


    How do i search a string that contains 6 or seven characters but no spaces this is to identify incorrect postcodes

    ie select pk,pcode

    from postcode

    where pcode like '%%%%%%'

    Thanks,

    Iain

    select pk,pcode

    from postcode

    where len(pcode) in (6,7) and len(pcode) = len(replace(pcode, ' ', ''))

    Won't run awfully fast, but should do the job.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • WHERE LEN(pcode) IN (6,7) AND CHARINDEX(' ',pcode) = 0

    This is probably not a good way of determining validity of post codes though, there are a lot of rules (well, there are in the UK) that make it more suited to a RegEx (would need to implement a CLR) or complicated PATINDEX matching.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • icampbell (7/5/2012)


    How do i search a string that contains 6 or seven characters but no spaces this is to identify incorrect postcodes

    ie select pk,pcode

    from postcode

    where pcode like '%%%%%%'

    Thanks,

    Iain

    This won't identify incorrect postcodes, it will identify postcodes with a missing separator - which is trivial to fix.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Phil and cadavre

    your posts have gave me exactly what i was looked for.

    Cheers Iain

  • consider also something like the following:

    x LIKE '[a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9]'

    or x LIKE '[a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9]'

  • David McKinney (7/6/2012)


    consider also something like the following:

    x LIKE '[a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9]'

    or x LIKE '[a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9]'

    Actually, this would probably be more accurate: -

    SELECT pk, pcode

    FROM postcode

    CROSS APPLY (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2,' ','')) b(fixedPcode)

    WHERE fixedPcode LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre (7/6/2012)[hrActually, this would probably be more accurate: -

    SELECT pk, pcode

    FROM postcode

    CROSS APPLY (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2,' ','')) b(fixedPcode)

    WHERE fixedPcode LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'

    Depends where you live! Wouldn't help me in France much! I was referring principally to the users request, rather than my personal experience of postcodes. (But you're right that if you're on a case sensitive server, including the collation would be wise.)

    By the way, I think a CTE would be more appropriate than a CROSS APPLY for a set based approach.

    e.g.

    ;WITH fixedPCode (pc) as (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2,' ',''))

    select pc FROM fixedPCode

    WHERE fixedPcode LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'

  • David McKinney (7/6/2012)

    (But you're right that if you're on a case sensitive server, including the collation would be wise.)

    Actually I don't get the collate on the replace - I would have thought it more useful on the like comparison? Did I miss something?

  • Hi Guys thanks for all of the replies.

    I was thinking around the lines of

    left(Pcode,3) & " " & right(Pcode,len(Pcode)-3)

    i will try out your code later today against a test database.

    cheers Iain.

  • David McKinney (7/6/2012)


    Cadavre (7/6/2012)[hrActually, this would probably be more accurate: -

    SELECT pk, pcode

    FROM postcode

    CROSS APPLY (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2,' ','')) b(fixedPcode)

    WHERE fixedPcode LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'

    Depends where you live! Wouldn't help me in France much! I was referring principally to the users request, rather than my personal experience of postcodes. (But you're right that if you're on a case sensitive server, including the collation would be wise.)

    By the way, I think a CTE would be more appropriate than a CROSS APPLY for a set based approach.

    e.g.

    ;WITH fixedPCode (pc) as (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2,' ',''))

    select pc FROM fixedPCode

    WHERE fixedPcode LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'

    Setting the collation is actually to do with a bug in replace for SQL Server rather than any other reason, however you're correct that the patterns I've used there are for UK postcodes.

    As for your comment on a CTE "set-based" solution. . . well, no πŸ˜‰

    First, let's fix your code so it's syntactically correct: -

    WITH fixedPCode (pc)

    AS (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2, ' ', '')

    FROM postcode)

    SELECT pc

    FROM fixedPCode

    WHERE pc LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR pc LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR pc LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]';

    Now, if you run this: -

    SELECT TOP 1000000 pcode

    INTO #test

    FROM (VALUES('SK13 8LY'),('M1 1AA'),('M60 1NW'),

    ('GIR 0AA'),('CR2 6XH'),('DN55 1PT'))a(pcode),

    master.sys.all_columns sc1,master.sys.all_columns sc2,master.sys.all_columns sc3;

    DECLARE @HOLDER VARCHAR(MAX); --Used to take the display time out of the equation

    WITH fixedPCode (pc)

    AS (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2, ' ', '')

    FROM #test)

    SELECT @HOLDER = pc

    FROM fixedPCode

    WHERE pc LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR pc LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR pc LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'

    SELECT @HOLDER = pcode

    FROM #test

    CROSS APPLY (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2,' ','')) b(fixedPcode)

    WHERE fixedPcode LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'

    With include execution plan turned on, you'll see the plans are identical for each query. I've used the CROSS APPLY to make the WHERE clause more readable just as your CTE does πŸ™‚


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I stand corrected about the cross apply, it performs better than the CTE - i guess with the Replace we're already 'line by line'.

    Also I note that the collation on the replace makes it perform better....interesting! Anything I should know?

  • I think our lasts posts crossed in the webosphere.

    Thanks for your feedback. Actually I did a similar test on "a big table I had lying about", but got slightly different execution plans for the two methods....but the cross apply did turn out more performant in this case.

  • Cadavre (7/6/2012)[hrActually, this would probably be more accurate: -

    SELECT pk, pcode

    FROM postcode

    CROSS APPLY (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2,' ','')) b(fixedPcode)

    WHERE fixedPcode LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    OR fixedPcode LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'

    Out of curiosity, would you care to comment your 'likes' i.e. give an example for each case?

    Thanks,

    David.

  • Sure. How's this?

    SELECT pk, pcode

    FROM postcode

    CROSS APPLY (SELECT REPLACE(pcode COLLATE Latin1_General_BIN2,' ','')) b(fixedPcode)

    /*---------------------------------*UK Postcodes are arranged like this: -

    SortingOfficeCode+Space+LocalCode

    e.g. SK13 8LY or M1 1AA

    The LocalCode is always 3 characters long,

    1 number followed by 2 letters.

    The SortingOfficeCode is between 2 and 4

    characters long, always starting with a letter.

    \*---------------------------------*/

    WHERE

    --This matches on 2 letter sorting office codes

    fixedPcode LIKE '[A-Z][A-Z0-9][0-9][A-Z][A-Z]'

    --This matches on 3 letter sorting office codes

    OR fixedPcode LIKE '[A-Z][A-Z0-9]_[0-9][A-Z][A-Z]'

    --This matches on 4 letter sorting office codes

    OR fixedPcode LIKE '[A-Z][A-Z0-9]__[0-9][A-Z][A-Z]'


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply