Need help with SQl syntax

  • I need to get the rows where the PrescriberID starts with 2 letters and then it can have numbers or anything. The first 2 chars have to be alpha

    The SQL below does not work I think that 'A-Za-z%' needs help

    Select top 100 RxClaims.prescriberId

    from MHPDW.TransferDB.[dbo].[RxClaims] RxClaims

    WHERE

    LEFT(RxClaims.prescriberId,2) LIKE 'A-Za-z%'

  • I think using a pattern like this will work: its saying the first two characters are A-Z, and then anything after that.

    Select top 100 RxClaims.prescriberId

    from MHPDW.TransferDB.[dbo].[RxClaims] RxClaims

    WHERE

    RxClaims.prescriberId LIKE '[A-Z][A-Z]%' --not case sensitive collation

    --RxClaims.prescriberId LIKE '[a-zA-Z][a-zA-Z]%' --case sensitive collation

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (3/31/2016)


    --RxClaims.prescriberId LIKE '[a-zA-Z][a-zA-Z]%' --case sensitive collation

    I think that would be for a binary collation. For case-sensitive collations, it would be either LIKE '[a-Z][a-Z]%', or LIKE '[A-z][A-z]%' - I can never remember if uppercase sorts before or after lowercase.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (4/2/2016)


    Lowell (3/31/2016)


    --RxClaims.prescriberId LIKE '[a-zA-Z][a-zA-Z]%' --case sensitive collation

    I think that would be for a binary collation. For case-sensitive collations, it would be either LIKE '[a-Z][a-Z]%', or LIKE '[A-z][A-z]%' - I can never remember if uppercase sorts before or after lowercase.

    Regardles of the sorting order (which is first upper, then lower), I think you're mistaking here, Hugo. [A-z] includes more than just the alpha characters in lower and upper case: your [A-z] includes not only the requested alphas but also the characters [, \, ], ^ and _. Lowell correctly specifies 2 times a single character out of 2 ranges from either a-z or A-Z, followed by any characters.

    If you're not sure about case sensitivity in your target db, I'd suggest to use the [a-zA-Z][a-zA-Z]% pattern anyway. This gives you the correct answer in both case sensitive and case insensitive environments.

    `



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • R.P.Rozema (4/5/2016)


    Hugo Kornelis (4/2/2016)


    Lowell (3/31/2016)


    --RxClaims.prescriberId LIKE '[a-zA-Z][a-zA-Z]%' --case sensitive collation

    I think that would be for a binary collation. For case-sensitive collations, it would be either LIKE '[a-Z][a-Z]%', or LIKE '[A-z][A-z]%' - I can never remember if uppercase sorts before or after lowercase.

    Regardles of the sorting order (which is first upper, then lower), I think you're mistaking here, Hugo. [A-z] includes more than just the alpha characters in lower and upper case: your [A-z] includes not only the requested alphas but also the characters [, \, ], ^ and _. Lowell correctly specifies 2 times a single character out of 2 ranges from either a-z or A-Z, followed by any characters.

    What you write is true for binary collations (which are always case sensitive), because they use the ASCII table for sorting characters, and those special characters are located between Z and a in the ASCII table.

    For non-binary collations, it is not true. They never have special characters between the letters. See the repro below.

    DECLARE @x varchar(10) = '[]x'

    IF @x COLLATE Latin1_General_CS_AS LIKE '[A-z][A-z]%'

    PRINT 'This is weird';

    ELSE

    PRINT 'See what I mean?';

    IF @x COLLATE Latin1_General_BIN2 LIKE '[A-z][A-z]%'

    PRINT 'Only binary collations have special characters between Z and a';

    ELSE

    PRINT 'Help! I found a bug';

    If you're not sure about case sensitivity in your target db, I'd suggest to use the [a-zA-Z][a-zA-Z]% pattern anyway. This gives you the correct answer in both case sensitive and case insensitive environments.

    The tempdb collation will always be equal to the instance collation. That being said, you are true that if you don't know the collation, using the [a-zA-Z][a-zA-Z]% pattern would be a reasonably safe choice. It will return the correct results in all collations commonly used in English-speaking countries. Not sure about other languages though - if I remember correctly, some Scandinavian countries use ae as an alternate representation for æ (ligature ae), and it is possible that æ sorts after z.

    Yes, trying to work around all possible collations can be very intersting/challenging/annoying.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (4/5/2016)Yes, trying to work around all possible collations can be very intersting/challenging/annoying.

    So true!



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply