Faing Issues with Collation

  • I am working on German data in a table. I am facing some problems. Before that

    let me set the context. Please execute the query within the

    Script begin and Script End

    --- Script Begin

    create table german

    (

    textdata nvarchar(100) collate Latin1_General_CI_AS

    )

    insert into german

    values (N'daß')

    insert into german

    values (N'dass')

    insert into german

    values (N'großen')

    insert into german

    values (N'grossen')

    insert into german

    values(N'läßt')

    insert into german

    values(N'lässt')

    insert into german

    values (N'Mehrweggebinden')

    --- Script End

    select * from german

    -- will give 7 rows

    select textdata,count(*) from german group by textdata

    -- Problem 1 : The german strings 'läßt' and 'lässt' are grouped together

    similarly 'großen' and 'grossen'

    similarly 'daß' and 'dass'

    How can we avoid this ?

    Consider the query below :

    select * from german where textdata like nchar(8223) + N'%' or textdata like N'[."“”,''-]%'

    returns 0 rows

    I tried changing the collation to SQL_Latin1_General_CP437_BIN

    alter table dbo.german alter column textdata nvarchar(100) collate SQL_Latin1_General_CP437_BIN

    Now group by works properly

    select textdata,count(*) from german group by textdata

    but the query which returned 0 rows previously returns 1 row

    select * from german where textdata like nchar(8223) + N'%' or textdata like N'[."“”,''-]%'

    Summary :

    I cannot do a group by and select with the same collation.

    in Latin1_General_CI_ASgroup by fails but select works properly

    in SQL_Latin1_General_CP437_BIN group by works propely but select fails

    Your Suggestion would help.

    -- Hari

  • Use collation casting in the SELECT list and the GROUP BY as follows:

    drop table #german

    create table #german

    (

    textdata nvarchar(100) collate Latin1_General_CI_AS

    )

    insert into #german values (N'daß')

    insert into #german values (N'dass')

    insert into #german values (N'großen')

    insert into #german values (N'grossen')

    insert into #german values(N'läßt')

    insert into #german values(N'lässt')

    insert into #german values (N'Mehrweggebinden')

    --select * from #german

    select textdata COLLATE SQL_Latin1_General_CP437_BIN, count(*)

    from #german

    group by textdata COLLATE SQL_Latin1_General_CP437_BIN

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks Chris 🙂

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply