collation conflict with a split function

  • Hey all, stumbled across a new server today which uses CI_AI as the server collation and I'm trying to find a better way to resolve a problem where several of my admin objects break because of "Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "SQL_Latin1_General_CP1_CI_AI" in the equal to operation."

    I can resolve it in the objects themselves by simply using COLLATE database_default on the join to my split function, but I'd really like to figure out If I can fix the problem on the function so that I don't have to update all my objects as well as make this a band-aid moving forward.

    Example of my join:

    LEFT JOIN [dbo].[Split_fn](@DBName,',') s

    ON db.[name] = s.[item] --COLLATE database_default

    I assumed incorrectly that by using COLLATE database_default in the table variable would resolve this problem. Is there anything I can do to fix the function or do I simply have to use collate on any join to this function? I realize I could change the collation of my admin database to match that of the server, but I deploy my admin database and objects to the environment at large with CI_AS. I'm assuming that my options are limited being that I am breaking on a join and a join operation occurs in tempdb, which has a different collation (CI_AI) ... perhaps I need to rethink my admin database collation, but curious to as If I have any other options at this point.

    Thanks

    Here is the function (author unknown, found it quite some time ago).

    IF OBJECT_ID('Split_fn') IS NOT NULL

    DROP FUNCTION [dbo].[Split_fn]

    GO

    /*******************************************************************************************************

    **Name:[dbo].[Split_fn]

    **Desc:Split up a delimited list and return a recordset

    **Auth:Unknown (found online)

    **Date:?

    *******************************************************************************

    **Change History

    *******************************************************************************

    **Date:Author:Description:

    **-------------------------------------------------------

    **

    ********************************************************************************************************/

    CREATE FUNCTION [dbo].[Split_fn]

    (

    @sInputList VARCHAR(8000)-- List of delimited items

    ,@sDelimiter VARCHAR(8)= ','-- delimiter that separates items

    )

    RETURNS @List TABLE ([item] VARCHAR(8000) COLLATE database_default)

    AS

    BEGIN

    DECLARE @sItem VARCHAR(8000)

    WHILE CHARINDEX(@sDelimiter,@sInputList,0) <> 0

    BEGIN

    SELECT

    @sItem=RTRIM(LTRIM(SUBSTRING(@sInputList,1,CHARINDEX(@sDelimiter,@sInputList,0)-1))) COLLATE database_default

    ,@sInputList=RTRIM(LTRIM(SUBSTRING(@sInputList,CHARINDEX(@sDelimiter,@sInputList,0)+LEN(@sDelimiter),LEN(@sInputList)))) COLLATE database_default

    IF LEN(@sItem) > 0

    INSERT INTO @List SELECT @sItem COLLATE database_default

    END

    IF LEN(@sInputList) > 0

    INSERT INTO @List SELECT @sInputList COLLATE database_default-- Put the last item in

    RETURN

    END

    GO

  • Ok, after some more testing/playing, I think I'm coming to realization that the objects themselves will have to have the COLLATE on the joins as I'm finding now that basically any join operation breaks when the calling database has a different collation than tempdb. This is new to me, haven't been in this situation before.

    At this point, I am thinking that I will have to change my database collation unless anyone else has something for me that I'm missing.

  • You're probably going to need to use the collation hint on the joins if you use that function.

    I don't use a function for string parsing, I use an inline query and a Numbers table. The template looks like this:

    set nocount on;

    declare @String varchar(max);

    select substring(@String + ',', Number, charindex(',', @String + ',', Number) - Number)

    from dbo.Numbers

    where Number <= len(@String)

    and substring(',' + @String, Number, 1) = ','

    order by Number;

    You'd just replace @String with your variable/parameter and then join to this. Can also be used in Cross/Outer Apply situations by plugging in a column name instead of the variable. I have the delimiter (a comma) hard-coded into this version, but you can use a variable there if you like.

    It's faster than a UDF, takes less I/O, and avoids the issues you're running into.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks GS, not a bad idea, something I'll have to consider. Currently though, I have so much dependancy upon that objects in many of my admin objects and reporting that I'm not sure I'm ready to make that change just yet. That and regardless, the problem would remain when joining between two different collations though right?

    So what I'm learning from my new environment in regards to collation is that these two are causing me grief:

    Latin1_General_BIN

    SQL_Latin1_General_CP1_CI_AI

    The Latin1_General_BIN have been relatively easy fixes in regards to case, but the _CI_AI is what I'm dealing with now and I'm thinking I may just leave my admin database in the server default collation as I am not seeing any of my objects breaking.

  • You won't end up with a collation conflict with an inline query, because they're both being done in the same context.

    For example:

    ;with CTE (ParsedString) as

    select substring(@String + ',', Number, charindex(',', @String + ',', Number) - Number)

    from dbo.Numbers

    where Number <= len(@String)

    and substring(',' + @String, Number, 1) = ',')

    select *

    from dbo.MyTable

    inner join CTE

    on MyTable.Col1 = CTE.ParsedString;

    Assuming @String was defined earlier in the proc/script, or was an input parameter, there's no possibility of collation conflict in this, because it's all being done in the same context. The Numbers table, even if it resides in a different database, just has a column of Integer data type, which doesn't have a collation, so even that doesn't impact the solution.

    I totally understand that this would require significant re-engineering. I've found it to be a worthwhile refactor, for a lot of reasons. Not the least of which is that it's a lot faster than a UDF with a While loop.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply