August 27, 2012 at 2:52 pm
Can the collation used by SSIS be changed or influenced during install or run time? We have found that our databases, that use a mandatory "LATIN1_GENERAL_BIN", have incorrect SSIS Merge Join output. Changing our database collation in testing didn't make a difference. What matters is the data. Which Windows collation is SSIS using?
Example Data:
FIRSTNAME
FIRSTNAME
FIRSTS-A-NAME
FIRSTS_A_NAME
FIRST_NAME
FIRST_NAME
FIRSTname
FIRSTname
FIRS_NAME
Our only solution right now is to put in a Sort task before the Merge Join task as setting advanced properties isn't enough (as described by Eric Johnson here --> http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx)
Can anyone please advise?
We are using 64-bit SQL Server 2008 R2 w/ SP1 in Windows Server 2008 R2 ENT w/ SP1.
UPDATE from ETL team: Explicitly ordering the source with "COLLATE Latin1_General_CS_AS" seems to have the same effect as using a separate sort task. We don't feel that we can rely on our findings, however, unless we have documentation that this collation is what is behind SSIS.
Can someone please advise?
October 19, 2015 at 12:55 pm
Hi,
Thanks for the info. I had the same issue where MERGE JOIN transformation drops close to a million records if not used with a SORT TRANSFORMATION. But if I used the SORT Transformation all the records go through, but the performance drops too much as i am dealing with close to 9 and half million records.
So after using the COLLATE, almost all the 1 million records went through except 25 records. There was no way of tracking those 25 from the MERGE JOIN as i had to use the INNER JOIN (OUTER JOIN's give millions of Duplicates).
Please let me know if there is something else which can further resolve this issue.
Thanks in advance
Gowtham
October 19, 2015 at 12:57 pm
SELECT
[DLCRTYPE], DLCASEID,
-- CONVERT(VARCHAR(20),DLCASEID) COLLATE Latin1_General_CI_AS DLCASEID,
[DLCASEPTY],
FROM dbo.DMDETAILP dp
INNER JOIN dbo.CASEMASP cm
ON cm.CMCRTYPE = dp.DLCRTYPE
AND cm.CMCASEID = dp.DLCASEID
AND cm.CMCASEPTY = dp.DLCASEPTY
ORDER BY CONVERT(VARCHAR(20),DLCASEID) COLLATE Latin1_General_CI_AS;
--ORDER BY DLCASEID COLLATE SQL_Latin1_General_CP1_CI_AS;
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply