January 14, 2009 at 4:41 pm
Running: Microsoft SQL Server 2005 - 9.00.3152.00 Enterprise Edition SP2 Enterprise Edition (Intel IA-64)
I need to export data from a table to a UTF-8 encoded text file. This is going to be scheduled for every 15 or 30 minutes. My first approach was using BCP and xp_Cmdshell but this server is locked down for security reasons; no go for xp_Cmdshell, and I'd rather not open that up.
So my revised approach is to call an SSIS task (that exports table data to text) from within a SQL JOB (other steps are required in the job before and after the SSIS is run). I think I will have to do something to be sure the SSIS is done before continuing to the next step (hrmm I think there may be a flag on the SSIS step that controls that).
The SSIS creates the text outfile which MUST be in quoted and comma separated format as shown here:
"1949","0","1","Something","Wow","","","",""
"2107","4","1","Something else","Cool","","","",""
The SSIS task works correctly-- EXCEPT the txt outfile is NOT in UTF-8.
To simplify the txt output, my SqlCommand concatenates the columns using quotes and commas, and that "single column" is wrapped in a CAST to NVARCHAR. Like this (shortened):
declare @dq char(1);
declare @qcq char(3);
set @dq='"'; -- a double quote
set @qcq='","'; -- double quote comma double quote
SELECT CAST(@dq + Col1 + @qcq + Col2 + @qcq + Col3 + @qcq + ReturnTime + @dq AS NVARCHAR(200)) as MyData
FROM MyTable
WHERE IsExported = 0
Looking at the SSIS, Connection Manager shows me that the "DestinationConnectionFlatFile" has CodePage=65001 which is supposed to set UTF-8... or so I thought.
Is there anything obvious I'm doing wrong here?
Has anyone successfully done this-- even if via a different mechanism?
January 15, 2009 at 8:17 am
In SSIS when you build the package you can specify the code page. Open the connection manager for the flat file. You should see a code page choice and be able to select it there. I believe that will store it out then in utf-8
January 15, 2009 at 9:56 am
Right... I'm already doing that, 3rd line from the end in the OP. Unless that's not the right CP...
January 16, 2009 at 5:27 pm
Has anyone out there tried this, either with success or failure?
Adding this 1/27:
One more whimper to try to get some additional responses.
January 27, 2009 at 11:48 am
Ah... the edit on the previous message didn't float this thread in the list of topics, so... has anyone else been able to output to UTF-8?
January 28, 2009 at 4:09 pm
I've not had a problem outputting a UTF-8 flat file. I just whipped up a quick test table containing 2 columns (int, nvarchar(256)) and populated it with some sample UTF-8 data, and exported to a flat file from an OLEDB source, and the result was in UTF-8 format.
For reference I'm running "Microsoft SQL Server 2005 - 9.00.3257.00 (X64)".
Did nothing special to the flat file connection manager - just set the codepage to 65001 and made sure that the Unicode tickbox was not selected.
Double-check that your column definition (under Advanced in the connection manager) for your NVARCHAR columns is set to "Unicode string [DT_WSTR]" and the OutputColumnWidth is correct.
I'd also try deleting your existing output file and run the package again to recreate it - I've found that if the file already exists it retains the existing format irrespective of what your connection manager settings say. If you had a previous version of the output file in ASCII format it could cause problems if you try outputting to that same file.
Regards,
Jacob
April 5, 2012 at 4:31 am
I totally agree with Jacob here. I also experience the same. But I have got different scenario.
Instead of the NVARCHAR column in the database, we have nTEXT. In the File connection, I have changed the Data type to DT_NTEXT for that column. But while running, it gives an error that:
"Input column is an NTEXT which is not supported with ANSI files."
I want the file to be in UTF-8 format only. By checking the UniCode Option in the File Connection, it works fine.
Please suggest.
November 28, 2012 at 7:09 am
I finally succeeded by writing the output to a temp.txt file and adding the next PowerShell command to convert it to UTF-8:
-- Change encoding to UTF-8 with PowerShell
SET @command = 'powershell -Command "Get-Content '+@Path+'\temp.txt -Encoding Unicode | Set-Content -Encoding UTF8 '+@path+'\'+@filename+'"';
EXEC xp_cmdshell @command;
December 2, 2012 at 7:48 pm
My first approach was using BCP and xp_Cmdshell but this server is locked down for security reasons; no go for xp_Cmdshell, and I'd rather not open that up.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply