January 14, 2009 at 4:41 pm
Running: Microsoft SQL Server 2005 - 9.00.3152.00 Enterprise Edition SP2 Enterprise Edition (Intel IA-64)
I need to export data from a table to a UTF-8 encoded text file. This is going to be scheduled for every 15 or 30 minutes. My first approach was using BCP and xp_Cmdshell but this server is locked down for security reasons; no go for xp_Cmdshell, and I'd rather not open that up.
So my revised approach is to call an SSIS task (that exports table data to text) from within a SQL JOB (other steps are required in the job before and after the SSIS is run). I think I will have to do something to be sure the SSIS is done before continuing to the next step (hrmm I think there may be a flag on the SSIS step that controls that).
The SSIS creates the text outfile which MUST be in quoted and comma separated format as shown here:
"1949","0","1","Something","Wow","","","",""
"2107","4","1","Something else","Cool","","","",""
The SSIS task works correctly-- EXCEPT the txt outfile is NOT in UTF-8.
To simplify the txt output, my SqlCommand concatenates the columns using quotes and commas, and that "single column" is wrapped in a CAST to NVARCHAR. Like this (shortened):
declare @dq char(1);
declare @qcq char(3);
set @dq='"'; -- a double quote
set @qcq='","'; -- double quote comma double quote
SELECT CAST(@dq + Col1 + @qcq + Col2 + @qcq + Col3 + @qcq + ReturnTime + @dq AS NVARCHAR(200)) as MyData
FROM MyTable
WHERE IsExported = 0
Looking at the SSIS, Connection Manager shows me that the "DestinationConnectionFlatFile" has CodePage=65001 which is supposed to set UTF-8... or so I thought.
Is there anything obvious I'm doing wrong here?
Has anyone successfully done this-- even if via a different mechanism?
January 15, 2009 at 8:17 am
In SSIS when you build the package you can specify the code page. Open the connection manager for the flat file. You should see a code page choice and be able to select it there. I believe that will store it out then in utf-8
January 15, 2009 at 9:56 am
Right... I'm already doing that, 3rd line from the end in the OP. Unless that's not the right CP...
January 16, 2009 at 5:27 pm
Has anyone out there tried this, either with success or failure?
Adding this 1/27:
One more whimper to try to get some additional responses.
January 27, 2009 at 11:48 am
Ah... the edit on the previous message didn't float this thread in the list of topics, so... has anyone else been able to output to UTF-8?
January 28, 2009 at 4:09 pm
I've not had a problem outputting a UTF-8 flat file. I just whipped up a quick test table containing 2 columns (int, nvarchar(256)) and populated it with some sample UTF-8 data, and exported to a flat file from an OLEDB source, and the result was in UTF-8 format.
For reference I'm running "Microsoft SQL Server 2005 - 9.00.3257.00 (X64)".
Did nothing special to the flat file connection manager - just set the codepage to 65001 and made sure that the Unicode tickbox was not selected.
Double-check that your column definition (under Advanced in the connection manager) for your NVARCHAR columns is set to "Unicode string [DT_WSTR]" and the OutputColumnWidth is correct.
I'd also try deleting your existing output file and run the package again to recreate it - I've found that if the file already exists it retains the existing format irrespective of what your connection manager settings say. If you had a previous version of the output file in ASCII format it could cause problems if you try outputting to that same file.
Regards,
Jacob
April 5, 2012 at 4:31 am
I totally agree with Jacob here. I also experience the same. But I have got different scenario.
Instead of the NVARCHAR column in the database, we have nTEXT. In the File connection, I have changed the Data type to DT_NTEXT for that column. But while running, it gives an error that:
"Input column is an NTEXT which is not supported with ANSI files."
I want the file to be in UTF-8 format only. By checking the UniCode Option in the File Connection, it works fine.
Please suggest.
November 28, 2012 at 7:09 am
I finally succeeded by writing the output to a temp.txt file and adding the next PowerShell command to convert it to UTF-8:
-- Change encoding to UTF-8 with PowerShell
SET @command = 'powershell -Command "Get-Content '+@Path+'\temp.txt -Encoding Unicode | Set-Content -Encoding UTF8 '+@path+'\'+@filename+'"';
EXEC xp_cmdshell @command;
December 2, 2012 at 7:48 pm
My first approach was using BCP and xp_Cmdshell but this server is locked down for security reasons; no go for xp_Cmdshell, and I'd rather not open that up.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy