How do I remove the column names and dashes in flat file?

  • Task: Create flat file extract from database of select fields as input to 3rd party reporting tool. Do this every month automatically (i.e., no manual intervention)

    Environment: SS 2008 SE, minimal installation (client tools, basic db only. No SSRS, no SSIS)

    Progress: In SSMS, created query, select output to file, successfully generate reports from resulting flat file

    Failure: Placing query directly into a job step and directing output to file under Advanced tab results in file with heading line and dashes line. 3rd party reporting tool cannot find anything to report on.

    Questions:

    What is the 'best' way to create automated process?

    Do I need to install SSIS to facilitate process?

    Additional Info: I will need to create different extract files as multiple job steps, so once I've solved this problem, it will need to be applied (as of today) at least 6 times, with more in future.

    Lament: One of the challenges with using GUI is that you don't get to see the 'invisible' steps used by the GUI to circumvent default processes. IF in SSMS the content can be suppressed, how is it done and why can't it be done in T-SQL (other products allow such controls using a command 'SET' not available in T-SQL!)

  • steve smith-401573 (4/8/2011)


    Environment: SS 2008 SE, minimal installation (client tools, basic db only. No SSRS, no SSIS)

    BCP is a good candidate for this kind of work. It can deliver a text file from a query with just about any shape you need and is standard equipment with the SQL Server Client tools, i.e. no SSIS or DB Engine required and can be kicked off from any the cmd shell or PowerShell.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • BCP would be an excellent solution IF I knew the output file format. Where will I find the definition(s) for truncation that SSMS provides? When I examine an output file from SSMS that works, it does NOT match the field specifications in the source table (e.g., a 2048 character length is dramatically shortened). In order to use BCP, since the default settings are NOT the same as used within SSMS for field delimiting (SSMS uses a space, not a tab, according to MS Word), I seem to need a different method to arrive at field definitions for the output file than counting dashes .

    Also, since BCP does a translation on null fields (although there should be no null fields, those of us with dirty hands all know the implications of 'should'), how can I be assured that if I use a defined Format when null fields or empty fields arise, that my results will be conforming to the undocumented input format?

    And, of course, this is my very first attempt at using BCP so I have no experience to guide me...

  • BCP would be an excellent solution IF I knew the output file format. Where will I find the definition(s) for truncation that SSMS provides? When I examine an output file from SSMS that works, it does NOT match the field specifications in the source table (e.g., a 2048 character length is dramatically shortened). In order to use BCP, since the default settings are NOT the same as used within SSMS for field delimiting (SSMS uses a space, not a tab, according to MS Word), I seem to need a different method to arrive at field definitions for the output file than counting dashes .

    I'm not sure I understand what you mean. Are you saying you do not know the format of your tables? Or the format the reporting system wants? That is critical information for a project like this 🙂

    SSMS truncates text output. You can change it in Tools > Options > Query Results > SQL Server. BCP can mimic any format. If you're just wanting delimited output you do not need a format file. If you want fixed-width you can get away with no format file by telling BCP not to have any column delimiter and casting all columns to CHAR in the query you supply.

    Also, since BCP does a translation on null fields (although there should be no null fields, those of us with dirty hands all know the implications of 'should'), how can I be assured that if I use a defined Format when null fields or empty fields arise, that my results will be conforming to the undocumented input format?

    "undocumented input format" ... please define where said input is coming from and why it is of concern 🙂 You can replace NULL with the string 'NULL' in your query if that will help your reporting tools know what's what. You have control over all that when you use the queryout option of BCP.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I apologize for the confusion. Let's try this: SS 2008 SE database TABLE aggregates system logs from multiple servers for multiple 'customers'. The requirement is to produce, for each customer, analytical reports for the month just ended.

    The flow is: SQL Server Database --> flat file extract --> 3rd party reporting tool ('3prt').

    Using SSMS with the 'output to file' GUI control selected, produces a flat file in appropriate format. I am attempting to duplicate that process.

    However, in my first attempts to create a 'simple' job to schedule the flat file extract automatically, additional 'stuff' appears at the front of the file and prevents the '3prt' from successfully extracting content.

    I have now eliminated most of that 'stuff' in front (changed a memory table lookup to use a select sub-clause in my where clause to define the month criterion, but log messages also display in the output file, which means that I must use a tool like bcp, since I don't know of any SQL Server equivalent to 'echo off' for batch jobs.

    So, the challenge is to provide an input file to the '3prt' which has poorly defined requirements. In fact, the only input format described assumes a Linux operating system instead of the Windows versions we are using. That is the basis for my describing 'undefined input' - the input to the '3prt' is opaquely described. The query that now works correctly converts the windows log records to Apache format.

    Hope that makes things a little clearer in terms of what I'm trying to accomplish.

    Disclaimer: My career predates GUI and I'm still frustrated by an inability to really 'see' what's going on behind the screen. Given a choice between code and button, I'll prefer code so I KNOW what's been said.

  • steve smith-401573 (4/8/2011)


    I apologize for the confusion. Let's try this: SS 2008 SE database TABLE aggregates system logs from multiple servers for multiple 'customers'. The requirement is to produce, for each customer, analytical reports for the month just ended.

    The flow is: SQL Server Database --> flat file extract --> 3rd party reporting tool ('3prt').

    Using SSMS with the 'output to file' GUI control selected, produces a flat file in appropriate format. I am attempting to duplicate that process.

    However, in my first attempts to create a 'simple' job to schedule the flat file extract automatically, additional 'stuff' appears at the front of the file and prevents the '3prt' from successfully extracting content.

    I have now eliminated most of that 'stuff' in front (changed a memory table lookup to use a select sub-clause in my where clause to define the month criterion, but have not tested to see what additional 'stuff' may now appear. If none, then job is done. If yes, then must resort to bcp or other method to weed away the 'stuff' leaving fixed length data records. (As it turns out, the original goal of removing the two header rows in the flat file is not needed after all; only fixed length records of the correct length.)

    So, the challenge is to provide an input file to the '3prt' which has poorly defined requirements. In fact, the only input format described assumes a Linux operating system instead of the Windows versions we are using. That is the basis for my describing 'undefined input' - the input to the '3prt' is opaquely described. The query that now works correctly converts the windows log records to Apache format.

    Hope that makes things a little clearer in terms of what I'm trying to accomplish.

    Disclaimer: My career predates GUI and I'm still frustrated by an inability to really 'see' what's going on behind the screen. Given a choice between code and button, I'll prefer code so I KNOW what's been said.

    Cool, off we go to the cmd prompt...try something like this:

    C:\>bcp "SELECT CAST(name AS CHAR(100)) AS name, ' ' AS a, CAST(object_id AS CHAR(20)) AS object_id, ' ' AS b, CAST(principal_id AS CHAR(18)) AS principal_id, ' ' AS c, CAST(SCHEMA_ID AS CHAR(11)) AS SCHEMA_ID FROM sys.objects;" queryout .\objects.txt -T -c -t "" -S SERVER_NAME\INSTANCE_NAME

    edit: add -t "" to override default field delimiter of tab

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Well, I finally figured out that bcp is an .exe requiring a command window. However, I want to specify an input file rather than enter the query, because the query has a subquery and, in its formatting of a column, uses the phrase { '"' } which would CLEARLY threaten to wreak havoc with quoted strings.

    1. How do I tell WHICH version of bcp.exe am I using? When I tried using the command { bcp queryout "C:\...\xxx.sql" -w -o "filename", -T } I receive a meaningless error response of "Copy direction must be either 'in', 'out', or 'format'. "queryout" does not appear to be an option, even thought it is listed as an exclusive choice in the command help. It seems that this is an older version of bcp?

    2. How do I debug my command string to figure out what rule I'm breaking? And, how will I integrate this into a SQL Server job schedule? Or will I simply need to use the Windows job scheduler, in which case, I'm really looking to use powershell instead?

    3. Are there limitations on the types of data transformations that are allowed in bcp? For instance, concatenating fields and character strings together to form one output field? bcp seems to be designed for high volume brute force rather than the lighter touch I need to use in filtering and reformatting.

    4. While bcp looks to be ideal for a method to take a huge database and extract out into a new table a subset of the original data (fewer fields, fewer records) without creating huge log entries, it does not seem to be ready to get me where I want to be - with an output flat file ready for the report tool.

    5. Should I be constructing a 'view' to use as an input string and then simply filter through the 'where' clause?

  • steve smith-401573 (4/8/2011)


    Well, I finally figured out that bcp is an .exe requiring a command window. However, I want to specify an input file rather than enter the query, because the query has a subquery and, in its formatting of a column, uses the phrase { '"' } which would CLEARLY threaten to wreak havoc with quoted strings.

    1. How do I tell WHICH version of bcp.exe am I using? When I tried using the command { bcp queryout "C:\...\xxx.sql" -w -o "filename", -T } I receive a meaningless error response of "Copy direction must be either 'in', 'out', or 'format'. "queryout" does not appear to be an option, even thought it is listed as an exclusive choice in the command help. It seems that this is an older version of bcp?

    C:\>bcp -v

    Also helpful:

    C:\>bcp /?

    2. How do I debug my command string to figure out what rule I'm breaking?

    - Read the docs. http://msdn.microsoft.com/en-us/library/ms162802.aspx (examples at bottom)

    - Look for examples online.

    - Trial and error. 🙂

    And, how will I integrate this into a SQL Server job schedule? Or will I simply need to use the Windows job scheduler, in which case, I'm really looking to use powershell instead?

    Make a SQL Agent Job Step with a Step Type of CmdExec which will execute any commands in the step as if they were executed from a cmd shell prompt.

    3. Are there limitations on the types of data transformations that are allowed in bcp? For instance, concatenating fields and character strings together to form one output field? bcp seems to be designed for high volume brute force rather than the lighter touch I need to use in filtering and reformatting.

    No. If you're using queryout then you are only limited by what T-SQL can do for you. The full set of T-SQL commands are there for you to use in your query. BCP only passes your query to the DB engine and waits for a tabular resultset to come back.

    4. While bcp looks to be ideal for a method to take a huge database and extract out into a new table a subset of the original data (fewer fields, fewer records) without creating huge log entries, it does not seem to be ready to get me where I want to be - with an output flat file ready for the report tool.

    I think BCP will do what you want. Don't get discouraged. Write your query first and get it working as you want in SSMS which is a much more comfortable env to write queries that on the cmd line...then tackle the BCP piece. At the end of the day it might be cleaner to create a stored procedure with your query in it and then just call the proc from bcp.

    5. Should I be constructing a 'view' to use as an input string and then simply filter through the 'where' clause?

    Don't try to do too much in a View. For complicated filtering logic or data manipulation before returning to the caller a stored procedure or user-defined table-valued function is usually a solid choice.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • The sql query is written and works. To accommodate the limitations of bcp performing only a single select, I took my memory table and made it a permanent lookup table in the database - 12 records, two fields.

    My challenge is in making bcp work; the few examples I've reviewed so far are too simple to demonstrate all that I want to do in one line. But the saga will be continued on Monday....

    Thanks for all your help, advice and support! Have a great weekend.

  • steve smith-401573 (4/8/2011)


    The sql query is written and works. To accommodate the limitations of bcp performing only a single select, I took my memory table and made it a permanent lookup table in the database - 12 records, two fields.

    You are better off putting everything in a stored procedure and calling that from bcp.

    Like this:

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'dbo.work_table')

    AND type IN (N'U') )

    DROP TABLE dbo.work_table

    GO

    CREATE TABLE dbo.work_table

    (

    name sysname,

    object_id int,

    principal_id int,

    schema_id int

    ) ;

    GO

    ALTER PROC dbo.get_objects

    AS

    BEGIN

    SET NOCOUNT ON;

    INSERT INTO dbo.work_table

    (

    name,

    object_id,

    principal_id,

    schema_id

    )

    SELECT name,

    object_id,

    principal_id,

    schema_id

    FROM sys.objects ;

    /*

    do additional work on dbo.work_table

    .

    .

    .

    */

    -- deliver resultset with proper shape

    SELECT CAST(name AS CHAR(100)) AS name,

    ' ' AS a,

    CAST(object_id AS CHAR(20)) AS object_id,

    ' ' AS b,

    CAST(principal_id AS CHAR(18)) AS principal_id,

    ' ' AS c,

    CAST(SCHEMA_ID AS CHAR(11)) AS SCHEMA_ID

    FROM dbo.work_table ;

    TRUNCATE TABLE dbo.work_table;

    END

    GO

    My challenge is in making bcp work; the few examples I've reviewed so far are too simple to demonstrate all that I want to do in one line. But the saga will be continued on Monday....

    Here would be your bcp command line calling the proc above:

    [font="Courier New"]C:\>bcp "EXEC DATABASE_NAME.dbo.get_objects" queryout .\objects.txt -T -c -t "" -S SERVER_NAME\INSTANCE_NAME[/font]

    Thanks for all your help, advice and support!

    No problem at all...you are helping me as well.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Well, I'm happy to report that I have managed to get the bcp command to properly execute. I've even managed to experiment with the '.\..\.....' prefixing to correctly point the directory starting from the current directory, rather than absolute reference. Of course, in a batch job, I think I'd prefer the absolute reference, but it's nice to know how to move about in a relative way, as well...

    Still to be done is to figure out the 'best' way to automate the process every month. Probably a straight-up batch job in Windows scheduler.

  • That's great news! Windows Scheduler calling a batch file would work. SQL Agent doing the same, or calling bcp.exe directly would also work. SQL Agent jobs have a step type called CmdExec that will allow you to execute a batch of commands as if you were at a cmd shell prompt.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I intend to use SQL Agent processing to test the actual scripts just to confirm that, as written, the commands do what I want.

    I also need to experiment with the capture of output. Since I don't have any control over the server, using a scheduled batch job may allow me to ensure visibility over the log results.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply