How to import multiple CSV files.

  • Slowly yet surely my faith in microsoft products is being worn away.

    Latest rug pulled from under the feet is the scrapping of OPENROWSET from a csv file.

    Simplest task in the world using a free database MYSQL, having spent many days trying to find a workaround, I am still non the wiser as to how to get many large CSV files into SQL2008R2

    I have read and read about bulk insert as this seems like the only way to get the data into sql, yet it will not work with my csv files.

    Currently unable to add my attachment of the extract from csv file. Will post this and try to reply with an attachement, most bizare.

  • Using windows7 and IE 8.0, was unable to add an attachment:w00t: Have had to go onto win xp machine to add attachment?

    The attachment is the first 10,000 records of the csv file(s) I am trying to import, in actual fact each file cotains 500,000 rows. Below is the table definition, the dates I have defined as char as they sometimes do contain rubbish data that screws up the whole import.

    [SPORTS_ID] [int] NULL,

    [EVENT_ID] [int] NULL,

    [SETTLED_DATE] [varchar](255) COLLATE Latin1_General_CI_AS NULL,

    [FULL_DESCRIPTION] [varchar](255) COLLATE Latin1_General_CI_AS NULL,

    [SCHEDULED_OFF] [varchar](255) COLLATE Latin1_General_CI_AS NULL,

    [EVENT] [varchar](255) COLLATE Latin1_General_CI_AS NULL,

    [ACTUAL_OFF] [varchar](255) COLLATE Latin1_General_CI_AS NULL,

    [SELECTION_ID] [int] NULL,

    [SELECTION] [varchar](255) COLLATE Latin1_General_CI_AS NULL,

    [ODDS] [real] NULL,

    [NUMBER_BETS] [int] NULL,

    [VOLUME_MATCHED] [real] NULL,

    [LATEST_TAKEN] [varchar](255) COLLATE Latin1_General_CI_AS NULL,

    [FIRST_TAKEN] [varchar](255) COLLATE Latin1_General_CI_AS NULL,

    [WIN_FLAG] [int] NULL,

    [IN_PLAY] [varchar](255) COLLATE Latin1_General_CI_AS NULL

    Any help would be greatly appreciated, it is driving me bonkers.

  • I am not any help, but i sure hope you find a good answer because I am going to be in your shoes in about 3 to 4 months with a bunch of CSV files.

    Never stop learning or you will be left behind.
  • This is what I tried from other postings on this subject from the forums:-

    I setup a format file (XML as advised by MSDN) and tried the OPENROWSET (BULK...... posted on another thread on here.

    I get a bunch of error messages: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row xx, column 1 (SPORTS_ID). The data is fine for column 1.

  • you could try running bcp from command prompt , or even create a ssis package which will give you better control flow.

    I would recommend these because it looks like your performing ETl and these bcp and SSIS are most robust.

    Jayanth Kurup[/url]

  • I recommend SSIS, a single data push task, a source and destination for each csv needed. Eezpeezee :-P:-P

    HTH :-):-)
    taybre

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • dji (6/23/2011)


    Using windows7 and IE 8.0, was unable to add an attachment:w00t: Have had to go onto win xp machine to add attachment?

    What is the path of the file?

    When you go to a different machine are you using a different login?

    Can you navigate to the file in Windows Explorer from the machine that you can't specify the file?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • The problem with the SSIS solution is that I need to import lots (100+) of differently named files. I even bought an application that supposedly joined multiple csv files so I could use SSIS on the joined files but my files were too big and blew the application, $19.99 down the chute!!

    I think the windows 7 problem is to do with ie8.0 preventing pop ups, the whole move to win7 64 bit has been a nightmare from start to finish.

    The easiest option for me is to migrate to MYSQL, the way in which MS simply remove features on a whim is crazy.

  • SSIS has a for each loop task which can be used to iterate thru as many files as required.

    Jayanth Kurup[/url]

  • one option might be if you really want it in SQL is import all the data to MYSQL then move it to SQL.

    Not for sure if this will work for SQL 2008, but might be worth a try.

    http://www.codeproject.com/KB/database/migrate-mysql-to-mssql.aspx

    Never stop learning or you will be left behind.
  • Jayanth_Kurup (6/23/2011)


    SSIS has a for each loop task which can be used to iterate thru as many files as required.

    You should consider this option.

    MySQL? :w00t:

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Thanks for the further replies.

    The SSIS method would require learning a whole new language by the sounds of things? My only experience with SSIS is through the wizard which offers no options for loops etc.

    The codeproject solution relies on ODBC, which is part of the problem, MS decided to stop support for ODBC in 64 bit SQL. This is the reason OPENROWSET from a flat file no longer works.

    The move to 64 bit computing has been the biggest retrograde step in my lifetime, thanks to microsoft.

  • dji (6/23/2011)


    The SSIS method would require learning a whole new language by the sounds of things? My only experience with SSIS is through the wizard which offers no options for loops etc..

    The following is an example of a For Each Loop:

    http://www.sqlis.com/post/Looping-over-files-with-the-Foreach-Loop.aspx

    The codeproject solution relies on ODBC, which is part of the problem, MS decided to stop support for ODBC in 64 bit SQL. This is the reason OPENROWSET from a flat file no longer works..

    The fact that your CodeProject uses ODBC will not prevent you from using OLE DB in an SSIS Project.

    I would attempt to learn SSIS if I were in your shoes. 🙂

    The move to 64 bit computing has been the biggest retrograde step in my lifetime, thanks to microsoft.

    Try to not be discouraged, it's all good. 😎

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (6/23/2011)


    dji (6/23/2011)


    The SSIS method would require learning a whole new language by the sounds of things? My only experience with SSIS is through the wizard which offers no options for loops etc..

    The following is an example of a For Each Loop:

    http://www.sqlis.com/post/Looping-over-files-with-the-Foreach-Loop.aspx

    The codeproject solution relies on ODBC, which is part of the problem, MS decided to stop support for ODBC in 64 bit SQL. This is the reason OPENROWSET from a flat file no longer works..

    The fact that your CodeProject uses ODBC will not prevent you from using OLE DB in an SSIS Project.

    I would attempt to learn SSIS if I were in your shoes. 🙂

    The move to 64 bit computing has been the biggest retrograde step in my lifetime, thanks to microsoft.

    Try to not be discouraged, it's all good. 😎

    What happens when you try to use a ODBC connection? we have old apps using ODBC connected to SQL 2008 R2 x64

    I just tested a connection from win 7 x64 to a brand new SQL box with r2 x64 on it

    Microsoft SQL Server ODBC Driver Version 06.01.7601

    Running connectivity tests...

    Attempting connection

    Connection established

    Verifying option settings

    Disconnecting from server

    TESTS COMPLETED SUCCESSFULLY

    Never stop learning or you will be left behind.
  • ok, BULK insert's about a zillion times easier to use than SSIS.

    your original table definition that you posted is missing two columns that appear in your data, country, and selection i think it was..

    this works perfectly, 10K rows in 0 seconds, i'd have to trace it to get the milliseocds it was so quick::

    CREATE TABLE [dbo].[TBLEXAMPLE] (

    [SPORTS_ID] INT NULL,

    [EVENT_ID] INT NULL,

    [Country] VARCHAR(255) NULL,

    [SETTLED_DATE] VARCHAR(255) NULL,

    [FULL_DESCRIPTION] VARCHAR(255) NULL,

    [COURSE] VARCHAR(255) NULL,

    [SCHEDULED_OFF] VARCHAR(255) NULL,

    [EVENT] VARCHAR(255) NULL,

    [ACTUAL_OFF] VARCHAR(255) NULL,

    [SELECTION_ID] INT NULL,

    [SELECTION] VARCHAR(255) NULL,

    [ODDS] decimal(19,4) NULL,

    [NUMBER_BETS] INT NULL,

    [VOLUME_MATCHED] decimal(19,4) NULL,

    [LATEST_TAKEN] VARCHAR(255) NULL,

    [FIRST_TAKEN] VARCHAR(255) NULL,

    [WIN_FLAG] INT NULL,

    [IN_PLAY] VARCHAR(255) NULL)

    BULK INSERT tblexample FROM 'c:\data\abc.csv'

    WITH (

    DATAFILETYPE = 'char',

    FIELDTERMINATOR = ',',

    ROWTERMINATOR = '\n',

    FIRSTROW = 2

    )

    now from thhere, if you have thousands of files, if you have xp_cmdShell enabled, you could use that and a cursort to get the list of all files forma given directly, and then loop thru each file with sime dynamic SQL.

    I've adapted an old example (this part is untested!) to do exactly that ; use this as a model:

    --a table to loop thru filenames drop table ALLFILENAMES

    CREATE TABLE ALLFILENAMES(WHICHPATH VARCHAR(255),WHICHFILE varchar(255))

    --the source table: yours already exists, but needed for this example.

    CREATE TABLE BULKACT(RAWDATA VARCHAR (8000))

    --some variables

    declare @filename varchar(255),

    @path varchar(255),

    @sql varchar(8000),

    @cmd varchar(1000)

    --get the list of files to process:

    --#########################################

    SET @path = 'C:\DB\'

    SET @cmd = 'dir ' + @path + '*.csv /b'

    INSERT INTO ALLFILENAMES(WHICHFILE)

    EXEC Master..xp_cmdShell @cmd

    UPDATE ALLFILENAMES SET WHICHPATH = @path where WHICHPATH is null

    --#########################################

    --cursor loop

    declare c1 cursor for SELECT WHICHPATH,WHICHFILE FROM ALLFILENAMES where WHICHFILE like '%.txt%'

    open c1

    fetch next from c1 into @path,@filename

    While @@fetch_status <> -1

    begin

    --bulk insert won't take a variable name, so make a sql and execute it instead:

    set @sql = 'BULK INSERT TBLEXAMPLE FROM ''' + @path + @filename + ''' '

    WITH (

    DATAFILETYPE = 'char',

    FIELDTERMINATOR = ',',

    ROWTERMINATOR = '\n',

    FIRSTROW = 2

    ) '

    print @sql

    exec (@sql)

    fetch next from c1 into @path,@filename

    end

    close c1

    deallocate c1

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Viewing 15 posts - 1 through 15 (of 43 total)

You must be logged in to reply to this topic. Login to reply