July 12, 2010 at 11:39 am
A freind has asked me to import records from a CSV file. There are 10 fields and 208,000 records per file. There will be 30 files to be uploaded at a time.
I used All Tasks >> IMPORT to load the file into a table. This was quick, less than 30 seconds. But I forsee problems.
The Import routine creates 10 VarChar(50) fields. I need to define the field types ahead of time. There are decimal, character and DateTime fields.
There are no indexes.
I do not know how to IMPORT the second CSV file into the same table.
I only want to IMPORT 4 fields, not all 10.
I want all the files to be IMPORTed into the same table without interaction by the user.
As a VB programmer, my solution would be to write a VB progam to do this. But I do not believe it would be as fast as the IMPORT process in SQL 2008.
What would be the best way to get some of the fields in all these files into a single table with minimal user interaction and quickly?
I read the article "SQL Server 2008 Table Valued Parameters Performance " but dealt with XML files and not CSVs.
Thanks,
pat
July 12, 2010 at 12:50 pm
Do a books online search for BULK INSERT or BCP. They are exactly what you are looking for. You'll need a formatfile to describe the structure of the files and from there you just import the sections you need.
July 12, 2010 at 2:25 pm
Jeff,
Thanks for pointing me in the right direction. It looks like the Format File will do all that I need. I assume if I do not specify a field in the Format File that the field in the .txt file will be skipped. Correct?
However, BOL does not give a very good example of a Format File. Nor does it reference a good list of examples.
Do all Format Files use XML? Where can I find the specifics of how to write a Format File with the SQL data Types Decimal(9,4), Int, varChar(30) and smalldatetime? The BOL example only shows a Float data type, "SQLFLT8". How do I translate, e.g Decimal(9,4), into a format recognized by the XML Format File?
Thanks,
pat
July 14, 2010 at 8:30 pm
I haven't found the complete answer but I wanted to update this post with what I have found.
The following Bulk Insert command will move data from a text file to a SQL table.
BULK INSERT Price
FROM 'C:\Price.csv' WITH (FORMATFILE='C:\PriceFormat.txt')
Below is the table create statement and one line of data from the text file.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Price](
[name] [varchar](50) NULL,
[TimePoint] [smalldatetime] NULL,
[rtPrice] [decimal](9, 2) NULL,
[rtLoss] [decimal](9, 2) NULL,
[rtCongestion] [decimal](9, 2) NULL,
[daPrice] [decimal](9, 2) NULL,
[daCongestion] [decimal](12, 4) NULL,
[daLoss] [decimal](12, 4) NULL,
[versifyId] [bigint] NULL,
[externalNodeId] [bigint] NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
now the text file:
,2010-06-17 05:00:00,35.38,0.04,0.09,35,-0.07,0.01,1711,1
I have modified the original FormatFile to exclude some data. This excluded data can be identified by having a 0 column number in the SQL Field no column.
9.0
10
1SQLCHAR00","0name""
2SQLCHAR00","2TimesPoint""
3SQLCHAR00","3rtPrice""
4SQLCHAR00","0rtLoss""
5SQLCHAR00","0rtCongestion""
6SQLCHAR00","6daPrice""
7SQLCHAR00","0daCongestion""
8SQLCHAR00","0daLoss""
9SQLCHAR00","9versifyId""
10SQLCHAR00"\r"10externalNodeID""
Most of my research came from: http://msdn.microsoft.com/en-us/library/ms191479.aspx
What I still cannot do and would like some advice on is how do is to import 5 fields into a SQL table that has only 5 fields in the SQL table when the original text file has 10 fields.
Here is the new table Create statement
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Price1](
[TimePoint] [smalldatetime] NULL,
[rtPrice] [decimal](9, 2) NULL,
[daPrice] [decimal](9, 2) NULL,
[versifyId] [bigint] NULL,
[externalNodeId] [bigint] NULL
) ON [PRIMARY]
My attempt at the FormatFile. This DID NOT work.
9.0
10
2SQLCHAR00","1TimesPoint""
3SQLCHAR00","2rtPrice""
6SQLCHAR00","3daPrice""
9SQLCHAR00","4versifyId""
10SQLCHAR00"\r"5externalNodeID""
Can anyone help with what the FormatFile would look like to do this?
thanks,
pat
July 15, 2010 at 12:44 am
can try SSIS foreach container to loop the files and load into one table?
July 15, 2010 at 4:35 am
Changebluesky,
Thanks for the idea. I will be using VB to download them from the FTP site, to loop through the files and to do the Bulk Insert. The only thing I am having difficulty with at the moment is structuring the FormatFile file so that I only import 5 of the 10 text fields into the SQL Table that only contains 5 fields.
pat
July 15, 2010 at 5:07 am
Using OPENROWSET BULK you can specify the column list eg
INSERT Price1
([TimePoint],
[rtPrice],
[daPrice],
[versifyId],
[externalNodeId]
)
SELECT
TimesPoint,
rtPrice,
daPrice,
versifyId,
externalNodeID
FROM OPENROWSET (BULK 'C:\Price.csv', FORMATFILE = 'C:\PriceFormat.txt') AS Z
July 15, 2010 at 8:28 am
Steve,
Thank you for adding the Openrowset code. I tried it and it did not work. I received the message, "Invalid column number in the Format File c:\Formatfile.txt". I googled Openrowset csv but couldn't find any examples that import fewer columns than are in originating csv/txt file.
Could you explain what is wrong with the FormatFile.txt I listed above?
Thanks,
pat
P.S. There is an error in my post and it was copied into your solution. It should be TimePoint and not TimesPoint. Sorry.
July 15, 2010 at 8:36 am
It seems to me that the formatfile should look more like
9.0
10
1SQLCHAR00","0""
2SQLCHAR00","2TimePoint""
3SQLCHAR00","3rtPrice""
4SQLCHAR00","0""
5SQLCHAR00","0""
6SQLCHAR00","6daPrice""
7SQLCHAR00","0""
8SQLCHAR00","0""
9SQLCHAR00","9versifyId""
10SQLCHAR00"\r"10extermalNodeID""
If I do not include all the columns from the csv file in this definition, SQL server will not know when the csv file Field 1 begins or ends. In my other shortened FormatFile in the previous reply, the file begins with Field 2. It makes sense to me that this would fails since I didn't tell SQL server what Field 1 in the csv file looks like.
But I get the same "Invalid column..." error message with the complete FormatFile above.
This is all very confusing to a first time user of FormatFile.
July 15, 2010 at 8:39 am
Pat, you will still need to use the full file definition you posted. The format file still has to define each column but when you use OPENROWSET, you can only select the fields you really want.
9.0
10
1 SQLCHAR 0 0 "," 0 name ""
2 SQLCHAR 0 0 "," 2 TimesPoint ""
3 SQLCHAR 0 0 "," 3 rtPrice ""
4 SQLCHAR 0 0 "," 0 rtLoss ""
5 SQLCHAR 0 0 "," 0 rtCongestion ""
6 SQLCHAR 0 0 "," 6 daPrice ""
7 SQLCHAR 0 0 "," 0 daCongestion ""
8 SQLCHAR 0 0 "," 0 daLoss ""
9 SQLCHAR 0 0 "," 9 versifyId ""
10 SQLCHAR 0 0 "\r" 10 externalNodeID ""
/* Anything is possible but is it worth it? */
July 15, 2010 at 8:42 am
I believe you need to give each column a name.
mpdillon (7/15/2010)
It seems to me that the formatfile should look more like9.0
10
1SQLCHAR00","0""
2SQLCHAR00","2TimePoint""
3SQLCHAR00","3rtPrice""
4SQLCHAR00","0""
5SQLCHAR00","0""
6SQLCHAR00","6daPrice""
7SQLCHAR00","0""
8SQLCHAR00","0""
9SQLCHAR00","9versifyId""
10SQLCHAR00"\r"10extermalNodeID""
If I do not include all the columns from the csv file in this definition, SQL server will not know when the csv file Field 1 begins or ends. In my other shortened FormatFile in the previous reply, the file begins with Field 2. It makes sense to me that this would fails since I didn't tell SQL server what Field 1 in the csv file looks like.
But I get the same "Invalid column..." error message with the complete FormatFile above.
This is all very confusing to a first time user of FormatFile.
/* Anything is possible but is it worth it? */
July 15, 2010 at 8:42 am
Steve,
I have stumbled upon a half a##ed solution. If I first import the whole csv file into a SQL table that has all the rows. Then I do an Insert using from the intermediate table, Selecting only the columns I want, my final table. This produces the net effect I want but it seems like a round about way and kind of sloppy.
So I would still like to find the solution that skips the intermediate table and posts only 5 of the 10 fields in the csv file into a SQL table that has 5 fields.
July 15, 2010 at 8:45 am
Can you show us the code you used to import into your staging table?
mpdillon (7/15/2010)
Steve,I have stumbled upon a half a##ed solution. If I first import the whole csv file into a SQL table that has all the rows. Then I do an Insert using from the intermediate table, Selecting only the columns I want, my final table. This produces the net effect I want but it seems like a round about way and kind of sloppy.
So I would still like to find the solution that skips the intermediate table and posts only 5 of the 10 fields in the csv file into a SQL table that has 5 fields.
/* Anything is possible but is it worth it? */
July 15, 2010 at 8:52 am
Gatekeeper,
Thanks. I did not see your post while I continued to work on this issue. You were 100% correct. When i used the full FormatFile the import was successful.
This contradicts what I posted just a few minutes ago. Obviously, I made some mistake in my testing which led to that post. Please ignore my previous reply if you find this thread while searching for an answer to a similar problem.
Thank everyone.
pat
July 15, 2010 at 8:58 am
Glad to know you got it figured out. BTW, BCP format file issues are a pain to understand 🙂
You can use the http://msdn.microsoft.com/en-us/library/ms191479.aspx MSDN article to understand the format better.
mpdillon (7/15/2010)
Gatekeeper,Thanks. I did not see your post while I continued to work on this issue. You were 100% correct. When i used the full FormatFile the import was successful.
This contradicts what I posted just a few minutes ago. Obviously, I made some mistake in my testing which led to that post. Please ignore my previous reply if you find this thread while searching for an answer to a similar problem.
Thank everyone.
pat
/* Anything is possible but is it worth it? */
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply