June 3, 2009 at 2:39 pm
Hey Guys,
Need suggestion on how to tackle this or any existing code is even better.... I will have a table that will have the filename and the field delimeter... first row of the file will have the column names... how can I create I create a table with first row as column name and then load the file in that table?
We will have multiple files in the directory at one time; however, the metatable will have the filename and delimeter info.
Thanks for the help guys.
June 3, 2009 at 6:05 pm
Bulk Insert into a staging table.
Use the "Quirky" update method to strip each row apart into it's parts.
If you can provide sample data, people on this web site will bend over backwards to help you. Without it, well, you're pretty much on your own, since those on this site that help out are unpaid volunteers. See the link in my signature for how to do this.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
June 3, 2009 at 6:31 pm
Hey Waynes,
Thanks for the response... Let me see how much you can bendbackword for this...
sample of file will be:
id, fname, lname, add1, add2
111, scott, tiger, 123 your way, apt 1
................................................
another file can be:
pac_id, pac_name, add1, add2, ...... max_pac_amt
999, Clinton for president, 333 washington way, # 333, .......... 7500
Files can have different columns and first row will have column name... a control table will have the filename and delimeter the file has.
Thanks
June 3, 2009 at 8:02 pm
How's this?
if object_id('tempdb..#Temp') is not null DROP TABLE #Temp
CREATE TABLE #Temp (
LineText varchar(max))
-- you will need to replace this with a bulk insert.
-- I'm using this just to load some test data in.
-- I also added an extra row to ensure that the string concatanation works properly.
insert into #Temp
select 'id, fname, lname, add1, add2' UNION ALL
select '111, scott, tiger, 123 your way, apt 1' UNION ALL
select '222, kevin, lion, 456 my way, apt 102'
-- after the data is bulk-inserted, add a RowID identity column and make a PK on it.
ALTER TABLE #Temp ADD RowID int NOT NULL identity
ALTER TABLE #Temp ADD CONSTRAINT [PK_Temp] PRIMARY KEY CLUSTERED (RowID)
declare @TempStr varchar(max)
declare @Delimiter char(1)
set @Delimiter = ',' -- set from the table for this file
-- you don't have any field definitions, so assuming varchar(50) for all
-- this method does assume that they are all the same datatype.
-- Otherwise, you will have to create the table ahead of time.
-- In which case then you won't have to worry about these next 2 lines
select @TempStr = 'CREATE TABLE Temp (' + replace(LineText, @Delimiter, ' varchar(50),') + ' varchar(50))' from #Temp where RowID = 1
exec (@TempStr)
-- now, time to insert all the other data.
-- build one huge string with all of the data
set @TempStr = (
select ' SELECT ' + replace(QuoteName(LineText, char(39)), @Delimiter, char(39) + ',' + char(39)) + ' UNION ALL'
from #Temp
where RowID > 1
FOR XML PATH(''))
set @TempStr = 'INSERT INTO TEMP ' + left(@TempStr, len(@TempStr)-9)
exec (@TempStr)
select * from temp
if object_id('temp') is not null drop table temp
if object_id('tempdb..#Temp') is not null DROP TABLE #Temp
Note that this even handles when a delimiter that is not a comma is used AND a comma is in the data (this is what the QuoteName function is used for). Also note that I used char(39), which is a single-quote. I find it easier to use this when dynamically making SQL code... it's easier to read and understand than '''' (4 quotes).... is that 4 quotes, a dbl-quote surrounded by single-quotes, or 2 dbl-quotes?
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
June 3, 2009 at 8:57 pm
Thanks Waynes.... this is great. It will certainly save me loads of my time. I guess I can also manipulate this to work with comma delimeted with quotes as text qualifier files...
June 4, 2009 at 9:12 am
I was wondering if varchar(max) can handle more than 2gb of data in a file as the entire file will be loaded in one varchar(max) field? Few files will be as much as 10gb in size...
June 4, 2009 at 10:21 am
Workaround I am thinking would work:
--- we can bulkinsert first record in #temp
--- create table temp using that info
--- bulk insert entire data into temp using delimeter and text qualifier
June 4, 2009 at 11:56 am
Ghanta (6/4/2009)
I was wondering if varchar(max) can handle more than 2gb of data in a file as the entire file will be loaded in one varchar(max) field? Few files will be as much as 10gb in size...
No, a bulk insert will load each line in the file into a separate row in the staging table.
The problem will be when you generate the string to do the inserts... Even if your file is less than 2gb, if it's close then the generated string could be larger than the 2gb limit. And that is all being loaded into one varchar(max) variable.
In this case, you have essentially two choices:
1. process the staging table in steps.
2. Add extra columns to the staging table. Utilize the "quirky update"[/url] method to update those fields. You could then either alter the staging table and drop the unnecessary columns, or select the desired columns into your new permanent table.
Personally, I'd go with the quirky update. I think it will be a lot faster.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
June 15, 2009 at 3:34 pm
hey Waynes,
Thanks for your help...this process works great with delimited file with first row as header... How can we deal with fixed width file? They will provide us the fieldnames and length in a dictionary file... is there a way to create format file dynamically (for such files) using those info in dictionary file so that I can do bulkinsert into datatable using the format file option?
Thanks for the help.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply