May 29, 2010 at 5:44 am
Hi All
I have a question that is realy Urgent!!! Please reply
I have flat files to be inserted to a table using Bulk Insert.
Each flat file is coming from a different sources(Retail stores) and each file names has the retail store Id
on it like "SalesData0024"....0024 to be the store ID. Each flat file may have one or more rows in it(comma separated ','and new line '').
THE question is: while we import the data it is required to extract the storeId from the file name and add the storeID as a new column.If the file has more than one row the same store id should be repeated for each row.
Let me give you the stored procedure which perfectly loops through the files and insert the rows to a table.
alter procedure [dbo].[BINS_ImportMultipleFiles] @filepath varchar(500),
@pattern varchar(100), @TableName varchar(128)
as
set quoted_identifier off
declare @query varchar(1000)
declare @max1 int
declare @count1 int
Declare @filename varchar(100)
set @count1 =0
create table #x (name varchar(200))
set @query ='master.dbo.xp_cmdshell ''dir '+@filepath+@pattern +' /b'''
insert #x exec (@query)
delete from #x where name is NULL
select identity(int,1,1) as ID, name into #y from #x
drop table #x
set @max1 = (select max(ID) from #y)
While @count1 <= @max1
begin
set @count1=@count1+1
set @filename = (select name from #y where [id] = @count1)
set @Query ='BULK INSERT '+ @Tablename + ' FROM '''+ @Filepath+@Filename+'''
WITH ( FIELDTERMINATOR = '','',ROWTERMINATOR = '''')'
exec (@query)
end
drop table #y
Note: SSIS cannot be used. If single rows had been in each file , creating identity column and joining it to store table which will have storeId and identity column would have been easier.
sql 2000
Thanks,
May 29, 2010 at 6:23 am
Use an XML format file and OPENROWSET BULK. This allows you to modify the bulk load stream without losing any of the benefits of bulk loading - it will still be minimally logged if you follow the rules.
You did not provide data for me to test with, so I won't both providing you with a tailored solution. You can find a pretty full example I wrote on a recent thread here on SSC:
May 29, 2010 at 7:20 am
Thanks for responding
please see sample of data given and the result required:
CREATE TABLE [dbo].[OrdersBulk](
[CustomerID] [int] NULL,
[CustomerName] [varchar](32) NULL,
[OrderID] [int] NULL,
[OrderDate] [smalldatetime] NULL
[storeID] varchar(5) not null
) ON [PRIMARY]
sample data
fileName1: saleData0024
1,foo,5,20031101
3,blat,7,20031101
5,foobar,23,20031104
fileName2: saleData0056
12,abc,5,20031101
result set expected :
CustomerIDCustomerNameOrderIDOrderDatestoreId
1 foo 52003-11-01 00:00:000024
15 blat72003-11-01 00:00:000024
5 foobar 232003-11-04 00:00:00 0024
12 abc 52003-11-01 00:00:00 0056
and ofcourse there are hundreds of files coming from retail stores
thanks
May 29, 2010 at 8:21 am
Why don't you have a decent crack at it first?
See how you get on.
May 29, 2010 at 10:50 pm
Paul White NZ (5/29/2010)
Use an XML format file and OPENROWSET BULK. This allows you to modify the bulk load stream without losing any of the benefits of bulk loading - it will still be minimally logged if you follow the rules.You did not provide data for me to test with, so I won't both providing you with a tailored solution. You can find a pretty full example I wrote on a recent thread here on SSC:
Did you notice near the bottom of the original post where it says "SQL 2000"? I hate it when they do that on a 2k8 forum.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 29, 2010 at 11:25 pm
Jeff Moden (5/29/2010)
Did you notice near the bottom of the original post where it says "SQL 2000"? I hate it when they do that on a 2k8 forum.
Gah! No I didn't - but looking back, I'm not surprised I missed it! And,yes, it is a bit irritating. I did notice the 'SSIS cannot be used' comment, but I figured that was just a lack of knowledge thing. Sigh.
You're right then: The BULK option was added to OPENROWSET in 2005, so it's not going to help here 🙁
I can't immediately think of a neat solution for this problem in 2000.
I know there are a lot of 2000 systems out there, but come on people, you're two-and-a-half major versions back now, and out of support!
May 30, 2010 at 4:50 am
Unfortunatley our company uses DB2 as main database ,
but they have sql server on side which is version 2000.
I ask apology for posting it here,but that was the only option I had to get quick answer, since it was very urgent need to my company..........:-D:hehe:
any intelegent idea to solve this would be advisable
Thanks Guys
May 30, 2010 at 7:36 am
daniarchitect (5/30/2010)
Unfortunatley our company uses DB2 as main database ,but they have sql server on side which is version 2000.
I ask apology for posting it here,but that was the only option I had to get quick answer, since it was very urgent need to my company..........:-D:hehe:
any intelegent idea to solve this would be advisable
Thanks Guys
Wrong statement.
With just a little more effort you could have figured that this is a SS2K8 forum and that there is a SS2K forum, too.
Furthermore, since you have a request marked as "urgent" you might want to get a consultant in. Not only this person would be available almost immediately, you could rely on a fast answer, too. Please note that it's a holiday weekend in the U.S. This probably will limit the number of people seeing your post.
And yet another option to get help faster is to provide ready to use sample data...
May 30, 2010 at 9:00 am
Why is SSIS not an option? Your databases do not need to be SQL 2005 to be able to use SSIS on them.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
May 30, 2010 at 9:21 am
daniarchitect (5/30/2010)
Unfortunatley our company uses DB2 as main database ,but they have sql server on side which is version 2000.
I ask apology for posting it here,but that was the only option I had to get quick answer, since it was very urgent need to my company..........:-D:hehe:
any intelegent idea to solve this would be advisable
Thanks Guys
As you can see, posting in the wrong forum really doesn't help get an answer quicker. It fact, it makes things worse because people offer solutions that won't work in the version you want (even though you posteded the correct version near the end of your post) and it ends up pissing people off that they wasted their time trying to help someone.
You've also asked for an ETL system to handle hundreds of files meaning that it would be nice if it were automatic and you shouldn't expect "quick" answer on such a thing.
I'm working on a simple solution that you'll need to expand upon a bit... be patient. "intelegent" ideas take a bit. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
May 30, 2010 at 10:18 am
As requested, here's a very simple SQL Server 2000 solution and it will load "hundreds of files" provided that they are, in fact, all named the same except for StoreID and all have the same data structure.
I normally don't bring data directly into a "final" table. I normally bring the data into a "staging" table, validate the data, and THEN insert from the staging table to the final table. This problem is no exception. The only thing I haven't done is to write any validation code. You can do that based on whatever your requirements are.
As usual, the comments in the code are important. Make sure you read them.
As a side bar, if you have problems that require sample files, you should attach them to your post to save the people who are trying to help you some time. And don't try to "cheat" a quick answer anymore. Most of us monitor all forums. Always post your question in the correct forum.
{EDIT} Note that most of the following code is NOT dynamic SQL but the software on this forum seems to thing it is. I had to use CHAR(92) in one spot where I could have just used a backslash to "make it pretty" on this forum. You, of course, could change it back.
Here's the code...
--===========================================================================
-- Presets
--===========================================================================
--===== Conditionally dropany Temp Tables to make reruns easier.
-- Notice that #FileList will be built on the fly using SELECT/INTO
IF OBJECT_ID('TempDB..#DirList','U') IS NOT NULL
DROP TABLE #DirList
;
IF OBJECT_ID('TempDB..#FileList','U') IS NOT NULL
DROP TABLE #FileList
;
IF OBJECT_ID('TempDB..#OrdersBulkStaging','U') IS NOT NULL
DROP TABLE #OrdersBulkStaging
;
--===== This is the same as your original import table
-- with one small change
CREATE TABLE #OrdersBulkStaging
(
[CustomerID] [int] NULL,
[CustomerName] [varchar](32) NULL,
[OrderID] [int] NULL,
[OrderDate] [smalldatetime] NULL
)
;
--===== This table will collect all file names and directory
-- names from the "current" directory.
CREATE TABLE #DirList
(
ObjectName SYSNAME,
Depth TINYINT,
IsFile TINYINT
)
;
--===== Declare the local variables
DECLARE @Counter INT,
@FileCount INT,
@StoreID VARCHAR(5),
@sql NVARCHAR(4000)
;
--===========================================================================
-- Get the file names we need to process. In the process, we also
-- isolate the StoreID from the file name.
--===========================================================================
--===== Collect the file names in the "current" directory.
-- OBVIOUSLY, THE DIRECTORY NAME WILL NEED TO CHANGE.
INSERT INTO #DirList
(ObjectName, Depth, IsFile)
EXEC Master.dbo.xp_DirTree 'C:\Temp' , 1 , 1
;
--===== Isolate only the file names we want to work with.
-- Notice how the StoreID is split out at this time.
-- This is were #FileList is built/populated on the fly.
SELECT IDENTITY(INT,1,1) AS RowNum,
ObjectName AS FileName,
SUBSTRING(ObjectName, 9, 4) AS StoreID
INTO #FileList
FROM #DirList
WHERE ObjectName LIKE 'saleData[0-9][0-9][0-9][0-9]'
;
--===== Remember how many files names that we need to address
SELECT @FileCount = @@ROWCOUNT
;
--===== Since we're working with SQL Server 2000, we don't have VARCHAR(MAX).
-- Instead, we need to loop through the file names. We use the "loader
-- view" here to skip the missing StoreID information.
-- As a side bar, this is one of the very few places where a WHILE LOOP
-- should actually be used in SQL Server 2000.
SELECT @Counter = 1
;
WHILE @Counter <= @FileCount
BEGIN
--===== This loads the data, sets up the StoreID variable, and increments the counter
-- OBVIOUSLY, THE DIRECTORY NAME WILL NEED TO CHANGE.
SELECT @sql = N'BULK INSERT #OrdersBulkStaging FROM ''C:\Temp' + CHAR(92) + FileName + ''' '
+ 'WITH (FIELDTERMINATOR = '','',ROWTERMINATOR = '''', CODEPAGE = ''RAW'', DATAFILETYPE = ''CHAR'')',
@StoreID = StoreID,
@Counter = @Counter + 1
FROM #FileList
WHERE RowNum = @Counter
EXEC (@SQL)
;
--===== Do the insert to the final table from the staging table.
-- This step also applies the correct StoreID
INSERT INTO dbo.OrdersBulk
(CustomerID, CustomerName, OrderID, OrderDate, StoreID)
SELECT CustomerID, CustomerName, OrderID, OrderDate, @StoreID
FROM #OrdersBulkStaging
;
--===== All done for this file. Truncate the staging table.
TRUNCATE TABLE #OrdersBulkStaging
END
;
--===== Show that we've successfully loaded the OrdersBulk table from all files
-- with names like 'saleData[0-9][0-9][0-9][0-9]'
SELECT * FROM dbo.OrdersBulk
;
--Jeff Moden
Change is inevitable... Change for the better is not.
May 30, 2010 at 12:28 pm
CirquedeSQLeil (5/30/2010)
Why is SSIS not an option? Your databases do not need to be SQL 2005 to be able to use SSIS on them.
Even if SSIS isn't per se an option, I haven't head anything that couldn't be done through DTS.....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 30, 2010 at 3:27 pm
Matt Miller (#4) (5/30/2010)
CirquedeSQLeil (5/30/2010)
Why is SSIS not an option? Your databases do not need to be SQL 2005 to be able to use SSIS on them.Even if SSIS isn't per se an option, I haven't head anything that couldn't be done through DTS.....
True enough... someone please post a complete DTS solution or the steps to accomplish it. Thanks.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 30, 2010 at 9:35 pm
I no longer have a functional version of DTS (my 2K server is history). That said (from memory, so bear with me):
- First would be an ActiveX task, to check the designated directory and pull all files names. Dump the files into a table
- upon completion of step 1, fire a data-driven task, set up to fire a BulkInsert task for each row in the table populated by step one (clearly you'd need a completion status, etc...).
- if you're inserting into a staging table - once everything has inserted - scrub the data appropriately, then push to the permanent table. That would be a "regular" sql Script task, then a SQL destination.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 30, 2010 at 10:25 pm
Matt Miller (#4) (5/30/2010)
I no longer have a functional version of DTS (my 2K server is history). That said (from memory, so bear with me):- First would be an ActiveX task, to check the designated directory and pull all files names. Dump the files into a table
- upon completion of step 1, fire a data-driven task, set up to fire a BulkInsert task for each row in the table populated by step one (clearly you'd need a completion status, etc...).
- if you're inserting into a staging table - once everything has inserted - scrub the data appropriately, then push to the permanent table. That would be a "regular" sql Script task, then a SQL destination.
Well done, especially from memory. You'd also need some loop code to decide if you where done or not and the actual "go back to the beginning" loop code. It all takes "Active X" (VBS) just about every step of the way. A "decent" example of how to do this in DTS can be found at http://www.sqldts.com/246.aspx. I put "decent" in quotes because it takes a lot of code to make those little icons actually work properly.
I think it's long, ugly, requires more than 1 language, and totally unnecessary. The tricks you have to go though in DTS (or even SSIS) to do something like a simple loop have always amazed me as to how easy they could have been instead of having to write a module to enable/disable legs. Even though SSIS is an improvement, I feel much the same way about it... long, ugly, requires more than 1 language, and totally unnecessary. Heh... Even a looping DOS batch file would be simpler. 😛
I guess I'm going to have to pick back up on VB.NET or C#, write a decent and permanent replacement for xp_DirTree, add a tight little "archive move" module, and make it so that it's so easy to do in T-SQL (already really easy) that no one will even want to import flat files or XML files using DTS or SSIS ever again.
Now, if we could just get MS to write a decent replacement for the JET drivers, we'd be all set. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 29 total)
You must be logged in to reply to this topic. Login to reply