August 20, 2010 at 12:38 pm
This approach will need to be refined to suit your requirements but it gives you the gist of a possible solution.
You can identify the files you need using an initial CTE based on xp_dirtree undocumented stored procedure. Using OPENROWSET allows you to convert xp_dirtree into a SELECT statement so you can use WHERE [File] = 1 to ensure that only files are selected. You could apply an additional condition to this filter to narrow down the files you want, if necessary. Now that you have the list of files in the cte, you can build up a UNION query string by using OPENROWSET BULK and FOR XML PATH. You will need a simple format file BulkFile.fmt which has a single field called BulkColumn defined as SQLCHAR 8000. By using LASTROW = 1, you will only get the header information selected in the OPENROWSET BULK statement. In this way, you obtain a dataset containing your filenames and the corresponding headers. You can then insert this data into a table, @t in this example. By iterating through this table based on your validation criteria, files which meet your criteria are copied to your validation folder. Since each copied file is sequentially removed from @t, this leaves behind a list of the failed files in @t for review.
DECLARE @SQL VARCHAR(MAX)
DECLARE @t TABLE (FileName VARCHAR(100), Header VARCHAR(8000))
DECLARE @FileName VARCHAR(100)
;WITH cte AS
(
SELECT subdirectory FROM OPENROWSET('SQLNCLI','SERVER=.;TRUSTED_CONNECTION=YES','SET FMTONLY OFF EXEC xp_dirtree ''C:\YourFolder'', 1, 1') WHERE [File] = 1
)
SELECT @SQL = STUFF((SELECT 'UNION ALL SELECT ''' + subdirectory + ''' AS FileName, BulkColumn AS Header FROM OPENROWSET (BULK ''C:\YourFolder\' + subdirectory + ''', FORMATFILE = ''C:\BulkFile.fmt'', LASTROW = 1) AS Z' + CHAR(10) FROM cte FOR XML PATH(''), TYPE).value('.[1]', 'varchar(MAX)'), 1, 10, '')
INSERT @t EXEC(@SQL)
WHILE EXISTS (SELECT * FROM @t WHERE Header = 'Title Forename Surname FullName JobTitle Department Organisation Address1')
BEGIN
SELECT TOP 1 @FileName = FileName FROM @t WHERE Header = 'Title Forename Surname FullName JobTitle Department Organisation Address1'
EXEC('xp_cmdshell ''Copy "C:\YourFolder\' + @FileName + '" "C:\YourFolder\YourValidatedFolder"''')
DELETE @t WHERE FileName = @FileName
END
SELECT FileName, Header FROM @t
August 20, 2010 at 1:06 pm
thisisdeepthy (8/20/2010)
cAN YOU PLEASE ALSO ADD THE SCRIPT WITH THIS TEXT VALIDATION IN THE CURSOR ? AND ALSO SLIGHT CHANGES IN THE REQ, NEED TO MOVE THE VALID FILES TO THE DB TABLE AND THE REJECTED TO LEAVE IN THE SAME FOLDER
What is the reason for yelling at us in general and Eralper specifically?
This is not a "spoon-feeding-solve-my-issue-immediately-for-free" site. It's a community of professionals helping each other to understand how coding SQL Server can be done efficiently.
If you want a script as per your need without doing much by yourself, get a consultant and pay him for as long as it takes until you're satisfied.
But stop yelling at volunteers, especially if you already got a solution that meets your requirement approx. 90%. You should be able to do the remaining part.
@Eralper: great job, given the limited information provided...
August 22, 2010 at 2:16 am
Thank you SSC Veteran. This works. Thanks a lot
August 22, 2010 at 8:53 pm
steve-893342 (8/20/2010)
This approach will need to be refined to suit your requirements but it gives you the gist of a possible solution.You can identify the files you need using an initial CTE based on xp_dirtree undocumented stored procedure. Using OPENROWSET allows you to convert xp_dirtree into a SELECT statement so you can use WHERE [File] = 1 to ensure that only files are selected. You could apply an additional condition to this filter to narrow down the files you want, if necessary. Now that you have the list of files in the cte, you can build up a UNION query string by using OPENROWSET BULK and FOR XML PATH. You will need a simple format file BulkFile.fmt which has a single field called BulkColumn defined as SQLCHAR 8000. By using LASTROW = 1, you will only get the header information selected in the OPENROWSET BULK statement. In this way, you obtain a dataset containing your filenames and the corresponding headers. You can then insert this data into a table, @t in this example. By iterating through this table based on your validation criteria, files which meet your criteria are copied to your validation folder. Since each copied file is sequentially removed from @t, this leaves behind a list of the failed files in @t for review.
DECLARE @SQL VARCHAR(MAX)
DECLARE @t TABLE (FileName VARCHAR(100), Header VARCHAR(8000))
DECLARE @FileName VARCHAR(100)
;WITH cte AS
(
SELECT subdirectory FROM OPENROWSET('SQLNCLI','SERVER=.;TRUSTED_CONNECTION=YES','SET FMTONLY OFF EXEC xp_dirtree ''C:\YourFolder'', 1, 1') WHERE [File] = 1
)
SELECT @SQL = STUFF((SELECT 'UNION ALL SELECT ''' + subdirectory + ''' AS FileName, BulkColumn AS Header FROM OPENROWSET (BULK ''C:\YourFolder\' + subdirectory + ''', FORMATFILE = ''C:\BulkFile.fmt'', LASTROW = 1) AS Z' + CHAR(10) FROM cte FOR XML PATH(''), TYPE).value('.[1]', 'varchar(MAX)'), 1, 10, '')
INSERT @t EXEC(@SQL)
WHILE EXISTS (SELECT * FROM @t WHERE Header = 'Title Forename Surname FullName JobTitle Department Organisation Address1')
BEGIN
SELECT TOP 1 @FileName = FileName FROM @t WHERE Header = 'Title Forename Surname FullName JobTitle Department Organisation Address1'
EXEC('xp_cmdshell ''Copy "C:\YourFolder\' + @FileName + '" "C:\YourFolder\YourValidatedFolder"''')
DELETE @t WHERE FileName = @FileName
END
SELECT FileName, Header FROM @t
NICELY DONE! Very nice!
Just a couple of tips (or thoughts or whatever you want to call them)...
Instead of using OPENROWSET, you could use BULK INSERT with no field delimiter (just a row terminator) to eliminate the need for a format file.
Also, instead of a DOS COPY and DELETE combination, you could just use MOVE instead.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 23, 2010 at 5:56 am
Thanks Jeff, yes it is rather annoying having to create a format file for OPENROWSET BULK, albeit a very simple one in this case. I wonder why that is when you don't necessarily have to for BULK INSERT.
August 23, 2010 at 8:38 am
I did mock up a BULK INSERT variation but, whilst eliminating the need for a format file, at the same time the specter of RBAR 🙁 was introduced. Or was it RBR since it wasn't too agonizing? I'm not sure :unsure:
August 23, 2010 at 8:57 am
I tried solving using OLE objects this time.
Created a procedure called validateheader to validate the header of each file with a template file.
create table validate_table (validate varchar(max), FileName1 varchar(100),FileName2 varchar(100) )
select * from validate_table
--- Now Create the procedure
ALTER PROCEDURE ValidateHeader
@FileName1 VARCHAR (1024),
@FileName2 VARCHAR (1024)
AS
DECLARE @OLEResult INT
DECLARE @FS INT
DECLARE @FileID INT
DECLARE @Message VARCHAR (8000)
DECLARE @out_msg varchar (8000)
DECLARE @OLEResult2 INT
DECLARE @FS2 INT
DECLARE @FileID2 INT
DECLARE @Message2 VARCHAR (8000)
DECLARE @out_msg2 varchar (8000)
-- Create an instance of the file system object
EXECUTE @OLEResult = sp_OACreate 'Scripting.FileSystemObject', @FS OUT
IF @OLEResult <> 0
BEGIN
PRINT 'Scripting.FileSystemObject'
PRINT 'Error code: ' + CONVERT (VARCHAR, @OLEResult)
END
-- Open the text file for reading
EXEC @OLEResult = sp_OAMethod @FS, 'OpenTextFile', @FileID OUT, @FileName1, 1, 1
IF @OLEResult <> 0
BEGIN
PRINT 'OpenTextFile'
PRINT 'Error code: ' + CONVERT (VARCHAR, @OLEResult)
END
-- Read the first line into the @Message variable
EXECUTE @OLEResult = sp_OAMethod @FileID, 'ReadLine', @Message OUT
---------------
-- Create an instance of the file system object
EXECUTE @OLEResult2 = sp_OACreate 'Scripting.FileSystemObject', @FS2 OUT
IF @OLEResult2 <> 0
BEGIN
PRINT 'Scripting.FileSystemObject'
PRINT 'Error code: ' + CONVERT (VARCHAR, @OLEResult2)
END
-- Open the text file for reading
EXEC @OLEResult2 = sp_OAMethod @FS2, 'OpenTextFile', @FileID2 OUT, @FileName2, 1, 1
IF @OLEResult2 <> 0
BEGIN
PRINT 'OpenTextFile'
PRINT 'Error code: ' + CONVERT (VARCHAR, @OLEResult2)
END
-- Read the first line into the @Message variable
EXECUTE @OLEResult2 = sp_OAMethod @FileID2, 'ReadLine', @Message2 OUT
if ((@OLEResult2 = @OLEResult) and (@Message = @Message2))
BEGIN
set @out_msg =@Message
set @out_msg2=@Message2
--PRINT 'FileHeader Matched-- SUCCESS '+@out_msg2
insert into validate_table values('FileHeader Matched-- SUCCESS ',@FileName1,@FileName2)
END
if ((@Message <> @Message2) OR (@OLEResult2 <> @OLEResult) )
BEGIN
set @out_msg =@Message
set @out_msg2=@Message2
--PRINT 'FileHeader Not Matched --REJECT '+ @out_msg2
insert into validate_table values('FileHeader Not Matched --REJECT ',@FileName1,@FileName2)
END
---------------
-- Keep looping through until the @OLEResult variable is < 0; this indicates that the end of the file has been reached.
WHILE @OLEResult >= 0
BEGIN
--PRINT @Message
EXECUTE @OLEResult = sp_OAMethod @FileID, 'ReadLine', @Message OUT
END
EXECUTE @OLEResult = sp_OADestroy @FileID
EXECUTE @OLEResult = sp_OADestroy @FS
---------------
-- Keep looping through until the @OLEResult2 variable is < 0; this indicates that the end of the file has been reached.
WHILE @OLEResult2 >= 1
BEGIN
--PRINT @Message2
EXECUTE @OLEResult2 = sp_OAMethod @FileID2, 'ReadLine', @Message2 OUT
END
EXECUTE @OLEResult2 = sp_OADestroy @FileID2
EXECUTE @OLEResult2 = sp_OADestroy @FS2
---------------
-- End of Procedure.
and created a cursor to call the procedure
declare @files table (ID int IDENTITY, FileName varchar(100))
insert into @files execute xp_cmdshell 'dir C:\Digitdb\Prospects\ /b'
select * from @files
-- then read the each file name from the table using a cursor and pass it to the ReadFromTextFile proc
declare @STR_fname VARCHAR(100)
declare @templ_fname varchar(100)
DECLARE fn_cursor CURSOR FOR
SELECT FileName
FROM @files
set @templ_fname='C:\Digitdb\prospects1\Template.txt'
OPEN fn_cursor
set @STR_fname=''
FETCH NEXT FROM fn_cursor INTO @STR_fname
WHILE @@FETCH_STATUS = 0
BEGIN
begin
EXEC ValidateHeader @templ_fname,@STR_fname
end
FETCH NEXT FROM fn_cursor INTO @STR_fname
print 'temp file : next file to read'+@templ_fname+' tally filename :'+@STR_fname
END
CLOSE fn_cursor
DEALLOCATE fn_cursor
but my validate_table is returning with 'fileheader not matched reject file' for the file in the prospects.
Any suggestions please?
August 23, 2010 at 1:08 pm
steve-893342 (8/23/2010)
I did mock up a BULK INSERT variation but, whilst eliminating the need for a format file, at the same time the specter of RBAR 🙁 was introduced. Or was it RBR since it wasn't too agonizing? I'm not sure :unsure:
How'z that? you're reading the LastRow = 1 for each file no matter which method you're using.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 23, 2010 at 2:14 pm
Well, at least with the BULK INSERT method I came up with, I first created 2 staging tables. Then for each each file, I needed to BULK INSERT into the first staging table, then INSERT into the second staging table and finally TRUNCATE the first staging table. This is done on a row by row basis which leads to multiple statements. Conversely the OPENROWSET BULK method resolves into a single UNION ALL SELECT statement. Maybe I'm missing something and there's a better way of doing the BULK INSERT :ermm:
DECLARE @SQL VARCHAR(MAX)
--BULK INSERT Method
IF NOT OBJECT_ID('tempdb.dbo.#t', 'U') IS NULL DROP TABLE #t
CREATE TABLE #t (Header VARCHAR(8000))
IF NOT OBJECT_ID('tempdb.dbo.#t2', 'U') IS NULL DROP TABLE #t2
CREATE TABLE #t2 (FileName VARCHAR(100), Header VARCHAR(8000))
;WITH cte AS
(
SELECT subdirectory FROM OPENROWSET('SQLNCLI','SERVER=.;TRUSTED_CONNECTION=YES','SET FMTONLY OFF EXEC xp_dirtree ''C:\YourFolder'', 1, 1') WHERE [File] = 1
)
SELECT @SQL = (SELECT 'BULK INSERT #t FROM ''C:\YourFolder\' + subdirectory + ''' WITH (LASTROW = 1) INSERT #t2 SELECT ''' + subdirectory + ''', Header FROM #t TRUNCATE TABLE #t' + CHAR(10) FROM cte FOR XML PATH(''), TYPE).value('.[1]', 'varchar(MAX)')
PRINT(@SQL)
--eg
BULK INSERT #t FROM 'C:\YourFolder\File1.txt' WITH (LASTROW = 1) INSERT #t2 SELECT 'File1.txt', Header FROM #t TRUNCATE TABLE #t
BULK INSERT #t FROM 'C:\YourFolder\File2.txt' WITH (LASTROW = 1) INSERT #t2 SELECT 'File2.txt', Header FROM #t TRUNCATE TABLE #t
BULK INSERT #t FROM 'C:\YourFolder\File3.txt' WITH (LASTROW = 1) INSERT #t2 SELECT 'File3.txt', Header FROM #t TRUNCATE TABLE #t
BULK INSERT #t FROM 'C:\YourFolder\File4.txt' WITH (LASTROW = 1) INSERT #t2 SELECT 'File4.txt', Header FROM #t TRUNCATE TABLE #t
--OPENROWSET BULK Method
;WITH cte AS
(
SELECT subdirectory FROM OPENROWSET('SQLNCLI','SERVER=.;TRUSTED_CONNECTION=YES','SET FMTONLY OFF EXEC xp_dirtree ''C:\YourFolder'', 1, 1') WHERE [File] = 1
)
SELECT @SQL = STUFF((SELECT 'UNION ALL SELECT ''' + subdirectory + ''' AS FileName, BulkColumn AS Header FROM OPENROWSET (BULK ''C:\YourFolder\' + subdirectory + ''', FORMATFILE = ''C:\BulkFile.fmt'', LASTROW = 1) AS Z' + CHAR(10) FROM cte FOR XML PATH(''), TYPE).value('.[1]', 'varchar(MAX)'), 1, 10, '')
PRINT(@SQL)
--eg
SELECT 'File1.txt' AS FileName, BulkColumn AS Header FROM OPENROWSET (BULK 'C:\YourFolder\File1.txt', FORMATFILE = 'C:\BulkFile.fmt', LASTROW = 1) AS Z
UNION ALL SELECT 'File2.txt' AS FileName, BulkColumn AS Header FROM OPENROWSET (BULK 'C:\YourFolder\File2.txt', FORMATFILE = 'C:\BulkFile.fmt', LASTROW = 1) AS Z
UNION ALL SELECT 'File3.txt' AS FileName, BulkColumn AS Header FROM OPENROWSET (BULK 'C:\YourFolder\File3.txt', FORMATFILE = 'C:\BulkFile.fmt', LASTROW = 1) AS Z
UNION ALL SELECT 'File4.txt' AS FileName, BulkColumn AS Header FROM OPENROWSET (BULK 'C:\YourFolder\File4.txt', FORMATFILE = 'C:\BulkFile.fmt', LASTROW = 1) AS Z
August 23, 2010 at 6:12 pm
steve-893342 (8/23/2010)
Well, at least with the BULK INSERT method I came up with, I first created 2 staging tables. Then for each each file, I needed to BULK INSERT into the first staging table, then INSERT into the second staging table and finally TRUNCATE the first staging table. This is done on a row by row basis which leads to multiple statements. Conversely the OPENROWSET BULK method resolves into a single UNION ALL SELECT statement. Maybe I'm missing something and there's a better way of doing the BULK INSERT :ermm:
The only thing missing is more coffee for me. You're spot on here.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 10 posts - 16 through 24 (of 24 total)
You must be logged in to reply to this topic. Login to reply