October 15, 2018 at 6:32 pm
Hi Guys,
I have two questions. I am automating the process through SSIS to load the .txt file into SQL Table.
The file is in Same format .txt and in "Fixed Width" with no header.
File 1 =
a) Header Row Delimiter = LF
b) Length of the data is 278 Characters
File 2 =
a) Header Row Delimiter = {CR}{LF}
b) Length of the data is 295 Characters
Questions:-
1) How can I read the Header Row Delimiter through C# in "Script Task" and pass through expression?
2) How can I get the length of the data row through c# in "Script Task"? i.e 278 or 295 through
Any advice would be greatly appreciated.
Thank You.
October 15, 2018 at 7:18 pm
rocky_498 - Monday, October 15, 2018 6:32 PMHi Guys,I have two questions. I am automating the process through SSIS to load the .txt file into SQL Table.
The file is in Same format .txt and in "Fixed Width" with no header.
File 1 =
a) Header Row Delimiter = LF
b) Length of the data is 278 CharactersFile 2 =
a) Header Row Delimiter = {CR}{LF}
b) Length of the data is 295 CharactersQuestions:-
1) How can I read the Header Row Delimiter through C# in "Script Task" and pass through expression?
2) How can I get the length of the data row through c# in "Script Task"? i.e 278 or 295 throughAny advice would be greatly appreciated.
Thank You.
I guess a better question would be... why do you think you need C# to do anything with this task?
If you could post the record layout for each file (absolutely need since the rows are fixed field), we can probably help quite a bit and without a diversion into C#.
Also, in one breath you say the files have no headers but then in the individual descriptions of the files, you say there's a header. Which is it?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2018 at 7:39 pm
Hi Jeff, Thank You for your time to reply.
I was thinking, If I could create two Data Flow
1) DF with Row Delimiter "LF" with 275 character
2) DF with Row Delimiter "CRLF" with 290 character
Anyway, below Is my sample data from both files and sample SQL table. Just an FYI I created a sample data.
Please let me know if you have any questions.
Sample File 1
000001California912347601234565
000002California912357601234565
000003California912367601234565
Row Delimiter = LF
Sample File 2
000001California912347601234565XYZ
000002California912357601234565ABC
000003California912367601234565HYW
Row Delimiter = {CR}{LF}
Test SQL Table
CREATE TABLE Test1
(
ID VARCHAR(6)
,State VARCHAR(10)
,ZIP VARCHAR(5)
,Phone VARCHAR(10)
,Extra VARCHAR(10)
)
October 15, 2018 at 7:53 pm
Yes, there are actually a couple of different ways to do this using a script task (assuming you mean T-SQL scripts). My biggest question, though, is there nothing in the file names of the files that would give you a hint as to which type of file it is? If not, I'll give it a whirl with code.
Also, just to help me pick the best method for you, are you allowed to store BCP format files either in the same place as the data or on your server somewhere or anywhere that the server can read?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2018 at 8:09 pm
To access BCP file will be a little bit challenge for me to convience IT.
After I come up with the solution I have to focus on performance because from this process it will almost more than 300 files will process through daily.
File Name is not consistent and can't rely on to use this logic.
October 15, 2018 at 8:26 pm
rocky_498 - Monday, October 15, 2018 8:09 PMTo access BCP file will be a little bit challenge for me to convience IT.
After I come up with the solution I have to focus on performance because from this process it will almost more than 300 files will process through daily.File Name is not consistent and can't rely on to use this logic.
Ah. Ok. Understood. For performance, BCP format files and the use of BULK INSERT would be absolutely the fastest but we can do well enough.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2018 at 8:28 pm
One more piece of info, please.... do this files have the exact same fields in exactly the same order but are simply different in field width? And both are supposed to end up in the same table?
Also, have you already got these things mapped out in SSIS?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2018 at 8:54 pm
If the answer to the last question above is "Yes... I already have these all mapped out in SSIS", there no need to throw all that work away. Just make a temporary table with a single wide column of, say, VARCHAR(300) and use the LAST ROW parameter in BULK INSERT to load only one row (the forst row) from the file. The delimiter for the bulk insert should be "0x0A", which is the hexadecimal for the Lf character (\ n was supposed to be but they screwed that up a bit). Then just read the length of that one row with the understanding that the row from the file with CrLfs in the rows will probably have a CHAR(13) for the last character in the row.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2018 at 9:04 pm
Yes, Exactly same fields and both files are going into the same table except one file has one extra and others don't.
Can you share your test .dtsx package?
October 15, 2018 at 9:07 pm
Sorry, I miss understood your last question. Yes, I already have mapped those fields in SSIS.
October 15, 2018 at 9:41 pm
What Expression value should I give for "LastRow" parameter?
October 15, 2018 at 9:42 pm
Sorry... I don't have a "test dtsx package", Rocky. I would have done all of this using BULK INSERT all from a stored procedure including reading the file names. I wouldn't have gone anywhere near SSIS for this simple type of thing.
You said that you could use a script task and I was hoping that was an SQL Script task. If so, the BULK INSERT of just one row using the "0x0A" delimiter and the LAST_ROW option would have read the first row for you to measure.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2018 at 9:53 pm
Ohh K. Can you share your stored proc if you already have written?
Maybe I can use the SQL Script task to move the data from .txt to staging SQL table and then go from there.
Thank You for your help.
October 16, 2018 at 1:11 am
Nah.... do what I do... don't even bother SSIS for this.
I created files for your two sample tables being sure that the first one's lines was terminated with LF only and the second with CrLF. Then I ran this against the C:\Temp1 directory I had put them in. Of course, the details of what was done are in the comments in the code.
--=====================================================================================================================
-- Create the necessary objects for this code.
-- This only needs to be done once.
--=====================================================================================================================
--===== Careful! We're dropping objects for this demo!!!
-- That's why I commented it out before I posted the code.
-- I needed these conditional drops to make my testing easier.
-- IF OBJECT_ID('dbo.Staging' ,'U') IS NOT NULL DROP TABLE dbo.Staging;
-- IF OBJECT_ID('dbo.vStaging' ,'V') IS NOT NULL DROP VIEW dbo.vStaging;
-- IF OBJECT_ID('dbo.DataParser','V') IS NOT NULL DROP VIEW dbo.DataParser;
-- IF OBJECT_ID('dbo.Final' ,'U') IS NOT NULL DROP TABLE dbo.Final;
GO
--===== This is the target for the BULK INSERT.
-- It'll be a permanent object that will be TRUNCATEd before each run.
CREATE TABLE dbo.Staging
(
RawData VARCHAR(500)
,FilePath VARcHAR(1000) -- Added this
)
;
GO
--===== We'll need this named constraint later to save the file path for each file we loop through.
ALTER TABLE dbo.Staging
ADD CONSTRAINT DF_Final_FilePath DEFAULT ('') FOR FilePath
;
GO
--===== Create the insertable view that will do the clean BULK INSERT into the staging table
CREATE VIEW dbo.vStaging AS
SELECT RawData
FROM dbo.Staging
;
GO
--===== Create the view that will do the parsing work.
-- We're using a view to do the parsing because you said that you probably couldn't use BCP Format files.
-- There's always more than one way to skin a cat. ;-)
-- This is also a permanent object.
CREATE VIEW dbo.DataParser AS
SELECT ID = SUBSTRING(RawData, 1, 6)
,State = SUBSTRING(RawData, 7,10)
,ZIP = SUBSTRING(RawData,17, 5)
,Phone = SUBSTRING(RawData,22,10)
,Extra = NULLIF(SUBSTRING(RawData,32,10),CHAR(13))
,FilePath
FROM dbo.Staging
;
GO
--===== This is whatever your final table is.
-- Up to you if it's supposed to be TRUNCATEd before each run.
CREATE TABLE dbo.Final
(
ID CHAR(6)
,State CHAR(10)
,ZIP CHAR(5)
,Phone CHAR(10)
,Extra VARCHAR(10)
,FilePath VARcHAR(1000) -- Added this
)
;
/*@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Here's the code that does the imports.
You could turn this into a stored procedure.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@*/
--=====================================================================================================================
-- PRESETS
--=====================================================================================================================
--===== Environmental presets
SET NOCOUNT OFF; -- We want to see the line counts here.
--===== Drop any temporary tables.
-- This may be commented out if you turn this into a stored procedure.
-- Again, I just needed this to make testing easier.
IF OBJECT_ID('tempdb..#DirInfo' ,'U') IS NOT NULL DROP TABLE #DirInfo;
IF OBJECT_ID('tempdb..#FileControl','U') IS NOT NULL DROP TABLE #FileControl;
--====== Truncate the final target table. Comment this out if you don't want to Truncate at the beginning of each run.
TRUNCATE TABLE dbo.Final
;
--===== Create the table that will capture the output of xp_DirTree.
CREATE TABLE #DirInfo
(
ObjectName VARCHAR(500) PRIMARY KEY CLUSTERED
,Depth INT
,IsFile TINYINT
)
;
--===== Create the table that will hold all the file names that match the pattern.
CREATE TABLE #FileControl
(
FileNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
,[FileName] VARCHAR(500)
)
;
--===== Create some obviously named variables and presets
DECLARE @Dummy BIT
,@Counter INT = 1
,@FileCount INT
,@FilePath VARCHAR(500) = 'C:\Temp1\'
,@FileName VARCHAR(500)
,@FilePattern VARCHAR(50) = '%.txt'
,@SQL VARCHAR(600)
;
--=====================================================================================================================
-- Get the directory information, which may contain subdirectories and non .txt files
--=====================================================================================================================
--======= Let the operator know what we're doing.
PRINT REPLICATE('-',120);
RAISERROR ('Gathering directory information from "%s"...',0,0,@FilePath) WITH NOWAIT
;
--===== Populate the directory information table with file names and other directory info.
SELECT @SQL = 'xp_Dirtree ''' + @FilePath + ''',1,1' -- 1 Level deep and capture file names too!
;
INSERT INTO #DirInfo
EXEC (@SQL)
;
-- SELECT * FROM #DirInfo; --Just for troubleshooting
--=====================================================================================================================
-- Extract just the files we want to work with and number them so we can "loop" through the names to import them.
-- We'll also remember the number of files we loaded as an end point for the loop.
--=====================================================================================================================
--===== Let the operator know what we're doing.
PRINT REPLICATE('-',120);
RAISERROR ('Extracting files with a pattern of "%s"',0,0,@FilePattern) WITH NOWAIT
;
INSERT INTO #FileControl
([FileName])
SELECT ObjectName
FROM #DirInfo
WHERE ObjectName LIKE @FilePattern
ORDER BY ObjectName
;
SELECT @FileCount = @@ROWCOUNT
;
--=====================================================================================================================
-- Now we'll load each file into the "RawData" column of the staging table.
--=====================================================================================================================
--======= Let the operator know what we're doing.
PRINT REPLICATE('=',120);
RAISERROR ('Importing the files...',0,0,@FilePattern) WITH NOWAIT
;
--====== Make sure the staging table is clear
TRUNCATE TABLE dbo.Staging
;
--===== Now, load each file that's in the File Control table.
WHILE @Counter <= @FileCount
BEGIN
--===== Get the file name.
SELECT @FileName = [FileName]
FROM #FileControl
WHERE FileNum = @Counter
;
--===== Tell the operator what we're doing so they don't get bored.
PRINT REPLICATE('-',120);
RAISERROR ('Working on File # %u of %u : %s%s',0,0,@Counter,@FileCount,@FilePath,@FileName) WITH NOWAIT
;
--===== Drop the default file path constraint from the table so we can recreate it with the current file path.
ALTER TABLE dbo.Staging
DROP CONSTRAINT DF_Final_FilePath;
;
--===== Create the new default file path constraint so we know while file each row came from.
SELECT @SQL = '
ALTER TABLE dbo.Staging
ADD CONSTRAINT DF_Final_FilePath DEFAULT (''' + @FilePath+@FileName + ''') FOR FilePath;'
;
EXEC (@SQL)
;
--===== Now, load the file contents.
-- Note that the rows in both file types end with an LF or "Newline" character and because we're
-- not doing anydata parsing yet, this works for both kinds of files without any other consideration.
SELECT @SQL = '
BULK INSERT dbo.vStaging
FROM ''' + @FilePath+@FileName + '''
WITH (
BATCHSIZE = 3000000
,CODEPAGE = ''RAW''
,DATAFILETYPE = ''char''
,FIELDTERMINATOR = ''''
,ROWTERMINATOR = ''0x0A''
,TABLOCK
)
;';
EXEC (@SQL)
;
--===== Bump the control counter
SELECT @Counter += 1
;
END
;
--======= Let the operator know what we're doing.
RAISERROR ('File loads complete.',0,0,@FilePattern) WITH NOWAIT;
PRINT REPLICATE('=',120);
--=====================================================================================================================
-- Use the view to parse the raw data into the final table.
--=====================================================================================================================
--======= Let the operator know what we're doing.
RAISERROR ('Parsing the data.',0,0,@FilePattern) WITH NOWAIT
;
INSERT INTO dbo.Final
(ID,State,ZIP,Phone,Extra,FilePath)
SELECT ID,State,ZIP,Phone,Extra,FilePath
FROM dbo.DataParser
;
--======= Let the operator know what we're doing.
RAISERROR ('RUN COMPLETE. %u files loaded into the dbo.Final table.',0,0,@FileCount) WITH NOWAIT;
PRINT REPLICATE('=',120)
;
--===== Let's see what we did...
SELECT * FROM dbo.Final
;
The output from that looks like this and the script could be turned into a proc and called from SSIS.
The output in the messages tab will keep someone from being bored. Here's what that looks like.
------------------------------------------------------------------------------------------------------------------------
Gathering directory information from "C:\Temp1\"...
(2 row(s) affected)
------------------------------------------------------------------------------------------------------------------------
Extracting files with a pattern of "%.txt"
(2 row(s) affected)
========================================================================================================================
Importing the files...
------------------------------------------------------------------------------------------------------------------------
Working on File # 1 of 2 : C:\Temp1\Sample1.txt
(3 row(s) affected)
------------------------------------------------------------------------------------------------------------------------
Working on File # 2 of 2 : C:\Temp1\Sample2.txt
(3 row(s) affected)
File loads complete.
========================================================================================================================
Parsing the data.
(6 row(s) affected)
RUN COMPLETE. 2 files loaded into the dbo.Final table.
========================================================================================================================
(6 row(s) affected)
--Jeff Moden
Change is inevitable... Change for the better is not.
October 16, 2018 at 10:01 am
Jeff,
I don't have words to say THANK YOU for your help. Very professional way to help.
I appreciate it. I will apply this logic and run through and let you know if I have any questions.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply