DTS - CSV to Database

Question

DTS - CSV to Database

deb-128405

Old Hand

Points: 325
More actions
July 10, 2005 at 1:24 pm

#90148

Hi,
I am trying to import a csv file into a Database thorugh DTS.Every day i get a file with the name <file_name>.yyyymmdd.csv and i have to run a DTS job daily to get the data into the table. Before importing the data into DB , i need to check for duplicate data and if there is one , the entire import has to be cancelled.
ex: file has x,y,z column and table also has x,y,z column.Before inserting data i need to check if x+y (for all the rows in csv file) is already there in the database.
How should i go about it and i want to avoid opening a excel file object and do a checking or putting the data into a temporary table before actually inserting.Is it possible to loop thorugh csv columns and write a t-sql query within DTS to check duplicates?

any help on this would be highly appreciated .

thanks,
deb

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

Usually you import the data into a temp table in the first dts stepts. Then you can do validation and import/cancel the operation as needed from there.

deb-128405 Old Hand Points: 325 More actions · Answer 2

Thanks for the suggestion

how can i create a temoprary table every time the DTS is run without haivng a permanent temporary table and store the data from csv file into ti. Could you please outline that how it can be done.

regards,

deb

philcart SSC-Forever Points: 47798 More actions · Answer 3

Why can't you have a permanent table, its standard practice for loading data?

If you can't have a permanent table then add an ExecuteSQL step at the start to create the table and another ExecuteSQL step at the end to drop the table.

--------------------
Colt 45 - the original point and click interface

fmardani Valued Member Points: 52 More actions · Answer 4

Hi,

This seems to be the exact job that I am doing for a client.

.csv files come in, the DTS packages process them. In between there is a temp table that gets created each time to validate, etc before data is inserted or updated in the main live tables.

So, you can create temp tables.

Hope this is a good explanation.

CJohnson-232084 SSCommitted Points: 1512 More actions · Answer 5

"how can i create a temoprary table every time the DTS is run without haivng a permanent temporary table and store the data from csv file into ti. Could you please outline that how it can be done."

If I understand your question, I think you are asking how to use a table each time you run the DTS package without having the data from the previous run still in a table with the same name.

A temporary table is dropped as soon as it goes out of scope. If you name the table #CSVImport in your code, the # sign tells SQL Server to create a temporary table name #CSVImport which you can use just like a permanent table. When the package completes, this table is automatically dropped (deleted). When your code runs the next time, it is as if the first instance of the table never existed.

Hope this is what you were asking. If I can get a few minutes today, I will work up a short example.

deb-128405 Old Hand Points: 325 More actions · Answer 6

Thanks everybody . Superb responses. I am going for a temp table approach.

1.Execute SQL task (create a temp table at the same DB)

2. On Success - > migrate the data into that table.

3. Check for duplicates in the existing temptable( which is basically the.csv file data) with an inner join.

4. On success -> executing a VB script to check the datain the temp table with the existing data in the DB((a join with the exisitng table in the DB). avoided both cursor or recordset .I can also get the duplicate rows , if any, and write to a log file.

5. On success ( no dupes) -> Execute SQL task to do a plain insert from temp table into the existing temp table and Delete the temp table.

6. On error on any steps above, wrote a VBscript (using Cdonts) to send mail with appropiate error messages.

Thats it. Please let me know if it is an optimum solution.

Thanks again to everybody.

Regards,

Deb

Jeff Gray SSChampion Points: 10667 More actions · Answer 7

Hello,

This approach should work, but you might want to consider using T-SQL for steps 4 and 5 combined. I'm not sure what the advantage would be in using VBS to check for duplicates.

jg

CJohnson-232084 SSCommitted Points: 1512 More actions · Answer 8

Deb,

I took basically the same approach you describe but I combined steps 3,4,5 in one SQL task (text below).

If you want to use VB, I'm always in favor of using the method the developer finds easiest to maintain. Unless your datasets are huge or you do something really ineficient, the savings in processing time are usually erased by maintenance hassles.

This is the SQL, assumes a previous step imported the CSV file into a table named NewData with same structure as OldData (the existing table), and that you only want to cancel if col X + col Y is duplicated. Obviously you can alter the join condition if you are also concerned with Col Z.

IF EXISTS (SELECT N.ColX

FROM OldData O

INNER JOIN NewData N

ON N.ColX = O.ColX AND N.ColY = O.ColY)

BEGIN

TRUNCATE TABLE NewData

END

ELSE

BEGIN

INSERT INTO OldData SELECT * FROM NewData

END

Good luck.