December 8, 2008 at 4:51 pm
I'm writing an SSIS package to import data from DB2 into SQL Server 2005, and there are two DB2 fields which contain "date" and "time" data. I need to concatenate this data into one SQL Server 'datetime' field. I think I need to use a derived column transformation in the data flow, but I can't figure out the syntax to concatenate the 'date' and 'time' fields together.
Any suggestions?
Thanks in advance, Mike
December 9, 2008 at 4:40 am
Hi,
You can use the derived column task. In the select you can use the dateadd and datepart T-SQL functions to add your time to your date.
Franky L.
December 9, 2008 at 5:06 am
It depends a lot on what datatypes you have imported from the DB2 source.
If they are both datetime you can use dthe DATEADD and DATEPART functions as Franky says, for example:
DATEADD( "Hour", DATEPART( "Hour", [tme]) , [dte])
ans do on for "Minute", "Second", and even "Millisecond" if you have to.
If they are imported as strings you can try to concatenate and then cast to datetime:
(DT_DBTIMESTAMP) ([dte_s] + [tme_s])
You need to be pretty sure about the correct formatting of the date- and timestrings to use this one though.
Peter Rijs
BI Consultant, The Netherlands
December 9, 2008 at 7:03 am
Peter, thanks for your reply; here's what I ended up doing:
1 - In the data flow task, I created a data flow source that pointed to the DB2 database
2 - I created a data conversion transformation with two input columns and linked the data flow task to the data conversion transformation.
- Input Column: The names of the DB2 fields, REC_ADD_DATE and REC_ADD_TIME.
- Output Alias: "Copy of REC_ADD_DATE" and "Copy of REC_ADD_TIME", respectively
- Data Type: string [DT_STR] in both cases
- Length: 10 for REC_ADD_DATE and 8 for REC_ADD_TIME
- In my case, the REC_ADD_DATE data is formatted as 'YYYY-MM-DD' and the REC_ADD_TIME data is formatted as 'HH:MI:SS'. I assume this is standard formatting for 'date' and 'time' data types in DB2.
3 - I created a derived column transformation, linked the data conversion transformation to the derived column transformation, and entered the following code:
- Derived Column Name: REC_ADD_DATE_TIME (I created this as a new column)
- Derived Column: (This is the default value - do not change this)
- Expression: [Copy of REC_ADD_DATE] + " " + [Copy of REC_ADD_TIME]
- Data Type: Unicode string [DT_WSTR] (Default value, I did not alter this)
- Length: 19 (Default value, I did not alter this)
4 - I created a data flow destination that pointed to the SQL Server database and linked the derived column transformation to the data flow destination.
- In the 'Mappings' section, I mapped the derived column REC_ADD_DATE_TIME to the existing field REC_ADD_DATE, since the column REC_ADD_DATE in SQL Server was already the proper data type of 'datetime'.
Not bad for only having a week's experience with SSIS 😉 Now, what I'd like to know is if there is a more elegant / efficient way of doing this? It seems quite convoluted to me.
Thanks for the help!
Regards,
Mike
December 9, 2008 at 7:05 am
Franky, thanks for your reply - your suggestions helped get me on the right track.. See my reply to Peter to see what I came up with.
Regards, Mike
December 9, 2008 at 8:20 am
Is there "a more elegant / efficient way of doing this"? That's something I have asked myself a couple of times with these kind of questions.
Most of the times it should be possible to find a more elegant/efficient way, but is the performance of the found solutions sufficient, and finding a more elegant/efficient way would cost to much effort.
For your requirements I would probably build both the string concatenation and the DATEADD variant, and then do some testing to see what performs best.
For you're solution however, three of my rules of thumb come to my mind:
1) When I get data from an external system (not being sql server itself or files in a share that is meant for this purpose), I try to get the data into SQL server or into a RAW file as quickly as possible to minimize the duration of the connection with the external system. That also means that i do no calculations etc. on the pipeline from an external system (maybe apart for adding an ID for what source it came from).
2) try to minimize the number of columns in the pipeline, as that will allow more records per buffer, and therefore faster throughput. In your case this would mean doing the conversions from date and time to string in the same step as adding them together and converting back to datetime.
(DT_DBTIMESTAMP) ((DT_STR)[REC_ADD_DATE] + " " + (DT_STR)[REC_ADD_TIME]))
3) avoid implicit conversions (which is what happens when the string of your step 3 is put into the datetime columns of your step 4. Making the conversion explicit makes it easier to debug, and you might even consider logging errors to a separate table/file, so the package can continue.
Peter Rijs
BI Consultant, The Netherlands
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply