December 26, 2012 at 8:37 pm
Hi all,
I have created a SSIS package. In the control flow i have a Execute SQL task which will truncate my staging table and followed by a Data Flow task that loads the source data to my destination staging table. Currently, my Data Flow consist of only a OLE DB Source and a OLE DB destination. However i would like to define the life range of data in my staging table but i do not know how to achieve that. Meaning i will only extract a specified range of data from source and load it into the staging table. In my source, there are two varchar fields EVENT_D and EVENT_T. So for example, if i run my package at 26/12/12 11:00:00, Data to be extracted and load into staging are data with EVENT_D + EVENT_T between 23/12/12 07:00:00 to 26/12/12 08:00:00
and if i run my package at 27/12/12 14:00:00 my loaded data will be between:
27 Dec 0800 - 73 hours TO 27 Dec 0800.
and if i run my package at 27/12/12 07:59:00(which is v rare) my loaded data will be between:
23/12/12 07:00:00 to 26/12/12 08:00:00
Is this achievable? I guess one of the problem would be my two varchar field and it is in the format of DD/MM/YYYY hh:mm:ss
Thanks,
10e5x
December 26, 2012 at 9:37 pm
You may use a conditional split transformation in the Data flow tab. You can write any number of conditions in this transformation and each condition have a respective output. So which ever condition is satisfied that specific data can be directed to the next step as you want.
You can also use functions to change the datatype of your columns as per your need
December 27, 2012 at 1:53 am
Depending on the nature of your OLEDB source (is it an RDBMS?), the most efficient way would be to CAST the varchar date/time columns to a single column with a datetime datatype and use a select query with an appropriate WHERE clause to provide the data (and not just the whole table).
An even better way would be to add a proper datetime column to the source data, but I'm assuming that's not allowed? If there's a lot of data in this table and it's growing, any method is going to gradually grind to a halt, unless you are somehow able to get a useful index on it (eg, in SQL Server, on a computed datetime column added to the base table).
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
December 27, 2012 at 2:09 am
Hi Phil,
Once again thanks for replying and helping. Ya my source it is from RDBMS. Your suggestions are too complicated to me. I am trying some other simpler way. Maybe two new derive column of EVENT_D and EVENT_T as datetime first then use conditional split. Btw are u able to help me with the expression?
December 27, 2012 at 2:18 am
Is the source a SQL Server database?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
December 27, 2012 at 2:48 am
Its Oracle View
December 27, 2012 at 2:54 am
OK, then you need help from an Oracle developer to design your SELECT statement for the OLEDB source.
Select col1, col2
from table
where [convert varchar date and time to datetime] between [startdate] and [enddate]
The problem with trying to do this all in SSIS is that you will always have to process all of the rows in the source table. If the source table is growing, as I mentioned before, your process will get slower and slower.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
December 27, 2012 at 3:06 am
You are right, definately will have overhead. i try to get it done first before looking at efficiency issue. Actually my problem is defining startDate and endDate. Thanks phil
December 27, 2012 at 4:12 am
10e5x (12/27/2012)
You are right, definately will have overhead. i try to get it done first before looking at efficiency issue. Actually my problem is defining startDate and endDate. Thanks phil
OK, I've looked at your original post again. I'm not sure I understand the logic for setting the start and end dates - can you explain it?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
December 27, 2012 at 8:15 pm
Phil Parkin (12/27/2012)
10e5x (12/27/2012)
You are right, definately will have overhead. i try to get it done first before looking at efficiency issue. Actually my problem is defining startDate and endDate. Thanks philOK, I've looked at your original post again. I'm not sure I understand the logic for setting the start and end dates - can you explain it?
The start date will always be 73 hours before the end date. While the end date will be the nearest 8am but its definitely a datetime of a past. E.g:
Datetime when package run: 24/12/12 0900
Startdate: 21/12/12 0700
Enddate: 24/12/12 0800
Datetime when package run: 25/12/12 2300
Startdate: 22/12/12 0700
Enddate: 25/12/12 0800
Datetime when package run: 26/12/12 0759
Startdate: 22/12/12 0700
Enddate: 25/12/12 0800
Datetime when package run: 26/12/12 0800
Startdate: 22/12/12 0700
Enddate: 25/12/12 0800
Datetime when package run: 26/12/12 0805
Startdate: 23/12/12 0700
Enddate: 26/12/12 0800
As u can see i want the data to be included from my soure to my staging span accross 73hrs. I want 73hrs worth of event data. So EVENT_D + EVENT_T should be between the Startdate and Enddate
Thanks in Advance,
10e5x
December 27, 2012 at 10:33 pm
You may write the expression something like this in the conditional split
(DT_DBTIMESTAMP)(event_d + " " + event_t) >= DATEADD("HH",-73,GETDATE()) && (DT_DBTIMESTAMP)(event_d + " " + event_t) <= DATEADD("HH",8,(DT_DBDATE)(GETDATE()))
December 28, 2012 at 12:33 am
Here's my pseudo-code expanded a bit to accommodate the start and end date bits:
declare @StartDate datetime
,@EndDate datetime
set @EndDate = dateadd(hour, 8, DATEADD(dd, 0, DATEDIFF(dd, 0, DATEADD(HOUR, - 8, getdate()))))
set @StartDate = dateadd(hour, - 73, @EndDate)
select col1
,col2
from table
where [convert varchar date and time to datetime] between @StartDate
and @EndDate
That's how it works in T-SQL. Just need to convert that to Oracle.
--Edit: fixed typo.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
December 28, 2012 at 1:40 am
Thanks to both of you, currently out of office cant test our both solutions. Will test as soon as i get back. Btw from what i read from both your solutions, i wonder if you all took care of scenario whereby package i executed before or at 8am. Which means this examples:
Datetime when package run: 26/12/12 0759
Startdate: 22/12/12 0700
Enddate: 25/12/12 0800
Datetime when package run: 26/12/12 0800
Startdate: 22/12/12 0700
Enddate: 25/12/12 0800
Just clarifying. Thanks alot
December 28, 2012 at 2:04 am
i wonder if you all took care of scenario whereby package i executed before or at 8am.
My solution handles that.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
December 28, 2012 at 2:24 am
Phil Parkin (12/28/2012)
i wonder if you all took care of scenario whereby package i executed before or at 8am.
My solution handles that.
Thanks! 😀
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply