July 14, 2010 at 7:01 am
Hi everyone,
I have a strange requirement in SSIS, that I wanted to share with you all, and need some idea on how can I implement it...
I have to associate my source data with a date range and add a SEQUENCE column to the target based on that date range....
Actually I load data everyday(daily load) and it come in with an Account Number primary key(, but the day when I start loading it from from that day to + 5 days I have to maintain a sequence column (initial =1) and if the same account comes in that date range I have to increment the SEQUENCE column by 1....
Now after 5 days I have to restart the counts...and if same account comes again I have to SEQUENCE it as 1 now (now again wait 5 days , if it comes in 5 days increment sequence otherwise start new sequence)
Now my problem is that I have to maintain a date range separately for every account and I don't know how and where to maintain it... as from the src I am just getting one date , that tells me the record in date, I have to check this date against the date range for every account...
INPUT is like
ACcno InDAte
A1 1-Jul 2010
A2 1-Jul 2010
A2 3-Jul 2010
A2 4-Jul 2010
A2 6-Jul 2010
A3 6-Jul 2010
A2 11-Jul 2010
A2 19-Jul 2010
Output should be like
ACcno InDAte Sequence
A1 1-Jul 2010 1
A2 1-Jul 2010 1
A2 3-Jul 2010 2
A2 4-Jul 2010 3
A2 6-Jul 2010 1
A3 6-Jul 2010 1
A2 11-Jul 2010 1
A2 13-Jul 2010 2
You can see as the Indate for A2 changes sequence keeps on incrementing in 5 days range....
here is some sample data...
CREATE TABLE [abc1](
[I_KEY] [int] NULL,
[A_NO] [nvarchar](14) NULL,
[IN_DATE] [datetime] NULL,
[Sequence] [int] NULL
)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 1, '23', 'Jul 1 2010', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 2, '23', 'Jul 4 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 3, '23', 'Jul 5 2010', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 4, '23', 'Jul 6 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 5, '23', 'Jul 8 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 6, '23', 'Jul 15 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 7, '23', 'Jul 16 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 8, '23', 'Jul 21 2010 ', NULL)
Just to make sure that my INDATE will act as the starting point for the date range....
Thanks
Account no 23 is coming in 3 ranges out put will be like this
1232010-07-01 00:00:00.000 1
2232010-07-04 00:00:00.0002
3232010-07-05 00:00:00.0003
4232010-07-06 00:00:00.000 1
5232010-07-08 00:00:00.0002
6232010-07-15 00:00:00.0001
7232010-07-16 00:00:00.0002
8232010-07-21 00:00:00.0001
I already have this post in SSIS, I am looking for how to do it in T-SQL, istead of the SSIS big flow, shown in my ssis post here
http://www.sqlservercentral.com/Forums/Topic951163-148-1.aspx
Any help on this...
Thanks
July 14, 2010 at 8:15 am
You can use a quirky update to populate the sequence field
Table Setup
CREATE TABLE [#abc1](
[I_KEY] [int] NULL,
[A_NO] [nvarchar](14) not NULL,
[IN_DATE] [datetime] not NULL,
[Sequence] [int] NULL ,
PRIMARY KEY (A_NO, IN_DATE)
)
Sample Date
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 9, '22', 'Jul 1 2010', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 11, '22', 'Jul 5 2010', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 12, '22', 'Jul 8 2010 ', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 10, '22', 'Jul 21 2010 ', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 1, '23', 'Jul 1 2010', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 2, '23', 'Jul 4 2010 ', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 3, '23', 'Jul 5 2010', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 4, '23', 'Jul 6 2010 ', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 5, '23', 'Jul 8 2010 ', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 6, '23', 'Jul 15 2010 ', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 7, '23', 'Jul 16 2010 ', NULL)
INSERT INTO #abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 8, '23', 'Jul 21 2010 ', NULL)
Code
declare @a_no nvarchar(14), @In_date datetime, @Sequence int
begin tran
update #abc1
set @Sequence = Sequence =
case when @a_no = a_no THEN
Case when abs(DateDiff(d, In_date, @In_date)) < 5 then @Sequence + 1
else 1 end
ELSE 1
end,
@In_date = case when @a_no = a_no THEN
Case when abs(DateDiff(d, In_date, @In_date)) >= 5 then In_Date
else
@In_Date
end
else
In_Date
end,
@a_no = a_no
OPTION (MAXDOP 1)
select * from #abc1
rollback tran
For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
July 14, 2010 at 1:35 pm
Code looks great
Thanks [/font]
July 14, 2010 at 1:39 pm
Response to Mike01
Thanks for your reply...
but there is no column named as Sequence in my source table..
I think its my mistake I gave code like that...
CREATE TABLE [#abc1](
[I_KEY] [int] NULL,
[A_NO] [nvarchar](14) not NULL,
[IN_DATE] [datetime] not NULL,
)
it is only this much from my Source table, I have to add a DERIVED column named as Sequence based on the Source data for Acc no and IN date....
Hope you got it..
July 14, 2010 at 1:43 pm
is the sequence just returned in the resultset? You don't want to store it anywhere? What is the significance of it then?
For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
July 14, 2010 at 2:33 pm
I do want to store it, but first I want to derive sequence in my Source query and then in TARGET I want to store it, I dont have any sequence in source yet...
I have done the same in my SSIS package flow....
data is like this
INPUT is like
ACcno InDAte
A1 1-Jul 2010
A2 1-Jul 2010
A2 3-Jul 2010
A2 4-Jul 2010
A2 6-Jul 2010
A3 6-Jul 2010
A2 11-Jul 2010
A2 19-Jul 2010
Output should be like
ACcno InDAte Sequence
A1 1-Jul 2010 1
A2 1-Jul 2010 1
A2 3-Jul 2010 2
A2 4-Jul 2010 3
A2 6-Jul 2010 1
A3 6-Jul 2010 1
A2 11-Jul 2010 1
A2 13-Jul 2010 2
July 15, 2010 at 1:12 pm
Any help on this?????????????
July 15, 2010 at 7:03 pm
I changed the base data a bit to clear out junk, keep it simple and add some records:
CREATE TABLE #table
(A_NO nvarchar(14)
,IN_DATE datetime)
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctA','Jul 1 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctA','Jul 4 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctA','Jul 5 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctA','Jul 6 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctA','Jul 8 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctA','Jul 15 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctA','Jul 16 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctA','Jul 21 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctB','Jul 9 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctB','Jul 11 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctB','Jul 15 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctB','Jul 17 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctB','Jul 19 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctB','Jul 23 2010')
INSERT INTO #table (A_NO,IN_DATE) VALUES ('AcctB','Jul 25 2010')
Now, you're going to need an intermediate table. Normally you would use a temp table for this but I don't know if SSIS has issues using temp tables. If so, you simply create a permanent table instead and just make sure the data is deleted out before you begin the process. The important thing is that it has to be clustered (whether temp or perm) as shown below. The 'quirky update' method is not reliable without doing so.
The code for the update was taken from Mike's post above.
CREATE TABLE #workwithdata --create the table to work with the data
(RNINT PRIMARY KEY CLUSTERED
,A_NONVARCHAR(14)
,IN_DATE DATETIME
,SEQINT NULL)
INSERT INTO #workwithdata (RN, A_NO, IN_DATE)
SELECT RN = ROW_NUMBER() OVER(ORDER BY A_NO, IN_DATE),
A_NO,
IN_DATE
FROM #table --this is the source data
DECLARE @Sequence INT = 0,
@In_date DATETIME,
@a_no NVARCHAR(14) = 'none'
UPDATE #workwithdata --this is the 'Quirky Update' method
SET @Sequence = SEQ =
CASE WHEN @a_no = a_no THEN
CASE WHEN ABS(DATEDIFF(d, In_date, @In_date)) < 5 THEN @Sequence + 1
ELSE 1 END
ELSE 1 END,
@In_date =
CASE WHEN @a_no = a_no THEN
CASE WHEN ABS(DATEDIFF(d, In_date, @In_date)) >= 5 THEN In_Date
ELSE @In_Date END
ELSE In_Date END,
@a_no = a_no
OPTION (MAXDOP 1)
SELECT A_NO, IN_DATE, SEQ --this is the final query that will give you
FROM #workwithdata --the end result that you're after
ORDER BY RN
July 16, 2010 at 4:02 pm
riajain9 (7/14/2010)
Hi everyone,I have a strange requirement in SSIS, that I wanted to share with you all, and need some idea on how can I implement it...
I have to associate my source data with a date range and add a SEQUENCE column to the target based on that date range....
Actually I load data everyday(daily load) and it come in with an Account Number primary key(, but the day when I start loading it from from that day to + 5 days I have to maintain a sequence column (initial =1) and if the same account comes in that date range I have to increment the SEQUENCE column by 1....
Now after 5 days I have to restart the counts...and if same account comes again I have to SEQUENCE it as 1 now (now again wait 5 days , if it comes in 5 days increment sequence otherwise start new sequence)
Now my problem is that I have to maintain a date range separately for every account and I don't know how and where to maintain it... as from the src I am just getting one date , that tells me the record in date, I have to check this date against the date range for every account...
INPUT is like
ACcno InDAte
A1 1-Jul 2010
A2 1-Jul 2010
A2 3-Jul 2010
A2 4-Jul 2010
A2 6-Jul 2010
A3 6-Jul 2010
A2 11-Jul 2010
A2 19-Jul 2010
Output should be like
ACcno InDAte Sequence
A1 1-Jul 2010 1
A2 1-Jul 2010 1
A2 3-Jul 2010 2
A2 4-Jul 2010 3
A2 6-Jul 2010 1
A3 6-Jul 2010 1
A2 11-Jul 2010 1
A2 13-Jul 2010 2
You can see as the Indate for A2 changes sequence keeps on incrementing in 5 days range....
here is some sample data...
CREATE TABLE [abc1](
[I_KEY] [int] NULL,
[A_NO] [nvarchar](14) NULL,
[IN_DATE] [datetime] NULL,
[Sequence] [int] NULL
)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 1, '23', 'Jul 1 2010', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 2, '23', 'Jul 4 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 3, '23', 'Jul 5 2010', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 4, '23', 'Jul 6 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 5, '23', 'Jul 8 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 6, '23', 'Jul 15 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 7, '23', 'Jul 16 2010 ', NULL)
INSERT INTO abc1 ( I_KEY,A_NO,IN_DATE,Sequence) VALUES ( 8, '23', 'Jul 21 2010 ', NULL)
Just to make sure that my INDATE will act as the starting point for the date range....
Thanks
Account no 23 is coming in 3 ranges out put will be like this
1232010-07-01 00:00:00.000 1
2232010-07-04 00:00:00.0002
3232010-07-05 00:00:00.0003
4232010-07-06 00:00:00.000 1
5232010-07-08 00:00:00.0002
6232010-07-15 00:00:00.0001
7232010-07-16 00:00:00.0002
8232010-07-21 00:00:00.0001
I already have this post in SSIS, I am looking for how to do it in T-SQL, istead of the SSIS big flow, shown in my ssis post here
http://www.sqlservercentral.com/Forums/Topic951163-148-1.aspx
Any help on this...
Thanks
I know it seems like a trivial question but which day of the week does the 5 days start on and are the other two days of the week to be ignored?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 20, 2010 at 8:49 am
Hello Mike/Bt...
I cannot use this update as my data is very large more than 100000 and is always increasing...
I there any way to add it in the query as a DERIVED column with the same logic....
The 5 days period start from when I insert my data in the table FIRST time....
The day I start it getting from the SRC....
Thank you again
July 20, 2010 at 1:09 pm
riajain9 (7/20/2010)
Hello Mike/Bt...I cannot use this update as my data is very large more than 100000 and is always increasing...
I there any way to add it in the query as a DERIVED column with the same logic....
Do you know this or are you just guessing it would be a problem? I've used this method when working with hundreds of thousands of records at a time. If the underlying table is properly structured it should not be a problem.
There are other options, but it may help if you provide more information about the big picture of what you're trying to do ... how many records are coming in? ... what are your business requirements? ... etc.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply