March 2, 2006 at 8:44 am
Hi I am new at this. I am getting stuck with this SQL query. We are paasing from AS 400 database to informtica. And there are like arnd a million records. These records also have some duplicates. Now there is a column by the name RRN which is somewhat like row ID in oracle. The filter condition shd be in such a way that the duplicates should not be loaded and only 1 row from the dupluicates shd be loaded. The one which has the MAX RRN Value . So for eg there are 4 records which are duplicate then only one with the MAX RRN value shd be loaded. As that was the latest inserted . I need an SQL query which shd just pass rows without duplicates and amongs the duplicates only those with MAX RRN . Any help is highly appreciated.
March 2, 2006 at 9:02 am
Sorry if I'm mistaken, but you should be able to identify the rows that create uniqueness when you use informatica. That is what is for.
select foo, Bar
from mytable
join (select min(foo) as firstfoo
from mytable
Group by bar) T1 on firstfoo = foo
(in the Group by statement is where you put all the fields which cause uniqueness.)
March 2, 2006 at 9:26 am
I am sorry but I got confused with the uniqueness .. I will explain again as to whats happening.. There are 1 million records which are we are importing from the AS 400 database to Informatica via Source Qualifier .. And this is the initialization .so it would run only once. The SQL query would be used in the Source Qualifier which should then import data which is duplicate free. As there would be multiple sets of duplicate rows..The query should just pick 1 row from all duplcate sets and filter out the others. And this row shd be the one with MAX RRN Value among each set.
The fields are RRN which could be called as the primary key , Product ID, Product Name , SSNNO, SSNType, Product description.
So if there are 4 rows which are duplicates with RRN values 100, 200, 300, 666 then it should only pick the row with RRN value 666 and filter out the remaining. I appreciate ur prompt reply.. I hope I have made the question simpler for you guys .. reply awaited
March 2, 2006 at 9:44 am
I assume RRn values only will be differed and remaining columns have same values in the duplicate rows. In that case write Sql statement like this
select max(RRN) as RRN, ProductID, ProductName, SSNNO, SSNType, productdescription from <table_name>
group by ProductID, ProductName, SSNNO, SSNType, productdescription
March 2, 2006 at 9:51 am
THANKS A LOT GUYS .. IT SEEMS TO RUN .. WILL LET U GUYS KNOW IF THERES A PROBLEM..THANKS AGAIN
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply