May 28, 2009 at 11:37 am
hi,
I have an SQL source table with structure as below
col1, col2, col3, col4, col5
a1 b1 c1 d e
a1 b1 c1 f g
a2 b1 c1 h i
a2 b1 c1 j k
...
I need to import only col1, col2 and col3 fields into destination table and these should be unique.
col1, col2, col3
a1 b1 c1
a2 b1 c1
...
Currently I am using group by clause in OLE source sql cmd to do this. Is there a better way to accomplish this?
May 28, 2009 at 12:01 pm
Is this what you mean?
SELECT DISTINCT col1, col2, col3
FROM yourTable
[font="Arial Narrow"]bc[/font]
May 28, 2009 at 12:29 pm
select col1, col2, col3 from
group by col1, col2, col3
May 28, 2009 at 12:43 pm
[font="Arial Narrow"]bc[/font]
May 28, 2009 at 12:46 pm
don't know what happened to the prior post....here it is.
I misunderstood what you were asking for. Usually you would use a GROUP BY to get some aggregate like MIN, MAX, SUM, COUNT, etc. Either of these queries accomplishes what you stated, gets the unique combination of values in the 3 columns.
SELECT DISTINCT col1, col2, col3
FROM yourTable
SELECT col1, col2, col3
FROM yourTable
GROUP BY col1, col2, col3
[font="Arial Narrow"]bc[/font]
May 28, 2009 at 1:02 pm
currently i am building the recordset in OLE Source using T-SQL. I want to know if there is a way to do it in ssis itself. I am testing aggregate transform but want to know how its done usually.
May 28, 2009 at 1:45 pm
My two cents would be to use the TSQL. I've never tested the ssis Aggregate transform, but would be surprised if it outperformed sql server (of course it probably depends :-)). Plus you don't have as much data to port across by letting the database engine handle it.
Perhaps others could provide a counterpoint.
[font="Arial Narrow"]bc[/font]
May 28, 2009 at 7:37 pm
bc (5/28/2009)
My two cents would be to use the TSQL. I've never tested the ssis Aggregate transform, but would be surprised if it outperformed sql server (of course it probably depends :-)). Plus you don't have as much data to port across by letting the database engine handle it.Perhaps others could provide a counterpoint.
I agree with this.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 28, 2009 at 8:35 pm
I think that the aggregate transform component can do this also.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
May 29, 2009 at 7:59 am
I did a little test yesterday afternoon against a query with 2 varchar(5) columns, ~250k rows (109 - 123 distinct records*). I built 2 Data Flows, with one Data Flow using the Aggregate transform and one using a Select Distinct query. The Aggregate transform data flow consistently takes 8-9 seconds, the Select Distinct < .25 second in debug mode.
*The other difference that may or may not affect you. My database is case-insensitive, the aggregate transform even with all the comparison flags checked returned additional records that appeared to be duplicates (had blank values).
[font="Arial Narrow"]bc[/font]
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply