February 19, 2013 at 4:01 pm
I just used the unpivot tool. So now I have more columns than I need. I know how to map the columns. If I need to use the SQL command, i don't know how I'd use it. I know the above is an either/or situation.
Since my data has been unpivoted in an effort to normalize it, I need to make sure that the information i'm inputting into the destination is DISTINCT. What tool do i need to get DISTINCT data into my destination?
Thanks!
February 19, 2013 at 4:07 pm
The aggregate tool. Just group by everything.
Be aware, that's a stream-stopper. It forces all rows into it before it releases any because the data is unsorted. So any following constructs in the stream won't start until the aggregate completes.
Under most circumstances I tend to do aggregations at the db tier as it's better able to handle the workload. If you're dropping this to a staging table immediately after you'll be better off (unless the volume difference is drastic) dumping everything and then SELECT DISTINCT'ing into your real table.
If it's for a flatfile or something it's just price of doing business, just try to make sure it's late in the stream.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
February 20, 2013 at 3:16 am
Jacob Pressures (2/19/2013)
I just used the unpivot tool. So now I have more columns than I need. I know how to map the columns. If I need to use the SQL command, i don't know how I'd use it. I know the above is an either/or situation.Since my data has been unpivoted in an effort to normalize it, I need to make sure that the information i'm inputting into the destination is DISTINCT. What tool do i need to get DISTINCT data into my destination?
Thanks!
Quick note: if your source is an RDBMS, it would be faster (probably much faster) to do all of this using SQL rather than SSIS.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
February 20, 2013 at 7:32 am
Thanks guys! It sounds like doing the staging table is the easiest. This is just an exercise for my internship to help me understand how to use SSIS. I guess i'll do it both ways for practice.
If i use a staging table, I'm assuming I'd have to end the data flow add another data flow to the Control flow and link the two. I'd then pull the distinct data out of the staging tables and place it in the real destination tables.
This is my understanding. Any alternatives? Once I put something into a destination table can i take it out in the same data flow? This is why i'm thinking I'd need two data flows.
Thanks!
February 20, 2013 at 7:47 am
There are almost always alternatives, and knowing the pros and cons of each is a really useful thing. In this case, once data is in a staging table I would probably run a T-SQL Merge to get the data to its destination, unless it's all INSERTS, in which case your suggested method will work fine.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
February 20, 2013 at 9:56 am
Thanks very much!
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply