January 29, 2010 at 11:37 am
I have a situation where I have a row with lots of columns. This table was originally a flat file. Each row has multiple columns
representing the same data but multiple instances.
I need to make each column it's own row with some common columns that will be used to identify the row.
Example:
CREATE TABLE [PeopleAndStuff](
[SSN] VARCHAR(11) NOT NULL,
[FirstName] VARCHAR(100) NOT NULL,
[LastName] VARCHAR(200) NOT NULL,
[Address] VARCHAR(500) NULL,
[Phone] VARCHAR(13) NULL,
[Item1] VARCHAR(300) NULL,
[Item1Value] MONEY NULL,
[Item2] VARCHAR(300) NULL,
[Item2Value] MONEY NULL
)
INSERT INTO [PeopleAndStuff]
SELECT '111-11-1111','John','Doe','111 Main St. Somewhere, SW','(123) 123-1234','Car','1000.00','Bicycle','50.00'
UNION ALL
SELECT '222-22-2222','Jane','Doe','222 Main St. Somewhere, SW','(222) 222-2222','Bicycle','10.00','car','500.00'
UNION ALL
SELECT '333-33-3333','Baby','Doe','333 Main St. Somewhere, SW','(333) 333-3333','Bottle','100.00','bib','5.00'
Now I want to create multiple tables:
CREATE TABLE [People] (
[SSN] VARCHAR(11) NOT NULL,
[FirstName] VARCHAR(100) NOT NULL,
[LastName] VARCHAR(200) NOT NULL,
[Address] VARCHAR(500) NULL,
[Phone] VARCHAR(13) NULL
)
CREATE TABLE [Stuff] (
[SSN] VARCHAR(11) NOT NULL,
[ItemDesc] VARCHAR(300) NULL,
[ItemValue] MONEY NULL
)
The end result should be where each row is evaluated and if data exists in each "ITEM" column, the "ITEM" and corresponding "VALUE"
should be added to the new table as different rows but with the same/ common "SSN" number.
What I have tried so far is to use a "MultiCast" object, and then have each column evaluated and if not null, insert the column and its value into
the destination table. But I get errors.
P.S.
This is an example. In my production environment, I have hundreds of columns to evaluate.
I am using SQL Server 2005 Standard w/ sp3
Any help would be greatly appreciated.
Thanks.
J.T.
January 29, 2010 at 11:58 am
Jason does your string always have exactly two items, or is it a long string where the first five items are SSN/name/address, and comma delimited 6 thru N are items?
you mentioned hundreds of columns, but your example has just two "items", although they might have different descriptions.
my eyes blurred and i read the description wrong...sorry. you should be able to do this:
note it is still not normalized; "bicycle and car" both appear twice apeice, because of different values...you might want to pull items out as a seperate table, and associate person-itemid-value instead.
select distinct [Item1],[Item1Value] from [PeopleAndStuff] union
select distinct [Item2],[Item2Value] from [PeopleAndStuff]
--results:
Item1 Item1Value
------- ---------------------
bib 5.00
Bicycle 10.00
Bicycle 50.00
Bottle 100.00
car 500.00
Car 1000.00
Lowell
February 1, 2010 at 1:25 am
February 1, 2010 at 11:39 am
I figured out the error in the Lookup process (configure error output and set to Ignore errors for values not in the lookup table such as NULLS). However now I am getting zero rows written because of table locking. This is what I was afraid of. How can I setup 1 process of lookups and table writes to occur first before the next set of table lookups when the evaluation is occurring on the same row, just different columns?
Here is what I am doing:
Taking the row and having the first item column evaluated in a lookup table and then the substitute value and the amount get written to the items table. There is a second process that is evaluating items column 2 against the same lookup table and (what I want) is to write a new row in the destination table. Both lookup and following write proceses are occurring off of a single MULTICAST process object.
Wishful thinking?>
February 1, 2010 at 1:08 pm
OK. So I put each column evaluation in it's own "Data Flow" task. Problem solved for the time being. However, I have a new issue. It is a performance tuning issue:
In these columns, some of them have NULL values. When a NULL is evaluated, I want the process to NOT add the columns to the table, but rather go to the next row. Again (same data), but when the general information gets written to the first table and columns destined for the 2nd table (items table) are evaluated and are NULL, do not insert a row into the items table.
Any help is appreciated.
February 1, 2010 at 8:59 pm
Why wouldn't UNPIVOT (maybe, more than one) do the job here?
--Jeff Moden
Change is inevitable... Change for the better is not.
February 1, 2010 at 11:59 pm
February 2, 2010 at 9:22 am
Thanks for taking the time to read and post Willem. I cannot find any documentation that explains how to use a conditional split. It looks like a process task that is used for writing data based on a condition to a different table. But can it be used to write to NO table? Any experience with using this process task that I can (humbly) get your input on?
JT
February 3, 2010 at 12:45 am
Jason,
You will find some information on the conditional split transformation (CST) in BOL.
You don't have to use an output downstream of the CST, so basically those rows won't go anywhere (but the bit bucket :-).
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply