Transposing Information

Question

Post reply

Transposing Information

ehlinger

Hall of Fame

Points: 3585
More actions
December 19, 2005 at 9:50 am

#91049

I'm going to try and simplify this a little. I have one of our business users comming to me and wanting me to set something up to help them analyze some information they will be getting on a regular basis. What they have is a flat file with some general information followed by 120 columns of survey type data.
So the structure of their flat file looks something like this:
ID
Name
City
State
Email
Q1
Q2
Q3
Q4
Q5
Q6
And so on through Q120

What I want to end up with is 2 tables that resemble the following:

Table 1
ID
Name
City
State
Email

Table 2
ID
Q

These would obviously be a one to many relationship but I think having the Q1 – Q120 in a separate table like this will make the reporting I need to do much easier.

Any suggestions on how to attack this?
Thanks

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Vinoo Say Hey Kid Points: 670 More actions · Answer 1

Well, the solution of having a one to many relationship looks good and you have yourself suggested it.

What else info do you need on this?

ehlinger Hall of Fame Points: 3585 More actions · Answer 2

What I'm looking for is the SQL code to actually transpose the data like this. I've tried a couple of things this morning but without much luck so I thought if anyone has done this before it would help in providing me some direction.

Paulo Carreira Newbie Points: 2 More actions · Answer 3

Hi,

So, lets see if I understand: You need a query that does this transformation for you. Then, you would like to put together a query execute an INSERT INTO SELECT FROM statement?

As far as I know, there is no way to do this using standard SQL. No query can create the view you need to insert into the target table.

I know of some ways to do it, maybe I can help if you provide us some details:

- How many rows are there in the source table?

- Do you plan to load the flat file into a database table?

- Is this a run-once problem or will you need to convert this structure continuously?

cheers,

paulo.

ehlinger Hall of Fame Points: 3585 More actions · Answer 4

Paulo,

You are correct, I'm looking for a query to help me do this to get it into relational tables.

As far as your other questions the number of rows in the source table could be as small as a few to hundreds. I can load the flat file into a database first (a temporary holding area if you will). Finally, this is something we will have to do on a continual basis. It is a file we are receiving from a survey company on a regular basis.

Thanks,

Curt

PW-201837 SSC-Insane Points: 20805 More actions · Answer 5

You need to "unpivot" the data. This can be done in T-SQL using a UNION ALL query that converts columns to rows.

eg:

SELECT Name, City, State, Email, 1 As ID, Q1 As Q

FROM YourTable

UNION ALL

SELECT Name, City, State, Email, 2 As ID, Q2 As Q

FROM YourTable

UNION ALL

...

SELECT Name, City, State, Email, 120 As ID, Q120 As Q

FROM YourTable

This would produce 120 rows for each Name, City, State & Email. If the columns are named as you suggest (Q1 - Q120), you don't have to hand code all this, you can write T-SQL against the INFORMATION_SCHEMA views (or syscolumns) to generate the SQL for you.

ehlinger Hall of Fame Points: 3585 More actions · Answer 6

Thanks. Can you get me started on how to do it by writing the T-SQL against the schema?

Paulo Carreira Newbie Points: 2 More actions · Answer 7

You already got the UNION ALL solution. I presume that if the table does not fit in memory it can be slow. This is why I asked about the size.

Another way to do it, which btw has the advantage of being more concise and in principle faster to execute, consists of declaring a cursor over the source table and inserting the multiple records in the target table. Now, you can do this in two ways: Either code the 120 inserts or use a nested cursor over syscolumns (usually not a good idea because you would be locking system tables).

hope this helps.

paulo.

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 8

Ehlinger,

As strange as it sounds, if the flat file is VERY consistant, you can actually use BCP or Bulk Insert to transpose the input from the flat file directly into a table. You will, however, have to make a BCP format file to deal with the unusual format... it would start out looking like this...

8.0

125

1

SQLCHAR

0

500

"\r\n"

1

ID

""

2

SQLCHAR

0

500

"\r\n"

2

Name

""

3

SQLCHAR

0

500

"\r\n"

3

City

""

4

SQLCHAR

0

500

"\r\n"

4

State

""

5

SQLCHAR

0

500

"\r\n"

5

Email

""

6

SQLCHAR

0

500

"\r\n"

6

Q1

""

7

SQLCHAR

0

500

"\r\n"

7

Q2

""

8

SQLCHAR

0

500

"\r\n"

8

Q3

""

9

SQLCHAR

0

500

"\r\n"

9

Q4

""

10

SQLCHAR

0

500

"\r\n"

10

Q5

""

11

SQLCHAR

0

500

"\r\n"

11

Q6

""

…

125

SQLCHAR

0

500

"\r\n"

125

Q120

""

Notice that the "column delimiters" are all CrLf's...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Vladan SSC-Insane Points: 22072 More actions · Answer 9

If the transformation during import suggested by Jeff works with your files, it will be the best way to do this. If you have problems, and need to import the data into a staging table first, here is what you could do with it using dynamic SQL.

Disclaimer : Please understand that most of the time it is best to avoid dynamic SQL and use standard, set-based solutions (like the one with UNION). If you know that you will always be working with a table that has 120 "Q" columns, I would suggest to take the time and write the query using 119 UNIONs. However, if you know that the table will always be of the same general structure, but the number of "Q" columns may vary, it would be handy to have SQL that can be quickly modified or parametrized to run on any number of columns.

I can not guarantee that this SQL will perform correctly on your data - it is just an example how it could be done. It does not take into account any previously inserted rows and supposes that all tables are freshly created and empty. I only used 4 Q columns to keep it simple, but you can use as many as you need (or rather as SQL Server allows ). Also, I understand that this action will be performed on a staging table with max. hundreds of rows; for greater amounts of data, I'm not sure what the performance would be.

--staging table to which you import data from flat file

CREATE TABLE #import (imp_ID int identity, cust_name varchar(50), City varchar(50), State varchar(30),

Email varchar(50), Q1 char(1), Q2 char(1), Q3 char(1), Q4 char(1))

--test values

INSERT INTO #import VALUES ('Bill','Houston','Texas','bill@company.com', 'Y','N','1','2')

--tables to hold results

CREATE TABLE #contacts (id_contact int, cust_name varchar(50), City varchar(50), State varchar(30),

Email varchar(50))

CREATE TABLE #questions (id_contact int, Q_id int, Q char(1))

--customers in survey and their contact data

INSERT INTO #contacts (id_contact, cust_name, city, state, email)

SELECT imp_id, cust_name, city, state, email

FROM #import

--bring all Q columns into one to normalize data

DECLARE @counter int, @sql nvarchar(1024)

SET @counter = 0

WHILE @counter < 4

BEGIN

SET @counter = @counter+1

SET @sql = 'INSERT INTO #questions (id_contact, Q_id, Q)

SELECT imp_ID,' + CAST(@counter as varchar(3))+ ',Q'+ CAST(@counter as varchar(3))

+ ' FROM #import'

EXEC sp_executesql @sql

END

drop table #import

drop table #questions

drop table #contacts

HTH, Vladan

EDIT : P.S. To determine the number of Q columns dynamically (supposed no other column begins with 'Q' !!) you can use this. It can be easily modified to use non-temporary tables:

SELECT count(*) from tempdb..syscolumns

where [id] = object_id('tempdb..#import')

and LEFT([name],1) = 'Q'

That means, if you declare a variable, fill it with count of columns and use it instead of the fixed value (4 in my example) in the WHILE clause, you can use the same SQL to transform table with any number of Q columns.