December 14, 2013 at 10:35 pm
Hi there,
I have a source file that has many columns (25) and want to find the most efficient way to load it into a table with just two columns.
I.e I always want be first column to appear so I would want columns 1+2, then 1+3, then 1+4, then 1+5..... Up to 1+25
I was thinking of an insert command that pulls data from a staging table - however bit every row uses the full 25 columns- some are only 3 or 4 and I don't want rows in the final table that just have data in first column 1.
Any help would be appreciated!!
December 15, 2013 at 12:06 am
Given that I'm not good at SSIS, I did this in Access, because it's what I know, and it works. (So it is good for some things!)
In my source table "srcTable", the structure was like this:
CREATE TABLE srcTable (
ID INT IDENTITY(1,1) PRIMARY KEY,
FieldA VARCHAR(10),
FieldB VARCHAR(10),
FieldC VARCHAR(10),
FieldD VARCHAR(10)
);
Then my destination table was like this:
CREATE TABLE destTable (
ParentID INT NOT NULL,
TheValue VARCHAR(10) NOT NULL);
Then I created a module to do the dynamic SQL dirty work...
Option Compare Database
Option Explicit
Public Sub NormalizeData()
Dim tdf As DAO.TableDef
Dim i As Integer
Dim strSQL As String
DBEngine(0)(0).TableDefs("srcTable").Fields.Refresh
Set tdf = DBEngine(0)(0).TableDefs("srcTable")
For i = 1 To tdf.Fields.Count - 1
strSQL = "INSERT INTO destTable ( ParentID, TheValue ) SELECT srcTable.ID, srcTable.[" & tdf.Fields(i).Name & "] FROM srcTable WHERE srcTable.[" & tdf.Fields(i).Name & "] IS NOT NULL;"
Debug.Print strSQL
DBEngine(0)(0).Execute strSQL
Next i
Set tdf = Nothing
End Sub
Basically, it grabs the first column name (tdf.fields(0).Name) and the nth column (tdf.fields(i).Name) and inserts the non-null values into the destination table. then it processes the next column until there are no more columns to process.
For the SQL Smarties, I would love to know how to do this in T-SQL... I know about the sys.columns stuff, but not sure how to use it... so I resorted to something I know and that works. =)
December 15, 2013 at 9:16 am
Being a visual oriented individual I have no idea even where to start. It would help if you provided sample and the expected results based on the sample data. The sampe data should be representative of your data, not actual production data.
December 15, 2013 at 12:37 pm
elliottcoyne (12/14/2013)
Hi there,I have a source file that has many columns (25) and want to find the most efficient way to load it into a table with just two columns.
I.e I always want be first column to appear so I would want columns 1+2, then 1+3, then 1+4, then 1+5..... Up to 1+25
I was thinking of an insert command that pulls data from a staging table - however bit every row uses the full 25 columns- some are only 3 or 4 and I don't want rows in the final table that just have data in first column 1.
Any help would be appreciated!!
It's real easy. Just load the data into a staging table table (as you've identified) and then do qualified CROSS APPLYs to unpivot the data which will also allow you to ignore any rows where the 2+ column have no data.
If you'd like a coded example, please refer to the first link in my signature line below "Helpful Links" for the right was to post the example data.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 17, 2013 at 7:30 pm
Hi Jeff - thanks for the offer of help. I've followed your guide (as far as I was able to) and created the following to mimic my data source:
--truncate table tstStreamlineCode
--===== Create the test table with
CREATE TABLE tstStreamlineCodes
(ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
NHScode varchar(7),
sl1 varchar(5),
sl2 varchar(5),
sl3 varchar(5),
sl4 varchar(5),
sl5 varchar(5),
sl6 varchar(5),
sl7 varchar(5),
sl8 varchar(5),
sl9 varchar(5)
)
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT tstStreamlineCodes ON
--===== Insert the test data into the test table
INSERT INTO tstStreamlineCodes
(ID, NHScode, sl1 ,sl2 ,sl3 ,sl4 ,sl5 ,sl6 ,sl7 ,sl8 ,sl9)
SELECT '1','10002K','4068','4065','4086','4069','4096','4120','4121','4097','4353' UNION ALL
SELECT '2','10006P','4068','4065','4086','4069','4096','4120','4121','4097','' UNION ALL
SELECT '3','1003T', '3632','','','','','','','','' UNION ALL
SELECT '4','1007B','3633','','','','','','','','' UNION ALL
SELECT '5','1024X','1589','2044','','','','','','','' UNION ALL
SELECT '6','1037N','1589','2044','','','','','','',''
--===== Set the identity insert back to normal
SET IDENTITY_INSERT tstStreamlineCodes OFF
December 18, 2013 at 11:54 am
elliottcoyne (12/17/2013)
Hi Jeff - thanks for the offer of help. I've followed your guide (as far as I was able to) and created the following to mimic my data source:
--truncate table tstStreamlineCode
--===== Create the test table with
CREATE TABLE tstStreamlineCodes
(ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
NHScode varchar(7),
sl1 varchar(5),
sl2 varchar(5),
sl3 varchar(5),
sl4 varchar(5),
sl5 varchar(5),
sl6 varchar(5),
sl7 varchar(5),
sl8 varchar(5),
sl9 varchar(5)
)
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT tstStreamlineCodes ON
--===== Insert the test data into the test table
INSERT INTO tstStreamlineCodes
(ID, NHScode, sl1 ,sl2 ,sl3 ,sl4 ,sl5 ,sl6 ,sl7 ,sl8 ,sl9)
SELECT '1','10002K','4068','4065','4086','4069','4096','4120','4121','4097','4353' UNION ALL
SELECT '2','10006P','4068','4065','4086','4069','4096','4120','4121','4097','' UNION ALL
SELECT '3','1003T', '3632','','','','','','','','' UNION ALL
SELECT '4','1007B','3633','','','','','','','','' UNION ALL
SELECT '5','1024X','1589','2044','','','','','','','' UNION ALL
SELECT '6','1037N','1589','2044','','','','','','',''
--===== Set the identity insert back to normal
SET IDENTITY_INSERT tstStreamlineCodes OFF
The following should do it (assuming you don't want to include the ID column which could easily be added in):
SELECT ca.NHSCode,ca.SLCode
FROM dbo.tstStreamLineCodes
CROSS APPLY
(
SELECT NHSCode,sl1 UNION ALL
SELECT NHSCode,sl2 UNION ALL
SELECT NHSCode,sl3 UNION ALL
SELECT NHSCode,sl4 UNION ALL
SELECT NHSCode,sl5 UNION ALL
SELECT NHSCode,sl6 UNION ALL
SELECT NHSCode,sl7 UNION ALL
SELECT NHSCode,sl8 UNION ALL
SELECT NHSCode,sl9
)ca(NHSCode,SLCode)
WHERE ca.SLCode > ''
;
Results:
NHSCode SLCode
------- ------
10002K 4068
10002K 4065
10002K 4086
10002K 4069
10002K 4096
10002K 4120
10002K 4121
10002K 4097
10002K 4353
10006P 4068
10006P 4065
10006P 4086
10006P 4069
10006P 4096
10006P 4120
10006P 4121
10006P 4097
1003T 3632
1007B 3633
1024X 1589
1024X 2044
1037N 1589
1037N 2044
(23 row(s) affected)
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply