September 14, 2011 at 8:48 am
Hi guys,
I am struggling to get a data set into SQL (See link below) I have imported from various sources before but I am at a slight loss with this one?!
I am guessing that I need to create the tables and then do a batch insert some how but I am not sure how I can get SQL to read the dataset correctly
Below is the dataset in question and description of the fields
Data > http://www.aoml.noaa.gov/hrd/hurdat/tracks1851to2010_atl_reanal.html
Fields > http://www.aoml.noaa.gov/hrd/data_sub/hurdat.html
I am looking for someone to point me in the right direction as I would like to learn how to do this!
Many Thanks in Advance 😀
September 14, 2011 at 9:38 am
Wow that is nasty!!! I can give you some general direction but beyond that the details are going to be blood sweat and tears. I would approach this as multi step process.
First, create a table in sql to hold the bulk data. Maybe have two fields (Card #, and Data as big nvarchar).
Then you should parse the bulk data into three separate tables. Figure out what the layout for each record type needs to be, then do inserts to each of those tables based on the contents of the bulk table.
Insert Header from bulk where ???
Insert Daily from bulk where ???
insert Trailer from bulk where ???
You will probably have to do this row by row to keep the RI in place.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 14, 2011 at 10:07 am
Ugh. Tim Mitchell had something similar that he did in a presentation to skip headers when using SSIS, but you could easily redirect the headers to another table, as Sean suggested.
Here's kind of the reverse, but you might be able to adapt it: http://agilebi.com/jwelch/2008/02/08/adding-headers-and-footers-to-flat-files/
September 14, 2011 at 10:43 am
Sean,
Thanks for getting back to me on this! Row-by-row could take some time
Although there is less information would it be simpler to use this dataset > http://www.aoml.noaa.gov/hrd/hurdat/easyread-2011.html
Many Thanks
September 14, 2011 at 10:56 am
That certainly looks like it would be easier to determine how to parse it, however it doesn't look it has the trailer info like you posted from the first link.
The reason I say you are going to have to separate this row by row is because you want to keep the daily line in relation to the header. This looks like a huge amount of historical data. Are you doing this import more than once? Row by row would be slow but for a one time data import it is something you just have to deal with sometimes. This is not such a huge amount of data that it would be too bad. Total is less than 40k rows. If this is a frequent import you may need to devise a scheme of keeping the different types of records together.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 14, 2011 at 11:13 am
all right, this was interesting for me, as I know i could do it in TSQL.
here's a working solution for header and data...i ignored the footer information for now.
copy and paste the data to a text file, and this has got everything: schema definitions, data import, and then creating the parent child foreign keys.
getting the data was actually easy, since it was fixed width results in the data.
the only thing i don't like is i used a WHILE loop at the bottom to lazily get the reference to the parent...i could not come up with a decent set based solution to find the right stormheader.cardnumber.
--DROP TABLE STORMHEADER DROP TABLE STORMDAILYDATA
CREATE TABLE [dbo].[STORMHEADER] (
[CARDNUMBER] INT NOT NULL, --Sequential card number starting at 00005 in 1851
[STORMDATE] DATETIME NULL, --MM/DD/Year = Month, Day, and Year of storm
[DAYSOFDATA] INT NULL, --Number of days in which positions are available (note that this also means number of lines to follow of Daily Data and then the one line of the Trailer)
[STORMNUMBER] INT NULL, --Storm number for that particular year (including subtropical storms)
[RUNNINGTOTALSTORMS] INT NULL, --Total# = Storm number since the beginning of the record (since 1851)
[STORMNAME] VARCHAR(50) NULL, --Storms only given official names since 1950
[CAMEASHOREINUS] INT NULL, --'1' = Made landfall (i.e., the center of the cyclone crossed the coast) on the continental United States as a tropical storm or hurricane,
--'0' = did not make a U.S. landfall
[STORMCATEGORY] INT NULL, --'0' = Used to indicate U.S. tropical storm landfall, but this has not been utilized in recent years
--'1' to '5' = Highest Saffir-Simpson Hurricane Scale impact in the United States based upon extimated maximum sustained surface winds produced at the coast. See scale below.
CONSTRAINT [PK__STORMHEADER__0492E426] PRIMARY KEY CLUSTERED (CardNumber))
CREATE TABLE [dbo].[STORMDAILYDATA](
[CARDNUMBER] INT NOT NULL, --Card# = As above.
[STORMHEADERREF] INT REFERENCES [STORMHEADER]([CARDNUMBER]),
[EVENTDATE] VARCHAR(5) NULL, --MM/DD = Month and Day
[STORMSTAGE00Z] CHAR(1) NULL,--'*' (tropical cyclone stage), 'S' (Subtropical stage) 'E' (extratropical stage) 'W' (wave stage - rarely used) 'L' (remanent Low stage - rarely used)
[LATITUDE00Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z
[LONGITUDE00Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z
[INTENSITY00Z] INT NULL,
[PRESSURE00Z] INT NULL,
[STORMSTAGE06Z] CHAR(1) NULL,
[LATITUDE06Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z
[LONGITUDE06Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z
[INTENSITY06Z] INT NULL,
[PRESSURE06Z] INT NULL,
[STORMSTAGE12Z] CHAR(1) NULL,
[LATITUDE12Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z
[LONGITUDE12Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z
[INTENSITY12Z] INT NULL,
[PRESSURE12Z] INT NULL,
[STORMSTAGE18Z] CHAR(1) NULL,
[LATITUDE18Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z
[LONGITUDE18Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z
[INTENSITY18Z] INT NULL,
[PRESSURE18Z] INT NULL)
GO
CREATE TABLE BULKDATA(RAWDATA VARCHAR (8000))
BULK INSERT BULKDATA FROM 'c:\Data\StormData.txt'
WITH (
DATAFILETYPE = 'char',
FIELDTERMINATOR = '|',
ROWTERMINATOR = '\n ',
FIRSTROW = 1
)
--individual data. 14038 rows(s) affected
--SELECT SUBSTRING(RAWDATA,7,10),* FROM BULKDATA
INSERT INTO [dbo].[STORMHEADER]
SELECT
SUBSTRING(RAWDATA,1,5) AS CARDNUMBER,
SUBSTRING(RAWDATA,7,10) AS STORMDATE,
SUBSTRING(RAWDATA,20,2) AS DAYSOFDATA,
SUBSTRING(RAWDATA,23,2) AS STORMNUMBER,
SUBSTRING(RAWDATA,32,3) AS RUNNINGTOTALSTORMS,
SUBSTRING(RAWDATA,36,12) AS STORMNAME,
SUBSTRING(RAWDATA,53,1) AS CAMEASHOREINUS,
SUBSTRING(RAWDATA,59,1) AS STORMCATEGORY
FROM BULKDATA
WHERE ISDATE(SUBSTRING(RAWDATA,7,10)) = 1
INSERT INTO STORMDAILYDATA
SELECT
SUBSTRING(RAWDATA,1,5) AS CARDNUMBER,
NULL AS [STORMHEADERREF],
SUBSTRING(RAWDATA,7,5) AS EVENTDATE,
SUBSTRING(RAWDATA,12,1) AS STORMSTAGE00Z,
SUBSTRING(RAWDATA,13,3) AS LATITUDE00Z,
SUBSTRING(RAWDATA,17,3) AS LONGITUDE00Z,
SUBSTRING(RAWDATA,21,3) AS INTENSITY00Z,
SUBSTRING(RAWDATA,25,4) AS PRESSURE00Z,
SUBSTRING(RAWDATA,29,1) AS STORMSTAGE06Z,
SUBSTRING(RAWDATA,30,3) AS LATITUDE06Z,
SUBSTRING(RAWDATA,34,3) AS LONGITUDE06Z,
SUBSTRING(RAWDATA,38,3) AS INTENSITY06Z,
SUBSTRING(RAWDATA,42,4) AS PRESSURE06Z,
SUBSTRING(RAWDATA,46,1) AS STORMSTAGE120Z,
SUBSTRING(RAWDATA,47,3) AS LATITUDE12Z,
SUBSTRING(RAWDATA,51,3) AS LONGITUDE12Z,
SUBSTRING(RAWDATA,55,3) AS INTENSITY12Z,
SUBSTRING(RAWDATA,59,4) AS PRESSURE12Z,
SUBSTRING(RAWDATA,63,1) AS STORMSTAGE18Z,
SUBSTRING(RAWDATA,64,3) AS LATITUDE18Z,
SUBSTRING(RAWDATA,68,3) AS LONGITUDE18Z,
SUBSTRING(RAWDATA,72,3) AS INTENSITY18Z,
SUBSTRING(RAWDATA,76,4) AS PRESSURE18Z
FROM BULKDATA
WHERE ISDATE(SUBSTRING(RAWDATA,7,10)) = 0
AND DATALENGTH(RAWDATA) = 80
--FK time:
UPDATE STORMDAILYDATA SET STORMHEADERREF=NULL
UPDATE STORMDAILYDATA
SET STORMHEADERREF = STORMHEADER.CARDNUMBER
FROM STORMHEADER
INNER JOIN STORMDAILYDATA
ON STORMDAILYDATA.CARDNUMBER -5 = STORMHEADER.CARDNUMBER
DECLARE @found int
SELECT @found = 1
FROM STORMDAILYDATA
WHERE STORMHEADERREF IS NULL
WHILE ISNULL(@found,0)= 1
BEGIN
UPDATE MyAlias
SET MyAlias.STORMHEADERREF = MYSELF.STORMHEADERREF
FROM STORMDAILYDATA MyAlias
INNER JOIN STORMDAILYDATA MYSELF
ON MyAlias.CARDNUMBER = (MYSELF.CARDNUMBER + 5)
WHERE MYSELF.STORMHEADERREF IS NOT NULL
AND MyAlias.STORMHEADERREF IS NULL
IF NOT EXISTS (SELECT 1
FROM STORMDAILYDATA
WHERE STORMHEADERREF IS NULL)
SET @found = 0
END
Lowell
September 15, 2011 at 9:28 am
Lowell,
I cant thankyou enough for doing this for me, although I am having a slight issue. When I execute this procedure I am getting the below error:
Msg 4866, Level 16, State 1, Line 3
The bulk load failed. The column is too long in the data file for row 1, column 1. Verify that the field terminator and row terminator are specified correctly.
Msg 7399, Level 16, State 1, Line 3
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 3
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
I can only assume this has something has gone wrong with the length of the bulk data column int he design of the table?! If you or anyone else could get back to me that would be excellent!
Thanks Again
Dave
September 15, 2011 at 9:46 am
i copy and pasted the file into a text document from the web page, so my lines all ended in Carriage Return Line feeds(CHAR(13) + CHAR(10) because of the enhanced text editor i use (EditPlus)
you'll get that error if you are getting the file a different way, say UNIX style with only \r for CR? as the line terminator.
without changing anything else, change the two lines that say
ROWTERMINATOR = '\n',
--CHANGE TO
ROWTERMINATOR = '\r',
i think that will fix your issue.
Lowell
September 15, 2011 at 10:08 am
Okay that makes sense, I copied the data into wordpad!
Now receiving this error:
Msg 208, Level 16, State 82, Line 1
Invalid object name 'BULKDATA'.
Sorry to keep bugging you with this, its very much appreciated!
September 15, 2011 at 10:49 am
dave_vicary (9/15/2011)
Okay that makes sense, I copied the data into wordpad!Now receiving this error:
Msg 208, Level 16, State 82, Line 1Invalid object name 'BULKDATA'.
Sorry to keep bugging you with this, its very much appreciated!
by line 1, you are obviously running this one statement at a time instead of as a batch.
the line above the BULK insert creates the table, and you must have skipped it:
CREATE TABLE BULKDATA(RAWDATA VARCHAR (8000))
also, i noticed that in the command i pasted, there is a space after the slash-n, which would also raise that error you got previously.
drop the three tables
DROP TABLE STORMHEADER
DROP TABLE STORMDAILYDATA
DROP TABLE BULKDATA
and run then script completely, isntead of one command at a time.
Lowell
September 16, 2011 at 6:24 am
I've done things similar to this (importing mainframe data into SQL) using SSIS (and Business Objects too, for that matter) and it was very straightforward and easy. Is there a requirement that you do it using SQL rather than a data loading tool?
September 16, 2011 at 7:15 am
ACinKC (9/16/2011)
I've done things similar to this (importing mainframe data into SQL) using SSIS (and Business Objects too, for that matter) and it was very straightforward and easy. Is there a requirement that you do it using SQL rather than a data loading tool?
Nope, it's just the tool I'm personally most familiar with,and also more conducive to forum help.
You can copy and paste a testable TSQL solution, but you cannot do the same for SSIS.
Lowell
September 16, 2011 at 7:29 am
Lowell that picture is awesome. It fits perfectly in another thread from a couple days ago. I will have to "steal" your pic and paste it in there.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 16, 2011 at 7:34 am
Sean Lange (9/16/2011)
Lowell that picture is awesome. It fits perfectly in another thread from a couple days ago. I will have to "steal" your pic and paste it in there.
kewl! post the thread, i'll wanna follow it!
Lowell
September 16, 2011 at 7:39 am
The thread itself is not all that interesting but the picture fits 100%.
http://www.sqlservercentral.com/Forums/Topic1171708-338-1.aspx
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply