How do I import this data?!

  • Hi guys,

    I am struggling to get a data set into SQL (See link below) I have imported from various sources before but I am at a slight loss with this one?!

    I am guessing that I need to create the tables and then do a batch insert some how but I am not sure how I can get SQL to read the dataset correctly

    Below is the dataset in question and description of the fields

    Data > http://www.aoml.noaa.gov/hrd/hurdat/tracks1851to2010_atl_reanal.html

    Fields > http://www.aoml.noaa.gov/hrd/data_sub/hurdat.html

    I am looking for someone to point me in the right direction as I would like to learn how to do this!

    Many Thanks in Advance 😀

  • Wow that is nasty!!! I can give you some general direction but beyond that the details are going to be blood sweat and tears. I would approach this as multi step process.

    First, create a table in sql to hold the bulk data. Maybe have two fields (Card #, and Data as big nvarchar).

    Then you should parse the bulk data into three separate tables. Figure out what the layout for each record type needs to be, then do inserts to each of those tables based on the contents of the bulk table.

    Insert Header from bulk where ???

    Insert Daily from bulk where ???

    insert Trailer from bulk where ???

    You will probably have to do this row by row to keep the RI in place.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Ugh. Tim Mitchell had something similar that he did in a presentation to skip headers when using SSIS, but you could easily redirect the headers to another table, as Sean suggested.

    Here's kind of the reverse, but you might be able to adapt it: http://agilebi.com/jwelch/2008/02/08/adding-headers-and-footers-to-flat-files/

  • Sean,

    Thanks for getting back to me on this! Row-by-row could take some time

    Although there is less information would it be simpler to use this dataset > http://www.aoml.noaa.gov/hrd/hurdat/easyread-2011.html

    Many Thanks

  • That certainly looks like it would be easier to determine how to parse it, however it doesn't look it has the trailer info like you posted from the first link.

    The reason I say you are going to have to separate this row by row is because you want to keep the daily line in relation to the header. This looks like a huge amount of historical data. Are you doing this import more than once? Row by row would be slow but for a one time data import it is something you just have to deal with sometimes. This is not such a huge amount of data that it would be too bad. Total is less than 40k rows. If this is a frequent import you may need to devise a scheme of keeping the different types of records together.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • all right, this was interesting for me, as I know i could do it in TSQL.

    here's a working solution for header and data...i ignored the footer information for now.

    copy and paste the data to a text file, and this has got everything: schema definitions, data import, and then creating the parent child foreign keys.

    getting the data was actually easy, since it was fixed width results in the data.

    the only thing i don't like is i used a WHILE loop at the bottom to lazily get the reference to the parent...i could not come up with a decent set based solution to find the right stormheader.cardnumber.

    --DROP TABLE STORMHEADER DROP TABLE STORMDAILYDATA

    CREATE TABLE [dbo].[STORMHEADER] (

    [CARDNUMBER] INT NOT NULL, --Sequential card number starting at 00005 in 1851

    [STORMDATE] DATETIME NULL, --MM/DD/Year = Month, Day, and Year of storm

    [DAYSOFDATA] INT NULL, --Number of days in which positions are available (note that this also means number of lines to follow of Daily Data and then the one line of the Trailer)

    [STORMNUMBER] INT NULL, --Storm number for that particular year (including subtropical storms)

    [RUNNINGTOTALSTORMS] INT NULL, --Total# = Storm number since the beginning of the record (since 1851)

    [STORMNAME] VARCHAR(50) NULL, --Storms only given official names since 1950

    [CAMEASHOREINUS] INT NULL, --'1' = Made landfall (i.e., the center of the cyclone crossed the coast) on the continental United States as a tropical storm or hurricane,

    --'0' = did not make a U.S. landfall

    [STORMCATEGORY] INT NULL, --'0' = Used to indicate U.S. tropical storm landfall, but this has not been utilized in recent years

    --'1' to '5' = Highest Saffir-Simpson Hurricane Scale impact in the United States based upon extimated maximum sustained surface winds produced at the coast. See scale below.

    CONSTRAINT [PK__STORMHEADER__0492E426] PRIMARY KEY CLUSTERED (CardNumber))

    CREATE TABLE [dbo].[STORMDAILYDATA](

    [CARDNUMBER] INT NOT NULL, --Card# = As above.

    [STORMHEADERREF] INT REFERENCES [STORMHEADER]([CARDNUMBER]),

    [EVENTDATE] VARCHAR(5) NULL, --MM/DD = Month and Day

    [STORMSTAGE00Z] CHAR(1) NULL,--'*' (tropical cyclone stage), 'S' (Subtropical stage) 'E' (extratropical stage) 'W' (wave stage - rarely used) 'L' (remanent Low stage - rarely used)

    [LATITUDE00Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z

    [LONGITUDE00Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z

    [INTENSITY00Z] INT NULL,

    [PRESSURE00Z] INT NULL,

    [STORMSTAGE06Z] CHAR(1) NULL,

    [LATITUDE06Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z

    [LONGITUDE06Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z

    [INTENSITY06Z] INT NULL,

    [PRESSURE06Z] INT NULL,

    [STORMSTAGE12Z] CHAR(1) NULL,

    [LATITUDE12Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z

    [LONGITUDE12Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z

    [INTENSITY12Z] INT NULL,

    [PRESSURE12Z] INT NULL,

    [STORMSTAGE18Z] CHAR(1) NULL,

    [LATITUDE18Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z

    [LONGITUDE18Z] INT NULL, --Positions and intensities are at 00Z, 06Z, 12Z, 18Z

    [INTENSITY18Z] INT NULL,

    [PRESSURE18Z] INT NULL)

    GO

    CREATE TABLE BULKDATA(RAWDATA VARCHAR (8000))

    BULK INSERT BULKDATA FROM 'c:\Data\StormData.txt'

    WITH (

    DATAFILETYPE = 'char',

    FIELDTERMINATOR = '|',

    ROWTERMINATOR = '\n ',

    FIRSTROW = 1

    )

    --individual data. 14038 rows(s) affected

    --SELECT SUBSTRING(RAWDATA,7,10),* FROM BULKDATA

    INSERT INTO [dbo].[STORMHEADER]

    SELECT

    SUBSTRING(RAWDATA,1,5) AS CARDNUMBER,

    SUBSTRING(RAWDATA,7,10) AS STORMDATE,

    SUBSTRING(RAWDATA,20,2) AS DAYSOFDATA,

    SUBSTRING(RAWDATA,23,2) AS STORMNUMBER,

    SUBSTRING(RAWDATA,32,3) AS RUNNINGTOTALSTORMS,

    SUBSTRING(RAWDATA,36,12) AS STORMNAME,

    SUBSTRING(RAWDATA,53,1) AS CAMEASHOREINUS,

    SUBSTRING(RAWDATA,59,1) AS STORMCATEGORY

    FROM BULKDATA

    WHERE ISDATE(SUBSTRING(RAWDATA,7,10)) = 1

    INSERT INTO STORMDAILYDATA

    SELECT

    SUBSTRING(RAWDATA,1,5) AS CARDNUMBER,

    NULL AS [STORMHEADERREF],

    SUBSTRING(RAWDATA,7,5) AS EVENTDATE,

    SUBSTRING(RAWDATA,12,1) AS STORMSTAGE00Z,

    SUBSTRING(RAWDATA,13,3) AS LATITUDE00Z,

    SUBSTRING(RAWDATA,17,3) AS LONGITUDE00Z,

    SUBSTRING(RAWDATA,21,3) AS INTENSITY00Z,

    SUBSTRING(RAWDATA,25,4) AS PRESSURE00Z,

    SUBSTRING(RAWDATA,29,1) AS STORMSTAGE06Z,

    SUBSTRING(RAWDATA,30,3) AS LATITUDE06Z,

    SUBSTRING(RAWDATA,34,3) AS LONGITUDE06Z,

    SUBSTRING(RAWDATA,38,3) AS INTENSITY06Z,

    SUBSTRING(RAWDATA,42,4) AS PRESSURE06Z,

    SUBSTRING(RAWDATA,46,1) AS STORMSTAGE120Z,

    SUBSTRING(RAWDATA,47,3) AS LATITUDE12Z,

    SUBSTRING(RAWDATA,51,3) AS LONGITUDE12Z,

    SUBSTRING(RAWDATA,55,3) AS INTENSITY12Z,

    SUBSTRING(RAWDATA,59,4) AS PRESSURE12Z,

    SUBSTRING(RAWDATA,63,1) AS STORMSTAGE18Z,

    SUBSTRING(RAWDATA,64,3) AS LATITUDE18Z,

    SUBSTRING(RAWDATA,68,3) AS LONGITUDE18Z,

    SUBSTRING(RAWDATA,72,3) AS INTENSITY18Z,

    SUBSTRING(RAWDATA,76,4) AS PRESSURE18Z

    FROM BULKDATA

    WHERE ISDATE(SUBSTRING(RAWDATA,7,10)) = 0

    AND DATALENGTH(RAWDATA) = 80

    --FK time:

    UPDATE STORMDAILYDATA SET STORMHEADERREF=NULL

    UPDATE STORMDAILYDATA

    SET STORMHEADERREF = STORMHEADER.CARDNUMBER

    FROM STORMHEADER

    INNER JOIN STORMDAILYDATA

    ON STORMDAILYDATA.CARDNUMBER -5 = STORMHEADER.CARDNUMBER

    DECLARE @found int

    SELECT @found = 1

    FROM STORMDAILYDATA

    WHERE STORMHEADERREF IS NULL

    WHILE ISNULL(@found,0)= 1

    BEGIN

    UPDATE MyAlias

    SET MyAlias.STORMHEADERREF = MYSELF.STORMHEADERREF

    FROM STORMDAILYDATA MyAlias

    INNER JOIN STORMDAILYDATA MYSELF

    ON MyAlias.CARDNUMBER = (MYSELF.CARDNUMBER + 5)

    WHERE MYSELF.STORMHEADERREF IS NOT NULL

    AND MyAlias.STORMHEADERREF IS NULL

    IF NOT EXISTS (SELECT 1

    FROM STORMDAILYDATA

    WHERE STORMHEADERREF IS NULL)

    SET @found = 0

    END

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell,

    I cant thankyou enough for doing this for me, although I am having a slight issue. When I execute this procedure I am getting the below error:

    Msg 4866, Level 16, State 1, Line 3

    The bulk load failed. The column is too long in the data file for row 1, column 1. Verify that the field terminator and row terminator are specified correctly.

    Msg 7399, Level 16, State 1, Line 3

    The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.

    Msg 7330, Level 16, State 2, Line 3

    Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

    I can only assume this has something has gone wrong with the length of the bulk data column int he design of the table?! If you or anyone else could get back to me that would be excellent!

    Thanks Again

    Dave

  • i copy and pasted the file into a text document from the web page, so my lines all ended in Carriage Return Line feeds(CHAR(13) + CHAR(10) because of the enhanced text editor i use (EditPlus)

    you'll get that error if you are getting the file a different way, say UNIX style with only \r for CR? as the line terminator.

    without changing anything else, change the two lines that say

    ROWTERMINATOR = '\n',

    --CHANGE TO

    ROWTERMINATOR = '\r',

    i think that will fix your issue.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Okay that makes sense, I copied the data into wordpad!

    Now receiving this error:

    Msg 208, Level 16, State 82, Line 1

    Invalid object name 'BULKDATA'.

    Sorry to keep bugging you with this, its very much appreciated!

  • dave_vicary (9/15/2011)


    Okay that makes sense, I copied the data into wordpad!

    Now receiving this error:

    Msg 208, Level 16, State 82, Line 1Invalid object name 'BULKDATA'.

    Sorry to keep bugging you with this, its very much appreciated!

    by line 1, you are obviously running this one statement at a time instead of as a batch.

    the line above the BULK insert creates the table, and you must have skipped it:

    CREATE TABLE BULKDATA(RAWDATA VARCHAR (8000))

    also, i noticed that in the command i pasted, there is a space after the slash-n, which would also raise that error you got previously.

    drop the three tables

    DROP TABLE STORMHEADER

    DROP TABLE STORMDAILYDATA

    DROP TABLE BULKDATA

    and run then script completely, isntead of one command at a time.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • I've done things similar to this (importing mainframe data into SQL) using SSIS (and Business Objects too, for that matter) and it was very straightforward and easy. Is there a requirement that you do it using SQL rather than a data loading tool?

  • ACinKC (9/16/2011)


    I've done things similar to this (importing mainframe data into SQL) using SSIS (and Business Objects too, for that matter) and it was very straightforward and easy. Is there a requirement that you do it using SQL rather than a data loading tool?

    Nope, it's just the tool I'm personally most familiar with,and also more conducive to forum help.

    You can copy and paste a testable TSQL solution, but you cannot do the same for SSIS.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell that picture is awesome. It fits perfectly in another thread from a couple days ago. I will have to "steal" your pic and paste it in there.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (9/16/2011)


    Lowell that picture is awesome. It fits perfectly in another thread from a couple days ago. I will have to "steal" your pic and paste it in there.

    kewl! post the thread, i'll wanna follow it!

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • The thread itself is not all that interesting but the picture fits 100%.

    http://www.sqlservercentral.com/Forums/Topic1171708-338-1.aspx

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply