Downloading files for Import?

  • jasona.work (9/12/2012)


    Have you considered using Powershell? I'm presuming you have it on the server, so you could craft a Powershell script to grab the file(s), then unzip them. Once they're unzipped, you could then import them into SQL using whatever method you prefer.

    Below is a series of functions for Powershell for FTPing files around:

    http://gallery.technet.microsoft.com/scriptcenter/PowerShell-FTP-Client-db6fe0cb

    And this has some code for zipping and unzipping files in Powershell:

    http://blogs.msdn.com/b/daiken/archive/2007/02/12/compress-files-with-windows-powershell-then-package-a-windows-vista-sidebar-gadget.aspx%5B/quote%5D

    Let's make your references easier for others, okay?

    http://gallery.technet.microsoft.com/scriptcenter/PowerShell-FTP-Client-db6fe0cb

    http://blogs.msdn.com/b/daiken/archive/2007/02/12/compress-files-with-windows-powershell-then-package-a-windows-vista-sidebar-gadget.aspx

  • jasona.work (9/12/2012)


    Have you considered using Powershell? I'm presuming you have it on the server, so you could craft a Powershell script to grab the file(s), then unzip them. Once they're unzipped, you could then import them into SQL using whatever method you prefer.

    Below is a series of functions for Powershell for FTPing files around:

    http://gallery.technet.microsoft.com/scriptcenter/PowerShell-FTP-Client-db6fe0cb

    And this has some code for zipping and unzipping files in Powershell:

    http://blogs.msdn.com/b/daiken/archive/2007/02/12/compress-files-with-windows-powershell-then-package-a-windows-vista-sidebar-gadget.aspx%5B/quote%5D

    Thanks jasona(?). I'll take a look at it.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • GSquared (9/12/2012)


    RBarryYoung (9/11/2012)


    GSquared (9/11/2012)


    SSIS has an FTP object that can be used to automatically download the files. Won't work if it's FTPS, but will for FTP. Will that work for you? It can also call the shell and run an unzip command as if it were the command-line.

    I have a process that outputs, encrypts (via command-line), and then FTP uploads a bunch of files every night. Do the whole thing in a pretty simple SSIS package. Not the same direction as what you're doing, and encryption instead of unzipping, but it's just the same actions in the oposite direction.

    Thanks, Gus. I really don't want to jump this into SSIS for this unless its dead easy and reliable. This site does not already use SSIS, so I would have to go through the whole deal of setting it up for them, along with standards for logging, reporting, etc. Plus, I already have the SQL import procedures and structures almost done.

    I have been looking around today and it does seem like there are some much better command-line tools for downloading today than there were 10 years ago. Some (like CoreFTP) are native Windows (and not a crummy Unix port) and will do both the download and the unzip at the same time, all from a single DOS command line (avoiding all of that awful FTP conversation-script dreck from years ago). Plus they have a free version, so that's the way I'm leaning at the moment.

    I use CoreFTP all the time. Great tool.

    Edit: What problems are you running into with it?

    Well, I may have over-reacted. I am trying to an IRS download directory (www.irs.gov/pub/irs-soi) but a I cannot get a connection. I don't think it has FTP there, but I cannot get CoreFTP to connect even with HTTP.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (9/12/2012)


    GSquared (9/12/2012)


    RBarryYoung (9/11/2012)


    GSquared (9/11/2012)


    SSIS has an FTP object that can be used to automatically download the files. Won't work if it's FTPS, but will for FTP. Will that work for you? It can also call the shell and run an unzip command as if it were the command-line.

    I have a process that outputs, encrypts (via command-line), and then FTP uploads a bunch of files every night. Do the whole thing in a pretty simple SSIS package. Not the same direction as what you're doing, and encryption instead of unzipping, but it's just the same actions in the oposite direction.

    Thanks, Gus. I really don't want to jump this into SSIS for this unless its dead easy and reliable. This site does not already use SSIS, so I would have to go through the whole deal of setting it up for them, along with standards for logging, reporting, etc. Plus, I already have the SQL import procedures and structures almost done.

    I have been looking around today and it does seem like there are some much better command-line tools for downloading today than there were 10 years ago. Some (like CoreFTP) are native Windows (and not a crummy Unix port) and will do both the download and the unzip at the same time, all from a single DOS command line (avoiding all of that awful FTP conversation-script dreck from years ago). Plus they have a free version, so that's the way I'm leaning at the moment.

    I use CoreFTP all the time. Great tool.

    Edit: What problems are you running into with it?

    Well, I may have over-reacted. I am trying to an IRS download directory (www.irs.gov/pub/irs-soi) but a I cannot get a connection. I don't think it has FTP there, but I cannot get CoreFTP to connect even with HTTP.

    I just tried your link, but it gave a 403 error and said a login is required.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (9/12/2012)


    RBarryYoung (9/12/2012)


    ...

    Well, I may have over-reacted. I am trying to an IRS download directory (www.irs.gov/pub/irs-soi) but a I cannot get a connection. I don't think it has FTP there, but I cannot get CoreFTP to connect even with HTTP.

    I just tried your link, but it gave a 403 error and said a login is required.

    Right, but if you go there with a Browser, it works fine with no login or password. That's what I can't figure out???

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (9/12/2012)


    GSquared (9/12/2012)


    RBarryYoung (9/12/2012)


    ...

    Well, I may have over-reacted. I am trying to an IRS download directory (www.irs.gov/pub/irs-soi) but a I cannot get a connection. I don't think it has FTP there, but I cannot get CoreFTP to connect even with HTTP.

    I just tried your link, but it gave a 403 error and said a login is required.

    Right, but if you go there with a Browser, it works fine with no login or password. That's what I can't figure out???

    RBarry, your link is formatted wrong in your post.

    You've got it going to "http://www.sqlservercentral.com/Forums/www.irs.gov/pub/irs-soi" rather than http://www.irs.gov/pub/irs-soi/[/url] When I try going to it, it does eventually come up with a list of files, it just takes a LONG time.

  • Barry,

    I noticed you can do HTTP downloads from this site, but can you do FTP downloads from this site? I was able to FTP connect to http://ftp.irs.gov, but I don't have a user/password to log in.

    One other question, is the name of the file to download always the same or is it dynamic?

    "El" Jerry.

    "A watt of Ottawa" - Gerardo Galvan

    To better understand your help request, please follow these best practices.[/url]

  • RBarryYoung (9/12/2012)


    Brandie Tarvin (9/12/2012)


    At my employer, we have a couple of methods, depending on site in question. One method was written up by our developers and has a webservice call that goes out every day searching for the files. This method uses a table in SQL Server that we put in the file names, the host FTP server, and when we expect these files to be available.

    ...

    Hmmm, I like the sound of that. Any possibilty that I could get a copy of the code??? .... :-D:-D:-D

    I'll talk to the developer. It's his code, so I can't share without permission.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • RBarryYoung (9/12/2012)


    GSquared (9/12/2012)


    RBarryYoung (9/12/2012)


    ...

    Well, I may have over-reacted. I am trying to an IRS download directory (www.irs.gov/pub/irs-soi) but a I cannot get a connection. I don't think it has FTP there, but I cannot get CoreFTP to connect even with HTTP.

    I just tried your link, but it gave a 403 error and said a login is required.

    Right, but if you go there with a Browser, it works fine with no login or password. That's what I can't figure out???

    That's what I tried to do. I just clicked the link in your post and got the 403 error. Wasn't trying to connect with an FTP client or anything. IE 9.0.8112.16421, if that matters.

    Edit: Occurs to me that you might have a cookie or something on your computer, supplying the login data that way. That would explain the difference.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • The same 403 error when you browse to it in Comodo Dragon, so I suspect Gus is right and you have a cookie already on your machine.

    I was going to suggest CoreFTP, because like you I've yet to actually sit down and play with Powershell. Having had that suggestion already put there though, I suspect I'd go with Powershell just for the excuse/motivation to get started with it 😀


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • GSquared (9/13/2012)


    RBarryYoung (9/12/2012)


    GSquared (9/12/2012)


    RBarryYoung (9/12/2012)


    ...

    Well, I may have over-reacted. I am trying to an IRS download directory (www.irs.gov/pub/irs-soi) but a I cannot get a connection. I don't think it has FTP there, but I cannot get CoreFTP to connect even with HTTP.

    I just tried your link, but it gave a 403 error and said a login is required.

    Right, but if you go there with a Browser, it works fine with no login or password. That's what I can't figure out???

    That's what I tried to do. I just clicked the link in your post and got the 403 error.

    Gus, the link in his post was wrong initially. I couldn't even reach the website at all until I tried the "corrected" link later posted. Try again.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • GSquared (9/13/2012)


    RBarryYoung (9/12/2012)


    GSquared (9/12/2012)


    RBarryYoung (9/12/2012)


    ...

    Well, I may have over-reacted. I am trying to an IRS download directory (www.irs.gov/pub/irs-soi) but a I cannot get a connection. I don't think it has FTP there, but I cannot get CoreFTP to connect even with HTTP.

    I just tried your link, but it gave a 403 error and said a login is required.

    Right, but if you go there with a Browser, it works fine with no login or password. That's what I can't figure out???

    That's what I tried to do. I just clicked the link in your post and got the 403 error. Wasn't trying to connect with an FTP client or anything. IE 9.0.8112.16421, if that matters.

    Edit: Occurs to me that you might have a cookie or something on your computer, supplying the login data that way. That would explain the difference.

    I don't think that could be it. I've never logged into any IRS sites, plus it works in all of my browsers and they have different cookie caches. Are you using HTTP in the Browser? I am pretty sure that they do not allow FTP to work there. Also, note the previous post about my original link being bad (that happens to me a lot here, for some reason): the correct one is http://www.irs.gov/pub/irs-soi/ (type it in by hand if necessary).

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • EL Jerry (9/12/2012)


    Barry,

    I noticed you can do HTTP downloads from this site, but can you do FTP downloads from this site? I was able to FTP connect to http://ftp.irs.gov, but I don't have a user/password to log in.

    I think that it's HTTP only, but that is what I am trying to do.

    One other question, is the name of the file to download always the same or is it dynamic?

    The files have fixed names, but I also need to get the "Last Modified" date/time from the directory listing so that I can tell if the IRS has updated any of them in the previous 24 hours.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (9/13/2012)


    GSquared (9/13/2012)


    RBarryYoung (9/12/2012)


    GSquared (9/12/2012)


    RBarryYoung (9/12/2012)


    ...

    Well, I may have over-reacted. I am trying to an IRS download directory (www.irs.gov/pub/irs-soi) but a I cannot get a connection. I don't think it has FTP there, but I cannot get CoreFTP to connect even with HTTP.

    I just tried your link, but it gave a 403 error and said a login is required.

    Right, but if you go there with a Browser, it works fine with no login or password. That's what I can't figure out???

    That's what I tried to do. I just clicked the link in your post and got the 403 error. Wasn't trying to connect with an FTP client or anything. IE 9.0.8112.16421, if that matters.

    Edit: Occurs to me that you might have a cookie or something on your computer, supplying the login data that way. That would explain the difference.

    I don't think that could be it. I've never logged into any IRS sites, plus it works in all of my browsers and they have different cookie caches. Are you using HTTP in the Browser? I am pretty sure that they do not allow FTP to work there. Also, note the previous post about my original link being bad (that happens to me a lot here, for some reason): the correct one is http://www.irs.gov/pub/irs-soi/ (type it in by hand if necessary).

    Yep, the broken link was the problem. Didn't notice you had corrected it is all.

    I don't think CoreFTP will connect to HTTP sites. I could be wrong, but I don't think that's what it's for.

    I think what I'd end up having to do in this case is use a .NET object to parse the HTML of the page, grab the table (easy enough if I'm not mistaken), dump that into a database table, compare for deltas (new rows or old rows with new dates), and then download via HTTPWebRequest/Response code.

    I'm not sure I could actually do that myself, since my .NET what little skill I used to have with .NET is rusty to the point of being a pile of red dust on the ground. More like a paint pigment FeO4, than anything else. 😛

    So, I know it's possible, but where to start or whether it's even a good idea, is beyond me. Sorry.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 16 through 30 (of 35 total)

You must be logged in to reply to this topic. Login to reply