February 11, 2008 at 5:02 am
Hi, I have problem, how can I download content ie page like txt file. Page is : http://www.cnb.cz/cs/financni_trhy/devizovy_trh/kurzy_devizoveho_trhu/denni_kurz.txt?
I need this page save like txt file on drive, e.g. c:/denni_kurz.txt
Thanks for help.
February 11, 2008 at 11:17 am
much easier to do that from an application.
from sql server 2000, it's difficult to do in TSQL because you must use an object with sp_oacreate to get the text of the web page, and you are limited to 4000 bytes at a time...so if the page is 25K, you need to get 7 "slices" of the web page, one at a time, to get it into sql server.
I did that for fun in a different thread here on SSC, but it required a separate program i wrote to download the web page selected.
In my case, I was building a "scraper", so it got a web page and put it in a table in SQL server.
In order to do this, you'll need to grab the dll from this project and put it in your path or .bin directory for SQL server: the source code, vb6 project, and compiled dll are all in this zip file:
http://www.stormrage.com/blogpix/WebReader.zip
[
this procedure does all the work:
CREATE FUNCTION dbo.GetWebPage (@webpage varchar(1000)='' )
RETURNS @webcontents table
(
rowid smallint IDENTITY(1,1), --Array index
pagetext varchar(4000) --Array element contents
)
AS
BEGIN
DECLARE @hr int,
@webreader int,
@Num4K int,
@i int,
@contents varchar(4000)
IF LTRIM(RTRIM(ISNULL(@webpage,''))) =''
SET @webpage = 'http://www.yahoo.com'
EXEC @hr = sp_OACreate 'WebReader.Reader', @webreader OUT
EXEC @hr = sp_OAMethod @webreader, 'GetWebPageContentsAsString',null,@webpage
EXEC @hr = sp_OAGetProperty @webreader, 'Num4KLengths', @Num4K OUT
SET @i = 1
WHILE @i <=@Num4K
BEGIN
EXEC @hr = sp_OAGetProperty @webreader, 'PageSlice', @contents OUT, @i
INSERT INTO @webcontents(pagetext) VALUES(@contents)
SET @i=@i + 1
End
EXEC @hr = sp_OADestroy @webreader
Return
End
select * from dbo.GetWebPage(default)
Lowell
February 12, 2008 at 3:25 pm
You can do it in an activex script in a DTS package like this:
Function Main()
Set socket = CreateObject("MSXML2.ServerXMLHTTP.4.0")
socket.setTimeouts 300000, 300000, 350000, 350000
socket.Open "GET", "http://www.cnb.cz/cs/financni_trhy/devizovy_trh/kurzy_devizoveho_trhu/denni_kurz.txt?", False
socket.send
Set fso = CreateObject("scripting.filesystemobject")
Set myfile = fso.createtextfile("c:\denni_kurz.txt", True)
myfile.write socket.responsetext
Main = DTSTaskExecResult_Success
End Function
February 13, 2008 at 1:31 am
Thanks to all. I use version of mrpolecat, and that's works fine.
February 13, 2008 at 3:29 am
Please I have one problem yet. I save txt file onto disk, but when I try read this txt file I will get error.
When I try DTS package and bulk insert with row terminator comma and column terminator TAB, i get error Unexpected end of file.
When I try stored procedure and bulk insert :
declare @sql varchar(8000)
SET @sql= 'BULK INSERT #WRK FROM ''c:\denni_kurz.txt' + ''' WITH (CODEPAGE = ''1250'' , FIELDTERMINATOR = ''|'',ROWTERMINATOR ='''') '
CREATE TABLE #WRK (TEXT_STRING text)
EXEC(@SQL)
SELECT * FROM #WRK
I get only first few rows, but not all.
When I try vb.net project with StreamReader, then I read all text.
Where is the problem ?
I don't see anything wrong in this text.
Thanks for help.
February 13, 2008 at 4:30 pm
Come on - we can't do all the work for you. You need to understand the format of your incoming data - first, the text file available currently at that URL is in UTF8 format, so you need to set your codepage to 65001. And I'm not sure why you would be trying a column seperator of TAB and rowterminator of comma when the data is pipe-delimited with standard CRLF row terminators...
Here's a really quick test using the text file available now at the URL - I've made some assumptions as to the column names as I can't read Czech:
-------------------------------------------------------
create table #Test
(Country nvarchar(256),
CurrencyName nvarchar(256),
Units int,
ISOCode char(3),
CurrentValue varchar(256))
bulk insert #Test
from 'c:\denni_kurz.txt'
with
(
FIELDTERMINATOR ='|',
ROWTERMINATOR ='',
CODEPAGE = '65001',
FIRSTROW = 3
)
select
*
from #Test
drop table #Test
-------------------------------------------------------
It imports fine for me. Note however I've used varchar for the CurrentValue (last column) as the amounts are presumably in European decimal style with commas as the decimal seperator. You'd need to do some further manipulation on that column to get it into decimal format - there may be a neat way of doing that with a proper format file.
Regards,
Jacob
June 22, 2008 at 8:05 pm
The url I am accessing requires a user name and the password. How can I pass the user name and the password to down load the file from a url??
June 23, 2008 at 7:08 am
July 24, 2021 at 8:59 pm
Hi Lowell.
Do you still have the file http://www.stormrage.com/blogpix/WebReader.zip available somewhere?
I did not find it on the stormrage website.
Thanks.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply