Message Boards Message Boards

Trouble with importing a book from Gutenberg by URL in Cloud

Posted 10 years ago

I want to import the plain text and assign it to the variable. I am doing it in the Wolfram Programming Cloud

LesMiserables = Import["http://www.gutenberg.org/cache/epub/135/pg135.txt"];

When I press shift+enter, the output is as follows

The request to URL http://www.gutenberg.org/cache/epub/135/pg135.txt was not successful. The server returned the HTTP 
status code 403 ("Forbidden")

I have tried importing the text from other websites but the error is always the same, so I assume there is not problem with the website, but with the code(cloud).

Could anyone help?

POSTED BY: Dmitry Kazakov

Hi,

I am not sure whether this might help at all but if you try

lesMiserables=URLFetch["http://www.gutenberg.org/cache/epub/135/pg135.txt"]

you get

<!DOCTYPE HTML><html><head><title>Error 403</title></head><body>

Error 403

Maybe you have just a wrong url. Go to http://www.gutenberg.org/ebooks/ first to see if the error ersists.

If you get the error again check that you:

  • Don't use anonymizers, open proxies, VPNs, or TOR to access Project Gutenberg. This includes the oogle proxies that are used by Chrome.
  • Don't access Project Gutenberg from hosted servers.
  • Don't use automated software to download lots of books. We have a limit on how fast you can go hile using this site. If you surpass this limit you get blocked for 24h.
  • We have a daily limit on how many books you can download. If you exceeded this limit you get locked for 24h.
  • If you use the RSS feed, set your update interval to 24 hours.

If you are sure that none of the above applies to you, and wish us to investigate the problem,we need to know your IP address.Go to this site,don't sign up, just copy the IP address (it looks like: 12.34.56.78 but your numbers will be different)andmail it to us.If that page also shows a proxy address, we need that one too.

</body></html>

Now if you are on the Cloud and try to find out your IP address

= my IP

(,use the single equal sign not the double one,) you get something like 54.209.144.183

If you run a whois on it you will see something like

Amazon Technologies Inc. AMAZON-2011L

That is a hosted server. So it might be that Guttenberg just blocks it. That's just a guess.

Cheers, M.

POSTED BY: Marco Thiel
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract