Message Boards Message Boards

GROUPS:

Script to download nested PDF books from a WEB Page

Posted 1 year ago
2659 Views
|
3 Replies
|
2 Total Likes
|

This is a WEB Scraping & WEB Crawler function.

This short function downloads all the PDF book files and saves locally, from the main WEB page, one file at a time, the PDFs are at second level WEB pages for each book:

getBooks[bookUrl_]:=
 URLDownload[
  Select[Flatten[
    Import[#, "Hyperlinks"] & /@ 
     Import[bookURL, "Hyperlinks"]], 
   StringContainsQ[#, ".pdf"] &], 
  "~/Downloads/Books", 
  CreateIntermediateDirectories -> True];


getBooks["https://books.goalkicker.com/"]
3 Replies

The 1st line should be:

getBooks[bookURL_String] :=

just a copy/paste error.

It is funny because it is getBooks[bookUrl_]:= at edit mode... gets this way when published...

Better now!!

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract