Group Abstract Group Abstract

Message Boards Message Boards

1
|
12.5K Views
|
4 Replies
|
2 Total Likes
View groups...
Share
Share this post:

How can I throttle downloads with URLSaveAsynchronous?

Posted 10 years ago
POSTED BY: Gregory Lypny
4 Replies
Posted 10 years ago
POSTED BY: Gregory Lypny

Something like this should work:

(* Set this to the number of asynchronous downloads you want running at a time *)
tasks = 10;

(* Set these as needed *)
user     = "anonymous";
password = "password";

(* This will be your list of URLs that you need to download. I just used this to test. *)
urls = ConstantArray[ "http://exampledata.wolfram.com/USConstitution.txt", 100 ];

(* Choose a download location *)
SetDirectory @ CreateDirectory[];

(* If you'd like some notification when a download is starting, use something like this:
   alert = Print["downloading: ", #] &; *)

alert = Null &;

elements = { "statuscode", "progress", "error", "headers", "cookies", "data" };
initialStatus = AssociationMap[ {} &, elements ];

store = Append[ #, "status" -> "waiting" ] & /@
  Association[ MapIndexed[
      First @ #2 -> Prepend[ initialStatus, "url" -> #1 ] &,
      urls
  ] ];

callback // ClearAll;
callback[ async_, "data", data_ ] :=
  Module[ { key },
      key = "UserData" /. Options @ async;
      If[ data =!= { {} }
          ,
          store[ key ][ "data"   ] = data
          ,
          store[ key ][ "status" ] = "finished";
          Module[ { nextKey },

              nextKey = SelectFirst[
                  Keys @ store,
                  store[ #, "status" ] === "waiting" &
              ];

              If[ nextKey =!= Missing @ "NotFound",
                  startDownload @ nextKey
              ]
          ]
      ]
  ];

callback[ async_, tag_, contents_ ] :=
  Module[ { key },
      key = "UserData" /. Options @ async;
      store[ key ][ tag ] = contents;
  ];

startDownload // ClearAll;
startDownload[ i_ ] := (
    store[ i, "status" ] = "initialized";
    alert @ i;
    With[ { url = store[ i, "url" ] },

        URLSaveAsynchronous[ url,
                             ToString @ i <> "_" <> FileNameTake @ url,
                             callback,
                             "UserData" -> i,
                             "Username" -> user,
                             "Password" -> password
        ]
    ]
);

(* Start the downloads *)
startDownload /@ Range @ tasks;

(* View progress *)
Dynamic @ Counts @ store[[ All, "status" ]]
POSTED BY: Richard Hennigan

Have you tried just inserting Pause statements?

Typically, check whether each download succeeds and if it hasn't I run Pause for a short while. This is usually enough to prevent problems like this.

POSTED BY: Sean Clarke
Posted 10 years ago

Hi Sean, Thank you for responding. Not sure how to check whether a download succeeds. I could rig something with FileExistsQ, but I suspect there is a slicker way by checking the status of an url function such as URLSaveAsynchronous with its Progress option.

POSTED BY: Gregory Lypny
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard