Group Abstract

Message Boards

9.6K Views

1 Reply

0 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

External Programs and Systems Wolfram Language Wolfram Cloud

Posted 4 years ago

I have been using using RemoteBatchSubmit on AWS. Occasionally I have to kill a job using RemoteBatchJobAbort. I have found that whenever I do this I can't get AWS to accept a follow-up job; the new job just seems to sit in the Runnable status. To over come the problem I've had to delete the batch stack and generate a new one using the Wolfram template. Maybe I haven't waited long enough before trying a new job after aborting a calculation? Is there some way to manually reset things on AWS to prepare it for a new job, or do I just need to be more patient and wait longer before submitting my next project?

POSTED BY: John Snyder

1 Reply

Sort By:

Posted 4 years ago

Hi John, there shouldn't be any manual action needed between submitting jobs. How long have you tried waiting with a job in the Runnable status? I've observed that the AWS Batch scheduler (on Amazon's side) can be a bit unpredictable in its latency. I've seen jobs occasionally take up to 20-30 minutes to transition from `Runnable` to `Starting`. If you haven't tried waiting that long already, I suggest doing so once to see if that's the problem. What are the vCPU counts of the two jobs? If the second's is greater than the first's, AWS Batch may need to launch a new instance for the second job instead of reusing the first job's instance. If the sum of the vCPU counts of the two instances is greater than either the stack's vCPU limit or your account quota, AWS Batch will have to terminate the first instance before the second can be launched. This can delay things, as AWS Batch tends to wait 10 minutes or so after a job ends before it terminates the host instance. After terminating a job and submitting a new one, you could check the old job's status to confirm that it actually has transitioned to `Failed` - this has to happen before the new job can get scheduled to an instance. You can also use the AWS Batch console to view status and various properties of jobs and job queues.

POSTED BY: Jesse Friedman

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback