Hi John, there shouldn't be any manual action needed between submitting jobs.
How long have you tried waiting with a job in the Runnable status? I've observed that the AWS Batch scheduler (on Amazon's side) can be a bit unpredictable in its latency. I've seen jobs occasionally take up to 20-30 minutes to transition from Runnable
to Starting
. If you haven't tried waiting that long already, I suggest doing so once to see if that's the problem.
What are the vCPU counts of the two jobs? If the second's is greater than the first's, AWS Batch may need to launch a new instance for the second job instead of reusing the first job's instance. If the sum of the vCPU counts of the two instances is greater than either the stack's vCPU limit or your account quota, AWS Batch will have to terminate the first instance before the second can be launched. This can delay things, as AWS Batch tends to wait 10 minutes or so after a job ends before it terminates the host instance.
After terminating a job and submitting a new one, you could check the old job's status to confirm that it actually has transitioned to Failed
- this has to happen before the new job can get scheduled to an instance. You can also use the AWS Batch console to view status and various properties of jobs and job queues.