Message Boards Message Boards

"GPU" issue during Out-of-Core Training ?

Posted 4 years ago

I took the example "Out-of-Core Training on MNIST" from

https://reference.wolfram.com/language/tutorial/NeuralNetworksLargeDatasets.html

The only code element I have modified is

results = 
 NetTrain[lenet, trainingDataFiles, All, 
  ValidationSet -> testDataFiles, MaxTrainingRounds -> 3, 
  TargetDevice -> "GPU"]

So, I added 'TargetDevice -> "GPU" ' which is typically not a problem. But in this example I get the following message:

enter image description here

The Stack Trace for NetTrain::interr2

enter image description here

The issue does not occur with "CPU" as TargetDevice. How to solve the problem?

POSTED BY: Jürgen Kanz
7 Replies

OK. The error in case of weird GPU state after sleep mode should anyway be: enter image description here

At least you don't have any error now.

If it happens again, check the value of Internal`$LastInternalFailure and you can report here. Thanks!

No, the computer did not went in a sleep mode during my activities.

POSTED BY: Jürgen Kanz

Clear should not have any effect on the GPU memory.

My idea when suggesting to reboot was more the following: When putting a computer in a sleep mode and using it again, the GPU might be in a weird state and WL may have problems to connect to it. I wanted to make sure that was not the case.

After a re-start of Mathematica the problem does not occur again.

My assumption is that *Clear["Global`"]** on top of the notebook does not clear the GPU memory and this was the reason for the failure.

What is the best command to start this kind of notebook (re-)evaluations with a clean memory of all needed devices?

POSTED BY: Jürgen Kanz

Just run the command that fails. And after that, evaluate:

Internal`$LastInternalFailure

(as suggested in the error message)

Then copy paste the output here

Yes, I am running V12 on Windows 10.

I do not know what the "content of Internal`$LastInternalFailure after the failure" is. How to get it?

POSTED BY: Jürgen Kanz

What is the content of Internal`$LastInternalFailure after the failure?

Are you running a 12.0?

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract