About problems with other GPUs:
By briefly going to the previous thread, all I can see about neural net functionality besides the 3090 problem is MacOS support (NVIDIA/Apple's fault, not ours) and a complaint about a faulty 12.2 update which we fixed a few days later with another update. I'm not going to comment on CUDALink because I'm not involved with it. I consider the GPU support on the ML side pretty solid: we've been successfully using NetTrain for our own internal projects on a variety of GPU models and machines (including AWS instances) for years. If you or any other user still have problems please contact tech support.
About numerical vs image data:
There is absolutely no difference between them from the neural net perspective. Images are immediately turned into numerical data by NetEncoder["Image"] and fed to the network as such. I have ran your own example on CPU vs GPU on my laptop (Dell XPS 15, GTX 1650M) and GPU is actually showing an improvement:
t = AbsoluteTime[];
NetTrain[net, TrainingData, BatchSize -> 10000, TargetDevice -> "CPU"];
Print[AbsoluteTime[] - t];
24.876159
t = AbsoluteTime[];
NetTrain[net, TrainingData, BatchSize -> 10000, TargetDevice -> "GPU"];
Print[AbsoluteTime[] - t];
15.667683
With a larger net, the improvement is massive (don't set a large BatchSize here or memory will blow up)
TrainingData = 
  RandomReal[1, {10000, 4}] -> RandomReal[1, {10000, 4}];
net = NetChain[{500, Ramp, 500, Ramp, 500, Ramp, 4}];
t = AbsoluteTime[];
NetTrain[net, TrainingData, MaxTrainingRounds -> 5, 
  TargetDevice -> "CPU"];
Print[AbsoluteTime[] - t];
7.083551
t = AbsoluteTime[];
NetTrain[net, TrainingData, MaxTrainingRounds -> 5, 
  TargetDevice -> "GPU"];
Print[AbsoluteTime[] - t];
0.654267
Do you get similar results for CPU vs GPU timings (especially with the second example)?