Some loss functions, e.g., InfoNCE loss that is used in contrastive self-supervised learning, require access to the entire batch outputs. Is it possible to access the batch outputs in the loss function or can it only be applied to each individual input?
I have learned that for the batch data generator, it is possible to access, e.g., the network as well as the batch data, I assume it is the previous batch data. Are these properties available in the loss function and how to use them with network layers?
BatchNormalizationLayer must have access to all batch data, or? If it has that, how is it implemented?