I found a possible workaround
Since the error does occur with the network
net = NetChain[{ElementwiseLayer[#-#&], AggregationLayer[Times, 1]}]
but not with the network
net = NetChain[{ElementwiseLayer[#-#+1&], AggregationLayer[Times, 1]}]
I made the hypothesis that the error occurs when you have a region of the function that has both null gradient and null value and they try to compute the product of the variables there.
I therefore reshaped the initial problem as
$$ \left\{ \begin{matrix}
(x+1)(y+1) - (x +y) -1
& \mbox{if} \;x,y>0 \\ 0 & \mbox{otherwise} \end{matrix} \right.$$
Therefore the product happens between two variables that have always non-null value (except in a single point
$(-1,-1)$)
This can be written in Wolfram's "network language" as
net = NetGraph[
{
"ramp" -> ElementwiseLayer[Ramp],
"x+1" -> ElementwiseLayer[# + 1 &],
"times" -> AggregationLayer[Times, 1],
"sum" -> AggregationLayer[Total, 1],
"x-y-1" -> ThreadingLayer[#1 - #2 - 1 &]
},
{
"ramp" -> "x+1",
"x+1" -> "times",
"ramp" -> "sum",
{"times", "sum"} -> "x-y-1"
}
]
and indeed
In[116]:= net[{4, 2}]
Out[116]= 8.
In[117]:= net[{-4, -2}]
Out[117]= 0.
In[118]:= net[{4, -2}, NetPortGradient["Input"]]
Out[118]= {0., 0.}
In[119]:= net[{-4, -2}, NetPortGradient["Input"]]
Out[119]= {0., 0.}
It works!