Hi Quentin,
this does not answer you question but have you tried optimising that within Mathematica, i.e. without compilation first? Using the Table command is actually quite slow:
Table[0, {40000}, {40000}]; // AbsoluteTiming
takes 43.5 seconds on my computer.
ConstantArray[0, {40000, 40000}]; // AbsoluteTiming
and
Normal[SparseArray[{}, {40000, 40000}]]; // AbsoluteTiming
run about 7.95 seconds. If you can use a SparseArray only it becomes really fast:
SparseArray[{}, {40000, 40000}]; // AbsoluteTiming
Also further calculations on SparseArray are quite fast: 0.000203 seconds.
Cheers,
Marco