I would probably work with just plain images. You could create a function that creates the "cells" for each number and then assembles those into an overlay. So, for example:
MakeTextCellOverlay[dims_, styles_, text_] :=
ColorReplace[ImageCrop[RemoveBackground[Rasterize[Style[text, styles]]], dims], White]
You can pass your desired styles and dimensions along with the text. You've already specified the styles you want. You can calculate the dimensions however you want, but I'll assume that each cell is scaled by a factor of 10 in each dimension:
cellDimensions = ImageDimensions[bitmap]/10
Now we compute the individual cells:
overlayCells =
MakeTextCellOverlay[cellDimensions, {16, RGBColor[1, 1, 0], Bold}, #] & /@ Range[10]
Use these to make row and column:
row = ImageAssemble[{overlayCells}];
column = ImageRotate[row, -Pi/2]
Now compose everything:
ImageCompose[ImageCompose[bitmap, column, Scaled[{.05, .5}]], row, Scaled[{.5, .05}]]
Now, I don't know what you want to do about the "collision" in the lower left corner, but hopefully this has given you enough to play with. Tweak it until you get what you want.