There was a bug found in the pre-processing BERT when the front-end language is not English (for instance when it is Chinese).
We will update BERT to fix it.
(and people who already downloaded the model will have to re-download it after having called ResourceRemove@ResourceObject["BERT Trained on BookCorpus and English Wikipedia Data"]
)
In the meantime the workaround for this bug is:
net = NetModel[{"BERT Trained on BookCorpus and English Wikipedia Data", "InputType" -> "ListOfStrings"}]
net2 = NetReplacePart[net, {"Input", "Function"} ->
ReplaceAll[NetExtract[net, {"Input", "Function"}],
RemoveDiacritics -> Function @ RemoveDiacritics[#, Language -> "English"]]]
then use net2 in place of net.