From the documentation:
The text string that is sent to the speech engine on your computer system is given by SpokenString[expr,Options[Speak],"PostProcess"->False].
Mathematica doesn't actually produce the speech. Instead it is sent to the "speech engine" on your computer. That is whatever your OS does to produce speech for accessibility purposes.