Well. The description of what you want is vague. Maybe I or they were confused about what you wanted exactly. Are you looking to do Text-To-Speech? You'll probably want to use someone's API to generate the audio. And then, overlaying audio on top of a video isn't something that the Exporter for AVI or MOV does. You can look at the documentation for AVI and MOV. There are no options for adding audio to an animation that is exported.
You should begin by learning how to make the movies using standard video editing software and then experimenting with how much you can move over to Mathematica. No doubt it's completely automatable even if you have to call some third party utilities. I just don't think it'll be straightforward. You'll have to implement a fair number of features.