If you're goal is to have usage data in a particular (plaintext) format, and you only have 9 symbols that your function cannot parse, then just overload your function with ad hoc definitions for those symbols.
getUsg2["Begin"] = {"Begin[\"context`\"] resets the current context."}
If you're worried about this approach, because your textual documentation will no longer be in sync for these 9 symbols, well, that's legitimate, but weigh the cost against the benefit.
If you're wanting to make your getUsg2 function absolutely bullet-proof against malformed box strings, then I just don't think that's feasible. You can't know all of the ways that such box strings will be malformed.
A compromise would be to handle the particular malformations that you already know about. So, instead of hard-coding getUsg2["Begin"], you could analyze that particular malformation, implement a fix for that pattern, rerun getUsg2 to see what "bad" symbols remain (it might have cleaned up some of the others as well), and then repeat this process until no bad cases remain.
For example:
cleanUsageText[str_String] :=
StringReplace[
str,
{
RegularExpression["\"\\W+StyleBox\[([^]]+)]\""] :>
StringDelete["$1", ("\\" | "(" | "*")](*,
the ones you already had would go here*)}]
getUsageSentences[sym_] :=
TextSentences[
cleanUsageText[WolframLanguageData[sym, "PlaintextUsage"]]]
This cleans up the usage for Begin. It partially cleans up BeginPackage. You can either try to get this to work for both, or just move on to creating an ad-hoc fix for BeginPackage. You may end up the maximum of 9 special fixes for the 9 remaining problems, but oh well.