Dear all,
I urgently need your help. I have many strings of the following kind:
string="(Vitis_vinifera_VvNAC09_GSVIVT01009651001:1.174911034,((Musa_\
acuminata_GSMUA_Achr7T26050_001:0.1846057474,Musa_acuminata_GSMUA_\
Achr4T07148_001:0.2741664889)_D_1_0_0_9260000000_:0.2005997307,(Oryza_\
sativa_Os10g21560_1:0.9733414222,Musa_acuminata_GSMUA_Achr2T06610_001:\
0.4881902598)0_1700000000:0.0623981619)_D_0_5__:0.13917144990000008);"
These are trees downloaded from the public database "TreeBase". I import them as strings and -I want to get rid of what officially is the node names in the Newick format.
To be precise, I want to delete everything that stands between a ")" symbol and THE NEXT ":" symbol.
For my example, I want to get the following:
"(VitisviniferaVvNAC09GSVIVT01009651001:1.174911034,((Musa\ acuminataGSMUAAchr7T26050001:0.1846057474,MusaacuminataGSMUA\ Achr4T07148001:0.2741664889):0.2005997307,(Oryza\ sativaOs10g215601:0.9733414222,MusaacuminataGSMUAAchr2T06610001:\ 0.4881902598):0.0623981619):0.13917144990000008);"
However, if I do
StringReplace[string, ")" ~~ __ ~~ ":" -> ")"]
what I get is this:
"(VitisviniferaVvNAC09GSVIVT01009651001:1.174911034,((Musa\ acuminataGSMUAAchr7T26050001:0.1846057474,MusaacuminataGSMUA\ Achr4T07148_001:0.2741664889)0.13917144990000008);"
I.e. Mathematica replaces everything between the first ")" symbol and the LAST ":" symbol. But that's not what I want.
Can anyone help with this please?
Thanks so much!