This is a question regarding in-memory data structures and datatypes which one uses to perform some operations on the data. Examples would be lists, arrays, queues trees of various types and so on. The data within the structures is stored in various data formats ranging from strings which are one-dimensional arrays of characters to ways of representing decimal numbers in binary format such as the IEEE-754 standard which is commonly used for this purpose. This is true in every programming language, this need to represent data in storage based on the nature of the data itself and the purpose for which you intend to use it. The Mathematica Programming Language has functionality that makes it ideal for working with biological data that uses this information (see Paul Wellin,
Programming with Mathematica, page page 351ff) Biological information such as DNA, RNA and proteins may be stored as strings of characters representing molecules, usually amino acids. There is a great deal of other information, mostly text, associated with each node that may be used to search. A prominent data "type" is the taxonomic position of an organism. Imagine a tree-like data structure which represents as a graph the relationships between all living things on Earth, each node is an organism assigned a name which locates it in this structure. Yet not all scientists agree on where each organism should be located in this structure, i.e. how it fits into this "tree of life" and its relationship to other organisms. However, there is not full agreement on this structure yet we must accomodate and work with this anomaly. I am asking how others may have dealt with this problem.
Using a thesaurus or thesauri is already common when search this type of data to accomodate the usual language variations such as "hare" vs. "rabbit". Additional domain specific thesauri may be used to handle differences in scientific terminology. I mentioned this as some solution may exist that uses a similar strategy.
This is a higher level problem than coding syntax issues, etc.