Hello Wolfram Community, my question is somehow trivial and might not be a real challenge for many people but here I go. I have a long list with 2 columns of strings for example.
String1 String2 Edit Distance Different Character
OKJ530-WM OKJS30-WM 1 5
LTBRT5100-244 LTBR-T5100-244 1 -
Computing the Edit Distance is pretty simple, also Sorting all my list from smaller distances to bigger ones.
In the first example the user made a mistake in the first line where they entered "5" in our system instead of "S". Most probably they got confused while copying the numbers. In the second example they forgot a "-" between "R" and "T"
I am trying to make a list of the most common mistakes to see if there is a way we can avoid them.
Right now I know String1 and String2 are different in 1 or 2 or 3 or N characters depending on the EditDistance value. However, I want to compute what is the difference and here is where I am a little lost. That is my final goal, to compute the Different Character column
I think probably a combination of StringCases and a RegularExpression might do the trick, but I am not so confident on the use of Regular Expression. Or many be there is a different way.
If anyone has an idea it will be super appreciated.
Attachments: