For the sake of completeness, I found a NIST description of the Jaro distance containing
William E. Winkler and Yves Thibaudeau, An Application of the
Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial
Census, Statistical Research Report Series RR91/09, U.S. Bureau of the
Census, Washington, D.C., 1991. The abstract (HTML) and full paper (PDF).
and there it is said on p. 12
Two characters are considered in common only if they are no further
apart than (m/2 - 1) where m = max(d,r). Characters in common from
two strings are assigned; remaining characters unassigned. Each
string has the same number of assigned characters.
That tells you to treat l3
as a whole and suggests the following implementation
Needs["Experimental`"]
Clear[check]
check[l1_List, l2_List] := Flatten[Outer[List, l1, Select[l2, #[[1]] == l1[[1, 1]] &], 1], 1]
Clear[jaro]
jaro[s1_String, s2_String, prec_: $MachinePrecision] :=
Block[{r, l1 = StringLength[s1], l2 = StringLength[s2], l3, m, t},
r = Floor[Max[l1, l2]/2] - 1;
If[r >= 0,(* then *)
l3 = DeleteDuplicates[
Flatten[check[{Transpose[{ToCharacterCode[s1],
Range[l1]}][[#]]},
Transpose[{ToCharacterCode[s2], Range[l2]}][[
Min[l2, Max[1, # - r]] ;; Min[l2, # + r]]]] & /@ Range[l1],
1], ((First[#1] == First[#2]) || (Last[#1] == Last[#2])) &];
m = Length[l3];
If[m > 0,
t = (m -
Count[{1, -1} .
MapAt[First,
SortBy[#, Last] & /@ Transpose[l3], {{1, All}, {2, All}}],
0])/2;
N[(m/l1 + m/l2 + (m - t)/m)/3, prec], (* else *)
0
], (* else *)
0
]
] /; StringLength[s1] > 0 && StringLength[s2] > 0 && prec > 1
which has still t sometimes off by 1/2 with respect to the textdistance.jaro. The JaroDistance[]
is spurious (this was made with Mathematica 10.3). Taking the drudgery to create a table
In[204]:= Grid[{{Item["s1"], Item["s2"], Item["td.jaro"], Item["jaro"],
Item["Defect"], Item["JaroDistance"]},
{Item["Miss Australia"], Item["Miss Brasilia"],
Item[0.8166833166833167],
Item[jaro["Miss Australia", "Miss Brasilia"], Background -> Pink],
Item["t+1/2"],
Item[JaroDistance["Miss Australia", "Miss Brasilia",
IgnoreCase -> False], Background -> Pink]
},
{Item["DWAYNE"], Item["DUANE"],
Item[0.8222222222222223],
Item[jaro["DWAYNE", "DUANE"]],
Item["-"],
Item[JaroDistance["DWAYNE", "DUANE", IgnoreCase -> False],
Background -> Pink]
},
{Item["MARTHA"], Item["MARHTA"],
Item[0.9444444444444445],
Item[jaro["MARTHA", "MARHTA"]],
Item["-"],
Item[JaroDistance["MARTHA", "MARHTA", IgnoreCase -> False],
Background -> Pink]
},
{Item["DIXON"], Item["DICKSONX"],
Item[0.7666666666666666],
Item[jaro["DIXON", "DICKSONX"]],
Item["-"],
Item[JaroDistance["DIXON", "DICKSONX", IgnoreCase -> False],
Background -> Pink]
},
{Item["JELLYFISH"], Item["SMELLYFISH"],
Item[0.8962962962962964],
Item[jaro["JELLYFISH", "SMELLYFISH"]],
Item["-"],
Item[JaroDistance["JELLYFISH", "SMELLYFISH", IgnoreCase -> False],
Background -> Pink]
},
{Item["Miss Mexiko"], Item["Miss Belize"],
Item[0.7575757575757575],
Item[jaro["Miss Mexiko", "Miss Belize"]],
Item["-"],
Item[JaroDistance["Miss Mexiko", "Miss Belize",
IgnoreCase -> False], Background -> Pink]
},
{Item["0100010100101001001001001010010"],
Item["10000100100111101010101010101010"],
Item[0.8308371735791091],
Item[jaro["0100010100101001001001001010010",
"10000100100111101010101010101010"]],
Item["-"],
Item[JaroDistance["0100010100101001001001001010010",
"10000100100111101010101010101010", IgnoreCase -> False]]
},
{Item["aasdjkdashdahsgdashdgasj"], Item["asdjkdashdahsgdashdgasj"],
Item[0.841183574879227],
Item[jaro["aasdjkdashdahsgdashdgasj", "asdjkdashdahsgdashdgasj"]],
Item["-"],
Item[JaroDistance["aasdjkdashdahsgdashdgasj",
"asdjkdashdahsgdashdgasj", IgnoreCase -> False],
Background -> Pink]
},
{Item["abdegopq"], Item["cfhijklmnrstuvwxyz"],
Item[0.0],
Item[jaro["abdegopq", "cfhijklmnrstuvwxyz"]],
Item["-"],
Item[JaroDistance["abdegopq", "cfhijklmnrstuvwxyz",
IgnoreCase -> False]]
},
{Item["Mary has a little lamb"],
Item["and Meghan has the redhead Harry"],
Item[0.5631555944055945],
Item[jaro["Mary has a little lamb",
"and Meghan has the redhead Harry"], Background -> Pink],
Item["t+1/2"],
Item[JaroDistance["Mary has a little lamb",
"and Meghan has the redhead Harry"]]
},
{Item["Take[list,-n] gives the last n elements of list."],
Item["Take[list,{m,n}] gives elements m through n of list."],
Item[0.791056166056166],
Item[jaro["Take[list,-n] gives the last n elements of list.",
"Take[list,{m,n}] gives elements m through n of list."],
Background -> Pink],
Item["t+1/2"],
Item[JaroDistance[
"Take[list,-n] gives the last n elements of list.",
"Take[list,{m,n}] gives elements m through n of list."]]
}
}, Background -> {None, {Cyan}}, Frame -> All]
one gets

and I've to confess in complete humbleness that I'm unable to get jaro[]
right using the given descriptions what it is meant to be. Please read the cited PDF and tell me.
[1]:William E. Winkler and Yves Thibaudeau, An Application of the Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial