I tend to use a head other than NonCommutativeMultiply, and to attach down values of interest. This code is adapted from a Mathematica StackOverflow thread (I've used it elsewhere as well).
First define rules to enforce associativity and to distribute our product over Plus. Als we have some, not needed here, to pull out "scalars".
ncTimes[] := 1
ncTimes[a_] := a
ncTimes[a___, ncTimes[b_, c__], d___] := ncTimes[a, b, c, d]
ncTimes[a___, x_ + y_, b___] := ncTimes[a, x, b] + ncTimes[a, y, b]
ncTimes[a___, i_?NumericQ*c_, b___] := i*ncTimes[a, c, b]
ncTimes[a___, i_?NumericQ, b___] := i*ncTimes[a, b]
commutator[x_, y_] := ncTimes[x, y] - ncTimes[y, x]
anticommutator[x_, y_] := ncTimes[x, y] + ncTimes[y, x]
Here is the example.
lhs = commutator[ncTimes[a, b], ncTimes[c, d]]
(* Out[153]= ncTimes[a, b, c, d] - ncTimes[c, d, a, b] *)
rhs =
ncTimes[a, anticommutator[c, b], d] -
ncTimes[a, c, anticommutator[d, b]] +
ncTimes[anticommutator[c, a], d, b] -
ncTimes[c, anticommutator[d, a], b]
(* Out[154]= ncTimes[a, b, c, d] - ncTimes[c, d, a, b] *)
So the two canonicalize to the same expression (hence are equivalent).