My project objective was a simple one: to extend the pre-existing backend of Mathematica's `FullSimplify`

function, in an effort to construct a system for automatically generating and visualizing proofs of arbitrary theorems in first-order equational logic (equational logic is a simplification of first-order logic that uses the equals sign as its only predicate symbol).

Let me begin my demonstration of this new functionality by proving an elementary theorem from a single axiom, using the command `proveTheorem`

:

addAxiom[ForAll[{x,y},f[x,y]==f[y,x]]]
proveTheorem[ForAll[{a,b,c,d},f[f[a,b],f[c,d]]==f[f[d,c],f[b,a]]]]

Within my framework, all proofs are represented internally as abstract proof networks, based on the same essential structure: starting from the axioms (coloured in red), one applies a sequence of syntactic transformations (coloured in yellow), justified either by the axioms or by intermediate lemmas (with the justifications represented by dashed lines), in order to ultimately deduce that the theorem (coloured in green) is true.

Internally, the theorem-proving system is based on the Knuth-Bendix *unfailing completion* algorithm for equational logic, which begins by considering the abstract proof tree of all possible axiomatic transformations (such that, if a theorem is true, then its proof will correspond to at least one branch of the tree). The algorithm then uses a combination of rule orientation and critical pairing to construct a confluent term-rewriting system, and determine which branches of the tree to actually traverse. With the `addAxiom`

command, I was able to generalize the Knuth-Bendix algorithm somewhat, thereby constructing a more sophisticated algorithm that would be capable of auto-generating proof networks of arbitrary theorems, not only in first-order equational logic, but in *any* finite axiom system based on first-order equational logic. To demonstrate this, let's now prove some slightly more nontrivial theorems (taken from the NKS book) regarding various algebraic properties of the NAND operator in Boolean algebra:

clearAxioms[]
addAxiom[ForAll[{a,b},f[f[a,a],f[a,b]]==a]]
addAxiom[ForAll[{c,d},f[c,f[c,d]]==f[c,f[d,d]]]]
addAxiom[ForAll[{x,y,z},f[x,f[x,f[y,z]]]==f[y,f,y,f[x,z]]]]]
proveTheorem[ForAll[{p,q},f[p,f[p,p]]==f[q,f[q,q]]]]

proveTheorem[ForAll[{p,q},f[p,q]==f[q,p]]]

proveTheorem[ForAll[{p,q},p==f[f[p,q],f[p,f[q,q]]]]]

In the case of these more nontrivial proofs, abstract proof networks are not particularly easy for humans to read or understand, so I began experimenting with alternative methods for presenting such proofs in a more human-readable way. One of the fundamental challenges in making machine-generated proofs human-comprehensible is that they lack the kind of *conceptual waypoints* that guide the reader through the logic of a conventional mathematical proof (published proofs tend to "tell a story" of how a particular result was derived, rather than simply presenting an abstract sequence of logical transformations).

As a preliminary heuristic for automatically finding conceptual waypoints for a given proof, I tried searching for nodes in the proof network with higher-than-average connection density (since the associated lemmas will, by definition, represent the "load-bearing pieces" in the logical structure of the proof). When presenting the proof to a human reader, such lemmas are introduced first (in order to give the reader an intuitive picture of the large-scale formal structure of the argument), and then the remaining details are filled in, in a recursive/hierarchical fashion.

clearAxioms[]
addAxiom[ForAll[{x,y},f[x,y]==f[y,x]]]
presentProof[ForAll[{a,b,c,d},f[f[a,b],f[c,d]]==f[f[d,c],f[b,a]]]]

The `presentProof`

function outputs a TeX file, which (when compiled) produces an auto-generated mathematics paper, presenting the proof in a style which I think is reasonably similar to how a human mathematician might write it.

Since all proofs are represented internally as sequences of syntactic transformations, this makes them immediately amenable to the techniques of automated proof-checking. With this in mind, I added the function `checkProof`

, which takes a given proof and automatically converts it to a blob of Mathematica code. The code represents the theorem and the axioms as symbolic expressions, and then uses `MapAt`

, `Replace`

and `ReplaceAll`

to apply the requisite syntactic transformations in the appropriate places, and deduce that the theorem is, indeed, true.

checkProof[ForAll[{a,b,c,d},f[f[a,b],f[c,d]]==f[f[d,c],f[b,a]]]]

Thus, since every auto-generated proof can be programmatically represented as a piece of symbolic Mathematica code in this way, it becomes possible to do something which cannot conventionally be done with a human-written proof: in addition to simply reading it, one can also actually *run* it.

With the framework for automatic proof-generation, proof-presentation and proof-verification already implemented, the obvious next step would be to add functionality for exporting proofs in such a format that they could be uploaded directly to the Wolfram Data Repository. In addition, with further backend development, these tools could be used to gain deeper insight into certain longstanding results, such as the proof of correctness of the Wolfram axiom for Boolean algebra.

Further information about the underlying theory:

http://mathworld.wolfram.com/Knuth-BendixCompletionAlgorithm.html

http://www.cs.miami.edu/home/geoff/Courses/TPTPSYS/FirstOrder/Paramodulation.shtml

http://mathworld.wolfram.com/CriticalPair.html

http://mathworld.wolfram.com/Confluence.html

My GitHub repository:

https://github.com/GetJonWithIt/starthere

**Attachments:**