headshot of Ryan Ryan Marcus, assistant professor at the University of Pennsylvania (Fall '23). Using machine learning to build the next generation of data systems.
    ____                       __  ___                          
   / __ \__  ______ _____     /  |/  /___ _____________  _______
  / /_/ / / / / __ `/ __ \   / /|_/ / __ `/ ___/ ___/ / / / ___/
 / _, _/ /_/ / /_/ / / / /  / /  / / /_/ / /  / /__/ /_/ (__  ) 
/_/ |_|\__, /\__,_/_/ /_/  /_/  /_/\__,_/_/   \___/\__,_/____/  
   ___                   __  ___                    
  / _ \__ _____ ____    /  |/  /__ ___________ _____
 / , _/ // / _ `/ _ \  / /|_/ / _ `/ __/ __/ // (_-<
/_/|_|\_, /\_,_/_//_/ /_/  /_/\_,_/_/  \__/\_,_/___/
   ___  __  ___                    
  / _ \/  |/  /__ ___________ _____
 / , _/ /|_/ / _ `/ __/ __/ // (_-<
/_/|_/_/  /_/\_,_/_/  \__/\_,_/___/                                   

Vulcan: A JavaScript Automated Proof System

Mathematicians have been trying to figure out how to get computers to write proofs for a long time. One of the earliest (dating back to the 1960s) attempts to do so was a logical rule called resolution.

I created Vulcan, an NPM package that implements a resolution-based automated proof system. Below is a in-browser demo I created with AngularJS and Browserify. You can use symbols A-Z as variables that can be either true or false, and any of the following operations: -> (implication), <-> (equivalence), ! (negation), & (conjunction), and | (disjunction).

Enter some information into the knowledge base, enter a query, and click “prove” to see a proof (if one exists).

Knowledge Base

{{ $index + 1 }} {{ r }}
{{ rules.length + 1 }}



Step Sentence Action
{{ ax.idx }}{{ ax.tree }}{{ ax.label }}
By assuming the negation of the query, we are able to derive a contradiction. Therefore, the knowledge base combined with the negation of the query is unsatisfiable, so the query follows from the axioms of the knowledge base.
{{ qedNumber }}{{ query }}by resolutional inference
No application of resolution could find a contradiction with the query. Because resolution is sound and complete, we know that the query is not a theorem of the knowledge base.

This certainly isn’t the first JavaScript implementation of a theorem prover, nor is it even the most powerful. It does, however, demonstrate the ability of a very simple system to come up with (in exponential time), a proof of any provable statement in propositional calculus.

But how does it work? Once again, you can find an excellent introduction in Stuart Russell and Peter Norvig’s book. But understanding the system really only requires two fundamental ideas about symbolic logic:

  1. That any expression in propositional logic can be written in conjunctive normal form
  2. That logical resolution is sound and complete.

Conjunctive normal form

The first point is the simplest. We say that a sentence of propositional logic is in conjunctive normal form if it is a series of other sentences combined via conjunction, such that each of the combined sentences is a disjunction of sentence symbols or negations of sentence symbols. In other words, a sentence is in conjunctive normal form if it is an AND of ORs. For example,

… is in conjunctive normal form, as is:

However, this sentence is not in conjunctive normal form:

Nor is this one:

The fact that is not in CNF (conjunctive normal form) seems problematic. Surely, any sufficiently powerful proof system should be able to handle implications. However, we can transform this expression into one that is in CNF using a simple trick: we replace the implication with .

Our new expression, , is in CNF. In fact, there is a sequence of logical rules that can be applied to any expression to convert it into CNF. Norvig and Russell give an excellent description that can be found in other places on the Internet, and I’ll produce my own explanation here.

  1. Remove bijections: .
  2. Replace any implications:
  3. Move nots inwards via DeMorgan’s rule:
    • Not over and:
    • Not over or:
  4. Eliminate double negation:
  5. Distribute ors over ands:

If you don’t believe that this simple algorithm will convert any sentence into CNF, try out a few examples. But this algorithm is not the easiest way to understand why any sentence in propositional logic can be converted into CNF. It’s helpful to remember that two sentences are equivalent if and only if they agree in every model. In other words, imagine you have two sentences, and , composed of the symbols A, B, and C. You can say that if you can plug in any value for A, B, and C and get the same final result for both and .

Let’s take , and and test to see if they are equivalent by building a truth table.


Quite an exhausting process, but it works. We can see that . The important consequence here is this: if we can construct a sentence with the same truth table as , we can construct a sentence that is equivalent to .

So let’s think about how to construct a new sentence that will be equivalent to but also in CNF. Think of as a bunch of clauses linked together by conjunctions. So, whenever is false, we need to make sure that at least one of the clauses in is false – that’ll make sure that all of is false.

For every row in the truth table that ends in false, add a clause to that is a conjunction of each sentence symbol, but negate the sentence symbol if that symbol is “false” in the table. For , we have only one row in the truth table where the result is false. So we’ll have only one clause in . That clause will be a disjunction of:

So finally, we get , which is equivalent to and in CNF. If you don’t believe me, try a truth table or try WolframAlpha. Now we have an algorithm for taking a truth table to a new sentence that will be in CNF.

Let’s try another example. Let . We’ll write out a truth table for and then convert it to CNF.


Since is false in two rows, we’ll have to build clauses from both of them. For the seoncd row, we get the clause . From the third row, we get . Putting the clauses together gives us . That’s equivalent to , which you can verify with a truth table or with WolframAlpha.

Hopefully you’re now significantly convinced that, given any sentence in propositional logic, there’s an equivalent sentence in CNF. The next critical component is logical resolution.


Resolution is an interesting trick with some very useful properties. It can be stated as follows:

If you’ve never seen this notation before, it just means that if you are given the two sentences on top as true, then you can deduce that the sentence on the bottom is true as well. For resolution specifically, if you have two disjunctions with a complementary symbol (a symbol that is negated in one disjunction but not in the other), you can remove that symbol from both sentences and combine the two sentences with another disjunction.

It possible to prove that resolution is both sound (meaning that a deduction made by resolution will always be correct) and complete (meaning that any sentence that can be deduced can be deduced by resolution). The second property – completeness – is rather amazing. You might find it interesting to read through a proof that explains why.

So how do we take advantage of resolution to create a proof system? Notice that the inputs and the output of resolution are conjunctions – meaning a string of (possibly negated) symbols linked together by ORs. Since we can convert every statement in our knowledge base to CNF, we can separate out each of the disjunctions into different statements and combine them in different ways with resolution. Since resolution is sound, we know that any combination of two clauses in our knowledge base will be valid. Since resolution is complete, we know that any fact that can be inferred from our knowledge base can be inferred via resolution.

Now, given a knowledge base and a query , how do we find out if is true, given ? To be more specific, we want to know if is semantically entailed by . Written formally: . By the completeness theorem, we know that if and only if can be syntactically generated from using some sound method : .

Let’s call our resolution method . Since resolution is complete, we know that any semantically true sentence entailed by can be syntactically derived via . In other words, we know that:

And, since is sound, we know that any sentence derived via from will also be entailed by . In other words, we have:

Combining these two gives us:

At this point, you might think an acceptable algorithm would be to take your knowledge base and apply resolution over and over again until you either find all possible sentences or you find . The problem here is that all sentences of propositional logic can be stated in infinitely many finite ways (there are infinitely many ways to express any sentence using only finitely many symbols). You might think you could solve this problem by simply converting each step into CNF. The problem with that is that CNF representations are not unique. Example:

Even if you were to generate unique CNF statements by deriving the CNF from the truth table at each step, such an approach would require the proof system to build larger and larger clauses (until reaching ). Ideally, we want to make things smaller and smaller. So instead of searching for , we’ll add to the knowledge base and then search for false. If you think about it, this is equivalent to reductio ad absurdum, or proof by contradiction. If by assuming that our query is false we can produce a contradiction, then it must be the case that our query is true.

Let’s formalize that a little bit. Essentially, our statement is:

This is equivalent to the deduction theorem. A nice way to think of this is:

A statement can be proven from a knowledge base if and only if the negation of that statement combined with the knowledge base produces a contradiction. In other words, a statement is provable from a knowledge base only if the union of the knowledge base and the negation of the statement is unsatisfiable.

So, if we show that is unsatisfiable, we’ve shown that . If you aren’t convinced, here’s a proof. This gives us the following algorithm:

  1. Convert into CNF, and split up each sentence into clauses
  2. Assume
  3. Apply resolution to every pair of clauses until either…
    • (a) no more clauses can be derived, meaning that there is no proof of from . If there were a proof, that would imply that there existed some such that , but since we know resolution is complete and sound, it must be the case that such an is not sound.
    • (b) we derive an “empty clause”, or false. In other words, we find a contradiction. The existence of a contradiction is enough to prove that is unsatisfiable, since it proves that you’ll always get false no matter what model you use. You’ve proven by contradiction.

The correctness of this algorithm has some interesting consequences. For example, try a knowledge base of and . That’s a contradiction. Then, ask the system to prove , a symbol we know absolutely nothing about. The system will resolve and to false, and suggest that a contradiction has been reached from the knowledge base . So have we proven from ?

Well, it turns out that we have! In fact, any conclusion follows from a contradiction. This is called the principle of explosion, which can be stated as:

Think of it this way. Consider a true statement like “the sky is blue.” We’ll call that . Consider another statement, “the Easter Bunny is real.” We’ll call that . We know that the statement is true because is true. However, let’s say for some reason we knew that the sky was blue and not blue, in other words, we know that was somehow true. Since we know is true, and we know is true, we can use resolution to deduce :

So we’ve shown that . Since resolution () is sound, we know that . This isn’t so ridiculous when we say it out loud:

If the sky isn’t blue, the Easter Bunny is real.

So it’s a good thing that our system finds that is entailed by . If it didn’t, it wouldn’t be complete!

You might’ve noticed that the line numbers of each statement in the generated proofs aren’t sequential. That’s because the proof is generated via resolution, and only the relevant clauses are displayed at the end. Since what we’re trying to prove is that is unsatisfiable, we’re essentially solving a limited case of the boolean satisfiability problem, which is NP-complete. That means there could be quite a few steps! If you put a lot of data into the knowledge base and ask a tough question, it might take your browser quite a while to come up with an answer!

One more interesting tidbit: the initial state of the demo shows how modus ponens is really just a special case of resolution. Since we know that resolution is sound and complete, all sound logical inference rules are a special case of resolution. We also know that resolution is a subset of any complete inference rule, and we know that resolution is equivalent to any sound and complete inference rule. It’s reassuring to know that truth works the same way no matter what.