The personal website and blog of Ryan Marcus, a graduate student at Brandeis University. Previously worked at HPE Vertica and Los Alamos National Laboratory.
      
    ____                       __  ___                          
   / __ \__  ______ _____     /  |/  /___ _____________  _______
  / /_/ / / / / __ `/ __ \   / /|_/ / __ `/ ___/ ___/ / / / ___/
 / _, _/ /_/ / /_/ / / / /  / /  / / /_/ / /  / /__/ /_/ (__  ) 
/_/ |_|\__, /\__,_/_/ /_/  /_/  /_/\__,_/_/   \___/\__,_/____/  
      /____/                                                    
        
   ___                   __  ___                    
  / _ \__ _____ ____    /  |/  /__ ___________ _____
 / , _/ // / _ `/ _ \  / /|_/ / _ `/ __/ __/ // (_-<
/_/|_|\_, /\_,_/_//_/ /_/  /_/\_,_/_/  \__/\_,_/___/
     /___/                                          
        
   ___  __  ___                    
  / _ \/  |/  /__ ___________ _____
 / , _/ /|_/ / _ `/ __/ __/ // (_-<
/_/|_/_/  /_/\_,_/_/  \__/\_,_/___/                                   
        

The Weierstrass Transform

Statistics and machine learning techniques for data smoothing have been around for a long time. The idea is to take a noisy function or noisy data and build a function that approximates the important patterns of the data, but omits the noise. Generally, the smoothed function that results is easier to analyze, sometimes because it is differentiable or integrable.

Below is a demo of one such data smoothing technique, the Weierstrass transform. Click to add or remove points, and move the slider to adjust the energy of the fit.

t =

The Weierstrass transform is fairly simple, although its notation does not make this obvious:

Here, is the smoothed function, is the original function, and is a variable controlling the closeness of the smoothed function.

Intuitively, each point of the smoothed function is a weighted average of all the points in the original function. The weight is determined by a Gaussian distribution. When evaluating the smoothed function at point , points in the original function that are close to get a higher weight, and points that are farther away from get lower weights.

One of the most interesting things about this transform is the large number of names it has been given in different fields:

  • In image processing, it is called a Guassian blur, normally described using the phrase “convolution with a Gaussian.”
  • In signal processing, it is thought of a simple kind of low-pass filter. In this context, the variable can be interpreted as the size of the pass.
  • In physics, it is viewed as a solution to the heat equation, where the constant is used to represent the time that has elapsed since heat diffusion began.
  • In statistics, it can be thought of (in a somewhat round-about way) as kernel density estimation.

My implementation is written in JavaScript, and can be found on GitHub. I’m not going to bother publishing such a trivial package to NPM, but if you’d like to use it in your own code you can still install it:

npm install --save https://github.com/RyanMarcus/weierstrass.git

Happy smoothing!

Vulcan: A JavaScript Automated Proof System

Mathematicians have been trying to figure out how to get computers to write proofs for a long time. One of the earliest (dating back to the 1960s) attempts to do so was a logical rule called resolution.

I created Vulcan, an NPM package that implements a resolution-based automated proof system. Below is a in-browser demo I created with AngularJS and Browserify. You can use symbols A-Z as variables that can be either true or false, and any of the following operations: -> (implication), <-> (equivalence), ! (negation), & (conjunction), and | (disjunction).

Enter some information into the knowledge base, enter a query, and click “prove” to see a proof (if one exists).

Knowledge Base

Fact
{{ $index + 1 }} {{ r }}
{{ rules.length + 1 }}

Query

Proof

Step Sentence Action
{{ ax.idx }}{{ ax.tree }}{{ ax.label }}
By assuming the negation of the query, we are able to derive a contradiction. Therefore, the knowledge base combined with the negation of the query is unsatisfiable, so the query follows from the axioms of the knowledge base.
{{ qedNumber }}{{ query }}by resolutional inference
No application of resolution could find a contradiction with the query. Because resolution is sound and complete, we know that the query is not a theorem of the knowledge base.

This certainly isn’t the first JavaScript implementation of a theorem prover, nor is it even the most powerful. It does, however, demonstrate the ability of a very simple system to come up with (in exponential time), a proof of any provable statement in propositional calculus.

But how does it work? Once again, you can find an excellent introduction in Stuart Russell and Peter Norvig’s book. But understanding the system really only requires two fundamental ideas about symbolic logic:

  1. That any expression in propositional logic can be written in conjunctive normal form
  2. That logical resolution is sound and complete.

Conjunctive normal form

The first point is the simplest. We say that a sentence of propositional logic is in conjunctive normal form if it is a series of other sentences combined via conjunction, such that each of the combined sentences is a disjunction of sentence symbols or negations of sentence symbols. In other words, a sentence is in conjunctive normal form if it is an AND of ORs. For example,

… is in conjunctive normal form, as is:

However, this sentence is not in conjunctive normal form:

Nor is this one:

The fact that is not in CNF (conjunctive normal form) seems problematic. Surely, any sufficiently powerful proof system should be able to handle implications. However, we can transform this expression into one that is in CNF using a simple trick: we replace the implication with .

Our new expression, , is in CNF. In fact, there is a sequence of logical rules that can be applied to any expression to convert it into CNF. Norvig and Russell give an excellent description that can be found in other places on the Internet, and I’ll produce my own explanation here.

  1. Remove bijections: .
  2. Replace any implications:
  3. Move nots inwards via DeMorgan’s rule:
    • Not over and:
    • Not over or:
  4. Eliminate double negation:
  5. Distribute ors over ands:

If you don’t believe that this simple algorithm will convert any sentence into CNF, try out a few examples. But this algorithm is not the easiest way to understand why any sentence in propositional logic can be converted into CNF. It’s helpful to remember that two sentences are equivalent if and only if they agree in every model. In other words, imagine you have two sentences, and , composed of the symbols A, B, and C. You can say that if you can plug in any value for A, B, and C and get the same final result for both and .

Let’s take , and and test to see if they are equivalent by building a truth table.

ABC
TTTTT
TTFTT
TFTTT
TFFFF
FTTTT
FTFTT
FFTTT
FFFTT

Quite an exhausting process, but it works. We can see that . The important consequence here is this: if we can construct a sentence with the same truth table as , we can construct a sentence that is equivalent to .

So let’s think about how to construct a new sentence that will be equivalent to but also in CNF. Think of as a bunch of clauses linked together by conjunctions. So, whenever is false, we need to make sure that at least one of the clauses in is false – that’ll make sure that all of is false.

For every row in the truth table that ends in false, add a clause to that is a conjunction of each sentence symbol, but negate the sentence symbol if that symbol is “false” in the table. For , we have only one row in the truth table where the result is false. So we’ll have only one clause in . That clause will be a disjunction of:

  • the negation of A, because A is false in that row of the truth table
  • B, because B is true in that row
  • C, because C is true in that row.

So finally, we get , which is equivalent to and in CNF. If you don’t believe me, try a truth table or try WolframAlpha. Now we have an algorithm for taking a truth table to a new sentence that will be in CNF.

Let’s try another example. Let . We’ll write out a truth table for and then convert it to CNF.

AB
TTT
TFF
FTF
FFT

Since is false in two rows, we’ll have to build clauses from both of them. For the seoncd row, we get the clause . From the third row, we get . Putting the clauses together gives us . That’s equivalent to , which you can verify with a truth table or with WolframAlpha.

Hopefully you’re now significantly convinced that, given any sentence in propositional logic, there’s an equivalent sentence in CNF. The next critical component is logical resolution.

Resolution

Resolution is an interesting trick with some very useful properties. It can be stated as follows:

If you’ve never seen this notation before, it just means that if you are given the two sentences on top as true, then you can deduce that the sentence on the bottom is true as well. For resolution specifically, if you have two disjunctions with a complementary symbol (a symbol that is negated in one disjunction but not in the other), you can remove that symbol from both sentences and combine the two sentences with another disjunction.

It possible to prove that resolution is both sound (meaning that a deduction made by resolution will always be correct) and complete (meaning that any sentence that can be deduced can be deduced by resolution). The second property – completeness – is rather amazing. You might find it interesting to read through a proof that explains why.

So how do we take advantage of resolution to create a proof system? Notice that the inputs and the output of resolution are conjunctions – meaning a string of (possibly negated) symbols linked together by ORs. Since we can convert every statement in our knowledge base to CNF, we can separate out each of the disjunctions into different statements and combine them in different ways with resolution. Since resolution is sound, we know that any combination of two clauses in our knowledge base will be valid. Since resolution is complete, we know that any fact that can be inferred from our knowledge base can be inferred via resolution.

Now, given a knowledge base and a query , how do we find out if is true, given ? To be more specific, we want to know if is semantically entailed by . Written formally: . By the completeness theorem, we know that if and only if can be syntactically generated from using some sound method : .

Let’s call our resolution method . Since resolution is complete, we know that any semantically true sentence entailed by can be syntactically derived via . In other words, we know that:

And, since is sound, we know that any sentence derived via from will also be entailed by . In other words, we have:

Combining these two gives us:

At this point, you might think an acceptable algorithm would be to take your knowledge base and apply resolution over and over again until you either find all possible sentences or you find . The problem here is that all sentences of propositional logic can be stated in infinitely many finite ways (there are infinitely many ways to express any sentence using only finitely many symbols). You might think you could solve this problem by simply converting each step into CNF. The problem with that is that CNF representations are not unique. Example:

Even if you were to generate unique CNF statements by deriving the CNF from the truth table at each step, such an approach would require the proof system to build larger and larger clauses (until reaching ). Ideally, we want to make things smaller and smaller. So instead of searching for , we’ll add to the knowledge base and then search for false. If you think about it, this is equivalent to reductio ad absurdum, or proof by contradiction. If by assuming that our query is false we can produce a contradiction, then it must be the case that our query is true.

Let’s formalize that a little bit. Essentially, our statement is:

This is equivalent to the deduction theorem. A nice way to think of this is:

A statement can be proven from a knowledge base if and only if the negation of that statement combined with the knowledge base produces a contradiction. In other words, a statement is provable from a knowledge base only if the union of the knowledge base and the negation of the statement is unsatisfiable.

So, if we show that is unsatisfiable, we’ve shown that . If you aren’t convinced, here’s a proof. This gives us the following algorithm:

  1. Convert into CNF, and split up each sentence into clauses
  2. Assume
  3. Apply resolution to every pair of clauses until either…
    • (a) no more clauses can be derived, meaning that there is no proof of from . If there were a proof, that would imply that there existed some such that , but since we know resolution is complete and sound, it must be the case that such an is not sound.
    • (b) we derive an “empty clause”, or false. In other words, we find a contradiction. The existence of a contradiction is enough to prove that is unsatisfiable, since it proves that you’ll always get false no matter what model you use. You’ve proven by contradiction.

The correctness of this algorithm has some interesting consequences. For example, try a knowledge base of and . That’s a contradiction. Then, ask the system to prove , a symbol we know absolutely nothing about. The system will resolve and to false, and suggest that a contradiction has been reached from the knowledge base . So have we proven from ?

Well, it turns out that we have! In fact, any conclusion follows from a contradiction. This is called the principle of explosion, which can be stated as:

Think of it this way. Consider a true statement like “the sky is blue.” We’ll call that . Consider another statement, “the Easter Bunny is real.” We’ll call that . We know that the statement is true because is true. However, let’s say for some reason we knew that the sky was blue and not blue, in other words, we know that was somehow true. Since we know is true, and we know is true, we can use resolution to deduce :

So we’ve shown that . Since resolution () is sound, we know that . This isn’t so ridiculous when we say it out loud:

If the sky isn’t blue, the Easter Bunny is real.

So it’s a good thing that our system finds that is entailed by . If it didn’t, it wouldn’t be complete!

You might’ve noticed that the line numbers of each statement in the generated proofs aren’t sequential. That’s because the proof is generated via resolution, and only the relevant clauses are displayed at the end. Since what we’re trying to prove is that is unsatisfiable, we’re essentially solving a limited case of the boolean satisfiability problem, which is NP-complete. That means there could be quite a few steps! If you put a lot of data into the knowledge base and ask a tough question, it might take your browser quite a while to come with an answer!

One more interesting tidbit: the initial state of the demo shows how modus ponens is really just a special case of resolution. Since we know that resolution is sound and complete, all sound logical inference rules are a special case of resolution. We also know that resolution is a subset of any complete inference rule, and we know that resolution is equivalent to any sound and complete inference rule. It’s reassuring to know that truth works the same way no matter what.

A JavaScript Connect Four AI

Might I interest you in a rousing game of Connect Four? You and the computer take turns dropping discs into one of the seven colums below. The goal is to get four pieces of your color in a row (vertically, diagonally, or horizontally).

Winner: none yet...Player {{winner}}
Draw: noyes

That’s an AngularJS powered Connect Four AI that was originally written in C and then compiled to JavaScript using Emscripten. It runs the actual AI in a web worker to prevent your browser from locking up.

It was implemented using techniques from Stuart Russell and Peter Norvig’s excellent book, Artificial Intelligence: A Modern Approach (3rd Edition). These techniques might not create the best Connect Four AI possible (in fact, Victor Allis’ thesis shows how to create a provably-unbeatable AI), but it can still do pretty well! The code (in C, Python, and JavaScript) is available on GitHub.

The main idea behind the AI is adversarial search. Think of all the possible games of Connect Four represented as a tree. The root of the tree is the empty board. The second tier of the tree represents all the possible moves that can be made by the first player. The third tier of the tree represents all the possible moves that can be made by the second player.

A small subset of the tree (for a small game board) might look like this:

Small tree

Imagine expanding the tree all the way down to the bottom, where the leaf nodes are “terminal” states – states where one player has won the game, or there is a draw. In order to make moves, the AI applies a minimax strategy. In other words, the AI assumes that its opponent will always play optimally, and thus tries to minimize the maximum gain of its opponent.

This tree is really big. The branching factor is approximately seven, and the depth of the tree could be as high as 42. So, while the the perfect tree formula would tell us that there are boards, some smart mathematicians have figured out that there are “only” around 4 trillion different boards. That’s still way too many to examine.

Thankfully, we can use a technique called alpha-beta pruning. The idea is pretty simple: do not bother searching branches that would either:

  • require our opponent to make a detrimental move (to get an outcome less than the outcome they could get if they played optimally)
  • or, require us to make a move that was less advantageous than another move we could make

Now, this is a hand wavey oversimplification of how the process actually works. I suggest you read the chapters in Russell and Norvig, or fuss through the Wikipedia page, to get a better understanding.

In the average case, alpha-beta pruning reduces the number of nodes we need to evaluate from to . This brings us from 4 trillion boards to around 2.7 billion boards. A pretty good drop – but still not enough for us to enumerate.

Next, we employ a heuristic function. Since we can’t reach the bottom of the tree, we’ll go down a certain depth and then make some kind of guess as to whether or not the board we are evaluating is a winning or losing board. I used a very simple heuristic: the number of possible four-in-a-rows. The computer would calculate how many ways it could win using alignments present in the current board, and then subtract the number of ways the computer’s opponent could win.

Now, the computer will exhaustively evaluate the tree to a certain depth (the number of moves to “think ahead”) and then use a heuristic to evaluate that board. Eventually, as the end of the game draws near, the computer finds winning or losing states within the prescribed depth, and plays perfectly. The hope is that the heuristic function can guide the AI to a winnable position. It seems to work pretty well – if you can beat it, try increasing the number of moves to think ahead. It’ll make it slow down, but it’ll get exponentially more difficult.

Oh – there’s one more complication. A lot of the boards that you encounter when going down the tree are duplicates (there is more than one way to get to a state). To avoid recomputing the desirability of these boards, we store them in a hash table.

It also looks like writing this type of AI is homework for AI classes at Cornell, Northeastern, and Purdue. I could only find one other Javascript Connect Four AI, which moves substantially faster than mine, but is easily defeated (which is exactly the behavior one would expect).