 Ryan Marcus, future assistant professor at the University of Pennsylvania (Fall '23). Using machine learning to build the next generation of data systems.
```
____                       __  ___
/ __ \__  ______ _____     /  |/  /___ _____________  _______
/ /_/ / / / / __ `/ __ \   / /|_/ / __ `/ ___/ ___/ / / / ___/
/ _, _/ /_/ / /_/ / / / /  / /  / / /_/ / /  / /__/ /_/ (__  )
/_/ |_|\__, /\__,_/_/ /_/  /_/  /_/\__,_/_/   \___/\__,_/____/
/____/
```
```   ___                   __  ___
/ _ \__ _____ ____    /  |/  /__ ___________ _____
/ , _/ // / _ `/ _ \  / /|_/ / _ `/ __/ __/ // (_-<
/_/|_|\_, /\_,_/_//_/ /_/  /_/\_,_/_/  \__/\_,_/___/
/___/
```
```   ___  __  ___
/ _ \/  |/  /__ ___________ _____
/ , _/ /|_/ / _ `/ __/ __/ // (_-<
/_/|_/_/  /_/\_,_/_/  \__/\_,_/___/
```

# The Weierstrass Transform

Statistics and machine learning techniques for data smoothing have been around for a long time. The idea is to take a noisy function or noisy data and build a function that approximates the important patterns of the data, but omits the noise. Generally, the smoothed function that results is easier to analyze, sometimes because it is differentiable or integrable.

Below is a demo of one such data smoothing technique, the Weierstrass transform. Click to add or remove points, and move the slider to adjust the energy of the fit.

t =

The Weierstrass transform is fairly simple, although its notation does not make this obvious:

Here, is the smoothed function, is the original function, and is a variable controlling the closeness of the smoothed function.

Intuitively, each point of the smoothed function is a weighted average of all the points in the original function. The weight is determined by a Gaussian distribution. When evaluating the smoothed function at point , points in the original function that are close to get a higher weight, and points that are farther away from get lower weights.

One of the most interesting things about this transform is the large number of names it has been given in different fields:

• In image processing, it is called a Guassian blur, normally described using the phrase “convolution with a Gaussian.”
• In signal processing, it is thought of a simple kind of low-pass filter. In this context, the variable can be interpreted as the size of the pass.
• In physics, it is viewed as a solution to the heat equation, where the constant is used to represent the time that has elapsed since heat diffusion began.
• In statistics, it can be thought of (in a somewhat round-about way) as kernel density estimation.

My implementation is written in JavaScript, and can be found on GitHub. I’m not going to bother publishing such a trivial package to NPM, but if you’d like to use it in your own code you can still install it:

``````npm install --save https://github.com/RyanMarcus/weierstrass.git
``````

Happy smoothing!