The interactive illustrations which follow can be embeded in an ebook via the new EPUB3 format.
1. A Matter of Perspective
In the figure below, the blue dots and red dots represent members of two different classes, ploted
in two dimensions according to two features.
The dots could signify medical patients in a disease study, where the red dots represent diseased
patients and blue dots denote healthy patients, plotted according to the features of white blood
cell count (horizonal axis) and temperature (vertical axis).
Alternatively, the dots could represent hit songs in two different genres (blue and red), plotted
against the features of tempo (horizontal) and mean volume (vertical).
Let's say you want to create a model to predict which class a new (unknown) data-dot should belong to.
This act of classification will in effect create a "dividing line" between the two sets of data.
Since the dots are a bit mixed together, any boundary you draw to separate them is likely to
have some error, i.e., some red dots will get trapped on the "blue side" of the boundary, and
a few blue dots might get stuck on the "red side."
So far we've been looking at a two-dimensional represenation using two features of the dataset.
What if there were a third feature -- a thid dimension? With your mouse, click and drag in the above image...
Thus we see that in this case, using an additional dimension, some other feature, allows for
these two groups to be clearly separated.
2. Stretching Your Perception
We may not always have an extra feature to add as a dimension, but sometimes we can create one
by warping the data according to some function of the features already present. In the following
image, you cannot use a straight line (or plane) to divide the classes no matter how to rotate
the image. Go ahead and try!
Instead, you can create a third dimension by warping the data according to the radius
from the center. The slider on the bottom
of the image shows a time animation of this warping. When you do this (and you rotate the image), the blue and red dots can be
clearly separated with a straight line (or plane).
3. Adaptive Feature Re-mapping: Neural Netwoks
In the preceding example, we engineered a new feature which was made out of a combination of
existing features, namely the horizontal and vertical coordinates.
We used this new feature, the radius, to help distinguish the two classes.
In general, even a two-dimensional grouping of data can take on a lot of shapes, such as those
shown below:
[TODO: make a figure showing various data configurations or shapes]
While we may be able to figure out the right function of features to use to reshape the data in order
to draw a boundary, this may also be difficult to do -- especially if there are more than just a
two features. Humans can only visualize three dimensions, so what happens when your dataset has
one thousand or one million features? How do you construct the right set of transformations needed
to clearly separate the data into classes?
Typically you abandon trying to specify the correct transformation "by hand," and instead use
adaptive "machine learning" (ML) method, which can learn from the data. One powerful ML method
seeing increasing use
is that of neural networks. For now, we'll forego any formalisms associated with neural networks,
and just note that they are helpful because they can find a set of (nonlinear) combinations of
the input features, and transformations applied to these combinations, in order to render the
data points as clearly distinct.
Early CL paper (with Yann LeCun as co-author) likened CL to a bunch of springs
that pull like dots together and repel dissimilar classes -- unless the dissimilar
ones are farther than some margin.
More options
Caution: This is only a cartoon simulation, not a full neural network.