Recently, I discovered this neat little algorithm called “self-organizing maps” that can be used to create a low-dimensional “map” (as in cartography) of high-dimensional data.
The algorithm is very simple. Say you have a set of high-dimensional vectors and you want to represent them in an image, such that each vector is associated with a pixel of that image, and the similarity between vectors should correspond to the distance between pixels.
As a first step, we associate a random vector with each pixel in our map. Then we go through the input vectors and compute the “best matching unit” (BMU), which is the vector in our map that is closest to the input.
Now we update the BMU, and vectors that are close to it on the map, to be more like the input vector. How strongly we shift the vectors may depend on the distance from the BMU, and with each new input, the sphere of influence should decrease.
For example, we can take the RGB vectors of 1000 random colors as inputs, and create a 2D color map:
Here is a simple (non-optimized) python source code that generated the image above: