7.4 kmeans demo continued

20211016 The demo continues by running the k-means algorithm and creating a movie to show the changing centroids each step of the algorithm, until the “best” fit is obtained.

=============================
K-Means Algorithm - The Movie
=============================

The algorithm iteratively applies the 2 steps:

 1. Assign all data points to its nearest centroid;

 2. Update the centroids position as the mean of its clusters.

The algorithm stops at an optimal point where repeating the two above
steps would not increase any performance or the centers would stop
moving.

The algorithm moves the centroids each step according to a measure 
which aims to maximise the between cluster sum of squares (distances)
and minimises the within cluster sum of squares.

After moving the centroid the points are recoloured according to their
nearest centroid. The process stop when no improvement can be made to
the measure.

A movie is generated to show the change to the centroids each step. 
When no further improvement can be made the centroids stop moving.

We will now generate and then display the movie.

Close the graphic window using Ctrl-W. 

Press Enter to continue: 

Here are three stills from the movie, showing the random starting point, an intermediate step and the final state. The X marks the three different centroids defining the three clusters.

Notice that the random starting point is different to the graphic shown in the previous page of this chapter, since they were captured from different runs of the demonstration to emphasise the point that the data is randomly selected each demo.

A second random dataset with points more distinctly separated into three clusters is then demonstrated.



Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0