-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Clojure for Data Science
By :

Finally, having tokenized, stemmed, and vectorized our input documents—and with a selection of distance measures to choose from—we're in a position to run clustering on our data. The first clustering algorithm we'll look at is called k-means clustering.
k-means is an iterative algorithm that proceeds as follows:
The process is visualized in the following diagram for k=3 clusters:
In the preceding figure, we can see that the initial cluster centroids at iteration 1 don't represent the structure of the data well. Although the points are clearly arranged in three groups, the initial centroids (represented by crosses) are all distributed around the top area of the graph. The points are colored...
Change the font size
Change margin width
Change background colour