Method

The node allows you to perform HCA using the following methods:

The first four go into the class of methods called Agglomerative. The last one is a Divisive method.

Agglomerative methods works as follows:

  1. Start with $ N$ clusters, and a distance matrix $ \mathbf{D} \in \mathbb{R}^{N \times N}$.
  2. Search the matrix for the most similar clusters, i.e the lowest value $ d_{ik}\;i \neq k$. Let the distance between the ``most similar'' clusters, say $ U$ and $ V$ be $ d_{UV}$.
  3. Merge clusters $ U$ and $ V$. Label the newly formed cluster $ (UV)$. Update $ D$ by deleting the rows and columns corresponding to clusters $ U$ and $ V$ and adding a row and column giving the distances between $ (UV)$ and the remaining clusters.
  4. Repeat step 2 and 3 until all objects are merged into a single cluster [1].

Divisive methods works as follows: Starting with one large cluster containing all $ N$ observations. Clusters are divided until each cluster contains only a single observation. At each stage, the cluster with the largest dissimilarity between any two of its observations is selected. To divide the selected cluster, the algorithm first looks for its most disparate observation, i.e. the observation which has the largest average dissimilarity to the other observations of the selected cluster. This observation initiates the ``splinter group''. In subsequent steps, the algorithm reassigns observations that are closer to the ``splinter group'' than to the ``old party''. The result is a division of the selected cluster into two new clusters.



Subsections
Bjørn Kåre Alsberg 2006-04-06