Unsupervised Learning(Part-5)

 

Hierarchical Clustering

  • Create a nested series of partitions represented in the form of a dendogram 
  • Shows how objects are grouped together step by step 
  • Typical stopping criteria 
  • Number of clusters 
  • Minimum distance between clusters being greater than a user defined threshold

Types of Hierarchical Clustering

Agglomerative 

  • Starts by assuming that each object is a separate cluster 
  • Successively merges clusters until some stopping criterion is met. For example, until the number of clusters is 1 

 

Divisive

  • Starts by assuming the existence of only one cluster 
  • Successively splits a cluster into two or more sub-clusters

Single-Link

  • Distance between two clusters is defined as the minimum distance between pairs of objects, one from each cluster 
  • Can result in a chaining effect 
  • C2 is closer to C1 
  • Builds minimum spanning tree 
  • Edge only exist between “closest” nodes 

Complete-Link 

  • Distance between to clusters is the maximum distance between the pairs of objects, one from each cluster
  • Merging clusters has the minimum increase in diameter of the cluster C2 is closer to C3


Prototype (Centroid/Mediod) Distance 

  • Calculate a centroid/mediod for each cluster (prototype) Alternatives 
  • Merge those clusters that have the minimum distance between their prototypes (Ward’s Method) 
  • Merge two clusters that results in the minimum increase in SSE 
  • Have the same disadvantages as partitional algorithms 

Average-Link 

  • Average distance across all pairs of objects (one drawn from each cluster) 
  • Rationale: sub-clusters belonging to the same clustering will tend to have high connectivity


Comments

Popular posts from this blog

Model Evaluation and Selection

Convolutional Neural Networks(Part-4)

Introduction to Machine Learning(Part-4)