Hierarchical clustering with prototypes

A dendrogram with a dashed line at the example cut height.

Pyprotoclust is an implementatin of representative hierarchical clustering using minimax linkage. The original algorithm is from Hierarchical Clustering With Prototypes via Minimax Linkage (DOI: 10.1198/jasa.2011.tm10183) by J. Bien and R. Tibshirani; Pyprotoclust takes a distance matrix as input. It returns a linkage matrix encoding the hierachical clustering as well as an additional list labelling the prototypes associated with each clustering.

I coded up a fun example inspired by the original paper where I apply the algorithm to determine representative pictures for the Olivetti Faces dataset. It can be found in the Pyprotoclust documentation.

Figure: (Left) A dendrogram of the hierarchical clustering example with a dashed line at the example cut height. (Right) A scatter plot of the example with circles centered at prototypes drawn with radii equal to the top-level linkage heights of each cluster.

Andy J. Goldschmidt
Andy J. Goldschmidt
Ph.D. student in Physics

Related