Hierarchical clustering is an unsupervised learning method that builds a hierarchy of clusters either in a bottom-up (agglomerative) or top-down (divisive) fashion. In agglomerative clustering, each data point starts as its own cluster, and pairs of clusters are merged iteratively based on a distance metric (e.g., Euclidean, cosine) and a linkage criterion (e.g., single, complete, average).
The result is a tree-like structure called a dendrogram, which shows how clusters are merged at each step. This allows the user to choose the number of clusters by cutting the dendrogram at a desired level. Unlike k-means, hierarchical clustering doesn't require specifying the number of clusters in advance.
It’s often used in bioinformatics, document clustering, and customer segmentation. However, it can be computationally expensive for large datasets and is sensitive to the choice of distance and linkage methods.
Hierarchical clustering provides interpretable results and is well-suited for exploring nested group structures in data.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More