Constrained clustering is a semi-supervised clustering approach that incorporates user-provided constraints (like “must-link” or “cannot-link” conditions between specific data points) into the clustering process. Instead of purely discovering clusters from data, constrained clustering respects background knowledge: a must-link means two instances should be in the same cluster, a cannot-link means they must not be in the same cluster. Algorithms such as COP-KMeans or constrained agglomerative clustering modify standard procedures to satisfy these constraints. The result is a clustering that not only fits the data’s similarity structure but also aligns with domain expertise or desired outcomes (e.g., keeping certain items grouped or separated).
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More