Semi-supervised Learning is a class of machine learning techniques that train models on a mix of labeled and unlabeled data.Typically, a small amount of labeled data is combined with a large amount of unlabeled data during training. The algorithm leverages the structure in the unlabeled data (for example, clustering or manifold structure) to better learn the decision boundary or predictor than it could with the labeled data alone. Semi-supervised learning sits between supervised learning (all data labeled) and unsupervised learning (no labels).A common approach is to first learn representations or clusters from the unlabeled data, and then use the labeled data to classify those representations (or to propagate labels to similar unlabeled examples). Another approach is self-training or pseudo-labeling, where a model trained on the labeled data predicts labels on unlabeled examples, adds the most confident predictions to the training set, and iteratively retrains. The fundamental rationale is that unlabeled data, when used with a bit of labeled data, can significantly improve learning accuracy without the full cost of labeling everything.This is valuable in scenarios where obtaining a few labels is feasible but labeling everything is too expensive or time-consuming.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More