Label errors are mistakes or inaccuracies in the ground truth annotations of a dataset. This can include incorrect class labels, bounding boxes that don’t tightly fit the object, segmentation masks that miss some regions, or any inconsistency. Label errors can arise from human annotator mistakes, ambiguity in data, or even issues in automatic labeling processes. These errors can negatively affect model training because the model is trying to learn a mapping that might be partially wrong. It may also hurt evaluation – a model could be penalized for being “wrong” when actually the label was wrong. Techniques exist to identify likely label errors, such as training a model and finding examples where the model disagrees with the label with high confidence (these might be candidates for relabeling). Datasets often undergo cleaning to correct label errors if discovered. In some cases, noise-robust learning algorithms or loss functions can mitigate the impact by not overfitting these noisy labels.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More