Data augmentation is the practice of generating new training data from existing data by applying label-preserving transformations. In computer vision, this often means transforming images by rotating, flipping, scaling, cropping, adjusting brightness/contrast, or adding noise.
For example, an image of a cat flipped horizontally is still a cat, providing a new training example. In NLP, augmentation can include synonym replacement or back-translation of sentences. Augmentation helps increase the diversity of the training set, combatting overfitting especially when original data is limited. It effectively teaches the model invariances (like a rotated object is the same object) and can significantly improve generalization performance without needing new labeled data.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More