Pool-based sampling is an active learning strategy where a model is given access to a large pool of unlabeled data and must choose the most informative samples from that pool to be labeled next. The model (or an acquisition function) typically selects examples about which it is uncertain (or which maximize some information criteria) from the pool. These selected samples are then annotated by an oracle (e.g., a human), added to the training set, and the model is retrained. This iterative process focuses labeling efforts on the data that will most improve the model, reducing the total number of annotations needed. Common pool-based methods include uncertainty sampling, query-by-committee, and density-weighted sampling
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More