Pooling (in context of CNNs) is an operation that reduces the spatial size of feature maps and aggregates information. The most common is max pooling, which divides the input (e.g., a feature map) into non-overlapping (or sometimes overlapping) regions, and outputs the maximum value from each region. For example, a 2x2 max pool on a feature map will take 2x2 blocks of pixels and output one pixel which is the max of those 4, effectively downsampling by 2 in each dimension. Another type is average pooling (output the average of the region). Pooling provides a form of translation invariance (small shifts in the input don’t significantly change the pooled output) and reduces computation for subsequent layers. It also acts as a form of regularization. Some modern architectures have moved away from explicit pooling layers, instead using strided convolutions or other mechanisms to reduce feature map size. Nonetheless, pooling has been fundamental in classic architectures like LeNet, AlexNet, VGG. There’s also global pooling (e.g., global average pooling, which collapses a whole feature map to a single number by averaging, often used to connect conv features to fully connected layers for classification).
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More