🎉 Big news: LightlyTrain now supports DINOv2. Read our announcement.

A-Z of Machine Learning and Computer Vision Terms

Artificial Intelligence (AI)

Binary Classification

Canonical Correlation Analysis (CCA)

Case-Based Reasoning

Chain of Thought (CoT)

ChatGPT

Chi-Squared Automatic Interaction Detection (CHAID)

Class Boundary (Statistics & Machine Learning)

Class Imbalance

Collaborative Filtering

Computer Vision

Computer Vision Model

Concept Drift

Conditional Random Field (CRF)

Confusion Matrix

Constrained Clustering

Contrastive Learning

Convolutional Neural Networks (CNNs)

Dimensionality Reduction

Dropout

Dynamic and Event-Based Classifications

Expectation-Maximization Algorithm (EM)

Extreme Learning Machine

Fisher’s Linear Discriminant

Foundation Models

Frame Rate

Frames Per Second (FPS)

Fully Connected Layer

Fuzzy Logic

Generative Adversarial Network (GAN)

Generative Adversarial Networks

Generative Pre-Trained Transformer

Sliding Window Attention

Sliding Window Attention is an attention mechanism pattern for transformer models that restricts each token’s attention scope to a fixed-size window of neighboring tokens.This approach was introduced as part of architectures for long sequences (such as Longformer), to address the quadratic complexity of standard self-attention. Instead of attending to all tokens in the sequence, a token only attends to those within a window of size w around it (for example, the closest w/2 tokens before and after).Using this local attention window significantly reduces computation to O(n · w) from O(n²) (where n is sequence length). Stacking multiple transformer layers with sliding window attention still allows a degree of global context (upper layers can indirectly attend farther as windows overlap). In practice, sliding window attention enables transformers to handle very long inputs efficiently by focusing on local context, and it can be combined with occasional global attention points to preserve some long-range information.