A-Z of Machine Learning and Computer Vision Terms

Gradient Descent

Hierarchical Clustering

Histogram of Oriented Gradients (HOG)

Human Pose Estimation

Human in the Loop (HITL)

Hyperparameter Tuning

Hyperparameters

Image Annotation

Image Augmentation

Image Captioning

Image Classification

Image Degradation

Image Generation

Image Processing

Image Recognition

Image Restoration

Image Segmentation

Imbalanced Data

Imbalanced Dataset

In-Context Learning

Instance Segmentation

Instance Segmentation

Interpretability

Intersection over Union (IoU)

Jupyter Notebooks

K-Means Clustering

Knowledge Graphs

Large Language Model (LLM)

Latent Dirichlet Allocation (LDA)

Linear Discriminant Analysis (LDA)

Linear Regression

Logistic Regression

Long Short-Term Memory (LSTM)

Machine Learning (ML)

Manifold Learning

Mean Average Precision (mAP)

Mean Squared Error (MSE)

Medical Image Segmentation

Model Parameters

Model Validation

Motion Detection

Motion Estimation

Multi-Task Learning

Natural Language Processing (NLP)

Neural Architecture Search

Neural Networks

Neural Style Transfer

Object Detection

Object Localization

Object Recognition

Object Tracking

One-Shot Learning

Optical Character Recognition (OCR)

Optimization Algorithms

Outlier Detection

PACS (Picture Archiving and Communication System)

Pandas and NumPy

Panoptic Segmentation

Parameter-Efficient Fine-Tuning (Prefix-Tuning)

Pattern Recognition

Pool-Based Sampling

Pose Estimation

Predictive Model Validation

Principal Component Analysis

Prompt Chaining

Prompt Engineering

Prompt Injection

E

Embedding Spaces

In computer vision, embedding spaces are vector representations where images or image regions are mapped into a continuous space that captures visual similarity and semantic meaning. Models learn to project images into these spaces such that visually or conceptually similar images are close together, while dissimilar ones are far apart.

These embeddings are typically produced by convolutional neural networks (CNNs) or vision transformers trained using classification, contrastive, or self-supervised objectives. Common applications include image retrieval, clustering, active learning, anomaly detection, and similarity-based search.

For example, in contrastive learning (e.g., SimCLR, DINO), embeddings of augmented views of the same image are pulled together, while those from different images are pushed apart. Embedding spaces also support zero-shot transfer by aligning images with text or labels (e.g., CLIP).

Embedding quality depends on training objectives, data diversity, and architecture. Well-structured embedding spaces allow efficient downstream tasks without retraining the full model.

Further Reading

🔗 Research Paper 📄 Blog Post 📄 Blog Post

Explore Our Products

Lightly One

Data Selection & Data Viewer

Get data insights and find the perfect selection strategy

Lightly Train

Self-Supervised Pretraining

Leverage self-supervised learning to pretrain models

Lightly Edge

Smart Data Capturing on Device

Find only the most valuable data directly on devide

Ready to Get Started?

Experience the power of automated data curation with Lightly

Class Boundary (Statistics & Machine Learning)

Long Short-Term Memory (LSTM)

Recurrent Neural Network (RNN)

Large Language Model (LLM)

Chain of Thought (CoT)

Foundation Models

Semantic Segmentation

Variance (Model Variance)

XAI (Explainable AI)

YOLO (You Only Look Once)

Weight Decay (L2 Regularization)

Text Generation Inference

True Positive Rate (TPR)

Type II Error (False Negative)

Type I Error (False Positive)

Transformers (Transformer Networks)

Stream-Based Selective Sampling

Support Vector Machine (SVM)

Sentiment Analysis

Surrogate Model

Supervised Learning

Semi-supervised Learning

Selective Sampling

Sliding Window Attention

Sensitivity and Specificity of Machine Learning

Segment Anything Model (SAM)

Regularization Algorithms

ROC (Receiver Operating Characteristic) Curve

Scale Imbalance

Regression (Regression Analysis)

Region-Based CNN (R-CNN)

Recall (Sensitivity or True Positive Rate)

RAG Architecture

Query Synthesis Methods

Query Strategy (Active Learning)

Predictive Model Validation

Prompt Injection

Prompt Engineering

Prompt Chaining

Pose Estimation

Pool-Based Sampling

Pattern Recognition

Parameter-Efficient Fine-Tuning (Prefix-Tuning)

Pandas and NumPy

Panoptic Segmentation

PACS (Picture Archiving and Communication System)

Outlier Detection

Object Tracking

Optical Character Recognition (OCR)

One-Shot Learning

Object Localization

Object Detection

Natural Language Processing (NLP)

Neural Networks

Multi-Task Learning

Motion Detection

Motion Estimation

Model Validation

Latent Dirichlet Allocation (LDA)

Medical Image Segmentation

Model Parameters

Mean Squared Error (MSE)

Mean Average Precision (mAP)

Machine Learning (ML)

Linear Regression

Linear Discriminant Analysis (LDA)

Intersection over Union (IoU)

Interpretability

Imbalanced Dataset

Instance Segmentation

Image Processing

Image Restoration

Image Segmentation

Image Recognition