A-Z of Machine Learning and Computer Vision Terms

AI Assisted Labeling

Active Learning

Anomaly Detection

Artificial Intelligence (AI)

Backpropagation

Batch Normalization

Bayesian Network

Binary Classification

Calibration Curve

Canonical Correlation Analysis (CCA)

Case-Based Reasoning

Chain of Thought (CoT)

Chi-Squared Automatic Interaction Detection (CHAID)

Class Boundary (Statistics & Machine Learning)

Class Imbalance

Collaborative Filtering

Computer Vision

Computer Vision Model

Conditional Random Field (CRF)

Confusion Matrix

Constrained Clustering

Contrastive Learning

Convolutional Neural Networks (CNNs)

Cross-Validation

Data Approximation

Data Augmentation

Data Operations

Data Pre-processing

Decision Boundary

Deep Neural Networks

Dimensionality Reduction

Dynamic and Event-Based Classifications

Embedding Spaces

Ensemble Learning

Expectation-Maximization Algorithm (EM)

Extreme Learning Machine

FP-Growth Algorithm

Factor Analysis

False Positive Rate

Feature Engineering

Feature Extraction

Feature Hashing

Feature Learning

Feature Scaling

Feature Selection

Few-shot Learning

Fisher’s Linear Discriminant

Foundation Models

Frames Per Second (FPS)

Fully Connected Layer

Generative Adversarial Network (GAN)

Generative Adversarial Networks

Generative Pre-Trained Transformer

Gradient Descent

T

Training Data

Training data is the collection of examples used to fit a machine learning model’s parameters. It consists of input features (and, in supervised learning, corresponding labels or target values) that the learning algorithm uses to learn patterns and make predictions.For instance, to train a face detection model, the training dataset might include thousands of images with faces annotated (labeled) by bounding boxes.The quality and representativeness of training data are critical – a model’s performance is bounded by what it sees during training. Typically, one partitions available data into training, validation, and test sets; the model is optimized on the training set. Good training data should be large, diverse, and accurately labeled to ensure the model generalizes well. Issues like noise or bias in training data can lead to poor model behavior, so data curation and augmentation are important steps in the ML pipeline.

Further Reading

🔗 Research Paper 📄 Blog Post 📄 Blog Post

Explore Our Products

Lightly One

Data Selection & Data Viewer

Get data insights and find the perfect selection strategy

Lightly Train

Self-Supervised Pretraining

Leverage self-supervised learning to pretrain models

Lightly Edge

Smart Data Capturing on Device

Find only the most valuable data directly on devide

Ready to Get Started?

Experience the power of automated data curation with Lightly

Region Proposal Network (RPN)

Class Boundary (Statistics & Machine Learning)

Long Short-Term Memory (LSTM)

Recurrent Neural Network (RNN)

Large Language Model (LLM)

Chain of Thought (CoT)

Foundation Models

Semantic Segmentation

Variance (Model Variance)

XAI (Explainable AI)

YOLO (You Only Look Once)

Weight Decay (L2 Regularization)

Text Generation Inference

True Positive Rate (TPR)

Type II Error (False Negative)

Type I Error (False Positive)

Transformers (Transformer Networks)

Stream-Based Selective Sampling

Support Vector Machine (SVM)

Sentiment Analysis

Surrogate Model

Supervised Learning

Semi-supervised Learning

Selective Sampling

Sliding Window Attention

Sensitivity and Specificity of Machine Learning

Segment Anything Model (SAM)

Regularization Algorithms

ROC (Receiver Operating Characteristic) Curve

Scale Imbalance

Regression (Regression Analysis)

Region-Based CNN (R-CNN)

Recall (Sensitivity or True Positive Rate)

RAG Architecture

Query Synthesis Methods

Query Strategy (Active Learning)

Predictive Model Validation

Prompt Injection

Prompt Engineering

Prompt Chaining

Pose Estimation

Pool-Based Sampling

Pattern Recognition

Parameter-Efficient Fine-Tuning (Prefix-Tuning)

Pandas and NumPy

Panoptic Segmentation

PACS (Picture Archiving and Communication System)

Outlier Detection

Object Tracking

Optical Character Recognition (OCR)

One-Shot Learning

Object Localization

Object Detection

Natural Language Processing (NLP)

Neural Networks

Multi-Task Learning

Motion Detection

Motion Estimation

Model Validation

Latent Dirichlet Allocation (LDA)

Medical Image Segmentation

Model Parameters

Mean Squared Error (MSE)

Mean Average Precision (mAP)

Machine Learning (ML)

Linear Regression

Linear Discriminant Analysis (LDA)

Intersection over Union (IoU)

Interpretability

Imbalanced Dataset

Image Processing

Image Restoration

Image Segmentation

Image Recognition