PCA is a linear dimensionality reduction technique used to reduce the number of features in a dataset while preserving as much variance as possible. It does so by finding new axes—called principal components—that are linear combinations of the original features and are ordered by the amount of variance they capture.
The first principal component captures the direction of maximum variance in the data, the second captures the next highest variance orthogonal to the first, and so on. By projecting data onto the top k components, PCA reduces dimensionality while retaining the most informative structure.
PCA is commonly used for data compression, noise reduction, and visualization of high-dimensional data. It assumes linear relationships and is sensitive to the scale of features, so preprocessing steps like normalization are important.
Although simple and fast, PCA doesn’t capture nonlinear structures and may not perform well when important information lies in subtle, nonlinear patterns.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More