Overfitting occurs when a machine learning model learns the training data too well, including its noise and idiosyncrasies, to the point that it performs poorly on new, unseen data. In essence, the model has memorized the training data rather than capturing the underlying general patterns. Overfitting is characterized by the training error being much lower than the test/validation error. It is often the result of a model being too complex relative to the amount of training data (e.g., too many parameters or not enough regularization). Symptoms might include: weird fluctuations in learned function between training points, very high variance in predictions. To combat overfitting, one can simplify the model (reduce complexity, number of parameters), use more training data, apply regularization (L1/L2, dropout, early stopping, etc.), or use cross-validation to detect it early. A classic example: high-degree polynomial fitting through points ends up oscillating wildly between points. In deep learning, models often have capacity to overfit but techniques like dropout, augmentation and large datasets keep it in check.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More