Data drift is the change in data distribution over time which can lead to model degradation if the model is not updated. For instance, if a credit scoring model was trained on last year’s data, but economic conditions have since shifted the distribution of credit behavior, the input data today might not follow the same patterns, causing the model to make less accurate predictions. Data drift can be covariate drift (shift in predictor distribution), prior probability drift (base rates of classes change), or concept drift (target concept itself changes). It’s detected by statistical tests or monitoring model outputs over time. Handling data drift often involves retraining the model with new data, or using online learning methods that continuously adapt to incoming data.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on devide
Learn More