In statistics and machine learning, class boundary refers to the value or decision thresholds that separate different classes or categories in a classification task. They define the dividing lines (or surfaces, in higher dimensions) between regions in the feature space where the model assigns different class labels.
In Statistics:
- Class boundaries often refer to the cutoff points between class intervals in frequency distributions or histograms.
- For example, if a variable is grouped into bins (e.g., 0–10, 10–20, 20–30), the class boundaries might be defined as 0, 10, 20, and 30.
- These are used to interpret and visualize continuous data as grouped categories.
In Machine Learning:
- Class boundaries are the learned decision surfaces that separate the feature space into distinct regions for each class.
- In binary classification, the class boundary is often the set of points where the predicted probability equals 0.5.
- In multiclass problems, the class boundaries separate the regions where each class has the highest predicted probability.
For example:
- In a logistic regression model, the boundary is a linear hyperplane.
- In non-linear models (like decision trees, SVMs with kernels, or neural networks), the boundaries can be highly curved and complex.