Lära Boosting: Concept and Mathematical Intuition

Svep för att visa menyn

Boosting combines multiple weak learners in sequence, with each learner focusing on correcting the errors of its predecessors. This process creates a strong predictive model from simple models that individually perform only slightly better than random. In contrast, bagging builds all learners independently and in parallel, then averages their predictions. Boosting emphasizes difficult cases through sequential learning, while bagging reduces variance by parallel averaging.

Definition

Boosting is a sequential ensemble learning method that constructs a strong predictive model by combining multiple weak learners. Each new learner in the sequence is trained to focus on correcting the errors made by the previous learners, allowing the overall ensemble to adaptively improve its predictions on challenging data points.

Mathematical Intuition: Weighted Error and Model Combination

Suppose you have a set of weak learners $h_1(x), h_2(x), \ldots, h_M(x)$ , each trained sequentially. Boosting assigns a weight $\alpha_m$ to each learner based on its accuracy. The overall prediction $F(x)$ is a weighted sum of these learners:

F(x) = \sum_{m=1}^M \alpha_m h_m(x)

After each round, boosting updates the weights on the training data. The weighted error $\epsilon_m$ for learner $m$ is calculated as:

\epsilon_m = \frac{\sum_{i=1}^N w_i \cdot I(y_i \neq h_m(x_i))}{\sum_{i=1}^N w_i}

where:

$w_i$ is the weight of sample $i$ ;
$y_i$ is the true label for sample $i$ ;
$h_m(x_i)$ is the prediction of the $m$ -th learner for sample $i$ ;
$I(...)$ is the indicator function, returning 1 if the argument is true, 0 otherwise.

Learners with lower error receive higher weights ( $\alpha_m$ ), so their predictions contribute more to the final model. This process continues, with each learner focusing more on the samples that previous learners found difficult.

Note

In boosting, after each weak learner is trained, the algorithm increases the weights of misclassified samples and decreases the weights of correctly predicted ones. This adjustment ensures that subsequent learners pay more attention to the difficult cases that previous learners struggled with. By focusing learning on the hardest-to-predict data points, boosting systematically improves overall model accuracy with each iteration.


              123456789101112131415161718192021222324252627282930
            
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define a weak learner: decision stump (tree of depth 1)
weak_learner = DecisionTreeClassifier(max_depth=1, random_state=42)

# Initialize AdaBoost with 5 weak learners
ada = AdaBoostClassifier(estimator=weak_learner, n_estimators=5, random_state=42)
ada.fit(X_train, y_train)

# Predict on the test set
y_pred = ada.predict(X_test)

# Print accuracy
print("Test set accuracy:", accuracy_score(y_test, y_pred))

# Show how each stage improves accuracy
staged_scores = list(ada.staged_score(X_test, y_test))
for i, score in enumerate(staged_scores, start=1):
    print(f"After {i} weak learners: accuracy = {score:.2f}")

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 1

Fråga AI

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 3. Kapitel 1