Lære Gradient Boosting: Theory and Implementation

Sveip for å vise menyen

Gradient Boosting is an ensemble method that trains weak learners (usually decision trees) sequentially, where each new model fits the gradient of the loss function with respect to the current predictions.

Unlike AdaBoost, which adjusts sample weights, Gradient Boosting directly fits weak learners to the residuals of the previous model, minimizing a chosen loss function step by step.

Mathematical Intuition

Each model learns from the residuals of the previous stage:

r_i^{(t)} = -\frac{\partial L(y_i, F(x_i))}{\partial F(x_i)}

Then, a new model $h_t(x)$ is fitted to these residuals, and the ensemble is updated as:

F_{t+1}(x) = F_t(x) + \eta , h_t(x)

where:

$L$ — loss function (e.g., MSE or log-loss),
$\eta$ — learning rate controlling the contribution of each tree.


              1234567891011121314151617181920212223242526272829303132333435
            
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

# Load dataset
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize and train the model
gb = GradientBoostingClassifier(
    n_estimators=100,
    learning_rate=0.1,
    max_depth=3,
    random_state=42
)
gb.fit(X_train, y_train)

# Evaluate
y_pred = gb.predict(X_test)

# Compute staged accuracy manually (works in all sklearn versions)
test_accuracy = []
for y_stage_pred in gb.staged_predict(X_test):
    acc = accuracy_score(y_test, y_stage_pred)
    test_accuracy.append(acc)

# Plot staged accuracy
plt.plot(range(1, len(test_accuracy) + 1), test_accuracy)
plt.xlabel("Number of Trees")
plt.ylabel("Test Accuracy")
plt.title(f"Gradient Boosting Learning Progression (Accuracy: {accuracy_score(y_test, y_pred):.3f})")
plt.grid(True)
plt.show()

Alt var klart?

Takk for tilbakemeldingene dine!

Seksjon 3. Kapittel 3

Spør AI

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

Seksjon 3. Kapittel 3