Boosting: Concept and Mathematical Intuition
Boosting combines multiple weak learners in sequence, with each learner focusing on correcting the errors of its predecessors. This process creates a strong predictive model from simple models that individually perform only slightly better than random. In contrast, bagging builds all learners independently and in parallel, then averages their predictions. Boosting emphasizes difficult cases through sequential learning, while bagging reduces variance by parallel averaging.
Boosting is a sequential ensemble learning method that constructs a strong predictive model by combining multiple weak learners. Each new learner in the sequence is trained to focus on correcting the errors made by the previous learners, allowing the overall ensemble to adaptively improve its predictions on challenging data points.
Mathematical Intuition: Weighted Error and Model Combination
Suppose you have a set of weak learners h1(x),h2(x),…,hM(x), each trained sequentially. Boosting assigns a weight αm to each learner based on its accuracy. The overall prediction F(x) is a weighted sum of these learners:
F(x)=m=1∑Mαmhm(x)After each round, boosting updates the weights on the training data. The weighted error ϵm for learner m is calculated as:
ϵm=∑i=1Nwi∑i=1Nwi⋅I(yi=hm(xi))where:
- wi is the weight of sample i;
- yi is the true label for sample i;
- hm(xi) is the prediction of the m-th learner for sample i;
- I(...) is the indicator function, returning 1 if the argument is true, 0 otherwise.
Learners with lower error receive higher weights (αm), so their predictions contribute more to the final model. This process continues, with each learner focusing more on the samples that previous learners found difficult.
In boosting, after each weak learner is trained, the algorithm increases the weights of misclassified samples and decreases the weights of correctly predicted ones. This adjustment ensures that subsequent learners pay more attention to the difficult cases that previous learners struggled with. By focusing learning on the hardest-to-predict data points, boosting systematically improves overall model accuracy with each iteration.
123456789101112131415161718192021222324252627282930from sklearn.ensemble import AdaBoostClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Load dataset data = load_breast_cancer() X, y = data.data, data.target # Split into train and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Define a weak learner: decision stump (tree of depth 1) weak_learner = DecisionTreeClassifier(max_depth=1, random_state=42) # Initialize AdaBoost with 5 weak learners ada = AdaBoostClassifier(estimator=weak_learner, n_estimators=5, random_state=42) ada.fit(X_train, y_train) # Predict on the test set y_pred = ada.predict(X_test) # Print accuracy print("Test set accuracy:", accuracy_score(y_test, y_pred)) # Show how each stage improves accuracy staged_scores = list(ada.staged_score(X_test, y_test)) for i, score in enumerate(staged_scores, start=1): print(f"After {i} weak learners: accuracy = {score:.2f}")
Bedankt voor je feedback!
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.
Can you explain how the weights are updated after each boosting round?
What is the intuition behind using decision stumps as weak learners?
How does boosting compare to bagging in terms of overfitting?
Geweldig!
Completion tarief verbeterd naar 7.14
Boosting: Concept and Mathematical Intuition
Veeg om het menu te tonen
Boosting combines multiple weak learners in sequence, with each learner focusing on correcting the errors of its predecessors. This process creates a strong predictive model from simple models that individually perform only slightly better than random. In contrast, bagging builds all learners independently and in parallel, then averages their predictions. Boosting emphasizes difficult cases through sequential learning, while bagging reduces variance by parallel averaging.
Boosting is a sequential ensemble learning method that constructs a strong predictive model by combining multiple weak learners. Each new learner in the sequence is trained to focus on correcting the errors made by the previous learners, allowing the overall ensemble to adaptively improve its predictions on challenging data points.
Mathematical Intuition: Weighted Error and Model Combination
Suppose you have a set of weak learners h1(x),h2(x),…,hM(x), each trained sequentially. Boosting assigns a weight αm to each learner based on its accuracy. The overall prediction F(x) is a weighted sum of these learners:
F(x)=m=1∑Mαmhm(x)After each round, boosting updates the weights on the training data. The weighted error ϵm for learner m is calculated as:
ϵm=∑i=1Nwi∑i=1Nwi⋅I(yi=hm(xi))where:
- wi is the weight of sample i;
- yi is the true label for sample i;
- hm(xi) is the prediction of the m-th learner for sample i;
- I(...) is the indicator function, returning 1 if the argument is true, 0 otherwise.
Learners with lower error receive higher weights (αm), so their predictions contribute more to the final model. This process continues, with each learner focusing more on the samples that previous learners found difficult.
In boosting, after each weak learner is trained, the algorithm increases the weights of misclassified samples and decreases the weights of correctly predicted ones. This adjustment ensures that subsequent learners pay more attention to the difficult cases that previous learners struggled with. By focusing learning on the hardest-to-predict data points, boosting systematically improves overall model accuracy with each iteration.
123456789101112131415161718192021222324252627282930from sklearn.ensemble import AdaBoostClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Load dataset data = load_breast_cancer() X, y = data.data, data.target # Split into train and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Define a weak learner: decision stump (tree of depth 1) weak_learner = DecisionTreeClassifier(max_depth=1, random_state=42) # Initialize AdaBoost with 5 weak learners ada = AdaBoostClassifier(estimator=weak_learner, n_estimators=5, random_state=42) ada.fit(X_train, y_train) # Predict on the test set y_pred = ada.predict(X_test) # Print accuracy print("Test set accuracy:", accuracy_score(y_test, y_pred)) # Show how each stage improves accuracy staged_scores = list(ada.staged_score(X_test, y_test)) for i, score in enumerate(staged_scores, start=1): print(f"After {i} weak learners: accuracy = {score:.2f}")
Bedankt voor je feedback!