Voting: Hard and Soft
Voting ensembles combine predictions from multiple base models to make a final decision. There are two main types: hard voting and soft voting.
- Hard voting takes the majority class label predicted by the base models;
- Soft voting averages the predicted probabilities and selects the class with the highest average probability.
Hard voting is typically used when base models output class labels, while soft voting is used when base models can output class probabilities. Soft voting can leverage the confidence of each model's prediction.
Comparing ensemble methods: voting combines predictions in parallel, bagging averages predictions from models trained on different data samples, and boosting combines models sequentially to correct errors.
12345678910111213141516171819202122232425262728293031323334353637383940414243from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.svm import SVC from sklearn.ensemble import VotingClassifier from sklearn.metrics import accuracy_score # Load dataset iris = load_iris() X, y = iris.data, iris.target # Split data X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, stratify=y) # Standardize features for SVC and Logistic Regression scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # Define base models clf1 = LogisticRegression(random_state=42, max_iter=200) clf2 = DecisionTreeClassifier(random_state=42) clf3 = SVC(probability=True, random_state=42) # Hard voting ensemble (majority vote) hard_voting = VotingClassifier( estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='hard' ) hard_voting.fit(X_train_scaled, y_train) y_pred_hard = hard_voting.predict(X_test_scaled) print("Hard Voting Accuracy:", accuracy_score(y_test, y_pred_hard)) # Soft voting ensemble (average probabilities) soft_voting = VotingClassifier( estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='soft' ) soft_voting.fit(X_train_scaled, y_train) y_pred_soft = soft_voting.predict(X_test_scaled) print("Soft Voting Accuracy:", accuracy_score(y_test, y_pred_soft))
Obrigado pelo seu feedback!
Pergunte à IA
Pergunte à IA
Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo
Incrível!
Completion taxa melhorada para 7.14
Voting: Hard and Soft
Deslize para mostrar o menu
Voting ensembles combine predictions from multiple base models to make a final decision. There are two main types: hard voting and soft voting.
- Hard voting takes the majority class label predicted by the base models;
- Soft voting averages the predicted probabilities and selects the class with the highest average probability.
Hard voting is typically used when base models output class labels, while soft voting is used when base models can output class probabilities. Soft voting can leverage the confidence of each model's prediction.
Comparing ensemble methods: voting combines predictions in parallel, bagging averages predictions from models trained on different data samples, and boosting combines models sequentially to correct errors.
12345678910111213141516171819202122232425262728293031323334353637383940414243from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.svm import SVC from sklearn.ensemble import VotingClassifier from sklearn.metrics import accuracy_score # Load dataset iris = load_iris() X, y = iris.data, iris.target # Split data X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, stratify=y) # Standardize features for SVC and Logistic Regression scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # Define base models clf1 = LogisticRegression(random_state=42, max_iter=200) clf2 = DecisionTreeClassifier(random_state=42) clf3 = SVC(probability=True, random_state=42) # Hard voting ensemble (majority vote) hard_voting = VotingClassifier( estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='hard' ) hard_voting.fit(X_train_scaled, y_train) y_pred_hard = hard_voting.predict(X_test_scaled) print("Hard Voting Accuracy:", accuracy_score(y_test, y_pred_hard)) # Soft voting ensemble (average probabilities) soft_voting = VotingClassifier( estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='soft' ) soft_voting.fit(X_train_scaled, y_train) y_pred_soft = soft_voting.predict(X_test_scaled) print("Soft Voting Accuracy:", accuracy_score(y_test, y_pred_soft))
Obrigado pelo seu feedback!