Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Voting: Hard and Soft | Stacking and Voting Ensembles
Ensemble Learning Techniques with Python

bookVoting: Hard and Soft

Voting ensembles combine predictions from multiple base models to make a final decision. There are two main types: hard voting and soft voting.

  • Hard voting takes the majority class label predicted by the base models;
  • Soft voting averages the predicted probabilities and selects the class with the highest average probability.

Hard voting is typically used when base models output class labels, while soft voting is used when base models can output class probabilities. Soft voting can leverage the confidence of each model's prediction.

Note
Study More

Comparing ensemble methods: voting combines predictions in parallel, bagging averages predictions from models trained on different data samples, and boosting combines models sequentially to correct errors.

12345678910111213141516171819202122232425262728293031323334353637383940414243
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.svm import SVC from sklearn.ensemble import VotingClassifier from sklearn.metrics import accuracy_score # Load dataset iris = load_iris() X, y = iris.data, iris.target # Split data X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, stratify=y) # Standardize features for SVC and Logistic Regression scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # Define base models clf1 = LogisticRegression(random_state=42, max_iter=200) clf2 = DecisionTreeClassifier(random_state=42) clf3 = SVC(probability=True, random_state=42) # Hard voting ensemble (majority vote) hard_voting = VotingClassifier( estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='hard' ) hard_voting.fit(X_train_scaled, y_train) y_pred_hard = hard_voting.predict(X_test_scaled) print("Hard Voting Accuracy:", accuracy_score(y_test, y_pred_hard)) # Soft voting ensemble (average probabilities) soft_voting = VotingClassifier( estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='soft' ) soft_voting.fit(X_train_scaled, y_train) y_pred_soft = soft_voting.predict(X_test_scaled) print("Soft Voting Accuracy:", accuracy_score(y_test, y_pred_soft))
copy
question mark

Which statement accurately describes the main difference between hard voting and soft voting?

Select the correct answer

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 4. Capítulo 1

Pregunte a AI

expand

Pregunte a AI

ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Suggested prompts:

Can you explain the difference between hard voting and soft voting in more detail?

Why would you choose soft voting over hard voting in practice?

Can you show how to interpret the accuracy results from the code example?

bookVoting: Hard and Soft

Desliza para mostrar el menú

Voting ensembles combine predictions from multiple base models to make a final decision. There are two main types: hard voting and soft voting.

  • Hard voting takes the majority class label predicted by the base models;
  • Soft voting averages the predicted probabilities and selects the class with the highest average probability.

Hard voting is typically used when base models output class labels, while soft voting is used when base models can output class probabilities. Soft voting can leverage the confidence of each model's prediction.

Note
Study More

Comparing ensemble methods: voting combines predictions in parallel, bagging averages predictions from models trained on different data samples, and boosting combines models sequentially to correct errors.

12345678910111213141516171819202122232425262728293031323334353637383940414243
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.svm import SVC from sklearn.ensemble import VotingClassifier from sklearn.metrics import accuracy_score # Load dataset iris = load_iris() X, y = iris.data, iris.target # Split data X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, stratify=y) # Standardize features for SVC and Logistic Regression scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # Define base models clf1 = LogisticRegression(random_state=42, max_iter=200) clf2 = DecisionTreeClassifier(random_state=42) clf3 = SVC(probability=True, random_state=42) # Hard voting ensemble (majority vote) hard_voting = VotingClassifier( estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='hard' ) hard_voting.fit(X_train_scaled, y_train) y_pred_hard = hard_voting.predict(X_test_scaled) print("Hard Voting Accuracy:", accuracy_score(y_test, y_pred_hard)) # Soft voting ensemble (average probabilities) soft_voting = VotingClassifier( estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='soft' ) soft_voting.fit(X_train_scaled, y_train) y_pred_soft = soft_voting.predict(X_test_scaled) print("Soft Voting Accuracy:", accuracy_score(y_test, y_pred_soft))
copy
question mark

Which statement accurately describes the main difference between hard voting and soft voting?

Select the correct answer

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 4. Capítulo 1
some-alt