Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Challenge: Implementing a Random Forest | Random Forest
Classification with Python

Desliza para mostrar el menú

book
Challenge: Implementing a Random Forest

In sklearn, the classification version of Random Forest is implemented using the RandomForestClassifier:

You will also calculate the cross-validation accuracy using the cross_val_score() function:

In the end, you'll print the importance of each feature. The feature_importances_ attribute returns an array of importance scores — these scores represent how much each feature contributed to reducing Gini impurity across all the decision nodes where that feature was used. In other words, the more a feature helps split the data in a useful way, the higher its importance.

However, the attribute only gives the scores without feature names. To display both, you can pair them using Python’s zip() function:

for feature, importance in zip(X.columns, model.feature_importances_):
    print(feature, importance)

This prints each feature name along with its importance score, making it easier to understand which features the model relied on most.

Tarea

Swipe to start coding

You are given a Titanic dataset stored as a DataFrame in the df variable.

  • Initialize the Random Forest model, set random_state=42, train it, and store the fitted model in the random_forest variable.
  • Calculate the cross-validation scores for the trained model using 10 folds, and store the resulting scores in the cv_scores variable.

Solución

Switch to desktopCambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones
¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 4. Capítulo 3
single

single

Pregunte a AI

expand

Pregunte a AI

ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

close

Awesome!

Completion rate improved to 4.17

book
Challenge: Implementing a Random Forest

In sklearn, the classification version of Random Forest is implemented using the RandomForestClassifier:

You will also calculate the cross-validation accuracy using the cross_val_score() function:

In the end, you'll print the importance of each feature. The feature_importances_ attribute returns an array of importance scores — these scores represent how much each feature contributed to reducing Gini impurity across all the decision nodes where that feature was used. In other words, the more a feature helps split the data in a useful way, the higher its importance.

However, the attribute only gives the scores without feature names. To display both, you can pair them using Python’s zip() function:

for feature, importance in zip(X.columns, model.feature_importances_):
    print(feature, importance)

This prints each feature name along with its importance score, making it easier to understand which features the model relied on most.

Tarea

Swipe to start coding

You are given a Titanic dataset stored as a DataFrame in the df variable.

  • Initialize the Random Forest model, set random_state=42, train it, and store the fitted model in the random_forest variable.
  • Calculate the cross-validation scores for the trained model using 10 folds, and store the resulting scores in the cv_scores variable.

Solución

Switch to desktopCambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones
¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

close

Awesome!

Completion rate improved to 4.17

Desliza para mostrar el menú

some-alt