Conteúdo do Curso
ML Introduction with scikit-learn
ML Introduction with scikit-learn
Scikit-learn Concepts
The scikit-learn (imported as sklearn
) library offers various functions and classes for preprocessing data and modeling. The main sklearn
objects are estimator, transformer, predictor, and model.
Estimator
Each sklearn
's class with the .fit()
method is considered an estimator. The .fit()
method allows an object to learn from the data.
In other words, the .fit()
method is for training an object. It takes X
and y
parameters (y
is optional for unsupervised learning tasks).
As you can tell, it's not very helpful if an object only learns from data without doing anything with it. However, the two objects — the transformer and the predictor — that inherit from the estimator are much more useful.
Transformer
A transformer has the .fit()
method and the .transform()
method that transforms the data in some way.
Usually, transformers need to learn something from data before transforming it, so you need to apply .fit()
and then .transform()
. To avoid that, transformers also have the .fit_transform()
method.
.fit_transform()
leads to the same result as applying .fit()
and .transform()
sequentially, but is sometimes faster, so it is preferable over .fit().transform()
.
nan
values shown in the training set in the picture indicate missing data in Python.
Predictor
A predictor is an estimator (has the .fit()
method) that has the .predict()
method. The .predict()
method is used for making predictions.
Model
A model is a type of predictor that also includes the .score()
method. This method calculates a score (metric) to measure the predictor's performance.
As mentioned in the previous chapter, accuracy is a metric representing the percentage of correct predictions.
The preprocessing stage involves working with transformers, and we work with predictors (more specifically with models) at the modeling stage.
Obrigado pelo seu feedback!