Lära Baseline Model and Manual Tuning

Svep för att visa menyn

When you begin an AutoML project, the first step is to establish a baseline model. This baseline provides a reference point for evaluating the performance of more complex models and automated pipelines. By creating a simple model using default settings, you can measure how much improvement is gained by applying advanced techniques or automation. The baseline acts as a benchmark, helping you understand whether your AutoML solution truly adds value.

Once you have a baseline, you can try to improve its performance through manual hyperparameter tuning. This process involves adjusting the model's settings—called hyperparameters—to see how changes affect results. Manual tuning is a hands-on way to learn which parameters matter most, and it sets the stage for understanding automated hyperparameter optimization later in the AutoML workflow.


              123456789101112131415161718192021222324252627282930313233343536373839
            
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np

# Create a more complex, noisy dataset
X, y = make_classification(
    n_samples=1200,
    n_features=20,
    n_informative=6,
    n_redundant=4,
    n_clusters_per_class=3,
    class_sep=0.7,         # lower separation -> harder
    flip_y=0.05,           # 5% label noise
    random_state=42
)

# Split data into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# --- Baseline model ---
baseline_model = DecisionTreeClassifier(random_state=42)
baseline_model.fit(X_train, y_train)
baseline_pred = baseline_model.predict(X_test)
baseline_acc = accuracy_score(y_test, baseline_pred)
print(f"Baseline accuracy: {baseline_acc:.3f}")

# --- Tuned model ---
tuned_model = DecisionTreeClassifier(
    max_depth=20,
    min_samples_split=5,
    min_samples_leaf=2,
    random_state=42
)
tuned_model.fit(X_train, y_train)
tuned_pred = tuned_model.predict(X_test)
tuned_acc = accuracy_score(y_test, tuned_pred)
print(f"Tuned accuracy: {tuned_acc:.3f}")

By fitting a simple LogisticRegression model and manually adjusting the C hyperparameter, you can see how performance changes. This hands-on approach shows the impact of parameter choices and provides insight into the importance of hyperparameter optimization in AutoML.

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 3

Fråga AI

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 1. Kapitel 3