Classification with Python
Classification in machine learning is a learning task that involves categorizing data instances into predefined classes or labels based on their features.
Classification aims to build a model to accurately assign new, unseen data points to the correct classes.
How can we use classification in real life?
- Email Spam Detection: Classification is used to determine whether an incoming email is spam or not spam (ham). Features extracted from the email's content and metadata are used to make this classification;
- Image Recognition: Classification algorithms can identify objects, people, animals, and scenes in images. This technology powers applications like self-driving cars, security cameras, and medical image analysis;
- Medical Diagnosis: Classification helps diagnose diseases based on medical test results, patient history, and symptoms;
- Text Categorization: News articles, legal documents, and social media posts can be classified into categories for information retrieval, content organization, and recommendation systems.
Example
Let's consider a simple classification task: classifying whether a fruit is an apple or an orange based on weight and size.
1234567891011121314151617181920212223242526272829303132333435363738394041import numpy as np import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import accuracy_score # Generate synthetic data np.random.seed(0) # Features: weight in grams and size in centimeters X = np.array([[120, 6], [150, 7], [100, 5], [130, 6.5], [170, 7.5], [130, 6], [180, 8], [90, 4.5], [110, 5.5], [160, 7], [145, 6.5], [155, 7], [140, 6.5]]) # Labels: 0 for apple, 1 for orange y = np.array([0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Create a decision tree classifier classifier = DecisionTreeClassifier() # Train the classifier on the training data classifier.fit(X_train, y_train) # Predict labels on the test data y_pred = classifier.predict(X_test) # Evaluate the classifier accuracy = accuracy_score(y_test, y_pred) print(f'Accuracy: {accuracy:.2f}') # Visualize the decision boundary plt.figure(figsize=(8, 6)) plt.scatter(X[:, 0], X[:, 1], c=y, label='Apple') plt.scatter(X[:, 0], X[:, 1], c=1-y, label='Orange') plt.xlabel('Weight (grams)') plt.ylabel('Size (cm)') plt.title('Apple vs Orange Classification') plt.xlim(80, 200) plt.ylim(4, 9) plt.legend() plt.show()
Grazie per i tuoi commenti!
Chieda ad AI
Chieda ad AI
Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione
Mi faccia domande su questo argomento
Riassuma questo capitolo
Mostri esempi dal mondo reale
Awesome!
Completion rate improved to 16.67
Classification with Python
Scorri per mostrare il menu
Classification in machine learning is a learning task that involves categorizing data instances into predefined classes or labels based on their features.
Classification aims to build a model to accurately assign new, unseen data points to the correct classes.
How can we use classification in real life?
- Email Spam Detection: Classification is used to determine whether an incoming email is spam or not spam (ham). Features extracted from the email's content and metadata are used to make this classification;
- Image Recognition: Classification algorithms can identify objects, people, animals, and scenes in images. This technology powers applications like self-driving cars, security cameras, and medical image analysis;
- Medical Diagnosis: Classification helps diagnose diseases based on medical test results, patient history, and symptoms;
- Text Categorization: News articles, legal documents, and social media posts can be classified into categories for information retrieval, content organization, and recommendation systems.
Example
Let's consider a simple classification task: classifying whether a fruit is an apple or an orange based on weight and size.
1234567891011121314151617181920212223242526272829303132333435363738394041import numpy as np import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import accuracy_score # Generate synthetic data np.random.seed(0) # Features: weight in grams and size in centimeters X = np.array([[120, 6], [150, 7], [100, 5], [130, 6.5], [170, 7.5], [130, 6], [180, 8], [90, 4.5], [110, 5.5], [160, 7], [145, 6.5], [155, 7], [140, 6.5]]) # Labels: 0 for apple, 1 for orange y = np.array([0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Create a decision tree classifier classifier = DecisionTreeClassifier() # Train the classifier on the training data classifier.fit(X_train, y_train) # Predict labels on the test data y_pred = classifier.predict(X_test) # Evaluate the classifier accuracy = accuracy_score(y_test, y_pred) print(f'Accuracy: {accuracy:.2f}') # Visualize the decision boundary plt.figure(figsize=(8, 6)) plt.scatter(X[:, 0], X[:, 1], c=y, label='Apple') plt.scatter(X[:, 0], X[:, 1], c=1-y, label='Orange') plt.xlabel('Weight (grams)') plt.ylabel('Size (cm)') plt.title('Apple vs Orange Classification') plt.xlim(80, 200) plt.ylim(4, 9) plt.legend() plt.show()
Grazie per i tuoi commenti!