Apprendre Modifying Data Structure

Glissez pour afficher le menu

AI in Action


              12345678910
            
import pandas as pd

students = pd.DataFrame({
    "Name": ["Alice", "Bob", "Carol", "Dan"],
    "Age":  [24, 30, 22, 21],
    "Grade":[85, 62, 90, 78]
})

students["Passed"] = students["Grade"] > 70
print(students)

Note

Pandas lets you transform entire columns at once. This includes arithmetics, text adjustments, logical expressions, and more. You can combine columns using element-wise operations or apply a single value to all rows (for example, adding a constant).

Adding and Removing Columns

Adding a new column is as simple as assigning values to a new column name. To remove a column, use the .drop() method.


              1234567891011121314
            
import pandas as pd

students = pd.DataFrame({
    "Name": ["Alice", "Bob", "Carol", "Dan"],
    "Age":  [24, 30, 22, 21],
    "Grade":[85, 62, 90, 78]
})

# Add a column
students["GradePlus5"] = students["Grade"] + 5
# Remove a column
students = students.drop("Grade", axis=1)

print(students)

The axis=1 specifies that you want to remove a column.

Adding and Removing Rows

There are two ways to add a new row:

Use .loc[] with a new index label and values;
Use pd.concat() to join two DataFrames.

To remove a row, use the .drop() method.


              1234567891011121314151617
            
import pandas as pd

students = pd.DataFrame({
    "Name": ["Alice", "Bob", "Carol", "Dan"],
    "Age":  [24, 30, 22, 21],
    "Grade":[85, 62, 90, 78]
})

# Add a row with loc
students.loc[3] = ["Eve", 28, 88]
# Add a row with concat
new_row = pd.DataFrame({"Name": ["Frank"], "Age": [25], "Grade": [95]})
students = pd.concat([students, new_row], ignore_index=True)
# Remove a row
students = students.drop(1)

print(students)

With ignore_index=True, pd.concat() discards the original index values and generates a new continuous index (0, 1, 2, ...).

Working with Indexes

Whenever you add or remove rows, you might notice that the row labels (the index) no longer form a clean sequence. To fix this, you can reset the index to restore simple row numbering or set one of the columns as the new index if you want labels that are more meaningful:


              1234567891011121314
            
import pandas as pd

students = pd.DataFrame({
    "Name": ["Alice", "Bob", "Carol", "Dan"],
    "Age":  [24, 30, 22, 21],
    "Grade":[85, 62, 90, 78]
})

# Set "Name" as the index
students = students.set_index("Name")
print(students)
# Reset to default numeric index
students = students.reset_index()
print(students)

If you want to leave a simple numeric index, then it is a good idea to call .reset_index(drop=True) each time you add/remove rows to keep your DataFrame tidy.