How to delete DataFrame row in Pandas based on column value ?

To delete rows in a Pandas DataFrame based on a column value, you can use boolean indexing or the drop() method. Below are detailed methods with examples:

1. Using Boolean Indexing

Filter rows by excluding those that match the condition.

Example DataFrame:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 17, 22],
    'Gender': ['F', 'M', 'M', 'M']
}

df = pd.DataFrame(data)
print(df)

Output:

      Name  Age Gender
0    Alice   25      F
1      Bob   30      M
2  Charlie   17      M
3    David   22      M

Delete Rows Where Age < 18

df_filtered = df[df['Age'] >= 18]  # Keep rows where Age >= 18
print(df_filtered)

Output:

    Name  Age Gender
0  Alice   25      F
1    Bob   30      M
3  David   22      M

2. Using `drop()` with Index

Delete rows by index positions that meet the condition.

Example:

# Get indices of rows to delete
indices_to_drop = df[df['Gender'] == 'M'].index

# Drop rows by index
df_filtered = df.drop(indices_to_drop)
print(df_filtered)

Output:

    Name  Age Gender
0  Alice   25      F

3. Using `query()` Method

Filter rows using a query string (useful for complex conditions).

Example:

df_filtered = df.query("Age >= 18 and Gender == 'M'")
print(df_filtered)

Output:

    Name  Age Gender
1    Bob   30      M
3  David   22      M

4. Delete Rows with Specific Values

Use isin() to target multiple values.

Example:

# Delete rows where Name is 'Bob' or 'David'
df_filtered = df[~df['Name'].isin(['Bob', 'David'])]
print(df_filtered)

Output:

      Name  Age Gender
0    Alice   25      F
2  Charlie   17      M

5. Handle Missing Values

Delete rows where a column has NaN values.

Example:

import numpy as np

# Add a row with NaN
df.loc[4] = ['Eva', np.nan, 'F']

# Drop rows where 'Age' is NaN
df_filtered = df.dropna(subset=['Age'])
print(df_filtered)

Output:

      Name   Age Gender
0    Alice  25.0      F
1      Bob  30.0      M
2  Charlie  17.0      M
3    David  22.0      M

6. Invert Conditions with `~`

Use the tilde operator (~) to negate a condition.

Example:

# Delete rows where Age is even
df_filtered = df[~(df['Age'] % 2 == 0)]
print(df_filtered)

Output:

      Name  Age Gender
0    Alice   25      F
2  Charlie   17      M

7. Modify DataFrame In-Place

Use inplace=True to modify the original DataFrame (use cautiously).

df.drop(df[df['Age'] < 18].index, inplace=True)
print(df)

Output:

    Name  Age Gender
0  Alice   25      F
1    Bob   30      M
3  David   22      M

Key Takeaways

Method	Use Case	Example
Boolean Indexing	Simple row exclusion.	`df[df['Age'] >= 18]`
`drop()`	Delete by index.	`df.drop(indices_to_drop)`
`query()`	Complex conditions (SQL-like syntax).	`df.query("Age > 20")`
`isin()`	Filter multiple values.	`~df['Name'].isin(['Bob'])`
`dropna()`	Remove rows with missing values.	`df.dropna(subset=['Age'])`

Common Pitfalls

Chained Indexing: Avoid df[df['Age'] > 18]['Name'] (use .loc instead).
In-Place Modification: Overwrites the original DataFrame (use with caution).
Operator Precedence: Use parentheses for compound conditions:

   df[(df['Age'] > 18) & (df['Gender'] == 'M')]

By using these methods, you can efficiently filter or delete rows in a Pandas DataFrame based on column values.

1. Using Boolean Indexing

Example DataFrame:

Delete Rows Where Age < 18

2. Using drop() with Index

Example:

3. Using query() Method

Example:

4. Delete Rows with Specific Values

Example:

5. Handle Missing Values

Example:

6. Invert Conditions with ~

Example:

7. Modify DataFrame In-Place

Key Takeaways

Common Pitfalls

Leave a Reply Cancel reply

2. Using `drop()` with Index

3. Using `query()` Method

6. Invert Conditions with `~`