To set a value for a specific cell in a pandas DataFrame using the index label, you can use the .loc
or .at
accessors. Below is a detailed explanation with examples:
1. Using loc
(Label-Based Indexing)
The .loc
accessor is used to access or modify values by row index label and column name.
Syntax
df.loc[row_index_label, column_name] = new_value
Example
import pandas as pd
# Create a DataFrame with a custom index
data = {
'A': [10, 20, 30],
'B': [40, 50, 60],
'C': [70, 80, 90]
}
df = pd.DataFrame(data, index=['X', 'Y', 'Z'])
print("Original DataFrame:")
print(df)
Output:
A B C
X 10 40 70
Y 20 50 80
Z 30 60 90
Set Value for Index ‘Y’ and Column ‘B’
df.loc['Y', 'B'] = 99
print("\nDataFrame after modification:")
print(df)
Output:
A B C
X 10 40 70
Y 20 99 80
Z 30 60 90
2. Using at
(Faster Scalar Access)
The .at
accessor is optimized for scalar (single-cell) operations and is faster than .loc
for this purpose.
Syntax
df.at[row_index_label, column_name] = new_value
Example
# Set value for index 'Z' and column 'A'
df.at['Z', 'A'] = 55
print(df)
Output:
A B C
X 10 40 70
Y 20 99 80
Z 55 60 90
3. Handling Integer-Based Indexes
If your DataFrame uses the default integer index (e.g., 0, 1, 2
), you can still use .loc
or .at
:
# Create a DataFrame with a default integer index
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
})
# Set value for row index 1 and column 'B'
df.loc[1, 'B'] = 99
print(df)
Output:
A B
0 1 4
1 2 99
2 3 6
4. Set Values Conditionally Based on Index
You can also use conditions on the index to modify cells.
Example: Set values for rows where index > 1
# Create a DataFrame with a numeric index
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
}, index=[1, 2, 3])
# Set column 'B' to 0 for rows where index > 1
df.loc[df.index > 1, 'B'] = 0
print(df)
Output:
A B
1 1 4
2 2 0
3 3 0
5. Handling MultiIndex DataFrames
For DataFrames with a MultiIndex (hierarchical index), use tuples to specify the index labels.
Example
# Create a MultiIndex DataFrame
index = pd.MultiIndex.from_tuples(
[('Group1', 'Item1'), ('Group1', 'Item2'), ('Group2', 'Item1')],
names=['Group', 'Item']
)
df = pd.DataFrame({'Price': [100, 200, 300]}, index=index)
print("Original DataFrame:")
print(df)
Output:
Price
Group Item
Group1 Item1 100
Item2 200
Group2 Item1 300
Modify Value for MultiIndex (‘Group1’, ‘Item2’)
df.loc[('Group1', 'Item2'), 'Price'] = 250
print("\nDataFrame after modification:")
print(df)
Output:
Price
Group Item
Group1 Item1 100
Item2 250
Group2 Item1 300
6. Handling Duplicate Index Labels
If the DataFrame has duplicate index labels, .loc
will modify all rows with that index. Be cautious!
Example
# Create a DataFrame with duplicate index labels
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
}, index=['X', 'X', 'Y'])
# Set column 'A' to 99 for all rows with index 'X'
df.loc['X', 'A'] = 99
print(df)
Output:
A B
X 99 4
X 99 5
Y 3 6
7. Using iloc
(Position-Based Indexing)
If you want to use the integer position (not the index label), use .iloc
. This is useful when the index is non-integer.
Example
# Set value for the 2nd row (position 1) and 1st column (position 0)
df.iloc[1, 0] = 999
print(df)
Common Pitfalls
- KeyError: Occurs if the specified index or column does not exist.
- SettingWithCopyWarning: Triggered when modifying a slice of a DataFrame. Use
.copy()
to avoid this.
Summary Table
Method | Use Case | Syntax | Performance |
---|---|---|---|
loc | Label-based indexing for single or multiple cells | df.loc[row, col] = value | Slower for scalars |
at | Optimized for scalar (single-cell) access | df.at[row, col] = value | Faster for scalars |
iloc | Position-based indexing | df.iloc[row_pos, col_pos] = value | Position-dependent |
Examples Recap
- Basic
.loc
/.at
for single cells. - Conditional updates based on index.
- MultiIndex DataFrames.
- Integer-based indexing with
.iloc
.
By understanding these methods, you can efficiently modify values in a DataFrame using index labels!