To add a new column to an existing pandas DataFrame, you can use several methods depending on your specific use case. Below are the most common techniques with detailed explanations and examples.
1. Direct Assignment (Simplest Method)
Add a new column by assigning values directly to a new column name.
Example 1: Add a column with a constant value
import pandas as pd
# Sample DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
})
# Add a new column with a constant value
df['Country'] = 'USA'
print(df)
Output:
Name Age Country
0 Alice 25 USA
1 Bob 30 USA
2 Charlie 35 USA
Example 2: Add a column with calculated values
# Add a column based on existing columns
df['Birth Year'] = 2023 - df['Age']
print(df)
Output:
Name Age Country Birth Year
0 Alice 25 USA 1998
1 Bob 30 USA 1993
2 Charlie 35 USA 1988
2. Using assign() (Method Chaining)
The assign() method returns a new DataFrame with the added column (does not modify the original DataFrame unless you overwrite it).
Example:
# Create a new column without modifying the original DataFrame
df_new = df.assign(Salary = [70000, 80000, 90000])
print(df_new)
Output:
Name Age Country Birth Year Salary
0 Alice 25 USA 1998 70000
1 Bob 30 USA 1993 80000
2 Charlie 35 USA 1988 90000
3. Insert a Column at a Specific Position
Use df.insert(loc, column_name, values) to add a column at a specific index position.
Example:
# Insert 'Gender' as the second column (index=1)
df.insert(1, 'Gender', ['F', 'M', 'M'])
print(df)
Output:
Name Gender Age Country Birth Year
0 Alice F 25 USA 1998
1 Bob M 30 USA 1993
2 Charlie M 35 USA 1988
4. Add a Column Conditionally
Use np.where() or boolean logic to create conditional columns.
Example:
import numpy as np
# Add a column based on a condition
df['Is Senior'] = np.where(df['Age'] > 30, 'Yes', 'No')
print(df)
Output:
Name Gender Age Country Birth Year Is Senior
0 Alice F 25 USA 1998 No
1 Bob M 30 USA 1993 No
2 Charlie M 35 USA 1988 Yes
5. Add a Column from a List/Array
Ensure the list/array length matches the DataFrame’s row count.
Example:
# Add a column from a list
df['Department'] = ['HR', 'Engineering', 'Finance']
print(df)
Output:
Name Gender Age Country Birth Year Is Senior Department
0 Alice F 25 USA 1998 No HR
1 Bob M 30 USA 1993 No Engineering
2 Charlie M 35 USA 1988 Yes Finance
6. Add a Column Using apply()
Use a function to compute column values row-wise.
Example:
# Add a column using a custom function
df['Name Length'] = df['Name'].apply(lambda x: len(x))
print(df)
Output:
Name Gender Age ... Department Name Length
0 Alice F 25 ... HR 5
1 Bob M 30 ... Engineering 3
2 Charlie M 35 ... Finance 7
Key Notes
- Overwriting Columns:
If the column name already exists, it will be overwritten.
df['Age'] = df['Age'] + 1 # Increments all values in the 'Age' column
- Alignment by Index:
When adding aSeries, values align by index:
bonus = pd.Series([1000, 2000], index=[0, 2])
df['Bonus'] = bonus # Row 1 (index=1) gets NaN
- Performance:
Use vectorized operations (e.g.,df['col1'] + df['col2']) instead of loops for efficiency.
Best Practices
- Use direct assignment (
df['new_col'] = ...) for simple column additions. - Use
assign()when method chaining is preferred (e.g., in a pipeline). - Use
insert()to control the column position.
Common Errors
ValueError: Length Mismatch
Ensure the new column has the same number of rows as the DataFrame.
# This will fail if len(ages) != len(df)
df['Age'] = ages
KeyError
Check for typos in column names when referencing existing columns.
Complete Example
import pandas as pd
# Create DataFrame
df = pd.DataFrame({
'Product': ['Laptop', 'Phone', 'Tablet'],
'Price': [1200, 800, 300]
})
# Add columns using different methods
df['Category'] = 'Electronics' # Direct assignment
df = df.assign(
Discount = lambda x: x['Price'] * 0.1 # 10% discount
)
df.insert(2, 'In Stock', [True, False, True]) # Insert at position 2
print(df)
Output:
Product Price In Stock Category Discount
0 Laptop 1200 True Electronics 120.0
1 Phone 800 False Electronics 80.0
2 Tablet 300 True Electronics 30.0
This covers all common scenarios for adding columns to a DataFrame! Let me know if you need further clarification.