To pivot a DataFrame in pandas, you can use the pivot()
, pivot_table()
, stack()
, unstack()
, or melt()
functions, depending on your goal. Here’s a breakdown of common methods:
1. pivot()
Method
Reshape data by assigning columns to index, columns, and values (fails if duplicates exist).
import pandas as pd
# Sample DataFrame
df = pd.DataFrame({
'A': ['x', 'x', 'y', 'y'],
'B': ['a', 'b', 'a', 'b'],
'C': [1, 2, 3, 4]
})
# Pivot the DataFrame
pivoted = df.pivot(index='A', columns='B', values='C')
Output:
B a b
A
x 1 2
y 3 4
2. pivot_table()
Method
Handles duplicates by aggregating values (e.g., using mean
, sum
, etc.).
# Sample DataFrame with duplicates
df = pd.DataFrame({
'A': ['x', 'x', 'x', 'y'],
'B': ['a', 'a', 'b', 'b'],
'C': [1, 2, 3, 4]
})
# Create a pivot table with aggregation (default: mean)
pivot_tbl = df.pivot_table(index='A', columns='B', values='C', aggfunc='sum')
Output:
B a b
A
x 3.0 3.0
y NaN 4.0
3. stack()
and unstack()
stack()
: Convert column labels to row indices (longer format).unstack()
: Convert row indices to column labels (wider format).
Example:
# Create a MultiIndex DataFrame
index = pd.MultiIndex.from_tuples([('x', 'a'), ('x', 'b'), ('y', 'a')])
df = pd.DataFrame({'data': [1, 2, 3]}, index=index)
# Unstack the inner index level
unstacked = df.unstack()
Output:
data
a b
x 1.0 2.0
y 3.0 NaN
4. melt()
Method
Convert wide DataFrames to long format (unpivot).
# Wide DataFrame
df = pd.DataFrame({
'id': [1, 2],
'2020': [100, 200],
'2021': [300, 400]
})
# Melt columns into rows
melted = df.melt(id_vars='id', var_name='year', value_name='sales')
Output:
id year sales
0 1 2020 100
1 2 2020 200
2 1 2021 300
3 2 2021 400
Key Notes:
- Use
pivot()
for simple reshaping with unique index/column pairs. - Use
pivot_table()
for data with duplicates (requires aggregation). - Use
stack()/unstack()
for MultiIndex DataFrames. - Use
melt()
to reverse a pivot (wide-to-long transformation).
For more details, check the pandas documentation.