To write a pandas DataFrame to a CSV file, use the to_csv() method. This method provides extensive customization for formatting, handling missing values, encoding, and more. Below are detailed explanations and examples:
Basic Syntax
df.to_csv(
path_or_buf, # File path or buffer (e.g., 'data.csv')
sep=',', # Delimiter (default: comma)
index=False, # Exclude index column (default: True)
header=True, # Include column names (default: True)
mode='w', # Write mode ('w' for write, 'a' for append)
encoding='utf-8', # File encoding (e.g., 'utf-8', 'latin-1')
na_rep='', # Representation for missing values (default: '')
columns=None, # Columns to write (default: all)
date_format=None, # Format for datetime columns (e.g., '%Y-%m-%d')
compression=None # Compression (e.g., 'gzip', 'zip')
)
Examples
1. Basic Save to CSV
import pandas as pd
# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]}
df = pd.DataFrame(data)
# Save to CSV (index included by default)
df.to_csv('basic_data.csv')
2. Exclude Index Column
df.to_csv('no_index_data.csv', index=False)
3. Custom Delimiter (e.g., Tab-Separated)
df.to_csv('tab_data.tsv', sep='\t', index=False)
4. Handle Missing Values
Replace NaN/None with a custom string (e.g., 'NA'):
import numpy as np
df_with_nulls = pd.DataFrame({'A': [1, np.nan, 3], 'B': [np.nan, 5, 6]})
df_with_nulls.to_csv('nulls_data.csv', na_rep='NA', index=False)
5. Write Specific Columns
df.to_csv('selected_columns.csv', columns=['Name'], index=False)
6. Append to Existing CSV
# Append new data to an existing file (no headers)
new_data = pd.DataFrame({'Name': ['Dave'], 'Age': [40]})
new_data.to_csv('basic_data.csv', mode='a', header=False, index=False)
7. Format Dates in CSV
from datetime import datetime
# DataFrame with dates
dates = pd.DataFrame({'Date': [datetime(2023, 1, 1), datetime(2023, 1, 2)]})
dates.to_csv('dates.csv', index=False, date_format='%Y-%m-%d')
8. Compress CSV (e.g., GZIP)
df.to_csv('compressed_data.csv.gz', index=False, compression='gzip')
9. Custom Encoding (e.g., Latin-1)
df.to_csv('latin_data.csv', encoding='latin-1', index=False)
Key Parameters Explained
| Parameter | Purpose |
|---|---|
path_or_buf | File path (e.g., 'data.csv') or a file-like object (e.g., StringIO). |
sep | Delimiter (default ','). Use '\t' for TSV files. |
index | Omit DataFrame index if False (default True). |
header | Include column names if True (default). Set False to exclude. |
mode | 'w' (overwrite) or 'a' (append). |
encoding | Specify encoding (e.g., 'utf-8', 'latin-1'). |
na_rep | Replace missing values with a string (e.g., 'NA', 'Missing'). |
columns | List of columns to export (e.g., ['Name', 'Age']). |
date_format | Format datetime columns (e.g., '%Y-%m-%d %H:%M:%S'). |
compression | Compress output (e.g., 'gzip', 'zip'). |
Common Pitfalls & Solutions
- Unwanted Index Column
Always useindex=Falseunless you need the index. - Encoding Errors
Useencoding='utf-8'(default) orencoding='latin-1'for compatibility. - Overwriting Files
Usemode='a'to append data instead of overwriting. - Missing Data Ambiguity
Explicitly definena_repto avoid confusion with empty strings.
Advanced Example: Export with Quoting
Handle data containing delimiters (e.g., commas in strings) using quoting:
import csv
data = {'Text': ['Hello, World', 'Goodbye, Earth']}
df = pd.DataFrame(data)
# Quote all non-numeric fields to escape commas
df.to_csv('quoted_data.csv', index=False, quoting=csv.QUOTE_NONNUMERIC)
Summary
- Use
df.to_csv('filename.csv', index=False)for most cases. - Customize separators, encoding, missing values, and compression as needed.
- Avoid including the index unless necessary for your use case.