To extract the file extension from a filename in Python, you can use the following methods, depending on whether you want the extension with or without the leading dot (.):
1. Using os.path.splitext (Classic Approach)
The os.path module provides a reliable way to split the filename into its base and extension:
import os
filename = "document.txt"
root, extension = os.path.splitext(filename)
print(extension) # Output: .txt
- Handles edge cases:
- Filenames with no extension:
"file"→("file", ""). - Hidden files (Unix-like):
".bashrc"→(".bashrc", ""). - Multiple dots:
"image.tar.gz"→("image.tar", ".gz").
2. Using pathlib.Path (Modern Approach, Python 3.4+)
The pathlib module offers an object-oriented approach:
from pathlib import Path
file_path = Path("photo.jpg")
extension = file_path.suffix # Includes the leading dot
print(extension) # Output: .jpg
# To get the extension without the dot:
extension_without_dot = file_path.suffix.lstrip(".")
print(extension_without_dot) # Output: jpg
- Handles multiple extensions:
path = Path("archive.tar.gz")
print(path.suffixes) # Output: ['.tar', '.gz']
print(path.suffix) # Output: .gz (last extension only)
3. Using String Manipulation (Manual Method)
For simple cases (not recommended for complex paths):
filename = "data.csv"
if "." in filename:
extension = filename.split(".")[-1] # "csv"
else:
extension = ""
Limitations:
- Fails for filenames like
".gitignore"(hidden files) or paths with directories (e.g.,"folder/file.txt"). - Does not handle multiple dots correctly (e.g.,
"file.tar.gz"→ splits into"gz").
Key Notes
- Leading Dot: Use
.lstrip(".")to remove it (e.g.,.txt→txt). - No Extension: Check if the result is empty (e.g.,
""). - Edge Cases: Prefer
os.pathorpathlibfor cross-platform compatibility.
Examples
| Filename | os.path.splitext | pathlib.Path.suffix |
|---|---|---|
"report.docx" | (".docx") | .docx |
"image.png" | (".png") | .png |
"README" | ("") | "" |
"archive.tar.gz" | (".gz") | .gz |
".bashrc" | ("") | "" |
Best Practice
Use pathlib (Python 3.4+) for clean, modern code, or os.path.splitext for compatibility with older Python versions. Avoid manual string splitting for complex paths.