To find all files with the .txt
extension in a directory (including subdirectories) using Python, you can use several methods. Below are the most common approaches with examples:
1. Using glob
(Simple and Recursive)
The glob
module supports pattern matching and recursive searches.
import glob
# Non-recursive (current directory only)
txt_files = glob.glob("path/to/directory/*.txt")
# Recursive (all subdirectories)
txt_files_recursive = glob.glob("path/to/directory/**/*.txt", recursive=True)
print("Non-recursive:", txt_files)
print("Recursive:", txt_files_recursive)
Output:
Non-recursive: ['path/to/directory/file1.txt', ...]
Recursive: ['path/to/directory/file1.txt', 'path/to/directory/subdir/file2.txt', ...]
2. Using os.walk
(Manual Traversal)
The os
module provides low-level directory traversal.
import os
txt_files = []
for root, dirs, files in os.walk("path/to/directory"):
for file in files:
if file.endswith(".txt"):
txt_files.append(os.path.join(root, file))
print("Files found:", txt_files)
Output:
Files found: ['path/to/directory/file1.txt', 'path/to/directory/subdir/file2.txt', ...]
3. Using pathlib
(Modern and Object-Oriented)
The pathlib
module (Python 3.4+) offers an object-oriented approach.
from pathlib import Path
# Non-recursive (current directory)
directory = Path("path/to/directory")
txt_files = list(directory.glob("*.txt"))
# Recursive (all subdirectories)
txt_files_recursive = list(directory.rglob("*.txt"))
print("Non-recursive:", txt_files)
print("Recursive:", txt_files_recursive)
Output:
Non-recursive: [PosixPath('path/to/directory/file1.txt'), ...]
Recursive: [PosixPath('path/to/directory/file1.txt'), PosixPath('path/to/directory/subdir/file2.txt'), ...]
Key Differences
Method | Pros | Cons |
---|---|---|
glob | Concise, supports wildcards. | Requires recursive=True for subdirs. |
os.walk | Full control over traversal. | More verbose. |
pathlib | Object-oriented, clean syntax. | Requires Python 3.4+. |
Advanced Scenarios
Case-Insensitive Search:
import re
# Use regex to match .txt, .TXT, etc.
txt_files = [
os.path.join(root, file)
for root, _, files in os.walk("path/to/directory")
for file in files
if re.search(r"\.txt$", file, re.IGNORECASE)
]
Exclude Hidden Files:
# Skip files/directories starting with `.`
txt_files = [
os.path.join(root, file)
for root, dirs, files in os.walk("path/to/directory")
for file in files
if file.endswith(".txt") and not file.startswith(".")
]
Summary
- For simplicity: Use
glob.glob
orpathlib.Path.rglob
. - For control: Use
os.walk
. - For modern code: Prefer
pathlib
.
These methods allow you to efficiently locate .txt
files in any directory structure.