To list all files in a directory using Python, you can use several methods from the os
, pathlib
, and glob
modules. Below is a detailed explanation with examples for each approach:
1. Using os.listdir()
Lists all entries (files and directories) in a directory. Requires filtering to isolate files.
import os
directory = "/path/to/directory"
all_entries = os.listdir(directory) # Returns filenames (not full paths)
files = [
entry for entry in all_entries
if os.path.isfile(os.path.join(directory, entry))
]
print(files)
Notes:
- Returns filenames without full paths.
- Use
os.path.join(directory, entry)
to get full paths. - Includes hidden files (e.g.,
.bashrc
on Unix).
2. Using os.scandir()
(Efficient for Large Directories)
Returns DirEntry
objects with file metadata, improving performance.
import os
directory = "/path/to/directory"
files = []
with os.scandir(directory) as entries:
for entry in entries:
if entry.is_file():
files.append(entry.name) # Filename only
# files.append(entry.path) # Full path
print(files)
Advantages:
- Faster than
os.listdir()
for large directories. - Direct access to file attributes (e.g.,
entry.is_file()
).
3. Using pathlib
(Modern and Pythonic)
Object-oriented approach with Path
objects (Python 3.5+).
from pathlib import Path
directory = Path("/path/to/directory")
files = [entry.name for entry in directory.iterdir() if entry.is_file()]
# Or full paths:
files_with_paths = [str(entry) for entry in directory.iterdir() if entry.is_file()]
print(files)
Features:
- Platform-agnostic path handling.
- Clean syntax with methods like
.is_file()
,.glob()
, etc.
4. Recursive Listing (All Subdirectories)
Using os.walk()
import os
directory = "/path/to/directory"
all_files = []
for root, dirs, files in os.walk(directory):
for file in files:
all_files.append(os.path.join(root, file))
print(all_files)
Using pathlib
(Recursive Glob)
from pathlib import Path
directory = Path("/path/to/directory")
all_files = list(directory.rglob("*")) # All files recursively
txt_files = list(directory.rglob("*.txt")) # Only .txt files
5. Filtering Files
By Extension
from pathlib import Path
directory = Path("/path/to/directory")
txt_files = [file.name for file in directory.glob("*.txt")]
Exclude Hidden Files
files = [
entry for entry in directory.iterdir()
if entry.is_file() and not entry.name.startswith(".")
]
6. Using glob
Module
Pattern-based file searching (supports wildcards like *
, ?
).
import glob
# All files in directory
files = glob.glob("/path/to/directory/*")
# Only .txt files
txt_files = glob.glob("/path/to/directory/*.txt")
# Recursive .txt files
recursive_txt = glob.glob("/path/to/directory/**/*.txt", recursive=True)
7. Sorting Files
Sort alphabetically or by modification time.
from pathlib import Path
directory = Path("/path/to/directory")
files = sorted(directory.iterdir(), key=lambda f: f.name) # Sort by name
files_by_mtime = sorted(
directory.iterdir(),
key=lambda f: f.stat().st_mtime # Sort by modification time
)
Summary Table
Method | Pros | Cons | Use Case |
---|---|---|---|
os.listdir() | Simple, no dependencies | Requires manual filtering | Basic scripts |
os.scandir() | Fast, metadata access | Slightly more verbose | Large directories |
pathlib | Modern, OOP, cross-platform | Python 3.5+ only | New projects |
os.walk() | Recursive listing | More code for recursion | Directory trees |
glob /pathlib.glob | Pattern matching (e.g., *.txt ) | Limited to pattern-based filtering | Filtering by extension/pattern |
Best Practices
- Use
pathlib
for readability and cross-platform compatibility. - Use
os.scandir()
for performance-critical applications. - Combine with
sorted()
to order files alphabetically or by metadata. - Handle exceptions (e.g.,
PermissionError
) if accessing restricted directories.
By leveraging these methods, you can efficiently list and filter files in Python for any use case.