To iterate over files in a directory in Python, you can use the os
module, glob
module, or the modern pathlib
library. Below is a detailed guide with examples for each approach.
1. Using os
Module
The os
module provides low-level directory and file operations.
Method 1: os.listdir()
Lists all entries (files and directories) in a specified directory.
Example:
import os
directory = "/path/to/directory"
# List all entries in the directory
entries = os.listdir(directory)
# Iterate and filter files
for entry in entries:
full_path = os.path.join(directory, entry) # Create full path
if os.path.isfile(full_path): # Check if it's a file
print(f"File: {entry}")
Method 2: os.scandir()
(Python 3.5+)
More efficient than os.listdir()
and returns DirEntry
objects with metadata.
Example:
import os
directory = "/path/to/directory"
with os.scandir(directory) as entries:
for entry in entries:
if entry.is_file(): # Directly check if it's a file
print(f"File: {entry.name}")
Method 3: os.walk()
(Recursive)
Iterates through all files in a directory and its subdirectories.
Example:
import os
root_dir = "/path/to/directory"
# Walk through all subdirectories and files
for root, dirs, files in os.walk(root_dir):
for file in files:
full_path = os.path.join(root, file)
print(f"File: {full_path}")
2. Using glob
Module
The glob
module supports Unix-style path pattern matching (e.g., *.txt
).
Example 1: Non-Recursive Search
import glob
directory = "/path/to/directory"
# Find all .txt files in the directory
txt_files = glob.glob(f"{directory}/*.txt")
for file_path in txt_files:
print(f"Text File: {file_path}")
Example 2: Recursive Search
import glob
# Search for .txt files in all subdirectories (** is recursive)
all_txt_files = glob.glob(f"{directory}/**/*.txt", recursive=True)
for file_path in all_txt_files:
print(f"Text File: {file_path}")
3. Using pathlib
(Python 3.4+)
The pathlib
library provides an object-oriented approach for path manipulation.
Example 1: List Files in a Directory
from pathlib import Path
directory = Path("/path/to/directory")
# Iterate over files
for file in directory.iterdir():
if file.is_file():
print(f"File: {file.name}")
Example 2: Glob-Style Search
from pathlib import Path
directory = Path("/path/to/directory")
# Find all .csv files (non-recursive)
csv_files = directory.glob("*.csv")
for file in csv_files:
print(f"CSV File: {file.name}")
# Recursive search for .csv files
all_csv_files = directory.rglob("*.csv")
for file in all_csv_files:
print(f"CSV File: {file}")
Key Considerations
- Absolute vs. Relative Paths:
- Use
os.path.abspath()
orPath.resolve()
to get absolute paths.
- Filtering Hidden Files:
- Skip hidden files (e.g., starting with
.
on Unix):python for entry in os.scandir(directory): if entry.is_file() and not entry.name.startswith('.'): print(entry.name)
- Sorting Files:
- Sort entries alphabetically:
python sorted_files = sorted(os.listdir(directory))
- Performance:
os.scandir()
andpathlib
are faster for large directories compared toos.listdir()
.
Full Example: Process Files with Metadata
import os
from pathlib import Path
def process_directory(directory):
for entry in os.scandir(directory):
if entry.is_file():
file_path = Path(entry.path)
print(f"""
File Name: {file_path.name}
Size: {file_path.stat().st_size} bytes
Modified: {file_path.stat().st_mtime}
""")
process_directory("/path/to/directory")
Summary
- Use
os.scandir()
for efficient iteration (Python 3.5+). - Use
glob
for pattern-based searches (e.g.,*.txt
). - Use
pathlib
for an object-oriented approach (Python 3.4+). - Use
os.walk()
for recursive traversal of directories.
Choose the method that best fits your use case and Python version!