How do I list all files of a directory in Python?

To list all files in a directory using Python, you can use several methods from the os, pathlib, and glob modules. Below is a detailed explanation with examples for each approach:

1. Using os.listdir()

Lists all entries (files and directories) in a directory. Requires filtering to isolate files.

import os

directory = "/path/to/directory"
all_entries = os.listdir(directory)  # Returns filenames (not full paths)
files = [
    entry for entry in all_entries
    if os.path.isfile(os.path.join(directory, entry))
]
print(files)

Notes:

  • Returns filenames without full paths.
  • Use os.path.join(directory, entry) to get full paths.
  • Includes hidden files (e.g., .bashrc on Unix).

2. Using os.scandir() (Efficient for Large Directories)

Returns DirEntry objects with file metadata, improving performance.

import os

directory = "/path/to/directory"
files = []
with os.scandir(directory) as entries:
    for entry in entries:
        if entry.is_file():
            files.append(entry.name)  # Filename only
            # files.append(entry.path)  # Full path
print(files)

Advantages:

  • Faster than os.listdir() for large directories.
  • Direct access to file attributes (e.g., entry.is_file()).

3. Using pathlib (Modern and Pythonic)

Object-oriented approach with Path objects (Python 3.5+).

from pathlib import Path

directory = Path("/path/to/directory")
files = [entry.name for entry in directory.iterdir() if entry.is_file()]
# Or full paths:
files_with_paths = [str(entry) for entry in directory.iterdir() if entry.is_file()]
print(files)

Features:

  • Platform-agnostic path handling.
  • Clean syntax with methods like .is_file(), .glob(), etc.

4. Recursive Listing (All Subdirectories)

Using os.walk()

import os

directory = "/path/to/directory"
all_files = []
for root, dirs, files in os.walk(directory):
    for file in files:
        all_files.append(os.path.join(root, file))
print(all_files)

Using pathlib (Recursive Glob)

from pathlib import Path

directory = Path("/path/to/directory")
all_files = list(directory.rglob("*"))  # All files recursively
txt_files = list(directory.rglob("*.txt"))  # Only .txt files

5. Filtering Files

By Extension

from pathlib import Path

directory = Path("/path/to/directory")
txt_files = [file.name for file in directory.glob("*.txt")]

Exclude Hidden Files

files = [
    entry for entry in directory.iterdir()
    if entry.is_file() and not entry.name.startswith(".")
]

6. Using glob Module

Pattern-based file searching (supports wildcards like *, ?).

import glob

# All files in directory
files = glob.glob("/path/to/directory/*")
# Only .txt files
txt_files = glob.glob("/path/to/directory/*.txt")
# Recursive .txt files
recursive_txt = glob.glob("/path/to/directory/**/*.txt", recursive=True)

7. Sorting Files

Sort alphabetically or by modification time.

from pathlib import Path

directory = Path("/path/to/directory")
files = sorted(directory.iterdir(), key=lambda f: f.name)  # Sort by name
files_by_mtime = sorted(
    directory.iterdir(),
    key=lambda f: f.stat().st_mtime  # Sort by modification time
)

Summary Table

MethodProsConsUse Case
os.listdir()Simple, no dependenciesRequires manual filteringBasic scripts
os.scandir()Fast, metadata accessSlightly more verboseLarge directories
pathlibModern, OOP, cross-platformPython 3.5+ onlyNew projects
os.walk()Recursive listingMore code for recursionDirectory trees
glob/pathlib.globPattern matching (e.g., *.txt)Limited to pattern-based filteringFiltering by extension/pattern

Best Practices

  • Use pathlib for readability and cross-platform compatibility.
  • Use os.scandir() for performance-critical applications.
  • Combine with sorted() to order files alphabetically or by metadata.
  • Handle exceptions (e.g., PermissionError) if accessing restricted directories.

By leveraging these methods, you can efficiently list and filter files in Python for any use case.

Leave a Reply

Your email address will not be published. Required fields are marked *