How to find all files in a directory with extension .txt in Python ?

To find all files with the .txt extension in a directory (including subdirectories) using Python, you can use several methods. Below are the most common approaches with examples:

1. Using glob (Simple and Recursive)

The glob module supports pattern matching and recursive searches.

import glob

# Non-recursive (current directory only)
txt_files = glob.glob("path/to/directory/*.txt")

# Recursive (all subdirectories)
txt_files_recursive = glob.glob("path/to/directory/**/*.txt", recursive=True)

print("Non-recursive:", txt_files)
print("Recursive:", txt_files_recursive)

Output:

Non-recursive: ['path/to/directory/file1.txt', ...]
Recursive: ['path/to/directory/file1.txt', 'path/to/directory/subdir/file2.txt', ...]

2. Using os.walk (Manual Traversal)

The os module provides low-level directory traversal.

import os

txt_files = []
for root, dirs, files in os.walk("path/to/directory"):
    for file in files:
        if file.endswith(".txt"):
            txt_files.append(os.path.join(root, file))

print("Files found:", txt_files)

Output:

Files found: ['path/to/directory/file1.txt', 'path/to/directory/subdir/file2.txt', ...]

3. Using pathlib (Modern and Object-Oriented)

The pathlib module (Python 3.4+) offers an object-oriented approach.

from pathlib import Path

# Non-recursive (current directory)
directory = Path("path/to/directory")
txt_files = list(directory.glob("*.txt"))

# Recursive (all subdirectories)
txt_files_recursive = list(directory.rglob("*.txt"))

print("Non-recursive:", txt_files)
print("Recursive:", txt_files_recursive)

Output:

Non-recursive: [PosixPath('path/to/directory/file1.txt'), ...]
Recursive: [PosixPath('path/to/directory/file1.txt'), PosixPath('path/to/directory/subdir/file2.txt'), ...]

Key Differences

MethodProsCons
globConcise, supports wildcards.Requires recursive=True for subdirs.
os.walkFull control over traversal.More verbose.
pathlibObject-oriented, clean syntax.Requires Python 3.4+.

Advanced Scenarios

Case-Insensitive Search:

import re

# Use regex to match .txt, .TXT, etc.
txt_files = [
    os.path.join(root, file)
    for root, _, files in os.walk("path/to/directory")
    for file in files
    if re.search(r"\.txt$", file, re.IGNORECASE)
]

Exclude Hidden Files:

# Skip files/directories starting with `.`
txt_files = [
    os.path.join(root, file)
    for root, dirs, files in os.walk("path/to/directory")
    for file in files
    if file.endswith(".txt") and not file.startswith(".")
]

Summary

  • For simplicity: Use glob.glob or pathlib.Path.rglob.
  • For control: Use os.walk.
  • For modern code: Prefer pathlib.

These methods allow you to efficiently locate .txt files in any directory structure.

Leave a Reply

Your email address will not be published. Required fields are marked *