How to unzip files in Python ?

To unzip files in Python, you can use the built-in zipfile module or the shutil module for a higher-level interface. Below is a detailed guide with examples:

1. Using the zipfile Module

The zipfile module provides direct control over ZIP files. It allows you to read, write, and extract ZIP archives.

Example 1: Extract All Files from a ZIP Archive

import zipfile

# Open the ZIP file in read mode
with zipfile.ZipFile("example.zip", "r") as zip_ref:
    # Extract all files to a directory (default: current directory)
    zip_ref.extractall("extracted_files")

print("All files extracted successfully!")
  • extractall([path]): Extracts all files to the specified directory (defaults to the current directory).
  • The with statement ensures the ZIP file is properly closed after extraction.

Example 2: Extract a Specific File

import zipfile

with zipfile.ZipFile("example.zip", "r") as zip_ref:
    # List all files in the ZIP
    print("Files in the ZIP:", zip_ref.namelist())

    # Extract a single file (e.g., "document.txt")
    zip_ref.extract("document.txt", "extracted_files")

print("Specific file extracted!")
  • namelist(): Returns a list of filenames in the ZIP archive.
  • extract(member, [path]): Extracts a specific file (member) to the specified directory.

2. Handling Password-Protected ZIP Files

Use the pwd parameter to provide a password (as a bytes object).

Example 3: Extract Password-Protected ZIP

import zipfile

password = "secret".encode("utf-8")  # Convert to bytes

with zipfile.ZipFile("encrypted.zip", "r") as zip_ref:
    # Extract all files with a password
    zip_ref.extractall("extracted_files", pwd=password)

print("Password-protected files extracted!")

3. Using shutil for Simplicity

The shutil module provides a high-level unpack_archive function that works for ZIP, TAR, and other formats.

Example 4: Extract with shutil

import shutil

# Extract to a directory (format inferred from the filename)
shutil.unpack_archive("example.zip", "extracted_files")

print("ZIP extracted using shutil!")
  • unpack_archive(filename, extract_dir): Automatically detects the archive format (ZIP, TAR, etc.).

4. Error Handling

Always handle exceptions for missing files, incorrect passwords, or corrupted archives.

Example 5: Safe Extraction with Error Handling

import zipfile

try:
    with zipfile.ZipFile("example.zip", "r") as zip_ref:
        zip_ref.extractall("extracted_files")
except zipfile.BadZipFile:
    print("Error: The file is not a valid ZIP archive.")
except FileNotFoundError:
    print("Error: The ZIP file does not exist.")
except RuntimeError as e:
    if "password" in str(e).lower():
        print("Error: Incorrect password.")
    else:
        print("Error:", e)

Summary

  • zipfile: Best for granular control (e.g., extracting specific files, handling passwords).
  • shutil: Simplest method for basic extraction (supports multiple formats).
  • Always use a with statement to ensure resources are properly released.
  • Handle exceptions to avoid crashes on invalid files or passwords.

When to Use Which?

  • Use zipfile for password-protected files or selective extraction.
  • Use shutil for quick extraction without needing low-level control.

Leave a Reply

Your email address will not be published. Required fields are marked *