To download a file over HTTP in Python, you can use built-in libraries like urllib or third-party libraries like requests. Below are detailed methods with examples:
Method 1: Using urllib (Standard Library)
The urllib module is part of Python’s standard library and requires no additional installation.
Example 1: Simple Download with urllib.request.urlretrieve
from urllib.request import urlretrieve
url = "https://example.com/file.zip"
save_path = "downloaded_file.zip"
# Download the file
urlretrieve(url, save_path)
print(f"File saved to {save_path}")Example 2: Download with Progress Tracking
For large files, use chunked downloads to avoid memory issues:
from urllib.request import urlopen
url = "https://example.com/large_file.zip"
save_path = "large_file.zip"
# Open the URL and create a local file
with urlopen(url) as response:
    with open(save_path, "wb") as f:
        while True:
            chunk = response.read(8192)  # Read 8KB at a time
            if not chunk:
                break
            f.write(chunk)
print("Download complete.")Method 2: Using requests (Third-Party Library)
The requests library simplifies HTTP interactions. Install it first:
pip install requestsExample 1: Basic Download
import requests
url = "https://example.com/image.jpg"
save_path = "image.jpg"
response = requests.get(url, stream=True)
if response.status_code == 200:
    with open(save_path, "wb") as f:
        f.write(response.content)
    print("File downloaded successfully.")
else:
    print(f"Failed to download: Status code {response.status_code}")Example 2: Download Large Files in Chunks
import requests
url = "https://example.com/large_video.mp4"
save_path = "video.mp4"
response = requests.get(url, stream=True)
if response.status_code == 200:
    with open(save_path, "wb") as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)
    print("Large file downloaded.")
else:
    print(f"Error: {response.status_code}")Example 3: Download with Progress Bar
Use tqdm to show download progress (install with pip install tqdm):
import requests
from tqdm import tqdm
url = "https://example.com/huge_file.iso"
save_path = "huge_file.iso"
response = requests.get(url, stream=True)
total_size = int(response.headers.get("content-length", 0))
with open(save_path, "wb") as f, tqdm(
    desc=save_path,
    total=total_size,
    unit="B",
    unit_scale=True,
    unit_divisor=1024,
) as progress_bar:
    for chunk in response.iter_content(chunk_size=8192):
        f.write(chunk)
        progress_bar.update(len(chunk))
print("Download with progress complete.")Method 3: Handle Errors and Headers
Add error handling and custom headers (e.g., user agents):
import requests
url = "https://example.com/protected_file.pdf"
save_path = "file.pdf"
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
}
try:
    response = requests.get(url, headers=headers, stream=True, timeout=10)
    response.raise_for_status()  # Raise error for bad status codes (4xx/5xx)
    with open(save_path, "wb") as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)
    print("File downloaded with headers.")
except requests.exceptions.HTTPError as e:
    print(f"HTTP error: {e}")
except requests.exceptions.ConnectionError:
    print("Connection failed.")
except requests.exceptions.Timeout:
    print("Request timed out.")Key Considerations
- Memory Efficiency: Use stream=Trueinrequestsor chunked downloads inurllibfor large files.
- Error Handling: Check HTTP status codes and handle exceptions.
- User-Agent: Some servers block default Python user agents. Mimic a browser with headers.
- Progress Tracking: Use tqdmfor visual feedback on large downloads.
- SSL Verification: Disable with verify=Falseinrequestsfor problematic HTTPS sites (not recommended for security).
Summary
- For simplicity: Use urllib.request.urlretrieve.
- For advanced needs: Use requestswith chunking and error handling.
- For large files: Always stream downloads to avoid memory overload.