Here’s a comprehensive guide to profiling Python scripts, including various methods and tools to identify performance bottlenecks:
1. Built-in Profilers
Python includes two built-in profiling modules in its standard library.
a. cProfile (Recommended)
A C-extension profiler with minimal overhead, suitable for most use cases.
Basic Usage:
python -m cProfile -s time your_script.py
Key Flags:
-s time: Sort by execution time (other options:cumulative,calls)-o profile.stats: Save results to a file-l: Limit to specific lines (withline_profiler)
Example Output:
1234567 function calls (1234564 primitive calls) in 4.567 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
100000 2.345 0.000 2.345 0.000 calculations.py:42(heavy_function)
50000 1.234 0.000 3.579 0.000 utils.py:15(helper)
...
Programmatic Usage:
import cProfile
def main():
# Your code here
if __name__ == "__main__":
profiler = cProfile.Profile()
profiler.enable()
main()
profiler.disable()
profiler.print_stats(sort='cumtime')
b. profile (Pure Python)
Slower but useful for extension development. Usage is similar to cProfile.
2. Time-Specific Code Sections
a. timeit Module
For microbenchmarks of small code snippets:
import timeit
result = timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
print(f"Execution time: {result:.4f} seconds")
b. Manual Timing
import time
start_time = time.perf_counter() # High-resolution timer
# Code to profile
end_time = time.perf_counter()
print(f"Duration: {end_time - start_time:.6f} seconds")
3. Advanced Profiling Tools
a. SnakeViz (Visualization)
- Install:
pip install snakeviz - Generate stats file:
python -m cProfile -o profile.stats your_script.py
- Visualize:
snakeviz profile.stats
Creates an interactive sunburst chart in your browser.
b. line_profiler (Line-by-Line)
- Install:
pip install line_profiler - Decorate functions with
@profile
@profile
def slow_function():
# Code to analyze
- Run:
kernprof -l -v your_script.py
Sample Output:
Line # Hits Time Per Hit % Time Line Contents
======================================================
3 1000 25345 25.3 95.2 x = [i**2 for i in range(10000)]
4. Memory Profiling
a. memory_profiler
- Install:
pip install memory_profiler - Usage:
from memory_profiler import profile
@profile
def memory_intensive_function():
# Code to analyze
- Run:
python -m memory_profiler your_script.py
b. tracemalloc (Built-in)
import tracemalloc
tracemalloc.start()
# ... run your code ...
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("[ Top 10 memory users ]")
for stat in top_stats[:10]:
print(stat)
5. Statistical Profilers (Low Overhead)
a. py-spy (Sampling Profiler)
- Install:
pip install py-spy - Profile running process:
py-spy top --pid 12345
- Generate flamegraph:
py-spy record -o profile.svg -- python your_script.py
6. Best Practices
- Profile Before Optimizing: Focus on actual bottlenecks
- Use Multiple Tools: Combine time and memory profiling
- Test Realistic Workloads: Profile with production-like data
- Profile in Production: Use
py-spyon live systems - Check Algorithm Complexity: Sometimes the biggest gains come from algorithmic improvements
7. Common Performance Issues to Look For
| Pattern | Potential Fix |
|---|---|
| Repeated O(n²) ops | Use sets/dicts for lookups |
| Excessive I/O | Batch operations |
| Unnecessary copies | Use generators/views |
| Global variables | Local variable optimization |
| Type conversions | Precompute/vectorize |
Example Workflow
- Run
cProfileto identify slow functions - Use
line_profileron critical functions - Check memory usage with
memory_profiler - Visualize with
snakevizorpy-spy - Optimize and re-profile
This combination of tools will help you systematically identify and address performance issues in your Python code.