Performance Tips

Python is not the fastest language, but it is fast enough for most tasks when you write it well. This guide covers practical techniques to speed up your Python code, and just as importantly, teaches you when not to optimize.

Generator Expressions vs List Comprehensions

List comprehensions create the entire list in memory at once. Generator expressions produce items one at a time, using almost no memory regardless of size.

# List comprehension: creates entire list in memory
squares = [x ** 2 for x in range(1_000_000)]  # ~8 MB in memory

# Generator expression: produces one item at a time
squares = (x ** 2 for x in range(1_000_000))  # ~120 bytes!

Use generators when you only need to iterate through results and do not need to keep them all in memory, index into them, or check their length.

# Perfect use case: sum, any, all, min, max
total = sum(x ** 2 for x in range(1_000_000))  # No brackets needed!

# Check if any line contains an error
has_error = any("ERROR" in line for line in open("log.txt"))

Python Playground
Output
Click "Run" to execute your code

__slots__ for Memory-Efficient Classes

By default, Python objects store their attributes in a dictionary (__dict__), which has significant memory overhead. Using __slots__ replaces the dictionary with a fixed set of attribute slots.

class PointRegular:
    def __init__(self, x, y):
        self.x = x
        self.y = y

class PointSlots:
    __slots__ = ('x', 'y')
    def __init__(self, x, y):
        self.x = x
        self.y = y

This saves 40-60% memory per instance, which matters when you create millions of objects.

__slots__ and Inheritance

When using __slots__ with inheritance, every class in the hierarchy should define its own __slots__. If a parent class does not use __slots__, the subclass will still have a __dict__ and the memory savings are lost.

class Base:
    __slots__ = ('x',)

class Child(Base):
    __slots__ = ('y',)  # Only add NEW attributes here
    # 'x' is inherited from Base's __slots__

class BrokenChild(Base):
    # No __slots__ defined — gets a __dict__, losing the memory benefit
    pass

Tradeoffs of __slots__:

  • Cannot add arbitrary attributes at runtime
  • Cannot use multiple inheritance with two parents that both have non-empty __slots__ (unless carefully designed)
  • Some libraries that inspect __dict__ may not work correctly

Python Playground
Output
Click "Run" to execute your code

Local Variable Lookups Are Faster Than Global

Python looks up local variables much faster than global or built-in variables. In tight loops, this can make a measurable difference.

# Slow: global lookup on every iteration
import math

def slow_sqrt_sum(numbers):
    total = 0
    for n in numbers:
        total += math.sqrt(n)  # Global lookup for math, then attribute lookup for sqrt
    return total

# Fast: local alias
def fast_sqrt_sum(numbers):
    sqrt = math.sqrt  # Bind to local variable once
    total = 0
    for n in numbers:
        total += sqrt(n)  # Local lookup is faster
    return total

This technique matters in hot loops. For most code, readability is more important.

Python Playground
Output
Click "Run" to execute your code

String Concatenation: join() vs +

Building strings with + in a loop creates a new string object on each iteration, because strings are immutable. Use str.join() instead.

# Slow: O(n^2) — creates a new string every iteration
result = ""
for word in words:
    result += word + " "

# Fast: O(n) — single allocation
result = " ".join(words)

Python Playground
Output
Click "Run" to execute your code

The collections Module

Python's collections module provides specialized container types that are faster or more convenient than the built-in types for specific use cases.

defaultdict: Auto-Initializing Dictionary

from collections import defaultdict

# Instead of checking if key exists
word_count = {}
for word in words:
    if word not in word_count:
        word_count[word] = 0
    word_count[word] += 1

# Use defaultdict
word_count = defaultdict(int)
for word in words:
    word_count[word] += 1  # Auto-initializes to 0

Counter: Counting Made Easy

from collections import Counter

words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
counts = Counter(words)
print(counts.most_common(2))  # [('apple', 3), ('banana', 2)]

deque: Fast Appends and Pops from Both Ends

Lists are slow when inserting or removing from the front (O(n)). deque is O(1) for both ends.

from collections import deque

d = deque([1, 2, 3])
d.appendleft(0)  # O(1) — list.insert(0, x) is O(n)!
d.popleft()       # O(1) — list.pop(0) is O(n)!

namedtuple: Lightweight Immutable Objects

from collections import namedtuple

Point = namedtuple('Point', ['x', 'y'])
p = Point(3, 4)
print(p.x, p.y)  # Access by name, but memory-efficient like a tuple

Python Playground
Output
Click "Run" to execute your code

Using Sets for Membership Testing

Checking if an item is in a list is O(n). Checking if it is in a set is O(1). For repeated membership tests, convert to a set first.

# Slow: O(n) lookup for each check
valid_ids = [1, 2, 3, ..., 10000]
if user_id in valid_ids:  # Scans entire list
    ...

# Fast: O(1) lookup
valid_ids = {1, 2, 3, ..., 10000}
if user_id in valid_ids:  # Hash table lookup
    ...

Python Playground
Output
Click "Run" to execute your code

Avoiding Unnecessary Copies

Python's assignment creates references, not copies. Understand when copies happen and when they are needed.

# Reference (no copy) — fast, shared mutations
a = [1, 2, 3]
b = a          # b points to the same list
b.append(4)    # a is also [1, 2, 3, 4]!

# Shallow copy — copies the container but not the items
b = a[:]       # or a.copy() or list(a)

# Deep copy — copies everything recursively
import copy
b = copy.deepcopy(a)

Avoid unnecessary copies in function calls:

# Wasteful: creates a copy every time
def process(items):
    sorted_items = sorted(items)  # Creates new list
    ...

# Better: sort in place if you own the data
def process(items):
    items.sort()  # Sorts in place, no copy
    ...

Python Playground
Output
Click "Run" to execute your code

The Cost of Polymorphism

Python's dynamic dispatch means every method call involves looking up the method in the object's class hierarchy at runtime. This is slower than a direct function call. For performance-critical code, understand the trade-offs.

Virtual Method Dispatch in Python

Every method call on an object (obj.method()) requires:

  1. Looking up method in obj.__class__.__dict__
  2. If not found, walking up the MRO (Method Resolution Order)
  3. Creating a bound method object
  4. Calling the bound method

This overhead is small for a single call but adds up in tight loops over millions of objects.

# Slower: polymorphic dispatch through class hierarchy
class Shape:
    def area(self):
        raise NotImplementedError

class Circle(Shape):
    def __init__(self, r):
        self.r = r
    def area(self):
        return 3.14159 * self.r ** 2

shapes = [Circle(i) for i in range(100_000)]
total = sum(s.area() for s in shapes)  # Method lookup on each iteration

# Faster: simple function
def circle_area(r):
    return 3.14159 * r ** 2

radii = list(range(100_000))
total = sum(circle_area(r) for r in radii)  # No method dispatch overhead

When to Prefer Functions Over Class Hierarchies

  • Hot loops processing large datasets: plain functions can be 20-50% faster
  • Simple transformations that do not need state: a function is simpler and faster
  • Numerical computations: consider using NumPy or other optimized libraries

However, do not sacrifice code organization for micro-optimization. Method dispatch overhead rarely matters in real applications. Profile first.

Python Playground
Output
Click "Run" to execute your code

Profiling with cProfile and timeit

Never optimize without measuring first. Python provides excellent profiling tools.

timeit for Micro-Benchmarks

import timeit

# Time a single expression
timeit.timeit('sum(range(1000))', number=10000)

# Compare approaches
time_list = timeit.timeit('[x**2 for x in range(100)]', number=10000)
time_gen = timeit.timeit('list(x**2 for x in range(100))', number=10000)

cProfile for Function-Level Profiling

import cProfile

def my_program():
    data = [i ** 2 for i in range(10000)]
    total = sum(data)
    filtered = [x for x in data if x > 1000]
    return len(filtered)

cProfile.run('my_program()')

The output shows how many times each function was called and how much time was spent in each.

Python Playground
Output
Click "Run" to execute your code

When to Optimize

The most important performance tip is knowing when not to optimize. Follow these rules:

  1. Make it work first. Write clear, correct code.
  2. Measure. Use profiling to find actual bottlenecks. Most code is not performance-critical.
  3. Optimize the bottleneck. Focus on the 10% of code that accounts for 90% of execution time.
  4. Consider algorithmic changes first. Switching from O(n^2) to O(n log n) beats any micro-optimization.
  5. Only then micro-optimize. Local variable caching, __slots__, etc.

"Premature optimization is the root of all evil." — Donald Knuth

Common algorithmic improvements:

Instead of... Use... Improvement
Nested loops for search Dictionary/set lookup O(n^2) to O(n)
Repeated string concatenation str.join() O(n^2) to O(n)
Sorting to find min/max min()/max() O(n log n) to O(n)
List for queue operations collections.deque O(n) pops to O(1)
Re-computing in loops Cache/memoize results Depends on problem

Python Playground
Output
Click "Run" to execute your code

Summary

Technique When to Use Benefit
Generator expressions Iterating without keeping all items Memory savings
__slots__ Many instances of the same class 40-60% less memory per instance
Local variable binding Tight loops with global lookups Faster variable access
str.join() Building strings in a loop O(n) instead of O(n^2)
collections types Counting, grouping, queues, records Built-in optimized containers
Sets for membership Repeated in checks O(1) instead of O(n)
Functions over methods Hot loops with simple operations Avoid dispatch overhead
Profile first Always Optimize the right thing

What's Next?

Learn how to keep your code clean and catch problems early with IDE Warnings & Linting.