Docstrings & Comments
Good documentation helps others (and future you) understand your code. Python has built-in support for documentation through docstrings, and PEP 257 provides conventions for writing them well.
Docstrings vs Comments
They serve different purposes:
- Docstrings describe what something does — they're for users of your code
- Comments explain why something is done a certain way — they're for maintainers of your code
def calculate_tax(income, rate=0.25):
"""Calculate tax owed based on income and tax rate."""
# Use max() to ensure we never return a negative tax
# (edge case when income adjustments push it below zero)
return max(0, income * rate)
Single-Line Docstrings
For simple functions, a one-line docstring is enough. It should be a complete sentence that describes what the function does, not how:
def square(n):
"""Return the square of a number."""
return n ** 2
def is_even(n):
"""Check whether a number is even."""
return n % 2 == 0
class Stack:
"""A simple last-in, first-out (LIFO) stack."""
pass
PEP 257 rules for single-line docstrings:
- Use triple double quotes (
"""..."""), even for one line - Put the closing
"""on the same line - No blank line before or after
- Write as a phrase ending with a period
Click "Run" to execute your codeMulti-Line Docstrings
For anything beyond a trivial function, use a multi-line docstring. The structure is:
- A summary line (same as single-line)
- A blank line
- Extended description, parameters, return values, examples, etc.
def connect(host, port, timeout=30):
"""Establish a connection to the remote server.
Attempts to connect to the specified host and port. If the
connection cannot be established within the timeout period,
a ConnectionError is raised.
Args:
host: The hostname or IP address to connect to.
port: The port number (1-65535).
timeout: Maximum seconds to wait. Defaults to 30.
Returns:
A Connection object representing the active connection.
Raises:
ConnectionError: If the connection cannot be established.
ValueError: If port is out of valid range.
"""
pass
Docstring Styles
There are three widely used conventions for formatting multi-line docstrings. Pick one and use it consistently throughout your project.
Google Style
The most readable and popular style. Uses indented sections with simple headers:
def fetch_users(status, limit=100):
"""Fetch users from the database filtered by status.
Retrieves a list of users matching the given status,
ordered by creation date (newest first).
Args:
status: The account status to filter by. One of
'active', 'inactive', or 'pending'.
limit: Maximum number of users to return.
Defaults to 100.
Returns:
A list of User objects matching the criteria.
Returns an empty list if no users match.
Raises:
DatabaseError: If the query fails.
ValueError: If status is not a valid value.
Example:
>>> users = fetch_users('active', limit=10)
>>> len(users)
10
"""
pass
NumPy/SciPy Style
Popular in the scientific Python ecosystem. Uses underlined section headers:
def calculate_statistics(data):
"""Calculate descriptive statistics for a dataset.
Parameters
----------
data : list of float
The input dataset. Must contain at least one element.
Returns
-------
dict
A dictionary containing:
- 'mean' : float, the arithmetic mean
- 'median' : float, the median value
- 'std' : float, the standard deviation
Raises
------
ValueError
If data is empty.
Examples
--------
>>> calculate_statistics([1, 2, 3, 4, 5])
{'mean': 3.0, 'median': 3.0, 'std': 1.41}
"""
pass
Sphinx/reStructuredText Style
Used by the Sphinx documentation tool. More compact, uses :param: directives:
def send_email(to, subject, body, cc=None):
"""Send an email message.
:param to: Recipient email address.
:type to: str
:param subject: The email subject line.
:type subject: str
:param body: The email body content.
:type body: str
:param cc: Optional CC recipients.
:type cc: list[str] or None
:returns: True if the email was sent successfully.
:rtype: bool
:raises SMTPError: If the email server is unreachable.
"""
pass
Comparison
| Style | Pros | Cons |
|---|---|---|
| Most readable, clean | Requires Napoleon extension for Sphinx | |
| NumPy | Great for scientific code, very detailed | Verbose, takes up many lines |
| Sphinx/reST | Native Sphinx support | Harder to read in source code |
Recommendation: Use Google style for most projects. It's the most readable in source code and well-supported by documentation tools.
Module-Level Docstrings
Every module (.py file) should have a docstring at the top describing its purpose:
"""User authentication and session management.
This module provides functions for authenticating users,
managing sessions, and handling login/logout flows. It
supports both password-based and token-based authentication.
Typical usage:
from auth import authenticate, create_session
user = authenticate(username, password)
session = create_session(user)
"""
import hashlib
from datetime import datetime
# ... rest of the module
The module docstring should be the very first thing in the file, before any imports.
Class Docstrings
Class docstrings describe the class's purpose, its constructor parameters, and its key attributes. When documenting class hierarchies, mention inheritance relationships and polymorphic behavior:
class Shape:
"""Base class for geometric shapes.
All shapes must implement the `area()` and `perimeter()` methods.
Subclasses include Circle, Rectangle, and Triangle.
This class cannot be used directly — instantiate a subclass instead.
"""
def area(self):
raise NotImplementedError
def perimeter(self):
raise NotImplementedError
class Circle(Shape):
"""A circle defined by its radius.
Inherits from Shape and implements area() and perimeter()
using standard circle formulas.
Args:
radius: The radius of the circle. Must be positive.
Attributes:
radius: The circle's radius.
Example:
>>> c = Circle(5)
>>> c.area()
78.53981633974483
"""
def __init__(self, radius):
self.radius = radius
def area(self):
"""Return the area of the circle (pi * r^2)."""
return 3.14159265358979 * self.radius ** 2
def perimeter(self):
"""Return the circumference (2 * pi * r)."""
return 2 * 3.14159265358979 * self.radius
Documenting Inherited Methods
When a subclass overrides a method, document how its behavior differs from the parent:
class JsonProcessor(BaseProcessor):
"""Process JSON data files.
Overrides BaseProcessor.process() to handle JSON-specific
parsing and validation.
"""
def process(self, data):
"""Parse and validate JSON data.
Unlike the base implementation, this method performs
schema validation before processing.
Args:
data: Raw JSON string to process.
Returns:
Parsed and validated dictionary.
"""
pass
Documenting Polymorphic Behavior
When a class participates in polymorphism, document the contract that subclasses must follow:
class PaymentGateway:
"""Abstract interface for payment processing.
Subclasses must implement charge() and refund(). The checkout
system calls these methods polymorphically — it doesn't know
or care which gateway is being used.
Implementations:
- StripeGateway: Credit card payments via Stripe
- PayPalGateway: PayPal payments
- MockGateway: Testing only, no real charges
"""
def charge(self, amount, currency="USD"):
"""Charge the customer.
Args:
amount: The amount to charge in cents.
currency: ISO 4217 currency code.
Returns:
A TransactionResult with status and ID.
"""
raise NotImplementedError
Click "Run" to execute your codeWhen to Comment (and When Not To)
Good Comments
# Retry up to 3 times because the API occasionally returns
# 503 during deployments (usually resolves within 2 seconds)
for attempt in range(3):
response = api.fetch(url)
if response.status != 503:
break
time.sleep(2)
# Using bisect for O(log n) insertion into sorted list
# instead of list.append() + list.sort() which is O(n log n)
bisect.insort(sorted_scores, new_score)
# TODO: Replace with proper caching once Redis is set up
cached_result = {}
Bad Comments
# Bad: states the obvious
x = x + 1 # Increment x by 1
# Bad: restates the code
user_list = get_users() # Get the list of users
# Bad: commented-out code (use version control instead)
# old_value = calculate_old_way(data)
# if old_value > threshold:
# process_old(old_value)
# Bad: misleading comment (the code does something different)
# Sort users by name
users.sort(key=lambda u: u.created_at) # Actually sorts by date!
The Best Comment Is No Comment
If you can make the code self-explanatory, do that instead of adding a comment:
# Instead of this:
# Check if the user is old enough to vote
if user.age >= 18:
pass
# Write this (the function name IS the documentation):
def is_eligible_to_vote(user):
return user.age >= 18
if is_eligible_to_vote(user):
pass
Type Hints as Self-Documenting Code
Type hints reduce the need for docstring parameter descriptions because the types are right there in the code:
# Without type hints — need docstring to explain types
def create_user(name, age, emails):
"""Create a new user.
Args:
name: The user's full name (str).
age: The user's age in years (int).
emails: List of email addresses (list of str).
"""
pass
# With type hints — types are self-documenting
def create_user(
name: str,
age: int,
emails: list[str],
) -> User:
"""Create a new user."""
pass
With type hints, the docstring can focus on behavior rather than types.
Click "Run" to execute your codePEP 257 Summary
The key rules from PEP 257:
- Use
"""triple double quotes"""for all docstrings - Put the summary on the first line
- The summary should be a phrase ending with a period
- For multi-line docstrings, leave a blank line after the summary
- The closing
"""goes on its own line for multi-line docstrings - Don't leave a blank line after the opening
""" - Module docstrings go at the very top of the file, before imports
- Class docstrings go right after the
classline - Function/method docstrings go right after the
defline
# Single-line: closing quotes on same line
def add(a, b):
"""Return the sum of a and b."""
return a + b
# Multi-line: closing quotes on their own line
def complex_operation(data, mode="fast"):
"""Perform a complex data transformation.
Processes the input data using the specified mode.
Valid modes are 'fast' (approximate) and 'precise'
(exact but slower).
Args:
data: The input dataset to process.
mode: Processing mode. Defaults to 'fast'.
Returns:
Transformed dataset as a dictionary.
"""
pass
Docstring Tools
Several tools work with docstrings:
help()— built-in function that displays docstrings in the REPL- Sphinx — generates HTML/PDF documentation from docstrings
- pydoc — command-line tool for viewing docstrings
- pdoc — simpler alternative to Sphinx for auto-generated docs
- interrogate — checks that all functions/classes have docstrings
# In the Python REPL:
help(str.split)
help(list.sort)
Click "Run" to execute your code