Project Structure

A well-organized project is easier to navigate, test, and maintain. Python gives you flexible tools for structuring code — from single scripts to large packages with dozens of modules. This guide covers the patterns that work best at every scale.

Single Script vs Module vs Package

Python code can be organized at three levels of complexity:

Single Script

A standalone .py file. Perfect for small utilities, automation tasks, and scripts under a few hundred lines:

backup_files.py

Module

A single .py file that's designed to be imported by other code. The difference from a script is intent — a module exposes functions and classes for reuse:

utils.py          # import utils
validators.py     # import validators

Package

A directory containing an __init__.py file and one or more modules. Packages let you group related functionality and create namespaces:

mypackage/
    __init__.py
    models.py
    views.py
    utils.py

Python Playground
Output
Click "Run" to execute your code

__init__.py and What It Does

The __init__.py file serves several purposes:

  1. Marks a directory as a Python package — without it (in older Python), the directory isn't importable
  2. Runs initialization code when the package is first imported
  3. Controls the public API by defining what gets exported
# mypackage/__init__.py

# Import key items so users can do: from mypackage import User
from mypackage.models import User, Admin
from mypackage.utils import validate_email

# Define what "from mypackage import *" exports
__all__ = ["User", "Admin", "validate_email"]

# Package-level constants
__version__ = "1.0.0"

Empty __init__.py

For simple packages, an empty __init__.py is fine. It just marks the directory as a package:

# mypackage/__init__.py
# (empty file)

Users then import directly from submodules:

from mypackage.models import User
from mypackage.utils import validate_email

Note: Since Python 3.3, "namespace packages" allow packages without __init__.py. But explicit __init__.py is still recommended for clarity.

The src/ Layout

The src/ layout is the recommended structure for installable Python packages. It prevents a common bug where tests accidentally import the local source instead of the installed package:

my-project/
    src/
        mypackage/
            __init__.py
            models.py
            services.py
            utils.py
    tests/
        test_models.py
        test_services.py
        test_utils.py
    pyproject.toml
    README.md

Why src/?

Without src/, when you run tests from the project root, Python might import the local mypackage/ directory instead of the installed version. The src/ directory prevents this because src/ isn't automatically on sys.path.

Flat Layout (Alternative)

Some projects skip src/ and put the package at the root. This is simpler but has the import pitfall described above:

my-project/
    mypackage/
        __init__.py
        models.py
    tests/
        test_models.py
    pyproject.toml

Recommendation: Use src/ for libraries and packages you'll distribute. The flat layout is fine for applications that won't be installed as packages.

Tests Folder Structure

Mirror your source structure in your tests directory. This makes it easy to find the test for any module:

src/
    mypackage/
        __init__.py
        models/
            __init__.py
            user.py
            product.py
        services/
            __init__.py
            auth.py
            payment.py
        utils.py
tests/
    __init__.py
    models/
        __init__.py
        test_user.py
        test_product.py
    services/
        __init__.py
        test_auth.py
        test_payment.py
    test_utils.py
    conftest.py              # Shared pytest fixtures

Key conventions:

  • Test files are named test_<module>.py (pytest discovers them automatically)
  • Test classes are named Test<ClassName>
  • Test functions are named test_<behavior>
  • conftest.py holds shared fixtures and test configuration
# tests/models/test_user.py
from mypackage.models.user import User

class TestUser:
    def test_create_user(self):
        user = User("Alice", "alice@example.com")
        assert user.name == "Alice"

    def test_user_display_name(self):
        user = User("alice bob", "alice@example.com")
        assert user.display_name() == "Alice Bob"

pyproject.toml Basics

pyproject.toml is the modern standard for Python project configuration. It replaces setup.py, setup.cfg, and various tool-specific config files:

[project]
name = "mypackage"
version = "1.0.0"
description = "A short description of my package"
readme = "README.md"
requires-python = ">=3.10"
license = {text = "MIT"}
authors = [
    {name = "Your Name", email = "you@example.com"},
]
dependencies = [
    "requests>=2.28",
    "pydantic>=2.0",
]

[project.optional-dependencies]
dev = [
    "pytest>=7.0",
    "mypy>=1.0",
    "black>=23.0",
]

[build-system]
requires = ["setuptools>=68.0"]
build-backend = "setuptools.backends._legacy:_Backend"

[tool.setuptools.packages.find]
where = ["src"]

# Tool configuration
[tool.black]
line-length = 88

[tool.mypy]
strict = true

[tool.pytest.ini_options]
testpaths = ["tests"]

Key Sections

Section Purpose
[project] Package metadata, dependencies
[build-system] How to build the package
[tool.*] Configuration for tools (Black, mypy, pytest, etc.)
[project.scripts] CLI entry points
[project.optional-dependencies] Extra dependency groups

Common Project Layouts

CLI Application

weather-cli/
    src/
        weather/
            __init__.py
            cli.py           # Click/argparse entry point
            api.py           # External API client
            models.py        # Data models
            formatters.py    # Output formatting
            config.py        # Configuration handling
    tests/
        test_cli.py
        test_api.py
        test_formatters.py
    pyproject.toml
    README.md
# In pyproject.toml
[project.scripts]
weather = "weather.cli:main"

Library / Reusable Package

python-slugify/
    src/
        slugify/
            __init__.py      # Public API exports
            core.py          # Main slugification logic
            special.py       # Language-specific rules
            unicode_data.py  # Unicode lookup tables
    tests/
        test_core.py
        test_special.py
    docs/
        index.md
        api.md
    pyproject.toml
    README.md
    LICENSE

Web Application (Flask/FastAPI)

mywebapp/
    src/
        app/
            __init__.py      # App factory
            models/
                __init__.py
                user.py
                post.py
            routes/
                __init__.py
                auth.py
                api.py
            services/
                __init__.py
                email.py
                storage.py
            templates/
                base.html
                index.html
            static/
                style.css
            config.py
    tests/
        conftest.py
        test_auth.py
        test_api.py
    migrations/
        001_initial.sql
    pyproject.toml
    README.md

Organizing Class Hierarchies Across Files

When a project uses inheritance and polymorphism heavily, how you organize classes across files has a significant impact on readability and maintainability.

Where to Put Base Classes

Base classes and abstract classes should live in their own module, separate from their implementations. This prevents circular imports and makes the hierarchy clear:

src/
    mypackage/
        base.py              # Abstract base classes
        mixins.py            # Reusable mixin classes
        implementations/
            __init__.py
            csv_processor.py
            json_processor.py
            xml_processor.py
# base.py
from abc import ABC, abstractmethod

class BaseProcessor(ABC):
    """Abstract base for all data processors."""

    @abstractmethod
    def parse(self, raw_data: str) -> dict:
        pass

    @abstractmethod
    def validate(self, data: dict) -> bool:
        pass

    def process(self, raw_data: str) -> dict:
        """Template method: parse, validate, then return."""
        data = self.parse(raw_data)
        if not self.validate(data):
            raise ValueError("Validation failed")
        return data
# implementations/json_processor.py
import json
from mypackage.base import BaseProcessor

class JsonProcessor(BaseProcessor):
    """Process JSON data files."""

    def parse(self, raw_data: str) -> dict:
        return json.loads(raw_data)

    def validate(self, data: dict) -> bool:
        return "id" in data and "name" in data

Where to Put Mixins

Mixins — small classes designed for multiple inheritance — belong in a dedicated mixins.py module or a mixins/ package:

# mixins.py
import logging

class LoggingMixin:
    """Add structured logging to any class."""

    def get_logger(self):
        return logging.getLogger(self.__class__.__name__)

    def log_info(self, message):
        self.get_logger().info(message)

class CachingMixin:
    """Add simple in-memory caching."""

    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        cls._cache = {}

    def cache_get(self, key):
        return self._cache.get(key)

    def cache_set(self, key, value):
        self._cache[key] = value

Then combine them in implementation classes:

# implementations/cached_json_processor.py
from mypackage.base import BaseProcessor
from mypackage.mixins import LoggingMixin, CachingMixin

class CachedJsonProcessor(LoggingMixin, CachingMixin, BaseProcessor):
    """JSON processor with logging and caching.

    MRO: CachedJsonProcessor -> LoggingMixin -> CachingMixin
         -> BaseProcessor -> ABC -> object
    """

    def parse(self, raw_data):
        self.log_info(f"Parsing {len(raw_data)} bytes")
        cached = self.cache_get(raw_data)
        if cached:
            self.log_info("Cache hit")
            return cached
        result = json.loads(raw_data)
        self.cache_set(raw_data, result)
        return result

Deep Inheritance vs Composition

A common question is whether to use inheritance or composition. Both have trade-offs:

Deep inheritance (A -> B -> C -> D) creates tight coupling. Changes to A can break B, C, and D. The MRO becomes harder to reason about with each level, especially with multiple inheritance.

Composition (A has a B, C, D) is more flexible. Each component can be swapped independently.

# Deep inheritance — tightly coupled
class Animal:
    pass

class Mammal(Animal):
    pass

class DomesticMammal(Mammal):
    pass

class Dog(DomesticMammal):  # 4 levels deep — hard to change
    pass

# Composition — loosely coupled
class Dog:
    def __init__(self):
        self.movement = WalkingMovement()
        self.sound = BarkingSound()
        self.diet = OmnivoreDiet()

General guidance:

  • Prefer composition when objects have a "has-a" relationship
  • Use inheritance when objects have a clear "is-a" relationship
  • Keep inheritance hierarchies shallow (2-3 levels max)
  • Use mixins for cross-cutting concerns (logging, caching, serialization)
  • Use ABCs to define interfaces that multiple unrelated classes must implement
  • Consider Protocols (structural subtyping) when you want duck typing with type safety

Python Playground
Output
Click "Run" to execute your code

Virtual Environments

Always use a virtual environment to isolate your project's dependencies from the system Python and from other projects:

# Create a virtual environment
python -m venv .venv

# Activate it
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
# Or with pyproject.toml:
pip install -e ".[dev]"

# Deactivate when done
deactivate

Your project's .gitignore should exclude the virtual environment directory:

# .gitignore
.venv/
__pycache__/
*.pyc
dist/
*.egg-info/

Tip: Modern tools like uv, pdm, and poetry manage virtual environments automatically. They create and activate environments as part of their workflow, so you don't have to manage them manually.

The if __name__ == "__main__" Pattern

This pattern lets a file work both as an importable module and a runnable script:

# my_module.py

def greet(name):
    return f"Hello, {name}!"

def main():
    """Entry point when run as a script."""
    print(greet("World"))

if __name__ == "__main__":
    main()

When you run python my_module.py, Python sets __name__ to "__main__", so main() executes. When you import my_module, __name__ is "my_module", so main() doesn't run.

Python Playground
Output
Click "Run" to execute your code

Putting It All Together

Here's a complete, well-structured project skeleton:

my-awesome-project/
    src/
        mypackage/
            __init__.py          # Package API, version
            base.py              # Abstract base classes
            mixins.py            # Reusable mixins
            models/
                __init__.py
                user.py
                product.py
            services/
                __init__.py
                auth.py
                payment.py
            utils/
                __init__.py
                validation.py
                formatting.py
            config.py            # Settings and configuration
    tests/
        __init__.py
        conftest.py              # Shared fixtures
        models/
            test_user.py
            test_product.py
        services/
            test_auth.py
            test_payment.py
        utils/
            test_validation.py
    docs/
        index.md
    pyproject.toml
    README.md
    LICENSE
    .gitignore

Start simple — a single file is fine for small projects. As your code grows, split it into modules. When you have enough modules, organize them into packages. The structure should serve the code, not the other way around.