Home

03 Testing Fundamentals: Basics

November 2025 (4724 Words, 27 Minutes)

lecture testing pytest unit-testing test-pyramid aaa-pattern clean-code test-frameworks

1. Introduction: Your CI is Green, But Does Your Code Work?

After Chapter 02 (Refactoring), your Road Profile Viewer has beautiful modular structure:

src/road_profile_viewer/
├── __init__.py
├── geometry.py      # Pure math functions
├── road.py          # Road generation
├── visualization.py # Dash UI
└── main.py          # Entry point

Your CI pipeline validates every pull request:

✅ Ruff check (PEP8 style)
✅ Ruff format check
✅ Pyright (type hints)

All checks green. Code merged. Deployed to production.

Then this happens:

# A user enters a vertical camera angle
angle = 90.0

# Your code crashes:
def find_intersection(x_road, y_road, angle_degrees, ...):
    angle_rad = -np.deg2rad(angle_degrees)
    slope = np.tan(angle_rad)  # tan(90°) = ∞
    # ZeroDivisionError or infinite loop!

Your CI said nothing was wrong. Because:

✅ Ruff: “Code is formatted nicely!” (it is)
✅ Pyright: “Types look good!” (they do)
❌ Nobody tested if the code actually works

This is the gap we’re filling today.

2. Learning Objectives

By the end of this lecture, you will:

Understand the difference between code quality (style) and code correctness (logic)
Master the testing pyramid - unit, module, and end-to-end tests
Write effective unit tests using equivalence classes and boundary analysis
Use pytest to test your modular Python code
Leverage LLMs to accelerate test writing (while understanding their limitations)
Break the “test cone” pattern that prevents developers from writing tests
Apply feature branch workflow to add tests to your project

What you WILL learn:

How to catch bugs before users do
How to write tests that give you confidence
How to use LLMs as test writing assistants (not autopilots)
Why the testing pyramid matters

What you WON’T learn (yet):

Test-driven development (TDD) - that’s Chapter 03 (TDD and CI)
Mocking and fixtures - advanced topics for later
UI testing - complex and out of scope

3. Part 1: Why Testing Matters

3.1 The Bug That CI Missed

Let’s look at a real bug in your src/road_profile_viewer/geometry.py:

from numpy.typing import NDArray
import numpy as np

def find_intersection(
    x_road: NDArray[np.float64],
    y_road: NDArray[np.float64],
    angle_degrees: float,
    camera_x: float = 0,
    camera_y: float = 1.5,
) -> tuple[float | None, float | None, float | None]:
    """Find intersection between camera ray and road profile."""
    angle_rad = -np.deg2rad(angle_degrees)

    # Handle vertical ray
    if np.abs(np.cos(angle_rad)) < 1e-10:
        return None, None, None  # Currently returns None for vertical angles

    slope = np.tan(angle_rad)
    # ... rest of function

What happens with different angles?

Angle	Expected Behavior	Actual Behavior	CI Status
-10°	Find intersection	✅ Works	✅ Green
-45°	Find intersection	✅ Works	✅ Green
0°	Horizontal ray	✅ Works	✅ Green
90°	Vertical ray	❌ Returns None (should handle gracefully)	✅ Green
Empty array	Handle empty data	❌ Crashes with IndexError	✅ Green

CI is blind to logic bugs. It only checks:

Is the code formatted? (Ruff)
Are types correct? (Pyright)

It doesn’t check:

Does the function return correct results?
Does it handle edge cases?
Does it fail gracefully with invalid input?

3.2 Code Quality ≠ Code Correctness

Code Quality Tools (Chapter 02 (Code Quality in Practice))

What they check: Style, formatting, type hints
What they catch: PEP8 violations, missing type hints
What they miss: Logic errors, edge case bugs, incorrect calculations

Tests (This Lecture)

What they check: Functionality, correctness, behavior
What they catch: Logic bugs, edge case failures, incorrect results
What they miss: Style issues (that’s what Ruff is for!)

You need BOTH:

Code Quality Tools → "Does this look professional?"
Tests              → "Does this actually work?"

4. Part 2: The Testing Pyramid

4.1 What is the Testing Pyramid?

The testing pyramid is a strategy for balancing different types of tests:

       /\
      /E2E\       ← Few, slow, expensive
     /------\       (Full Dash app, browser simulation)
    /Module \     ← Some, moderate speed
   /----------\     (Multiple modules working together)
  /   Unit     \   ← Many, fast, cheap
 /--------------\    (Individual functions)

Three levels of testing:

Unit Tests (Base of pyramid)
- Test individual functions in isolation
- Fast (milliseconds)
- Easy to write and maintain
- Catch bugs early
Module/Integration Tests (Middle of pyramid)
- Test multiple modules working together
- Moderate speed (seconds)
- Catch interface mismatches
End-to-End Tests (Top of pyramid)
- Test entire application flow
- Slow (seconds to minutes)
- Expensive to maintain
- Catch system-level issues

4.2 Why the Pyramid Shape?

Ratio recommendation:

70% Unit tests
20% Module/Integration tests
10% End-to-End tests

Why?

Speed matters for feedback loop:

Unit test:    find_intersection() test runs in 0.01 seconds
Module test:  geometry + road together runs in 0.1 seconds
E2E test:     Start Dash app, simulate clicks runs in 5+ seconds

If you change one line in find_intersection():

✅ 50 unit tests run in 0.5 seconds → instant feedback
❌ 50 E2E tests run in 250 seconds → you go get coffee, lose flow

The feedback loop is critical:

Fast tests → Run them often → Catch bugs immediately → Fix quickly
Slow tests → Run them rarely → Bugs pile up → Hard to debug

4.3 The Anti-Pattern: The Test Cone (Inverted Pyramid)

What happens when developers avoid unit tests:

 /--------------\
  \   E2E     /    ← Many E2E tests (slow!)
   \----------/
    \ Module /      ← Some module tests
     \------/
      \ Unit/       ← Few or no unit tests
       \  /

This is backwards! Problems:

Slow feedback: Every change requires running slow E2E tests
Hard to debug: E2E test fails → which of 10 modules caused it?
Brittle: UI changes break all E2E tests
Developers avoid running tests: Too slow → bugs pile up

Why does the cone happen?

“I don’t know how to write unit tests”
“Writing test setup is tedious”
“The UI is easy to click through manually”
Result: Only E2E tests, testing pyramid inverted

Today, we’re going to fix this by:

Learning unit testing properly
Using LLMs to handle tedious boilerplate
Making unit testing so easy you’ll prefer it

5. Part 3: Unit Testing Deep Dive

5.1 What is a Unit Test?

Definition: A unit test verifies that a single unit (function, method, class) works correctly in isolation.

Example:

# Unit being tested: find_intersection()
def test_find_intersection_normal_angle():
    """Test that find_intersection() works with a normal downward angle."""
    # Arrange: Set up test data
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 3, 4])
    angle = -10.0

    # Act: Call the function
    x, y, dist = find_intersection(x_road, y_road, angle)

    # Assert: Verify the result
    assert x is not None, "Should find an intersection"
    assert dist > 0, "Distance should be positive"

Key characteristics:

✅ Tests ONE function (find_intersection)
✅ No dependencies on Dash, UI, or other modules
✅ Fast (runs in milliseconds)
✅ Clear what’s being tested

5.2 The AAA Pattern: Arrange-Act-Assert

Every good unit test follows this structure:

def test_example():
    # ARRANGE: Set up the test data and conditions
    input_data = prepare_test_data()
    expected_result = calculate_expected()

    # ACT: Call the function being tested
    actual_result = function_under_test(input_data)

    # ASSERT: Verify the result matches expectations
    assert actual_result == expected_result

Why this pattern?

Clear structure makes tests readable
Easy to see what’s being tested
Easy to debug when tests fail

5.3 The Impossibility of Exhaustive Testing

Question: Why can’t we just test every possible input?

Let’s do the math for find_intersection():

def find_intersection(x_road, y_road, angle_degrees, camera_x=0, camera_y=1.5):
    """Find intersection between camera ray and road profile."""
    # ...

Just one parameter: angle_degrees (a float)

Float precision: 32-bit float has ~7 significant digits
Range: -180° to 180° (or -90° to 90° for practical angles)
Possible values: Let’s be conservative and say 1,000,000 distinct values

Time per test: ~1 millisecond (very fast!)

Total time to test just angle_degrees:

1,000,000 tests × 0.001 seconds = 1,000 seconds ≈ 17 minutes

But wait, we have 4 parameters:

angle_degrees: ~1,000,000 values
camera_x: ~1,000,000 values
camera_y: ~1,000,000 values
x_road, y_road: Arrays with variable length and values → effectively infinite

If we test ALL combinations (just for 3 float parameters):

1,000,000³ = 1,000,000,000,000,000,000 tests (1 quintillion!)

At 1ms per test:
= 1,000,000,000,000,000 seconds
= 31,709,791,983 years
≈ 32 BILLION YEARS

For perspective:

Age of the universe: ~14 billion years
Time to test all inputs: ~32 billion years
Conclusion: Exhaustive testing is literally impossible

And this is for ONE function with simple parameters!

5.4 Testing Frameworks: Why We Need pytest

Question: Can’t we just write tests as regular Python functions?

Yes, but it gets messy fast:

# Without a testing framework:
def manual_test():
    x, y, dist = find_intersection(x_road, y_road, -10.0)
    if x is None:
        print("❌ FAILED: x should not be None")
        return False
    if dist <= 0:
        print("❌ FAILED: distance should be positive")
        return False
    print("✅ PASSED")
    return True

# Run test
manual_test()

Problems with manual testing:

❌ No automatic test discovery (you have to call each function manually)
❌ No organized test reports (just prints)
❌ No test isolation (one failure might affect others)
❌ No helpful error messages (you have to write all the checks)
❌ No test fixtures (setup/teardown)
❌ No parallel execution (slow!)
❌ No integration with CI/CD

Enter pytest: The Industry Standard Testing Framework

What pytest provides:

# Automatic test discovery - finds all test_*.py files
$ uv run pytest

# Run with detailed output
$ uv run pytest -v

# Run specific test file
$ uv run pytest tests/test_geometry.py

# Run tests matching a pattern
$ uv run pytest -k "intersection"

# Show print statements even when passing
$ uv run pytest -s

# Stop at first failure
$ uv run pytest -x

What makes pytest great:

✅ Simple syntax: Just use assert (no special methods)
✅ Auto-discovery: Finds all test_*.py files automatically
✅ Rich output: Shows exactly what failed and why
✅ Fixtures: Share setup code across tests
✅ Parametrization: Run same test with different inputs
✅ Plugins: Extend functionality (coverage, benchmarking, etc.)
✅ CI/CD integration: Works seamlessly with GitHub Actions

Comparison:

# Manual testing (primitive)
def test_something():
    result = function()
    if result != expected:
        print("FAILED")
        return False
    return True

# pytest (modern)
def test_something():
    result = function()
    assert result == expected  # pytest handles the rest!

pytest output example:

$ uv run pytest tests/test_geometry.py -v

============================= test session starts ==============================
platform win32 -- Python 3.12.0, pytest-8.0.0, pluggy-1.4.0
cachedir: .pytest_cache
rootdir: C:\...\road-profile-viewer
collected 5 items

tests/test_geometry.py::test_normal_angle PASSED                        [ 20%]
tests/test_geometry.py::test_vertical_angle PASSED                      [ 40%]
tests/test_geometry.py::test_empty_array FAILED                         [ 60%]
tests/test_geometry.py::test_boundary_angle PASSED                      [ 80%]
tests/test_geometry.py::test_upward_angle PASSED                        [100%]

=================================== FAILURES ===================================
________________________________ test_empty_array ______________________________

    def test_empty_array():
        x_road = np.array([])
        y_road = np.array([])
>       x, y, dist = find_intersection(x_road, y_road, -10.0)
E       IndexError: index 0 is out of bounds for axis 0 with size 0

tests/test_geometry.py:42: IndexError
========================= 1 failed, 4 passed in 0.12s =========================

Notice:

✅ Shows exactly which test failed
✅ Shows the exact line that failed
✅ Shows the error type and message
✅ Shows how many tests passed/failed
✅ Shows execution time
✅ Color-coded output (red for failures, green for passes)

Why this matters:

Without pytest: "Something is broken. Good luck finding it!"
With pytest:    "test_empty_array failed at line 42 with IndexError"

Installation (already done if you have uv):

# pytest is typically listed in your pyproject.toml
$ uv add --dev pytest

# Run tests
$ uv run pytest

Now that we understand WHY pytest exists, let’s write our first test!

5.5 Hands-On: Writing Your First Unit Test

Goal: Test the find_intersection() function.

Step 1: Create test file structure

# In your road-profile-viewer directory
$ mkdir tests
$ touch tests/__init__.py
$ touch tests/test_geometry.py

Step 2: Write a simple test

# tests/test_geometry.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.geometry import find_intersection


def test_find_intersection_finds_intersection_for_normal_angle() -> None:
    """
    Test that find_intersection() returns a valid intersection
    for a normal downward angle with a simple road profile.

    Equivalence class: Normal downward angles (-90° < angle < 0°)
    """
    # Arrange: Create simple road going upward
    x_road: NDArray[np.float64] = np.array([0, 10, 20, 30], dtype=np.float64)
    y_road: NDArray[np.float64] = np.array([0, 2, 4, 6], dtype=np.float64)
    angle: float = -10.0  # Downward angle
    camera_x: float = 0.0
    camera_y: float = 10.0  # Camera above road

    # Act: Find intersection
    x, y, dist = find_intersection(
        x_road, y_road, angle, camera_x, camera_y
    )

    # Assert: Should find an intersection
    assert x is not None, "x coordinate should not be None"
    assert y is not None, "y coordinate should not be None"
    assert dist is not None, "distance should not be None"
    assert dist > 0, "distance should be positive"
    assert 0 <= x <= 30, "intersection x should be within road bounds"

Step 3: Run the test

$ uv run pytest tests/test_geometry.py -v
============================= test session starts ==============================
collected 1 item

tests/test_geometry.py::test_find_intersection_finds_intersection_for_normal_angle PASSED [100%]

============================== 1 passed in 0.05s ===============================

✅ Your first unit test passes!

5.6 Clean Code Principles for Testing

Before we dive deeper into equivalence classes, let’s establish some clean code principles for writing maintainable tests. These principles come from Robert C. Martin’s “Clean Code” and are essential for keeping your test suite readable and maintainable.

5.6.1 One Concept Per Test

Principle: Each test should verify ONE concept or behavior, not multiple unrelated things.

Why? When a test fails, you want to immediately know what’s broken. Multi-concept tests make debugging harder.

❌ Bad Example: Testing Multiple Concepts

def test_find_intersection_everything():
    """This test tries to verify too many things at once."""
    # Test downward angle
    x_road = np.array([0, 10, 20, 30], dtype=np.float64)
    y_road = np.array([0, 2, 4, 6], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
    assert x is not None  # Concept 1: Downward angle finds intersection

    # Test vertical angle
    x2, y2, dist2 = find_intersection(x_road, y_road, 90.0)
    assert x2 is None  # Concept 2: Vertical angle returns None

    # Test empty arrays
    x3, y3, dist3 = find_intersection(np.array([]), np.array([]), -10.0)
    assert x3 is None  # Concept 3: Empty arrays return None

Problem: If this test fails, which concept is broken? You have to read the entire test to figure it out.

✅ Good Example: One Concept Per Test

def test_find_intersection_returns_intersection_for_downward_angle():
    """Test that downward angles find intersections."""
    x_road = np.array([0, 10, 20, 30], dtype=np.float64)
    y_road = np.array([0, 2, 4, 6], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
    assert x is not None

def test_find_intersection_returns_none_for_vertical_angle():
    """Test that vertical angles return None."""
    x_road = np.array([0, 10, 20, 30], dtype=np.float64)
    y_road = np.array([0, 2, 4, 6], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, 90.0)
    assert x is None

def test_find_intersection_returns_none_for_empty_arrays():
    """Test that empty arrays return None."""
    x, y, dist = find_intersection(np.array([]), np.array([]), -10.0)
    assert x is None

Benefits:

✅ Test name tells you exactly what failed
✅ Each test is focused and easy to understand
✅ Easy to run or skip individual concepts
✅ Clear test output when failures occur

5.6.2 One Assert Per Test (With Pragmatic Exceptions)

Principle: Ideally, each test should have ONE assert statement. But there are reasonable exceptions.

When ONE assert is ideal:

def test_find_intersection_returns_positive_distance():
    """Test that distance is always positive when intersection found."""
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
    assert dist > 0  # One concept: distance is positive

When MULTIPLE asserts are acceptable:

Multiple asserts are okay when they’re testing different aspects of the same concept:

def test_find_intersection_returns_valid_coordinates():
    """Test that returned coordinates are within expected bounds (same concept)."""
    x_road = np.array([0, 10, 20, 30], dtype=np.float64)
    y_road = np.array([0, 2, 4, 6], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)

    # All these asserts verify the SAME concept: valid coordinate bounds
    assert x is not None, "x should not be None"
    assert y is not None, "y should not be None"
    assert 0 <= x <= 30, f"x should be in [0, 30], got {x}"
    assert 0 <= y <= 6, f"y should be in [0, 6], got {y}"

This is acceptable because:

All asserts test the same concept: “returned coordinates are valid”
They’re related aspects of one behavior
Failure of any assert indicates the same problem: invalid coordinates

When to SPLIT into multiple tests:

If asserts test different concepts, split them:

# ❌ Bad: Two different concepts in one test
def test_find_intersection_coordinates_and_distance():
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
    assert x > 0  # Concept 1: x is positive
    assert dist > 0  # Concept 2: distance is positive (different concept!)

# ✅ Good: Split into two tests
def test_find_intersection_returns_positive_x():
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
    assert x > 0

def test_find_intersection_returns_positive_distance():
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
    assert dist > 0

5.6.3 Descriptive Test Names

Principle: Test names should describe what is being tested, what scenario, and what the expected result is.

Pattern: test_[function]_[scenario]_[expected_behavior]()

❌ Bad Test Names:

def test_1():  # What does this test?
    pass

def test_intersection():  # Too vague
    pass

def test_angle():  # Which angle? What about it?
    pass

def test_edge_case():  # Which edge case?
    pass

✅ Good Test Names:

def test_find_intersection_returns_none_for_empty_arrays():
    """Clear: function + scenario + expected result"""
    pass

def test_find_intersection_finds_intersection_for_downward_angle():
    """You know exactly what this tests"""
    pass

def test_find_intersection_returns_none_when_ray_misses_road():
    """Scenario and outcome are explicit"""
    pass

def test_find_intersection_handles_vertical_angle_gracefully():
    """Clear edge case handling"""
    pass

Why this matters:

Test output when failure occurs:

❌ FAILED test_1
   # What does this tell you? Nothing!

✅ FAILED test_find_intersection_returns_none_for_empty_arrays
   # Immediately clear: empty array handling is broken

Real-world example from find_intersection:

class TestFindIntersection:
    """Test suite for find_intersection() function."""

    def test_find_intersection_returns_intersection_for_downward_angle(self):
        """Test normal downward angle finds intersection."""
        pass

    def test_find_intersection_returns_none_for_vertical_angle(self):
        """Test vertical angle (90°) returns None."""
        pass

    def test_find_intersection_returns_none_for_empty_road_arrays(self):
        """Test empty arrays return None gracefully."""
        pass

    def test_find_intersection_returns_none_for_single_point_road(self):
        """Test single-point road (no segments) returns None."""
        pass

    def test_find_intersection_finds_intersection_on_first_segment(self):
        """Test intersection detection on first road segment."""
        pass

When you run these tests:

$ uv run pytest tests/test_geometry.py -v
tests/test_geometry.py::TestFindIntersection::test_find_intersection_returns_intersection_for_downward_angle PASSED
tests/test_geometry.py::TestFindIntersection::test_find_intersection_returns_none_for_vertical_angle PASSED
tests/test_geometry.py::TestFindIntersection::test_find_intersection_returns_none_for_empty_road_arrays FAILED

You immediately know: “Empty array handling is broken” without reading any code!

5.6.4 Summary of Clean Code Principles for Testing

Principle	Guideline	Why It Matters
One Concept Per Test	Each test verifies ONE behavior/concept	When test fails, you immediately know what's broken
One Assert Per Test	Ideally one assert; multiple okay if testing same concept	Failures are unambiguous; clear what went wrong
Descriptive Names	`test_[function]_[scenario]_[expected]()`	Test output is self-documenting; no need to read code
AAA Pattern	Arrange-Act-Assert (from Section 5.2)	Tests are readable and consistent

These principles work together:

# ✅ Perfect test: One concept, one assert, descriptive name, AAA pattern
def test_find_intersection_returns_positive_distance_for_valid_intersection():
    """Test that distance from camera to intersection is always positive."""
    # Arrange
    x_road = np.array([0, 10, 20, 30], dtype=np.float64)
    y_road = np.array([0, 2, 4, 6], dtype=np.float64)

    # Act
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)

    # Assert
    assert dist > 0, f"Expected positive distance, got {dist}"

Benefits of following these principles:

✅ Maintainability: Easy to update tests when code changes
✅ Debugging: Failures immediately point to the problem
✅ Documentation: Test names explain what the code should do
✅ Confidence: Clear tests give you confidence to refactor

6. What’s Next?

We’ve covered the fundamentals:

✅ The testing pyramid and why unit tests form the base
✅ The AAA pattern for structuring tests
✅ pytest framework basics
✅ Clean code principles for tests

But there’s a critical question we haven’t answered: How do you decide WHICH test cases to write?

You can’t test everything exhaustively. So how do you choose wisely?

Continue to Chapter 03 (Equivalence Classes): Equivalence Classes to learn advanced test design techniques.