Home

03 Testing Fundamentals: Boundary Analysis and LLM-Assisted Testing

November 2025 (19548 Words, 109 Minutes)

lecture testing pytest unit-testing test-pyramid equivalence-classes boundary-testing llm-assisted-development

1. Introduction: Welcome Back to Testing - From Theory to Practice

Welcome back! In Part 1 of Chapter 03 (Testing Fundamentals), you learned the fundamentals of software testing:

What We Covered in Part 1:

Why Testing Matters
- Code quality (Ruff, Pyright) ≠ Code correctness (tests)
- CI can be green, but your code can still have critical bugs
- Tests catch logic errors that static analysis tools miss
The Testing Pyramid
- 70% Unit tests (fast, focused, cheap)
- 20% Integration tests (moderate speed)
- 10% E2E tests (slow, expensive)
- Not the inverted “test cone” anti-pattern!
Unit Testing Fundamentals
- AAA Pattern: Arrange-Act-Assert
- Why pytest is the industry standard
- Writing your first unit test
- The impossibility of exhaustive testing
Clean Code Principles for Testing
- One concept per test
- One assert per test (with pragmatic exceptions)
- Descriptive test names that tell you what’s broken
Equivalence Classes
- Simple floats → 5 equivalence classes
- Multiple parameters → 10+ classes (complexity explosion!)
- Array inputs → Structural + Value dimensions
- Complex functions like find_intersection() → 600+ potential combinations!

The Challenge We Left You With:

You now know WHY testing matters and HOW to write basic unit tests. But you’re probably thinking:

“How do I know if I’ve tested enough?”
“What about those tricky edge cases at the boundaries?”
“Writing 20-30 tests for one function sounds tedious!”
“Can I use AI to help me write tests faster?”

Today’s Mission: From Testing Basics to Testing Mastery

In Part 2, we’re going to solve these problems with three powerful techniques:

Boundary Value Analysis - Where bugs actually hide (not in the middle of equivalence classes!)
LLM-Assisted Testing - Break the “test cone” by using AI for boilerplate (while keeping human oversight)
Integration Testing & Workflow - Make testing a natural part of your development process

What Makes Part 2 Different:

Part 1 was conceptual - understanding what tests are and why they matter.

Part 2 is practical - learning specific techniques to write better tests faster, and integrating testing into your real workflow.

By the end of today’s lecture, you’ll be able to:

Identify boundary values systematically for any function
Use LLMs to generate test boilerplate (while avoiding common AI pitfalls)
Write integration tests that verify modules work together
Apply feature branch workflow to add tests to your Road Profile Viewer
Know when you have “enough” tests (realistic coverage, not perfection)

The Road Ahead:

✅ Part 1: Foundation (Why test? How to write basic unit tests?)
→ Part 2: Mastery (Where do bugs hide? How to test efficiently?)
→ Chapter 03 (TDD and CI): TDD & CI (Write tests FIRST, automate everything)

Let’s dive into the techniques that separate beginners from professionals: boundary value analysis and LLM-assisted testing!

2. Boundary Value Analysis: Where Bugs Hide

Observation: Bugs often lurk at the boundaries between equivalence classes.

Why boundaries? Off-by-one errors, floating-point precision, edge cases in conditional logic.

Let’s apply boundary analysis to all our examples.

2.1 Boundaries for Simple Float Example: `reciprocal(x)`

Function: reciprocal(x) = 1/x

Equivalence Classes and Their Boundaries:

Equivalence Class	Boundary Values to Test
Positive numbers	`x = 0.0001` (near zero), `x = 1.0`, `x = 1000000.0` (very large)
Negative numbers	`x = -0.0001` (near zero), `x = -1.0`, `x = -1000000.0`
Zero	`x = 0.0`, `x = 1e-100` (extremely close to zero)
Extreme boundaries	`sys.float_info.max`, `sys.float_info.min`, `float('inf')`, `float('nan')`

Python Float Boundaries (Similar to C/C++ FLT_MAX, DBL_MIN):

Python floats are 64-bit IEEE 754 doubles.

What is IEEE 754?

IEEE 754 is the international standard for floating-point arithmetic, defining how computers represent and compute with decimal numbers. It specifies:

How numbers are stored in binary (sign, exponent, mantissa)
Special values (infinity, NaN)
Rounding behavior
Operations (+, -, ×, ÷)

Why this matters for testing: Understanding IEEE 754 boundaries helps you write tests for edge cases that occur at the limits of numerical precision. All modern programming languages (Python, C, C++, Java, JavaScript) follow this standard.

Official References:

IEEE 754 Standard: https://standards.ieee.org/standard/754-2019.html
Wikipedia (Good overview): https://en.wikipedia.org/wiki/IEEE_754
What Every Computer Scientist Should Know About Floating-Point Arithmetic: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Python provides the sys.float_info module to access platform-specific float limits defined by IEEE 754:

Python Constant	C/C++ Equivalent	Typical Value	Description
`sys.float_info.max`	`DBL_MAX`	`1.7976931348623157e+308`	Maximum representable finite float
`sys.float_info.min`	`DBL_MIN`	`2.2250738585072014e-308`	Minimum positive normalized float
`sys.float_info.epsilon`	`DBL_EPSILON`	`2.220446049250313e-16`	Difference between 1.0 and next representable float
`math.inf` or `float('inf')`	`INFINITY`	`inf`	Positive infinity (non-finite value)
`-math.inf` or `float('-inf')`	`-INFINITY`	`-inf`	Negative infinity (non-finite value)
`math.nan` or `float('nan')`	`NAN`	`nan`	Not a Number (undefined result)
`math.ulp(0.0)`	—	`≈ 5e-324`	Smallest positive subnormal (denormalized) float
`math.ulp(x)`	—	Varies	Unit in the Last Place (spacing to next representable float)

Critical Distinction: math.inf vs. sys.float_info.max

These are fundamentally different and test different aspects of your code:

math.inf (or float('inf')) - IEEE-754 Positive Infinity
- NOT a number you can store in finite mantissa/exponent
- A special value that compares larger than every finite float
- Propagates through many operations: inf + 1 = inf, inf * 2 = inf
- Result of overflow: sys.float_info.max * 2 = inf
- Use to test: How your code handles infinity (sentinel logic, clamping, division by zero results)
sys.float_info.max (≈ 1.8e308) - Largest Finite Float
- The largest finite IEEE-754 double
- The “upper boundary” before you overflow to infinity
- Use to test: Finite range limits (overflow/underflow edges)

Example illustrating the difference:

import sys
import math

# These are NOT the same!
print(math.inf > sys.float_info.max)  # True - infinity is bigger than any finite
print(sys.float_info.max * 2)         # inf - overflow!
print(1 / math.inf)                   # 0.0 - reciprocal of infinity
print(1 / sys.float_info.max)         # ≈ 5.6e-309 - tiny but finite!

Important distinctions:

sys.float_info.min is NOT the most negative number (that would be -sys.float_info.max)
It’s the smallest positive normalized number (closest to zero without being denormalized)
Similar to how C’s DBL_MIN works
There is NO sys.float_info.min_negative - just use -sys.float_info.max

Other Useful Finite Boundaries (Often Overlooked):

Boundary Type	Python Expression	Typical Value	When to Test
Smallest positive normalized	`sys.float_info.min`	`≈ 2.2e-308`	Testing underflow thresholds, relative error near zero
Smallest positive subnormal	`math.ulp(0.0)`	`≈ 5e-324`	Catching denormal handling, flush-to-zero surprises
Unit in Last Place (ULP)	`math.ulp(x)`	Varies by `x`	Checking "next representable" values in precision-sensitive code
Next float toward direction	`math.nextafter(x, y)`	Varies	Razor-edge boundaries (e.g., stepping from max to infinity)
Most negative finite	`-sys.float_info.max`	`≈ -1.8e308`	Lower bound before underflow to `-inf`

Practical Guidance: Which Should I Use in Tests?

To test infinity handling (non-finite values):

import math

# Use math.inf (preferred for readability)
def test_reciprocal_handles_infinity():
    result = reciprocal(math.inf)
    assert result == 0.0  # 1/inf = 0

To test finite range limits (overflow/underflow edges):

import sys

def test_reciprocal_handles_max_float():
    result = reciprocal(sys.float_info.max)
    # Reciprocal of largest finite → extremely small
    assert result > 0
    assert result < sys.float_info.min  # Might underflow to subnormal

To test precision limits (ULP testing):

import math

def test_reciprocal_precision_at_boundary():
    # Test the "next" representable value after 1.0
    x = math.nextafter(1.0, 2.0)  # Slightly larger than 1.0
    result = reciprocal(x)
    # Should be slightly less than 1.0
    assert result < 1.0

Tip for NumPy Users:

If your code uses NumPy arrays, use np.finfo() for consistency with the dtype under test:

import numpy as np

# For float64 (Python float equivalent)
fi = np.finfo(np.float64)
print(fi.max)        # Largest finite (≈ 1.8e308)
print(fi.tiny)       # Smallest positive normalized (≈ 2.2e-308)
print(fi.eps)        # Machine epsilon (≈ 2.2e-16)

# Use with nextafter
x_next_to_max = np.nextafter(fi.max, np.inf)  # → inf

# For float32 (if working with single precision)
fi32 = np.finfo(np.float32)
print(fi32.max)      # ≈ 3.4e38 (much smaller than float64!)

This keeps everything consistent with the NumPy dtype you’re actually using.

Comprehensive Boundary Test Suite:

import sys
import math
import pytest

# Define constants for readability (following best practices)
INF = math.inf
NINF = -math.inf
NAN = math.nan
FMAX = sys.float_info.max
FMIN_NORM_POS = sys.float_info.min
FMIN_SUB_POS = math.ulp(0.0)
EPSILON = sys.float_info.epsilon


class TestReciprocalBoundaries:
    """Comprehensive boundary tests for reciprocal(x) function."""

    # BASIC BOUNDARIES (near zero, typical values)

    def test_reciprocal_boundary_near_zero_positive(self):
        """Boundary: Very small positive number"""
        result = reciprocal(1e-6)  # 0.000001
        assert result == pytest.approx(1e6, rel=1e-9)  # Should be 1000000

    def test_reciprocal_boundary_near_zero_negative(self):
        """Boundary: Very small negative number"""
        result = reciprocal(-1e-6)
        assert result == pytest.approx(-1e6, rel=1e-9)

    def test_reciprocal_boundary_exactly_zero(self):
        """Boundary: Exactly zero (should raise exception)"""
        with pytest.raises(ZeroDivisionError):
            reciprocal(0.0)

    def test_reciprocal_boundary_very_large_positive(self):
        """Boundary: Very large positive number"""
        result = reciprocal(1e10)
        assert result == pytest.approx(1e-10, rel=1e-9)

    # FINITE RANGE BOUNDARIES (sys.float_info limits)

    def test_reciprocal_boundary_max_finite_float(self):
        """Boundary: Largest finite float (sys.float_info.max ≈ 1.8e308)

        Tests overflow prevention at upper finite boundary.
        Reciprocal of max finite → extremely small result (may underflow to subnormal).
        """
        result = reciprocal(FMAX)
        assert result > 0, "Reciprocal of max float should be positive"
        assert result < FMIN_NORM_POS, "Result underflows below normalized range (subnormal)"
        # Result is approximately 5.6e-309 (subnormal/denormalized)

    def test_reciprocal_boundary_min_normalized_float(self):
        """Boundary: Smallest positive normalized float (sys.float_info.min ≈ 2.2e-308)

        Tests underflow-to-overflow transition.
        Reciprocal of min normalized → extremely large result (overflows to infinity).
        """
        result = reciprocal(FMIN_NORM_POS)
        assert result == INF, "Reciprocal of min normalized float overflows to infinity"

    def test_reciprocal_boundary_min_subnormal_float(self):
        """Boundary: Smallest positive subnormal float (math.ulp(0.0) ≈ 5e-324)

        Tests denormalized number handling.
        Reciprocal of smallest representable → massive overflow to infinity.
        """
        result = reciprocal(FMIN_SUB_POS)
        assert result == INF, "Reciprocal of min subnormal overflows to infinity"

    def test_reciprocal_boundary_epsilon(self):
        """Boundary: Machine epsilon (sys.float_info.epsilon ≈ 2.2e-16)

        Tests precision limit behavior.
        Epsilon is smallest value where 1.0 + epsilon != 1.0
        """
        result = reciprocal(EPSILON)
        expected = 1.0 / EPSILON  # ≈ 4.5e15
        assert result == pytest.approx(expected, rel=1e-9)

    def test_reciprocal_boundary_most_negative_finite(self):
        """Boundary: Most negative finite float (-sys.float_info.max ≈ -1.8e308)

        Tests lower finite boundary (there is no sys.float_info.min_negative).
        """
        result = reciprocal(-FMAX)
        assert result < 0, "Reciprocal of most negative finite should be negative"
        assert result > -FMIN_NORM_POS, "Result is tiny negative (subnormal range)"

    # NON-FINITE BOUNDARIES (infinities and NaN)

    def test_reciprocal_boundary_positive_infinity(self):
        """Boundary: Positive infinity (non-finite value)

        Tests special value handling (not overflow prevention).
        1 / inf = 0 (mathematical limit)
        """
        result = reciprocal(INF)
        assert result == 0.0, "Reciprocal of positive infinity should be zero"

    def test_reciprocal_boundary_negative_infinity(self):
        """Boundary: Negative infinity (non-finite value)

        1 / -inf = -0 (negative zero)
        """
        result = reciprocal(NINF)
        # Python treats 0.0 == -0.0 as True (IEEE 754 behavior)
        assert result == 0.0 or result == -0.0, "Reciprocal of -inf should be zero"

    def test_reciprocal_boundary_nan(self):
        """Boundary: Not a Number (undefined/invalid input)

        NaN propagates through operations (doesn't crash).
        Important: nan == nan is always False! Use math.isnan().
        """
        result = reciprocal(NAN)
        assert math.isnan(result), "Reciprocal of NaN should be NaN (propagates)"

    # RAZOR-EDGE BOUNDARIES (using math.nextafter)

    def test_reciprocal_boundary_next_after_max_toward_inf(self):
        """Boundary: Next float after max (toward infinity)

        Uses math.nextafter() to test the exact transition point.
        Next float after FMAX toward infinity IS infinity.
        """
        x_after_max = math.nextafter(FMAX, INF)
        assert x_after_max == INF, "Next float after max toward inf is infinity"
        result = reciprocal(x_after_max)
        assert result == 0.0, "Reciprocal of inf is 0"

    def test_reciprocal_boundary_next_after_one_toward_two(self):
        """Boundary: Next representable float after 1.0

        Tests ULP (Unit in Last Place) sensitivity.
        1.0 + epsilon is the next representable value after 1.0
        """
        x_just_above_one = math.nextafter(1.0, 2.0)
        result = reciprocal(x_just_above_one)
        # Reciprocal should be slightly less than 1.0
        assert result < 1.0, f"Expected < 1.0, got {result}"
        assert result == pytest.approx(1.0, rel=1e-14), "Should be very close to 1.0"

    def test_reciprocal_boundary_ulp_sensitivity(self):
        """Boundary: Testing ULP (Unit in Last Place) around a value

        Demonstrates floating-point granularity.
        """
        x = 1000.0
        ulp_at_x = math.ulp(x)  # Spacing at 1000.0

        result1 = reciprocal(x)
        result2 = reciprocal(x + ulp_at_x)  # Next representable value

        assert result1 != result2, "Adjacent floats should produce different reciprocals"
        assert result1 > result2, "Larger input → smaller reciprocal"

Introducing pytest’s @pytest.mark.parametrize - Avoid Test Duplication

So far, we’ve written separate test functions for each boundary. But what if you need to test the same logic with multiple input values?

The Problem: Repetitive Tests

# ❌ TEDIOUS: Writing 6 nearly-identical tests
def test_reciprocal_max_float():
    result = reciprocal(FMAX)
    assert result > 0 and result < FMIN_NORM_POS

def test_reciprocal_negative_max_float():
    result = reciprocal(-FMAX)
    assert result < 0 and result > -FMIN_NORM_POS

def test_reciprocal_min_norm_float():
    result = reciprocal(FMIN_NORM_POS)
    assert result == INF

# ... 3 more similar tests ...

Problems with this approach:

Code duplication (same test logic repeated)
Hard to maintain (change logic → update 6 tests)
Adds clutter (6 test functions for one concept)
If one fails, you don’t know how many others would fail

The Solution: Parametrized Tests

pytest provides @pytest.mark.parametrize to run the same test function with different input data:

# ✅ CLEAN: One test function, multiple inputs
@pytest.mark.parametrize("x, expected_behavior", [
    (FMAX, "underflow to subnormal"),
    (-FMAX, "underflow to negative subnormal"),
    (FMIN_NORM_POS, "overflow to infinity"),
    (FMIN_SUB_POS, "overflow to infinity"),
    (INF, "exact zero"),
    (NINF, "exact zero"),
])
def test_reciprocal_extreme_boundaries_summary(x, expected_behavior):
    """Parametrized test covering all extreme boundaries with descriptions."""
    result = reciprocal(x)

    if "subnormal" in expected_behavior:
        assert math.isfinite(result), f"{expected_behavior}: result should be finite"
        assert abs(result) < FMIN_NORM_POS, f"{expected_behavior}: should be in subnormal range"
    elif "overflow to infinity" in expected_behavior:
        assert result == INF, f"{expected_behavior}: should overflow to infinity"
    elif "exact zero" in expected_behavior:
        assert result == 0.0, f"{expected_behavior}: should be exactly zero"

How it works:

Decorator syntax: @pytest.mark.parametrize("param1, param2", [...])
- First argument: parameter names (comma-separated string)
- Second argument: list of tuples (one tuple per test case)
Test function receives parameters: def test_function(param1, param2):
- pytest injects each tuple’s values into the function
pytest runs the test multiple times: Once per tuple in the list
- Each run is treated as a separate test
- Test output shows which combination passed/failed

Example pytest output:

$ pytest test_boundaries.py::test_reciprocal_extreme_boundaries_summary -v

test_boundaries.py::test_reciprocal_extreme_boundaries_summary[FMAX-underflow to subnormal] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[-FMAX-underflow to negative subnormal] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[FMIN_NORM_POS-overflow to infinity] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[FMIN_SUB_POS-overflow to infinity] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[INF-exact zero] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[NINF-exact zero] PASSED

================================== 6 passed in 0.03s ==================================

Key benefits:

✅ DRY principle: Test logic written once, reused for all cases
✅ Easy to add cases: Just add a tuple to the list
✅ Clear output: Each combination is a separate test result
✅ Maintainable: Change logic in one place
✅ Readable: Parameter names document what’s being tested

When to use parametrize:

✅ DO use when:

Testing the same function with multiple inputs
Equivalence classes have the same validation logic
Boundary values follow the same pattern

❌ DON’T use when:

Each test needs different assertions (use separate test functions)
Test setup/teardown differs per case (use fixtures instead)
Tests become hard to read due to complex conditionals

More examples:

# Testing multiple equivalence classes with same validation
@pytest.mark.parametrize("angle", [-10, -20, -30, -45])
def test_find_intersection_downward_angles_all_succeed(angle):
    """Test that all downward angles find intersection"""
    x, y, dist = find_intersection(x_road, y_road, angle)
    assert x is not None
    assert dist > 0

# Testing boundary values with expected results
@pytest.mark.parametrize("x, expected", [
    (1.0, 1.0),
    (2.0, 0.5),
    (0.5, 2.0),
    (10.0, 0.1),
])
def test_reciprocal_normal_values(x, expected):
    """Test reciprocal with normal values"""
    result = reciprocal(x)
    assert result == pytest.approx(expected, rel=1e-9)

Complete parametrized test for our extreme boundaries:

# PARAMETRIZED TEST (pytest feature for multiple similar tests)

@pytest.mark.parametrize("x, expected_behavior", [
    (FMAX, "underflow to subnormal"),
    (-FMAX, "underflow to negative subnormal"),
    (FMIN_NORM_POS, "overflow to infinity"),
    (FMIN_SUB_POS, "overflow to infinity"),
    (INF, "exact zero"),
    (NINF, "exact zero"),
])
def test_reciprocal_extreme_boundaries_summary(x, expected_behavior):
    """Parametrized test covering all extreme boundaries with descriptions."""
    result = reciprocal(x)

    if "subnormal" in expected_behavior:
        assert math.isfinite(result), f"{expected_behavior}: result should be finite"
        assert abs(result) < FMIN_NORM_POS, f"{expected_behavior}: should be in subnormal range"
    elif "overflow to infinity" in expected_behavior:
        assert result == INF, f"{expected_behavior}: should overflow to infinity"
    elif "exact zero" in expected_behavior:
        assert result == 0.0, f"{expected_behavior}: should be exactly zero"

Why Test These Extreme Boundaries?

Boundary Type	What It Tests	Real-World Scenario
sys.float_info.max	Finite overflow prevention	Astronomical distances, cosmology calculations
sys.float_info.min	Underflow-to-overflow transition	Quantum physics, particle simulations
math.ulp(0.0)	Subnormal/denormal handling	High-precision scientific computing
sys.float_info.epsilon	Precision limits near 1.0	Financial calculations, numerical stability
math.inf	Non-finite value handling	Division by zero results, sentinel values
math.nan	Invalid input propagation	Missing data in datasets, undefined operations
math.nextafter()	Razor-edge transitions	Verifying exact boundary behavior

Key Insights:

✅ Test BOTH finite and non-finite boundaries: sys.float_info.max (finite) AND math.inf (non-finite) are different!
✅ Use constants for readability: Define FMAX, INF, etc. at module level
✅ Prefer math.inf over float('inf'): Better readability (same with math.nan)
✅ Use math.nextafter() for exact transitions: Test the exact point where behavior changes
✅ Python handles inf/nan gracefully: No crashes, but behavior may surprise you
✅ Use math.isnan() to check for NaN: nan == nan is always False!
✅ Document WHY each test matters: Future developers need context

Common Pitfall:

# ❌ WRONG: Testing only infinity
def test_reciprocal_large_values():
    result = reciprocal(float('inf'))
    assert result == 0.0

# Problem: This doesn't test finite overflow (sys.float_info.max)!
# A function could handle inf correctly but crash on FMAX.

# ✅ CORRECT: Test BOTH finite and non-finite boundaries
def test_reciprocal_max_finite():
    result = reciprocal(sys.float_info.max)  # Finite
    assert result < sys.float_info.min  # Underflow behavior

def test_reciprocal_infinity():
    result = reciprocal(math.inf)  # Non-finite
    assert result == 0.0  # Special value handling

2.2 Boundaries for Multi-Parameter: `reciprocal_sum(x, y, z)`

Function: reciprocal_sum(x, y, z) = 1/(x+y+z)

Boundaries to test:

Boundary Condition	Test Values	Why Important
Sum exactly zero	`(1.0, -0.5, -0.5)`	Division by zero
Sum near zero (positive)	`(1.0, -0.9999, 0.0)`	Large positive result
Sum near zero (negative)	`(-1.0, 0.9999, 0.0)`	Large negative result
One parameter zero	`(0.0, 1.0, 1.0)`	Doesn't affect sum
Two parameters zero	`(0.0, 0.0, 2.0)`	Only one matters
All parameters zero	`(0.0, 0.0, 0.0)`	Division by zero

Boundary Tests:

def test_reciprocal_sum_boundary_sum_exactly_zero():
    """Boundary: Sum cancels exactly to zero"""
    with pytest.raises(ZeroDivisionError):
        reciprocal_sum(1.0, -0.5, -0.5)

def test_reciprocal_sum_boundary_sum_near_zero_positive():
    """Boundary: Sum is very small positive"""
    result = reciprocal_sum(1.0, -0.9999, 0.0)  # sum = 0.0001
    assert result == pytest.approx(10000.0, rel=1e-5)

def test_reciprocal_sum_boundary_all_zeros():
    """Boundary: All parameters zero"""
    with pytest.raises(ZeroDivisionError):
        reciprocal_sum(0.0, 0.0, 0.0)

def test_reciprocal_sum_boundary_two_zeros():
    """Boundary: Two parameters zero, one non-zero"""
    result = reciprocal_sum(0.0, 0.0, 0.5)  # sum = 0.5
    assert result == pytest.approx(2.0, rel=1e-10)

2.3 Boundaries for Array Example: `array_sum(arr)`

Function: array_sum(arr) - sum of array elements

Structural Boundaries:

Boundary	Test Case
Empty array	`len(arr) = 0`
Single element	`len(arr) = 1`
Two elements	`len(arr) = 2` (smallest non-trivial case)

Value Boundaries:

Boundary	Test Case
All zeros	`[0.0, 0.0, 0.0]`
Contains one zero	`[0.0, 1.0, 2.0]`
All same value	`[5.0, 5.0, 5.0]`
Alternating signs near zero	`[1e-10, -1e-10, 1e-10]`

Boundary Tests:

def test_array_sum_boundary_empty_array():
    """Boundary: Empty array (length = 0)"""
    with pytest.raises(ValueError):
        array_sum(np.array([]))

def test_array_sum_boundary_single_element():
    """Boundary: Single element (length = 1)"""
    result = array_sum(np.array([5.0]))
    assert result == 5.0

def test_array_sum_boundary_two_elements():
    """Boundary: Two elements (minimum non-trivial)"""
    result = array_sum(np.array([3.0, 4.0]))
    assert result == 7.0

def test_array_sum_boundary_all_zeros():
    """Boundary: All elements are zero"""
    result = array_sum(np.array([0.0, 0.0, 0.0]))
    assert result == 0.0

def test_array_sum_boundary_alternating_near_zero():
    """Boundary: Values near zero with alternating signs"""
    result = array_sum(np.array([1e-10, -1e-10, 1e-10]))
    assert abs(result - 1e-10) < 1e-15  # Sum should be close to 1e-10

2.3.1 What About Large Arrays?

Great question! You might be wondering: “We tested 0, 1, 2 elements… but what about 1000 or 1,000,000 elements?”

The Short Answer: Unit tests should generally use small, representative arrays (typically 10-100 elements), not massive ones. Large arrays belong in performance tests, not unit tests.

Why Not Test with Huge Arrays in Unit Tests?

Problem	Impact	Example
Slow tests	Unit tests should run in milliseconds. Large arrays slow them to seconds.	Array of 1 million → 100x slower test suite
Doesn't find more bugs	If your code works with 10 elements, it usually works with 10,000.	Summation logic doesn't change with size
Noise in test output	Hard to debug failures with massive data	"Array mismatch at index 47,293" - good luck!
Memory usage	CI servers might run out of memory	10 tests × 1M floats × 8 bytes = 80 MB per test run

When DO You Need Large Arrays?

✅ DO test large arrays when:

Algorithm complexity matters (e.g., O(n²) vs O(n))

def test_sort_performance_scales_linearly():
    """Verify sorting doesn't degrade to O(n²)"""
    import time

    # Small array
    arr_small = np.random.rand(100)
    start = time.time()
    sort_function(arr_small)
    time_small = time.time() - start

    # Large array (10x bigger)
    arr_large = np.random.rand(1000)
    start = time.time()
    sort_function(arr_large)
    time_large = time.time() - start

    # O(n log n) should be ~13x slower (10 * log₂(10) ≈ 13)
    # NOT 100x slower (which would be O(n²))
    assert time_large < time_small * 20, "Sorting appears to be O(n²)!"

Memory allocation bugs (buffer overflows, off-by-one at specific sizes)

def test_array_processing_handles_power_of_two_sizes():
    """Test sizes like 256, 512, 1024 (common buffer boundaries)"""
    for size in [256, 512, 1024, 2048]:
        arr = np.random.rand(size)
        result = process_array(arr)
        assert len(result) == size, f"Failed at size {size}"

Regression test for a specific bug that only appeared with large data

def test_regression_issue_42_overflow_at_10000_elements():
    """Regression: Integer overflow occurred at exactly 10,000 elements

    Bug report: https://github.com/yourproject/issues/42
    Fixed by switching from int32 to int64 accumulator
    """
    arr = np.ones(10_000)  # Specific size that caused the bug
    result = array_sum(arr)
    assert result == 10_000.0, "Overflow regression detected!"

Best Practice: Separate Performance Tests

# tests/test_geometry.py (unit tests - FAST)
def test_find_intersection_basic():
    """Unit test: Small representative array"""
    x_road = np.array([0, 10, 20])  # Only 3 points
    y_road = np.array([0, 2, 4])
    x, y, dist = find_intersection(x_road, y_road, -10.0)
    assert x is not None

# tests/test_geometry_performance.py (performance tests - SLOW)
@pytest.mark.slow  # Mark as slow so we can skip during development
def test_find_intersection_large_road():
    """Performance test: Realistic large road dataset"""
    x_road = np.linspace(0, 1000, 10_000)  # 10,000 points
    y_road = np.sin(x_road / 10) * 5

    import time
    start = time.time()
    x, y, dist = find_intersection(x_road, y_road, -10.0)
    elapsed = time.time() - start

    assert x is not None
    assert elapsed < 0.1, f"Too slow: {elapsed:.3f}s for 10k points"

Run performance tests separately:

# Normal development: Skip slow tests (runs in seconds)
$ pytest tests/ -v

# Before merging PR: Run all tests including slow ones (runs in minutes)
$ pytest tests/ -v --run-slow

# Just performance tests
$ pytest tests/test_geometry_performance.py -v

Property-Based Testing with Hypothesis (Advanced)

For thorough testing with varying sizes, use Hypothesis:

from hypothesis import given, strategies as st

@given(st.lists(st.floats(allow_nan=False, allow_infinity=False),
                min_size=0, max_size=100))
def test_array_sum_any_valid_array(arr):
    """Property: array_sum should never crash on valid float arrays

    Hypothesis will automatically generate hundreds of test cases
    with different array sizes and values.
    """
    if len(arr) == 0:
        with pytest.raises(ValueError):
            array_sum(np.array(arr))
    else:
        result = array_sum(np.array(arr))
        assert isinstance(result, (int, float, np.number))

How Large is “Large Enough” for Unit Tests?

Rule of thumb:

Array Size	When to Use	Example
0-2 elements	Boundary testing	Empty, single, pair (edge cases)
3-10 elements	Unit tests (most common)	Enough to test logic without noise
100-1000 elements	Realistic scenario tests	Typical real-world data size
10,000+ elements	Performance tests (separate suite)	Algorithm complexity, memory usage
1,000,000+ elements	Stress tests (rare, CI only)	Production-scale data validation

Example: Right-Sized Unit Tests

# ✅ GOOD: Representative sizes for unit tests
def test_find_intersection_typical_road():
    """Test with typical road size (~100 points)"""
    x_road = np.linspace(0, 80, 100)  # Realistic: 100 points over 80 meters
    y_road = generate_road_profile(num_points=100)

    x, y, dist = find_intersection(x_road, y_road, -10.0)
    assert x is not None  # Fast test, runs in ~1ms

# ❌ BAD: Unnecessarily large for a unit test
def test_find_intersection_huge_road():
    """This belongs in performance tests, not unit tests"""
    x_road = np.linspace(0, 10000, 1_000_000)  # Overkill!
    y_road = np.random.rand(1_000_000)

    x, y, dist = find_intersection(x_road, y_road, -10.0)
    assert x is not None  # Slow test, runs in ~500ms

Summary: Testing Array Sizes

✅ Unit tests: Use small arrays (3-100 elements) - fast and sufficient
✅ Boundary tests: Always test 0, 1, 2 elements (edge cases)
✅ Performance tests: Use large arrays (10k+) in separate test suite
✅ Property-based tests: Use Hypothesis to generate varying sizes automatically
❌ Don’t: Put large arrays in regular unit tests (slows down development)

Key principle: Unit tests verify correctness, performance tests verify speed. Keep them separate!

2.4 Boundaries for Complex Array Example: `find_intersection()`

Function: find_intersection(x_road, y_road, angle_degrees, camera_x, camera_y)

Array Length Boundaries:

Boundary	Test Case	Why Important
`len(x_road) = 0`	Empty arrays	No road to intersect
`len(x_road) = 1`	Single point	Can't form a segment
`len(x_road) = 2`	Two points	Minimum valid road (one segment)
`len(x_road) ≠ len(y_road)`	Mismatched lengths	Invalid input

Angle Boundaries:

Boundary	Test Values	Why Important
Exactly -90°	`-90.0`	Vertical downward (tan = ∞)
Near -90°	`-89.9`, `-89.999`	Near vertical
Exactly 0°	`0.0`	Horizontal ray
Near 0°	`-0.1`, `0.1`	Nearly horizontal
Exactly 90°	`90.0`	Vertical upward (tan = ∞)
Near 90°	`89.9`, `89.999`	Near vertical

Camera Position Boundaries:

Boundary	Test Case	Why Important
`camera_x = x_road[0]`	Camera at road start	Edge of road
`camera_x = x_road[-1]`	Camera at road end	Edge of road
`camera_y = y_road[i]`	Camera at road level	Might be tangent
`camera_y = min(y_road)`	Camera at lowest point	Boundary case
`camera_y = max(y_road)`	Camera at highest point	Boundary case

Intersection Position Boundaries:

Boundary	Test Case	Why Important
Intersection at `x_road[0]`	Ray hits first point exactly	Endpoint handling
Intersection at `x_road[-1]`	Ray hits last point exactly	Endpoint handling
Intersection between two segments	Ray hits at segment boundary	Interpolation edge case

Comprehensive Boundary Tests:

class TestFindIntersectionBoundaries:
    """Boundary value tests for find_intersection()"""

    # ARRAY LENGTH BOUNDARIES

    def test_boundary_empty_arrays(self):
        """Boundary: Empty road arrays (len = 0)"""
        x, y, dist = find_intersection(np.array([]), np.array([]), -10.0)
        assert x is None

    def test_boundary_single_point(self):
        """Boundary: Single point (len = 1)"""
        x, y, dist = find_intersection(np.array([5.0]), np.array([2.0]), -10.0)
        assert x is None  # Can't form segment

    def test_boundary_two_points_minimum_valid(self):
        """Boundary: Two points (len = 2, minimum valid)"""
        x_road = np.array([0.0, 10.0])
        y_road = np.array([0.0, 2.0])
        x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=0.0, camera_y=5.0)
        assert x is not None  # Should work

    # ANGLE BOUNDARIES

    def test_boundary_angle_exactly_negative_90(self):
        """Boundary: Angle exactly -90° (vertical downward)"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([0, 2, 4])
        x, y, dist = find_intersection(x_road, y_road, -90.0)
        # Implementation returns None for vertical - verify this is intentional
        assert x is None

    def test_boundary_angle_near_negative_90(self):
        """Boundary: Angle near -90° (-89.9°)"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([0, 2, 4])
        x, y, dist = find_intersection(x_road, y_road, -89.9, camera_x=0.0, camera_y=10.0)
        # Nearly vertical - should still find intersection if one exists

    def test_boundary_angle_exactly_zero(self):
        """Boundary: Angle exactly 0° (horizontal)"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([2.0, 2.0, 2.0])  # Flat road at y=2
        x, y, dist = find_intersection(x_road, y_road, 0.0, camera_x=-5.0, camera_y=2.0)
        assert x is not None  # Horizontal ray should hit flat road

    def test_boundary_angle_near_zero(self):
        """Boundary: Angle near 0° (0.1° - nearly horizontal)"""
        x_road = np.array([0, 10, 20, 30])
        y_road = np.array([0, 1, 2, 3])
        x, y, dist = find_intersection(x_road, y_road, 0.1, camera_x=0.0, camera_y=1.5)
        # Very shallow angle

    def test_boundary_angle_exactly_90(self):
        """Boundary: Angle exactly 90° (vertical upward)"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([0, 2, 4])
        x, y, dist = find_intersection(x_road, y_road, 90.0)
        assert x is None  # Implementation returns None for vertical

    # CAMERA POSITION BOUNDARIES

    def test_boundary_camera_at_road_start(self):
        """Boundary: Camera x-position at road start"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([0, 2, 4])
        x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=0.0, camera_y=10.0)
        assert x is not None

    def test_boundary_camera_at_road_end(self):
        """Boundary: Camera x-position at road end"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([0, 2, 4])
        x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=20.0, camera_y=10.0)
        # Camera at end, looking down - might or might not intersect

    def test_boundary_camera_at_road_level(self):
        """Boundary: Camera y-position at road level"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([2.0, 2.0, 2.0])  # Flat at y=2
        x, y, dist = find_intersection(x_road, y_road, 0.0, camera_x=5.0, camera_y=2.0)
        # Camera ON the road, horizontal ray - tangent case

    def test_boundary_camera_at_minimum_y(self):
        """Boundary: Camera at lowest point of road"""
        x_road = np.array([0, 10, 20, 30])
        y_road = np.array([5, 2, 3, 6])  # min at x=10, y=2
        x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=15.0, camera_y=2.0)
        # Camera at same height as lowest road point

    # INTERSECTION POSITION BOUNDARIES

    def test_boundary_intersection_at_first_point(self):
        """Boundary: Ray intersects at first road point"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([5, 3, 1])
        # Position camera so ray hits exactly at (0, 5)
        x, y, dist = find_intersection(x_road, y_road, -45.0, camera_x=-5.0, camera_y=10.0)
        if x is not None:
            assert abs(x - 0.0) < 0.1  # Should be near first point

    def test_boundary_intersection_at_last_point(self):
        """Boundary: Ray intersects at last road point"""
        x_road = np.array([0, 10, 20])
        y_road = np.array([0, 2, 4])
        # Position so ray hits last point
        x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=15.0, camera_y=10.0)
        if x is not None and x > 18:
            assert abs(x - 20.0) < 2.0  # Should be near last point

    def test_boundary_intersection_at_segment_boundary(self):
        """Boundary: Ray intersects exactly where two segments meet"""
        x_road = np.array([0, 10, 20, 30])
        y_road = np.array([0, 5, 5, 10])
        # Ray aimed to hit exactly at (10, 5)
        x, y, dist = find_intersection(x_road, y_road, -30.0, camera_x=5.0, camera_y=10.0)
        if x is not None:
            # Could be reported as end of first segment or start of second
            assert 9.0 <= x <= 11.0

Key Insights for Boundary Testing:

Test at the boundary, just inside, and just outside
- Angle = 90°, 89.9°, 90.1°
- Length = 0, 1, 2
Floating-point precision matters
- Use pytest.approx() for floating-point comparisons
- Test values near zero (1e-10, 1e-15)
Array boundaries are both structural and positional
- Length boundaries (empty, single, two)
- Position boundaries (first element, last element, boundaries between)
Combination boundaries are critical
- Camera at road level + horizontal ray = tangent case
- Empty array + any angle = should handle gracefully

General Boundary Testing Strategy:

Data Type	Boundary Values to Test	Python Tools
Integers	`0`, `1`, `-1`, `max_value`, `min_value`	`sys.maxsize`, `-sys.maxsize-1`
Floats (Finite)	`0.0`, near-zero (`1e-10`), `1.0`, very large (`1e10`), `sys.float_info.max`, `-sys.float_info.max`, `sys.float_info.min` (smallest normalized), `math.ulp(0.0)` (smallest subnormal), `sys.float_info.epsilon`	`sys.float_info`, `math.ulp(x)`, `math.nextafter(x, y)`
Floats (Non-Finite)	`math.inf`, `-math.inf`, `math.nan`	`math.isnan(x)`, `math.isinf(x)`, `math.isfinite(x)`
Arrays/Lists	Empty (`[]`), Single element, Two elements, Very large	`len()`, `np.size`
Angles (degrees)	`0°`, `±90°`, `±180°`, `±360°`, values near these (`89.9°`, `90.1°`)	`np.deg2rad()`, `np.rad2deg()`
Strings	Empty string (`""`), Single char, Very long string, Unicode edge cases	`len()`, `str.encode()`

Quick Reference: Python Float Boundary Testing Cheat Sheet

import sys
import math

# CONSTANTS (define once, use everywhere)
FMAX = sys.float_info.max              # Largest finite (≈1.8e308)
FMIN_NORM = sys.float_info.min         # Smallest normalized (≈2.2e-308)
FMIN_SUB = math.ulp(0.0)               # Smallest subnormal (≈5e-324)
EPSILON = sys.float_info.epsilon       # Machine epsilon (≈2.2e-16)
INF = math.inf                         # Positive infinity
NINF = -math.inf                       # Negative infinity
NAN = math.nan                         # Not a number

# TEST CHECKLIST FOR FLOAT FUNCTIONS

# ✅ Finite boundaries (test overflow/underflow)
test_function(FMAX)           # Largest finite
test_function(-FMAX)          # Most negative finite
test_function(FMIN_NORM)      # Smallest normalized positive
test_function(FMIN_SUB)       # Smallest subnormal positive

# ✅ Non-finite values (test special value handling)
test_function(INF)            # Positive infinity
test_function(NINF)           # Negative infinity
test_function(NAN)            # Not a Number

# ✅ Precision boundaries
test_function(EPSILON)        # Machine epsilon
test_function(1.0 + EPSILON)  # Next after 1.0

# ✅ Razor-edge transitions (using math.nextafter)
math.nextafter(FMAX, INF)     # → inf (overflow transition)
math.nextafter(1.0, 2.0)      # Next representable after 1.0
math.nextafter(1.0, 0.0)      # Previous representable before 1.0

# ✅ Checking results
math.isfinite(x)              # True if not inf/nan
math.isinf(x)                 # True if +inf or -inf
math.isnan(x)                 # True if nan (don't use x == nan!)
math.ulp(x)                   # Spacing at x (precision)

Decision Tree: Which Boundaries Should I Test?

Does your function do arithmetic with floats?
│
├─ YES, division or reciprocal → Test ALL boundaries (finite + non-finite)
│   ├─ sys.float_info.max (finite overflow)
│   ├─ sys.float_info.min (underflow-to-overflow)
│   ├─ math.inf (division by zero result)
│   ├─ math.nan (invalid input)
│   └─ Near zero (1e-10, 1e-100)
│
├─ YES, but simple operations (+, -, *) → Test finite boundaries
│   ├─ sys.float_info.max (overflow detection)
│   ├─ sys.float_info.epsilon (precision loss)
│   └─ Normal range values
│
└─ NO, just comparison/sorting → Test basic boundaries
    ├─ 0.0, 1.0, -1.0
    ├─ math.inf, -math.inf (sorting edge cases)
    └─ math.nan (comparison always returns False!)

2.5 Boundaries for Discrete Functions: Thresholds Are Boundaries Too!

Common Misconception: “Discrete functions don’t have boundary values - they just have categories.”

Reality: Discrete functions DO have boundaries - they’re actually easier to identify than continuous function boundaries!

What we’ve seen so far:

All boundary examples in this lecture (reciprocal, reciprocal_sum, array_sum, find_intersection) were continuous real-valued functions - they return floats from an infinite range. For these, we had to think carefully about:

What values are “close to zero”? (1e-10? 1e-100?)
Where do we test precision loss? (epsilon?)
How do we handle infinity and NaN?

The good news: Functions with discrete/finite outputs make boundary identification much simpler! The boundaries are explicit thresholds in the code.

Key insight: For discrete functions, boundaries are threshold values where the output changes category. Instead of worrying about “how close to zero?”, you test both sides of each threshold.

2.5.1 Example: Simple Threshold Boundaries - `calculate_grade(score)`

Let’s start with the simplest case: a function that maps continuous inputs to discrete outputs.

Function Definition:

def calculate_grade(score: int) -> str:
    """
    Calculate letter grade based on score.

    Args:
        score: Integer between 0 and 100

    Returns:
        Letter grade (A, B, C, D, F)

    Raises:
        ValueError: If score is not in valid range
    """
    if score < 0 or score > 100:
        raise ValueError("Score must be between 0 and 100")

    if score >= 90:
        return "A"
    elif score >= 80:
        return "B"
    elif score >= 70:
        return "C"
    elif score >= 60:
        return "D"
    else:
        return "F"

Question: Where are the boundaries for this function?

Analysis:

Looking at the code, we immediately see explicit threshold values:

score >= 90 → Grade “A” (boundary at 90)
score >= 80 → Grade “B” (boundary at 80)
score >= 70 → Grade “C” (boundary at 70)
score >= 60 → Grade “D” (boundary at 60)
score < 60 → Grade “F” (boundary at 60 from below)
score < 0 or score > 100 → Error (boundaries at 0 and 100)

Equivalence classes and their boundaries:

Grade Class	Range	Boundaries to Test	Why These Boundaries?
Grade A	`90 ≤ score ≤ 100`	`90, 100`	Lower threshold (A/B boundary), upper limit
Grade B	`80 ≤ score < 90`	`80, 89`	Lower threshold (B/C boundary), upper edge (B/A boundary)
Grade C	`70 ≤ score < 80`	`70, 79`	Lower threshold (C/D boundary), upper edge (C/B boundary)
Grade D	`60 ≤ score < 70`	`60, 69`	Lower threshold (D/F boundary), upper edge (D/C boundary)
Grade F	`0 ≤ score < 60`	`0, 59`	Lower limit, upper edge (F/D boundary)
Invalid	`score < 0 or score > 100`	`-1, 101`	Just outside valid range

Key insight - Two types of boundary values:

Threshold boundaries - Where output category changes:
- score = 89 should return “B”
- score = 90 should return “A”
- These are critical! Off-by-one errors are common (using > instead of >=)
Range boundaries - Valid input range limits:
- score = 0 (minimum valid)
- score = 100 (maximum valid)
- score = -1 (just below valid range)
- score = 101 (just above valid range)

Boundary Value Test Code:

import pytest

def test_grade_boundary_A_B_threshold():
    """Boundary: score = 89 (B) vs 90 (A)"""
    assert calculate_grade(89) == "B"  # Just below A threshold
    assert calculate_grade(90) == "A"  # At A threshold

def test_grade_boundary_B_C_threshold():
    """Boundary: score = 79 (C) vs 80 (B)"""
    assert calculate_grade(79) == "C"  # Just below B threshold
    assert calculate_grade(80) == "B"  # At B threshold

def test_grade_boundary_C_D_threshold():
    """Boundary: score = 69 (D) vs 70 (C)"""
    assert calculate_grade(69) == "D"  # Just below C threshold
    assert calculate_grade(70) == "C"  # At C threshold

def test_grade_boundary_D_F_threshold():
    """Boundary: score = 59 (F) vs 60 (D)"""
    assert calculate_grade(59) == "F"  # Just below D threshold
    assert calculate_grade(60) == "D"  # At D threshold

def test_grade_boundary_valid_range_lower():
    """Boundary: score = 0 (valid) vs -1 (invalid)"""
    assert calculate_grade(0) == "F"   # Minimum valid score
    with pytest.raises(ValueError):
        calculate_grade(-1)             # Just below valid range

def test_grade_boundary_valid_range_upper():
    """Boundary: score = 100 (valid) vs 101 (invalid)"""
    assert calculate_grade(100) == "A"  # Maximum valid score
    with pytest.raises(ValueError):
        calculate_grade(101)             # Just above valid range

Test design observations:

Each test covers exactly one boundary - Makes it easy to identify which boundary is broken if a test fails
We test both sides of each threshold - score=89 (B) and score=90 (A)
We test valid range limits - Minimum (0), maximum (100), and just outside (-1, 101)
Total: 6 boundary tests - Much more focused than testing all 100 possible scores!

Contrast with continuous functions:

Aspect	Continuous (reciprocal)	Discrete (calculate_grade)
Boundary identification	Must decide: "How close to zero?" (1e-10? 1e-100?)	Explicit in code: 90, 80, 70, 60
Number of boundaries	Depends on your choice of partitioning	Fixed by function logic
Boundary precision	Floating-point precision matters (epsilon)	Integer values - no precision issues
Testing strategy	Test near-zero, infinity, NaN, epsilon	Test threshold ± 1

Why discrete boundaries are easier:

Explicit thresholds - The code literally says if score >= 90, so you know to test 89 vs 90
No precision worries - Integer thresholds mean no floating-point edge cases
Clear pass/fail - Either the grade is correct or it isn’t - no “close enough” judgment calls
Off-by-one detection - Boundary tests immediately catch > vs >= mistakes

2.5.2 Example: Multi-Dimensional Boundaries - `calculate_shipping_cost(weight, distance, express)`

Now let’s see a more complex case: multiple parameters, each with their own boundaries.

Function Definition:

def calculate_shipping_cost(weight: float, distance: float, express: bool = False) -> float:
    """
    Calculate shipping cost based on weight and distance.

    Args:
        weight: Weight in kg (0.1 to 50)
        distance: Distance in km (1 to 5000)
        express: Whether express shipping is requested

    Returns:
        float: Shipping cost in EUR

    Raises:
        ValueError: If weight or distance is out of range
    """
    if weight < 0.1 or weight > 50:
        raise ValueError("Weight must be between 0.1 and 50 kg")

    if distance < 1 or distance > 5000:
        raise ValueError("Distance must be between 1 and 5000 km")

    # Base cost calculation
    if weight <= 5:
        base_cost = 5.0
    elif weight <= 20:
        base_cost = 10.0
    else:
        base_cost = 20.0

    # Distance multiplier
    if distance <= 100:
        distance_multiplier = 1.0
    elif distance <= 500:
        distance_multiplier = 1.5
    else:
        distance_multiplier = 2.0

    cost = base_cost * distance_multiplier

    # Express shipping adds 50%
    if express:
        cost *= 1.5

    return round(cost, 2)

Question: Where are the boundaries for this function?

Analysis:

This function has three dimensions, each with its own boundaries:

Dimension 1: Weight boundaries

Boundary	Threshold Value	Test Cases	Expected Behavior Change
Minimum weight	`0.1 kg`	`0.09 kg` (error) vs `0.1 kg` (valid)	Error → Light category
Light/Medium	`5 kg`	`5.0 kg` (light) vs `5.1 kg` (medium)	€5 base → €10 base
Medium/Heavy	`20 kg`	`20.0 kg` (medium) vs `20.1 kg` (heavy)	€10 base → €20 base
Maximum weight	`50 kg`	`50 kg` (valid) vs `50.1 kg` (error)	Heavy category → Error

Dimension 2: Distance boundaries

Boundary	Threshold Value	Test Cases	Expected Behavior Change
Minimum distance	`1 km`	`0.9 km` (error) vs `1 km` (valid)	Error → Local category
Local/Regional	`100 km`	`100 km` (local) vs `101 km` (regional)	1.0× multiplier → 1.5× multiplier
Regional/Long-distance	`500 km`	`500 km` (regional) vs `501 km` (long)	1.5× multiplier → 2.0× multiplier
Maximum distance	`5000 km`	`5000 km` (valid) vs `5001 km` (error)	Long-distance → Error

Dimension 3: Express flag

Value	Effect
`False`	1.0× (standard shipping)
`True`	1.5× (express multiplier)

Note: Boolean parameters don’t have “boundaries” in the traditional sense - they only have two values. But we still need to test both!

Challenge: Combinatorial explosion

If we tested every combination of boundaries:

4 weight boundaries × 4 distance boundaries × 2 express values = 32 tests

Practical strategy: Test each dimension independently

Instead, we test each dimension’s boundaries while holding other dimensions constant:

import pytest

# ===== Weight boundaries (distance and express fixed) =====

def test_shipping_weight_boundary_minimum():
    """Boundary: weight = 0.1 (valid) vs 0.09 (invalid)"""
    assert calculate_shipping_cost(0.1, 100, False) == 5.0  # Valid
    with pytest.raises(ValueError, match="Weight must be between"):
        calculate_shipping_cost(0.09, 100, False)  # Invalid

def test_shipping_weight_boundary_light_medium():
    """Boundary: weight = 5.0 (light) vs 5.1 (medium)"""
    cost_light = calculate_shipping_cost(5.0, 100, False)
    cost_medium = calculate_shipping_cost(5.1, 100, False)
    assert cost_light == 5.0   # Light: base=5.0, dist_mult=1.0 → 5.0
    assert cost_medium == 10.0  # Medium: base=10.0, dist_mult=1.0 → 10.0

def test_shipping_weight_boundary_medium_heavy():
    """Boundary: weight = 20.0 (medium) vs 20.1 (heavy)"""
    cost_medium = calculate_shipping_cost(20.0, 100, False)
    cost_heavy = calculate_shipping_cost(20.1, 100, False)
    assert cost_medium == 10.0  # Medium: base=10.0, dist_mult=1.0 → 10.0
    assert cost_heavy == 20.0   # Heavy: base=20.0, dist_mult=1.0 → 20.0

def test_shipping_weight_boundary_maximum():
    """Boundary: weight = 50 (valid) vs 50.1 (invalid)"""
    assert calculate_shipping_cost(50, 100, False) == 20.0  # Valid
    with pytest.raises(ValueError, match="Weight must be between"):
        calculate_shipping_cost(50.1, 100, False)  # Invalid

# ===== Distance boundaries (weight and express fixed) =====

def test_shipping_distance_boundary_minimum():
    """Boundary: distance = 1 (valid) vs 0.9 (invalid)"""
    assert calculate_shipping_cost(5.0, 1, False) == 5.0  # Valid
    with pytest.raises(ValueError, match="Distance must be between"):
        calculate_shipping_cost(5.0, 0.9, False)  # Invalid

def test_shipping_distance_boundary_local_regional():
    """Boundary: distance = 100 (local) vs 101 (regional)"""
    cost_local = calculate_shipping_cost(5.0, 100, False)
    cost_regional = calculate_shipping_cost(5.0, 101, False)
    assert cost_local == 5.0     # Local: base=5.0, dist_mult=1.0 → 5.0
    assert cost_regional == 7.5  # Regional: base=5.0, dist_mult=1.5 → 7.5

def test_shipping_distance_boundary_regional_long():
    """Boundary: distance = 500 (regional) vs 501 (long-distance)"""
    cost_regional = calculate_shipping_cost(5.0, 500, False)
    cost_long = calculate_shipping_cost(5.0, 501, False)
    assert cost_regional == 7.5  # Regional: base=5.0, dist_mult=1.5 → 7.5
    assert cost_long == 10.0     # Long: base=5.0, dist_mult=2.0 → 10.0

def test_shipping_distance_boundary_maximum():
    """Boundary: distance = 5000 (valid) vs 5001 (invalid)"""
    assert calculate_shipping_cost(5.0, 5000, False) == 10.0  # Valid
    with pytest.raises(ValueError, match="Distance must be between"):
        calculate_shipping_cost(5.0, 5001, False)  # Invalid

# ===== Express flag (weight and distance fixed) =====

def test_shipping_express_flag():
    """Dimension: express = False vs True"""
    cost_standard = calculate_shipping_cost(5.0, 100, False)
    cost_express = calculate_shipping_cost(5.0, 100, True)
    assert cost_standard == 5.0   # Standard: base=5.0, dist=1.0, express=1.0 → 5.0
    assert cost_express == 7.5    # Express: base=5.0, dist=1.0, express=1.5 → 7.5

Test design observations:

Independent dimension testing - Each dimension is tested while holding others constant
Total: ~12 tests instead of 32 (all combinations) or 100+ (exhaustive)
We still catch boundary bugs - If weight threshold is wrong (weight < 5 instead of weight <= 5), our test will fail
Clear test names - Each test explicitly states what boundary it’s testing

When to test combinations:

You should add combination tests when:

Dimensions interact - E.g., “heavy packages get discount on long distances”
Edge case combinations - E.g., “minimum weight + maximum distance + express”
Critical business logic - E.g., “free shipping for light local packages”

For this function, dimensions are independent (just multiplication), so independent testing is sufficient.

Key insight for multi-dimensional boundaries:

Test each dimension independently with representative values in other dimensions. This gives you \(O(d \times b)\) tests (d dimensions, b boundaries each) instead of \(O(b^d)\) exhaustive tests!

For shipping cost: 12 tests (3 dimensions × ~4 boundaries) instead of 32 tests (all combinations).

2.5.3 Summary: Boundary Testing Strategy for Discrete Functions

Key principles:

Boundaries are explicit - Look for threshold values in conditional logic (if, elif)
Test both sides of each threshold - value-1 (old category) vs value (new category)
Test range limits - Minimum valid, maximum valid, just outside range
For multi-dimensional functions: Test each dimension independently (unless dimensions interact)

Common threshold patterns to look for:

Pattern in Code	Boundaries to Test	Example
`if x >= threshold:`	`threshold-1`, `threshold`	`score >= 90` → test 89, 90
`if x > threshold:`	`threshold`, `threshold+1`	`weight > 5` → test 5.0, 5.1
`if x < threshold:`	`threshold-1`, `threshold`	`distance < 100` → test 99, 100
`if x <= threshold:`	`threshold`, `threshold+1`	`weight <= 5` → test 5.0, 5.1
`if min <= x <= max:`	`min-1, min, max, max+1`	`0 <= score <= 100` → test -1, 0, 100, 101

Comparison: Continuous vs Discrete boundaries

Aspect	Continuous Functions	Discrete Functions
Boundary identification	Must choose: "How close to problematic value?"	Explicit thresholds in code
Test values	Near-zero, infinity, NaN, epsilon	Threshold ± smallest unit (1 for integers, 0.1 for floats)
Precision concerns	Floating-point precision critical	Usually none (integer thresholds)
Number of boundaries	Depends on partitioning choice	Fixed by function logic
Common bugs caught	Division by zero, overflow, precision loss	Off-by-one errors (`>` vs `>=`)

The good news: Discrete functions make boundary testing easier, not harder! The thresholds are right there in the code - you just need to test both sides of each one.

Final insight:

Every conditional creates a boundary. If you see if x >= 90, you know you need to test x=89 and x=90. If you see if weight <= 5, you know you need to test weight=5.0 and weight=5.1.

Boundary testing for discrete functions is systematic and mechanical - which is why it’s so effective!

3. LLM-Assisted Testing - Breaking the Test Cone

Why don’t developers write tests?

Research and surveys show:

“Writing tests is tedious” (42% of developers)
- Boilerplate: imports, setup, teardown
- Repetitive: Similar structure for every function
“I don’t know what to test” (38% of developers)
- What are the equivalence classes?
- What are the boundary cases?
- How many tests do I need?
“Initial setup takes forever” (35% of developers)
- pytest configuration
- Test file structure
- First test is always hardest

Result: Developers procrastinate → Only write E2E tests → Inverted pyramid (test cone)

The Solution: Use LLMs to handle the tedious parts, leaving you to focus on correctness.

The Workflow:

Human: Identifies what needs testing (equivalence classes)
LLM: Generates test boilerplate (imports, structure, AAA pattern)
Human: Reviews and refines (fixes logic, adds edge cases)
LLM: Generates more tests based on feedback
Human: Verifies assertions are correct
Run tests: Catch bugs in actual code (not tests!)

Key insight: LLMs excel at boilerplate, humans excel at domain logic.

Scenario: You need to test find_intersection() from src/road_profile_viewer/geometry.py.

Step 1: Prompt the LLM (in tests/test_geometry.py)

# tests/test_geometry.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.geometry import find_intersection

# Prompt: "Write pytest unit tests for find_intersection() covering:
# - Equivalence classes: downward angles, horizontal, upward angles, vertical
# - Boundary cases: empty arrays, single point, angle=90
# - Use AAA pattern (Arrange-Act-Assert)
# - Include proper type hints matching the actual repository code"

Step 2: LLM Generates Initial Tests

Copilot/Claude might generate:

def test_find_intersection_downward_angle() -> None:
    """Test find_intersection with normal downward angle."""
    x_road: NDArray[np.float64] = np.array([0, 10, 20, 30], dtype=np.float64)
    y_road: NDArray[np.float64] = np.array([0, 2, 4, 6], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
    assert x is not None
    assert y is not None
    assert dist is not None and dist > 0

def test_find_intersection_horizontal() -> None:
    """Test find_intersection with horizontal angle."""
    x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
    y_road: NDArray[np.float64] = np.array([0, 0, 0], dtype=np.float64)  # Flat road
    x, y, dist = find_intersection(x_road, y_road, 0.0, 0.0, 1.0)
    assert x is not None

def test_find_intersection_vertical() -> None:
    """Test find_intersection with vertical angle (90 degrees)."""
    x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
    y_road: NDArray[np.float64] = np.array([0, 2, 4], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, 90.0)
    assert x is None  # Current implementation returns None for vertical

def test_find_intersection_empty_road() -> None:
    """Test find_intersection with empty road arrays."""
    x_road: NDArray[np.float64] = np.array([], dtype=np.float64)
    y_road: NDArray[np.float64] = np.array([], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0)
    assert x is None

Value: 80% of boilerplate written instantly!

Step 3: Human Review - Find the Flaws

Now YOU review with domain knowledge:

❌ Problem 1: Missing edge case

# LLM didn't test: What if camera is BELOW the road?
# Add this test:
def test_find_intersection_camera_below_road() -> None:
    """Test when camera is below road level."""
    x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
    y_road: NDArray[np.float64] = np.array([5, 5, 5], dtype=np.float64)  # Flat road at y=5
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 0.0)  # Camera at y=0
    # Should still find intersection (ray goes down, crosses road above)
    assert x is not None

❌ Problem 2: Weak assertions

# LLM wrote:
assert x is not None  # Too weak!

# Better assertion:
assert 0 <= x <= 30, f"Expected x in [0, 30], got {x}"
assert y >= 0, f"Expected y positive, got {y}"

❌ Problem 3: Invalid test data

# LLM wrote:
x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([0, 0, 0], dtype=np.float64)  # Flat road at y=0
x, y, dist = find_intersection(x_road, y_road, 0.0, 0.0, 1.0)  # Camera at y=1

# Problem: Horizontal ray from y=1 won't intersect road at y=0!
# Fix: Either raise camera or tilt road
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 1, 2], dtype=np.float64)  # Sloped road

Step 4: Iterative Refinement

# You to LLM: "Add tests for upward angles where ray might not intersect"
# LLM generates:
def test_find_intersection_upward_angle_no_intersection() -> None:
    """Test that upward angle with road below returns None."""
    x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
    y_road: NDArray[np.float64] = np.array([0, 1, 2], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, 45.0, 0.0, 0.0)  # Ray goes up
    # Depending on implementation, might not intersect
    # If None is correct behavior:
    assert x is None

Step 5: Run Tests and Find Real Bugs

$ uv run pytest tests/test_geometry.py -v

Surprise! One test fails:

FAILED test_find_intersection_empty_road - IndexError: index 0 is out of bounds

This is GOOD! The test found a bug in your actual code:

def find_intersection(x_road, y_road, ...):
    # Bug: Doesn't check if arrays are empty before accessing!
    for i in range(len(x_road) - 1):  # Crashes if len(x_road) = 0
        x1, y1 = x_road[i], y_road[i]
        ...

Fix the bug:

def find_intersection(x_road, y_road, ...):
    # Add validation
    if len(x_road) == 0 or len(y_road) == 0:
        return None, None, None

    # ... rest of function

Run tests again:

$ uv run pytest tests/test_geometry.py -v
============================= 8 passed in 0.12s ===============================

✅ All tests pass! Bug fixed before it reached users.

What LLMs are GOOD at:

✅ Boilerplate (imports, test structure, AAA pattern)
✅ Standard patterns (testing return values, basic assertions)
✅ Coverage (generating tests for each function)

What LLMs are BAD at:

❌ Domain knowledge (“Is it correct for vertical angles to return None?”)
❌ Edge cases specific to your problem (“What if camera is inside the road?”)
❌ Subtle bugs in test logic (test always passes even when it shouldn’t)
❌ Determining correct expected values (“What SHOULD the distance be?”)
❌ Adding unnecessary logic to tests (loops, conditionals, calculations)

Example of LLM failure #1: Weak assertions

# LLM might generate:
def test_calculate_distance():
    dist = calculate_distance(0, 0, 3, 4)
    assert dist > 0  # Too weak! Just checks it's positive

# Human knows Pythagorean theorem:
def test_calculate_distance():
    dist = calculate_distance(0, 0, 3, 4)
    assert abs(dist - 5.0) < 0.01, f"Expected 5.0, got {dist}"  # 3-4-5 triangle!

Example of LLM failure #2: Logic in tests

LLMs sometimes generate tests with loops, conditionals, or calculations. This makes tests hard to understand and debug.

# ❌ LLM might generate this:
def test_find_intersection_multiple_angles():
    """Test intersection for various angles"""
    angles = [-10, -20, -30, -45]
    for angle in angles:  # ❌ Loop in test!
        x, y, dist = find_intersection(x_road, y_road, angle)
        if x is not None:  # ❌ Conditional in test!
            assert dist > 0
        else:
            assert dist is None  # ❌ Complex logic!

Problems with this test:

Loop makes it hard to see which angle failed
Conditional logic means test might not actually test anything
If test fails, which angle was the problem?
The test itself needs testing!

Human fix: Separate tests, no logic

# ✅ Human refactors to simple, clear tests:
def test_find_intersection_returns_positive_distance_for_minus_10_degrees():
    """Test -10° angle returns positive distance"""
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
    assert dist > 0  # No conditionals, no loops!

def test_find_intersection_returns_positive_distance_for_minus_20_degrees():
    """Test -20° angle returns positive distance"""
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, -20.0, 0.0, 10.0)
    assert dist > 0

# Or use pytest parametrize (cleaner for multiple similar tests):
@pytest.mark.parametrize("angle", [-10, -20, -30, -45])
def test_find_intersection_returns_positive_distance_for_downward_angles(angle):
    """Test that all downward angles return positive distance"""
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)
    x, y, dist = find_intersection(x_road, y_road, angle, 0.0, 10.0)
    assert dist > 0  # Simple assertion, pytest runs this test 4 times with different angles

Key principle: Tests should be straight-line code (inspired by Google’s testing practices)

✅ DO: Write simple, linear test code
✅ DO: Use pytest’s @pytest.mark.parametrize for multiple inputs
❌ DON’T: Write loops in tests
❌ DON’T: Write conditionals in tests
❌ DON’T: Calculate expected values in tests (use hardcoded values)

Why?

Tests must be trivially correct by inspection
If your test has logic, you need tests for your tests!
Complex test logic hides bugs in the test itself
When a test fails, you want to know immediately what’s wrong

The Human-LLM Loop:

Human: "Test find_intersection for edge cases"
LLM: Generates 5 tests
Human: "This one is wrong - camera below road should still work"
LLM: Fixes that test
Human: Runs tests, finds bug in actual code (not test)
Human: Fixes code
Tests pass → Confidence!

3.1 Warning: LLMs Love Mocking (But You Shouldn’t Overuse It)

When you use LLMs to generate tests, they often suggest mocking - replacing real objects with fake ones to verify function calls. This can lead to brittle tests that break when you refactor code.

Key Google Testing Principle: Test State, Not Interactions

There are two ways to verify that code works:

State Testing: Observe the system after invoking it to verify outcomes
Interaction Testing: Verify expected sequences of function calls on collaborators (using mocks)

State testing is less brittle because it focuses on “what” results occurred rather than “how” results were achieved.

3.1.1 What Mocking Looks Like (and When It’s Problematic)

# ❌ LLM-generated test with excessive mocking
from unittest.mock import patch

def test_find_intersection_calls_tan():
    """Test that find_intersection calls np.tan"""
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)

    with patch('numpy.tan') as mock_tan:
        mock_tan.return_value = 0.176  # Mocked slope
        x, y, dist = find_intersection(x_road, y_road, -10.0)
        mock_tan.assert_called_once()  # ❌ Tests HOW, not WHAT

Problem: This test breaks if you:

Change the trigonometry implementation (e.g., use cosine/sine instead of tan)
Optimize the slope calculation
Use a different math library
But the function still produces correct results!

This is a brittle test - it fails when implementation changes, even though behavior stays the same.

3.1.2 Better Approach: Test the Result, Not the Method Calls

# ✅ Test the RESULT (state), not the METHOD CALLS (interactions)
def test_find_intersection_returns_correct_intersection():
    """Test that intersection position is geometrically correct"""
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)

    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)

    # Assert FINAL STATE (outcomes)
    assert x is not None, "Should find intersection for downward angle"
    assert 0 <= x <= 20, f"Intersection x should be within road bounds, got {x}"
    assert y >= 0, f"Intersection y should be above ground, got {y}"
    assert dist > 0, f"Distance should be positive, got {dist}"

    # No assumptions about which numpy functions were called!

Benefits:

✅ Tests behavior, not implementation
✅ Survives refactoring (you can change HOW it calculates, as long as WHAT it returns is correct)
✅ Clear what’s being tested (intersection position and distance)
✅ Fails only when actual behavior changes

3.1.3 When IS Mocking Appropriate?

Mocking is valuable in specific scenarios:

✅ DO mock:

External APIs (network calls, HTTP requests)
Databases (slow, require setup)
File I/O (slow, requires filesystem state)
Time/Randomness (non-deterministic behavior)
Expensive operations (takes seconds/minutes to run)

❌ DON’T mock:

Pure functions (like find_intersection() - deterministic, fast)
Simple data structures (NumPy arrays, lists, dicts)
Internal implementation details (which functions your code calls)
Math libraries (NumPy, math module - they’re fast and deterministic)

3.1.4 Example: When Mocking Is Appropriate

# ✅ Good use of mocking: External API
from unittest.mock import patch, Mock

def test_fetch_weather_data_returns_temperature():
    """Test weather fetching without hitting real API"""
    # Mock the external HTTP request (slow, requires network)
    with patch('requests.get') as mock_get:
        mock_response = Mock()
        mock_response.json.return_value = {'temp': 22.5}
        mock_get.return_value = mock_response

        # Now test your function
        temp = fetch_weather_data('Berlin')

        # Assert RESULT (not implementation)
        assert temp == 22.5

Why this mocking is good:

External API is slow and requires network
API might be rate-limited or cost money
API might not be available in test environment
We’re still testing the RESULT (temp == 22.5), not method calls

3.1.5 Rule of Thumb: Prefer Real Objects Over Mocks

For find_intersection() and similar functions:

✅ Use real NumPy arrays - they’re fast (microseconds) and deterministic
✅ Use real math functions - np.tan(), np.cos() are instant
✅ Test the final state - intersection coordinates, distance values
❌ Don’t mock NumPy - no benefit, adds brittleness

Google’s guideline: “Use real objects when they’re fast and deterministic. Mock only when necessary.”

3.1.6 Summary: State Testing vs. Interaction Testing

Aspect	State Testing (Preferred)	Interaction Testing (Use Sparingly)
What it tests	Final results/outcomes	Sequence of function calls
Assertion style	`assert x == expected_value`	`mock.assert_called_once()`
Brittleness	Low - survives refactoring	High - breaks when implementation changes
When to use	Always, when possible	External dependencies only
Example	`assert dist > 0`	`mock_api.get.assert_called()`

Key takeaway: When LLMs suggest mocking, ask yourself: “Is this dependency slow or external?” If not, use the real object and test the state!

Best Practice: Use LLM to START, human to VERIFY and REFINE. Watch out for over-mocking!

4. Part 4: Hands-On Exercise - Test geometry.py

4.1 Exercise: Write Comprehensive Tests for `geometry.py`

Goal: Test all functions in src/road_profile_viewer/geometry.py using equivalence classes and boundary analysis.

Setup:

# Create test structure if not already done
$ mkdir -p tests
$ touch tests/__init__.py
$ touch tests/test_geometry.py

Task 1: Test find_intersection() - Equivalence Classes

Use an LLM to generate initial tests, then refine:

# tests/test_geometry.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.geometry import find_intersection, calculate_ray_line


class TestFindIntersection:
    """Test suite for find_intersection() function."""

    def test_normal_downward_angle(self) -> None:
        """Equivalence class: Normal downward angle (-90° < angle < 0°)"""
        x_road: NDArray[np.float64] = np.array([0, 10, 20, 30], dtype=np.float64)
        y_road: NDArray[np.float64] = np.array([0, 2, 4, 6], dtype=np.float64)
        x, y, dist = find_intersection(x_road, y_road, -45.0, 0.0, 10.0)
        assert x is not None and y is not None and dist is not None

    def test_horizontal_angle(self) -> None:
        """Equivalence class: Horizontal ray (angle = 0°)"""
        # Your test here (let LLM generate, then review)
        pass

    def test_vertical_angle_boundary(self) -> None:
        """Boundary case: Vertical angle (90°)"""
        pass

    def test_empty_road_arrays(self) -> None:
        """Boundary case: Empty arrays"""
        pass

    def test_single_point_road(self) -> None:
        """Boundary case: Road with only one point"""
        pass

Task 2: Test calculate_ray_line() - Boundary Cases

class TestCalculateRayLine:
    """Test suite for calculate_ray_line() function."""

    def test_normal_angle(self) -> None:
        """Test with normal angle"""
        x_ray: NDArray[np.float64]
        y_ray: NDArray[np.float64]
        x_ray, y_ray = calculate_ray_line(-10.0, camera_x=0.0, camera_y=2.0)
        assert len(x_ray) == 2
        assert len(y_ray) == 2
        assert x_ray[0] == 0.0  # Starts at camera
        assert y_ray[0] == 2.0

    def test_vertical_angle(self) -> None:
        """Boundary: Vertical angle (90°)"""
        x_ray, y_ray = calculate_ray_line(90.0)
        # Should handle vertical line
        assert x_ray[0] == x_ray[1]  # Vertical means same x

    def test_zero_angle(self) -> None:
        """Boundary: Horizontal angle (0°)"""
        x_ray, y_ray = calculate_ray_line(0.0, camera_y=2.0)
        # Horizontal line at y=2.0
        assert y_ray[0] == y_ray[1] == 2.0

Task 3: Run Tests and Iterate

$ uv run pytest tests/test_geometry.py -v

# If failures occur:
# 1. Is the test wrong? Fix the test.
# 2. Is the code wrong? Fix the code.
# 3. Unsure? Add a print statement in the test to see actual values.

Success criteria:

✅ At least 10 tests for find_intersection()
✅ At least 5 tests for calculate_ray_line()
✅ All equivalence classes covered
✅ Boundary cases tested
✅ All tests pass

5. Part 5: Module and Integration Testing

5.1 What are Module/Integration Tests?

Unit test: Tests one function in isolation.

Module/Integration test: Tests multiple modules working together.

Example:

# Unit test: Test ONLY geometry.find_intersection()
def test_find_intersection():
    x_road = np.array([0, 10])
    y_road = np.array([0, 2])
    x, y, dist = find_intersection(x_road, y_road, -10.0)
    assert x is not None

# Integration test: Test geometry + road together
def test_road_generation_with_intersection() -> None:
    """Test that road generation produces data that geometry can process."""
    from road_profile_viewer.road import generate_road_profile
    from road_profile_viewer.geometry import find_intersection

    # Generate road using road.py
    x_road: NDArray[np.float64]
    y_road: NDArray[np.float64]
    x_road, y_road = generate_road_profile(num_points=100, x_max=80)

    # Use geometry.py to find intersection
    x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=0.0, camera_y=2.0)

    # Verify integration works
    assert x is not None, "Generated road should intersect camera ray"
    assert 0 <= x <= 80, "Intersection should be within road bounds"
    assert y >= 0, "Intersection should be above ground"

Key difference:

Unit test: Mock/fake data (np.array([0, 10]))
Integration test: Real data from actual modules (generate_road_profile())

5.2 When to Write Integration Tests

Write integration tests to catch:

Interface mismatches

# road.py returns (x, y)
# geometry.py expects (x, y)
# If road.py changes to return (x, y, metadata), tests will catch it!

Data format issues

# What if generate_road_profile() returns list instead of np.array?
# Integration test will fail!

Assumptions about data

# geometry.py assumes x_road is sorted
# What if road.py returns unsorted data?
# Integration test catches this!

5.3 Example Integration Test Suite

# tests/test_integration.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.road import generate_road_profile
from road_profile_viewer.geometry import find_intersection, calculate_ray_line


class TestRoadGeometryIntegration:
    """Test integration between road and geometry modules."""

    def test_generated_road_intersects_downward_ray(self) -> None:
        """Verify that generated roads can be processed by geometry functions."""
        # Use real road generation
        x_road: NDArray[np.float64]
        y_road: NDArray[np.float64]
        x_road, y_road = generate_road_profile(num_points=100, x_max=80)

        # Verify it works with geometry module
        x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=0.0, camera_y=10.0)

        assert x is not None, "Should find intersection with generated road"
        assert isinstance(x, (int, float, np.number)), "Should return numeric type"
        assert isinstance(y, (int, float, np.number)), "Should return numeric type"
        assert isinstance(dist, (int, float, np.number)), "Should return numeric type"

    def test_road_data_format_compatible_with_geometry(self) -> None:
        """Verify road.py returns data in format geometry.py expects."""
        x_road: NDArray[np.float64]
        y_road: NDArray[np.float64]
        x_road, y_road = generate_road_profile()

        # Check data types
        assert isinstance(x_road, np.ndarray), "x_road should be numpy array"
        assert isinstance(y_road, np.ndarray), "y_road should be numpy array"

        # Check lengths match
        assert len(x_road) == len(y_road), "Road arrays should have same length"

        # Check data is sorted
        assert np.all(np.diff(x_road) > 0), "x_road should be strictly increasing"

    def test_varying_road_parameters_work_with_geometry(self) -> None:
        """Test that different road generation parameters work with geometry."""
        for num_points in [10, 50, 100, 200]:
            for x_max in [40, 80, 120]:
                x_road, y_road = generate_road_profile(num_points, x_max)
                x, y, dist = find_intersection(x_road, y_road, -10.0)
                # Should not crash, might or might not find intersection
                assert x is None or isinstance(x, (int, float, np.number))

Run integration tests:

$ uv run pytest tests/test_integration.py -v

5.4 E2E Testing: Why We Skip It (For Now)

End-to-End test would mean:

Start the Dash application
Open a browser (using Selenium/Playwright)
Enter angle in input field
Verify graph updates correctly

Why skip it in this course?

Complex setup: Requires Selenium, browser drivers, etc.
Slow: Each test takes seconds to run
Brittle: UI changes break tests frequently
Diminishing returns: Unit + integration tests catch 90% of bugs

For this course:

✅ 70% Unit tests (fast, focused)
✅ 20% Integration tests (moderate)
❌ 10% E2E tests (skip - too advanced)

In real projects: Yes, E2E tests matter. But get your pyramid base solid first!

6. 5.5 Test Maintainability: Writing Tests That Don’t Break

You’ve learned how to write unit tests, integration tests, and how to use LLMs to accelerate testing. Now let’s talk about a critical quality: test maintainability.

The Problem:

You write 20 unit tests. They all pass. ✅

You refactor find_intersection() to improve performance (no behavior change). Suddenly 10 tests fail. ❌

Question: Is this good or bad?

Bad! These tests are brittle - they break when implementation changes, even though behavior stayed the same.

6.1 The Brittleness Problem

Brittle tests fail in response to unrelated production code changes that introduce no real bugs. They:

Force you to repeatedly tweak tests with each refactoring
Consume disproportionate maintenance time
Scale poorly in large codebases
Undermine the “automated” nature of test suites
Make you afraid to refactor

Google’s insight: In large codebases, brittle tests are a major productivity killer. Engineers spend more time fixing tests than fixing actual bugs!

6.2 Example: Brittle Test (Tests Implementation Details)

# ❌ BRITTLE: Tests HOW the function works internally
from unittest.mock import patch

def test_find_intersection_uses_tan_for_slope():
    """Test that function uses np.tan to calculate slope"""
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)

    # Mock np.tan to verify it's called
    with patch('numpy.tan') as mock_tan:
        mock_tan.return_value = 0.176  # Mocked slope value
        find_intersection(x_road, y_road, -10.0)
        assert mock_tan.called  # ❌ Breaks if you change slope calculation!

Problem: If you refactor to use a different trigonometry approach (e.g., using atan2 or pre-computed lookup tables), this test fails even though:

The intersection position is still correct
The behavior is identical
No bugs were introduced

This test is testing the IMPLEMENTATION, not the BEHAVIOR.

6.3 Better: Robust Test (Tests Public Behavior)

# ✅ ROBUST: Tests WHAT the code does, not HOW it does it
def test_find_intersection_returns_correct_position_for_downward_angle():
    """Test that intersection position is geometrically correct for downward ray"""
    # Arrange
    x_road = np.array([0, 10, 20], dtype=np.float64)
    y_road = np.array([0, 2, 4], dtype=np.float64)

    # Act
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)

    # Assert behavior: intersection should be within road bounds
    assert x is not None, "Should find intersection for downward angle"
    assert 0 <= x <= 20, f"Intersection x should be in road bounds [0, 20], got {x}"
    assert 0 <= y <= 4, f"Intersection y should be in road bounds [0, 4], got {y}"
    assert dist > 0, f"Distance should be positive, got {dist}"

    # No assumptions about HOW it calculated this!
    # Works with tan, atan2, lookup tables, or any other implementation

Benefits:

✅ Tests behavior, not implementation
✅ Survives refactoring (change HOW, behavior stays same)
✅ Clear what’s being tested (intersection correctness)
✅ Fails only when actual behavior changes

6.4 The Key Principle: Test Via Public APIs

Google’s guideline: “Write tests that invoke the system being tested in the same way its users would.”

What does this mean for find_intersection()?

The public API (what users see):

x, y, dist = find_intersection(x_road, y_road, angle_degrees, camera_x, camera_y)

Public contract (what users expect):

Input: Arrays and angles
Output: Intersection coordinates or None
Behavior: Find where camera ray intersects road

✅ DO test:

Does it return correct intersection coordinates?
Does it handle edge cases (empty arrays, vertical angles)?
Does it return None when ray misses road?

❌ DON’T test:

Does it call np.tan() internally?
Does it use a specific algorithm?
What order does it process road segments?

6.5 Four Categories of Code Changes (Google’s Framework)

Google categorizes production code changes and how tests should respond:

Change Type	Example	Should Tests Change?	Why
Pure Refactoring	Optimize `find_intersection()` algorithm	❌ NO	Tests verify behavior remains constant
New Features	Add `find_all_intersections()` function	✅ Add new tests only	Existing tests stay unchanged
Bug Fixes	Fix crash on empty arrays	✅ Add new test case	Test the bug to prevent regression
Behavior Changes	Return list of intersections instead of first	✅ Modify existing tests	Contract changed, tests must reflect new behavior

The ideal: Tests only change when behavior changes (category 4). All other changes should leave tests untouched!

6.6 Real-World Example: Refactoring Scenario

Scenario: You want to optimize find_intersection() by using a faster algorithm.

Before (current implementation):

def find_intersection(x_road, y_road, angle_degrees, camera_x=0, camera_y=1.5):
    """Find intersection using linear search"""
    angle_rad = -np.deg2rad(angle_degrees)

    # Check vertical
    if np.abs(np.cos(angle_rad)) < 1e-10:
        return None, None, None

    slope = np.tan(angle_rad)

    # Linear search through segments
    for i in range(len(x_road) - 1):
        x1, y1 = x_road[i], y_road[i]
        x2, y2 = x_road[i + 1], y_road[i + 1]
        # ... intersection calculation

After (optimized implementation):

def find_intersection(x_road, y_road, angle_degrees, camera_x=0, camera_y=1.5):
    """Find intersection using binary search (10x faster!)"""
    angle_rad = -np.deg2rad(angle_degrees)

    # Different vertical check using sine
    if np.abs(np.sin(angle_rad - np.pi/2)) < 1e-10:  # Changed!
        return None, None, None

    # Use atan2 instead of tan (more numerically stable)
    direction = np.array([np.cos(angle_rad), np.sin(angle_rad)])  # Changed!

    # Binary search through segments (faster for large roads)
    left, right = 0, len(x_road) - 1
    # ... binary search logic (completely different!)

What happens to tests?

❌ Brittle tests (testing implementation):

def test_find_intersection_uses_tan():
    with patch('numpy.tan') as mock_tan:
        find_intersection(...)
        assert mock_tan.called  # ❌ FAILS! We use atan2 now!

✅ Robust tests (testing behavior):

def test_find_intersection_returns_correct_position():
    x, y, dist = find_intersection(x_road, y_road, -10.0)
    assert 0 <= x <= 20  # ✅ PASSES! Still finds correct intersection!

Result:

Brittle tests: 10 failures (all false positives!)
Robust tests: 0 failures (behavior unchanged!)

This is the difference between productive testing and test hell!

6.7 Practical Guidelines for Maintainable Tests

✅ DO:

Test public API only - Call functions the same way users would
Test final outcomes - Assert on return values, not internal state
Use real objects when fast - Avoid mocking NumPy, math functions
Test behaviors, not methods - One test per behavior (not one per function)
Expect tests to be stable - Good tests only change when requirements change

❌ DON’T:

Mock internal implementation - Don’t mock np.tan() in your own code
Assert on private state - Don’t check internal variables
Test method call sequences - Don’t verify tan was called before cos
Couple tests to algorithm - Don’t assume linear vs. binary search
Test performance in unit tests - Speed is separate from correctness

6.8 Summary: Maintainable vs. Brittle Tests

Aspect	Maintainable Tests	Brittle Tests
What they test	Public behavior (WHAT)	Implementation details (HOW)
Assertion style	`assert x is not None`	`mock_tan.assert_called()`
Refactoring impact	Tests still pass ✅	Tests break ❌
Developer experience	"Tests just work!"	"Ugh, fix tests again..."
When they fail	Real bug found	Often false alarm

Key takeaway: The best tests are the ones you never have to touch until a real bug appears. Test WHAT your code does, not HOW it does it!

6. Part 6: Applying Feature Branch Workflow

Let’s add these tests using the professional workflow from Chapter 02 (Feature Development)!

6.1 Step 1: Create Feature Branch

$ git checkout main
$ git pull origin main
$ git checkout -b feature/add-unit-tests

6.2 Step 2: Create Test Files

$ mkdir -p tests
$ touch tests/__init__.py
$ touch tests/test_geometry.py
$ touch tests/test_road.py
$ touch tests/test_integration.py

6.3 Step 3: Write Tests (Using LLM Assistance)

Use Copilot/Claude to generate initial tests, then refine:

# tests/test_geometry.py
# (See examples from Part 5 above)

# tests/test_road.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.road import generate_road_profile


class TestGenerateRoadProfile:
    def test_default_parameters(self) -> None:
        """Test road generation with default parameters."""
        x: NDArray[np.float64]
        y: NDArray[np.float64]
        x, y = generate_road_profile()
        assert len(x) == 100  # Default num_points
        assert x[0] == 0.0
        assert y[0] == 0.0  # Road starts at origin

    def test_custom_num_points(self) -> None:
        """Test road generation with custom number of points."""
        x, y = generate_road_profile(num_points=50)
        assert len(x) == 50
        assert len(y) == 50

    def test_road_is_continuous(self) -> None:
        """Verify generated road has no gaps."""
        x, y = generate_road_profile(num_points=100)
        assert np.all(np.diff(x) > 0), "x should be strictly increasing"

6.4 Step 4: Run Tests Locally

$ uv run pytest tests/ -v
============================= test session starts ==============================
tests/test_geometry.py::TestFindIntersection::test_normal_downward_angle PASSED
tests/test_geometry.py::TestFindIntersection::test_empty_road_arrays PASSED
tests/test_road.py::TestGenerateRoadProfile::test_default_parameters PASSED
...
============================== 15 passed in 0.25s ===============================

6.5 Step 5: Commit Your Tests

$ git add tests/
$ git commit -m "Add unit tests for geometry and road modules

- test_geometry.py: 10 tests covering equivalence classes and boundaries
- test_road.py: 5 tests for road generation
- test_integration.py: 3 integration tests
- All tests use AAA pattern (Arrange-Act-Assert)
- Tests generated with LLM assistance, refined by human review

Total: 18 tests, all passing"

6.6 Step 6: Push and Create PR

$ git push -u origin feature/add-unit-tests

$ gh pr create --title "Add unit tests for geometry and road modules" \
  --body "Adds comprehensive test suite using pytest.

**Coverage:**
- geometry.py: 10 unit tests (equivalence classes + boundaries)
- road.py: 5 unit tests
- Integration: 3 tests verifying modules work together

**Testing approach:**
- Used LLM to generate initial test structure
- Human review to fix assertions and add edge cases
- All tests pass locally

**Next steps:**
- Chapter 03 (TDD and CI) will add CI integration
- TDD workflow for new features"

6.7 Step 7: CI Will Run (in Chapter 03)

In the next lecture, we’ll update CI to run tests automatically!

7. Summary: What You’ve Accomplished

7.1 Before This Lecture

✅ Ruff (style)
✅ Pyright (types)
❌ No tests → Logic bugs slip through

7.2 After This Lecture

✅ Ruff (style)
✅ Pyright (types)
✅ Pytest (correctness) → 18 tests catch bugs!

7.3 Key Concepts You’ve Mastered

1. Testing Pyramid

70% Unit tests (fast, focused)
20% Integration tests (moderate)
10% E2E tests (slow, expensive)

2. Unit Testing Skills

AAA pattern (Arrange-Act-Assert)
Equivalence class partitioning
Boundary value analysis
Writing focused, fast tests

3. LLM-Assisted Testing

LLMs generate boilerplate (fast start)
Humans verify correctness (essential!)
Iterative refinement loop
Breaks the “test cone” pattern

4. Practical Skills

Created test file structure
Wrote 18+ unit tests
Used pytest to run tests
Applied feature branch workflow

7.4 The Difference Tests Make

Without tests:

Developer: *Changes find_intersection()*
Developer: *Manually tests by clicking UI*
Developer: "Looks good!" *Pushes to main*
User: *Discovers bug with vertical angle* "App crashed!"

With tests:

Developer: *Changes find_intersection()*
Developer: $ pytest tests/
Developer: "❌ Test failed! Bug caught before commit"
Developer: *Fixes bug, tests pass*
Developer: *Pushes with confidence*
User: *Everything works!*

8. Key Takeaways

Remember these principles:

Code quality ≠ Code correctness - Ruff catches style, tests catch bugs
Testing pyramid, not cone - More unit tests, fewer E2E tests
Equivalence classes + boundaries - Test smart, not exhaustive
AAA pattern - Arrange, Act, Assert for readable tests
LLMs assist, humans verify - Use LLMs for boilerplate, not correctness
Fast tests = frequent testing - Unit tests run in milliseconds
Tests give confidence - Refactor without fear
Feature branch workflow applies to tests - Tests are a feature too!

You’re now ready for Chapter 03 (TDD and CI): Test-Driven Development!

In the next lecture, we’ll flip the script: write tests BEFORE code (TDD), integrate tests into CI, and make failing tests block merges.

9. Further Reading

On Testing:

Kent Beck’s “Test-Driven Development by Example”
Martin Fowler’s “Testing Strategies” article
pytest documentation: https://docs.pytest.org/

On the Testing Pyramid:

Martin Fowler: TestPyramid
Google Testing Blog: Testing on the Toilet series

On Equivalence Classes:

Software Testing Fundamentals: Equivalence Partitioning
Boundary Value Analysis techniques

Interactive Learning:

pytest tutorial: https://docs.pytest.org/en/stable/getting-started.html
Real Python: Effective Python Testing With Pytest

Last Updated: 2025-11-04 Prerequisites: Lectures 1-4 (Especially Chapter 02 (Refactoring) - Modular Code), Chapter 03 (Testing Basics) Next Lecture: Chapter 03 (TDD and CI) - Test-Driven Development & CI Integration