03 Testing Fundamentals: Boundary Analysis and LLM-Assisted Testing
November 2025 (19548 Words, 109 Minutes)
1. Introduction: Welcome Back to Testing - From Theory to Practice
Welcome back! In Part 1 of Chapter 03 (Testing Fundamentals), you learned the fundamentals of software testing:
What We Covered in Part 1:
- Why Testing Matters
- Code quality (Ruff, Pyright) ≠ Code correctness (tests)
- CI can be green, but your code can still have critical bugs
- Tests catch logic errors that static analysis tools miss
- The Testing Pyramid
- 70% Unit tests (fast, focused, cheap)
- 20% Integration tests (moderate speed)
- 10% E2E tests (slow, expensive)
- Not the inverted “test cone” anti-pattern!
- Unit Testing Fundamentals
- AAA Pattern: Arrange-Act-Assert
- Why pytest is the industry standard
- Writing your first unit test
- The impossibility of exhaustive testing
- Clean Code Principles for Testing
- One concept per test
- One assert per test (with pragmatic exceptions)
- Descriptive test names that tell you what’s broken
- Equivalence Classes
- Simple floats → 5 equivalence classes
- Multiple parameters → 10+ classes (complexity explosion!)
- Array inputs → Structural + Value dimensions
- Complex functions like
find_intersection()→ 600+ potential combinations!
The Challenge We Left You With:
You now know WHY testing matters and HOW to write basic unit tests. But you’re probably thinking:
- “How do I know if I’ve tested enough?”
- “What about those tricky edge cases at the boundaries?”
- “Writing 20-30 tests for one function sounds tedious!”
- “Can I use AI to help me write tests faster?”
Today’s Mission: From Testing Basics to Testing Mastery
In Part 2, we’re going to solve these problems with three powerful techniques:
- Boundary Value Analysis - Where bugs actually hide (not in the middle of equivalence classes!)
- LLM-Assisted Testing - Break the “test cone” by using AI for boilerplate (while keeping human oversight)
- Integration Testing & Workflow - Make testing a natural part of your development process
What Makes Part 2 Different:
Part 1 was conceptual - understanding what tests are and why they matter.
Part 2 is practical - learning specific techniques to write better tests faster, and integrating testing into your real workflow.
By the end of today’s lecture, you’ll be able to:
- Identify boundary values systematically for any function
- Use LLMs to generate test boilerplate (while avoiding common AI pitfalls)
- Write integration tests that verify modules work together
- Apply feature branch workflow to add tests to your Road Profile Viewer
- Know when you have “enough” tests (realistic coverage, not perfection)
The Road Ahead:
✅ Part 1: Foundation (Why test? How to write basic unit tests?)
→ Part 2: Mastery (Where do bugs hide? How to test efficiently?)
→ Chapter 03 (TDD and CI): TDD & CI (Write tests FIRST, automate everything)
Let’s dive into the techniques that separate beginners from professionals: boundary value analysis and LLM-assisted testing!
2. Boundary Value Analysis: Where Bugs Hide
Observation: Bugs often lurk at the boundaries between equivalence classes.
Why boundaries? Off-by-one errors, floating-point precision, edge cases in conditional logic.
Let’s apply boundary analysis to all our examples.
2.1 Boundaries for Simple Float Example: reciprocal(x)
Function: reciprocal(x) = 1/x
Equivalence Classes and Their Boundaries:
| Equivalence Class | Boundary Values to Test |
|---|---|
| Positive numbers | x = 0.0001 (near zero), x = 1.0, x = 1000000.0 (very large) |
| Negative numbers | x = -0.0001 (near zero), x = -1.0, x = -1000000.0 |
| Zero | x = 0.0, x = 1e-100 (extremely close to zero) |
| Extreme boundaries | sys.float_info.max, sys.float_info.min, float('inf'), float('nan') |
Python Float Boundaries (Similar to C/C++ FLT_MAX, DBL_MIN):
Python floats are 64-bit IEEE 754 doubles.
What is IEEE 754?
IEEE 754 is the international standard for floating-point arithmetic, defining how computers represent and compute with decimal numbers. It specifies:
- How numbers are stored in binary (sign, exponent, mantissa)
- Special values (infinity, NaN)
- Rounding behavior
- Operations (+, -, ×, ÷)
Why this matters for testing: Understanding IEEE 754 boundaries helps you write tests for edge cases that occur at the limits of numerical precision. All modern programming languages (Python, C, C++, Java, JavaScript) follow this standard.
Official References:
- IEEE 754 Standard: https://standards.ieee.org/standard/754-2019.html
- Wikipedia (Good overview): https://en.wikipedia.org/wiki/IEEE_754
- What Every Computer Scientist Should Know About Floating-Point Arithmetic: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
Python provides the sys.float_info module to access platform-specific float limits defined by IEEE 754:
| Python Constant | C/C++ Equivalent | Typical Value | Description |
|---|---|---|---|
sys.float_info.max |
DBL_MAX |
1.7976931348623157e+308 |
Maximum representable finite float |
sys.float_info.min |
DBL_MIN |
2.2250738585072014e-308 |
Minimum positive normalized float |
sys.float_info.epsilon |
DBL_EPSILON |
2.220446049250313e-16 |
Difference between 1.0 and next representable float |
math.inf or float('inf') |
INFINITY |
inf |
Positive infinity (non-finite value) |
-math.inf or float('-inf') |
-INFINITY |
-inf |
Negative infinity (non-finite value) |
math.nan or float('nan') |
NAN |
nan |
Not a Number (undefined result) |
math.ulp(0.0) |
— | ≈ 5e-324 |
Smallest positive subnormal (denormalized) float |
math.ulp(x) |
— | Varies | Unit in the Last Place (spacing to next representable float) |
Critical Distinction: math.inf vs. sys.float_info.max
These are fundamentally different and test different aspects of your code:
math.inf(orfloat('inf')) - IEEE-754 Positive Infinity- NOT a number you can store in finite mantissa/exponent
- A special value that compares larger than every finite float
- Propagates through many operations:
inf + 1 = inf,inf * 2 = inf - Result of overflow:
sys.float_info.max * 2 = inf - Use to test: How your code handles infinity (sentinel logic, clamping, division by zero results)
sys.float_info.max(≈ 1.8e308) - Largest Finite Float- The largest finite IEEE-754 double
- The “upper boundary” before you overflow to infinity
- Use to test: Finite range limits (overflow/underflow edges)
Example illustrating the difference:
import sys
import math
# These are NOT the same!
print(math.inf > sys.float_info.max) # True - infinity is bigger than any finite
print(sys.float_info.max * 2) # inf - overflow!
print(1 / math.inf) # 0.0 - reciprocal of infinity
print(1 / sys.float_info.max) # ≈ 5.6e-309 - tiny but finite!
Important distinctions:
sys.float_info.minis NOT the most negative number (that would be-sys.float_info.max)- It’s the smallest positive normalized number (closest to zero without being denormalized)
- Similar to how C’s
DBL_MINworks - There is NO
sys.float_info.min_negative- just use-sys.float_info.max
Other Useful Finite Boundaries (Often Overlooked):
| Boundary Type | Python Expression | Typical Value | When to Test |
|---|---|---|---|
| Smallest positive normalized | sys.float_info.min |
≈ 2.2e-308 |
Testing underflow thresholds, relative error near zero |
| Smallest positive subnormal | math.ulp(0.0) |
≈ 5e-324 |
Catching denormal handling, flush-to-zero surprises |
| Unit in Last Place (ULP) | math.ulp(x) |
Varies by x |
Checking "next representable" values in precision-sensitive code |
| Next float toward direction | math.nextafter(x, y) |
Varies | Razor-edge boundaries (e.g., stepping from max to infinity) |
| Most negative finite | -sys.float_info.max |
≈ -1.8e308 |
Lower bound before underflow to -inf |
Practical Guidance: Which Should I Use in Tests?
To test infinity handling (non-finite values):
import math
# Use math.inf (preferred for readability)
def test_reciprocal_handles_infinity():
result = reciprocal(math.inf)
assert result == 0.0 # 1/inf = 0
To test finite range limits (overflow/underflow edges):
import sys
def test_reciprocal_handles_max_float():
result = reciprocal(sys.float_info.max)
# Reciprocal of largest finite → extremely small
assert result > 0
assert result < sys.float_info.min # Might underflow to subnormal
To test precision limits (ULP testing):
import math
def test_reciprocal_precision_at_boundary():
# Test the "next" representable value after 1.0
x = math.nextafter(1.0, 2.0) # Slightly larger than 1.0
result = reciprocal(x)
# Should be slightly less than 1.0
assert result < 1.0
Tip for NumPy Users:
If your code uses NumPy arrays, use np.finfo() for consistency with the dtype under test:
import numpy as np
# For float64 (Python float equivalent)
fi = np.finfo(np.float64)
print(fi.max) # Largest finite (≈ 1.8e308)
print(fi.tiny) # Smallest positive normalized (≈ 2.2e-308)
print(fi.eps) # Machine epsilon (≈ 2.2e-16)
# Use with nextafter
x_next_to_max = np.nextafter(fi.max, np.inf) # → inf
# For float32 (if working with single precision)
fi32 = np.finfo(np.float32)
print(fi32.max) # ≈ 3.4e38 (much smaller than float64!)
This keeps everything consistent with the NumPy dtype you’re actually using.
Comprehensive Boundary Test Suite:
import sys
import math
import pytest
# Define constants for readability (following best practices)
INF = math.inf
NINF = -math.inf
NAN = math.nan
FMAX = sys.float_info.max
FMIN_NORM_POS = sys.float_info.min
FMIN_SUB_POS = math.ulp(0.0)
EPSILON = sys.float_info.epsilon
class TestReciprocalBoundaries:
"""Comprehensive boundary tests for reciprocal(x) function."""
# BASIC BOUNDARIES (near zero, typical values)
def test_reciprocal_boundary_near_zero_positive(self):
"""Boundary: Very small positive number"""
result = reciprocal(1e-6) # 0.000001
assert result == pytest.approx(1e6, rel=1e-9) # Should be 1000000
def test_reciprocal_boundary_near_zero_negative(self):
"""Boundary: Very small negative number"""
result = reciprocal(-1e-6)
assert result == pytest.approx(-1e6, rel=1e-9)
def test_reciprocal_boundary_exactly_zero(self):
"""Boundary: Exactly zero (should raise exception)"""
with pytest.raises(ZeroDivisionError):
reciprocal(0.0)
def test_reciprocal_boundary_very_large_positive(self):
"""Boundary: Very large positive number"""
result = reciprocal(1e10)
assert result == pytest.approx(1e-10, rel=1e-9)
# FINITE RANGE BOUNDARIES (sys.float_info limits)
def test_reciprocal_boundary_max_finite_float(self):
"""Boundary: Largest finite float (sys.float_info.max ≈ 1.8e308)
Tests overflow prevention at upper finite boundary.
Reciprocal of max finite → extremely small result (may underflow to subnormal).
"""
result = reciprocal(FMAX)
assert result > 0, "Reciprocal of max float should be positive"
assert result < FMIN_NORM_POS, "Result underflows below normalized range (subnormal)"
# Result is approximately 5.6e-309 (subnormal/denormalized)
def test_reciprocal_boundary_min_normalized_float(self):
"""Boundary: Smallest positive normalized float (sys.float_info.min ≈ 2.2e-308)
Tests underflow-to-overflow transition.
Reciprocal of min normalized → extremely large result (overflows to infinity).
"""
result = reciprocal(FMIN_NORM_POS)
assert result == INF, "Reciprocal of min normalized float overflows to infinity"
def test_reciprocal_boundary_min_subnormal_float(self):
"""Boundary: Smallest positive subnormal float (math.ulp(0.0) ≈ 5e-324)
Tests denormalized number handling.
Reciprocal of smallest representable → massive overflow to infinity.
"""
result = reciprocal(FMIN_SUB_POS)
assert result == INF, "Reciprocal of min subnormal overflows to infinity"
def test_reciprocal_boundary_epsilon(self):
"""Boundary: Machine epsilon (sys.float_info.epsilon ≈ 2.2e-16)
Tests precision limit behavior.
Epsilon is smallest value where 1.0 + epsilon != 1.0
"""
result = reciprocal(EPSILON)
expected = 1.0 / EPSILON # ≈ 4.5e15
assert result == pytest.approx(expected, rel=1e-9)
def test_reciprocal_boundary_most_negative_finite(self):
"""Boundary: Most negative finite float (-sys.float_info.max ≈ -1.8e308)
Tests lower finite boundary (there is no sys.float_info.min_negative).
"""
result = reciprocal(-FMAX)
assert result < 0, "Reciprocal of most negative finite should be negative"
assert result > -FMIN_NORM_POS, "Result is tiny negative (subnormal range)"
# NON-FINITE BOUNDARIES (infinities and NaN)
def test_reciprocal_boundary_positive_infinity(self):
"""Boundary: Positive infinity (non-finite value)
Tests special value handling (not overflow prevention).
1 / inf = 0 (mathematical limit)
"""
result = reciprocal(INF)
assert result == 0.0, "Reciprocal of positive infinity should be zero"
def test_reciprocal_boundary_negative_infinity(self):
"""Boundary: Negative infinity (non-finite value)
1 / -inf = -0 (negative zero)
"""
result = reciprocal(NINF)
# Python treats 0.0 == -0.0 as True (IEEE 754 behavior)
assert result == 0.0 or result == -0.0, "Reciprocal of -inf should be zero"
def test_reciprocal_boundary_nan(self):
"""Boundary: Not a Number (undefined/invalid input)
NaN propagates through operations (doesn't crash).
Important: nan == nan is always False! Use math.isnan().
"""
result = reciprocal(NAN)
assert math.isnan(result), "Reciprocal of NaN should be NaN (propagates)"
# RAZOR-EDGE BOUNDARIES (using math.nextafter)
def test_reciprocal_boundary_next_after_max_toward_inf(self):
"""Boundary: Next float after max (toward infinity)
Uses math.nextafter() to test the exact transition point.
Next float after FMAX toward infinity IS infinity.
"""
x_after_max = math.nextafter(FMAX, INF)
assert x_after_max == INF, "Next float after max toward inf is infinity"
result = reciprocal(x_after_max)
assert result == 0.0, "Reciprocal of inf is 0"
def test_reciprocal_boundary_next_after_one_toward_two(self):
"""Boundary: Next representable float after 1.0
Tests ULP (Unit in Last Place) sensitivity.
1.0 + epsilon is the next representable value after 1.0
"""
x_just_above_one = math.nextafter(1.0, 2.0)
result = reciprocal(x_just_above_one)
# Reciprocal should be slightly less than 1.0
assert result < 1.0, f"Expected < 1.0, got {result}"
assert result == pytest.approx(1.0, rel=1e-14), "Should be very close to 1.0"
def test_reciprocal_boundary_ulp_sensitivity(self):
"""Boundary: Testing ULP (Unit in Last Place) around a value
Demonstrates floating-point granularity.
"""
x = 1000.0
ulp_at_x = math.ulp(x) # Spacing at 1000.0
result1 = reciprocal(x)
result2 = reciprocal(x + ulp_at_x) # Next representable value
assert result1 != result2, "Adjacent floats should produce different reciprocals"
assert result1 > result2, "Larger input → smaller reciprocal"
Introducing pytest’s @pytest.mark.parametrize - Avoid Test Duplication
So far, we’ve written separate test functions for each boundary. But what if you need to test the same logic with multiple input values?
The Problem: Repetitive Tests
# ❌ TEDIOUS: Writing 6 nearly-identical tests
def test_reciprocal_max_float():
result = reciprocal(FMAX)
assert result > 0 and result < FMIN_NORM_POS
def test_reciprocal_negative_max_float():
result = reciprocal(-FMAX)
assert result < 0 and result > -FMIN_NORM_POS
def test_reciprocal_min_norm_float():
result = reciprocal(FMIN_NORM_POS)
assert result == INF
# ... 3 more similar tests ...
Problems with this approach:
- Code duplication (same test logic repeated)
- Hard to maintain (change logic → update 6 tests)
- Adds clutter (6 test functions for one concept)
- If one fails, you don’t know how many others would fail
The Solution: Parametrized Tests
pytest provides @pytest.mark.parametrize to run the same test function with different input data:
# ✅ CLEAN: One test function, multiple inputs
@pytest.mark.parametrize("x, expected_behavior", [
(FMAX, "underflow to subnormal"),
(-FMAX, "underflow to negative subnormal"),
(FMIN_NORM_POS, "overflow to infinity"),
(FMIN_SUB_POS, "overflow to infinity"),
(INF, "exact zero"),
(NINF, "exact zero"),
])
def test_reciprocal_extreme_boundaries_summary(x, expected_behavior):
"""Parametrized test covering all extreme boundaries with descriptions."""
result = reciprocal(x)
if "subnormal" in expected_behavior:
assert math.isfinite(result), f"{expected_behavior}: result should be finite"
assert abs(result) < FMIN_NORM_POS, f"{expected_behavior}: should be in subnormal range"
elif "overflow to infinity" in expected_behavior:
assert result == INF, f"{expected_behavior}: should overflow to infinity"
elif "exact zero" in expected_behavior:
assert result == 0.0, f"{expected_behavior}: should be exactly zero"
How it works:
- Decorator syntax:
@pytest.mark.parametrize("param1, param2", [...])- First argument: parameter names (comma-separated string)
- Second argument: list of tuples (one tuple per test case)
- Test function receives parameters:
def test_function(param1, param2):- pytest injects each tuple’s values into the function
- pytest runs the test multiple times: Once per tuple in the list
- Each run is treated as a separate test
- Test output shows which combination passed/failed
Example pytest output:
$ pytest test_boundaries.py::test_reciprocal_extreme_boundaries_summary -v
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[FMAX-underflow to subnormal] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[-FMAX-underflow to negative subnormal] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[FMIN_NORM_POS-overflow to infinity] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[FMIN_SUB_POS-overflow to infinity] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[INF-exact zero] PASSED
test_boundaries.py::test_reciprocal_extreme_boundaries_summary[NINF-exact zero] PASSED
================================== 6 passed in 0.03s ==================================
Key benefits:
- ✅ DRY principle: Test logic written once, reused for all cases
- ✅ Easy to add cases: Just add a tuple to the list
- ✅ Clear output: Each combination is a separate test result
- ✅ Maintainable: Change logic in one place
- ✅ Readable: Parameter names document what’s being tested
When to use parametrize:
✅ DO use when:
- Testing the same function with multiple inputs
- Equivalence classes have the same validation logic
- Boundary values follow the same pattern
❌ DON’T use when:
- Each test needs different assertions (use separate test functions)
- Test setup/teardown differs per case (use fixtures instead)
- Tests become hard to read due to complex conditionals
More examples:
# Testing multiple equivalence classes with same validation
@pytest.mark.parametrize("angle", [-10, -20, -30, -45])
def test_find_intersection_downward_angles_all_succeed(angle):
"""Test that all downward angles find intersection"""
x, y, dist = find_intersection(x_road, y_road, angle)
assert x is not None
assert dist > 0
# Testing boundary values with expected results
@pytest.mark.parametrize("x, expected", [
(1.0, 1.0),
(2.0, 0.5),
(0.5, 2.0),
(10.0, 0.1),
])
def test_reciprocal_normal_values(x, expected):
"""Test reciprocal with normal values"""
result = reciprocal(x)
assert result == pytest.approx(expected, rel=1e-9)
Complete parametrized test for our extreme boundaries:
# PARAMETRIZED TEST (pytest feature for multiple similar tests)
@pytest.mark.parametrize("x, expected_behavior", [
(FMAX, "underflow to subnormal"),
(-FMAX, "underflow to negative subnormal"),
(FMIN_NORM_POS, "overflow to infinity"),
(FMIN_SUB_POS, "overflow to infinity"),
(INF, "exact zero"),
(NINF, "exact zero"),
])
def test_reciprocal_extreme_boundaries_summary(x, expected_behavior):
"""Parametrized test covering all extreme boundaries with descriptions."""
result = reciprocal(x)
if "subnormal" in expected_behavior:
assert math.isfinite(result), f"{expected_behavior}: result should be finite"
assert abs(result) < FMIN_NORM_POS, f"{expected_behavior}: should be in subnormal range"
elif "overflow to infinity" in expected_behavior:
assert result == INF, f"{expected_behavior}: should overflow to infinity"
elif "exact zero" in expected_behavior:
assert result == 0.0, f"{expected_behavior}: should be exactly zero"
Why Test These Extreme Boundaries?
| Boundary Type | What It Tests | Real-World Scenario |
|---|---|---|
| sys.float_info.max | Finite overflow prevention | Astronomical distances, cosmology calculations |
| sys.float_info.min | Underflow-to-overflow transition | Quantum physics, particle simulations |
| math.ulp(0.0) | Subnormal/denormal handling | High-precision scientific computing |
| sys.float_info.epsilon | Precision limits near 1.0 | Financial calculations, numerical stability |
| math.inf | Non-finite value handling | Division by zero results, sentinel values |
| math.nan | Invalid input propagation | Missing data in datasets, undefined operations |
| math.nextafter() | Razor-edge transitions | Verifying exact boundary behavior |
Key Insights:
- ✅ Test BOTH finite and non-finite boundaries:
sys.float_info.max(finite) ANDmath.inf(non-finite) are different! - ✅ Use constants for readability: Define
FMAX,INF, etc. at module level - ✅ Prefer
math.infoverfloat('inf'): Better readability (same withmath.nan) - ✅ Use
math.nextafter()for exact transitions: Test the exact point where behavior changes - ✅ Python handles inf/nan gracefully: No crashes, but behavior may surprise you
- ✅ Use
math.isnan()to check for NaN:nan == nanis alwaysFalse! - ✅ Document WHY each test matters: Future developers need context
Common Pitfall:
# ❌ WRONG: Testing only infinity
def test_reciprocal_large_values():
result = reciprocal(float('inf'))
assert result == 0.0
# Problem: This doesn't test finite overflow (sys.float_info.max)!
# A function could handle inf correctly but crash on FMAX.
# ✅ CORRECT: Test BOTH finite and non-finite boundaries
def test_reciprocal_max_finite():
result = reciprocal(sys.float_info.max) # Finite
assert result < sys.float_info.min # Underflow behavior
def test_reciprocal_infinity():
result = reciprocal(math.inf) # Non-finite
assert result == 0.0 # Special value handling
2.2 Boundaries for Multi-Parameter: reciprocal_sum(x, y, z)
Function: reciprocal_sum(x, y, z) = 1/(x+y+z)
Boundaries to test:
| Boundary Condition | Test Values | Why Important |
|---|---|---|
| Sum exactly zero | (1.0, -0.5, -0.5) |
Division by zero |
| Sum near zero (positive) | (1.0, -0.9999, 0.0) |
Large positive result |
| Sum near zero (negative) | (-1.0, 0.9999, 0.0) |
Large negative result |
| One parameter zero | (0.0, 1.0, 1.0) |
Doesn't affect sum |
| Two parameters zero | (0.0, 0.0, 2.0) |
Only one matters |
| All parameters zero | (0.0, 0.0, 0.0) |
Division by zero |
Boundary Tests:
def test_reciprocal_sum_boundary_sum_exactly_zero():
"""Boundary: Sum cancels exactly to zero"""
with pytest.raises(ZeroDivisionError):
reciprocal_sum(1.0, -0.5, -0.5)
def test_reciprocal_sum_boundary_sum_near_zero_positive():
"""Boundary: Sum is very small positive"""
result = reciprocal_sum(1.0, -0.9999, 0.0) # sum = 0.0001
assert result == pytest.approx(10000.0, rel=1e-5)
def test_reciprocal_sum_boundary_all_zeros():
"""Boundary: All parameters zero"""
with pytest.raises(ZeroDivisionError):
reciprocal_sum(0.0, 0.0, 0.0)
def test_reciprocal_sum_boundary_two_zeros():
"""Boundary: Two parameters zero, one non-zero"""
result = reciprocal_sum(0.0, 0.0, 0.5) # sum = 0.5
assert result == pytest.approx(2.0, rel=1e-10)
2.3 Boundaries for Array Example: array_sum(arr)
Function: array_sum(arr) - sum of array elements
Structural Boundaries:
| Boundary | Test Case |
|---|---|
| Empty array | len(arr) = 0 |
| Single element | len(arr) = 1 |
| Two elements | len(arr) = 2 (smallest non-trivial case) |
Value Boundaries:
| Boundary | Test Case |
|---|---|
| All zeros | [0.0, 0.0, 0.0] |
| Contains one zero | [0.0, 1.0, 2.0] |
| All same value | [5.0, 5.0, 5.0] |
| Alternating signs near zero | [1e-10, -1e-10, 1e-10] |
Boundary Tests:
def test_array_sum_boundary_empty_array():
"""Boundary: Empty array (length = 0)"""
with pytest.raises(ValueError):
array_sum(np.array([]))
def test_array_sum_boundary_single_element():
"""Boundary: Single element (length = 1)"""
result = array_sum(np.array([5.0]))
assert result == 5.0
def test_array_sum_boundary_two_elements():
"""Boundary: Two elements (minimum non-trivial)"""
result = array_sum(np.array([3.0, 4.0]))
assert result == 7.0
def test_array_sum_boundary_all_zeros():
"""Boundary: All elements are zero"""
result = array_sum(np.array([0.0, 0.0, 0.0]))
assert result == 0.0
def test_array_sum_boundary_alternating_near_zero():
"""Boundary: Values near zero with alternating signs"""
result = array_sum(np.array([1e-10, -1e-10, 1e-10]))
assert abs(result - 1e-10) < 1e-15 # Sum should be close to 1e-10
2.3.1 What About Large Arrays?
Great question! You might be wondering: “We tested 0, 1, 2 elements… but what about 1000 or 1,000,000 elements?”
The Short Answer: Unit tests should generally use small, representative arrays (typically 10-100 elements), not massive ones. Large arrays belong in performance tests, not unit tests.
Why Not Test with Huge Arrays in Unit Tests?
| Problem | Impact | Example |
|---|---|---|
| Slow tests | Unit tests should run in milliseconds. Large arrays slow them to seconds. | Array of 1 million → 100x slower test suite |
| Doesn't find more bugs | If your code works with 10 elements, it usually works with 10,000. | Summation logic doesn't change with size |
| Noise in test output | Hard to debug failures with massive data | "Array mismatch at index 47,293" - good luck! |
| Memory usage | CI servers might run out of memory | 10 tests × 1M floats × 8 bytes = 80 MB per test run |
When DO You Need Large Arrays?
✅ DO test large arrays when:
- Algorithm complexity matters (e.g., O(n²) vs O(n))
def test_sort_performance_scales_linearly(): """Verify sorting doesn't degrade to O(n²)""" import time # Small array arr_small = np.random.rand(100) start = time.time() sort_function(arr_small) time_small = time.time() - start # Large array (10x bigger) arr_large = np.random.rand(1000) start = time.time() sort_function(arr_large) time_large = time.time() - start # O(n log n) should be ~13x slower (10 * log₂(10) ≈ 13) # NOT 100x slower (which would be O(n²)) assert time_large < time_small * 20, "Sorting appears to be O(n²)!" - Memory allocation bugs (buffer overflows, off-by-one at specific sizes)
def test_array_processing_handles_power_of_two_sizes(): """Test sizes like 256, 512, 1024 (common buffer boundaries)""" for size in [256, 512, 1024, 2048]: arr = np.random.rand(size) result = process_array(arr) assert len(result) == size, f"Failed at size {size}" - Regression test for a specific bug that only appeared with large data
def test_regression_issue_42_overflow_at_10000_elements(): """Regression: Integer overflow occurred at exactly 10,000 elements Bug report: https://github.com/yourproject/issues/42 Fixed by switching from int32 to int64 accumulator """ arr = np.ones(10_000) # Specific size that caused the bug result = array_sum(arr) assert result == 10_000.0, "Overflow regression detected!"
Best Practice: Separate Performance Tests
# tests/test_geometry.py (unit tests - FAST)
def test_find_intersection_basic():
"""Unit test: Small representative array"""
x_road = np.array([0, 10, 20]) # Only 3 points
y_road = np.array([0, 2, 4])
x, y, dist = find_intersection(x_road, y_road, -10.0)
assert x is not None
# tests/test_geometry_performance.py (performance tests - SLOW)
@pytest.mark.slow # Mark as slow so we can skip during development
def test_find_intersection_large_road():
"""Performance test: Realistic large road dataset"""
x_road = np.linspace(0, 1000, 10_000) # 10,000 points
y_road = np.sin(x_road / 10) * 5
import time
start = time.time()
x, y, dist = find_intersection(x_road, y_road, -10.0)
elapsed = time.time() - start
assert x is not None
assert elapsed < 0.1, f"Too slow: {elapsed:.3f}s for 10k points"
Run performance tests separately:
# Normal development: Skip slow tests (runs in seconds)
$ pytest tests/ -v
# Before merging PR: Run all tests including slow ones (runs in minutes)
$ pytest tests/ -v --run-slow
# Just performance tests
$ pytest tests/test_geometry_performance.py -v
Property-Based Testing with Hypothesis (Advanced)
For thorough testing with varying sizes, use Hypothesis:
from hypothesis import given, strategies as st
@given(st.lists(st.floats(allow_nan=False, allow_infinity=False),
min_size=0, max_size=100))
def test_array_sum_any_valid_array(arr):
"""Property: array_sum should never crash on valid float arrays
Hypothesis will automatically generate hundreds of test cases
with different array sizes and values.
"""
if len(arr) == 0:
with pytest.raises(ValueError):
array_sum(np.array(arr))
else:
result = array_sum(np.array(arr))
assert isinstance(result, (int, float, np.number))
How Large is “Large Enough” for Unit Tests?
Rule of thumb:
| Array Size | When to Use | Example |
|---|---|---|
| 0-2 elements | Boundary testing | Empty, single, pair (edge cases) |
| 3-10 elements | Unit tests (most common) | Enough to test logic without noise |
| 100-1000 elements | Realistic scenario tests | Typical real-world data size |
| 10,000+ elements | Performance tests (separate suite) | Algorithm complexity, memory usage |
| 1,000,000+ elements | Stress tests (rare, CI only) | Production-scale data validation |
Example: Right-Sized Unit Tests
# ✅ GOOD: Representative sizes for unit tests
def test_find_intersection_typical_road():
"""Test with typical road size (~100 points)"""
x_road = np.linspace(0, 80, 100) # Realistic: 100 points over 80 meters
y_road = generate_road_profile(num_points=100)
x, y, dist = find_intersection(x_road, y_road, -10.0)
assert x is not None # Fast test, runs in ~1ms
# ❌ BAD: Unnecessarily large for a unit test
def test_find_intersection_huge_road():
"""This belongs in performance tests, not unit tests"""
x_road = np.linspace(0, 10000, 1_000_000) # Overkill!
y_road = np.random.rand(1_000_000)
x, y, dist = find_intersection(x_road, y_road, -10.0)
assert x is not None # Slow test, runs in ~500ms
Summary: Testing Array Sizes
- ✅ Unit tests: Use small arrays (3-100 elements) - fast and sufficient
- ✅ Boundary tests: Always test 0, 1, 2 elements (edge cases)
- ✅ Performance tests: Use large arrays (10k+) in separate test suite
- ✅ Property-based tests: Use Hypothesis to generate varying sizes automatically
- ❌ Don’t: Put large arrays in regular unit tests (slows down development)
Key principle: Unit tests verify correctness, performance tests verify speed. Keep them separate!
2.4 Boundaries for Complex Array Example: find_intersection()
Function: find_intersection(x_road, y_road, angle_degrees, camera_x, camera_y)
Array Length Boundaries:
| Boundary | Test Case | Why Important |
|---|---|---|
len(x_road) = 0 |
Empty arrays | No road to intersect |
len(x_road) = 1 |
Single point | Can't form a segment |
len(x_road) = 2 |
Two points | Minimum valid road (one segment) |
len(x_road) ≠ len(y_road) |
Mismatched lengths | Invalid input |
Angle Boundaries:
| Boundary | Test Values | Why Important |
|---|---|---|
| Exactly -90° | -90.0 |
Vertical downward (tan = ∞) |
| Near -90° | -89.9, -89.999 |
Near vertical |
| Exactly 0° | 0.0 |
Horizontal ray |
| Near 0° | -0.1, 0.1 |
Nearly horizontal |
| Exactly 90° | 90.0 |
Vertical upward (tan = ∞) |
| Near 90° | 89.9, 89.999 |
Near vertical |
Camera Position Boundaries:
| Boundary | Test Case | Why Important |
|---|---|---|
camera_x = x_road[0] |
Camera at road start | Edge of road |
camera_x = x_road[-1] |
Camera at road end | Edge of road |
camera_y = y_road[i] |
Camera at road level | Might be tangent |
camera_y = min(y_road) |
Camera at lowest point | Boundary case |
camera_y = max(y_road) |
Camera at highest point | Boundary case |
Intersection Position Boundaries:
| Boundary | Test Case | Why Important |
|---|---|---|
Intersection at x_road[0] |
Ray hits first point exactly | Endpoint handling |
Intersection at x_road[-1] |
Ray hits last point exactly | Endpoint handling |
| Intersection between two segments | Ray hits at segment boundary | Interpolation edge case |
Comprehensive Boundary Tests:
class TestFindIntersectionBoundaries:
"""Boundary value tests for find_intersection()"""
# ARRAY LENGTH BOUNDARIES
def test_boundary_empty_arrays(self):
"""Boundary: Empty road arrays (len = 0)"""
x, y, dist = find_intersection(np.array([]), np.array([]), -10.0)
assert x is None
def test_boundary_single_point(self):
"""Boundary: Single point (len = 1)"""
x, y, dist = find_intersection(np.array([5.0]), np.array([2.0]), -10.0)
assert x is None # Can't form segment
def test_boundary_two_points_minimum_valid(self):
"""Boundary: Two points (len = 2, minimum valid)"""
x_road = np.array([0.0, 10.0])
y_road = np.array([0.0, 2.0])
x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=0.0, camera_y=5.0)
assert x is not None # Should work
# ANGLE BOUNDARIES
def test_boundary_angle_exactly_negative_90(self):
"""Boundary: Angle exactly -90° (vertical downward)"""
x_road = np.array([0, 10, 20])
y_road = np.array([0, 2, 4])
x, y, dist = find_intersection(x_road, y_road, -90.0)
# Implementation returns None for vertical - verify this is intentional
assert x is None
def test_boundary_angle_near_negative_90(self):
"""Boundary: Angle near -90° (-89.9°)"""
x_road = np.array([0, 10, 20])
y_road = np.array([0, 2, 4])
x, y, dist = find_intersection(x_road, y_road, -89.9, camera_x=0.0, camera_y=10.0)
# Nearly vertical - should still find intersection if one exists
def test_boundary_angle_exactly_zero(self):
"""Boundary: Angle exactly 0° (horizontal)"""
x_road = np.array([0, 10, 20])
y_road = np.array([2.0, 2.0, 2.0]) # Flat road at y=2
x, y, dist = find_intersection(x_road, y_road, 0.0, camera_x=-5.0, camera_y=2.0)
assert x is not None # Horizontal ray should hit flat road
def test_boundary_angle_near_zero(self):
"""Boundary: Angle near 0° (0.1° - nearly horizontal)"""
x_road = np.array([0, 10, 20, 30])
y_road = np.array([0, 1, 2, 3])
x, y, dist = find_intersection(x_road, y_road, 0.1, camera_x=0.0, camera_y=1.5)
# Very shallow angle
def test_boundary_angle_exactly_90(self):
"""Boundary: Angle exactly 90° (vertical upward)"""
x_road = np.array([0, 10, 20])
y_road = np.array([0, 2, 4])
x, y, dist = find_intersection(x_road, y_road, 90.0)
assert x is None # Implementation returns None for vertical
# CAMERA POSITION BOUNDARIES
def test_boundary_camera_at_road_start(self):
"""Boundary: Camera x-position at road start"""
x_road = np.array([0, 10, 20])
y_road = np.array([0, 2, 4])
x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=0.0, camera_y=10.0)
assert x is not None
def test_boundary_camera_at_road_end(self):
"""Boundary: Camera x-position at road end"""
x_road = np.array([0, 10, 20])
y_road = np.array([0, 2, 4])
x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=20.0, camera_y=10.0)
# Camera at end, looking down - might or might not intersect
def test_boundary_camera_at_road_level(self):
"""Boundary: Camera y-position at road level"""
x_road = np.array([0, 10, 20])
y_road = np.array([2.0, 2.0, 2.0]) # Flat at y=2
x, y, dist = find_intersection(x_road, y_road, 0.0, camera_x=5.0, camera_y=2.0)
# Camera ON the road, horizontal ray - tangent case
def test_boundary_camera_at_minimum_y(self):
"""Boundary: Camera at lowest point of road"""
x_road = np.array([0, 10, 20, 30])
y_road = np.array([5, 2, 3, 6]) # min at x=10, y=2
x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=15.0, camera_y=2.0)
# Camera at same height as lowest road point
# INTERSECTION POSITION BOUNDARIES
def test_boundary_intersection_at_first_point(self):
"""Boundary: Ray intersects at first road point"""
x_road = np.array([0, 10, 20])
y_road = np.array([5, 3, 1])
# Position camera so ray hits exactly at (0, 5)
x, y, dist = find_intersection(x_road, y_road, -45.0, camera_x=-5.0, camera_y=10.0)
if x is not None:
assert abs(x - 0.0) < 0.1 # Should be near first point
def test_boundary_intersection_at_last_point(self):
"""Boundary: Ray intersects at last road point"""
x_road = np.array([0, 10, 20])
y_road = np.array([0, 2, 4])
# Position so ray hits last point
x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=15.0, camera_y=10.0)
if x is not None and x > 18:
assert abs(x - 20.0) < 2.0 # Should be near last point
def test_boundary_intersection_at_segment_boundary(self):
"""Boundary: Ray intersects exactly where two segments meet"""
x_road = np.array([0, 10, 20, 30])
y_road = np.array([0, 5, 5, 10])
# Ray aimed to hit exactly at (10, 5)
x, y, dist = find_intersection(x_road, y_road, -30.0, camera_x=5.0, camera_y=10.0)
if x is not None:
# Could be reported as end of first segment or start of second
assert 9.0 <= x <= 11.0
Key Insights for Boundary Testing:
- Test at the boundary, just inside, and just outside
- Angle = 90°, 89.9°, 90.1°
- Length = 0, 1, 2
- Floating-point precision matters
- Use
pytest.approx()for floating-point comparisons - Test values near zero (1e-10, 1e-15)
- Use
- Array boundaries are both structural and positional
- Length boundaries (empty, single, two)
- Position boundaries (first element, last element, boundaries between)
- Combination boundaries are critical
- Camera at road level + horizontal ray = tangent case
- Empty array + any angle = should handle gracefully
General Boundary Testing Strategy:
| Data Type | Boundary Values to Test | Python Tools |
|---|---|---|
| Integers | 0, 1, -1, max_value, min_value |
sys.maxsize, -sys.maxsize-1 |
| Floats (Finite) | 0.0, near-zero (1e-10), 1.0, very large (1e10), sys.float_info.max, -sys.float_info.max, sys.float_info.min (smallest normalized), math.ulp(0.0) (smallest subnormal), sys.float_info.epsilon |
sys.float_info, math.ulp(x), math.nextafter(x, y) |
| Floats (Non-Finite) | math.inf, -math.inf, math.nan |
math.isnan(x), math.isinf(x), math.isfinite(x) |
| Arrays/Lists | Empty ([]), Single element, Two elements, Very large |
len(), np.size |
| Angles (degrees) | 0°, ±90°, ±180°, ±360°, values near these (89.9°, 90.1°) |
np.deg2rad(), np.rad2deg() |
| Strings | Empty string (""), Single char, Very long string, Unicode edge cases |
len(), str.encode() |
Quick Reference: Python Float Boundary Testing Cheat Sheet
import sys
import math
# CONSTANTS (define once, use everywhere)
FMAX = sys.float_info.max # Largest finite (≈1.8e308)
FMIN_NORM = sys.float_info.min # Smallest normalized (≈2.2e-308)
FMIN_SUB = math.ulp(0.0) # Smallest subnormal (≈5e-324)
EPSILON = sys.float_info.epsilon # Machine epsilon (≈2.2e-16)
INF = math.inf # Positive infinity
NINF = -math.inf # Negative infinity
NAN = math.nan # Not a number
# TEST CHECKLIST FOR FLOAT FUNCTIONS
# ✅ Finite boundaries (test overflow/underflow)
test_function(FMAX) # Largest finite
test_function(-FMAX) # Most negative finite
test_function(FMIN_NORM) # Smallest normalized positive
test_function(FMIN_SUB) # Smallest subnormal positive
# ✅ Non-finite values (test special value handling)
test_function(INF) # Positive infinity
test_function(NINF) # Negative infinity
test_function(NAN) # Not a Number
# ✅ Precision boundaries
test_function(EPSILON) # Machine epsilon
test_function(1.0 + EPSILON) # Next after 1.0
# ✅ Razor-edge transitions (using math.nextafter)
math.nextafter(FMAX, INF) # → inf (overflow transition)
math.nextafter(1.0, 2.0) # Next representable after 1.0
math.nextafter(1.0, 0.0) # Previous representable before 1.0
# ✅ Checking results
math.isfinite(x) # True if not inf/nan
math.isinf(x) # True if +inf or -inf
math.isnan(x) # True if nan (don't use x == nan!)
math.ulp(x) # Spacing at x (precision)
Decision Tree: Which Boundaries Should I Test?
Does your function do arithmetic with floats?
│
├─ YES, division or reciprocal → Test ALL boundaries (finite + non-finite)
│ ├─ sys.float_info.max (finite overflow)
│ ├─ sys.float_info.min (underflow-to-overflow)
│ ├─ math.inf (division by zero result)
│ ├─ math.nan (invalid input)
│ └─ Near zero (1e-10, 1e-100)
│
├─ YES, but simple operations (+, -, *) → Test finite boundaries
│ ├─ sys.float_info.max (overflow detection)
│ ├─ sys.float_info.epsilon (precision loss)
│ └─ Normal range values
│
└─ NO, just comparison/sorting → Test basic boundaries
├─ 0.0, 1.0, -1.0
├─ math.inf, -math.inf (sorting edge cases)
└─ math.nan (comparison always returns False!)
2.5 Boundaries for Discrete Functions: Thresholds Are Boundaries Too!
Common Misconception: “Discrete functions don’t have boundary values - they just have categories.”
Reality: Discrete functions DO have boundaries - they’re actually easier to identify than continuous function boundaries!
What we’ve seen so far:
All boundary examples in this lecture (reciprocal, reciprocal_sum, array_sum, find_intersection) were continuous real-valued functions - they return floats from an infinite range. For these, we had to think carefully about:
- What values are “close to zero”? (1e-10? 1e-100?)
- Where do we test precision loss? (epsilon?)
- How do we handle infinity and NaN?
The good news: Functions with discrete/finite outputs make boundary identification much simpler! The boundaries are explicit thresholds in the code.
Key insight: For discrete functions, boundaries are threshold values where the output changes category. Instead of worrying about “how close to zero?”, you test both sides of each threshold.
2.5.1 Example: Simple Threshold Boundaries - calculate_grade(score)
Let’s start with the simplest case: a function that maps continuous inputs to discrete outputs.
Function Definition:
def calculate_grade(score: int) -> str:
"""
Calculate letter grade based on score.
Args:
score: Integer between 0 and 100
Returns:
Letter grade (A, B, C, D, F)
Raises:
ValueError: If score is not in valid range
"""
if score < 0 or score > 100:
raise ValueError("Score must be between 0 and 100")
if score >= 90:
return "A"
elif score >= 80:
return "B"
elif score >= 70:
return "C"
elif score >= 60:
return "D"
else:
return "F"
Question: Where are the boundaries for this function?
Analysis:
Looking at the code, we immediately see explicit threshold values:
score >= 90→ Grade “A” (boundary at 90)score >= 80→ Grade “B” (boundary at 80)score >= 70→ Grade “C” (boundary at 70)score >= 60→ Grade “D” (boundary at 60)score < 60→ Grade “F” (boundary at 60 from below)score < 0 or score > 100→ Error (boundaries at 0 and 100)
Equivalence classes and their boundaries:
| Grade Class | Range | Boundaries to Test | Why These Boundaries? |
|---|---|---|---|
| Grade A | 90 ≤ score ≤ 100 |
90, 100 |
Lower threshold (A/B boundary), upper limit |
| Grade B | 80 ≤ score < 90 |
80, 89 |
Lower threshold (B/C boundary), upper edge (B/A boundary) |
| Grade C | 70 ≤ score < 80 |
70, 79 |
Lower threshold (C/D boundary), upper edge (C/B boundary) |
| Grade D | 60 ≤ score < 70 |
60, 69 |
Lower threshold (D/F boundary), upper edge (D/C boundary) |
| Grade F | 0 ≤ score < 60 |
0, 59 |
Lower limit, upper edge (F/D boundary) |
| Invalid | score < 0 or score > 100 |
-1, 101 |
Just outside valid range |
Key insight - Two types of boundary values:
- Threshold boundaries - Where output category changes:
score = 89should return “B”score = 90should return “A”- These are critical! Off-by-one errors are common (using
>instead of>=)
- Range boundaries - Valid input range limits:
score = 0(minimum valid)score = 100(maximum valid)score = -1(just below valid range)score = 101(just above valid range)
Boundary Value Test Code:
import pytest
def test_grade_boundary_A_B_threshold():
"""Boundary: score = 89 (B) vs 90 (A)"""
assert calculate_grade(89) == "B" # Just below A threshold
assert calculate_grade(90) == "A" # At A threshold
def test_grade_boundary_B_C_threshold():
"""Boundary: score = 79 (C) vs 80 (B)"""
assert calculate_grade(79) == "C" # Just below B threshold
assert calculate_grade(80) == "B" # At B threshold
def test_grade_boundary_C_D_threshold():
"""Boundary: score = 69 (D) vs 70 (C)"""
assert calculate_grade(69) == "D" # Just below C threshold
assert calculate_grade(70) == "C" # At C threshold
def test_grade_boundary_D_F_threshold():
"""Boundary: score = 59 (F) vs 60 (D)"""
assert calculate_grade(59) == "F" # Just below D threshold
assert calculate_grade(60) == "D" # At D threshold
def test_grade_boundary_valid_range_lower():
"""Boundary: score = 0 (valid) vs -1 (invalid)"""
assert calculate_grade(0) == "F" # Minimum valid score
with pytest.raises(ValueError):
calculate_grade(-1) # Just below valid range
def test_grade_boundary_valid_range_upper():
"""Boundary: score = 100 (valid) vs 101 (invalid)"""
assert calculate_grade(100) == "A" # Maximum valid score
with pytest.raises(ValueError):
calculate_grade(101) # Just above valid range
Test design observations:
- Each test covers exactly one boundary - Makes it easy to identify which boundary is broken if a test fails
- We test both sides of each threshold -
score=89(B) andscore=90(A) - We test valid range limits - Minimum (0), maximum (100), and just outside (-1, 101)
- Total: 6 boundary tests - Much more focused than testing all 100 possible scores!
Contrast with continuous functions:
| Aspect | Continuous (reciprocal) | Discrete (calculate_grade) |
|---|---|---|
| Boundary identification | Must decide: "How close to zero?" (1e-10? 1e-100?) | Explicit in code: 90, 80, 70, 60 |
| Number of boundaries | Depends on your choice of partitioning | Fixed by function logic |
| Boundary precision | Floating-point precision matters (epsilon) | Integer values - no precision issues |
| Testing strategy | Test near-zero, infinity, NaN, epsilon | Test threshold ± 1 |
Why discrete boundaries are easier:
- Explicit thresholds - The code literally says
if score >= 90, so you know to test 89 vs 90 - No precision worries - Integer thresholds mean no floating-point edge cases
- Clear pass/fail - Either the grade is correct or it isn’t - no “close enough” judgment calls
- Off-by-one detection - Boundary tests immediately catch
>vs>=mistakes
2.5.2 Example: Multi-Dimensional Boundaries - calculate_shipping_cost(weight, distance, express)
Now let’s see a more complex case: multiple parameters, each with their own boundaries.
Function Definition:
def calculate_shipping_cost(weight: float, distance: float, express: bool = False) -> float:
"""
Calculate shipping cost based on weight and distance.
Args:
weight: Weight in kg (0.1 to 50)
distance: Distance in km (1 to 5000)
express: Whether express shipping is requested
Returns:
float: Shipping cost in EUR
Raises:
ValueError: If weight or distance is out of range
"""
if weight < 0.1 or weight > 50:
raise ValueError("Weight must be between 0.1 and 50 kg")
if distance < 1 or distance > 5000:
raise ValueError("Distance must be between 1 and 5000 km")
# Base cost calculation
if weight <= 5:
base_cost = 5.0
elif weight <= 20:
base_cost = 10.0
else:
base_cost = 20.0
# Distance multiplier
if distance <= 100:
distance_multiplier = 1.0
elif distance <= 500:
distance_multiplier = 1.5
else:
distance_multiplier = 2.0
cost = base_cost * distance_multiplier
# Express shipping adds 50%
if express:
cost *= 1.5
return round(cost, 2)
Question: Where are the boundaries for this function?
Analysis:
This function has three dimensions, each with its own boundaries:
Dimension 1: Weight boundaries
| Boundary | Threshold Value | Test Cases | Expected Behavior Change |
|---|---|---|---|
| Minimum weight | 0.1 kg |
0.09 kg (error) vs 0.1 kg (valid) |
Error → Light category |
| Light/Medium | 5 kg |
5.0 kg (light) vs 5.1 kg (medium) |
€5 base → €10 base |
| Medium/Heavy | 20 kg |
20.0 kg (medium) vs 20.1 kg (heavy) |
€10 base → €20 base |
| Maximum weight | 50 kg |
50 kg (valid) vs 50.1 kg (error) |
Heavy category → Error |
Dimension 2: Distance boundaries
| Boundary | Threshold Value | Test Cases | Expected Behavior Change |
|---|---|---|---|
| Minimum distance | 1 km |
0.9 km (error) vs 1 km (valid) |
Error → Local category |
| Local/Regional | 100 km |
100 km (local) vs 101 km (regional) |
1.0× multiplier → 1.5× multiplier |
| Regional/Long-distance | 500 km |
500 km (regional) vs 501 km (long) |
1.5× multiplier → 2.0× multiplier |
| Maximum distance | 5000 km |
5000 km (valid) vs 5001 km (error) |
Long-distance → Error |
Dimension 3: Express flag
| Value | Effect |
|---|---|
False |
1.0× (standard shipping) |
True |
1.5× (express multiplier) |
Note: Boolean parameters don’t have “boundaries” in the traditional sense - they only have two values. But we still need to test both!
Challenge: Combinatorial explosion
If we tested every combination of boundaries:
- 4 weight boundaries × 4 distance boundaries × 2 express values = 32 tests
Practical strategy: Test each dimension independently
Instead, we test each dimension’s boundaries while holding other dimensions constant:
import pytest
# ===== Weight boundaries (distance and express fixed) =====
def test_shipping_weight_boundary_minimum():
"""Boundary: weight = 0.1 (valid) vs 0.09 (invalid)"""
assert calculate_shipping_cost(0.1, 100, False) == 5.0 # Valid
with pytest.raises(ValueError, match="Weight must be between"):
calculate_shipping_cost(0.09, 100, False) # Invalid
def test_shipping_weight_boundary_light_medium():
"""Boundary: weight = 5.0 (light) vs 5.1 (medium)"""
cost_light = calculate_shipping_cost(5.0, 100, False)
cost_medium = calculate_shipping_cost(5.1, 100, False)
assert cost_light == 5.0 # Light: base=5.0, dist_mult=1.0 → 5.0
assert cost_medium == 10.0 # Medium: base=10.0, dist_mult=1.0 → 10.0
def test_shipping_weight_boundary_medium_heavy():
"""Boundary: weight = 20.0 (medium) vs 20.1 (heavy)"""
cost_medium = calculate_shipping_cost(20.0, 100, False)
cost_heavy = calculate_shipping_cost(20.1, 100, False)
assert cost_medium == 10.0 # Medium: base=10.0, dist_mult=1.0 → 10.0
assert cost_heavy == 20.0 # Heavy: base=20.0, dist_mult=1.0 → 20.0
def test_shipping_weight_boundary_maximum():
"""Boundary: weight = 50 (valid) vs 50.1 (invalid)"""
assert calculate_shipping_cost(50, 100, False) == 20.0 # Valid
with pytest.raises(ValueError, match="Weight must be between"):
calculate_shipping_cost(50.1, 100, False) # Invalid
# ===== Distance boundaries (weight and express fixed) =====
def test_shipping_distance_boundary_minimum():
"""Boundary: distance = 1 (valid) vs 0.9 (invalid)"""
assert calculate_shipping_cost(5.0, 1, False) == 5.0 # Valid
with pytest.raises(ValueError, match="Distance must be between"):
calculate_shipping_cost(5.0, 0.9, False) # Invalid
def test_shipping_distance_boundary_local_regional():
"""Boundary: distance = 100 (local) vs 101 (regional)"""
cost_local = calculate_shipping_cost(5.0, 100, False)
cost_regional = calculate_shipping_cost(5.0, 101, False)
assert cost_local == 5.0 # Local: base=5.0, dist_mult=1.0 → 5.0
assert cost_regional == 7.5 # Regional: base=5.0, dist_mult=1.5 → 7.5
def test_shipping_distance_boundary_regional_long():
"""Boundary: distance = 500 (regional) vs 501 (long-distance)"""
cost_regional = calculate_shipping_cost(5.0, 500, False)
cost_long = calculate_shipping_cost(5.0, 501, False)
assert cost_regional == 7.5 # Regional: base=5.0, dist_mult=1.5 → 7.5
assert cost_long == 10.0 # Long: base=5.0, dist_mult=2.0 → 10.0
def test_shipping_distance_boundary_maximum():
"""Boundary: distance = 5000 (valid) vs 5001 (invalid)"""
assert calculate_shipping_cost(5.0, 5000, False) == 10.0 # Valid
with pytest.raises(ValueError, match="Distance must be between"):
calculate_shipping_cost(5.0, 5001, False) # Invalid
# ===== Express flag (weight and distance fixed) =====
def test_shipping_express_flag():
"""Dimension: express = False vs True"""
cost_standard = calculate_shipping_cost(5.0, 100, False)
cost_express = calculate_shipping_cost(5.0, 100, True)
assert cost_standard == 5.0 # Standard: base=5.0, dist=1.0, express=1.0 → 5.0
assert cost_express == 7.5 # Express: base=5.0, dist=1.0, express=1.5 → 7.5
Test design observations:
- Independent dimension testing - Each dimension is tested while holding others constant
- Total: ~12 tests instead of 32 (all combinations) or 100+ (exhaustive)
- We still catch boundary bugs - If weight threshold is wrong (
weight < 5instead ofweight <= 5), our test will fail - Clear test names - Each test explicitly states what boundary it’s testing
When to test combinations:
You should add combination tests when:
- Dimensions interact - E.g., “heavy packages get discount on long distances”
- Edge case combinations - E.g., “minimum weight + maximum distance + express”
- Critical business logic - E.g., “free shipping for light local packages”
For this function, dimensions are independent (just multiplication), so independent testing is sufficient.
Key insight for multi-dimensional boundaries:
Test each dimension independently with representative values in other dimensions. This gives you \(O(d \times b)\) tests (d dimensions, b boundaries each) instead of \(O(b^d)\) exhaustive tests!
For shipping cost: 12 tests (3 dimensions × ~4 boundaries) instead of 32 tests (all combinations).
2.5.3 Summary: Boundary Testing Strategy for Discrete Functions
Key principles:
- Boundaries are explicit - Look for threshold values in conditional logic (
if,elif) - Test both sides of each threshold -
value-1(old category) vsvalue(new category) - Test range limits - Minimum valid, maximum valid, just outside range
- For multi-dimensional functions: Test each dimension independently (unless dimensions interact)
Common threshold patterns to look for:
| Pattern in Code | Boundaries to Test | Example |
|---|---|---|
if x >= threshold: |
threshold-1, threshold |
score >= 90 → test 89, 90 |
if x > threshold: |
threshold, threshold+1 |
weight > 5 → test 5.0, 5.1 |
if x < threshold: |
threshold-1, threshold |
distance < 100 → test 99, 100 |
if x <= threshold: |
threshold, threshold+1 |
weight <= 5 → test 5.0, 5.1 |
if min <= x <= max: |
min-1, min, max, max+1 |
0 <= score <= 100 → test -1, 0, 100, 101 |
Comparison: Continuous vs Discrete boundaries
| Aspect | Continuous Functions | Discrete Functions |
|---|---|---|
| Boundary identification | Must choose: "How close to problematic value?" | Explicit thresholds in code |
| Test values | Near-zero, infinity, NaN, epsilon | Threshold ± smallest unit (1 for integers, 0.1 for floats) |
| Precision concerns | Floating-point precision critical | Usually none (integer thresholds) |
| Number of boundaries | Depends on partitioning choice | Fixed by function logic |
| Common bugs caught | Division by zero, overflow, precision loss | Off-by-one errors (> vs >=) |
The good news: Discrete functions make boundary testing easier, not harder! The thresholds are right there in the code - you just need to test both sides of each one.
Final insight:
Every conditional creates a boundary. If you see
if x >= 90, you know you need to testx=89andx=90. If you seeif weight <= 5, you know you need to testweight=5.0andweight=5.1.Boundary testing for discrete functions is systematic and mechanical - which is why it’s so effective!
3. LLM-Assisted Testing - Breaking the Test Cone
Why don’t developers write tests?
Research and surveys show:
- “Writing tests is tedious” (42% of developers)
- Boilerplate: imports, setup, teardown
- Repetitive: Similar structure for every function
- “I don’t know what to test” (38% of developers)
- What are the equivalence classes?
- What are the boundary cases?
- How many tests do I need?
- “Initial setup takes forever” (35% of developers)
- pytest configuration
- Test file structure
- First test is always hardest
Result: Developers procrastinate → Only write E2E tests → Inverted pyramid (test cone)
The Solution: Use LLMs to handle the tedious parts, leaving you to focus on correctness.
The Workflow:
1. Human: Identifies what needs testing (equivalence classes)
2. LLM: Generates test boilerplate (imports, structure, AAA pattern)
3. Human: Reviews and refines (fixes logic, adds edge cases)
4. LLM: Generates more tests based on feedback
5. Human: Verifies assertions are correct
6. Run tests: Catch bugs in actual code (not tests!)
Key insight: LLMs excel at boilerplate, humans excel at domain logic.
Scenario: You need to test find_intersection() from src/road_profile_viewer/geometry.py.
Step 1: Prompt the LLM (in tests/test_geometry.py)
# tests/test_geometry.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.geometry import find_intersection
# Prompt: "Write pytest unit tests for find_intersection() covering:
# - Equivalence classes: downward angles, horizontal, upward angles, vertical
# - Boundary cases: empty arrays, single point, angle=90
# - Use AAA pattern (Arrange-Act-Assert)
# - Include proper type hints matching the actual repository code"
Step 2: LLM Generates Initial Tests
Copilot/Claude might generate:
def test_find_intersection_downward_angle() -> None:
"""Test find_intersection with normal downward angle."""
x_road: NDArray[np.float64] = np.array([0, 10, 20, 30], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([0, 2, 4, 6], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
assert x is not None
assert y is not None
assert dist is not None and dist > 0
def test_find_intersection_horizontal() -> None:
"""Test find_intersection with horizontal angle."""
x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([0, 0, 0], dtype=np.float64) # Flat road
x, y, dist = find_intersection(x_road, y_road, 0.0, 0.0, 1.0)
assert x is not None
def test_find_intersection_vertical() -> None:
"""Test find_intersection with vertical angle (90 degrees)."""
x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([0, 2, 4], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, 90.0)
assert x is None # Current implementation returns None for vertical
def test_find_intersection_empty_road() -> None:
"""Test find_intersection with empty road arrays."""
x_road: NDArray[np.float64] = np.array([], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, -10.0)
assert x is None
Value: 80% of boilerplate written instantly!
Step 3: Human Review - Find the Flaws
Now YOU review with domain knowledge:
❌ Problem 1: Missing edge case
# LLM didn't test: What if camera is BELOW the road?
# Add this test:
def test_find_intersection_camera_below_road() -> None:
"""Test when camera is below road level."""
x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([5, 5, 5], dtype=np.float64) # Flat road at y=5
x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 0.0) # Camera at y=0
# Should still find intersection (ray goes down, crosses road above)
assert x is not None
❌ Problem 2: Weak assertions
# LLM wrote:
assert x is not None # Too weak!
# Better assertion:
assert 0 <= x <= 30, f"Expected x in [0, 30], got {x}"
assert y >= 0, f"Expected y positive, got {y}"
❌ Problem 3: Invalid test data
# LLM wrote:
x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([0, 0, 0], dtype=np.float64) # Flat road at y=0
x, y, dist = find_intersection(x_road, y_road, 0.0, 0.0, 1.0) # Camera at y=1
# Problem: Horizontal ray from y=1 won't intersect road at y=0!
# Fix: Either raise camera or tilt road
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 1, 2], dtype=np.float64) # Sloped road
Step 4: Iterative Refinement
# You to LLM: "Add tests for upward angles where ray might not intersect"
# LLM generates:
def test_find_intersection_upward_angle_no_intersection() -> None:
"""Test that upward angle with road below returns None."""
x_road: NDArray[np.float64] = np.array([0, 10, 20], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([0, 1, 2], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, 45.0, 0.0, 0.0) # Ray goes up
# Depending on implementation, might not intersect
# If None is correct behavior:
assert x is None
Step 5: Run Tests and Find Real Bugs
$ uv run pytest tests/test_geometry.py -v
Surprise! One test fails:
FAILED test_find_intersection_empty_road - IndexError: index 0 is out of bounds
This is GOOD! The test found a bug in your actual code:
def find_intersection(x_road, y_road, ...):
# Bug: Doesn't check if arrays are empty before accessing!
for i in range(len(x_road) - 1): # Crashes if len(x_road) = 0
x1, y1 = x_road[i], y_road[i]
...
Fix the bug:
def find_intersection(x_road, y_road, ...):
# Add validation
if len(x_road) == 0 or len(y_road) == 0:
return None, None, None
# ... rest of function
Run tests again:
$ uv run pytest tests/test_geometry.py -v
============================= 8 passed in 0.12s ===============================
✅ All tests pass! Bug fixed before it reached users.
What LLMs are GOOD at:
- ✅ Boilerplate (imports, test structure, AAA pattern)
- ✅ Standard patterns (testing return values, basic assertions)
- ✅ Coverage (generating tests for each function)
What LLMs are BAD at:
- ❌ Domain knowledge (“Is it correct for vertical angles to return None?”)
- ❌ Edge cases specific to your problem (“What if camera is inside the road?”)
- ❌ Subtle bugs in test logic (test always passes even when it shouldn’t)
- ❌ Determining correct expected values (“What SHOULD the distance be?”)
- ❌ Adding unnecessary logic to tests (loops, conditionals, calculations)
Example of LLM failure #1: Weak assertions
# LLM might generate:
def test_calculate_distance():
dist = calculate_distance(0, 0, 3, 4)
assert dist > 0 # Too weak! Just checks it's positive
# Human knows Pythagorean theorem:
def test_calculate_distance():
dist = calculate_distance(0, 0, 3, 4)
assert abs(dist - 5.0) < 0.01, f"Expected 5.0, got {dist}" # 3-4-5 triangle!
Example of LLM failure #2: Logic in tests
LLMs sometimes generate tests with loops, conditionals, or calculations. This makes tests hard to understand and debug.
# ❌ LLM might generate this:
def test_find_intersection_multiple_angles():
"""Test intersection for various angles"""
angles = [-10, -20, -30, -45]
for angle in angles: # ❌ Loop in test!
x, y, dist = find_intersection(x_road, y_road, angle)
if x is not None: # ❌ Conditional in test!
assert dist > 0
else:
assert dist is None # ❌ Complex logic!
Problems with this test:
- Loop makes it hard to see which angle failed
- Conditional logic means test might not actually test anything
- If test fails, which angle was the problem?
- The test itself needs testing!
Human fix: Separate tests, no logic
# ✅ Human refactors to simple, clear tests:
def test_find_intersection_returns_positive_distance_for_minus_10_degrees():
"""Test -10° angle returns positive distance"""
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 2, 4], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
assert dist > 0 # No conditionals, no loops!
def test_find_intersection_returns_positive_distance_for_minus_20_degrees():
"""Test -20° angle returns positive distance"""
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 2, 4], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, -20.0, 0.0, 10.0)
assert dist > 0
# Or use pytest parametrize (cleaner for multiple similar tests):
@pytest.mark.parametrize("angle", [-10, -20, -30, -45])
def test_find_intersection_returns_positive_distance_for_downward_angles(angle):
"""Test that all downward angles return positive distance"""
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 2, 4], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, angle, 0.0, 10.0)
assert dist > 0 # Simple assertion, pytest runs this test 4 times with different angles
Key principle: Tests should be straight-line code (inspired by Google’s testing practices)
- ✅ DO: Write simple, linear test code
- ✅ DO: Use pytest’s
@pytest.mark.parametrizefor multiple inputs - ❌ DON’T: Write loops in tests
- ❌ DON’T: Write conditionals in tests
- ❌ DON’T: Calculate expected values in tests (use hardcoded values)
Why?
- Tests must be trivially correct by inspection
- If your test has logic, you need tests for your tests!
- Complex test logic hides bugs in the test itself
- When a test fails, you want to know immediately what’s wrong
The Human-LLM Loop:
1. Human: "Test find_intersection for edge cases"
2. LLM: Generates 5 tests
3. Human: "This one is wrong - camera below road should still work"
4. LLM: Fixes that test
5. Human: Runs tests, finds bug in actual code (not test)
6. Human: Fixes code
7. Tests pass → Confidence!
3.1 Warning: LLMs Love Mocking (But You Shouldn’t Overuse It)
When you use LLMs to generate tests, they often suggest mocking - replacing real objects with fake ones to verify function calls. This can lead to brittle tests that break when you refactor code.
Key Google Testing Principle: Test State, Not Interactions
There are two ways to verify that code works:
- State Testing: Observe the system after invoking it to verify outcomes
- Interaction Testing: Verify expected sequences of function calls on collaborators (using mocks)
State testing is less brittle because it focuses on “what” results occurred rather than “how” results were achieved.
3.1.1 What Mocking Looks Like (and When It’s Problematic)
# ❌ LLM-generated test with excessive mocking
from unittest.mock import patch
def test_find_intersection_calls_tan():
"""Test that find_intersection calls np.tan"""
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 2, 4], dtype=np.float64)
with patch('numpy.tan') as mock_tan:
mock_tan.return_value = 0.176 # Mocked slope
x, y, dist = find_intersection(x_road, y_road, -10.0)
mock_tan.assert_called_once() # ❌ Tests HOW, not WHAT
Problem: This test breaks if you:
- Change the trigonometry implementation (e.g., use cosine/sine instead of tan)
- Optimize the slope calculation
- Use a different math library
- But the function still produces correct results!
This is a brittle test - it fails when implementation changes, even though behavior stays the same.
3.1.2 Better Approach: Test the Result, Not the Method Calls
# ✅ Test the RESULT (state), not the METHOD CALLS (interactions)
def test_find_intersection_returns_correct_intersection():
"""Test that intersection position is geometrically correct"""
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 2, 4], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
# Assert FINAL STATE (outcomes)
assert x is not None, "Should find intersection for downward angle"
assert 0 <= x <= 20, f"Intersection x should be within road bounds, got {x}"
assert y >= 0, f"Intersection y should be above ground, got {y}"
assert dist > 0, f"Distance should be positive, got {dist}"
# No assumptions about which numpy functions were called!
Benefits:
- ✅ Tests behavior, not implementation
- ✅ Survives refactoring (you can change HOW it calculates, as long as WHAT it returns is correct)
- ✅ Clear what’s being tested (intersection position and distance)
- ✅ Fails only when actual behavior changes
3.1.3 When IS Mocking Appropriate?
Mocking is valuable in specific scenarios:
✅ DO mock:
- External APIs (network calls, HTTP requests)
- Databases (slow, require setup)
- File I/O (slow, requires filesystem state)
- Time/Randomness (non-deterministic behavior)
- Expensive operations (takes seconds/minutes to run)
❌ DON’T mock:
- Pure functions (like
find_intersection()- deterministic, fast) - Simple data structures (NumPy arrays, lists, dicts)
- Internal implementation details (which functions your code calls)
- Math libraries (NumPy, math module - they’re fast and deterministic)
3.1.4 Example: When Mocking Is Appropriate
# ✅ Good use of mocking: External API
from unittest.mock import patch, Mock
def test_fetch_weather_data_returns_temperature():
"""Test weather fetching without hitting real API"""
# Mock the external HTTP request (slow, requires network)
with patch('requests.get') as mock_get:
mock_response = Mock()
mock_response.json.return_value = {'temp': 22.5}
mock_get.return_value = mock_response
# Now test your function
temp = fetch_weather_data('Berlin')
# Assert RESULT (not implementation)
assert temp == 22.5
Why this mocking is good:
- External API is slow and requires network
- API might be rate-limited or cost money
- API might not be available in test environment
- We’re still testing the RESULT (temp == 22.5), not method calls
3.1.5 Rule of Thumb: Prefer Real Objects Over Mocks
For find_intersection() and similar functions:
- ✅ Use real NumPy arrays - they’re fast (microseconds) and deterministic
- ✅ Use real math functions -
np.tan(),np.cos()are instant - ✅ Test the final state - intersection coordinates, distance values
- ❌ Don’t mock NumPy - no benefit, adds brittleness
Google’s guideline: “Use real objects when they’re fast and deterministic. Mock only when necessary.”
3.1.6 Summary: State Testing vs. Interaction Testing
| Aspect | State Testing (Preferred) | Interaction Testing (Use Sparingly) |
|---|---|---|
| What it tests | Final results/outcomes | Sequence of function calls |
| Assertion style | assert x == expected_value |
mock.assert_called_once() |
| Brittleness | Low - survives refactoring | High - breaks when implementation changes |
| When to use | Always, when possible | External dependencies only |
| Example | assert dist > 0 |
mock_api.get.assert_called() |
Key takeaway: When LLMs suggest mocking, ask yourself: “Is this dependency slow or external?” If not, use the real object and test the state!
Best Practice: Use LLM to START, human to VERIFY and REFINE. Watch out for over-mocking!
4. Part 4: Hands-On Exercise - Test geometry.py
4.1 Exercise: Write Comprehensive Tests for geometry.py
Goal: Test all functions in src/road_profile_viewer/geometry.py using equivalence classes and boundary analysis.
Setup:
# Create test structure if not already done
$ mkdir -p tests
$ touch tests/__init__.py
$ touch tests/test_geometry.py
Task 1: Test find_intersection() - Equivalence Classes
Use an LLM to generate initial tests, then refine:
# tests/test_geometry.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.geometry import find_intersection, calculate_ray_line
class TestFindIntersection:
"""Test suite for find_intersection() function."""
def test_normal_downward_angle(self) -> None:
"""Equivalence class: Normal downward angle (-90° < angle < 0°)"""
x_road: NDArray[np.float64] = np.array([0, 10, 20, 30], dtype=np.float64)
y_road: NDArray[np.float64] = np.array([0, 2, 4, 6], dtype=np.float64)
x, y, dist = find_intersection(x_road, y_road, -45.0, 0.0, 10.0)
assert x is not None and y is not None and dist is not None
def test_horizontal_angle(self) -> None:
"""Equivalence class: Horizontal ray (angle = 0°)"""
# Your test here (let LLM generate, then review)
pass
def test_vertical_angle_boundary(self) -> None:
"""Boundary case: Vertical angle (90°)"""
pass
def test_empty_road_arrays(self) -> None:
"""Boundary case: Empty arrays"""
pass
def test_single_point_road(self) -> None:
"""Boundary case: Road with only one point"""
pass
Task 2: Test calculate_ray_line() - Boundary Cases
class TestCalculateRayLine:
"""Test suite for calculate_ray_line() function."""
def test_normal_angle(self) -> None:
"""Test with normal angle"""
x_ray: NDArray[np.float64]
y_ray: NDArray[np.float64]
x_ray, y_ray = calculate_ray_line(-10.0, camera_x=0.0, camera_y=2.0)
assert len(x_ray) == 2
assert len(y_ray) == 2
assert x_ray[0] == 0.0 # Starts at camera
assert y_ray[0] == 2.0
def test_vertical_angle(self) -> None:
"""Boundary: Vertical angle (90°)"""
x_ray, y_ray = calculate_ray_line(90.0)
# Should handle vertical line
assert x_ray[0] == x_ray[1] # Vertical means same x
def test_zero_angle(self) -> None:
"""Boundary: Horizontal angle (0°)"""
x_ray, y_ray = calculate_ray_line(0.0, camera_y=2.0)
# Horizontal line at y=2.0
assert y_ray[0] == y_ray[1] == 2.0
Task 3: Run Tests and Iterate
$ uv run pytest tests/test_geometry.py -v
# If failures occur:
# 1. Is the test wrong? Fix the test.
# 2. Is the code wrong? Fix the code.
# 3. Unsure? Add a print statement in the test to see actual values.
Success criteria:
- ✅ At least 10 tests for
find_intersection() - ✅ At least 5 tests for
calculate_ray_line() - ✅ All equivalence classes covered
- ✅ Boundary cases tested
- ✅ All tests pass
5. Part 5: Module and Integration Testing
5.1 What are Module/Integration Tests?
Unit test: Tests one function in isolation.
Module/Integration test: Tests multiple modules working together.
Example:
# Unit test: Test ONLY geometry.find_intersection()
def test_find_intersection():
x_road = np.array([0, 10])
y_road = np.array([0, 2])
x, y, dist = find_intersection(x_road, y_road, -10.0)
assert x is not None
# Integration test: Test geometry + road together
def test_road_generation_with_intersection() -> None:
"""Test that road generation produces data that geometry can process."""
from road_profile_viewer.road import generate_road_profile
from road_profile_viewer.geometry import find_intersection
# Generate road using road.py
x_road: NDArray[np.float64]
y_road: NDArray[np.float64]
x_road, y_road = generate_road_profile(num_points=100, x_max=80)
# Use geometry.py to find intersection
x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=0.0, camera_y=2.0)
# Verify integration works
assert x is not None, "Generated road should intersect camera ray"
assert 0 <= x <= 80, "Intersection should be within road bounds"
assert y >= 0, "Intersection should be above ground"
Key difference:
- Unit test: Mock/fake data (
np.array([0, 10])) - Integration test: Real data from actual modules (
generate_road_profile())
5.2 When to Write Integration Tests
Write integration tests to catch:
- Interface mismatches
# road.py returns (x, y) # geometry.py expects (x, y) # If road.py changes to return (x, y, metadata), tests will catch it! - Data format issues
# What if generate_road_profile() returns list instead of np.array? # Integration test will fail! - Assumptions about data
# geometry.py assumes x_road is sorted # What if road.py returns unsorted data? # Integration test catches this!
5.3 Example Integration Test Suite
# tests/test_integration.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.road import generate_road_profile
from road_profile_viewer.geometry import find_intersection, calculate_ray_line
class TestRoadGeometryIntegration:
"""Test integration between road and geometry modules."""
def test_generated_road_intersects_downward_ray(self) -> None:
"""Verify that generated roads can be processed by geometry functions."""
# Use real road generation
x_road: NDArray[np.float64]
y_road: NDArray[np.float64]
x_road, y_road = generate_road_profile(num_points=100, x_max=80)
# Verify it works with geometry module
x, y, dist = find_intersection(x_road, y_road, -10.0, camera_x=0.0, camera_y=10.0)
assert x is not None, "Should find intersection with generated road"
assert isinstance(x, (int, float, np.number)), "Should return numeric type"
assert isinstance(y, (int, float, np.number)), "Should return numeric type"
assert isinstance(dist, (int, float, np.number)), "Should return numeric type"
def test_road_data_format_compatible_with_geometry(self) -> None:
"""Verify road.py returns data in format geometry.py expects."""
x_road: NDArray[np.float64]
y_road: NDArray[np.float64]
x_road, y_road = generate_road_profile()
# Check data types
assert isinstance(x_road, np.ndarray), "x_road should be numpy array"
assert isinstance(y_road, np.ndarray), "y_road should be numpy array"
# Check lengths match
assert len(x_road) == len(y_road), "Road arrays should have same length"
# Check data is sorted
assert np.all(np.diff(x_road) > 0), "x_road should be strictly increasing"
def test_varying_road_parameters_work_with_geometry(self) -> None:
"""Test that different road generation parameters work with geometry."""
for num_points in [10, 50, 100, 200]:
for x_max in [40, 80, 120]:
x_road, y_road = generate_road_profile(num_points, x_max)
x, y, dist = find_intersection(x_road, y_road, -10.0)
# Should not crash, might or might not find intersection
assert x is None or isinstance(x, (int, float, np.number))
Run integration tests:
$ uv run pytest tests/test_integration.py -v
5.4 E2E Testing: Why We Skip It (For Now)
End-to-End test would mean:
- Start the Dash application
- Open a browser (using Selenium/Playwright)
- Enter angle in input field
- Verify graph updates correctly
Why skip it in this course?
- Complex setup: Requires Selenium, browser drivers, etc.
- Slow: Each test takes seconds to run
- Brittle: UI changes break tests frequently
- Diminishing returns: Unit + integration tests catch 90% of bugs
For this course:
- ✅ 70% Unit tests (fast, focused)
- ✅ 20% Integration tests (moderate)
- ❌ 10% E2E tests (skip - too advanced)
In real projects: Yes, E2E tests matter. But get your pyramid base solid first!
6. 5.5 Test Maintainability: Writing Tests That Don’t Break
You’ve learned how to write unit tests, integration tests, and how to use LLMs to accelerate testing. Now let’s talk about a critical quality: test maintainability.
The Problem:
You write 20 unit tests. They all pass. ✅
You refactor find_intersection() to improve performance (no behavior change). Suddenly 10 tests fail. ❌
Question: Is this good or bad?
Bad! These tests are brittle - they break when implementation changes, even though behavior stayed the same.
6.1 The Brittleness Problem
Brittle tests fail in response to unrelated production code changes that introduce no real bugs. They:
- Force you to repeatedly tweak tests with each refactoring
- Consume disproportionate maintenance time
- Scale poorly in large codebases
- Undermine the “automated” nature of test suites
- Make you afraid to refactor
Google’s insight: In large codebases, brittle tests are a major productivity killer. Engineers spend more time fixing tests than fixing actual bugs!
6.2 Example: Brittle Test (Tests Implementation Details)
# ❌ BRITTLE: Tests HOW the function works internally
from unittest.mock import patch
def test_find_intersection_uses_tan_for_slope():
"""Test that function uses np.tan to calculate slope"""
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 2, 4], dtype=np.float64)
# Mock np.tan to verify it's called
with patch('numpy.tan') as mock_tan:
mock_tan.return_value = 0.176 # Mocked slope value
find_intersection(x_road, y_road, -10.0)
assert mock_tan.called # ❌ Breaks if you change slope calculation!
Problem: If you refactor to use a different trigonometry approach (e.g., using atan2 or pre-computed lookup tables), this test fails even though:
- The intersection position is still correct
- The behavior is identical
- No bugs were introduced
This test is testing the IMPLEMENTATION, not the BEHAVIOR.
6.3 Better: Robust Test (Tests Public Behavior)
# ✅ ROBUST: Tests WHAT the code does, not HOW it does it
def test_find_intersection_returns_correct_position_for_downward_angle():
"""Test that intersection position is geometrically correct for downward ray"""
# Arrange
x_road = np.array([0, 10, 20], dtype=np.float64)
y_road = np.array([0, 2, 4], dtype=np.float64)
# Act
x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)
# Assert behavior: intersection should be within road bounds
assert x is not None, "Should find intersection for downward angle"
assert 0 <= x <= 20, f"Intersection x should be in road bounds [0, 20], got {x}"
assert 0 <= y <= 4, f"Intersection y should be in road bounds [0, 4], got {y}"
assert dist > 0, f"Distance should be positive, got {dist}"
# No assumptions about HOW it calculated this!
# Works with tan, atan2, lookup tables, or any other implementation
Benefits:
- ✅ Tests behavior, not implementation
- ✅ Survives refactoring (change HOW, behavior stays same)
- ✅ Clear what’s being tested (intersection correctness)
- ✅ Fails only when actual behavior changes
6.4 The Key Principle: Test Via Public APIs
Google’s guideline: “Write tests that invoke the system being tested in the same way its users would.”
What does this mean for find_intersection()?
The public API (what users see):
x, y, dist = find_intersection(x_road, y_road, angle_degrees, camera_x, camera_y)
Public contract (what users expect):
- Input: Arrays and angles
- Output: Intersection coordinates or None
- Behavior: Find where camera ray intersects road
✅ DO test:
- Does it return correct intersection coordinates?
- Does it handle edge cases (empty arrays, vertical angles)?
- Does it return None when ray misses road?
❌ DON’T test:
- Does it call
np.tan()internally? - Does it use a specific algorithm?
- What order does it process road segments?
6.5 Four Categories of Code Changes (Google’s Framework)
Google categorizes production code changes and how tests should respond:
| Change Type | Example | Should Tests Change? | Why |
|---|---|---|---|
| Pure Refactoring | Optimize find_intersection() algorithm |
❌ NO | Tests verify behavior remains constant |
| New Features | Add find_all_intersections() function |
✅ Add new tests only | Existing tests stay unchanged |
| Bug Fixes | Fix crash on empty arrays | ✅ Add new test case | Test the bug to prevent regression |
| Behavior Changes | Return list of intersections instead of first | ✅ Modify existing tests | Contract changed, tests must reflect new behavior |
The ideal: Tests only change when behavior changes (category 4). All other changes should leave tests untouched!
6.6 Real-World Example: Refactoring Scenario
Scenario: You want to optimize find_intersection() by using a faster algorithm.
Before (current implementation):
def find_intersection(x_road, y_road, angle_degrees, camera_x=0, camera_y=1.5):
"""Find intersection using linear search"""
angle_rad = -np.deg2rad(angle_degrees)
# Check vertical
if np.abs(np.cos(angle_rad)) < 1e-10:
return None, None, None
slope = np.tan(angle_rad)
# Linear search through segments
for i in range(len(x_road) - 1):
x1, y1 = x_road[i], y_road[i]
x2, y2 = x_road[i + 1], y_road[i + 1]
# ... intersection calculation
After (optimized implementation):
def find_intersection(x_road, y_road, angle_degrees, camera_x=0, camera_y=1.5):
"""Find intersection using binary search (10x faster!)"""
angle_rad = -np.deg2rad(angle_degrees)
# Different vertical check using sine
if np.abs(np.sin(angle_rad - np.pi/2)) < 1e-10: # Changed!
return None, None, None
# Use atan2 instead of tan (more numerically stable)
direction = np.array([np.cos(angle_rad), np.sin(angle_rad)]) # Changed!
# Binary search through segments (faster for large roads)
left, right = 0, len(x_road) - 1
# ... binary search logic (completely different!)
What happens to tests?
❌ Brittle tests (testing implementation):
def test_find_intersection_uses_tan():
with patch('numpy.tan') as mock_tan:
find_intersection(...)
assert mock_tan.called # ❌ FAILS! We use atan2 now!
✅ Robust tests (testing behavior):
def test_find_intersection_returns_correct_position():
x, y, dist = find_intersection(x_road, y_road, -10.0)
assert 0 <= x <= 20 # ✅ PASSES! Still finds correct intersection!
Result:
- Brittle tests: 10 failures (all false positives!)
- Robust tests: 0 failures (behavior unchanged!)
This is the difference between productive testing and test hell!
6.7 Practical Guidelines for Maintainable Tests
✅ DO:
- Test public API only - Call functions the same way users would
- Test final outcomes - Assert on return values, not internal state
- Use real objects when fast - Avoid mocking NumPy, math functions
- Test behaviors, not methods - One test per behavior (not one per function)
- Expect tests to be stable - Good tests only change when requirements change
❌ DON’T:
- Mock internal implementation - Don’t mock
np.tan()in your own code - Assert on private state - Don’t check internal variables
- Test method call sequences - Don’t verify
tanwas called beforecos - Couple tests to algorithm - Don’t assume linear vs. binary search
- Test performance in unit tests - Speed is separate from correctness
6.8 Summary: Maintainable vs. Brittle Tests
| Aspect | Maintainable Tests | Brittle Tests |
|---|---|---|
| What they test | Public behavior (WHAT) | Implementation details (HOW) |
| Assertion style | assert x is not None |
mock_tan.assert_called() |
| Refactoring impact | Tests still pass ✅ | Tests break ❌ |
| Developer experience | "Tests just work!" | "Ugh, fix tests again..." |
| When they fail | Real bug found | Often false alarm |
Key takeaway: The best tests are the ones you never have to touch until a real bug appears. Test WHAT your code does, not HOW it does it!
6. Part 6: Applying Feature Branch Workflow
Let’s add these tests using the professional workflow from Chapter 02 (Feature Development)!
6.1 Step 1: Create Feature Branch
$ git checkout main
$ git pull origin main
$ git checkout -b feature/add-unit-tests
6.2 Step 2: Create Test Files
$ mkdir -p tests
$ touch tests/__init__.py
$ touch tests/test_geometry.py
$ touch tests/test_road.py
$ touch tests/test_integration.py
6.3 Step 3: Write Tests (Using LLM Assistance)
Use Copilot/Claude to generate initial tests, then refine:
# tests/test_geometry.py
# (See examples from Part 5 above)
# tests/test_road.py
import numpy as np
from numpy.typing import NDArray
import pytest
from road_profile_viewer.road import generate_road_profile
class TestGenerateRoadProfile:
def test_default_parameters(self) -> None:
"""Test road generation with default parameters."""
x: NDArray[np.float64]
y: NDArray[np.float64]
x, y = generate_road_profile()
assert len(x) == 100 # Default num_points
assert x[0] == 0.0
assert y[0] == 0.0 # Road starts at origin
def test_custom_num_points(self) -> None:
"""Test road generation with custom number of points."""
x, y = generate_road_profile(num_points=50)
assert len(x) == 50
assert len(y) == 50
def test_road_is_continuous(self) -> None:
"""Verify generated road has no gaps."""
x, y = generate_road_profile(num_points=100)
assert np.all(np.diff(x) > 0), "x should be strictly increasing"
6.4 Step 4: Run Tests Locally
$ uv run pytest tests/ -v
============================= test session starts ==============================
tests/test_geometry.py::TestFindIntersection::test_normal_downward_angle PASSED
tests/test_geometry.py::TestFindIntersection::test_empty_road_arrays PASSED
tests/test_road.py::TestGenerateRoadProfile::test_default_parameters PASSED
...
============================== 15 passed in 0.25s ===============================
6.5 Step 5: Commit Your Tests
$ git add tests/
$ git commit -m "Add unit tests for geometry and road modules
- test_geometry.py: 10 tests covering equivalence classes and boundaries
- test_road.py: 5 tests for road generation
- test_integration.py: 3 integration tests
- All tests use AAA pattern (Arrange-Act-Assert)
- Tests generated with LLM assistance, refined by human review
Total: 18 tests, all passing"
6.6 Step 6: Push and Create PR
$ git push -u origin feature/add-unit-tests
$ gh pr create --title "Add unit tests for geometry and road modules" \
--body "Adds comprehensive test suite using pytest.
**Coverage:**
- geometry.py: 10 unit tests (equivalence classes + boundaries)
- road.py: 5 unit tests
- Integration: 3 tests verifying modules work together
**Testing approach:**
- Used LLM to generate initial test structure
- Human review to fix assertions and add edge cases
- All tests pass locally
**Next steps:**
- Chapter 03 (TDD and CI) will add CI integration
- TDD workflow for new features"
6.7 Step 7: CI Will Run (in Chapter 03)
In the next lecture, we’ll update CI to run tests automatically!
7. Summary: What You’ve Accomplished
7.1 Before This Lecture
✅ Ruff (style)
✅ Pyright (types)
❌ No tests → Logic bugs slip through
7.2 After This Lecture
✅ Ruff (style)
✅ Pyright (types)
✅ Pytest (correctness) → 18 tests catch bugs!
7.3 Key Concepts You’ve Mastered
1. Testing Pyramid
- 70% Unit tests (fast, focused)
- 20% Integration tests (moderate)
- 10% E2E tests (slow, expensive)
2. Unit Testing Skills
- AAA pattern (Arrange-Act-Assert)
- Equivalence class partitioning
- Boundary value analysis
- Writing focused, fast tests
3. LLM-Assisted Testing
- LLMs generate boilerplate (fast start)
- Humans verify correctness (essential!)
- Iterative refinement loop
- Breaks the “test cone” pattern
4. Practical Skills
- Created test file structure
- Wrote 18+ unit tests
- Used pytest to run tests
- Applied feature branch workflow
7.4 The Difference Tests Make
Without tests:
Developer: *Changes find_intersection()*
Developer: *Manually tests by clicking UI*
Developer: "Looks good!" *Pushes to main*
User: *Discovers bug with vertical angle* "App crashed!"
With tests:
Developer: *Changes find_intersection()*
Developer: $ pytest tests/
Developer: "❌ Test failed! Bug caught before commit"
Developer: *Fixes bug, tests pass*
Developer: *Pushes with confidence*
User: *Everything works!*
8. Key Takeaways
Remember these principles:
- Code quality ≠ Code correctness - Ruff catches style, tests catch bugs
- Testing pyramid, not cone - More unit tests, fewer E2E tests
- Equivalence classes + boundaries - Test smart, not exhaustive
- AAA pattern - Arrange, Act, Assert for readable tests
- LLMs assist, humans verify - Use LLMs for boilerplate, not correctness
- Fast tests = frequent testing - Unit tests run in milliseconds
- Tests give confidence - Refactor without fear
- Feature branch workflow applies to tests - Tests are a feature too!
You’re now ready for Chapter 03 (TDD and CI): Test-Driven Development!
In the next lecture, we’ll flip the script: write tests BEFORE code (TDD), integrate tests into CI, and make failing tests block merges.
9. Further Reading
On Testing:
- Kent Beck’s “Test-Driven Development by Example”
- Martin Fowler’s “Testing Strategies” article
- pytest documentation: https://docs.pytest.org/
On the Testing Pyramid:
- Martin Fowler: TestPyramid
- Google Testing Blog: Testing on the Toilet series
On Equivalence Classes:
- Software Testing Fundamentals: Equivalence Partitioning
- Boundary Value Analysis techniques
Interactive Learning:
- pytest tutorial: https://docs.pytest.org/en/stable/getting-started.html
- Real Python: Effective Python Testing With Pytest
Last Updated: 2025-11-04 Prerequisites: Lectures 1-4 (Especially Chapter 02 (Refactoring) - Modular Code), Chapter 03 (Testing Basics) Next Lecture: Chapter 03 (TDD and CI) - Test-Driven Development & CI Integration