Home

Appendix 3: Pytest Assertion Reference - Beyond the Basics

November 2025 (12357 Words, 69 Minutes)

testing pytest assertions best-practices reference

Introduction: Your Testing Toolkit

You’ve been writing tests for weeks now. You know the basics: assert, pytest.approx(), pytest.raises(). But as you write more complex tests, you start noticing patterns. Your tests are getting messy. LLM-generated tests don’t quite look right. You’re not sure how to test warnings, or how to make exception tests more specific.

This appendix is your reference guide. Think of it as the pytest documentation, but focused on the patterns you’ll actually use in practice.

What This Appendix Covers

This is NOT a tutorial. This is a practical reference for assertion patterns, organized by use case:

Advanced Exception Testing - Beyond basic pytest.raises()
pytest.approx Deep Dive - Floating-point comparisons for complex data structures
Warning Testing - Testing deprecation warnings and user warnings
Test Control - Skipping, failing, and marking tests
Assertion Introspection - Understanding pytest’s assertion rewriting magic
Parametrization Patterns - Testing multiple scenarios efficiently
Common Anti-Patterns - What NOT to do (especially common in LLM-generated tests)
Quick Reference - Cheat sheets and lookup tables

What You Already Know

From Chapter 03 (Testing Fundamentals), you already understand:

Boundary value analysis - Testing at boundaries like sys.float_info.max, math.inf, math.ulp()
Basic pytest.approx() - Comparing floating-point numbers with tolerances
Basic pytest.raises() - Testing that code raises exceptions
State testing - Testing results, not method calls
AAA pattern - Arrange, Act, Assert

This appendix builds on that foundation with advanced patterns and best practices.

How to Use This Appendix

While writing tests:

Jump to the relevant chapter for quick syntax lookup
Check anti-patterns section if something feels wrong
Use quick reference tables for at-a-glance help

When reviewing LLM-generated tests:

Check Chapter 7 for common anti-patterns
Verify exception tests use match parameter (Chapter 1)
Ensure parametrization is used appropriately (Chapter 6)

When debugging failing tests:

Chapter 5 explains assertion introspection output
Chapter 2 covers pytest.approx tolerance issues
Chapter 4 covers test control for flaky tests

Key References

Throughout this appendix, we’ll reference:

Official pytest documentation: https://docs.pytest.org/
pytest API reference: https://docs.pytest.org/en/stable/reference/reference.html
Software Engineering at Google (O’Reilly) - Testing chapter
IEEE 754 Standard - Floating-point arithmetic
NerdWallet Engineering Blog - pytest best practices

Let’s dive in.

Advanced Exception Testing

You already know the basics of pytest.raises():

def test_division_by_zero():
    with pytest.raises(ZeroDivisionError):
        1 / 0

But what if you want to verify the exception message? Or access the exception object for detailed assertions? Or test code that raises multiple exceptions?

The `match` Parameter: Validating Exception Messages

The most common improvement to pytest.raises() is the match parameter, which validates the exception message using a regular expression.

Basic syntax:

with pytest.raises(ValueError, match=r"must be positive"):
    validate_input(-5)

Why use match?

Specificity - Ensures you’re raising the RIGHT exception for the RIGHT reason
Regression prevention - Catches changes to error messages
Documentation - Makes test intent clearer

Example: Testing input validation

# road_profile_viewer/filters.py
from numpy.typing import NDArray
import numpy as np

def apply_lowpass_filter(data: NDArray[np.float64], cutoff_freq: float) -> NDArray[np.float64]:
    """Apply lowpass filter to road profile data.

    Args:
        data: Input road profile data
        cutoff_freq: Cutoff frequency in Hz (must be positive)

    Raises:
        ValueError: If cutoff_freq is not positive
    """
    if cutoff_freq <= 0:
        raise ValueError(f"Cutoff frequency must be positive, got {cutoff_freq}")
    # ... filter implementation

❌ Bad test (no message validation):

def test_lowpass_filter_negative_cutoff():
    data = np.array([1.0, 2.0, 3.0])
    with pytest.raises(ValueError):  # Too vague!
        apply_lowpass_filter(data, cutoff_freq=-10.0)

This test would pass even if the code raised ValueError("Invalid data") instead of the cutoff frequency error.

✅ Good test (validates message):

def test_lowpass_filter_negative_cutoff():
    # Arrange
    data = np.array([1.0, 2.0, 3.0])
    invalid_cutoff = -10.0

    # Act & Assert
    with pytest.raises(ValueError, match=r"Cutoff frequency must be positive"):
        apply_lowpass_filter(data, cutoff_freq=invalid_cutoff)

Pro Tips for match parameter:

Use raw strings - match=r"pattern" to avoid escaping backslashes
match uses re.search() - Pattern can appear anywhere in message (not full string match)
Escape special regex characters - Use re.escape() for literal strings:

# If error message contains parentheses, brackets, or special chars
import re

expected_msg = "Invalid range: [0, 100]"
with pytest.raises(ValueError, match=re.escape(expected_msg)):
    validate_range(200)

Test dynamic parts with regex groups - For messages with dynamic values:

# Message: "Cutoff frequency must be positive, got -10.0"
with pytest.raises(ValueError, match=r"must be positive, got -?\d+\.?\d*"):
    apply_lowpass_filter(data, cutoff_freq=-10.0)

Reference: pytest.raises() documentation

Accessing Exception Details: `ExceptionInfo`

Sometimes you need to inspect the exception object itself - not just its message. Use pytest.raises() as a context manager and access the exception via .value:

Syntax:

with pytest.raises(ValueError) as exc_info:
    risky_operation()

# Access exception details
assert exc_info.type is ValueError
assert exc_info.value.args[0] == "Expected message"
assert "partial message" in str(exc_info.value)

Example: Testing exception attributes

# road_profile_viewer/exceptions.py
class ProfileDataError(Exception):
    """Custom exception for profile data errors."""

    def __init__(self, message: str, data_length: int, expected_length: int):
        super().__init__(message)
        self.data_length = data_length
        self.expected_length = expected_length

def test_profile_data_error_attributes():
    # Arrange
    invalid_data = np.array([1.0, 2.0])  # Too short
    expected_length = 100

    # Act & Assert
    with pytest.raises(ProfileDataError) as exc_info:
        load_profile_data(invalid_data, expected_length=expected_length)

    # Access custom attributes
    assert exc_info.value.data_length == 2
    assert exc_info.value.expected_length == 100
    assert "length" in str(exc_info.value).lower()

ExceptionInfo attributes:

exc_info.type - Exception class (e.g., ValueError)
exc_info.value - Exception instance (the actual exception object)
exc_info.traceback - Traceback object (for advanced debugging)

When to use ExceptionInfo:

Testing custom exception attributes
Validating exception causes (via __cause__ or __context__)
Debugging complex exception chains
Testing multiple conditions about the exception

Reference: ExceptionInfo API

Testing Exception Groups (Python 3.11+)

Python 3.11 introduced exception groups - exceptions that bundle multiple exceptions together. Use pytest.raises() with ExceptionGroup or the helper pytest.raises_group():

Example: Testing concurrent operations

def test_multiple_validation_errors():
    """Test that all validation errors are reported together."""
    invalid_config = {
        "cutoff_freq": -10.0,      # Invalid (negative)
        "sample_rate": 0,          # Invalid (zero)
        "window_size": "invalid"   # Invalid (wrong type)
    }

    # Python 3.11+ syntax
    with pytest.raises(ExceptionGroup) as exc_info:
        validate_filter_config(invalid_config)

    # Check that group contains expected exceptions
    exceptions = exc_info.value.exceptions
    assert len(exceptions) == 3
    assert any(isinstance(e, ValueError) and "cutoff" in str(e) for e in exceptions)
    assert any(isinstance(e, ValueError) and "sample_rate" in str(e) for e in exceptions)
    assert any(isinstance(e, TypeError) and "window_size" in str(e) for e in exceptions)

Note: Exception groups are advanced Python 3.11+ features. For this course, focus on standard exception testing patterns above.

Reference: PEP 654 - Exception Groups

When NOT to Use `pytest.raises()`

Anti-pattern: Testing that code does NOT raise

# ❌ WRONG - Pointless test
def test_division_does_not_raise():
    with pytest.raises(ZeroDivisionError):
        pass  # This will fail because nothing raises!

    # ... what were we testing?

✅ CORRECT - Just call the function

def test_division_with_nonzero_divisor():
    # Arrange
    numerator = 10.0
    denominator = 2.0

    # Act
    result = numerator / denominator

    # Assert
    assert result == 5.0  # If this runs, no exception was raised!

If a test completes without raising an exception, it passes. You don’t need to explicitly test that exceptions DON’T occur.

Quick Reference: Exception Testing

Pattern	Syntax	Use Case
Basic exception	`with pytest.raises(ValueError):`	Test that code raises specific exception type
Exception message	`with pytest.raises(ValueError, match=r"pattern"):`	Validate exception message matches regex
Exception details	`with pytest.raises(ValueError) as exc_info:`	Access exception object for detailed assertions
Literal message	`match=re.escape("literal [text]")`	Match exact message with special characters
Exception groups	`with pytest.raises(ExceptionGroup):`	Test bundled exceptions (Python 3.11+)

pytest.approx Deep Dive

You already use pytest.approx() for floating-point comparisons:

assert result == pytest.approx(expected, rel=1e-9)

But pytest.approx() is more powerful than you think. It works with sequences, dictionaries, NumPy arrays, and even nested structures.

How pytest.approx Works: Relative vs Absolute Tolerance

Default tolerances:

rel=1e-6 (relative tolerance: 0.0001%)
abs=1e-12 (absolute tolerance)

Comparison rule: A value is considered equal if it satisfies EITHER tolerance:

\[ |\text{actual} - \text{expected}| \leq \max(\text{rel} \times \text{expected}, \text{abs}) \]

Example:

import pytest

# For expected = 1000.0, rel=1e-6, abs=1e-12
# Tolerance = max(1e-6 * 1000.0, 1e-12) = 0.001
assert 1000.001 == pytest.approx(1000.0)  # Within 0.001
assert 999.999 == pytest.approx(1000.0)   # Within 0.001
assert 1000.002 != pytest.approx(1000.0)  # Outside 0.001

Why two tolerances?

Relative tolerance scales with magnitude (good for large numbers)
Absolute tolerance handles values near zero (where relative tolerance breaks down)

Example: Near-zero values

# For expected = 0.0, rel=1e-6 would give 0 tolerance!
# So we need abs=1e-12
assert 1e-13 == pytest.approx(0.0)  # Uses absolute tolerance
assert 1e-11 != pytest.approx(0.0)  # Outside absolute tolerance

Choosing tolerances:

From Chapter 03 (Boundary Analysis), you learned about math.ulp() - the unit in last place:

import math

# For high-precision requirements, use ULP-based tolerance
x = 1.0
tolerance = 10 * math.ulp(x)  # 10 ULPs = 10 * 2.220446049250313e-16

assert result == pytest.approx(expected, abs=tolerance)

Reference: pytest.approx documentation

pytest.approx with Sequences (Lists, Tuples)

pytest.approx() works element-wise with sequences:

import pytest

# Lists
assert [0.1 + 0.2, 0.2 + 0.4] == pytest.approx([0.3, 0.6])

# Tuples
assert (0.1 + 0.2, 0.2 + 0.4) == pytest.approx((0.3, 0.6))

# Mixed (but must be same type on both sides!)
result_list = [1.0000001, 2.0000001, 3.0000001]
expected_list = [1.0, 2.0, 3.0]
assert result_list == pytest.approx(expected_list)

Example: Testing filter output

def test_lowpass_filter_output_values():
    # Arrange
    input_data = np.array([1.0, 2.0, 3.0, 2.0, 1.0])
    cutoff_freq = 1.0
    sample_rate = 10.0

    # Expected output (pre-computed or from reference implementation)
    expected_output = [0.98, 1.95, 2.89, 2.05, 1.02]

    # Act
    result = apply_lowpass_filter(input_data, cutoff_freq, sample_rate)

    # Assert - compare as list
    assert result.tolist() == pytest.approx(expected_output, rel=1e-2)

Important notes:

Lengths must match - pytest.approx will fail if sequences have different lengths
Types must match - Can’t compare list to tuple (convert first)
Applies to each element - Each element checked with same tolerance

pytest.approx with Dictionaries

pytest.approx() compares dictionary values element-wise:

import pytest

result_dict = {
    "mean": 0.1 + 0.2,
    "std": 0.2 + 0.4,
    "max": 1.0000001
}

expected_dict = {
    "mean": 0.3,
    "std": 0.6,
    "max": 1.0
}

assert result_dict == pytest.approx(expected_dict)

Example: Testing statistical summary

def test_profile_statistics():
    # Arrange
    profile_data = np.array([0.5, 1.0, 1.5, 2.0, 2.5])

    # Act
    stats = compute_profile_statistics(profile_data)

    # Assert - compare as dictionary
    expected_stats = {
        "mean": 1.5,
        "median": 1.5,
        "std": 0.70710678,  # sqrt(0.5)
        "min": 0.5,
        "max": 2.5
    }

    assert stats == pytest.approx(expected_stats, rel=1e-6)

Important notes:

Keys must match exactly - pytest.approx doesn’t check keys, only values
Only numeric values compared - Non-numeric values checked with ==
Nested dictionaries require special handling (see below)

pytest.approx with NumPy Arrays

pytest.approx() works with NumPy arrays (most common use case in this course):

import numpy as np
import pytest

# 1D arrays
result = np.array([0.1 + 0.2, 0.2 + 0.4, 0.3 + 0.6])
expected = np.array([0.3, 0.6, 0.9])
assert result == pytest.approx(expected)

# 2D arrays
result_2d = np.array([[1.0000001, 2.0000001],
                      [3.0000001, 4.0000001]])
expected_2d = np.array([[1.0, 2.0],
                        [3.0, 4.0]])
assert result_2d == pytest.approx(expected_2d)

Example: Testing FFT output

def test_fft_output_magnitudes():
    # Arrange
    signal = np.sin(2 * np.pi * 5.0 * np.linspace(0, 1, 100))  # 5 Hz sine wave

    # Act
    fft_result = np.fft.fft(signal)
    magnitudes = np.abs(fft_result)

    # Expected: Peak at 5 Hz frequency bin
    expected_peak_index = 5
    expected_magnitudes = np.zeros(100)
    expected_magnitudes[expected_peak_index] = 50.0  # Expected amplitude
    expected_magnitudes[-expected_peak_index] = 50.0  # Negative frequency

    # Assert - approximate comparison of full array
    assert magnitudes == pytest.approx(expected_magnitudes, rel=1e-1, abs=1e-10)

When to use numpy.testing instead:

For more advanced NumPy testing, consider numpy.testing module:

import numpy.testing as npt

# More control over NaN and infinity handling
npt.assert_allclose(result, expected, rtol=1e-6, atol=1e-12)

# Assert arrays are exactly equal (no tolerance)
npt.assert_array_equal(result_int, expected_int)

# Assert arrays have same shape
npt.assert_array_compare(np.shape, result, expected)

Comparison:

Feature	pytest.approx	numpy.testing.assert_allclose
Syntax	`assert result == pytest.approx(expected)`	`npt.assert_allclose(result, expected)`
Error messages	pytest's assertion introspection	NumPy-specific error messages
NaN handling	Requires `nan_ok=True`	Built-in with `equal_nan=True`
Infinity handling	Works by default	Works by default
Consistency	Same syntax for all pytest tests	NumPy-specific, separate from pytest

Recommendation: Use pytest.approx() for consistency unless you need NumPy-specific features.

Special Cases: NaN and Infinity

Testing NaN values:

By default, NaN != NaN in floating-point arithmetic. Use nan_ok=True:

import math
import pytest

# ❌ WRONG - This will fail!
result_with_nan = math.nan
assert result_with_nan == pytest.approx(math.nan)  # AssertionError!

# ✅ CORRECT - Use nan_ok=True
assert result_with_nan == pytest.approx(math.nan, nan_ok=True)

Example: Testing numerical algorithm that can produce NaN

def test_safe_division_returns_nan():
    """Test that safe division returns NaN for 0/0."""
    # Arrange
    numerator = 0.0
    denominator = 0.0

    # Act
    result = safe_divide(numerator, denominator)  # Returns NaN instead of raising

    # Assert
    assert math.isnan(result)  # Explicit NaN check

    # OR use pytest.approx with nan_ok
    assert result == pytest.approx(math.nan, nan_ok=True)

Testing infinity:

Infinity works without special handling:

import math
import pytest

assert math.inf == pytest.approx(math.inf)
assert -math.inf == pytest.approx(-math.inf)

# But remember: inf != very large number!
import sys
assert sys.float_info.max != pytest.approx(math.inf)  # Different values!

From Chapter 03 (Boundary Analysis): Remember the critical distinction:

sys.float_info.max ≈ 1.8 × 10³⁰⁸ - Largest finite float
math.inf - Non-finite special value (larger than any finite number)

def test_overflow_to_infinity():
    """Test that overflow produces infinity, not sys.float_info.max."""
    # Arrange
    huge_number = sys.float_info.max

    # Act
    result = huge_number * 2  # Overflows to infinity

    # Assert
    assert result == pytest.approx(math.inf)  # NOT sys.float_info.max!
    assert math.isinf(result)

Nested Structures and Limitations

pytest.approx() has limited support for nested structures. You cannot directly nest pytest.approx() calls:

❌ WRONG - This doesn’t work:

nested_dict = {
    "outer": {
        "inner": 0.1 + 0.2
    }
}

# This will NOT work - pytest.approx doesn't recurse into nested dicts
assert nested_dict == pytest.approx({"outer": {"inner": 0.3}})

✅ WORKAROUND - Flatten or test separately:

# Option 1: Flatten and test
assert nested_dict["outer"]["inner"] == pytest.approx(0.3)

# Option 2: Test nested dict separately
assert nested_dict["outer"] == pytest.approx({"inner": 0.3})

# Option 3: Use custom helper for deep comparison
def approx_nested(data, expected, **kwargs):
    """Recursively apply pytest.approx to nested structures."""
    if isinstance(expected, dict):
        return {k: approx_nested(data[k], v, **kwargs) for k, v in expected.items()}
    elif isinstance(expected, (list, tuple)):
        return type(expected)(approx_nested(d, e, **kwargs) for d, e in zip(data, expected))
    else:
        return pytest.approx(expected, **kwargs)

# Use custom helper
assert approx_nested(nested_dict, {"outer": {"inner": 0.3}}) == nested_dict

Recommendation: Keep test assertions simple. If you need deep nested comparisons, consider refactoring your data structures or testing at different levels.

Quick Reference: pytest.approx Patterns

Data Structure	Syntax	Notes
Scalar	`assert x == pytest.approx(expected)`	Basic floating-point comparison
List/Tuple	`assert [x, y] == pytest.approx([a, b])`	Element-wise comparison, lengths must match
Dictionary	`assert {"k": x} == pytest.approx({"k": a})`	Keys must match, values compared element-wise
NumPy array	`assert arr == pytest.approx(expected_arr)`	Works with multi-dimensional arrays
NaN values	`assert x == pytest.approx(nan, nan_ok=True)`	Must enable `nan_ok=True`
Infinity	`assert x == pytest.approx(math.inf)`	Works without special handling
Custom tolerance	`pytest.approx(x, rel=1e-9, abs=1e-12)`	Adjust relative and absolute tolerances

Warning and Deprecation Testing

Not all problems in code raise exceptions. Sometimes code issues warnings - signals that something might be wrong, but execution continues.

Common warning types:

UserWarning - General warnings to users
DeprecationWarning - Features being phased out
FutureWarning - Upcoming breaking changes
RuntimeWarning - Suspicious runtime behavior (e.g., divide by zero in NumPy)

Testing Warnings with pytest.warns()

Similar to pytest.raises(), but for warnings:

Basic syntax:

import pytest
import warnings

def test_deprecated_function_warns():
    with pytest.warns(DeprecationWarning):
        deprecated_function()

With message matching:

def test_deprecated_function_message():
    with pytest.warns(DeprecationWarning, match=r"deprecated.*use new_function instead"):
        deprecated_function()

Example: Testing your own deprecation warnings

# road_profile_viewer/filters.py
def apply_filter(data, cutoff):
    """Apply filter to data.

    .. deprecated:: 2.0
        Use apply_lowpass_filter() instead. This function will be removed in version 3.0.
    """
    warnings.warn(
        "apply_filter() is deprecated, use apply_lowpass_filter() instead",
        DeprecationWarning,
        stacklevel=2
    )
    return apply_lowpass_filter(data, cutoff)

def test_apply_filter_deprecation_warning():
    """Test that apply_filter() raises deprecation warning."""
    # Arrange
    data = np.array([1.0, 2.0, 3.0])
    cutoff = 1.0

    # Act & Assert
    with pytest.warns(DeprecationWarning, match=r"deprecated.*apply_lowpass_filter"):
        result = apply_filter(data, cutoff)

    # Can still assert on result
    assert len(result) == len(data)

Important: stacklevel parameter

When issuing warnings in your own code, use stacklevel=2 to report the warning at the caller’s location, not inside your function:

# Without stacklevel - warning points to inside apply_filter()
warnings.warn("deprecated", DeprecationWarning)

# With stacklevel=2 - warning points to where apply_filter() was called
warnings.warn("deprecated", DeprecationWarning, stacklevel=2)

Reference: pytest.warns documentation

Testing Deprecation Warnings: pytest.deprecated_call()

For specifically testing deprecation warnings, use pytest.deprecated_call():

def test_deprecated_function():
    with pytest.deprecated_call():
        deprecated_function()

This is equivalent to:

with pytest.warns((DeprecationWarning, PendingDeprecationWarning)):
    deprecated_function()

When to use each:

pytest.warns(DeprecationWarning) - When you want to match message
pytest.deprecated_call() - When you just want to verify deprecation (any deprecation warning)

Accessing Warning Details

Like pytest.raises(), you can access warning details:

def test_warning_details():
    with pytest.warns(UserWarning) as warning_info:
        issue_warning()

    # Access warning details
    assert len(warning_info) == 1  # Number of warnings
    assert "specific text" in str(warning_info[0].message)
    assert warning_info[0].category is UserWarning

Example: Testing NumPy runtime warnings

def test_divide_by_zero_warning():
    """Test that dividing by zero in NumPy issues RuntimeWarning."""
    # Arrange
    numerator = np.array([1.0, 2.0, 3.0])
    denominator = np.array([1.0, 0.0, 1.0])  # Contains zero!

    # Act & Assert
    with pytest.warns(RuntimeWarning, match="divide by zero"):
        result = numerator / denominator

    # Result contains inf at index 1
    assert math.isinf(result[1])

Configuring Warning Filters

Sometimes you want to suppress warnings during tests (e.g., third-party library warnings you can’t control).

In pytest.ini:

[pytest]
filterwarnings =
    error                           # Turn warnings into errors (strict mode)
    ignore::DeprecationWarning      # Ignore all deprecation warnings
    ignore:.*deprecated.*:DeprecationWarning:numpy.*  # Ignore NumPy deprecations

In individual tests:

@pytest.mark.filterwarnings("ignore::DeprecationWarning")
def test_with_suppressed_warnings():
    # This test won't fail on deprecation warnings
    deprecated_function()

Common use case: Ignoring third-party warnings while keeping your own:

[pytest]
filterwarnings =
    error                                        # Fail on warnings
    ignore::DeprecationWarning:matplotlib.*      # Except matplotlib deprecations
    ignore::PendingDeprecationWarning:numpy.*    # Except NumPy pending deprecations

Reference: Warnings capture configuration

When to Test Warnings

Test warnings when:

You’re deprecating your own API - Ensure warnings are issued correctly
You’re working around library warnings - Document expected warnings in tests
You’re testing numerical code - Verify warnings for edge cases (overflow, underflow, division by zero)

Don’t test warnings when:

They’re from third-party libraries - Not your responsibility (filter them instead)
They’re unrelated to test intent - Focus on primary behavior

Quick Reference: Warning Testing

Pattern	Syntax	Use Case
Basic warning	`with pytest.warns(UserWarning):`	Test that code issues specific warning type
Warning message	`with pytest.warns(UserWarning, match=r"pattern"):`	Validate warning message matches regex
Deprecation	`with pytest.deprecated_call():`	Test that code issues deprecation warning
Warning details	`with pytest.warns(UserWarning) as w:`	Access warning object for detailed assertions
Suppress warnings	`@pytest.mark.filterwarnings("ignore")`	Ignore warnings in specific test

Test Control and Organization

Sometimes you need to control when tests run, or explicitly mark them as failing. pytest provides several helpers for test control.

Explicit Test Failure: pytest.fail()

Sometimes you need to fail a test explicitly with a custom message:

Syntax:

import pytest

def test_something():
    if complex_condition():
        pytest.fail("Custom failure message")

When to use pytest.fail():

Complex conditional logic - When simple assert isn’t expressive enough
Placeholder tests - Mark tests as TODO
Unreachable code - Fail if code reaches unexpected state

Example: Testing that code path is NOT taken

def test_error_handling_path_not_taken():
    """Test that error handling path is NOT triggered for valid input."""
    # Arrange
    valid_data = np.array([1.0, 2.0, 3.0])

    # Act
    try:
        result = process_data(valid_data)
    except ValueError:
        pytest.fail("ValueError should not be raised for valid data")

    # Assert
    assert len(result) == len(valid_data)

Alternative (more Pythonic):

def test_error_handling_path_not_taken_v2():
    """Test that error handling path is NOT triggered for valid input."""
    # Arrange
    valid_data = np.array([1.0, 2.0, 3.0])

    # Act - just call it, if exception raised, test fails automatically
    result = process_data(valid_data)

    # Assert
    assert len(result) == len(valid_data)

The second version is preferred - pytest fails automatically if unexpected exception occurs.

When pytest.fail() IS useful:

def test_switch_statement_coverage():
    """Test all branches of switch-like logic."""
    for case in ["option_a", "option_b", "option_c"]:
        result = handle_option(case)

        if case == "option_a":
            assert result == "handled_a"
        elif case == "option_b":
            assert result == "handled_b"
        elif case == "option_c":
            assert result == "handled_c"
        else:
            pytest.fail(f"Unexpected case: {case}")  # Should never reach here

Reference: pytest.fail documentation

Skipping Tests: pytest.skip()

Skip tests conditionally at runtime:

Syntax:

import pytest
import sys

def test_windows_only():
    if sys.platform != "win32":
        pytest.skip("Test only runs on Windows")

    # Windows-specific test code
    ...

When to use pytest.skip():

Platform-specific tests - Skip on unsupported platforms
Dependency-based tests - Skip if optional dependency missing
Slow tests - Skip during rapid development
External resource tests - Skip if resource unavailable

Example: Skip if optional dependency missing

def test_with_optional_dependency():
    try:
        import matplotlib.pyplot as plt
    except ImportError:
        pytest.skip("matplotlib not installed")

    # Test code using matplotlib
    ...

Better approach: Use decorator or importorskip

# Option 1: Decorator
@pytest.mark.skipif(sys.platform != "win32", reason="Windows only")
def test_windows_only():
    ...

# Option 2: importorskip
def test_with_matplotlib():
    plt = pytest.importorskip("matplotlib.pyplot")
    # Test code using plt
    ...

Recommendation: Prefer decorators for static conditions, pytest.skip() for dynamic runtime conditions.

Reference: Skipping tests

Expected Failures: pytest.xfail()

Mark tests as “expected to fail” - useful for known bugs or incomplete features:

Syntax:

import pytest

def test_known_bug():
    pytest.xfail("Known bug #123 - division by zero not handled")
    buggy_function()

When to use pytest.xfail():

Known bugs - Document bugs with failing tests (better than deleting tests!)
Incomplete features - Write tests for features before implementation (TDD)
Platform-specific failures - Mark tests that fail on specific platforms

Example: Known bug documentation

def test_edge_case_known_issue():
    """Test edge case with known issue.

    See: https://github.com/user/repo/issues/123
    TODO: Fix in version 2.1
    """
    if sys.float_info.max * 2 == math.inf:  # Will fail on some platforms
        pytest.xfail("Known issue: overflow handling platform-dependent")

    result = handle_overflow(sys.float_info.max)
    assert result is not None

Decorator form:

@pytest.mark.xfail(reason="Known bug #123")
def test_known_bug():
    buggy_function()

# Conditional xfail
@pytest.mark.xfail(sys.platform == "win32", reason="Fails on Windows")
def test_unix_specific():
    ...

Difference from skip:

skip - Test doesn’t run, marked as “skipped”
xfail - Test runs, expected to fail, marked as “xfail” if fails or “xpass” if unexpectedly passes

Reference: Expected failures

Conditional Import: pytest.importorskip()

Skip test if module cannot be imported:

Syntax:

def test_with_optional_dependency():
    np = pytest.importorskip("numpy", minversion="1.20")
    # Test code using numpy
    ...

Advantages over try/except:

Clearer intent - Obviously skipping due to missing dependency
Version checking - Can specify minimum version
Better pytest output - Marked as “skipped” with reason

Example: Testing with optional dependencies

def test_advanced_plotting():
    """Test advanced plotting features (requires matplotlib)."""
    plt = pytest.importorskip("matplotlib.pyplot")
    sns = pytest.importorskip("seaborn", minversion="0.11")

    # Arrange
    data = np.array([1, 2, 3, 4, 5])

    # Act
    fig, ax = plt.subplots()
    sns.lineplot(x=range(len(data)), y=data, ax=ax)

    # Assert
    assert len(ax.lines) == 1

Reference: pytest.importorskip documentation

Decorators vs Imperative Calls

When to use decorators:

# Static conditions (known before test runs)
@pytest.mark.skipif(sys.platform == "win32", reason="Unix only")
def test_unix_feature():
    ...

@pytest.mark.xfail(reason="Known bug")
def test_buggy_feature():
    ...

When to use imperative calls:

# Dynamic conditions (determined during test execution)
def test_conditional_skip():
    config = load_config()  # Need to run code to determine condition
    if not config.feature_enabled:
        pytest.skip("Feature disabled in config")
    ...

Recommendation: Use decorators when possible (clearer, shows skip reason before test runs).

Quick Reference: Test Control

Function	Syntax	Use Case
pytest.fail()	`pytest.fail("message")`	Explicitly fail test with custom message
pytest.skip()	`pytest.skip("reason")`	Skip test at runtime (dynamic condition)
@pytest.mark.skip	`@pytest.mark.skip(reason="...")`	Skip test (static condition)
@pytest.mark.skipif	`@pytest.mark.skipif(condition, reason="...")`	Skip test conditionally
pytest.xfail()	`pytest.xfail("reason")`	Mark test as expected to fail (runtime)
@pytest.mark.xfail	`@pytest.mark.xfail(reason="...")`	Mark test as expected to fail (static)
pytest.importorskip()	`pytest.importorskip("module")`	Skip test if module unavailable

Assertion Introspection Mastery

One of pytest’s most powerful features is assertion introspection - the ability to show detailed information about why assertions fail.

How Assertion Rewriting Works

When you import pytest, it rewrites Python’s assert statement before execution. This allows pytest to:

Capture intermediate values - Shows subexpressions in failed assertions
Provide context - Shows surrounding lines of code
Format output - Pretty-prints complex data structures

Example of introspection output:

def test_list_comparison():
    result = [1, 2, 3, 4]
    expected = [1, 2, 5, 4]
    assert result == expected

Pytest output:

    def test_list_comparison():
        result = [1, 2, 3, 4]
        expected = [1, 2, 5, 4]
>       assert result == expected
E       AssertionError: assert [1, 2, 3, 4] == [1, 2, 5, 4]
E         At index 2 diff: 3 != 5
E         Use -v to get more diff

Notice how pytest automatically:

Shows the values of result and expected
Identifies which index differs
Suggests using -v for more details

What Introspection Shows for Different Types

Strings - Context diff:

def test_long_string():
    result = "The quick brown fox jumps over the lazy dog"
    expected = "The quick brown cat jumps over the lazy dog"
    assert result == expected

Output:

E       AssertionError: assert 'The quick br...he lazy dog' == 'The quick br...he lazy dog'
E         - The quick brown cat jumps over the lazy dog
E         ?                 ^^
E         + The quick brown fox jumps over the lazy dog
E         ?                 ^^

Lists - First differing element:

def test_list_diff():
    result = [1, 2, 3, 4, 5]
    expected = [1, 2, 3, 99, 5]
    assert result == expected

Output:

E       AssertionError: assert [1, 2, 3, 4, 5] == [1, 2, 3, 99, 5]
E         At index 3 diff: 4 != 99

Dictionaries - Differing entries:

def test_dict_diff():
    result = {"a": 1, "b": 2, "c": 3}
    expected = {"a": 1, "b": 99, "c": 3}
    assert result == expected

Output:

E       AssertionError: assert {'a': 1, 'b': 2, 'c': 3} == {'a': 1, 'b': 99, 'c': 3}
E         Differing items:
E         {'b': 2} != {'b': 99}

Sets - Extra/missing items:

def test_set_diff():
    result = {1, 2, 3, 4}
    expected = {1, 2, 3, 5}
    assert result == expected

Output:

E       AssertionError: assert {1, 2, 3, 4} == {1, 2, 3, 5}
E         Extra items in the left set:
E         {4}
E         Extra items in the right set:
E         {5}

Custom Assertion Messages

You can add custom messages to assertions:

Syntax:

assert condition, "Custom failure message"

Example:

def test_with_custom_message():
    result = compute_value()
    expected = 42
    assert result == expected, f"Expected {expected}, but got {result}"

Important: Modern pytest (7.0+) preserves introspection even with custom messages!

Old pytest (before 7.0):

Custom message disabled introspection
You had to choose: introspection OR custom message

Modern pytest (7.0+):

Custom message appends to introspection
You get both introspection AND custom message!

Example output (pytest 7.0+):

>       assert result == expected, f"Expected {expected}, but got {result}"
E       AssertionError: Expected 42, but got 41
E       assert 41 == 42

When to add custom messages:

Complex conditions - Explain WHY the assertion matters
Domain-specific checks - Add context about what’s being tested
Debugging hints - Suggest fixes or related tests

When NOT to add custom messages:

Simple comparisons - Introspection already clear
Redundant information - Just repeating what introspection shows

Example: Good use of custom message

def test_filter_cutoff_frequency():
    """Test that filter cutoff is within valid range."""
    # Arrange
    sample_rate = 100.0  # Hz
    cutoff_freq = 60.0   # Hz

    # Act
    filter_config = create_filter(sample_rate, cutoff_freq)

    # Assert with helpful message
    assert filter_config.cutoff_freq < sample_rate / 2, \
        f"Cutoff frequency ({cutoff_freq} Hz) must be less than Nyquist frequency ({sample_rate/2} Hz)"

Custom Assertion Explanations (Advanced)

For very complex custom types, you can define custom assertion explanations via the pytest_assertrepr_compare hook.

Example: Custom comparison for NumPy arrays

# conftest.py
import numpy as np

def pytest_assertrepr_compare(op, left, right):
    """Custom assertion representation for NumPy arrays."""
    if isinstance(left, np.ndarray) and isinstance(right, np.ndarray) and op == "==":
        return [
            "NumPy array comparison:",
            f"  Shape: {left.shape} vs {right.shape}",
            f"  Dtype: {left.dtype} vs {right.dtype}",
            f"  Max difference: {np.max(np.abs(left - right))}",
            f"  Mean difference: {np.mean(np.abs(left - right))}",
        ]

Result:

E       AssertionError: NumPy array comparison:
E         Shape: (100,) vs (100,)
E         Dtype: float64 vs float64
E         Max difference: 0.05
E         Mean difference: 0.01

When to use:

Custom types with complex comparison logic
When default introspection isn’t helpful
Domain-specific types (e.g., road profile data structures)

Note: This is advanced usage. For most tests, default introspection is sufficient.

Reference: Custom assertion explanations

Debugging Assertion Rewriting

Sometimes assertion rewriting doesn’t work. Common causes:

1. Module imported before pytest:

# ❌ WRONG - Imports module before pytest can rewrite assertions
import my_module
import pytest

# ✅ CORRECT - pytest imported first (automatic in test files)
import pytest
import my_module

2. Assertions in imported modules:

Pytest only rewrites assertions in test files (files matching test_*.py or *_test.py).

For assertions in non-test files, use pytest.register_assert_rewrite():

# conftest.py
pytest.register_assert_rewrite("my_package.helpers")

3. Bytecode caching issues:

If assertion introspection stops working after code changes, clear pytest cache:

pytest --cache-clear

Reference: Assertion rewriting

Quick Reference: Assertion Introspection

Type	Introspection Shows	Example Output
Numbers	Values and comparison	`assert 5 == 10` → `5 != 10`
Strings	Context diff	Character-by-character diff with markers
Lists	First differing index	`At index 3 diff: 4 != 99`
Dicts	Differing items	`Differing items: {'b': 2} != {'b': 99}`
Sets	Extra/missing items	`Extra items in left: {4}`
Custom types	Default repr()	Use `pytest_assertrepr_compare` for custom

Parametrization Patterns

One of the most powerful pytest features is parametrization - running the same test with different inputs.

Basic Parametrization: @pytest.mark.parametrize

Syntax:

import pytest

@pytest.mark.parametrize("input,expected", [
    (1, 2),
    (2, 4),
    (3, 6),
])
def test_double(input, expected):
    assert double(input) == expected

This creates three separate tests, one for each parameter set:

test_module.py::test_double[1-2] PASSED
test_module.py::test_double[2-4] PASSED
test_module.py::test_double[3-6] PASSED

Example: Testing boundary values

From Chapter 03 (Boundary Analysis), you learned to test boundary values. Parametrization makes this cleaner:

@pytest.mark.parametrize("x,expected", [
    (0.0, 0.0),                    # Zero
    (1.0, 1.0),                    # Normal value
    (sys.float_info.max, sys.float_info.max),  # Largest finite
    (math.inf, None),              # Infinity (returns None)
    (-math.inf, None),             # Negative infinity
])
def test_safe_sqrt_boundaries(x, expected):
    """Test safe_sqrt at floating-point boundaries."""
    result = safe_sqrt(x)

    if expected is None:
        assert result is None
    else:
        assert result == pytest.approx(expected)

Advantages:

Reduces duplication - Same test logic, different data
Clear test names - Each parameter set creates separate test
Granular failures - Know exactly which input failed
Easy to add cases - Just add to list

Multiple Parameters

You can parametrize multiple arguments:

@pytest.mark.parametrize("x,y,expected", [
    (1, 2, 3),
    (0, 0, 0),
    (-1, 1, 0),
    (sys.float_info.max, 1, math.inf),  # Overflow
])
def test_add(x, y, expected):
    result = x + y
    assert result == pytest.approx(expected)

Parametrizing Fixtures: indirect=True

Sometimes you want to parametrize fixtures instead of test parameters:

Example: Testing with different data files

# conftest.py
@pytest.fixture
def profile_data(request):
    """Load profile data from file."""
    filename = request.param  # Get parameter value
    return np.loadtxt(f"test_data/{filename}")

# test_processing.py
@pytest.mark.parametrize("profile_data", [
    "smooth_road.txt",
    "rough_road.txt",
    "highway.txt",
], indirect=True)  # Pass parameter to fixture!
def test_process_profile(profile_data):
    result = process_profile(profile_data)
    assert len(result) == len(profile_data)

How it works:

indirect=True tells pytest to pass parameter to fixture, not test
Fixture receives parameter via request.param
Fixture returns processed value to test

When to use:

Expensive setup (load data, create database, etc.)
Complex test data generation
Shared setup across multiple tests

Combining Fixtures and Parametrize

You can mix regular fixtures with parametrized values:

@pytest.fixture
def sample_rate():
    """Sample rate fixture (not parametrized)."""
    return 100.0  # Hz

@pytest.mark.parametrize("cutoff_freq,expected_response", [
    (10.0, "lowpass"),
    (50.0, "highpass"),
])
def test_filter_response(sample_rate, cutoff_freq, expected_response):
    """Test filter response at different cutoff frequencies."""
    # sample_rate from fixture, cutoff_freq from parametrize
    filter = create_filter(sample_rate, cutoff_freq)
    assert filter.response_type == expected_response

Using `ids` for Readable Test Names

By default, pytest generates test names from parameter values. For complex values, this can be ugly:

# Default test names (hard to read):
# test_something[<object object at 0x...>-expected0]
# test_something[<object object at 0x...>-expected1]

# Better: Use ids parameter
@pytest.mark.parametrize("data,expected", [
    (smooth_road_data, "smooth"),
    (rough_road_data, "rough"),
], ids=["smooth_road", "rough_road"])
def test_classification(data, expected):
    ...

Result:

test_module.py::test_classification[smooth_road] PASSED
test_module.py::test_classification[rough_road] PASSED

Using functions for ids:

def idfn(val):
    """Generate test ID from parameter value."""
    if isinstance(val, np.ndarray):
        return f"array_len_{len(val)}"
    return str(val)

@pytest.mark.parametrize("data", [
    np.array([1, 2, 3]),
    np.array([1, 2, 3, 4, 5]),
], ids=idfn)
def test_with_arrays(data):
    ...

Result:

test_module.py::test_with_arrays[array_len_3] PASSED
test_module.py::test_with_arrays[array_len_5] PASSED

Parametrize Anti-Patterns

❌ Anti-pattern 1: Over-parametrization

Don’t parametrize unrelated behaviors:

# ❌ WRONG - Testing different behaviors, not different inputs
@pytest.mark.parametrize("operation,x,y,expected", [
    ("add", 1, 2, 3),
    ("subtract", 5, 3, 2),
    ("multiply", 2, 3, 6),
])
def test_calculator(operation, x, y, expected):
    if operation == "add":
        result = x + y
    elif operation == "subtract":
        result = x - y
    elif operation == "multiply":
        result = x * y
    assert result == expected

✅ CORRECT - Separate tests for different behaviors:

@pytest.mark.parametrize("x,y,expected", [(1, 2, 3), (0, 0, 0), (-1, 1, 0)])
def test_add(x, y, expected):
    assert add(x, y) == expected

@pytest.mark.parametrize("x,y,expected", [(5, 3, 2), (0, 0, 0), (-1, -1, 0)])
def test_subtract(x, y, expected):
    assert subtract(x, y) == expected

@pytest.mark.parametrize("x,y,expected", [(2, 3, 6), (0, 5, 0), (-1, -1, 1)])
def test_multiply(x, y, expected):
    assert multiply(x, y) == expected

Why: Different behaviors should be separate tests. Parametrize same behavior with different inputs.

❌ Anti-pattern 2: Hidden test logic in parameters

# ❌ WRONG - Test logic in parameter list
@pytest.mark.parametrize("x,expected", [
    (0, 0),
    (1, 1),
    (2, 4),
    (3, 9),
    (4, 16),  # Pattern: expected = x**2, but not obvious!
])
def test_square(x, expected):
    assert square(x) == expected

✅ CORRECT - Make pattern explicit:

@pytest.mark.parametrize("x", [0, 1, 2, 3, 4])
def test_square(x):
    expected = x ** 2  # Pattern clear in test body
    assert square(x) == expected

Or use a helper function:

def square_test_cases():
    """Generate test cases for square function."""
    return [(x, x**2) for x in range(10)]

@pytest.mark.parametrize("x,expected", square_test_cases())
def test_square(x, expected):
    assert square(x) == expected

❌ Anti-pattern 3: Too many parameters

# ❌ WRONG - Too many parameters, hard to read
@pytest.mark.parametrize(
    "sample_rate,cutoff_freq,window_size,overlap,detrend,filter_type,expected_length",
    [
        (100, 10, 256, 128, True, "lowpass", 100),
        (200, 20, 512, 256, False, "highpass", 200),
        # ... 20 more cases ...
    ]
)
def test_complex_filter(...):  # 7 parameters!
    ...

✅ CORRECT - Use dataclasses or dicts:

from dataclasses import dataclass

@dataclass
class FilterConfig:
    sample_rate: float
    cutoff_freq: float
    window_size: int
    overlap: int
    detrend: bool
    filter_type: str
    expected_length: int

@pytest.mark.parametrize("config", [
    FilterConfig(100, 10, 256, 128, True, "lowpass", 100),
    FilterConfig(200, 20, 512, 256, False, "highpass", 200),
], ids=["config_1", "config_2"])
def test_complex_filter(config):
    result = apply_filter(
        sample_rate=config.sample_rate,
        cutoff_freq=config.cutoff_freq,
        # ...
    )
    assert len(result) == config.expected_length

Quick Reference: Parametrization

Pattern	Syntax	Use Case
Basic parametrize	`@pytest.mark.parametrize("x,y", [(1,2), (3,4)])`	Run test with different inputs
Single parameter	`@pytest.mark.parametrize("x", [1, 2, 3])`	Vary single input
Fixture parametrize	`@pytest.mark.parametrize("fix", [...], indirect=True)`	Pass parameters to fixture
Custom test IDs	`@pytest.mark.parametrize(..., ids=[...])`	Readable test names
ID function	`@pytest.mark.parametrize(..., ids=func)`	Generate IDs from parameter values

Reference: Parametrization documentation

Common Anti-Patterns

LLM-generated tests (from ChatGPT, Claude, etc.) often contain anti-patterns. Here are the most common mistakes to watch for.

Anti-Pattern 1: Tests That Return Values

❌ WRONG:

def test_addition():
    result = 1 + 1
    return result == 2  # ❌ Tests must NOT return!

Why it’s wrong: pytest ignores return values. This test will ALWAYS pass, even if result != 2.

✅ CORRECT:

def test_addition():
    result = 1 + 1
    assert result == 2  # ✅ Use assert!

Detection: Search for return statements in test functions (almost always wrong).

Anti-Pattern 2: Looping Over Test Cases

❌ WRONG:

def test_multiple_cases():
    test_cases = [(1, 2), (2, 4), (3, 6)]
    for input, expected in test_cases:
        assert double(input) == expected  # ❌ Stops at first failure!

Why it’s wrong:

Stops at first failure - You don’t see all failing cases
No granular reporting - Don’t know which case failed from test name
Harder to debug - Need to add print statements to see which iteration failed

✅ CORRECT:

@pytest.mark.parametrize("input,expected", [
    (1, 2),
    (2, 4),
    (3, 6),
])
def test_double(input, expected):
    assert double(input) == expected  # ✅ Use parametrize!

Benefits:

All failures reported - Runs all cases even if some fail
Granular test names - test_double[1-2], test_double[2-4], etc.
Easy to debug - Know exactly which case failed

Anti-Pattern 3: Assertions with Side Effects

❌ WRONG:

def test_data_processing():
    data = []
    assert data.append(1) is None  # ❌ Modifies data!
    assert len(data) == 1

Why it’s wrong: Assertions should NOT modify state. This makes tests:

Hard to understand - Assertion isn’t just checking, it’s doing!
Fragile - Depends on assertion execution order
Confusing when skipped - If assertion doesn’t run, state is different

✅ CORRECT:

def test_data_processing():
    data = []
    data.append(1)  # ✅ Separate action from assertion
    assert len(data) == 1

Anti-Pattern 4: Over-Mocking

LLM-generated tests often abuse mocking. From Chapter 03 (Boundary Analysis):

❌ WRONG:

def test_data_processing(mocker):
    # ❌ Mocking internal implementation details
    mocker.patch("module.internal_helper")
    mocker.patch("module.another_helper")
    mocker.patch("module.yet_another_helper")

    result = process_data([1, 2, 3])

    # ❌ Testing method calls, not results
    module.internal_helper.assert_called_once()
    module.another_helper.assert_called_with(ANY)

Why it’s wrong:

Tests implementation, not behavior - Breaks when you refactor
Doesn’t test actual logic - Mocks bypass real code
False confidence - Tests pass even if code is broken

✅ CORRECT:

def test_data_processing():
    # ✅ No mocks - test the actual behavior
    result = process_data([1, 2, 3])

    # ✅ Assert on result (state), not method calls (interaction)
    assert len(result) == 3
    assert all(isinstance(x, int) for x in result)
    assert result == [2, 4, 6]  # Actual expected result

When to mock:

External services - APIs, databases, file systems
Slow operations - Network calls, large computations
Non-deterministic behavior - Random, time-dependent

When NOT to mock:

Your own functions - Test them directly
Simple helpers - Faster to run than mock
Business logic - The core of what you’re testing

Anti-Pattern 5: Non-Deterministic Tests

❌ WRONG:

import random

def test_random_sampling():
    data = random.sample(range(100), 10)  # ❌ Different every run!
    result = process_data(data)
    assert len(result) == 10  # Might pass or fail randomly

Why it’s wrong:

Flaky tests - Pass sometimes, fail sometimes (destroys trust in tests)
Hard to debug - Can’t reproduce failures
Wastes time - Developers re-run tests hoping for pass

✅ CORRECT - Option 1: Seed random generator

import random

def test_random_sampling():
    random.seed(42)  # ✅ Consistent results
    data = random.sample(range(100), 10)
    result = process_data(data)
    assert len(result) == 10

✅ CORRECT - Option 2: Use fixture with fixed data

@pytest.fixture
def sample_data():
    """Fixed test data (deterministic)."""
    return [1, 5, 10, 15, 20, 25, 30, 35, 40, 45]

def test_sampling(sample_data):
    result = process_data(sample_data)
    assert len(result) == 10

Other non-deterministic sources:

# ❌ WRONG - Time-dependent
import time
def test_timestamp():
    timestamp = time.time()  # Different every run!
    ...

# ✅ CORRECT - Mock or fix time
def test_timestamp(mocker):
    mocker.patch("time.time", return_value=1234567890)
    timestamp = time.time()
    assert timestamp == 1234567890

# ❌ WRONG - Order-dependent (dict/set iteration)
def test_keys():
    data = {"a": 1, "b": 2, "c": 3}
    keys = list(data.keys())
    assert keys[0] == "a"  # Order not guaranteed in Python < 3.7!

# ✅ CORRECT - Don't depend on order
def test_keys():
    data = {"a": 1, "b": 2, "c": 3}
    assert "a" in data.keys()
    assert set(data.keys()) == {"a", "b", "c"}

Anti-Pattern 6: Test Pollution

❌ WRONG:

# Shared global state between tests
global_cache = []

def test_add_to_cache():
    global_cache.append(1)
    assert len(global_cache) == 1  # ✅ Passes first time

def test_cache_contains_items():
    assert len(global_cache) > 0  # ❌ Depends on test order!

Why it’s wrong:

Order-dependent - Tests pass/fail depending on execution order
Breaks isolation - Tests affect each other
Hard to debug - Failures only appear when tests run in specific order

✅ CORRECT:

@pytest.fixture
def cache():
    """Fresh cache for each test."""
    return []

def test_add_to_cache(cache):
    cache.append(1)
    assert len(cache) == 1  # ✅ Isolated

def test_cache_contains_items(cache):
    cache.append(1)
    cache.append(2)
    assert len(cache) == 2  # ✅ Isolated

Fixture scope:

# Function scope (default) - New instance per test
@pytest.fixture(scope="function")
def fresh_cache():
    return []

# Module scope - Shared across tests in same module
@pytest.fixture(scope="module")
def shared_resource():
    # Use for expensive setup (database, etc.)
    return expensive_resource()

# Session scope - Shared across entire test session
@pytest.fixture(scope="session")
def global_config():
    return load_config()

Anti-Pattern 7: Overly Strict Assertions

❌ WRONG:

def test_error_message():
    with pytest.raises(ValueError) as exc_info:
        validate_input(-1)

    # ❌ Too strict - breaks if message wording changes slightly
    assert str(exc_info.value) == "Input must be positive integer greater than zero"

Why it’s wrong: Minor message changes break tests (e.g., “positive integer” → “positive int”).

✅ CORRECT:

def test_error_message():
    with pytest.raises(ValueError, match=r"must be positive"):  # ✅ Flexible regex
        validate_input(-1)

Balance strictness:

Too loose: with pytest.raises(Exception) - Catches ANY exception
Too strict: Exact string match - Breaks on minor changes
Just right: Regex match for key phrases

Anti-Pattern 8: Testing Built-in Functionality

❌ WRONG:

def test_list_append():
    """Test that list.append works."""
    lst = [1, 2]
    lst.append(3)
    assert lst == [1, 2, 3]  # ❌ Testing Python, not your code!

Why it’s wrong: You’re testing Python’s list implementation, not your code.

✅ CORRECT - Test YOUR code:

def test_data_processor_uses_append_correctly():
    """Test that DataProcessor adds items correctly."""
    processor = DataProcessor()
    processor.add_value(1)
    processor.add_value(2)

    # ✅ Testing YOUR code's behavior
    assert processor.get_values() == [1, 2]
    assert processor.count() == 2

Quick Reference: Anti-Patterns

Anti-Pattern	Why Wrong	Fix
Return values	pytest ignores returns	Use `assert`
Looping test cases	Stops at first failure	Use `@pytest.mark.parametrize`
Assertions with side effects	Modifies state during check	Separate action from assertion
Over-mocking	Tests implementation, not behavior	Test state, mock only external deps
Non-deterministic tests	Flaky, hard to debug	Seed random, fix time, use deterministic data
Test pollution	Tests affect each other	Use fixtures for isolation
Overly strict	Breaks on minor changes	Use regex, check key properties only
Testing built-ins	Not testing your code	Test your code's behavior

Quick Reference Tables

All pytest Assertion Helpers

Helper	Purpose	Example
`pytest.raises()`	Test exceptions	`with pytest.raises(ValueError, match=r"pattern"):`
`pytest.warns()`	Test warnings	`with pytest.warns(UserWarning, match=r"pattern"):`
`pytest.deprecated_call()`	Test deprecations	`with pytest.deprecated_call():`
`pytest.approx()`	Floating-point comparison	`assert x == pytest.approx(expected, rel=1e-6)`
`pytest.fail()`	Explicit failure	`pytest.fail("Custom message")`
`pytest.skip()`	Skip test	`pytest.skip("Reason")`
`pytest.xfail()`	Expected failure	`pytest.xfail("Known bug")`
`pytest.importorskip()`	Skip if import fails	`plt = pytest.importorskip("matplotlib.pyplot")`
`@pytest.mark.parametrize`	Run with multiple inputs	`@pytest.mark.parametrize("x,y", [(1,2), (3,4)])`

pytest.approx Default Tolerances

Parameter	Default Value	Meaning
`rel`	`1e-6`	Relative tolerance (0.0001%)
`abs`	`1e-12`	Absolute tolerance
`nan_ok`	`False`	Allow NaN comparisons

Tolerance formula: \(

\text{actual} - \text{expected}

\leq \max(\text{rel} \times

\text{expected}

, \text{abs})\)

Common Pytest Markers

Marker	Purpose	Example
`@pytest.mark.skip`	Skip test	`@pytest.mark.skip(reason="Not implemented")`
`@pytest.mark.skipif`	Conditional skip	`@pytest.mark.skipif(sys.platform == "win32", reason="Unix only")`
`@pytest.mark.xfail`	Expected failure	`@pytest.mark.xfail(reason="Known bug #123")`
`@pytest.mark.parametrize`	Parametrize test	`@pytest.mark.parametrize("x", [1, 2, 3])`
`@pytest.mark.filterwarnings`	Filter warnings	`@pytest.mark.filterwarnings("ignore::DeprecationWarning")`

When to Use Each Pattern

Scenario	Use This	Not This
Testing exceptions	`pytest.raises(ValueError, match=r"...")`	`try/except` blocks
Floating-point comparison	`pytest.approx()`	`abs(x - y) < epsilon`
Multiple test cases	`@pytest.mark.parametrize`	`for` loops in tests
Testing your code	Assert on results (state)	Mock internal functions (interaction)
Test isolation	Fixtures	Global variables
Platform-specific tests	`@pytest.mark.skipif`	`if sys.platform ...` in test
Expected failures	`@pytest.mark.xfail`	Commenting out tests

Conclusion

You now have a comprehensive reference for pytest assertion patterns and best practices. Key takeaways:

Most Important Patterns

Use match parameter - Validate exception messages: pytest.raises(ValueError, match=r"pattern")
Use pytest.approx() - For all floating-point comparisons, not just scalars
Parametrize tests - Use @pytest.mark.parametrize, not loops
Test state, not interactions - Assert on results, not method calls
Keep tests deterministic - Seed random, fix time, avoid shared state

Red Flags in LLM-Generated Tests

Watch for these anti-patterns:

return statements in tests
for loops over test cases
Excessive mocking (especially of your own code)
No match parameter in pytest.raises()
Random/time-dependent data without seeding

When in Doubt

Check pytest docs - https://docs.pytest.org/
Look for existing patterns - Search your codebase for similar tests
Ask: “Am I testing behavior or implementation?” - Test behavior
Ask: “Is this test deterministic?” - Make it deterministic
Ask: “Could this be parametrized?” - Probably yes

Next Steps

Apply these patterns to your Road Profile Viewer tests
Review existing tests for anti-patterns
Refactor LLM-generated tests using these best practices
Move to TDD (Chapter 03 (TDD and CI)) with solid assertion skills

Happy testing!

References

Official Documentation

pytest Documentation: https://docs.pytest.org/
pytest API Reference: https://docs.pytest.org/en/stable/reference/reference.html
Assertion Introspection: https://docs.pytest.org/en/stable/how-to/assert.html
Parametrization: https://docs.pytest.org/en/stable/how-to/parametrize.html
Warning Capture: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

Python Standards

IEEE 754 Floating-Point: https://en.wikipedia.org/wiki/IEEE_754
PEP 654 - Exception Groups: https://peps.python.org/pep-0654/
What Every Computer Scientist Should Know About Floating-Point: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Best Practices and Articles

Software Engineering at Google (O’Reilly) - Testing chapter
Real Python: Effective Python Testing With Pytest: https://realpython.com/pytest-python-testing/
NerdWallet Engineering: 5 Pytest Best Practices: https://www.nerdwallet.com/blog/engineering/5-pytest-best-practices/

Course Materials

Chapter 03 (Testing Basics): Testing fundamentals, AAA pattern
Chapter 03 (Boundary Analysis): Boundary value analysis, state testing, IEEE 754
Chapter 03 (TDD and CI): Test-Driven Development (TDD) and CI/CD

Appendix 3: Pytest Assertion Reference - Beyond the Basics

Introduction: Your Testing Toolkit

What This Appendix Covers

What You Already Know

How to Use This Appendix

Key References

Advanced Exception Testing

The match Parameter: Validating Exception Messages

Accessing Exception Details: ExceptionInfo

Testing Exception Groups (Python 3.11+)

When NOT to Use pytest.raises()

Quick Reference: Exception Testing

pytest.approx Deep Dive

How pytest.approx Works: Relative vs Absolute Tolerance

pytest.approx with Sequences (Lists, Tuples)

pytest.approx with Dictionaries

pytest.approx with NumPy Arrays

Special Cases: NaN and Infinity

Nested Structures and Limitations

Quick Reference: pytest.approx Patterns

Warning and Deprecation Testing

Testing Warnings with pytest.warns()

Testing Deprecation Warnings: pytest.deprecated_call()

Accessing Warning Details

Configuring Warning Filters

When to Test Warnings

Quick Reference: Warning Testing

Test Control and Organization

Explicit Test Failure: pytest.fail()

Skipping Tests: pytest.skip()

Expected Failures: pytest.xfail()

Conditional Import: pytest.importorskip()

Decorators vs Imperative Calls

Quick Reference: Test Control

Assertion Introspection Mastery

How Assertion Rewriting Works

What Introspection Shows for Different Types

Custom Assertion Messages

Custom Assertion Explanations (Advanced)

Debugging Assertion Rewriting

Quick Reference: Assertion Introspection

Parametrization Patterns

Basic Parametrization: @pytest.mark.parametrize

Multiple Parameters

Parametrizing Fixtures: indirect=True

Combining Fixtures and Parametrize

Using ids for Readable Test Names

Parametrize Anti-Patterns

Quick Reference: Parametrization

Common Anti-Patterns

Anti-Pattern 1: Tests That Return Values

Anti-Pattern 2: Looping Over Test Cases

Anti-Pattern 3: Assertions with Side Effects

Anti-Pattern 4: Over-Mocking

Anti-Pattern 5: Non-Deterministic Tests

Anti-Pattern 6: Test Pollution

Anti-Pattern 7: Overly Strict Assertions

Anti-Pattern 8: Testing Built-in Functionality

Quick Reference: Anti-Patterns

Quick Reference Tables

All pytest Assertion Helpers

pytest.approx Default Tolerances

Common Pytest Markers

When to Use Each Pattern

Conclusion

Most Important Patterns

Red Flags in LLM-Generated Tests

When in Doubt

Next Steps

References

Official Documentation

Python Standards

Best Practices and Articles

Course Materials

The `match` Parameter: Validating Exception Messages

Accessing Exception Details: `ExceptionInfo`

When NOT to Use `pytest.raises()`

Using `ids` for Readable Test Names