Appendix 3: Pytest Assertion Reference - Beyond the Basics
November 2025 (12357 Words, 69 Minutes)
Introduction: Your Testing Toolkit
You’ve been writing tests for weeks now. You know the basics: assert, pytest.approx(), pytest.raises(). But as you write more complex tests, you start noticing patterns. Your tests are getting messy. LLM-generated tests don’t quite look right. You’re not sure how to test warnings, or how to make exception tests more specific.
This appendix is your reference guide. Think of it as the pytest documentation, but focused on the patterns you’ll actually use in practice.
What This Appendix Covers
This is NOT a tutorial. This is a practical reference for assertion patterns, organized by use case:
- Advanced Exception Testing - Beyond basic
pytest.raises() - pytest.approx Deep Dive - Floating-point comparisons for complex data structures
- Warning Testing - Testing deprecation warnings and user warnings
- Test Control - Skipping, failing, and marking tests
- Assertion Introspection - Understanding pytest’s assertion rewriting magic
- Parametrization Patterns - Testing multiple scenarios efficiently
- Common Anti-Patterns - What NOT to do (especially common in LLM-generated tests)
- Quick Reference - Cheat sheets and lookup tables
What You Already Know
From Chapter 03 (Testing Fundamentals), you already understand:
- Boundary value analysis - Testing at boundaries like
sys.float_info.max,math.inf,math.ulp() - Basic pytest.approx() - Comparing floating-point numbers with tolerances
- Basic pytest.raises() - Testing that code raises exceptions
- State testing - Testing results, not method calls
- AAA pattern - Arrange, Act, Assert
This appendix builds on that foundation with advanced patterns and best practices.
How to Use This Appendix
While writing tests:
- Jump to the relevant chapter for quick syntax lookup
- Check anti-patterns section if something feels wrong
- Use quick reference tables for at-a-glance help
When reviewing LLM-generated tests:
- Check Chapter 7 for common anti-patterns
- Verify exception tests use
matchparameter (Chapter 1) - Ensure parametrization is used appropriately (Chapter 6)
When debugging failing tests:
- Chapter 5 explains assertion introspection output
- Chapter 2 covers pytest.approx tolerance issues
- Chapter 4 covers test control for flaky tests
Key References
Throughout this appendix, we’ll reference:
- Official pytest documentation: https://docs.pytest.org/
- pytest API reference: https://docs.pytest.org/en/stable/reference/reference.html
- Software Engineering at Google (O’Reilly) - Testing chapter
- IEEE 754 Standard - Floating-point arithmetic
- NerdWallet Engineering Blog - pytest best practices
Let’s dive in.
Advanced Exception Testing
You already know the basics of pytest.raises():
def test_division_by_zero():
with pytest.raises(ZeroDivisionError):
1 / 0
But what if you want to verify the exception message? Or access the exception object for detailed assertions? Or test code that raises multiple exceptions?
The match Parameter: Validating Exception Messages
The most common improvement to pytest.raises() is the match parameter, which validates the exception message using a regular expression.
Basic syntax:
with pytest.raises(ValueError, match=r"must be positive"):
validate_input(-5)
Why use match?
- Specificity - Ensures you’re raising the RIGHT exception for the RIGHT reason
- Regression prevention - Catches changes to error messages
- Documentation - Makes test intent clearer
Example: Testing input validation
# road_profile_viewer/filters.py
from numpy.typing import NDArray
import numpy as np
def apply_lowpass_filter(data: NDArray[np.float64], cutoff_freq: float) -> NDArray[np.float64]:
"""Apply lowpass filter to road profile data.
Args:
data: Input road profile data
cutoff_freq: Cutoff frequency in Hz (must be positive)
Raises:
ValueError: If cutoff_freq is not positive
"""
if cutoff_freq <= 0:
raise ValueError(f"Cutoff frequency must be positive, got {cutoff_freq}")
# ... filter implementation
❌ Bad test (no message validation):
def test_lowpass_filter_negative_cutoff():
data = np.array([1.0, 2.0, 3.0])
with pytest.raises(ValueError): # Too vague!
apply_lowpass_filter(data, cutoff_freq=-10.0)
This test would pass even if the code raised ValueError("Invalid data") instead of the cutoff frequency error.
✅ Good test (validates message):
def test_lowpass_filter_negative_cutoff():
# Arrange
data = np.array([1.0, 2.0, 3.0])
invalid_cutoff = -10.0
# Act & Assert
with pytest.raises(ValueError, match=r"Cutoff frequency must be positive"):
apply_lowpass_filter(data, cutoff_freq=invalid_cutoff)
Pro Tips for match parameter:
- Use raw strings -
match=r"pattern"to avoid escaping backslashes matchusesre.search()- Pattern can appear anywhere in message (not full string match)- Escape special regex characters - Use
re.escape()for literal strings:
# If error message contains parentheses, brackets, or special chars
import re
expected_msg = "Invalid range: [0, 100]"
with pytest.raises(ValueError, match=re.escape(expected_msg)):
validate_range(200)
- Test dynamic parts with regex groups - For messages with dynamic values:
# Message: "Cutoff frequency must be positive, got -10.0"
with pytest.raises(ValueError, match=r"must be positive, got -?\d+\.?\d*"):
apply_lowpass_filter(data, cutoff_freq=-10.0)
Reference: pytest.raises() documentation
Accessing Exception Details: ExceptionInfo
Sometimes you need to inspect the exception object itself - not just its message. Use pytest.raises() as a context manager and access the exception via .value:
Syntax:
with pytest.raises(ValueError) as exc_info:
risky_operation()
# Access exception details
assert exc_info.type is ValueError
assert exc_info.value.args[0] == "Expected message"
assert "partial message" in str(exc_info.value)
Example: Testing exception attributes
# road_profile_viewer/exceptions.py
class ProfileDataError(Exception):
"""Custom exception for profile data errors."""
def __init__(self, message: str, data_length: int, expected_length: int):
super().__init__(message)
self.data_length = data_length
self.expected_length = expected_length
def test_profile_data_error_attributes():
# Arrange
invalid_data = np.array([1.0, 2.0]) # Too short
expected_length = 100
# Act & Assert
with pytest.raises(ProfileDataError) as exc_info:
load_profile_data(invalid_data, expected_length=expected_length)
# Access custom attributes
assert exc_info.value.data_length == 2
assert exc_info.value.expected_length == 100
assert "length" in str(exc_info.value).lower()
ExceptionInfo attributes:
exc_info.type- Exception class (e.g.,ValueError)exc_info.value- Exception instance (the actual exception object)exc_info.traceback- Traceback object (for advanced debugging)
When to use ExceptionInfo:
- Testing custom exception attributes
- Validating exception causes (via
__cause__or__context__) - Debugging complex exception chains
- Testing multiple conditions about the exception
Reference: ExceptionInfo API
Testing Exception Groups (Python 3.11+)
Python 3.11 introduced exception groups - exceptions that bundle multiple exceptions together. Use pytest.raises() with ExceptionGroup or the helper pytest.raises_group():
Example: Testing concurrent operations
def test_multiple_validation_errors():
"""Test that all validation errors are reported together."""
invalid_config = {
"cutoff_freq": -10.0, # Invalid (negative)
"sample_rate": 0, # Invalid (zero)
"window_size": "invalid" # Invalid (wrong type)
}
# Python 3.11+ syntax
with pytest.raises(ExceptionGroup) as exc_info:
validate_filter_config(invalid_config)
# Check that group contains expected exceptions
exceptions = exc_info.value.exceptions
assert len(exceptions) == 3
assert any(isinstance(e, ValueError) and "cutoff" in str(e) for e in exceptions)
assert any(isinstance(e, ValueError) and "sample_rate" in str(e) for e in exceptions)
assert any(isinstance(e, TypeError) and "window_size" in str(e) for e in exceptions)
Note: Exception groups are advanced Python 3.11+ features. For this course, focus on standard exception testing patterns above.
Reference: PEP 654 - Exception Groups
When NOT to Use pytest.raises()
Anti-pattern: Testing that code does NOT raise
# ❌ WRONG - Pointless test
def test_division_does_not_raise():
with pytest.raises(ZeroDivisionError):
pass # This will fail because nothing raises!
# ... what were we testing?
✅ CORRECT - Just call the function
def test_division_with_nonzero_divisor():
# Arrange
numerator = 10.0
denominator = 2.0
# Act
result = numerator / denominator
# Assert
assert result == 5.0 # If this runs, no exception was raised!
If a test completes without raising an exception, it passes. You don’t need to explicitly test that exceptions DON’T occur.
Quick Reference: Exception Testing
| Pattern | Syntax | Use Case |
|---|---|---|
| Basic exception | with pytest.raises(ValueError): |
Test that code raises specific exception type |
| Exception message | with pytest.raises(ValueError, match=r"pattern"): |
Validate exception message matches regex |
| Exception details | with pytest.raises(ValueError) as exc_info: |
Access exception object for detailed assertions |
| Literal message | match=re.escape("literal [text]") |
Match exact message with special characters |
| Exception groups | with pytest.raises(ExceptionGroup): |
Test bundled exceptions (Python 3.11+) |
pytest.approx Deep Dive
You already use pytest.approx() for floating-point comparisons:
assert result == pytest.approx(expected, rel=1e-9)
But pytest.approx() is more powerful than you think. It works with sequences, dictionaries, NumPy arrays, and even nested structures.
How pytest.approx Works: Relative vs Absolute Tolerance
Default tolerances:
rel=1e-6(relative tolerance: 0.0001%)abs=1e-12(absolute tolerance)
Comparison rule: A value is considered equal if it satisfies EITHER tolerance:
\[ |\text{actual} - \text{expected}| \leq \max(\text{rel} \times \text{expected}, \text{abs}) \]
Example:
import pytest
# For expected = 1000.0, rel=1e-6, abs=1e-12
# Tolerance = max(1e-6 * 1000.0, 1e-12) = 0.001
assert 1000.001 == pytest.approx(1000.0) # Within 0.001
assert 999.999 == pytest.approx(1000.0) # Within 0.001
assert 1000.002 != pytest.approx(1000.0) # Outside 0.001
Why two tolerances?
- Relative tolerance scales with magnitude (good for large numbers)
- Absolute tolerance handles values near zero (where relative tolerance breaks down)
Example: Near-zero values
# For expected = 0.0, rel=1e-6 would give 0 tolerance!
# So we need abs=1e-12
assert 1e-13 == pytest.approx(0.0) # Uses absolute tolerance
assert 1e-11 != pytest.approx(0.0) # Outside absolute tolerance
Choosing tolerances:
From Chapter 03 (Boundary Analysis), you learned about math.ulp() - the unit in last place:
import math
# For high-precision requirements, use ULP-based tolerance
x = 1.0
tolerance = 10 * math.ulp(x) # 10 ULPs = 10 * 2.220446049250313e-16
assert result == pytest.approx(expected, abs=tolerance)
Reference: pytest.approx documentation
pytest.approx with Sequences (Lists, Tuples)
pytest.approx() works element-wise with sequences:
import pytest
# Lists
assert [0.1 + 0.2, 0.2 + 0.4] == pytest.approx([0.3, 0.6])
# Tuples
assert (0.1 + 0.2, 0.2 + 0.4) == pytest.approx((0.3, 0.6))
# Mixed (but must be same type on both sides!)
result_list = [1.0000001, 2.0000001, 3.0000001]
expected_list = [1.0, 2.0, 3.0]
assert result_list == pytest.approx(expected_list)
Example: Testing filter output
def test_lowpass_filter_output_values():
# Arrange
input_data = np.array([1.0, 2.0, 3.0, 2.0, 1.0])
cutoff_freq = 1.0
sample_rate = 10.0
# Expected output (pre-computed or from reference implementation)
expected_output = [0.98, 1.95, 2.89, 2.05, 1.02]
# Act
result = apply_lowpass_filter(input_data, cutoff_freq, sample_rate)
# Assert - compare as list
assert result.tolist() == pytest.approx(expected_output, rel=1e-2)
Important notes:
- Lengths must match - pytest.approx will fail if sequences have different lengths
- Types must match - Can’t compare list to tuple (convert first)
- Applies to each element - Each element checked with same tolerance
pytest.approx with Dictionaries
pytest.approx() compares dictionary values element-wise:
import pytest
result_dict = {
"mean": 0.1 + 0.2,
"std": 0.2 + 0.4,
"max": 1.0000001
}
expected_dict = {
"mean": 0.3,
"std": 0.6,
"max": 1.0
}
assert result_dict == pytest.approx(expected_dict)
Example: Testing statistical summary
def test_profile_statistics():
# Arrange
profile_data = np.array([0.5, 1.0, 1.5, 2.0, 2.5])
# Act
stats = compute_profile_statistics(profile_data)
# Assert - compare as dictionary
expected_stats = {
"mean": 1.5,
"median": 1.5,
"std": 0.70710678, # sqrt(0.5)
"min": 0.5,
"max": 2.5
}
assert stats == pytest.approx(expected_stats, rel=1e-6)
Important notes:
- Keys must match exactly - pytest.approx doesn’t check keys, only values
- Only numeric values compared - Non-numeric values checked with
== - Nested dictionaries require special handling (see below)
pytest.approx with NumPy Arrays
pytest.approx() works with NumPy arrays (most common use case in this course):
import numpy as np
import pytest
# 1D arrays
result = np.array([0.1 + 0.2, 0.2 + 0.4, 0.3 + 0.6])
expected = np.array([0.3, 0.6, 0.9])
assert result == pytest.approx(expected)
# 2D arrays
result_2d = np.array([[1.0000001, 2.0000001],
[3.0000001, 4.0000001]])
expected_2d = np.array([[1.0, 2.0],
[3.0, 4.0]])
assert result_2d == pytest.approx(expected_2d)
Example: Testing FFT output
def test_fft_output_magnitudes():
# Arrange
signal = np.sin(2 * np.pi * 5.0 * np.linspace(0, 1, 100)) # 5 Hz sine wave
# Act
fft_result = np.fft.fft(signal)
magnitudes = np.abs(fft_result)
# Expected: Peak at 5 Hz frequency bin
expected_peak_index = 5
expected_magnitudes = np.zeros(100)
expected_magnitudes[expected_peak_index] = 50.0 # Expected amplitude
expected_magnitudes[-expected_peak_index] = 50.0 # Negative frequency
# Assert - approximate comparison of full array
assert magnitudes == pytest.approx(expected_magnitudes, rel=1e-1, abs=1e-10)
When to use numpy.testing instead:
For more advanced NumPy testing, consider numpy.testing module:
import numpy.testing as npt
# More control over NaN and infinity handling
npt.assert_allclose(result, expected, rtol=1e-6, atol=1e-12)
# Assert arrays are exactly equal (no tolerance)
npt.assert_array_equal(result_int, expected_int)
# Assert arrays have same shape
npt.assert_array_compare(np.shape, result, expected)
Comparison:
| Feature | pytest.approx | numpy.testing.assert_allclose |
|---|---|---|
| Syntax | assert result == pytest.approx(expected) |
npt.assert_allclose(result, expected) |
| Error messages | pytest's assertion introspection | NumPy-specific error messages |
| NaN handling | Requires nan_ok=True |
Built-in with equal_nan=True |
| Infinity handling | Works by default | Works by default |
| Consistency | Same syntax for all pytest tests | NumPy-specific, separate from pytest |
Recommendation: Use pytest.approx() for consistency unless you need NumPy-specific features.
Special Cases: NaN and Infinity
Testing NaN values:
By default, NaN != NaN in floating-point arithmetic. Use nan_ok=True:
import math
import pytest
# ❌ WRONG - This will fail!
result_with_nan = math.nan
assert result_with_nan == pytest.approx(math.nan) # AssertionError!
# ✅ CORRECT - Use nan_ok=True
assert result_with_nan == pytest.approx(math.nan, nan_ok=True)
Example: Testing numerical algorithm that can produce NaN
def test_safe_division_returns_nan():
"""Test that safe division returns NaN for 0/0."""
# Arrange
numerator = 0.0
denominator = 0.0
# Act
result = safe_divide(numerator, denominator) # Returns NaN instead of raising
# Assert
assert math.isnan(result) # Explicit NaN check
# OR use pytest.approx with nan_ok
assert result == pytest.approx(math.nan, nan_ok=True)
Testing infinity:
Infinity works without special handling:
import math
import pytest
assert math.inf == pytest.approx(math.inf)
assert -math.inf == pytest.approx(-math.inf)
# But remember: inf != very large number!
import sys
assert sys.float_info.max != pytest.approx(math.inf) # Different values!
From Chapter 03 (Boundary Analysis): Remember the critical distinction:
sys.float_info.max≈ 1.8 × 10³⁰⁸ - Largest finite floatmath.inf- Non-finite special value (larger than any finite number)
def test_overflow_to_infinity():
"""Test that overflow produces infinity, not sys.float_info.max."""
# Arrange
huge_number = sys.float_info.max
# Act
result = huge_number * 2 # Overflows to infinity
# Assert
assert result == pytest.approx(math.inf) # NOT sys.float_info.max!
assert math.isinf(result)
Nested Structures and Limitations
pytest.approx() has limited support for nested structures. You cannot directly nest pytest.approx() calls:
❌ WRONG - This doesn’t work:
nested_dict = {
"outer": {
"inner": 0.1 + 0.2
}
}
# This will NOT work - pytest.approx doesn't recurse into nested dicts
assert nested_dict == pytest.approx({"outer": {"inner": 0.3}})
✅ WORKAROUND - Flatten or test separately:
# Option 1: Flatten and test
assert nested_dict["outer"]["inner"] == pytest.approx(0.3)
# Option 2: Test nested dict separately
assert nested_dict["outer"] == pytest.approx({"inner": 0.3})
# Option 3: Use custom helper for deep comparison
def approx_nested(data, expected, **kwargs):
"""Recursively apply pytest.approx to nested structures."""
if isinstance(expected, dict):
return {k: approx_nested(data[k], v, **kwargs) for k, v in expected.items()}
elif isinstance(expected, (list, tuple)):
return type(expected)(approx_nested(d, e, **kwargs) for d, e in zip(data, expected))
else:
return pytest.approx(expected, **kwargs)
# Use custom helper
assert approx_nested(nested_dict, {"outer": {"inner": 0.3}}) == nested_dict
Recommendation: Keep test assertions simple. If you need deep nested comparisons, consider refactoring your data structures or testing at different levels.
Quick Reference: pytest.approx Patterns
| Data Structure | Syntax | Notes |
|---|---|---|
| Scalar | assert x == pytest.approx(expected) |
Basic floating-point comparison |
| List/Tuple | assert [x, y] == pytest.approx([a, b]) |
Element-wise comparison, lengths must match |
| Dictionary | assert {"k": x} == pytest.approx({"k": a}) |
Keys must match, values compared element-wise |
| NumPy array | assert arr == pytest.approx(expected_arr) |
Works with multi-dimensional arrays |
| NaN values | assert x == pytest.approx(nan, nan_ok=True) |
Must enable nan_ok=True |
| Infinity | assert x == pytest.approx(math.inf) |
Works without special handling |
| Custom tolerance | pytest.approx(x, rel=1e-9, abs=1e-12) |
Adjust relative and absolute tolerances |
Warning and Deprecation Testing
Not all problems in code raise exceptions. Sometimes code issues warnings - signals that something might be wrong, but execution continues.
Common warning types:
UserWarning- General warnings to usersDeprecationWarning- Features being phased outFutureWarning- Upcoming breaking changesRuntimeWarning- Suspicious runtime behavior (e.g., divide by zero in NumPy)
Testing Warnings with pytest.warns()
Similar to pytest.raises(), but for warnings:
Basic syntax:
import pytest
import warnings
def test_deprecated_function_warns():
with pytest.warns(DeprecationWarning):
deprecated_function()
With message matching:
def test_deprecated_function_message():
with pytest.warns(DeprecationWarning, match=r"deprecated.*use new_function instead"):
deprecated_function()
Example: Testing your own deprecation warnings
# road_profile_viewer/filters.py
def apply_filter(data, cutoff):
"""Apply filter to data.
.. deprecated:: 2.0
Use apply_lowpass_filter() instead. This function will be removed in version 3.0.
"""
warnings.warn(
"apply_filter() is deprecated, use apply_lowpass_filter() instead",
DeprecationWarning,
stacklevel=2
)
return apply_lowpass_filter(data, cutoff)
def test_apply_filter_deprecation_warning():
"""Test that apply_filter() raises deprecation warning."""
# Arrange
data = np.array([1.0, 2.0, 3.0])
cutoff = 1.0
# Act & Assert
with pytest.warns(DeprecationWarning, match=r"deprecated.*apply_lowpass_filter"):
result = apply_filter(data, cutoff)
# Can still assert on result
assert len(result) == len(data)
Important: stacklevel parameter
When issuing warnings in your own code, use stacklevel=2 to report the warning at the caller’s location, not inside your function:
# Without stacklevel - warning points to inside apply_filter()
warnings.warn("deprecated", DeprecationWarning)
# With stacklevel=2 - warning points to where apply_filter() was called
warnings.warn("deprecated", DeprecationWarning, stacklevel=2)
Reference: pytest.warns documentation
Testing Deprecation Warnings: pytest.deprecated_call()
For specifically testing deprecation warnings, use pytest.deprecated_call():
def test_deprecated_function():
with pytest.deprecated_call():
deprecated_function()
This is equivalent to:
with pytest.warns((DeprecationWarning, PendingDeprecationWarning)):
deprecated_function()
When to use each:
pytest.warns(DeprecationWarning)- When you want to match messagepytest.deprecated_call()- When you just want to verify deprecation (any deprecation warning)
Accessing Warning Details
Like pytest.raises(), you can access warning details:
def test_warning_details():
with pytest.warns(UserWarning) as warning_info:
issue_warning()
# Access warning details
assert len(warning_info) == 1 # Number of warnings
assert "specific text" in str(warning_info[0].message)
assert warning_info[0].category is UserWarning
Example: Testing NumPy runtime warnings
def test_divide_by_zero_warning():
"""Test that dividing by zero in NumPy issues RuntimeWarning."""
# Arrange
numerator = np.array([1.0, 2.0, 3.0])
denominator = np.array([1.0, 0.0, 1.0]) # Contains zero!
# Act & Assert
with pytest.warns(RuntimeWarning, match="divide by zero"):
result = numerator / denominator
# Result contains inf at index 1
assert math.isinf(result[1])
Configuring Warning Filters
Sometimes you want to suppress warnings during tests (e.g., third-party library warnings you can’t control).
In pytest.ini:
[pytest]
filterwarnings =
error # Turn warnings into errors (strict mode)
ignore::DeprecationWarning # Ignore all deprecation warnings
ignore:.*deprecated.*:DeprecationWarning:numpy.* # Ignore NumPy deprecations
In individual tests:
@pytest.mark.filterwarnings("ignore::DeprecationWarning")
def test_with_suppressed_warnings():
# This test won't fail on deprecation warnings
deprecated_function()
Common use case: Ignoring third-party warnings while keeping your own:
[pytest]
filterwarnings =
error # Fail on warnings
ignore::DeprecationWarning:matplotlib.* # Except matplotlib deprecations
ignore::PendingDeprecationWarning:numpy.* # Except NumPy pending deprecations
Reference: Warnings capture configuration
When to Test Warnings
Test warnings when:
- You’re deprecating your own API - Ensure warnings are issued correctly
- You’re working around library warnings - Document expected warnings in tests
- You’re testing numerical code - Verify warnings for edge cases (overflow, underflow, division by zero)
Don’t test warnings when:
- They’re from third-party libraries - Not your responsibility (filter them instead)
- They’re unrelated to test intent - Focus on primary behavior
Quick Reference: Warning Testing
| Pattern | Syntax | Use Case |
|---|---|---|
| Basic warning | with pytest.warns(UserWarning): |
Test that code issues specific warning type |
| Warning message | with pytest.warns(UserWarning, match=r"pattern"): |
Validate warning message matches regex |
| Deprecation | with pytest.deprecated_call(): |
Test that code issues deprecation warning |
| Warning details | with pytest.warns(UserWarning) as w: |
Access warning object for detailed assertions |
| Suppress warnings | @pytest.mark.filterwarnings("ignore") |
Ignore warnings in specific test |
Test Control and Organization
Sometimes you need to control when tests run, or explicitly mark them as failing. pytest provides several helpers for test control.
Explicit Test Failure: pytest.fail()
Sometimes you need to fail a test explicitly with a custom message:
Syntax:
import pytest
def test_something():
if complex_condition():
pytest.fail("Custom failure message")
When to use pytest.fail():
- Complex conditional logic - When simple
assertisn’t expressive enough - Placeholder tests - Mark tests as TODO
- Unreachable code - Fail if code reaches unexpected state
Example: Testing that code path is NOT taken
def test_error_handling_path_not_taken():
"""Test that error handling path is NOT triggered for valid input."""
# Arrange
valid_data = np.array([1.0, 2.0, 3.0])
# Act
try:
result = process_data(valid_data)
except ValueError:
pytest.fail("ValueError should not be raised for valid data")
# Assert
assert len(result) == len(valid_data)
Alternative (more Pythonic):
def test_error_handling_path_not_taken_v2():
"""Test that error handling path is NOT triggered for valid input."""
# Arrange
valid_data = np.array([1.0, 2.0, 3.0])
# Act - just call it, if exception raised, test fails automatically
result = process_data(valid_data)
# Assert
assert len(result) == len(valid_data)
The second version is preferred - pytest fails automatically if unexpected exception occurs.
When pytest.fail() IS useful:
def test_switch_statement_coverage():
"""Test all branches of switch-like logic."""
for case in ["option_a", "option_b", "option_c"]:
result = handle_option(case)
if case == "option_a":
assert result == "handled_a"
elif case == "option_b":
assert result == "handled_b"
elif case == "option_c":
assert result == "handled_c"
else:
pytest.fail(f"Unexpected case: {case}") # Should never reach here
Reference: pytest.fail documentation
Skipping Tests: pytest.skip()
Skip tests conditionally at runtime:
Syntax:
import pytest
import sys
def test_windows_only():
if sys.platform != "win32":
pytest.skip("Test only runs on Windows")
# Windows-specific test code
...
When to use pytest.skip():
- Platform-specific tests - Skip on unsupported platforms
- Dependency-based tests - Skip if optional dependency missing
- Slow tests - Skip during rapid development
- External resource tests - Skip if resource unavailable
Example: Skip if optional dependency missing
def test_with_optional_dependency():
try:
import matplotlib.pyplot as plt
except ImportError:
pytest.skip("matplotlib not installed")
# Test code using matplotlib
...
Better approach: Use decorator or importorskip
# Option 1: Decorator
@pytest.mark.skipif(sys.platform != "win32", reason="Windows only")
def test_windows_only():
...
# Option 2: importorskip
def test_with_matplotlib():
plt = pytest.importorskip("matplotlib.pyplot")
# Test code using plt
...
Recommendation: Prefer decorators for static conditions, pytest.skip() for dynamic runtime conditions.
Reference: Skipping tests
Expected Failures: pytest.xfail()
Mark tests as “expected to fail” - useful for known bugs or incomplete features:
Syntax:
import pytest
def test_known_bug():
pytest.xfail("Known bug #123 - division by zero not handled")
buggy_function()
When to use pytest.xfail():
- Known bugs - Document bugs with failing tests (better than deleting tests!)
- Incomplete features - Write tests for features before implementation (TDD)
- Platform-specific failures - Mark tests that fail on specific platforms
Example: Known bug documentation
def test_edge_case_known_issue():
"""Test edge case with known issue.
See: https://github.com/user/repo/issues/123
TODO: Fix in version 2.1
"""
if sys.float_info.max * 2 == math.inf: # Will fail on some platforms
pytest.xfail("Known issue: overflow handling platform-dependent")
result = handle_overflow(sys.float_info.max)
assert result is not None
Decorator form:
@pytest.mark.xfail(reason="Known bug #123")
def test_known_bug():
buggy_function()
# Conditional xfail
@pytest.mark.xfail(sys.platform == "win32", reason="Fails on Windows")
def test_unix_specific():
...
Difference from skip:
skip- Test doesn’t run, marked as “skipped”xfail- Test runs, expected to fail, marked as “xfail” if fails or “xpass” if unexpectedly passes
Reference: Expected failures
Conditional Import: pytest.importorskip()
Skip test if module cannot be imported:
Syntax:
def test_with_optional_dependency():
np = pytest.importorskip("numpy", minversion="1.20")
# Test code using numpy
...
Advantages over try/except:
- Clearer intent - Obviously skipping due to missing dependency
- Version checking - Can specify minimum version
- Better pytest output - Marked as “skipped” with reason
Example: Testing with optional dependencies
def test_advanced_plotting():
"""Test advanced plotting features (requires matplotlib)."""
plt = pytest.importorskip("matplotlib.pyplot")
sns = pytest.importorskip("seaborn", minversion="0.11")
# Arrange
data = np.array([1, 2, 3, 4, 5])
# Act
fig, ax = plt.subplots()
sns.lineplot(x=range(len(data)), y=data, ax=ax)
# Assert
assert len(ax.lines) == 1
Reference: pytest.importorskip documentation
Decorators vs Imperative Calls
When to use decorators:
# Static conditions (known before test runs)
@pytest.mark.skipif(sys.platform == "win32", reason="Unix only")
def test_unix_feature():
...
@pytest.mark.xfail(reason="Known bug")
def test_buggy_feature():
...
When to use imperative calls:
# Dynamic conditions (determined during test execution)
def test_conditional_skip():
config = load_config() # Need to run code to determine condition
if not config.feature_enabled:
pytest.skip("Feature disabled in config")
...
Recommendation: Use decorators when possible (clearer, shows skip reason before test runs).
Quick Reference: Test Control
| Function | Syntax | Use Case |
|---|---|---|
| pytest.fail() | pytest.fail("message") |
Explicitly fail test with custom message |
| pytest.skip() | pytest.skip("reason") |
Skip test at runtime (dynamic condition) |
| @pytest.mark.skip | @pytest.mark.skip(reason="...") |
Skip test (static condition) |
| @pytest.mark.skipif | @pytest.mark.skipif(condition, reason="...") |
Skip test conditionally |
| pytest.xfail() | pytest.xfail("reason") |
Mark test as expected to fail (runtime) |
| @pytest.mark.xfail | @pytest.mark.xfail(reason="...") |
Mark test as expected to fail (static) |
| pytest.importorskip() | pytest.importorskip("module") |
Skip test if module unavailable |
Assertion Introspection Mastery
One of pytest’s most powerful features is assertion introspection - the ability to show detailed information about why assertions fail.
How Assertion Rewriting Works
When you import pytest, it rewrites Python’s assert statement before execution. This allows pytest to:
- Capture intermediate values - Shows subexpressions in failed assertions
- Provide context - Shows surrounding lines of code
- Format output - Pretty-prints complex data structures
Example of introspection output:
def test_list_comparison():
result = [1, 2, 3, 4]
expected = [1, 2, 5, 4]
assert result == expected
Pytest output:
def test_list_comparison():
result = [1, 2, 3, 4]
expected = [1, 2, 5, 4]
> assert result == expected
E AssertionError: assert [1, 2, 3, 4] == [1, 2, 5, 4]
E At index 2 diff: 3 != 5
E Use -v to get more diff
Notice how pytest automatically:
- Shows the values of
resultandexpected - Identifies which index differs
- Suggests using
-vfor more details
What Introspection Shows for Different Types
Strings - Context diff:
def test_long_string():
result = "The quick brown fox jumps over the lazy dog"
expected = "The quick brown cat jumps over the lazy dog"
assert result == expected
Output:
E AssertionError: assert 'The quick br...he lazy dog' == 'The quick br...he lazy dog'
E - The quick brown cat jumps over the lazy dog
E ? ^^
E + The quick brown fox jumps over the lazy dog
E ? ^^
Lists - First differing element:
def test_list_diff():
result = [1, 2, 3, 4, 5]
expected = [1, 2, 3, 99, 5]
assert result == expected
Output:
E AssertionError: assert [1, 2, 3, 4, 5] == [1, 2, 3, 99, 5]
E At index 3 diff: 4 != 99
Dictionaries - Differing entries:
def test_dict_diff():
result = {"a": 1, "b": 2, "c": 3}
expected = {"a": 1, "b": 99, "c": 3}
assert result == expected
Output:
E AssertionError: assert {'a': 1, 'b': 2, 'c': 3} == {'a': 1, 'b': 99, 'c': 3}
E Differing items:
E {'b': 2} != {'b': 99}
Sets - Extra/missing items:
def test_set_diff():
result = {1, 2, 3, 4}
expected = {1, 2, 3, 5}
assert result == expected
Output:
E AssertionError: assert {1, 2, 3, 4} == {1, 2, 3, 5}
E Extra items in the left set:
E {4}
E Extra items in the right set:
E {5}
Custom Assertion Messages
You can add custom messages to assertions:
Syntax:
assert condition, "Custom failure message"
Example:
def test_with_custom_message():
result = compute_value()
expected = 42
assert result == expected, f"Expected {expected}, but got {result}"
Important: Modern pytest (7.0+) preserves introspection even with custom messages!
Old pytest (before 7.0):
- Custom message disabled introspection
- You had to choose: introspection OR custom message
Modern pytest (7.0+):
- Custom message appends to introspection
- You get both introspection AND custom message!
Example output (pytest 7.0+):
> assert result == expected, f"Expected {expected}, but got {result}"
E AssertionError: Expected 42, but got 41
E assert 41 == 42
When to add custom messages:
- Complex conditions - Explain WHY the assertion matters
- Domain-specific checks - Add context about what’s being tested
- Debugging hints - Suggest fixes or related tests
When NOT to add custom messages:
- Simple comparisons - Introspection already clear
- Redundant information - Just repeating what introspection shows
Example: Good use of custom message
def test_filter_cutoff_frequency():
"""Test that filter cutoff is within valid range."""
# Arrange
sample_rate = 100.0 # Hz
cutoff_freq = 60.0 # Hz
# Act
filter_config = create_filter(sample_rate, cutoff_freq)
# Assert with helpful message
assert filter_config.cutoff_freq < sample_rate / 2, \
f"Cutoff frequency ({cutoff_freq} Hz) must be less than Nyquist frequency ({sample_rate/2} Hz)"
Custom Assertion Explanations (Advanced)
For very complex custom types, you can define custom assertion explanations via the pytest_assertrepr_compare hook.
Example: Custom comparison for NumPy arrays
# conftest.py
import numpy as np
def pytest_assertrepr_compare(op, left, right):
"""Custom assertion representation for NumPy arrays."""
if isinstance(left, np.ndarray) and isinstance(right, np.ndarray) and op == "==":
return [
"NumPy array comparison:",
f" Shape: {left.shape} vs {right.shape}",
f" Dtype: {left.dtype} vs {right.dtype}",
f" Max difference: {np.max(np.abs(left - right))}",
f" Mean difference: {np.mean(np.abs(left - right))}",
]
Result:
E AssertionError: NumPy array comparison:
E Shape: (100,) vs (100,)
E Dtype: float64 vs float64
E Max difference: 0.05
E Mean difference: 0.01
When to use:
- Custom types with complex comparison logic
- When default introspection isn’t helpful
- Domain-specific types (e.g., road profile data structures)
Note: This is advanced usage. For most tests, default introspection is sufficient.
Reference: Custom assertion explanations
Debugging Assertion Rewriting
Sometimes assertion rewriting doesn’t work. Common causes:
1. Module imported before pytest:
# ❌ WRONG - Imports module before pytest can rewrite assertions
import my_module
import pytest
# ✅ CORRECT - pytest imported first (automatic in test files)
import pytest
import my_module
2. Assertions in imported modules:
Pytest only rewrites assertions in test files (files matching test_*.py or *_test.py).
For assertions in non-test files, use pytest.register_assert_rewrite():
# conftest.py
pytest.register_assert_rewrite("my_package.helpers")
3. Bytecode caching issues:
If assertion introspection stops working after code changes, clear pytest cache:
pytest --cache-clear
Reference: Assertion rewriting
Quick Reference: Assertion Introspection
| Type | Introspection Shows | Example Output |
|---|---|---|
| Numbers | Values and comparison | assert 5 == 10 → 5 != 10 |
| Strings | Context diff | Character-by-character diff with markers |
| Lists | First differing index | At index 3 diff: 4 != 99 |
| Dicts | Differing items | Differing items: {'b': 2} != {'b': 99} |
| Sets | Extra/missing items | Extra items in left: {4} |
| Custom types | Default repr() | Use pytest_assertrepr_compare for custom |
Parametrization Patterns
One of the most powerful pytest features is parametrization - running the same test with different inputs.
Basic Parametrization: @pytest.mark.parametrize
Syntax:
import pytest
@pytest.mark.parametrize("input,expected", [
(1, 2),
(2, 4),
(3, 6),
])
def test_double(input, expected):
assert double(input) == expected
This creates three separate tests, one for each parameter set:
test_module.py::test_double[1-2] PASSED
test_module.py::test_double[2-4] PASSED
test_module.py::test_double[3-6] PASSED
Example: Testing boundary values
From Chapter 03 (Boundary Analysis), you learned to test boundary values. Parametrization makes this cleaner:
@pytest.mark.parametrize("x,expected", [
(0.0, 0.0), # Zero
(1.0, 1.0), # Normal value
(sys.float_info.max, sys.float_info.max), # Largest finite
(math.inf, None), # Infinity (returns None)
(-math.inf, None), # Negative infinity
])
def test_safe_sqrt_boundaries(x, expected):
"""Test safe_sqrt at floating-point boundaries."""
result = safe_sqrt(x)
if expected is None:
assert result is None
else:
assert result == pytest.approx(expected)
Advantages:
- Reduces duplication - Same test logic, different data
- Clear test names - Each parameter set creates separate test
- Granular failures - Know exactly which input failed
- Easy to add cases - Just add to list
Multiple Parameters
You can parametrize multiple arguments:
@pytest.mark.parametrize("x,y,expected", [
(1, 2, 3),
(0, 0, 0),
(-1, 1, 0),
(sys.float_info.max, 1, math.inf), # Overflow
])
def test_add(x, y, expected):
result = x + y
assert result == pytest.approx(expected)
Parametrizing Fixtures: indirect=True
Sometimes you want to parametrize fixtures instead of test parameters:
Example: Testing with different data files
# conftest.py
@pytest.fixture
def profile_data(request):
"""Load profile data from file."""
filename = request.param # Get parameter value
return np.loadtxt(f"test_data/{filename}")
# test_processing.py
@pytest.mark.parametrize("profile_data", [
"smooth_road.txt",
"rough_road.txt",
"highway.txt",
], indirect=True) # Pass parameter to fixture!
def test_process_profile(profile_data):
result = process_profile(profile_data)
assert len(result) == len(profile_data)
How it works:
indirect=Truetells pytest to pass parameter to fixture, not test- Fixture receives parameter via
request.param - Fixture returns processed value to test
When to use:
- Expensive setup (load data, create database, etc.)
- Complex test data generation
- Shared setup across multiple tests
Combining Fixtures and Parametrize
You can mix regular fixtures with parametrized values:
@pytest.fixture
def sample_rate():
"""Sample rate fixture (not parametrized)."""
return 100.0 # Hz
@pytest.mark.parametrize("cutoff_freq,expected_response", [
(10.0, "lowpass"),
(50.0, "highpass"),
])
def test_filter_response(sample_rate, cutoff_freq, expected_response):
"""Test filter response at different cutoff frequencies."""
# sample_rate from fixture, cutoff_freq from parametrize
filter = create_filter(sample_rate, cutoff_freq)
assert filter.response_type == expected_response
Using ids for Readable Test Names
By default, pytest generates test names from parameter values. For complex values, this can be ugly:
# Default test names (hard to read):
# test_something[<object object at 0x...>-expected0]
# test_something[<object object at 0x...>-expected1]
# Better: Use ids parameter
@pytest.mark.parametrize("data,expected", [
(smooth_road_data, "smooth"),
(rough_road_data, "rough"),
], ids=["smooth_road", "rough_road"])
def test_classification(data, expected):
...
Result:
test_module.py::test_classification[smooth_road] PASSED
test_module.py::test_classification[rough_road] PASSED
Using functions for ids:
def idfn(val):
"""Generate test ID from parameter value."""
if isinstance(val, np.ndarray):
return f"array_len_{len(val)}"
return str(val)
@pytest.mark.parametrize("data", [
np.array([1, 2, 3]),
np.array([1, 2, 3, 4, 5]),
], ids=idfn)
def test_with_arrays(data):
...
Result:
test_module.py::test_with_arrays[array_len_3] PASSED
test_module.py::test_with_arrays[array_len_5] PASSED
Parametrize Anti-Patterns
❌ Anti-pattern 1: Over-parametrization
Don’t parametrize unrelated behaviors:
# ❌ WRONG - Testing different behaviors, not different inputs
@pytest.mark.parametrize("operation,x,y,expected", [
("add", 1, 2, 3),
("subtract", 5, 3, 2),
("multiply", 2, 3, 6),
])
def test_calculator(operation, x, y, expected):
if operation == "add":
result = x + y
elif operation == "subtract":
result = x - y
elif operation == "multiply":
result = x * y
assert result == expected
✅ CORRECT - Separate tests for different behaviors:
@pytest.mark.parametrize("x,y,expected", [(1, 2, 3), (0, 0, 0), (-1, 1, 0)])
def test_add(x, y, expected):
assert add(x, y) == expected
@pytest.mark.parametrize("x,y,expected", [(5, 3, 2), (0, 0, 0), (-1, -1, 0)])
def test_subtract(x, y, expected):
assert subtract(x, y) == expected
@pytest.mark.parametrize("x,y,expected", [(2, 3, 6), (0, 5, 0), (-1, -1, 1)])
def test_multiply(x, y, expected):
assert multiply(x, y) == expected
Why: Different behaviors should be separate tests. Parametrize same behavior with different inputs.
❌ Anti-pattern 2: Hidden test logic in parameters
# ❌ WRONG - Test logic in parameter list
@pytest.mark.parametrize("x,expected", [
(0, 0),
(1, 1),
(2, 4),
(3, 9),
(4, 16), # Pattern: expected = x**2, but not obvious!
])
def test_square(x, expected):
assert square(x) == expected
✅ CORRECT - Make pattern explicit:
@pytest.mark.parametrize("x", [0, 1, 2, 3, 4])
def test_square(x):
expected = x ** 2 # Pattern clear in test body
assert square(x) == expected
Or use a helper function:
def square_test_cases():
"""Generate test cases for square function."""
return [(x, x**2) for x in range(10)]
@pytest.mark.parametrize("x,expected", square_test_cases())
def test_square(x, expected):
assert square(x) == expected
❌ Anti-pattern 3: Too many parameters
# ❌ WRONG - Too many parameters, hard to read
@pytest.mark.parametrize(
"sample_rate,cutoff_freq,window_size,overlap,detrend,filter_type,expected_length",
[
(100, 10, 256, 128, True, "lowpass", 100),
(200, 20, 512, 256, False, "highpass", 200),
# ... 20 more cases ...
]
)
def test_complex_filter(...): # 7 parameters!
...
✅ CORRECT - Use dataclasses or dicts:
from dataclasses import dataclass
@dataclass
class FilterConfig:
sample_rate: float
cutoff_freq: float
window_size: int
overlap: int
detrend: bool
filter_type: str
expected_length: int
@pytest.mark.parametrize("config", [
FilterConfig(100, 10, 256, 128, True, "lowpass", 100),
FilterConfig(200, 20, 512, 256, False, "highpass", 200),
], ids=["config_1", "config_2"])
def test_complex_filter(config):
result = apply_filter(
sample_rate=config.sample_rate,
cutoff_freq=config.cutoff_freq,
# ...
)
assert len(result) == config.expected_length
Quick Reference: Parametrization
| Pattern | Syntax | Use Case |
|---|---|---|
| Basic parametrize | @pytest.mark.parametrize("x,y", [(1,2), (3,4)]) |
Run test with different inputs |
| Single parameter | @pytest.mark.parametrize("x", [1, 2, 3]) |
Vary single input |
| Fixture parametrize | @pytest.mark.parametrize("fix", [...], indirect=True) |
Pass parameters to fixture |
| Custom test IDs | @pytest.mark.parametrize(..., ids=[...]) |
Readable test names |
| ID function | @pytest.mark.parametrize(..., ids=func) |
Generate IDs from parameter values |
Reference: Parametrization documentation
Common Anti-Patterns
LLM-generated tests (from ChatGPT, Claude, etc.) often contain anti-patterns. Here are the most common mistakes to watch for.
Anti-Pattern 1: Tests That Return Values
❌ WRONG:
def test_addition():
result = 1 + 1
return result == 2 # ❌ Tests must NOT return!
Why it’s wrong: pytest ignores return values. This test will ALWAYS pass, even if result != 2.
✅ CORRECT:
def test_addition():
result = 1 + 1
assert result == 2 # ✅ Use assert!
Detection: Search for return statements in test functions (almost always wrong).
Anti-Pattern 2: Looping Over Test Cases
❌ WRONG:
def test_multiple_cases():
test_cases = [(1, 2), (2, 4), (3, 6)]
for input, expected in test_cases:
assert double(input) == expected # ❌ Stops at first failure!
Why it’s wrong:
- Stops at first failure - You don’t see all failing cases
- No granular reporting - Don’t know which case failed from test name
- Harder to debug - Need to add print statements to see which iteration failed
✅ CORRECT:
@pytest.mark.parametrize("input,expected", [
(1, 2),
(2, 4),
(3, 6),
])
def test_double(input, expected):
assert double(input) == expected # ✅ Use parametrize!
Benefits:
- All failures reported - Runs all cases even if some fail
- Granular test names -
test_double[1-2],test_double[2-4], etc. - Easy to debug - Know exactly which case failed
Anti-Pattern 3: Assertions with Side Effects
❌ WRONG:
def test_data_processing():
data = []
assert data.append(1) is None # ❌ Modifies data!
assert len(data) == 1
Why it’s wrong: Assertions should NOT modify state. This makes tests:
- Hard to understand - Assertion isn’t just checking, it’s doing!
- Fragile - Depends on assertion execution order
- Confusing when skipped - If assertion doesn’t run, state is different
✅ CORRECT:
def test_data_processing():
data = []
data.append(1) # ✅ Separate action from assertion
assert len(data) == 1
Anti-Pattern 4: Over-Mocking
LLM-generated tests often abuse mocking. From Chapter 03 (Boundary Analysis):
❌ WRONG:
def test_data_processing(mocker):
# ❌ Mocking internal implementation details
mocker.patch("module.internal_helper")
mocker.patch("module.another_helper")
mocker.patch("module.yet_another_helper")
result = process_data([1, 2, 3])
# ❌ Testing method calls, not results
module.internal_helper.assert_called_once()
module.another_helper.assert_called_with(ANY)
Why it’s wrong:
- Tests implementation, not behavior - Breaks when you refactor
- Doesn’t test actual logic - Mocks bypass real code
- False confidence - Tests pass even if code is broken
✅ CORRECT:
def test_data_processing():
# ✅ No mocks - test the actual behavior
result = process_data([1, 2, 3])
# ✅ Assert on result (state), not method calls (interaction)
assert len(result) == 3
assert all(isinstance(x, int) for x in result)
assert result == [2, 4, 6] # Actual expected result
When to mock:
- External services - APIs, databases, file systems
- Slow operations - Network calls, large computations
- Non-deterministic behavior - Random, time-dependent
When NOT to mock:
- Your own functions - Test them directly
- Simple helpers - Faster to run than mock
- Business logic - The core of what you’re testing
Anti-Pattern 5: Non-Deterministic Tests
❌ WRONG:
import random
def test_random_sampling():
data = random.sample(range(100), 10) # ❌ Different every run!
result = process_data(data)
assert len(result) == 10 # Might pass or fail randomly
Why it’s wrong:
- Flaky tests - Pass sometimes, fail sometimes (destroys trust in tests)
- Hard to debug - Can’t reproduce failures
- Wastes time - Developers re-run tests hoping for pass
✅ CORRECT - Option 1: Seed random generator
import random
def test_random_sampling():
random.seed(42) # ✅ Consistent results
data = random.sample(range(100), 10)
result = process_data(data)
assert len(result) == 10
✅ CORRECT - Option 2: Use fixture with fixed data
@pytest.fixture
def sample_data():
"""Fixed test data (deterministic)."""
return [1, 5, 10, 15, 20, 25, 30, 35, 40, 45]
def test_sampling(sample_data):
result = process_data(sample_data)
assert len(result) == 10
Other non-deterministic sources:
# ❌ WRONG - Time-dependent
import time
def test_timestamp():
timestamp = time.time() # Different every run!
...
# ✅ CORRECT - Mock or fix time
def test_timestamp(mocker):
mocker.patch("time.time", return_value=1234567890)
timestamp = time.time()
assert timestamp == 1234567890
# ❌ WRONG - Order-dependent (dict/set iteration)
def test_keys():
data = {"a": 1, "b": 2, "c": 3}
keys = list(data.keys())
assert keys[0] == "a" # Order not guaranteed in Python < 3.7!
# ✅ CORRECT - Don't depend on order
def test_keys():
data = {"a": 1, "b": 2, "c": 3}
assert "a" in data.keys()
assert set(data.keys()) == {"a", "b", "c"}
Anti-Pattern 6: Test Pollution
❌ WRONG:
# Shared global state between tests
global_cache = []
def test_add_to_cache():
global_cache.append(1)
assert len(global_cache) == 1 # ✅ Passes first time
def test_cache_contains_items():
assert len(global_cache) > 0 # ❌ Depends on test order!
Why it’s wrong:
- Order-dependent - Tests pass/fail depending on execution order
- Breaks isolation - Tests affect each other
- Hard to debug - Failures only appear when tests run in specific order
✅ CORRECT:
@pytest.fixture
def cache():
"""Fresh cache for each test."""
return []
def test_add_to_cache(cache):
cache.append(1)
assert len(cache) == 1 # ✅ Isolated
def test_cache_contains_items(cache):
cache.append(1)
cache.append(2)
assert len(cache) == 2 # ✅ Isolated
Fixture scope:
# Function scope (default) - New instance per test
@pytest.fixture(scope="function")
def fresh_cache():
return []
# Module scope - Shared across tests in same module
@pytest.fixture(scope="module")
def shared_resource():
# Use for expensive setup (database, etc.)
return expensive_resource()
# Session scope - Shared across entire test session
@pytest.fixture(scope="session")
def global_config():
return load_config()
Anti-Pattern 7: Overly Strict Assertions
❌ WRONG:
def test_error_message():
with pytest.raises(ValueError) as exc_info:
validate_input(-1)
# ❌ Too strict - breaks if message wording changes slightly
assert str(exc_info.value) == "Input must be positive integer greater than zero"
Why it’s wrong: Minor message changes break tests (e.g., “positive integer” → “positive int”).
✅ CORRECT:
def test_error_message():
with pytest.raises(ValueError, match=r"must be positive"): # ✅ Flexible regex
validate_input(-1)
Balance strictness:
- Too loose:
with pytest.raises(Exception)- Catches ANY exception - Too strict: Exact string match - Breaks on minor changes
- Just right: Regex match for key phrases
Anti-Pattern 8: Testing Built-in Functionality
❌ WRONG:
def test_list_append():
"""Test that list.append works."""
lst = [1, 2]
lst.append(3)
assert lst == [1, 2, 3] # ❌ Testing Python, not your code!
Why it’s wrong: You’re testing Python’s list implementation, not your code.
✅ CORRECT - Test YOUR code:
def test_data_processor_uses_append_correctly():
"""Test that DataProcessor adds items correctly."""
processor = DataProcessor()
processor.add_value(1)
processor.add_value(2)
# ✅ Testing YOUR code's behavior
assert processor.get_values() == [1, 2]
assert processor.count() == 2
Quick Reference: Anti-Patterns
| Anti-Pattern | Why Wrong | Fix |
|---|---|---|
| Return values | pytest ignores returns | Use assert |
| Looping test cases | Stops at first failure | Use @pytest.mark.parametrize |
| Assertions with side effects | Modifies state during check | Separate action from assertion |
| Over-mocking | Tests implementation, not behavior | Test state, mock only external deps |
| Non-deterministic tests | Flaky, hard to debug | Seed random, fix time, use deterministic data |
| Test pollution | Tests affect each other | Use fixtures for isolation |
| Overly strict | Breaks on minor changes | Use regex, check key properties only |
| Testing built-ins | Not testing your code | Test your code's behavior |
Quick Reference Tables
All pytest Assertion Helpers
| Helper | Purpose | Example |
|---|---|---|
pytest.raises() |
Test exceptions | with pytest.raises(ValueError, match=r"pattern"): |
pytest.warns() |
Test warnings | with pytest.warns(UserWarning, match=r"pattern"): |
pytest.deprecated_call() |
Test deprecations | with pytest.deprecated_call(): |
pytest.approx() |
Floating-point comparison | assert x == pytest.approx(expected, rel=1e-6) |
pytest.fail() |
Explicit failure | pytest.fail("Custom message") |
pytest.skip() |
Skip test | pytest.skip("Reason") |
pytest.xfail() |
Expected failure | pytest.xfail("Known bug") |
pytest.importorskip() |
Skip if import fails | plt = pytest.importorskip("matplotlib.pyplot") |
@pytest.mark.parametrize |
Run with multiple inputs | @pytest.mark.parametrize("x,y", [(1,2), (3,4)]) |
pytest.approx Default Tolerances
| Parameter | Default Value | Meaning |
|---|---|---|
rel |
1e-6 |
Relative tolerance (0.0001%) |
abs |
1e-12 |
Absolute tolerance |
nan_ok |
False |
Allow NaN comparisons |
| Tolerance formula: \( | \text{actual} - \text{expected} | \leq \max(\text{rel} \times | \text{expected} | , \text{abs})\) |
Common Pytest Markers
| Marker | Purpose | Example |
|---|---|---|
@pytest.mark.skip |
Skip test | @pytest.mark.skip(reason="Not implemented") |
@pytest.mark.skipif |
Conditional skip | @pytest.mark.skipif(sys.platform == "win32", reason="Unix only") |
@pytest.mark.xfail |
Expected failure | @pytest.mark.xfail(reason="Known bug #123") |
@pytest.mark.parametrize |
Parametrize test | @pytest.mark.parametrize("x", [1, 2, 3]) |
@pytest.mark.filterwarnings |
Filter warnings | @pytest.mark.filterwarnings("ignore::DeprecationWarning") |
When to Use Each Pattern
| Scenario | Use This | Not This |
|---|---|---|
| Testing exceptions | pytest.raises(ValueError, match=r"...") |
try/except blocks |
| Floating-point comparison | pytest.approx() |
abs(x - y) < epsilon |
| Multiple test cases | @pytest.mark.parametrize |
for loops in tests |
| Testing your code | Assert on results (state) | Mock internal functions (interaction) |
| Test isolation | Fixtures | Global variables |
| Platform-specific tests | @pytest.mark.skipif |
if sys.platform ... in test |
| Expected failures | @pytest.mark.xfail |
Commenting out tests |
Conclusion
You now have a comprehensive reference for pytest assertion patterns and best practices. Key takeaways:
Most Important Patterns
- Use
matchparameter - Validate exception messages:pytest.raises(ValueError, match=r"pattern") - Use pytest.approx() - For all floating-point comparisons, not just scalars
- Parametrize tests - Use
@pytest.mark.parametrize, not loops - Test state, not interactions - Assert on results, not method calls
- Keep tests deterministic - Seed random, fix time, avoid shared state
Red Flags in LLM-Generated Tests
Watch for these anti-patterns:
returnstatements in testsforloops over test cases- Excessive mocking (especially of your own code)
- No
matchparameter inpytest.raises() - Random/time-dependent data without seeding
When in Doubt
- Check pytest docs - https://docs.pytest.org/
- Look for existing patterns - Search your codebase for similar tests
- Ask: “Am I testing behavior or implementation?” - Test behavior
- Ask: “Is this test deterministic?” - Make it deterministic
- Ask: “Could this be parametrized?” - Probably yes
Next Steps
- Apply these patterns to your Road Profile Viewer tests
- Review existing tests for anti-patterns
- Refactor LLM-generated tests using these best practices
- Move to TDD (Chapter 03 (TDD and CI)) with solid assertion skills
Happy testing!
References
Official Documentation
- pytest Documentation: https://docs.pytest.org/
- pytest API Reference: https://docs.pytest.org/en/stable/reference/reference.html
- Assertion Introspection: https://docs.pytest.org/en/stable/how-to/assert.html
- Parametrization: https://docs.pytest.org/en/stable/how-to/parametrize.html
- Warning Capture: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
Python Standards
- IEEE 754 Floating-Point: https://en.wikipedia.org/wiki/IEEE_754
- PEP 654 - Exception Groups: https://peps.python.org/pep-0654/
- What Every Computer Scientist Should Know About Floating-Point: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
Best Practices and Articles
- Software Engineering at Google (O’Reilly) - Testing chapter
- Real Python: Effective Python Testing With Pytest: https://realpython.com/pytest-python-testing/
- NerdWallet Engineering: 5 Pytest Best Practices: https://www.nerdwallet.com/blog/engineering/5-pytest-best-practices/
Course Materials
- Chapter 03 (Testing Basics): Testing fundamentals, AAA pattern
- Chapter 03 (Boundary Analysis): Boundary value analysis, state testing, IEEE 754
- Chapter 03 (TDD and CI): Test-Driven Development (TDD) and CI/CD