Home

03 Exercise: Test Quality Comparison

exercises chapter-03 testing pytest unit-testing clean-code comparison

Introduction

Welcome to Testing Fundamentals: Test Quality Comparison! These exercises help you recognize the difference between well-written tests and poorly-written tests.

Learning Objectives:

Instructions:

  1. Study both the “Poor Quality” and “Good Quality” versions
  2. Identify ALL differences between the two versions
  3. Understand which clean code principles are being applied
  4. Consider how each improvement makes tests more maintainable
  5. Click “Show Analysis” to compare your observations

Difficulty: Intermediate Time: 10-15 minutes per exercise Prerequisites: Understanding of Chapter 03 (Testing Basics) concepts

Why This Exercise Matters: Well-written tests are your safety net. When you refactor code, good tests give you confidence. When tests fail, good tests tell you exactly what’s broken. Poor tests waste time and provide false security.


Exercise 1: Test Naming

Poor Quality

import numpy as np
from road_profile_viewer.geometry import find_intersection

def test_1():
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])
    x, y, dist = find_intersection(x_road, y_road, -10.0)
    assert x is not None

def test_2():
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])
    x, y, dist = find_intersection(x_road, y_road, 90.0)
    assert x is None

Good Quality

import numpy as np
from road_profile_viewer.geometry import find_intersection

def test_find_intersection_finds_intersection_for_downward_angle():
    """Test that downward angles successfully find road intersections."""
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])
    x, y, dist = find_intersection(x_road, y_road, -10.0)
    assert x is not None

def test_find_intersection_returns_none_for_vertical_angle():
    """Test that vertical angles (90°) return None gracefully."""
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])
    x, y, dist = find_intersection(x_road, y_road, 90.0)
    assert x is None

Questions

  1. What are the key differences between Poor and Good quality?
  2. What happens when tests fail in each version?
  3. Which principles from clean code for testing are being applied?
💡 Show Analysis

Key Differences

Test Names:

  • Poor: test_1(), test_2() - meaningless
  • Good: test_find_intersection_finds_intersection_for_downward_angle() - descriptive
    • Follows pattern: test_[function]_[scenario]_[expected_behavior]()

Docstrings:

  • Poor: No docstrings
  • Good: Clear docstring explaining what’s being tested

Impact of Failure

Poor Quality - Unhelpful Output:

FAILED test_1
# Questions: What broke? Which function? Which scenario?
# You have to read the test code to understand what failed

Good Quality - Immediately Clear:

FAILED test_find_intersection_returns_none_for_vertical_angle
# Immediately clear: "Vertical angle handling is broken"
# No need to read code - test name tells you what's broken

Principles Applied

  1. Descriptive Test Names
    • Pattern: test_[function]_[scenario]_[expected]()
    • Communicates: What + How + Expected outcome
  2. Self-Documenting Tests
    • Test name + docstring make the test’s purpose crystal clear
    • Future developers (or you in 6 months) understand instantly
  3. Better Debugging Experience
    • When test fails, you know exactly what to investigate
    • No time wasted reading test code to understand what it tests

Real-World Impact

Imagine a project with 500 tests. When test_1() fails:

  • ❌ You waste 5-10 minutes finding and reading the test
  • ❌ Multiply by dozens of failures during refactoring
  • ❌ Result: Hours wasted

When test_find_intersection_returns_none_for_vertical_angle() fails:

  • ✅ You know immediately: “vertical angle handling is the issue”
  • ✅ Go directly to that code
  • ✅ Result: Fix in minutes

Key Takeaway: Descriptive names aren’t about being verbose - they’re about saving time.


Exercise 2: AAA Pattern

Poor Quality

def test_distance_calculation():
    x, y, dist = find_intersection(np.array([0, 10, 20, 30]), np.array([0, 2, 4, 6]), -10.0, 0.0, 10.0)
    assert dist > 0 and x is not None

Good Quality

def test_find_intersection_returns_positive_distance():
    """Test that distance is positive when intersection is found."""
    # Arrange: Set up test data
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])
    angle = -10.0
    camera_x = 0.0
    camera_y = 10.0

    # Act: Find intersection
    x, y, dist = find_intersection(x_road, y_road, angle, camera_x, camera_y)

    # Assert: Verify distance is positive
    assert dist > 0

Questions

  1. What structural differences make the Good version easier to read?
  2. Why is separating Arrange-Act-Assert important?
  3. What makes the assertion clearer in the Good version?
💡 Show Analysis

Structural Differences

Poor Quality:

  • Everything crammed into one line
  • No variable names for clarity
  • Multiple assertions combined with and
  • No clear structure

Good Quality:

  • Arrange section: Test data clearly defined with meaningful variable names
  • Act section: Function call on its own line
  • Assert section: Single, clear assertion
  • Comments marking each section

Why AAA Structure Matters

1. Readability:

# Poor: What's being tested?
x, y, dist = find_intersection(np.array([0, 10, 20, 30]), ...)
assert dist > 0 and x is not None

# Good: Crystal clear what each part does
# Arrange: I'm setting up a simple upward road
# Act: I'm finding the intersection
# Assert: I'm checking distance is positive

2. Debugging: When test fails, you immediately know:

  • Arrange failed: Test setup is wrong
  • Act failed: Function crashed
  • Assert failed: Result doesn’t match expectations

3. Maintenance: Easy to modify each section independently:

# Need to test different angle? Change Arrange section only
# Arrange
angle = -45.0  # Changed from -10.0

# Act and Assert sections unchanged

Assertion Clarity

Poor:

assert dist > 0 and x is not None
# Which failed? Distance or x?
# Testing two things (bad practice)

Good:

assert dist > 0
# One thing tested: distance positivity
# If it fails, you know exactly what's wrong

Key Takeaway

The AAA pattern isn’t about following rules - it’s about making tests readable 6 months from now when you’ve forgotten what you were testing.


Exercise 3: One Concept Per Test

Poor Quality

def test_find_intersection_everything():
    """Test find_intersection with various inputs."""
    # Test normal angle
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])
    x1, y1, dist1 = find_intersection(x_road, y_road, -10.0)
    assert x1 is not None

    # Test vertical angle
    x2, y2, dist2 = find_intersection(x_road, y_road, 90.0)
    assert x2 is None

    # Test empty arrays
    x3, y3, dist3 = find_intersection(np.array([]), np.array([]), -10.0)
    assert x3 is None

    # Test horizontal angle
    x4, y4, dist4 = find_intersection(x_road, y_road, 0.0)
    assert x4 is not None

Good Quality

def test_find_intersection_finds_intersection_for_normal_angle():
    """Test that normal downward angles find intersections."""
    # Arrange
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])

    # Act
    x, y, dist = find_intersection(x_road, y_road, -10.0)

    # Assert
    assert x is not None


def test_find_intersection_returns_none_for_vertical_angle():
    """Test that vertical angles return None gracefully."""
    # Arrange
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])

    # Act
    x, y, dist = find_intersection(x_road, y_road, 90.0)

    # Assert
    assert x is None


def test_find_intersection_returns_none_for_empty_arrays():
    """Test that empty road arrays return None gracefully."""
    # Arrange
    x_road = np.array([])
    y_road = np.array([])

    # Act
    x, y, dist = find_intersection(x_road, y_road, -10.0)

    # Assert
    assert x is None


def test_find_intersection_finds_intersection_for_horizontal_angle():
    """Test that horizontal angles (0°) find intersections."""
    # Arrange
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])

    # Act
    x, y, dist = find_intersection(x_road, y_road, 0.0)

    # Assert
    assert x is not None

Questions

  1. What happens when test_find_intersection_everything() fails?
  2. What are the benefits of splitting tests?
  3. How does this affect debugging and maintenance?
💡 Show Analysis

Problem with Multi-Concept Tests

When test_find_intersection_everything() fails:

FAILED test_find_intersection_everything
# Which scenario failed?
# - Normal angle?
# - Vertical angle?
# - Empty arrays?
# - Horizontal angle?
# You have to read the entire test to find out!

Worse: If the first assertion fails, pytest stops there. You don’t even know if the other scenarios work!

# If this fails:
assert x1 is not None  # ❌ Fails here

# You never test these:
assert x2 is None      # ❓ Unknown
assert x3 is None      # ❓ Unknown
assert x4 is not None  # ❓ Unknown

Benefits of One Concept Per Test

1. Clear Failure Messages:

# Poor quality - vague
FAILED test_find_intersection_everything

# Good quality - specific
FAILED test_find_intersection_returns_none_for_vertical_angle
FAILED test_find_intersection_returns_none_for_empty_arrays
# Now you know BOTH vertical angles AND empty arrays are broken!

2. Independent Testing: Each scenario is tested independently:

  • Vertical angle test fails → Doesn’t prevent empty array test from running
  • See all failures at once, not one at a time
  • Faster debugging (fix all issues in one go)

3. Selective Testing:

# Run just one scenario
$ pytest -k "vertical_angle"

# Run all except one scenario
$ pytest -k "not empty_arrays"

4. Better Git History:

# Poor - unclear what changed
commit: "fix test_find_intersection_everything"

# Good - clear what was added/fixed
commit: "add test for vertical angle handling"
commit: "fix empty array test"

Real-World Scenario

You refactor find_intersection() to handle empty arrays better.

With Poor Quality:

  1. Run test_find_intersection_everything()
  2. It fails (but you don’t know why until you read the whole test)
  3. Debug and fix
  4. Re-run entire test
  5. Repeat

With Good Quality:

  1. Run pytest -k "empty_arrays"
  2. Test fails with clear message: “empty arrays test failed”
  3. Debug and fix
  4. Re-run just that test (fast!)
  5. All other tests still passing (confidence!)

Key Takeaway

“Test one thing at a time” isn’t about being pedantic - it’s about:

  • Clear, specific failure messages
  • Independent test execution
  • Faster debugging
  • Better test organization

Exercise 4: Multiple Asserts - Appropriate vs Inappropriate

Example A: Inappropriate Use

def test_find_intersection_mixed_concerns():
    """Test various properties of find_intersection."""
    # Arrange
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])

    # Act
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)

    # Assert - testing DIFFERENT concepts
    assert x > 0              # Concept 1: x is positive
    assert dist > 0           # Concept 2: distance is positive
    assert dist < 100         # Concept 3: distance is reasonable

Example B: Appropriate Use

def test_find_intersection_returns_positive_x():
    """Test that x coordinate is positive for downward angles."""
    # Arrange
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])

    # Act
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)

    # Assert
    assert x > 0


def test_find_intersection_returns_valid_distance():
    """Test that distance is positive and reasonable."""
    # Arrange
    x_road = np.array([0, 10, 20, 30])
    y_road = np.array([0, 2, 4, 6])

    # Act
    x, y, dist = find_intersection(x_road, y_road, -10.0, 0.0, 10.0)

    # Assert - testing SAME concept (valid distance)
    assert dist > 0, "Distance should be positive"
    assert dist < 100, "Distance should be reasonable"

Questions

  1. Why is Example A inappropriate?
  2. Why is Example B’s use of multiple asserts acceptable?
  3. How do you decide when to split tests?
💡 Show Analysis

Example A: The Problem

Testing THREE different concepts in one test:

assert x > 0        # Concept 1: "x coordinate is positive"
assert dist > 0     # Concept 2: "distance is positive"
assert dist < 100   # Concept 3: "distance is reasonable"

Why this is problematic:

1. Unclear failure:

FAILED test_find_intersection_mixed_concerns
# Which concept failed?
# - Is x wrong?
# - Is distance negative?
# - Is distance unreasonably large?
# You have to investigate!

2. Coupled concepts:

  • x coordinate and distance are independent properties
  • Testing them together creates unnecessary coupling
  • If distance calculation changes, this test might fail even if x is still correct

3. Hard to maintain:

  • Want to change distance validation logic? Have to modify this test
  • Want to add more x coordinate checks? Have to modify this test
  • Changes to one concept affect the other

Example B: Appropriate Multiple Asserts

First test - ONE concept (x positivity):

def test_find_intersection_returns_positive_x():
    assert x > 0  # Single concept: x is positive

Second test - ONE concept (distance validity):

def test_find_intersection_returns_valid_distance():
    assert dist > 0    # Both asserts test SAME concept:
    assert dist < 100  # "distance is valid"

Why multiple asserts here are OK:

  • Both asserts verify the same concept: “distance is valid”
  • They’re different aspects of the same thing (positive AND reasonable)
  • Failing either assert indicates the same problem: distance validation is broken

Decision Framework

Ask yourself: “What concept is this test verifying?”

Can describe in one sentence? → OK for multiple asserts

# "This test verifies that coordinates are within road bounds"
def test_coordinates_within_bounds():
    assert 0 <= x <= 30   # ✅ Same concept
    assert 0 <= y <= 6    # ✅ Same concept

Need multiple sentences? → Split tests

# "This test verifies x is positive AND distance is positive AND..."
# ❌ That's multiple concepts - split them!

Practical Example

Inappropriate - Mixed concerns:

def test_road_profile_generation():
    x, y = generate_road_profile()
    assert len(x) == 100         # Concept 1: Array length
    assert np.min(x) >= 0        # Concept 2: x bounds
    assert np.std(y) > 0.1       # Concept 3: y variation
    assert np.max(y) < 10        # Concept 4: y bounds

If this fails, what’s broken? Length? Bounds? Variation? No idea!

Appropriate - Focused concepts:

def test_road_profile_has_correct_length():
    x, y = generate_road_profile()
    assert len(x) == 100

def test_road_profile_x_within_bounds():
    x, y = generate_road_profile()
    assert np.min(x) >= 0    # Same concept: x bounds
    assert np.max(x) <= 100  # Same concept: x bounds

def test_road_profile_y_has_variation():
    x, y = generate_road_profile()
    assert np.std(y) > 0.1

def test_road_profile_y_within_bounds():
    x, y = generate_road_profile()
    assert np.min(y) >= -5   # Same concept: y bounds
    assert np.max(y) <= 10   # Same concept: y bounds

Now when a test fails: Test name tells you exactly which property is broken!

Key Takeaway

Multiple asserts are OK when:

  • ✅ Testing different aspects of the same concept
  • ✅ All asserts fail for the same underlying reason

Split into separate tests when:

  • ❌ Testing different concepts
  • ❌ Asserts could fail for independent reasons

Rule of thumb: If you can’t name the test with one concept, you’re testing too many things.


Exercise 5: Test Isolation

Poor Quality (Dependent Tests)

# Global state shared between tests
test_road_data = None

def test_generate_road_profile():
    """Test road profile generation."""
    global test_road_data
    test_road_data = generate_road_profile(num_points=100)
    assert test_road_data is not None

def test_road_profile_has_correct_length():
    """Test that generated road has correct length."""
    global test_road_data
    # Depends on test_generate_road_profile running first!
    assert len(test_road_data[0]) == 100

def test_road_profile_x_values():
    """Test that x values are correct."""
    global test_road_data
    # Also depends on test_generate_road_profile!
    assert np.max(test_road_data[0]) <= 80

Good Quality (Isolated Tests)

def test_generate_road_profile_returns_non_null():
    """Test that road profile generation returns data."""
    # Arrange
    num_points = 100

    # Act
    road_data = generate_road_profile(num_points=num_points)

    # Assert
    assert road_data is not None


def test_generate_road_profile_has_correct_length():
    """Test that generated road has correct number of points."""
    # Arrange
    num_points = 100

    # Act
    x_road, y_road = generate_road_profile(num_points=num_points)

    # Assert
    assert len(x_road) == 100
    assert len(y_road) == 100


def test_generate_road_profile_x_values_within_bounds():
    """Test that x values are within expected range."""
    # Arrange
    num_points = 100
    x_max = 80

    # Act
    x_road, y_road = generate_road_profile(num_points=num_points, x_max=x_max)

    # Assert
    assert np.min(x_road) >= 0
    assert np.max(x_road) <= x_max

Questions

  1. What happens if tests run in different order?
  2. What are the problems with global state?
  3. How do isolated tests improve reliability?
💡 Show Analysis

Problems with Dependent Tests

1. Order Dependency:

# Poor quality - MUST run in this order:
test_generate_road_profile()        # Sets global state
test_road_profile_has_correct_length()  # Reads global state
test_road_profile_x_values()            # Reads global state

# What if pytest runs them differently?
$ pytest -k "x_values"  # ❌ Fails! Global state not set
$ pytest tests/ --reverse  # ❌ Fails! Wrong order

2. Debugging Nightmare:

# Run one test in isolation
$ pytest tests/test_road.py::test_road_profile_x_values

# ❌ FAILED: 'NoneType' object has no attribute '__getitem__'
# Why? Because test_generate_road_profile didn't run first!
# Test failure has nothing to do with the actual logic

3. Flaky Tests:

# Sometimes passes (when order is right)
$ pytest tests/  # ✅ PASSED (lucky!)

# Sometimes fails (when order is wrong)
$ pytest tests/ --random-order  # ❌ FAILED (unlucky!)

# Developers waste hours investigating "sometimes it fails"

4. Can’t Run in Parallel:

# Parallel execution fails
$ pytest -n 4  # ❌ All tests fail! Global state collision

Benefits of Isolated Tests

1. Run Anywhere, Anytime:

# ✅ Run one test
$ pytest tests/test_road.py::test_generate_road_profile_x_values_within_bounds
# Works perfectly!

# ✅ Run in random order
$ pytest tests/ --random-order
# All pass!

# ✅ Run in parallel
$ pytest -n 8
# Fast AND reliable!

2. Clear Test Failures:

# When test fails, you know it's the actual logic
def test_generate_road_profile_x_values_within_bounds():
    x_road, y_road = generate_road_profile(num_points=100, x_max=80)
    assert np.max(x_road) <= 80  # If this fails, x generation is broken

# Not because of:
# - Missing setup
# - Wrong test order
# - Global state corruption

3. Independent Debugging:

Each test is self-contained:

def test_generate_road_profile_x_values_within_bounds():
    # Everything needed is right here
    x_road, y_road = generate_road_profile(num_points=100, x_max=80)
    assert np.max(x_road) <= 80

# No need to:
# - Find setup function
# - Check global state
# - Understand test order

4. Maintenance:

# Want to delete a test? Just delete it!
# No worrying about:
# - "Does another test depend on this?"
# - "Will this break other tests?"

# Want to add a test? Just add it!
# No worrying about:
# - "What order should this run in?"
# - "Do I need to set up global state?"

How to Achieve Isolation

Use fixtures (pytest’s solution):

# Instead of global state, use fixtures
import pytest

@pytest.fixture
def sample_road():
    """Fixture that provides test road data."""
    return generate_road_profile(num_points=100, x_max=80)

def test_road_profile_length(sample_road):
    """Each test gets fresh road data."""
    x_road, y_road = sample_road
    assert len(x_road) == 100

def test_road_profile_x_values(sample_road):
    """Independent from other tests."""
    x_road, y_road = sample_road
    assert np.max(x_road) <= 80

Or simply: Create data in each test

# Simple approach - just create what you need
def test_road_profile_length():
    x_road, y_road = generate_road_profile(num_points=100)
    assert len(x_road) == 100

def test_road_profile_x_values():
    x_road, y_road = generate_road_profile(num_points=100, x_max=80)
    assert np.max(x_road) <= 80

“But that’s duplicate setup code!”

That’s OK! Benefits outweigh the duplication:

  • ✅ Tests are independent and reliable
  • ✅ Tests are self-documenting (all setup visible)
  • ✅ Tests can run in any order
  • ✅ Easy to understand each test in isolation

If setup is really complex, use fixtures. Otherwise, keep it simple.

Key Takeaway

Test isolation = test reliability

Each test should:

  • ✅ Set up its own data
  • ✅ Run independently
  • ✅ Clean up after itself (if needed)
  • ✅ Not affect other tests

Benefits:

  • Can run tests in any order
  • Can run tests in parallel (fast!)
  • Can debug tests in isolation
  • Failures are always meaningful

Remember: Flaky tests (sometimes pass, sometimes fail) are usually caused by lack of isolation.


Summary and Key Principles

What You’ve Learned

After completing these exercises, you should be able to recognize:

Descriptive test names - test_[function]_[scenario]_[expected]()AAA pattern - Clear Arrange-Act-Assert structure ✅ One concept per test - Focused, clear purpose ✅ Appropriate use of multiple asserts - Same concept, different aspects ✅ Test isolation - Independent, reliable tests

The Five Principles of Good Tests

  1. Descriptive Names
    • Pattern: test_[function]_[scenario]_[expected_behavior]()
    • Why: Failing tests immediately tell you what’s broken
  2. AAA Structure
    • Arrange → Act → Assert
    • Why: Readable, debuggable, maintainable
  3. One Concept Per Test
    • Each test verifies one behavior
    • Why: Clear failures, independent testing
  4. Smart Use of Multiple Asserts
    • Multiple asserts OK for same concept
    • Split tests for different concepts
    • Why: Balance between focus and practicality
  5. Test Isolation
    • No shared state between tests
    • Each test sets up its own data
    • Why: Reliable, parallelizable, debuggable

Before vs After Impact

Before (Poor Quality Tests):

After (Good Quality Tests):

Checklist for Writing Tests

Use this checklist for every test you write:


Practice Exercise

Your Turn!

Take this poorly-written test and improve it:

def test_stuff():
    data = [5, 15, 25, 35]
    result = double_values_above_threshold(data, 20)
    assert result == [50, 70] and len(result) == 2

Improvements to make:

  1. Give it a descriptive name
  2. Add AAA structure
  3. Split multiple concepts if needed
  4. Add a docstring
  5. Make assertions clear
💡 Show Improved Version
def test_double_values_above_threshold_filters_and_doubles_values():
    """
    Test that double_values_above_threshold correctly filters values above
    threshold and doubles them.
    """
    # Arrange: Set up input data with values above and below threshold
    input_data = [5, 15, 25, 35]
    threshold = 20

    # Act: Process the data
    result = double_values_above_threshold(input_data, threshold)

    # Assert: Verify correct values are doubled
    expected = [50, 70]  # 25*2=50, 35*2=70 (5 and 15 filtered out)
    assert result == expected


def test_double_values_above_threshold_returns_correct_count():
    """Test that correct number of values pass the threshold."""
    # Arrange
    input_data = [5, 15, 25, 35]
    threshold = 20

    # Act
    result = double_values_above_threshold(input_data, threshold)

    # Assert
    assert len(result) == 2

Improvements made:

  1. ✅ Descriptive name explaining function + behavior
  2. ✅ Clear AAA structure with comments
  3. ✅ Split into two tests (values vs. count are different concepts)
  4. ✅ Docstrings added
  5. ✅ Clear assertions with expected value defined separately

Next Steps

Keep Practicing

  1. Review your own tests - Apply these principles to tests you’ve written
  2. Code review practice - Review classmates’ tests with these principles
  3. Read pytest docs - Learn about fixtures and parametrize for advanced techniques
  4. Write tests first - Before fixing a bug, write a test that reproduces it

What’s Coming

Chapter 03 (Boundary Analysis):

Chapter 03 (TDD and CI):

Resources


Feedback

Found these exercises helpful? Have suggestions? Reach out during office hours or via the course forum.

Remember: Writing good tests is a skill that improves with practice. Every test you write is an opportunity to apply these principles!

Happy testing! 🧪✨

© 2026 Dominik Mueller   •  Powered by Soopr   •  Theme  Moonwalk