Mon 21 July 2025
Strategic Testing
Are our tests good? Are they bad? Do we have enough tests?
Beyond writing isolated and targeted unit tests, there are methods that ensure our tests are appropriate. This article covers some strategies that answer these questions. Namely; using test coverage metrics, include mutation testing, applying fuzzy tests and finally test fixtures.
Test Coverage
We can measure the number of lines executed when running our
tests. For example the following code snippet doesn't
include the fourth line 'return False'
in the test suite.
def is_odd(n: int) -> bool:
if n % 2:
return True
return False
def test_is_odd():
assert is_odd(7)
Using the ratio of lines tested to lines not tested gives us the test coverage metric.
Along with context of how important each line of our code is; test coverage is helpful, however as a performance metric or as a blind target it's quite useless.
If you've got a 20% coverage and the code is critical to your business then getting that to 80% is crucial. Trying to eek out the extra 0.1% coverage when you're at 98% is fruitless, or having a goal to go from 98% -> 99% is a poor man's KPI.
Unless they're easy to add you'll be trying to make very minor coverage gains for edge-cases that might be rarely hit. At this point there is likely something more impactful to focus on.
It might be an interesting exercise to understand if the code you're not covering is reachable during the programs lifetime. If not, the dead code should just be removed instead of tested.
Mutation Testing
We rely on tests to ensure our code is correct and works as expected but how do we ensure our tests are correct and they work as expected. What tests our tests? Mutation testing aims to fill this gap.
How often have you written a passing test and you purposely make it fail just to ensure that the test is catching the case you are intending it to catch? This forms the basis of mutation testing.
When applied, it goes through your test suite and makes
subtle changes to your tests. If we use the test_is_odd
method as an example, it might bump 7 to 8. Sometimes it
might take a string and remove a character. It will also
change operators in your test such as changing <=
to <
.
The test will then run as normal but under the expectations
that it should fail with the mutations. If our example still
passes when we pass 8 instead of 7 then something is wrong,
the test isn't working as expected which might indicate that
we are mocking too many dependencies, we aren't being specific
enough or nothing is really being tested.
Fuzzy Tests
Some software may receive user inputted or malformed data, in these cases you might not want the system to behave irregularly. A developer might not know of all the funky data that could be provided to the method upfront, in these cases they may rely on writing fuzzy tests.
As an example, if we had a method that expects a user provided string, we can define a fuzzy test which enumerates a data bank of known edge-cases for strings such as providing an emoji, an empty string or a large string of zero width characters.
Test Fixtures
As the code base grows you might notice that we are writing repeated lines of code in order to setup a user object or prepare data before passing it to the method we are testing.
Large projects get around this by defining test fixtures. These can be passed as parameters to our tests so that we know the setup the test requires before it runs. The benefit of keeping the fixture separate to the test is that it reducing the amount of code that is duplicated across tests and if the setup for the user changes then only the fixture should require changing.
Tests should be focused on asserting one thing and fewer lines in the test makes it easier to see what's going wrong when something breaks.
Finally
If you're printing it, maybe you should assert it.
Further Reading
The "Software Testing" series:
-
1: Assert
-
2: (here) Strategic Testing