One of the major quality metrics for test cases is coverage. We'll informally define test coverage — as exercised by our test suite — as the portion of the total fixed set of functionality for which our tests can reliably tell whether it works as intended. In other words, the percentage of functionality for which we can provide evidence that it works.
Unit tests are often tracked with code coverage tools like gcov. With these, we can compare lines of code (LOC) executed during tests against total LOC, and decide that the unit test coverage is good when the number approaches 100%.
When testing the functionality of a component in a system, the picture is less clear. We can rarely call API methods one by one and claim full coverage when done. Instead, we need to identify situations where each function is applicable, enumerate different parameters for each possible state, and design a test case using this information. This may lead to situations where important functionality is only tested lightly at best, missing corner cases or risk areas.
For example, to test the cellular voice call functionality using oFono, we need to make sure that the cellular modem is powered and registered to the network (which might also involve handling SIM PINs, for example). As test parameters, we decide which numbers to call (What numbers are valid? What happens if we call 911? Do we have a test network available?) — and only then call the method to start the cellular call. Afterwards, we need to bring the system to a known state, regardless of what happened during the test, so that later tests are not interfered with.
We generally want to separate and fix setup and cleanup parts of tests in order to limit the variation in test cases to manageable levels. In the example, we don't want to deal with registration details or modem states, but instead focus on details of the cellular call — addressing, what happens during the call, how it is disconnected, and so on. Of course, the setup and cleanup parts should also be tested, but this is best done separately if at all possible.
The techniques presented here are intended to help designing functional tests, tracking coverage over the tested API, discovering gaps in current test coverage, and presenting the total status of the test suite to an audience.
Deciding what parts of the API should be tracked for test coverage can be tricky. Components with good testability will have documentation available for this work. The situation is not so easy with all components, though, especially when they are under heavy development. Here are some places to look for information about a component's interfaces:
Note that none of these methods are enough by themselves, without a good understanding how the tested system works.
The situation which the above techniques deal with has some similarities to difficulties encountered during black-box testing. We should consider it a separate problem, though. The techniques presented here are mostly developed for analyzing open but often not completely documented components.
Once a listing of the API functionality is available, we can build a matrix with test cases on one axis and API functions on the other, a simple task on a spreadsheet. For the following techniques, we keep API functions on top row and test cases in leftmost column.
For each test case, we mark each function covered. This way the total API coverage we're actually testing can be measured by looking at sums of the columns under each API function. We'll call this sort of table a coverage matrix.
It's recommended to mention the version numbers of the tested component and separate test suites used, and keep the whole document in a version control system.
One thing that is missing from the above picture are the parameters each tested function takes. This data should be tracked together with the test cases and APIs, so that changes in the component or the test content can be followed in one place. Here are some general guidelines for how to track test parameters:
In practise, we often create test cases that take a set of values for each parameter as input, and generate sub-cases by permuting the variables in the test software (obviously some care is needed to avoid combinatoric issues here). This is not the only way to count test cases — some test designers prefer to explicitly list each tested parameter set as a test case — but it appears to work for us.
The above discussion often refers to test cases, but does not really describe what sort of set makes sense. Designing test cases for a component can be viewed as an iterative process, where test cases are added, combined, removed or refined, repeating for the life of the component. The following lists some inputs to this process:
Further iterations over the test set should be done to identify redundant tests (but take care to remove only tests where all parts of functionality under test are exercised elsewhere).
A large quantity of material has been written about test techniques, but a wider literature overview is beyond the scope of this document.
Using a test coverage matrix like the above suggests several possible enhancements. Some ideas for future development are listed below.
(There will be links here to real coverage analysis documentation, once they are up on the wiki.)