How to Test Programs

How to Test Programs

by Troy Vasiga

Motivation

If I told you that I could control an entire airport full of landings and takeoffs at an international airport, you would invariably ask the following of me:

Prove it!

That is, you would not take my word for it: you would like some evidence that I can do what I claim to do. This is the underlying reason for testing a computer program: we wish a certificate that the program (approximately) does what it is supposed to. Moreover, we wish to prove to ourselves and others (i.e. managers, customers, TA's) that our program satisfies the requirements asked of it.

So, how do we go about proving our program matches the given specifications?

In short, we can't.

Alan Turing proved that it is impossible to (algorithmically) determine that, given any program and an input, will determine if the program will complete execution (halt) on that input (this is known as the "Halting Problem"). Hence, we can't just pass off the testing to a grand "Testing" program. There is no panacea, no magic bullet: we are going to have to do the proving ourselves. If we throw up our hands and say "Forget testing: the program seems to work," there will be little confidence in our computer program, and the economic/academic repercussions are obvious.

Additionally, there are standards (ISO 9000 and European Union (EU) standards, for instance), that require that software passes certain standards of quality: the only way to achieve these standards is through testing [Sanders and Curran, p. 44-66].

Views of Testing

Since it is evidence of correctness that we are looking for, we can view the testing of software as a jury or judge views evidence: the better quality of evidence (not necessarily the greater quantity), the stronger the case that the program satisfies the requirements. As the programmer, we are the person on trial: innocent until proven guilty. As the tester, we are the accuser: we must provide substantial evidence that the program does not meet the specifications. If strong evidence cannot be presented to this end, the program is deemed to be "correct enough" and it is set free into the world.

This analogy leads us to view testing as a process of finding bugs: some academics (Beizer, Marick and others) feel the sole purpose of testing is to find bugs. Once the bugs are found (since there will always be bugs in a non-trivial system), testing gives way to the debugging phase: find the line(s) of code in error and correct them.

So, how do we test? In the next two sections, general techniques and problem-specific techniques of testing will be discussed.

General Techniques of Testing

There are two main methodologies of testing: white-box and black-box testing.

White-box testing examines the internal structure of a program and attempts to test each logical case. White-box testing can be thought of as "transparent" box testing: the tester can see and test a specific section of code. For instance, in white-box testing, an IF-THEN-ELSE statement would be tested with both a TRUE condition and a FALSE condition. Unfortunately, there are a few problems with white-box testing:

the tester often does not have access to the source code
white-box testing can be exponentially large (for n IF-THEN-ELSE statements, there are 2ⁿ different combinations of values) [Myers, p. 9-10]

These problems with white-box testing lead to the more practical black-box testing methodology. Black-box testing (also known as data-driven or input/output-driven testing) in which the tester views the program as a black box, and as such, the inner workings of the program are unknown. The main tool used in black-box testing is the specification of the program: that is, the tester attempts to determine what input causes the output of the program to be different from what the specifications would require.

As a general rule within black-box testing, the tester should test the "good" input (i.e. a positive integer), "bad" input (i.e. casual mistakes, such as 04 instead of the integer 4), and the "ugly" input (i.e. malicious mistakes, such as the string "Hello" instead of the integer 4). If you view "ugly" testing as unnecessary, and feel that that "Garbage In, Garbage Out" (GIGO) should be the motto of testing, note that others would strongly disagree: for instance, Beizer states "[GIGO] is one of the worst cop-outs ever invented by the computer industry" [Beizer, p. 284]. If a program is designed to ensure that nuclear reactors run safely, and the user happens to type "1.0" instead of "1" (Garbage In), it would be disastrous to have a meltdown (Garbage Out). In summary the motto of proper programming should be: "Garbage In, Nice-error-message Out."

You may be asking:

These methods are fine, but what if I have 100000 lines of code to test?

To answer this question, consider the following quote [Beizer, p. 7]:

"tests must be designed and tested: designed by a process no less rigourous and no less controlled than that used for code."

Hence, the processes used in creating a program (such as modularization, preconditions, postconditions, documentation, etc.) are necessary for testing. For example, consider the concept of modularization in terms of testing: to test a program with modules A and B, test

module A
module B
module A used in conjunction with module B

That is, after testing each module, the integration of the modules needs to be tested.

Since programming is not a "blind" process (it should be somewhat deterministic), testing should be predictable as well. A tester should be able to determine the output before the test is run. If this is not the case, the code/specifications are not known well enough, and errors will go unnoticed, since the tester won't be able to realize when the "actual" output doesn't match the "specified" output.

Finally, since errors are detected in testing, it is strongly recommended that "test suites" are used. A test suite is a file of test cases which can be used as input for a program, and as such, can be repeatedly used to verify that an error has been fixed.

Specific Techniques of Testing

This section provides a small checklist of test considerations for specific types of programs (based on [Marick, Appendix A & B]). Note that these are by no means complete: for a given program, you may have to test all of these cases and more, depending on the specifications of your program. In addition, not all of these cases will be applicable to all of your programs.

Numerically based

good values of different types (i.e. positive, negative, zero)
boundary conditions
maximum, minimum
outside of max and min
gaps in domain (i.e. prime numbers, even numbers, etc).

String based

delimiter problems (missing or too many)
mixed case (hello, Hello, HeLlo)
input is too long for string
input has white space or other delimiter

File based

file exists and contains correct data
file exists but data is wrong type/format
file exists but is empty
file exists but is corrupt
file does not exist

Logic based

Boolean values of 0/false, 1/true, and something else (e.g. 7/Hello)
ensure nested statements are tested thoroughly
case statements should test all conditions (including ELSE clause)

Loops

ensure entering condition of loop is true
are exit values what are expected?
is the loop exited at the correct iteration
loop body executes zero, once, or multiple times

Data structures and pointers

ordered data structure

first element added/removed
middle element added/removed
last element added/removed

unordered data structure

empty structure
single item
multiple items
full structure
duplicate items

pointers

nil
pointer is not nil (i.e. points to object)
two pointers pointing to same object (e.g. pointers A and B point to object X)
pointer to a list of multiple objects

Summary

Testing is a necessary stage in the software life cycle: it gives the programmer and user some sense of correctness, though never "proof" of correctness. With effective testing techniques, software is more easily debugged, less likely to "break," more "correct", and, in summary, better.

References

Beizer, B (1990). Software testing techniques (2nd ed.). New York : Van Nostrand Reinhold.

DeMillo, R.A., McCracken, W.M., Martin, and R.J., Passafiume, J.F. (1987). Software Testing and Evaluation. Don Mills: Benjamin/Cummings.

Marick, B. (1995). The Craft of Software Testing: Subsystem testing including object-based and object-oriented testing. Englewood Cliffs: PTR Prentice Hall.

Myers, G. J. (1979). The Art of Software Testing. New York: John Wiley & Sons.

Sanders, J. and Curran, E. (1994). Software Quality: A Framework for Success in Software Development and Support. Don Mills: Addison-Wesley.