You are on page 1of 6

FEATURE

Jumping Into Automation


Adventure with Your
Eyes Open by Mark Fewster

Automating the execution of tests is there are four attributes that describe the proportion of the bugs, but they must also
becoming more and more popular as the quality of a test case. Perhaps the most ensure that the test cases are well designed
need to improve software quality amidst important of these is its effectiveness, in order to avoid excessive costs.
increasing system complexity becomes whether or not it finds bugs, or at least,
ever stronger. The appeal of having the whether or not it is likely to find bugs. Automating tests is another skill requiring a
computer run the tests in a fraction of the Another attribute reflects how much the test different approach and extensive effort. For
time it takes to perform them manually has case does. A good test case should be exem- most organizations, it is expensive to auto-
led many organizations to attempt test plary, testing more than one thing thereby mate a test compared with the cost of per-
automation without a clear understanding reducing the total number test cases forming it once manually. In order to
of all that is involved. required. The other two attributes are both achieve a respectable return on investment
cost considerations: how economical a test (ROI), they have to automate with the
Consequently, many attempts have failed to case is to perform, analyze and debug; and understanding that each test will need to be
achieve real or lasting benefits. This paper how evolvable it is, or how much mainte- performed many times throughout its use-
highlights a few of the more common mis- nance effort is required on the test case each ful life.
takes that have contributed to these failures time the software changes.
and offers some thoughts on how they may Whether a test is automated or performed
be avoided. Figure 1 depicts the four quality attributes manually affects neither its effectiveness
of a test case in a Kiviat diagram and com- nor how exemplary it is. It doesn’t matter
Confusing Automation pares the likely measures of each on the how clever you are at automating a test or
and Testing same test case when it is performed manu- how well you do it, if the test itself is unable
Testing is a skill. While this may come as a ally (shown as an interactive test in the fig- to detect nor confirm the absence of a bug.
surprise to some people, it is a simple fact. ure) and after it has been automated. In such an instance, automation only accel-
For any system there are an astronomical erates failure. Automating a test affects only
number of possible test cases although we These four attributes must often be bal- how economic and evolvable it is. Once
have time to run only a very small number anced one against another. For example, a implemented, an automated test is general-
of them. Yet this small number of test cases single test case that tests a lot of things is ly much more economic; the cost of run-
is expected to find most of the bugs in the likely to cost a lot to perform, analyze and ning it correlates to a mere fraction of the
software, so the job of selecting which test debug. It may also require a lot of mainte- effort to perform it manually. However,
cases to build and run is an important one. nance each time the software changes. automated tests generally cost more to cre-
Both experiment and experience has told us Consequently, a high measure on the exem- ate and maintain. The better the approach to
that selecting test cases at random is not an plary scale is likely to result in low meas- automating tests, the cheaper it will be to
effective approach to testing. A more ures on the economic and evolvable scales. implement new automated
thoughtful approach is required if good test As this demonstrates, testing is indeed a
cases are to be developed. skill. Not only must testers ensure that the
What exactly is a good test case? Well, test cases they use are going to find a high

March 2002 http://www.testinginstitute.com Journal of Software Testing Professionals 5


Effective Effective
Effective

Economic Evolvable Economic Evolvable


Economic Evolvable

Exemplary Exemplary Exemplary

Figure 1a The ‘goodness’ of a test case Figure 1b When the ‘good’ manual test Figure 1c Although an automated test is
can be illustrated by considering the four case is automated, its measures of good- more economic than the same test case
attributes in this Kiviat diagram. The ness change as shown by the broken line. performed manually, after it has been exe-
greater the measure of each attribute, the While the test case is equally as effective cuted for the first time it is much less eco-
greater the area enclosed by the joining – it has the potential to find the same nomic since it has cost more to automate
lines and the better the test case. This faults as the manual test case – it proves it (shown by the dotted line in this dia-
shows a ‘good’ manual test case. to be more economic but less evolvable. gram). It is important to understand than
the economic benefits of automation will
be achieved only after the automated tests
tests in the long term. Similarly, if no file called an automated test script. When
have been used a number of times.
thought is given to maintenance when it is replayed, the tool reads the script and
tests are automated, updating an entire passes the same inputs on to the software
automated test suite can cost as much, if under test (SUT) that usually has no idea it were first recorded. This implies that as
not more, than the cost of performing all is a tool controlling it rather than a real well as recording the inputs, the tool must
the tests manually. See FEWS99 for case person sitting at a computer. In addition, record at least some of the output from the
histories. the test tool generates a log file, recording SUT. But which particular outputs should
precise information on when the replay be recorded? How often should the out-
For an effective and efficient automated was performed and perhaps some details puts be recorded? Which characteristics of
set of tests (tests that have a low cost but a of the machine. Figure 2 depicts the replay the output should be recorded? These are
high probability of finding bugs) you have of a single test case. questions that have to be answered by the
to start with the raw ingredient of a good tester as the inputs are captured or possibly
test set, a set of tests skilfully designed by For many people this is all that is required (depending on the particular test tool in
a tester to exercise the most important to automate tests. After all, what else is use) during a replay.
things. You then have to apply automation there to testing but entering a whole series
skills to automate the tests in such a way of inputs? However, merely replaying the Alternatively, the testers may prefer to edit
that they can be created and maintained at captured input to the SUT does not the script, inserting the required instruc-
a reasonable cost. amount to performing a complete test. tions for the tool to perform comparisons
between the actual output from the SUT
Believe Capture/Replay There is no verification of the results. How and the expected output now determined
= Automation will we know if the software generated the by the tester. This pre-supposes that the
Capture / replay technology is indeed a same outputs? If the tester is required to sit tester will be able to understand the script
useful part of test automation but it is only and watch each test being replayed he or sufficiently well to make the right changes
a very small part of it. The ability to cap- she might as well have been typing them in the right places. It also assumes that the
ture all the keystrokes and mouse move- in since they are unlikely to be able to keep tester will know exactly what instructions
ments a tester makes is an enticing propo- up with the progress of the tool, particular- to edit in the script, their precise syntax,
sition, particularly when these exact key- ly if it is a long test. It is necessary for the and how to specify the expected output.
strokes and mouse movements can be tool to perform some checking of the out-
replayed by the tool time and time again. put from the application to determine that In either approach, the tests themselves
The test tool records the information in a its behavior is the same as when the inputs may not end up as particularly good tests.

March 2002 http://www.testinginstitute.com Journal of Software Testing Professionals 6


by the SUT is checked. This view is fur-
ther strengthened by many of the testing
Main Menu
1. Generate report
2. Edit report definition
tools that make it particularly easy to
3. Utilities
4. Exit check information that appears on the
screen both during and after a test has been
executed. However, this assumes that a
correct screen display indicates success,
Test
Test script:
script: SUT Log
Log but it is often the output that ends up else-
-- test
testinput
input
where (in an output file or a database for
example) that is more important. Just
because information appears on the screen
Audit trail correctly it does not always guarantee that
(from tool) it will be recorded elsewhere correctly.

For good testing it is often necessary to


Figure 2: Capture/Replay of a Single Test Case check these other outputs from the SUT.
Perhaps not only the files and database
Even if it was thought out carefully at the once manually so they can be replayed is a records have been created and changed,
start, the omission of just one important low cost way of starting test automation. but also those that have not been changed
comparison or the inclusion of one unnec- That is probably why it is so appealing to and those that have (or at least should
essary or erroneous comparison, can those who opt for this approach. The cost have) been deleted or removed. Checking
destroy a good test. Such tests may never of maintaining automated scripts created some of these aspects of the outcome of a
spot that important bug or may repeatedly in this way becomes prohibitive as soon as test (rather than merely the output) will
fail good software. the software changes. If we are to min- make tests more sensitive to unexpected
imise maintenance costs, it is necessary to changes and help ensure that more bugs
Scripts generated by testing tools are usu- invest more effort up front implementing are found.
ally not very readable. Will the whole automated scripts. Figure 3 depicts this in
series of individual actions really convey the form of a graph. Without a good mechanism to enable
what has been going on and where com- comparison of results other than those that
parison instructions are to be inserted? Verify Only Screen appear on the screen, tests that undertake
Scripts are written in a programming lan- Based Information these comparisons can become very com-
guage so anyone editing them has to have Testers are often seen in front of a com- plex and unwieldy. A common solution is
some understanding of programming. puter screen so it is perhaps natural to to have the information presented on the
Also, while it may be possible for the per- assume that only the output to the screen
son who has just recorded the script to
understand it immediately afterwards,
after some time has elapsed or for anyone
else this may be more difficult.

Even if the comparison instructions are Cost


Effort to implement
inserted by the tool under the testers con-
Maintenance cost
trol, the script is likely to need editing at
some stage. This is most likely to occur
when the SUT changes. A new field here,
a new window there, will soon cause
untold misery for testers who then have to
review each script looking for the places Simple Sophisticated
that need updating. Of course, the scripts implementation implementation
could be re-recorded but this defeats the
objective of recording them in the first Figure 3: The cost of test maintenance is related to the cost of test implementation. It
place. is necessary to spend time building the test in order to avoid high maintenance costs
later on.
Recording test cases that are performed

March 2002 http://www.testinginstitute.com Journal of Software Testing Professionals 7


Evolve Naturally
Expected
Expected Like a number of other common mistakes,
Screen
Screen this one isn’t made through a deliberate
Script
Script Script
Script Log
Log Data
Data decision (by choice); rather, it is made
(ascii) (binary) through a lack of understanding. The prob-
(ascii) (binary) Diffs
Diffs
Captured
Captured lem that is commonly and unwittingly
Screen ignored is not having a consistent and well
Screen organised home for all the data files, data-
Data
Data bases, scripts, expected results, etc.
Input
Input Everything that makes up the tests and is
Data
Data required to run them, the results from their
execution, and other information comprise
the ‘testware’. Where and how these arte-
Accounts facts are stored (e.g. grouped by test case,
Accounts grouped by artefact type, or not grouped at
Report
Report all) is called the testware architecture.

There are three key issues to address:


scale, re-use, and multiple versions. Scale
Figure 4 Executing a single test inevitably results in a large number of different files is simply the number of things that com-
and types of information, all of which have to be stored somewhere. Configuration prise the testware. For any one test there
management is essential for efficient test automation. can be several (10, 15 or even 20) things
(files) that are unique (files and records
screen after the test has completed. This is that frequently occur with each new containing test input, test data, scripts,
the subject of the next common mistake. release of the SUT. Of course, this in turn expected results, actual results and differ-
adversely impacts the maintenance costs ences, log files, audit trails and reports).
Use Only Screen Based for the test. Figure 4 depicts one such test case.
Comparison
Many testing tools make screen based In one case, I came across a situation Re-use is an important consideration for
comparisons very easy indeed. It is a sim- where a PC-based tool vendor had strug- efficient automation. The ability to share
ple matter of capturing the display on a gled long and hard to perform a compari- scripts and test data not only reduces the
screen or a portion of it and instructing the son of a large file generated on a main- effort required to build new tests but also
tool to make the same capture at the same frame computer. The file was brought reduces the effort required for mainte-
point in the test and compare the result down to the PC one page at a time where nance. But, re-use will only be possible if
with the original version. As described at the tool then performed a comparison with testers can easily (and quickly) find out
the end of the previous common mistake, the original version. It turned out that the what there is to re-use, quickly locate it,
this can easily be used to compare infor- file comprised records that exceeded the and understand how to use it. I’m told that
mation that did not originally appear on maximum record length that the tool could a programmer will spend no more than 2
the screen but was a part of the overall out- handle. This, together with the length of minutes looking for a re-useable function
come of the test. time the whole process took caused the before he or she will give up and write
automated comparison of this file to be their own. I’m sure this behavior applies to
However, the amount of information in abandoned. testers and that it may be a lot less than 2
files and databases is often huge and to minutes. Of course, while test automation
display it all on the screen one page at a In this case, and many others like it, it is implemented by only one or two people
time is usually impractical if not impossi- would have been relatively simple to this will not be much of a problem, at least
ble. Thus, compromise sets in. Because it invoke a comparison process on the main- while those people remain on the automa-
becomes so difficult to do, little compari- frame computer to compare the whole file tion team. But once more people become
son of the tests’ true outcome is per- (or just a part of it) in one pass. This would involved, either on the same project or on
formed. Where a tester does labour long have been completed in a matter of sec- other projects, the need for more formal
and hard to ensure that the important infor- onds (compared with something exceed- testware architecture (indeed a standard /
mation is checked, the test becomes com- ing an hour when downloaded to the PC). common architecture) becomes much
plex and unwieldy once again, and worse, greater.
very sensitive to a wide range of changes Let Testware aAchitecture
8 Journal of Software Testing Professionals http://www.testinginstitute.com March 2002
Multiple versions can be a real problem in implementing automated tests that avoid crude but adequate measure of the likely
environments where previous versions of or at least reduce some of the maintenance savings can be calculated by multiplying
software have to be supported while a new costs. Then run the tests on an unstable the manual test effort by the number of
version is being prepared. When an emer- version of the software so you can learn times it is likely to be run
gency bug fix is undertaken, we would what is involved in analysing failures and
like to run as many of our automated tests explore further implementation enhance- The decision as to which test cases to auto-
as seems appropriate to ensure that the bug ments to make this task easier and there- mate and which of these to automate first,
fix has not had any adverse affects on the fore reduce the analyze effort. has to be based on the potential pay back.
rest of the software. But if we have had to That is, the extent of the benefits gained by
change our tests to make them compatible The other aspect, that of automating too automating one test case compared with
with the new version of the software this much full stop may at first seem unlikely. the benefits gained by automating a differ-
will not be possible unless we have saved Intuitively, the more tests that are automat- ent test case.
the old versions of the tests. Of course the ed the better. But this may not be the case.
problem becomes even worse if we have Continually adding more and more auto- The characteristics that would make a test
to manage more than one older version or mated tests can result in unnecessary case a likely candidate for test automation
software system. duplication, redundancy, and/or a cumula- are given below.
tive maintenance cost. James Bach has an
If we have only a few automated tests it excellent way of describing this • It will be run many times. Clearly the
will be practical to simply copy the whole [BACH97]. James points out that eventu- more times a test case can be usefully
set of automated tests for each new version ally the test suite will take on a life of its run the more beneficial an automated
of the software. Bug fixes to the tests own, testers will depart, new testers will version of it will be. Regression tests
themselves may then have to be repeated arrive and the test suite grows ever larger. (those that will be run every time a new
across two or more sets but this should be Nobody will know exactly what all the version of software is created) are par-
a relatively rare occurrence. However, if tests do and nobody will be willing to ticularly suitable for automation.
we have a large number of tests this remove any of them, just in case they are
approach soon becomes impractical. In important. In this situation many inappro- • It is mundane (but important). An
this case, we have to look to configuration priate tests will be automated as automa- example is input verification tests such
management for an effective answer. tion becomes an end it itself. People will as checking that an edit box accepts
automate tests because “that’s what we do only the valid range of values. These
Trying to Automate here - automate tests” regardless of the rel- tests are by nature uninteresting and
Too Much ative benefits of doing so. therefore error prone to perform but are
There are two aspects to this common mis- nevertheless important to have done.
take: automating too much too soon, and James Bach [BACH97] reports a case his-
automating too much, full stop. tory in which it was discovered that 80% • It is expensive to perform manually.
Automating too much too soon leaves you of the bugs found by testing were found by Such as multi-user tests or tests that
with a lot of poorly automated tests that manual tests and not the automated tests take a long time to run.
are difficult (and therefore, costly) to despite the fact that the automated tests
maintain. It is much better to start small. had been developed of a number of years • It is difficult to perform manually. For
Identify a few good, but diverse, tests (say and formed a large part of the testing that example, test cases that are timing crit-
10 or 20 tests, or 2 to 3 hours worth of took place. A sobering thought indeed. ical or are particularly complex to per-
interactive testing) and automate them on form.
an old (stable) version of software, per- Automating the
haps a number of times, exploring differ- Wrong Tests • It requires special knowledge. Test
ent techniques and approaches. The aim Not every test case should be automated cases that require people with particular
here should be to find out just what the because the benefit of automating some knowledge (business or system knowl-
tool can do and how different tests can tests is outweighed by the cost of doing so. edge) can be good candidates for
best be automated taking into account the Indeed, some test cases cannot be auto- automation since more or less anyone
end quality of the automation (that is, how mated but this fact does not stop some can perform a test that has been auto-
easy it is to implement, analyze, and main- people trying (at high cost but with no mated.
tain). Next, run the tests on a later (but still benefit gained). Once we have some expe-
stable) version of the software to explore rience of automating tests it will be possi- The characteristics that would make a test
the test maintenance issues. This may ble to estimate reasonably well the time it case an unlikely candidate for test automa-
cause you to look for different ways of will take to automate a particular test. A tion are given below.

March 2002 http://www.testinginstitute.com Journal of Software Testing Professionals 9


• It will not be run many times. If a test cal reasons) it may be possible and benefi- weaknesses before automating to auto-
case is not run often because there is no cial to automate some parts of it (such as mate large numbers of tests.
need to run it often (rather than because data preparation and clear-up).
it is too expensive to run often manual- Good test automation does take time and
ly) then it should not be automated. Conclusion effort and where time is limited it is par-
• It is not important. By definition, test Appreciating that automation is a separate ticularly important that success-threaten-
cases that are not important will not task from testing is important for success- ing problems be avoided since there will
find important bugs. If a bug is not ful test automation. Automation is neither be less time to back track and have anoth-
important it doesn’t seem sensible to easy nor straightforward; it has to be er go. There are many pitfalls that impair
invest a lot of effort into finding it, par- worked at and is rarely successful when or destroy well-intentioned attempts to
ticularly if such bugs are not going to be undertaken as an incidental task. If insuffi- automate testing. Knowledge of the most
fixed. cient resources are dedicated to automa- common ones should help organizations
• It is a usability test. These cannot be tion, it will not deliver the significant ben- steer away from them and will hopefully
automated since usability is a human efits that are possible. Simple approaches help make them vigilant as to other prob-
interaction issue. to automation like capture/replay are a low lems that may similarly compromise test
• It is hard to automate. Test cases that cost way starting automation but then automation efforts.
will take a lot of effort to automate are incur a high maintenance cost. More About the Author
generally not worth automating. For sophisticated approaches cost more in With over 20 years in the software indus-
example, if it were to take a few days to time and effort to start with but incur only try Mark has held posts from programmer
find a solution to a technical problem a fraction of the maintenance costs of the to development manager before joining
that prevented a test case from being simple approaches. Grove Consultants in 1993. He provides
automated, it would only be worth- consultancy and training in software test-
When automating testing, automating the
while doing if the cost could be ing, particularly in the application of test-
execution of tests is only part of the job.
recouped. ing techniques and test automation. Mark
Verifying correctly that test cases passed
serves on the committee of British
or failed requires a number of important
Two further considerations are: Computer Society’s Specialist Interest
decisions to be made as to how often
• How expensive will it be to maintain the Group in Software Testing (BCS SIGIST)
checks are to be made, what is to be
automated test case? Even if a test case and has been a member of the Information
checked, and how much of it is to be
is simple to automate it may be vulner- Systems Examination Board (ISEB)
checked, If the wrong choices are taken
able to changes in the software. This developing on a qualification scheme for
good tests can easily be compromised.
would make it costly to maintain and testing professionals. He has co-authored
Another lesson many organizations have
therefore less attractive to automate the book “Software Test Automation”
learnt the hard way concerns testware
since the benefits of automating it may with Dorothy Graham.
architecture, the structure of the testware,
be wiped out or at least severely cur-
the things we use and create when testing
tailed by the maintenance cost involved References
(such as scripts, data, expected results,
in updating it with each new version of
etc.). A good architecture will encourage BACH97 James Bach, “Test Automation Snake Oil”
software. presented at the 14th International Conference on
reuse (thereby reducing automated test
• How much value will this add to the Testing Computer Software, Washington, USA.
build and maintenance costs) and be easi-
existing automated test cases?
er to work with (resulting in fewer errors FEWS99 Mark Fewster & Dorothy Graham,
Although a test case may in itself offer “Software Test Automation”, published by Addison-
being made when working with automat-
a lot of value, if it duplicates a part or all Wesley, 1999.
ed tests).
of an existing test case then the value of
the new one is much reduced.
Where full automation is not warranted, When starting test automation, there is a
consider partial automation. For example, huge learning curve to climb and it is best
it may be difficult to automate the execu- not to automate a lot of test cases to start
tion of a particular complex test case but it with since they are not likely to be as good
may be possible and beneficial to auto- as the ones we automate later on after we
mate the comparison of some of the results have learnt more about good practices. It
of the test case with the expected results. is better to focus on relatively few tests,
Conversely, where the execution of test trying out different implementations and
case cannot be automated (say for techni- assessing their relative strengths and

10 Journal of Software Testing Professionals http://www.testinginstitute.com March 2002

You might also like