Professional Documents
Culture Documents
Many of the test-takers take the test every month; some of them have
taken the test around ten times and they have been given the same set of
speaking material. Moreover, in many test sites, the administration is not
strict enough, leading to those who have taken the test sharing the
information with those taking the test later. All these problems have one
way or another affected this characteristic of the test.
Test validity refers to the extent as to how far it achieves the purpose it
sets out to do. If the test fails to achieve this aim, it does not satisfy
the quality of validity. According to Heaton (1988), test validity was
traditionally subdivided into four categories: content, criterion-related,
empirical, and construct validity. On designing a test, it is important to
keep in mind this quality. The example below gives a scenario where
test validity is violated and how this particular situation can be
changed to satisfy this quality. One good example for this is:
A listening task is given with a summary for students to listen and fill in
the numbered gaps. The content of the recording is familiar with some
students sitting the test, and as a result, they can come up with
answers to the gaps. Even though the answers are not exactly the
same as the answer keys, but they are acceptable. In this case, the
test designer has violated the test validity quality. In order to improve
this situation, the instruction should clearly state that the numbered
spaces must be filled in with words from the recordings.
In short, there is a close link between test reliability and test validity as
stated in the book by Bachman and Palmer (1996, p.29): The two
measurement qualities, reliability and construct validity, are thus
essential to the usefulness of any language test. Reliability is the
necessary condition for construct validity, and hence for usefulness.
However, reliability is not a sufficient condition for either construct
validity or usefulness.
Authenticity comes next. It is an important quality as it shows the
relationship between the test and the real world. This term can be
viewed in two respects. The broader sense of authenticity in general
refers to the use of real life materials such as recordings taken from
news reports or interviews on TV, or reading texts taken from
newspapers or magazines, brochures. Another sense is seen when a
language test is designed in relation to the target language use in
specific domains besides the language test itself. It is important to take
into consideration the target language use (TLU) and test tasks when a
test is designed.
Another quality which is in close link to the above mentioned
characteristic is test interactiveness. It is defined as test-takers
reactions to the test given to them. Their reactions can be positive or
Finally, teachers can also use tests to improve their teaching. Good
teachers would look at areas where their students need to practice
more. This leads to the inevitable situation of teaching to the test as
mentioned above. For example, after taking the test, the teachers can
see that their students still have problems concerning the use of
relative pronouns in the writing and this particular grammar points
frequently appear in tests. So, the teachers can make adjustments to
their lesson plans so that next time their students can successfully deal
with this problem area.
The last component of test usefulness is test practicality. According to
Bachman and Palmer (1996, p.36), practicality is the correlative
relationship between the resources which are required for the design,
development, use of the test and the availability of the resources.
Although this is mentioned last in the six qualities, it is of no less
importance in relation with other features.
For example, a school with eight English teachers and twenty classes
of forty students. This school wants to conduct a speaking test and
really wants to ensure its reliability by using the scheme of two
teachers per one test room and students will take the test one by one.
On the face of it, there is no problem because there is an even number
of teachers and no teacher must work harder than any others.
However, with eight hundred students in total and eight teachers, this
is no easy task. Tiredness will certainly affect the marking of teachers
at this school. so this is not a practical way to conduct a speaking test.
In conclusion, when designing a test, it is important for test developers
to take into consideration all the six qualities mentioned in the paper
above. With one quality missing, the usefulness of the test will be
affected; in other words, such a test will be of little value both to test
takers and test teachers.
Question 2: Measures to ensure the six qualities of a useful test.
1. Measures to ensure test reliability
a. First, ensure that the length of the test is suitable for the
length of time allocated for testees to finish the test. If a test
is too short, it may lead to cheating, or if the test is too long,
it may cause irritation among test-takers as they do not have
enough time to finish the test. Consequently, testers may not
be able to measure their students knowledge.
b. If there are two or more sets of test given to students of the
same class, it is important that test tasks be kept similar. The
only thing to change is the positions of options and the order
of the questions in the test paper.