Testing 07

Item Response Theory
Shortcomings of Classical True

Score Model
Sample dependence
Limitation to the specific test situation.
Dependence on the parallel forms
Same error variance for all
Sample Dependence
The first shortcoming of CTS is that the values of
commonly used item statistics in test development
such as item difficulty and item discrimination
depend on the particular examinee samples in
which they are obtained. The average level of
ability and the range of ability scores in an
examinee sample influence, often substantially, the
values of the item statistics.
Difficulty level changes with the level of samples
ability and discrimination index is different
between heterogeneous sample and the
homogeneous sample.
Limitation to the Specific Test

Situation
The task of comparing examinees who have
taken samples of test items of differing
difficulty cannot easily be handled with
standard testing models and procedures.
Dependence on the Parallel

Forms
The fundamental concept, test reliability, is
defined in terms of parallel forms.
Same Error Variance For All

CTS presumes that the variance of errors of
measurement is the same for all examinees.

The purpose of any test theory is to describe how
inferences from examinee item responses and/or
test scores can be made about unobservable
examinee characteristics or traits that are
measured by a test.
An individuals expected performance on a
particular test question, or item, is a function of
both the level of difficulty of the item and the
individuals level of ability.

Examinee performance on a test can be predicted
(or
explained)
by
defining
examinee
characteristics, referred to as traits, or abilities;
estimating scores for examinees on these traits
(called "ability scores"); and using the scores to
predict or explain item and test performance.
Since traits are not directly measurable, they are
referred to as latent traits or abilities. An item
response model specifies a relationship between
the observable examinee test performance and the
unobservable traits or abilities assumed to underlie
performance on the test.
Assumptions of IRT
Unidimensionality
Local independence
Unidimensionality Assumption
It is possible to estimate an examinee's ability on
the same ability scale from any subset of items in
the domain of items that have been fitted to the
model. The domain of items needs to be
homogeneous in the sense of measuring a single
ability: If the domain of items is too heterogenous,
the ability estimates will have little meaning.
Most of the IRT models that are currently being
applied make the specific assumption that the
items in a test measure a single, or unidimensional
ability or trait, and that the items form a
unidimensional scale of measurement.
Local Independence
This assumption states that an examinee's
responses to different items in a test are
statistically independent. For this
assumption to be true, an examinee's
performance on one item must not affect,
either for better or for worse, his or her
responses on any other items in the test.
Item Characteristic Curves

Specific assumptions about the relationship
between the test taker's ability and his
performance on a given item are explicitly
stated in the mathematical formula, or item
characteristic curve (ICC).

The form of the ICC is determined by the
particular mathematical model on which it is
based. The types of information about item
characteristics may include:
(1) the degree to which the item
discriminates among individuals of differing
levels of ability (the 'discrimination'
parameter a);

(2) the level of difficulty of the item (the
'difficulty' parameter b), and
(3) the probability that an individual of low
ability can answer the item correctly (the
'pseudo-chance' or 'guessing' parameter c).
One of the major considerations in the
application of IRT models, therefore, is the
estimation of these item parameters.
ICC
Probability
Ability Scale
pseudo-chance parameter
c: p=0.20 for two items
difficulty parameter b:
halfway
between the
pseudo-chance parameter
and one
discrimination parameter
a: proportional to the slop
of the ICC at the point of
the difficulty parameter
The steeper the slope, the
greater the discrimination
parameter.
Ability Score
1. The test developer collects a set of observed
item responses from a relatively large number of
test takers.
2. After an initial examination of how well
various models fit the data, an IRT model is
selected.
3. Through an iterative procedure, parameter
estimates are assigned to items and ability scores
to individuals, so as to maximize the agreement, or
fit between the particular IRT model and the test
data.
Ability Score
Item Information Function

The limitations on CTS theory approaches to
precision of measurement are addressed in the IRT
concept of information function. The item
information function refers to the amount of
information a given item provides for estimating
an individual's level of ability, and is a function of
both the slope of the ICC and the amount of
variation at each ability level.
The information function of a given item will be at
its maximum for individuals whose ability is at or
near the value of the difficulty parameter.

The information function of a given item will be at
its maximum for individuals whose ability is at or
near the value of the difficulty parameter.
(1) provides the most information about
differences in ability at the lower end of the ability
scale.
(2) provides relatively little information at any
point on the ability scale.
(3) provides the most information about
differences in ability at the high end of the ability
scale.
Test Information Function

The test information function (TIF) is the sum of
the item information functions, each of which
contributes independently to the total, and is a
measure of how much information a test provides
at different ability levels.
The TIF is the IRT analog of CTS theory
reliability and the standard error of measurement.
Item Bank
If there is a need for regular test administration and
analysis, the construction of item bank may be
taken into consideration.
Item bank is not a simple collection of test items
that is organized in their raw form, but with
parameters assigned on the basis of CTS or IRT
models.
Item bank should also have a data processing
system that assures the steady quality of the data in
the bank (describing, classifying, accepting, and
rejecting items)
Specifications in CTS Item Bank
Form of items
Type of item parts
Describing data
Classifying data
Form of Items
Dichotomous
Listening comprehension
Statement + question + choices
Short conversation +question + choices
Long conversation / passage + some questions + choices
Reading comprehension
Passage + some questions + choices
Passage + T/F questions
Syntactic knowledge / vocabulary
Question stem with blank/underlined parts + choices
Cloze
Passage + choices
Form of Items
Nondichotomous
Listening comprehension
Dictation
Dictation passage with blanks to be filled
Describing data
Ability measured
Difficulty index
Discrimination
Storage code

Testing 07

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Testing 07

Uploaded by

Copyright:

Available Formats

Item Response Theory

Shortcomings of Classical True

Limitation to the Specific Test

Dependence on the Parallel

Same Error Variance For All

Item Response Theory

Item Response Theory

Item Characteristic Curves

Item Characteristic Curves

Item Characteristic Curves

Item Information Function

Item Information Function

Item Information Function

Item Information Function

Test Information Function

Specifications in CTS Item Bank

You might also like