You are on page 1of 321

You have either reached a page that is unavailable for viewing or reached your viewing limit for this

book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

Johannes Hartig Eckhard Klieme Detlev Leutner (Editors)

Assessment of
Competencies in
Educational Contexts

HOGREFE

Assessment of
Competencies in
Educational Contexts
Johannes Hartig
Eckhard Klieme
Detlev Leutner
(Editors)

HOGREFE
Johannes H a iti*. Eckh ard K lic m c. Dctlcv Lcutncr: A ssessm ent o f Com petencies in Educational Contexts. H o grcfc P u b lish in g G m b H . Gottingen 20t>8

f t 2008 H ogrelc P ublishing G m bH


K e in c uncrlaubte W citcrgahc Oder Verviclfatigung

L i b r a r y o f C o n g r e s s C a t a l o g i n g in P u b lic a tio n
is available v ia the L ibrary o f C o n g ress M arc D atab ase u n d er the
LC C on tro l N u m b er 2 0 0 6 9 3 3 0 0 6
L ib r a r y a n d A rc h iv e s C a n a d a C a ta lo g u in g in P u b lic a tio n
A ssessm en t o f co m p ete n cie s in ed ucational co n tex ts /J o h a n n e s
H artig , E ck h ard K licm e. D ctlcv L cutncr. editors.
Includes bibliographical references.
IS B N 9 7 8 -0 -8 8 9 3 7 -2 9 7 -9
1 C om petency b ased ed u catio n al tests. 2. E d u catio n al evaluation.
I. K liem e, E ck h ard , 1954- II L cutncr, Dot lev III H artig , Jo h an n es. 1970LC 1034.A 88 2007

371.26

C 2 0 0 6 -9 0 4 9 38-6

(<': 2008 by I Io g refe & I lu b cr P u b lish ers


P U B L IS H IN G O F F IC E S
IJSA:
H ogrcfc & 1Iuber Publishers, 875 M assachusetts A venue, 7th Floor, C am bridge. M A 02139
P h o n e (866) 823-4726, F ax (617) 354 -6 8 7 5 ; E -m ail info@ hogrefe.com
EURO PE: H o g refe & H u b e r P u b lish ers, R o h n sw eg 25. 37085 G ottingen, G erm any
P h o n e + 49 551 4 9 6 0 9 -0 , Fax + 49 551 4 9 6 0 9 -8 8 . E -m ail hh@ h o g rele.co m
S A L E S & D IS T R IB U T IO N
U SA :
I Iogrefe & I Iuber P u b lish ers, C u sto m er S ervices D ep artm ent,
30 A m b en v o o d Parkw ay. A sh lan d . O H 44805
P h o n e (8 0 0 ) 228-3749, Fax (419) 2 8 1 -6883. E -m ail c u stse rv fth o g re fe .c o m
EU RO PE H o g re fe & H u b er P ublishers. R o h n sw eg 25. 37085 G ottingen. G erm any
P h o n e + 49 551 4 9 6 0 9 -0 . F ax + 49 551 4 9 6 0 9 -8 8 , E -m ail h h @ h o g refe.co m
O T H E R O F F IC E S
CA N A D A : H o g refe & H uber P u b lish ers, 1543 Bay view A venue, T oronto, O ntario M 4G 3B5
S W IT Z E R L A N D : H o g refe & H u b er P ublishers. L an g g ass-S trasse 76. C H -3000 B e n i 9
H o g refe & H u b er P u b lish ers
Incorporated and registered in the State o f W ashington, U SA . and in G ottingen. Low er Saxony. G erm any
N o part o f this book m ay be reproduced, stored in a retrieval system or transm itted, in any form o r by
any m eans, electronic, m echanical, photocopying, m icrofilm ing, recording or otherw ise, w ithout w ritten
pem iissio n from the publisher.

P rinted and bound in the U SA


ISB N 9 7 8 -0 -8 8 9 3 7 -2 9 7 -9

J o h a n n e s H a rtig . E c k h a rd K lic m c . D ctlcv L c u tn c r A s se ssm e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n te x ts. H o g rc fc P u b lish in g G m b H . G o ttin g e n 2008


f t 2 0 0 8 H o g n rle P u b lish in g G m b H
K c in c u n c rla u b tc W c itc rg a b c o d c r V c rv ie lfa tig u n g

Contents

Preface

........................................................................................................................................ v

C o n te n ts .................................................................................................................................................ix
Con trib u to rs ...................................................................................................................................... xi

Theoretical Perspectives and Developm ental Models


1

T h e C o n c e p t of C o m p e te n c e in Ed u catio n al Contexts
E ck h a rd K liem e, Johannes H artig, a n d D om in iqu e R a u c h ........................................................... 3

C o m p e te n c ie s for Successful Learning:


D e v e lo p m e n ta l C h a n g e s and Constraints
M a rcu s H a sse lh o rn .....................................................................................................................23

A M o d e l-B a s e d Test of C o m p e te n c e Profile and


C o m p e te n c e Level in D e d u c tiv e R easo n ing
C h ristian e S p iel a n d Ju dith G liic k ........................................................................................................ 4 5

Psychometric Modeling
4

P sych o m e tric M o dels for the A sse ssm e n t of C o m p e te n c ie s


Johannes H a r tig ............................................................................................................................................6 9

E xp lan ato ry Item R esponse M odels: A Brief Introduction


M a rk W ilson, P au l D e B oeck, a n d C laus H. C a rs tense 1 1 .............................................................. 91

Lin kin g C o m p e te n c ie s in H orizontal, Vertical, and


L o n g itu d in a l Settings an d M easu rin g Grow th
A lina A. von D avier, C lau s

H. C arsten sen , a n d M a tth ia s von D u v ie r.......................................121

R epo rtin g Test O u tco m e s Using M o d els for C o g n itiv e D iag n o sis
M a tth ia s von D avier, Lou D i Bello, a n d K en ta ro Y a m a m o to ................................................

J o h a n n e s H a r tig . E c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 8

2 0 0 8 H ogrefe Publishing G m bH
K e in e u n c riu u b te W c itc r g a b c o d e r V c rv ic lfa tig u n g .

151

C ontents

Assessment Methods and Technology


8

M easuring C o m p e te n cie s: Introduction to C o n ce p ts and


Q u e stio n s of A sse ssm e n t in Education
D e tle v Leutner, Joh ann es H artig, a n d N ina J u d e ............................................................................177

Intro ductio n to the C o m p u te r-B a se d A sse ssm e n t of C o m p e te n c ie s


A s tr id J u r e c k a ..............................................................................................................................................193

10

A d a p tiv e Testing an d Item Banking


T h eo J . H. M . E g g en ................................................................................................................................... 215

11

C o m p u te r-B a se d Tests: Alternatives for Test an d Item D esig n


Joach im W ir th ..............................................................................................................................................235

12

C o m p u te r-B a se d A sse ssm e n t in S u p p o rt of D istance Learning


G re g o r y K. W K. C hung . H a ro ld F. O 'Neil. W illiam L. Be w ley. a n d Eva L. B a k e r

253

Large-Scale Assessment for the Monitoring of Educational Quality


13

A sse ssm e n t in Larg e -S c a le Studies


Tina S eid el a n d M a n fred P r e n z e l.........................................................................................................2 7 9

14

Intro ductio n of Ed u catio n al Standards in G e rm a n -S p e a k in g C o u n tries


E ck h a rd K liem e a n d K a th a rin a M a a g M e rk i................................................................................... 305

15

C a u sa l Effects and Fair C o m p a riso n : C o n s id e rin g the


Influence of C o n te x t V ariab le s on Stu d e n t C o m p e te n c ie s
C h ris to f N achtigall, U lfK rd h n e, U lrike E n ders , a n d R o lf S te y e r ........................................ 315

16

M o n ito rin g and A ssu ran ce of S ch o o l Q u ality: Principles of


A sse ssm e n t and Internet-Based F e e d b a c k of Test Results
Ingm ar H o se n feld ........................................................................................................................................ 337

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in L d u c a tm n a l C o n te x t s , l l o g r e f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K c in e u n e riu u b fe W c itc r g a b e o d e r V c r v ic lf a tig u n g .

Contributors

Eva L. Baker
N ational C e n te r for R esearch on E valuation, S ta n d a rd s , a n d S tu d e n t Testing
(C R E S S T ), U niversity o f C alifo rn ia, L o s A ngeles, C alifo rn ia, USA.

William L. Bewley
N ational C e n te r for R esearch on E v alu atio n , S ta n d a rd s , a n d S tu d e n t Testing
(C R E S S T ), U niversity o f C alifo rn ia, L o s A ngeles, C alifo rn ia, USA.
( 7aits // . ( 'arstensen
D e p a r tm e n t o f E ducational S cie n ce and R esearch M ethodology,
L e ib n iz -In s titu te for S cience E d u ca tio n (IP N ), Kiel, G erm any .

Gregory K. W. K. Chung
N ation al C e n te r for R esearch on E valuation , S ta n d a rd s , and S tu d e n t Testing
(C R E S S T ), U niversity o f C alifo rn ia, L o s A ngeles, C alifo rn ia, USA.

Paul De Boeck
D e p a r tm e n t o f Psychology, K ath o liek e Universiteit L eu ven, B elgium .

Lou DiBello
L e a r n i n g Sciences R e s e a r c h Institute, U niversity o f Illinois at Chicago,
Illinois, USA.
Theo J H. M Eggen
Cito, P s y c h o m e tric R e s e a rc h C en tre, A r n h e m , N e th e rla n d s /
D e p a r tm e n t o f R e s e a rc h M eth odo logy , M e a s u r e m e n t , and D ata A nalysis,
U niversity o f Tw ente, E n sch ede, N etherlands.

I Hrike Enders
In stitute o f Psychology, F ried rich S chiller University, Jena, G erm an y .

Judith Gluck
In stitute o f Psychology, A lp e n - A d r ia U niversity K la g e n fu rt, A ustria.

Johannes Hartig
C e n te r for E d ucatio nal Q u a lity and E valuation, G e r m a n In stitu te for
International E d ucatio nal R esearch (D IPF), F r a n k f u r t, G erm any.

M arcus Hasselhorn
C e n t e r for E d u ca tio n a n d D ev elo pm ent, G e r m a n Institute for International
E d ucatio nal R e s e a rc h (D IP F ), F r a n k f u r t, G erm any .

Ingmar Hosenfeld
F aculty o f Psychology, U n iv ersity o f K o b le n z -L a n d a u ,
C a m p u s L a n d a u , G erm any.

Nina Jude
C e n te r for E ducational Q u a lity and E valuation, G e r m a n Institu te for
International E ducational R e s e a rc h (D IP F ), F r a n k f u r t, G erm an y.

J o h a n n e s H a r tig . E c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 8

2 D08

H u g ic fc PuM iJiing G m b H

K e in e u n e r la u b te W c itc r g a b e o d c r V c r v ic lf a tig u n g .

xii

C on trib uto rs

A strid Jurecka
C e n t e r for E d ucational Q u a lity a n d Evaluation, G e r m a n In stitute for
International E d ucational R esearch (D IPF), F r a n k f u r t, G erm any.

Eckhard Klieme
C e n t e r for E d ucatio nal Q u a lity a n d E valuation, G e r m a n In stitute for
International E d ucatio nal R esearch (D IP F ), F ra n k f u rt, G erm any.

U lfKrdhne
In stitute o f Psychology, F riedrich S chiller University, Jena, G erm an y .

De/lev Leutner
D e p a r tm e n t o f Instru ctio n al Psychology, School o f E d u ca tio n ,
U n iv e rs ity o f D u is b u rg -E s s e n , Essen, G e rm a n y .

Katharina M aag M erki


In stitute o f E d u ca tio n al Science, U niversity o f E d u c a tio n , Freiburg, G erm an y .

( h risto f NachtigaU
In stitute o f Psychology, F riedrich S ch iller U niversity, Jena, G erm an y .

HaroldF. O'Neil
Rossier School o f E d u ca tio n , U n iv ersity o f S o u th e rn C alifo rn ia / National
C e n t e r for R esearch on E valuation, S ta n d a rd s , and S tu d e n t Testing
(C R E S S T ), U n iv e rs ity o f C alifornia, L o s A ngeles, C alifo rn ia, USA.

M anfred Prenzel
D e p a r tm e n t o f E d u ca tio n al Science, L e ib n iz -In s titu te for S cience E d ucatio n
(IPN ), Kiel, G erm an y .

Dominique Rauch
C e n t e r for E d u ca tio n al Q u a lity a n d E valuation, G e r m a n In stitute for
In ternation al E ducational R esearch (D IP F ), F r a n k f u r t, G erm an y .

Tina Seidel
In stitute o f E d u ca tio n al Science, D e p a r t m e n t o f E ducational Psychology,
F riedrich S chiller University, Jen a, G erm any .

( 1hristiane Spiel
F aculty o f Psychology, U niv ersity o f V ienna, A ustria.

R o/fSteyer
In stitute o f Psychology, F ried rich S chiller University, Jena, G erm an y .

Alina A. von Davier


E d ucatio nal Testing S ervice, P rin ceto n, N e w Jersey, USA.

M atthias von Davier


E d ucatio nal Testing S ervice, P rin ceto n , N e w Jersey, USA.

M ark Wilson
G r a d u a t e School o f E ducation, U niversity o f C alifornia,
Berkeley, C alifornia, USA.

Joachim Wirth
D e p a r tm e n t o f R e s e a rc h on L e a r n i n g a n d In stru ctio n , R u h r-U n iv e rs ita t
B o c h u m , G e rm a n y .

Kentaro Yamamoto
E d ucatio nal Testing S ervice, P rin ceto n , N e w Jersey, U SA .

J o h a n n e s H a r tig . E c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n t e x ts , l l o g r c f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K c in e u n e riu u b fe W c itc r g a b c o d e r V c r v ic lf a tig u n g .

Part I
Theoretical Perspectives and
Developmental Models

J o h a n n e s H a r tig . E c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s m E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 8

2D0S Hogrefe Publish in g G m b H


K c in c u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

J o h a n n e s H a rtig , E c k h a rd K liem e. D c tlc v L e u tn e r: A sse ssm e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n te x ts , H o g re fe P u b lish in g G m b H . G o ttin g e n 2 0 0 8

H o g re fe P u b lish in g G m b H
K e in e u n e rla u b te W c itc rg a b c o d e r V c rv ic lfa tig u n g .

2008

Chapter i
The Concept of Competence in Educational
Contexts1
Eckhard Klieme, Johannes Hartig, and Dominique Rauch

W i th in the social and h u m a n sciences, one a re a u n d e r g o in g rapid e x p an sio n and


a ttra c tin g in c re ase d public attention is em pirical educational research (M a n d l &
K opp , 2005). T h e in c re asin g k n o w le d g e r e q u ire m e n ts in m a n y a re a s o f w o rk and
life and th e globalization o f labor and ed u catio nal m a rk e ts have m a d e th e question
o f th e educational s y s te m s productivity a crucial o n e for society. Since th e end o f
th e 1980s, the in tro d u ctio n o f n e w oversight strategies for g o v e rn m e n ta l intervention
w o rld w id e h as led to a s tro n g er focus on o u tp u ts and o u tc o m e s at all levels o f
th e ed u cational sy stem , fro m e le m e n ta r y th r o u g h s e c o n d a r y and te rtia r y ed u catio n
up to vocational and adu lt education. T h e s e o u tc o m e s - o r th e v alu e a d d e d to th e m
- are used a s criteria for the p ro d u ctiv ity o f entire ed u cation al system s, the quality
o f individual ed u cational institutions, and the le a r n in g a c h ie v e m e n ts o f in d iv id u
als. T h e role o f ed u cational research, then, is to re n d e r this educational p ro d uctivity
measurable , to d evelop m o d e ls that can explain h o w educational p ro c e s s ta k e place,
ev aluate th eir effectiv en ess and efficiency, a n d p r o p o s e a n d a n a ly z e strateg ies for
intervention.
In a m o d e rn industrial society, ed u catio n a n d professional qualificatio ns can no
lon ger b e d e s c rib e d a c c o r d in g to a rigid c an o n o f k n o w le d g e in specific subjects
passed on fro m g e n e ra tio n to gen eration. Instead b u ild in g c o m p e te n c ie s h as been
identified as the m a in objective o f education. A n d w h ile ed u cation al goals t h e m
selves are ch a n g in g , the traditional m e th o d s o f p e d ag o g ical a n d psychological e v a lu
ation - such a s the c rite rio n -o rie n te d a c h ie v e m e n t a s s e s s m e n ts o f the 1970s, w hich
essentially tra n sla te d hierarch ically s tr u c tu re d g oals specific to the p a r tic u la r subject
m a tte r at h and into test item s - are r e a c h in g th e ir lim its (Segers, Dochy, & Cascallar,
2003).
A n i m p o r ta n t theoretical and practical c o n trib u tio n o f recent educational research
is the reco n c e p tu a liz a tio n and o peratio n alizatio n o f ed ucational ob jectives in c o n c e p
tual t e r m s o f competence, a s well as related c o n c e p ts such a s literacy and life skills.
1 Sections o f this chapter previously appeared in Koeppcn. K.. Hartig. J.. Klieme, E.. &
Leutner. D. (2008). Current issues in competence modelling and assessment. Zeitschrift
fur Psychologic Journal of Psychology-, 216, 61-73.
.1 H a rtig , E K lie m e, & D. L e u tn e r (E d s ),
A ssessm en t o f C o m p eten cies in e d u c a tio n a l C on texts, 3 -2 2 .
> 2008 I Iogrefe & I Iu b er P u b lish ers
J o h a n n e s IU rt!g . E c k h a rd K liem e. D c tlc v L e u tn e r: A sse ssm e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n te x ts, H o g re fe P u b lish in g G m b H , G o ttin g e n 2 0 0 8
2D08 H u g tc fc PubliJiing G m b H
K e in e u n e rla u b te W c itc rg a b c o d e r V c rv ic lfa tig u n g .

E. K liem e. J. Hartig. & D. Rauch

T h e c o n c e p t o f c o m p e te n c e is central to em pirical stud ies d e a lin g w ith th e d e v e lo p


m e n t o f h u m a n re s o u rc e s and th e p ro d uctivity o f education. A lth o u g h it h as b een in
u s e for decad e s, the te rm c o m p e te n c e has enjoyed in c re a sin g c u r r e n c y in educational
research, p sychology and n e ig h b o rin g discip lines in the last fe w y e a r s (e.g., Csapo,
2004; K lie m e , F u n k e , Leutner, R e i m a n n , & W i rt h , 2001; K lie m e & H a rtig , 2007,
R y ch en & S alganik, 2001, 2003; S te rn b e rg & G rig o re n k o , 2003; W ein ert, 2001).
In th e first section o f th is chapter, th re e well k n o w n yet f u n d a m e n ta ll y differ
ent c o n c e p ts o f c o m p e te n c e will b e p resen ted a n d d iscu ssed w ith respect to th eir
potential to g u id e the m e a s u r e m e n t o f educational o u tcom es. W e will then present
a w o r k in g definition o f c o m p e te n c e for ed u cational a s s e s s m e n t settings, before w e
d is c u s s c u r r e n t c h a lle n g e s in th e a s s e s sm e n t o f com petencies. H ere w e d istin g u ish
fo u r key areas: th e d ev e lo p m e n t o f theoretical m o d e ls o f c o m p e te n c e , th e c o n s t r u c
tion o f p s y c h o m e tric m odels, th e co n s tru c tio n o f m e a s u r e m e n t a n d research on the
u s e o f d ia g n o stic inform ation.

Different Concepts of Competence


In th e fo llow in g w e will deal w ith the g e n e ric ap p ro ach fo u n d e d by N o a m C h o m s k y ,
th e n o r m a tiv e a p p r o a c h w h ic h is p ro m in e n t a m o n g e d u c a tio n a lis ts and the p r a g m a tic
a p p ro a c h e s developed by D av id M c C le lla n d a n d o th e r scholars in psychology.

Theoretical Concepts for Linguistic D evelopm ent and Socialization


In his ling u istic theory, C h o m s k y d is ta n c e d h i m s e l f fro m the b eh av io ristic lin g u is
tics p r e d o m in a n t at his tim e, w h ic h e q u a te d la n g u a g e to o b s e rv a b le so u n d and s e n
te n ce patterns. H e a r g u e d that " I f w e are ever to u n d e r s ta n d h o w la n g u a g e is used
o r a c q u ire d , then w e m u s t a b s tra c t for s e p arate and in d e p e n d e n t s tu d y a cognitive
system , a system o f k n o w le d g e a n d belief, th at d ev elop s in early c h ild h o o d a n d that
in te ra c ts w ith m a n y o th e r factors to d e t e r m in e th e k in d s o f b e h a v io r that w e observe;
to in tro d u c e a technical te rm , w e m u s t isolate and s tu d y the system o f linguistic com
petence that u n d e r lie s b e h a v io r but th at is not realized in any d ire c t o r sim p le w ay in
b e h a v io r44 (C ho m sky, 1968, p. 4). Here, C h o m s k y proves to be a p r e c u r s o r to m o d e r n
co g n itiv ism , w h ile he h i m s e l f recu rs to th e the o ries o f th o u g h t by D e s c a r te s and
W ilh e lm v o n H um bo ld t. He a ttrib u te s to th e m th e idea o f c o n c e iv in g la n g u a g e as a
system o f rules that enables h u m a n b e in g s to be creative: it e n ab les th e m to e x p ress
ever n e w th o u g h ts in e v e r n e w situations. T h e citation ren d ers clear th a t the te rm
competence is in tro d u c e d as a technical te rm (w ithout referen ce to its etym olog ical
origins) in o r d e r to d e s c rib e th e c o g n itiv e system u n d e r ly i n g th e se creative linguistic
abilities.

J o h a n n e s H u rt!# . E c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuM iJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

The C o n c e p t o f C o m p e te n c e in E ducational C ontexts

T h e q u estio n o f m e a s u r i n g individual c o m p e te n c ie s a c tu a lly b e a r s no signifi


can ce to C h o m sk y . Rather, he is c o n c e r n e d w ith u n d e r s t a n d i n g the c o g n itiv e basis
o f la n g u a g e -re la te d a c tio n s that is c o m m o n to all h u m a n beings. In this sense, in
te rin dividual d iffe re n c e s o nly relate to performance as th e actual realization o f a
c o m p e te n c e , w h ic h is influ enced by personal and situational facto rs and is irrelevant
to th e theory.
T h e b ro a d e r sociological discussion a n d H a b e r m a s (1981) in p a r tic u la r signifi
cantly g e n e ra liz e d th e c o m p e te n c e c o n c e p t sensu C h o m s k y to c o m m u n ic a tiv e c o m
p e te n c e as th e e p ito m e o f s o c io -c o g n itiv e rules a n d s tr u c tu r e s that allow in d iv id u als
to g e n e ra te c o m m u n ic a tiv e situations. T h is co n stituted a h ighly influential f r a m e
w o rk social scientists u se o f the te rm competence up until the 1990s. For instance,
S c h n e e w in d a n d P e k ru n (1994) state in th e ir overv iew o f th e o rie s o f educational
p sy cho lo gy and socialization that:
In th e end, th e d ev e lo p m e n t o f c o m m u n ic a tiv e c o m p e te n c ie s e m e rg e s as the
final goal o f socialization. It in te n d s to enable the individual to ta k e p art in
th e d is c o u rs e o f ideal sp eech situations w h ile on th e o th e r h a n d it is m e an t
to c o n trib u te to w a r d s e x p o u n d i n g th e pro b lem s asso ciated w ith th e social
co n d itio n s that h in d e r an ideal s p eech situation, (p. 21, tran slatio n by the
authors)
G e n e ra tiv e m o d e ls su ch as C h o m s k y s o r sim ilarly the P iag e tian trad itio n (Siegler
& A libali, 2 0 0 5 ) m a in ta in th e distin ctio n b e tw e e n competence a n d perform ance. In
th ese th e o rie s th e question w h e th e r c o m p e te n c e can be m o d e led and m e a s u r e d is
identical to the q u estio n in h o w fa r it is possible to u n d e r s ta n d , d e s c rib e and ev aluate
th e fu n ctionalities o f a c o g n itiv e sy ste m th at generates co n tin g en t b eh av io r (p e rfo r
m a n ce) w hile not b e i n g identical to it. A s a rule, g e n e ra tiv e m o d e ls o f c o m p e te n c e
and its d ev e lo p m e n t are not g r o u n d e d in quan titativ e m e a s u r e m e n ts but they are
d e te r m in e d in r e c o n s tru c tiv e m a n n e r in a b ro ad sense. F rom the em pirical p e r s p e c
tive, such m o d e ls are less g r o u n d e d in m e a s u r e m e n ts b a s e d on la rger sam ples, but in
case studies that are qualitatively re c o n s tru c te d , b e it h e rm e n e u tic a lly or - a s in the
case o f C h o m s k y - by form al m ethods.
T h e issue o f m o d e lin g and o f e m p iric a lly a s s e s sin g c o m p e te n c ie s h as particularly
b e e n u n d e r discussion in research on la n g u a g e tests. H ere, test c o n c e p ts that a d d re s s
v a rio u s a s p e c ts o f la n g u a g e u se re p e a te d ly stand in conflict w ith C h o m s k y s co n cep t
o f c o m p e te n c e : T h e r e is a d iffe re n c e b e tw e e n c o m p e te n c e and p e rf o rm a n c e , w h e re
c o m p e te n c e e q u a ls ability e q u a ls trait, w h ile p e r f o r m a n c e refers to th e actual e x e c u
tion o f ta s k s (Shoham y, 1996, p. 148; see also B eck & K liem e, 2007). By e q u a tin g
c o m p e te n c ie s to traits, th is v ie w o p e n s a path to m e a s u r i n g individual d e g r e e s o f
c o m p e te n c ie s in larger g r o u p s o f persons, and builds a link to the p r a g m a tic co n cep t
o f c o m p e te n c e d e s c r ib e d below.

J o h a n n e s H u rt!# . K c k h a r d K lie m e . l> e tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in L d u c a tio n a l C o n t e x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H o g ic fc PuM iJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

E. K liem e. J. Hartig. & D. Rauch

Com petence and the Normative Concept of

Bildung

W h e n sch olars o f ed ucational science s p e a k abo ut th e general g oals o f tr a in in g


w ith i n m o d e rn societies, they quarrel w ith fin d in g a b a la n c e b e tw e e n Bildung in the
tra d itio n o f G e r m a n philosophy, i.e. develo p in g perso n ality a n d a llo w in g in d iv id u als
to particip ate in h u m a n culture, a n d qualification, i.e. es ta b lis h in g k n o w le d g e and
skills that are relevant fo r vocational practice. T h e G e r m a n H e in r ic h Roth s e e m s
to have b een th e first sch o lar w h o deliberately u s e d th e notion o f c o m p e te n c e to
find a c o m p r o m is e b e tw e e n th e t w o directions. Interestingly, th e in trod uction o f
th e c o m p e te n c e c o n c e p t in the secon d v o lu m e o f R o th 's Pedagogical Anthropology,
w h ich w a s published in 1971, g o e s alo n g w ith a transition fro m a traditional to an
e m a n c ip a to r y c o n c e p t o f education. Roth defines th e central objective o f edu catio n
as f o s t e r i n g Miindigkeit (maturity), defined as c o m p e te n c e for resp o n sib le action. He
f u r th e r defines m a tu r ity a s th e m ental constitution o f a h u m a n b e i n g w h e re hetero n o m y h as b e e n sub stituted by au to n o m y to the h ig h e st possible d eg ree (R oth, 1971).
T h u s , he im m e d ia te ly c o n n e c ts w ith the en lig h te n e d trad itio n o f education.
Roth do es not give any definition for c o m p e te n c e . N e v erth eless w e c a n a s s u m e
th at he w a s a c q u a in te d w ith the v a r ia n ts o f the te r m in social science, a n d built up
on th e m , m o r e so b e c a u s e he refers to literatu re on th e d ev elo p m en t o f c o m p e te n c e
m o tiv ation e lse w h e re by re fe rrin g to W h i t e (1959). In any case, a s a psychologically
tra in e d ed u catio nal scientist, R oth (1971) v ie w s c o m p e te n c ie s a s individual abilities
in te rm s o f dispo sitio n s for action and j u d g m e n t:
In o u r view, m a tu r ity [Miindigkeit] should be in te rp re te d as c o m p e te n c e in
a threefold sense: a) as s e lf-c o m p e te n c e - the ability to be respo nsib le for
y o u r owm action, b) professional c o m p e te n c e - th e ability to act and ju d g e
in a p a r tic u la r profession, a n d hold responsible, c) social c o m p e te n c e - the
ability to act a n d ju d g e , a n d hold responsible, in professional o r social areas
th at are relevant in social, societal o r political term s, (p. 180, translatio n by the
authors)
R o th s c o m p e te n c e co n cep t is v e ry b road w h e n c o m p a r e d w ith the discussion in
social science. W h e n m e n tio n in g abilities, he d o e s not o nly m e a n co g n itiv e d is p o s i
tio n s for ach iev em en t, b u t a co m p re h e n siv e ability to act th a t includes th e a f f e c
tive-m otivational a re a (R oth, 1971). Finally, the e m a n c ip a to r y intention associates
c o m p e te n c e w ith a d e m a n d for responsibility. T h us, R o th s c o n c e p t o f c o m p e te n c e
refers to ideal, com plex g o a ls o f ed u catio n, re s e m b lin g a b ro a d n o rm a tiv e c o n c e p t o f
Bi/dung. A s a sch o lar w h o w o rk e d in em pirical educational research, Roth inten ded
to actu ally c o n s tru c t m e a s u r e s for this co n cep t, b ut neither he n o r his su ccesso rs
have b e e n able to provide such m easures.

J o h a n n e s H a r tig . K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PuM iJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

The C o n c e p t o f C o m p e te n c e in E ducational C ontexts

A Functional-Pragm atic Concept of Com petence


T h e fun ctional c o n c e p t o f c o m p e te n c e , used in th e early seven ties psychology, is
explicitly not in terested in th e generative, c o g n itiv e system th at is in d e p e n d e n t from
situ ations or in n o r m a tiv e g oals o f education such a s fo s te rin g autonom y, but instead
it is interested in a p e r s o n s ability to c o p e w ith c h a lle n g e s in p a r tic u la r situations.
In 1973, D av id M c C le lla n d d e m a n d e d te stin g for c o m p e te n c e rath er th a n for intel
lig e n c e (p. I) criticizin g th e traditional intelligence diagnostic. W h i l e intelligence
tests a r e deliberately d e -c o n te x tu a liz c d , M c C le lla n d c la im e d that educational and
psychological research n e e d s c o n c e p ts and a s s e s s m e n t p ro c e d u r e s that ta k e into a c
c o u n t the situation and c o n te x tu a liz a tio n o f h u m a n action. Competence-oriented
d ia g n o stic s w a s a s s o c ia te d w ith a h o p e o f im p r o v in g the a d ju s tm e n t o f te st c o n
tents to real-life situations (e.g., in vocational settings), a n d th u s b e in g b e tte r able
to predict d iffe re n c e s in a c h ie v e m e n t in th ese situations. C o m p e te n c e a c c o r d i n g to
M c C le lla n d refers to th e attrib u te s req u ired for su ccessfu lly p e r f o r m i n g p a rtic u la r
actions. H owever, he d o e s not f u r th e r specify th e c o n c e p t w ith regard to any p a r tic u
lar theory. F ro m his perspective, any k in d o f individual attrib u te m a y b e p erceiv ed as
c o m p e te n c e as far as it se rv e s to predict success in c o n c re te achievem ent: so m e
o f th ese c o m p e te n c ie s m a y be rather traditio nal co g n itiv e o n es in v o lv in g reading,
w ritin g , and ca lc u la tin g skills. O th e r s should involve w h a t traditionally have been
perso n ality v ariables, a lth o u g h they m ig h t b etter b e c o n sid ered c o m p e te n c ie s
(M cC lellan d , 1973, p. 10).
H ence, the h isto ry o f the subject reveals th at a key fe a tu re c o n c e r n i n g th e c o m p e
te n c e c o n c e p t is its s tro n g e r relation to real life. B a n d u r a (1990), a social p s y c h o lo
gist, s u m m a r i z e s that th e re is a m a r k e d d ifferen c e b e tw e e n p o s se ss in g k n o w le d g e
and skills, and b e in g able to use them well u n d e r div e rse c irc u m s ta n c e s , m a n y o f
w h ich co n tain am b ig u o u s , u n p redictab le, stressful e lem en ts44 (p. 315). Connell,
S heridan , a n d G a r d n e r (2003) a re particularly c o n c is e in d e s c rib in g c o m p e te n c e as
realized abilities44 (p. 142). W h i l e intelligence re s e a rc h a s s e s se s c o g n itiv e a c h ie v e
m en t c o n s tr u c ts that are g e n e r a liz e d a c ro ss a b ro ad s c o p e o f situations, c o m p e te n c e
c o n s tr u c ts a d h e r e to specific a re a s o f d e m a n d s. T h u s , th e question: c o m p e te n t for
(doing) w h a t? is essential to any c o m p e te n c e definition.
N evertheless, d e s c rip tio n s o f specific c o m p e te n c e c o n s tr u c ts will d iffe r as to
d e g r e e s in h o w far the p o stu la ted c o m p e te n c ie s can be applied a c ro ss differen t
situations. W e in e rt (2001) refers to key competencies that are c h a ra c te riz e d by a
p articularly b ro a d s c o p e o f transfer, e.g., l a n g u a g e c o m p e te n c ie s , a n d metacompe
tencies th a t facilitate th e acquisition and u se o f specific co m p eten cies. M e ta - c o m p e te n c ie s include strategies o f th in k in g , le a rn in g , p la n n in g and g o v e rn in g as well as
k n o w le d g e about ta s k s and strateg ies and k n o w le d g e o f y o u r personal s tr e n g th s and
weaknesses.
I f c o m p e te n c ie s are regard ed a s c o n te x t- d e p e n d e n t ability c o n s tru c ts, th e ir d e
v e lo p m e n t can only be co nceived a s resu ltin g fro m le a r n in g p ro cesses w h e re the
individual in teracts w ith his o r h er e n v iro n m e n t. T h is m e a n s c o m p e te n c ie s can be

J o h a n n e s I D m # . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n a l C o n te x t s , l l o g r c f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

E. K liem e. J. Hartig. & D. Rauch

a c q u ir e d by l e a r n i n g or they even have to b e a c q u ir e d th r o u g h le arn in g , w h ile b a


sic c o g n itiv e abilities, in contrast, can only be le arn ed and tr a in e d to a fa r low er
d e g re e (W ein ert, 2001). C o m p e te n c ie s can be a c q u ire d th r o u g h e x p e r ie n c e g ain ed
fro m relevant situ ations o f d e m a n d and they m ig h t b e influ en ced by tr a i n in g or o th e r
external in terv en tio n s, by y e a rs o f practice th e y m ig h t be e n h a n c e d to an e x p e r
tise in th e resp ectiv e dom a in s. In th is sense, M a y e r (2003) s u m m a r i z e s m o re recent
a p p ro a c h e s o f a p s y c h o lo g y o f abilities, c o m p eten cies, and e x p e rtis e " a s follows:
A bility can be defined as o n e s potential for le a r n in g k n o w le d g e th a t s u p p o rts c o g
nitive p e rfo rm a n c e . (...) C o m p e te n c y can b e defined a s th e sp e c ia liz e d k n o w le d g e
one h as a c q u ir e d th at s u p p o r t c o g n itiv e p e r f o r m a n c e , and e x p e rtis e is a v e ry high
level o f c o m p e te n c y " (p. 265). It is th u s possible to re n d e r th e fact that c o m p e te n c ie s
can b e le arn ed a d e fin in g ch a ra c te ris tic o f c o m p e te n c ie s a g a in s t o th e r dispositional
c o n s tr u c ts (e.g., H a r tig & K lie m e, 2006). S im o n to n (2003) c h a r a c te riz e s c o m p e te n c e
as any a c q u ire d skill o r k n o w le d g e th a t c o n s titu te s an essential c o m p o n e n t for p er
f o r m a n c e o r a c h ie v e m e n t in a giv en d o m a in " (p. 230). H e illustrates th is by the
e x a m p le o f a c o m p o s e r w h o n e e d s c o m p e te n c ie s in d e a lin g w ith melody, r h y th m ,
orch estration , d r a m a t i z i n g and so on.
In s u m m a r y , th e p ra g m a tic psychological tradition con ceiv es o f c o m p e te n c ie s as
context-specific disp o sitio n s for a c h ie v e m e n t that c a n be a c q u ir e d th r o u g h learning.
F u r th e r m o r e , they fu n c tio n a lly relate to situatio ns and d e m a n d s in specific do m a in s.
T h e s c o p e o f th ese d o m a i n s or o f th e relevant s itu a tio n s can v a ry from h ig hly specific
c o m p e te n c ie s in n a r r o w d o m a in s to b road ly c o n c e p tu a liz e d key c o m p e te n c ie s , but
c o n te x tu a liz a tio n and le a rn in g are f u n d a m e n ta l to all o f th ese concepts. A s will be
ou tlined in th e follow ing section, this p rag m a tic, co n te x tu a liz e d con cep t o f c o m p e
te n ce is a useful fo u ndatio n for the em pirical a s s e s sm e n t o f ed ucational outcom es.

A Working Definition of Com petence in Educational Assessment


W h e n O E C D policy m a k e r s reach e d out to define an in ternation al p r o g r a m to assess
th e o u tc o m e o f schooling, th e g u id i n g q u estio n they had in m in d w a s w h a t y o u n g
ad u lts at the end o f edu catio n w ou ld need in t e r m s o f skills to be able to play a
c o n s tru c tiv e role a s c itize n s in society (Trier & Peschar, 1995). T h u s , they crossed
th e b o u n d a r ie s o f school c u rric u la as well as th e lim itatio n s o f classical m o d e ls o f
h u m a n abilities. T h e y n either restricted educational a s s e s s m e n t to k n o w le d g e and
skills w ith in a few school subjects n o r referred to psychological the o ries o f g e n
eral co g n itiv e abilities. Instead, th e y took a fu n ctio n al view, a s k in g w h e t h e r y o u n g
ad u lts are p r e p a re d to c o p e w ith the d e m a n d s a n d c h a lle n g e s o f th e ir f u tu r e life. T h is
t y p e o f disposition for m a s t e r in g u n fo r e s e e n d e m a n d s and ta s k s has b een called life
skills (B ink ley, S ternberg, Jones, & N o h a r a , 1999) ox cross curricular competencies
(O E C D , 1997; T rie r & Peschar, 1995).
In fact, th e fu n ctio n al u n d e r s t a n d i n g o f c o m p e te n c ie s b e c a m e central to the w h ole
P r o g r a m for intern ational stu d en t a s s e s sm e n t (P ISA ) as it h as b e e n im p le m e n te d by

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R


2D 0S H u g i c f c P u M i Jiin g G m b H
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

The C o n c e p t o f C o m p e te n c e in E ducational C ontexts

th e O E C D since 1998. For exam p le, th e PISA f r a m e w o r k defines mathematical lit


eracy a s an in d iv id u a ls ability, in d e a lin g w ith the world, to identify, t o u n d e rs ta n d ,
to e n g a g e in a n d to m a k e w ell-fo u n d ed j u d g m e n t s ab out the role that m a th e m a tic s
plays, as n e e d e d for th a t in d iv id u a ls c u r r e n t a n d f u tu r e life a s a co n structiv e, c o n
cern ed , and reflective c itiz e n " (O E C D , 1999, p. 41). L ikew ise, reading literacy and
science literacy are related to e v e r y d a y applications a n d auth en tic tasks.
A s a reaction to th e n eed for a f u n c tio n a l a p p ro a c h o f c o m p e te n c ie s w h ich devel
o p e d from th e go als o f PISA, Wei n ert (1999) s u g g e s te d a c o n c e p t o f c o m p e te n c e th at
should be u s e d in la rg e-scale a s s e s s m e n ts o f educational outcom e. C o m p e te n c ie s
should b e defined by th e r a n g e o f situation s and ta s k s w h ich h av e to be m a stered ,
and a s s e s sm e n t m ig h t be d o n e by c o n fr o n tin g th e stu d e n t w ith a s a m p le o f such
(eventually sim u lated ) situations. T h is k in d o f a s s e s s m e n t should be o f g r e a te r p r a c
tical u se b e c a u s e it go es b e y o n d c o m p a r tm e n t a li z e d and inert know ledge.
A m o n g the scholars w h o have b e e n w o r k in g in th e field o f educational assessm en t,
R ic h a rd S havelson s e e m s to be closest to this co n cep tio n o f c o m p e te n c e , althou gh
he has not used this te rm systematically. S havelson w a s a p r o m in e n t p ro p o n e n t o f
p e r f o r m a n c e a s s e s sm e n t, i.e. a s s e s sm e n t that in c o rp o r a te s h a n d s -o n , a u th e n tic
activities b a s e d on s a m p lin g from a p o p u la tio n o f relevant situatio ns and ac tiv i
ties (Shavelson, Baxter, & Pine, 1991, 1992). H e h as applied this c o n c e p t in diverse
vocational a n d professional d om a in s. Recently, S havelson (2007) arg u ed for a broad
u n d e r s ta n d i n g o f ed u catio nal o u tc o m e s that g o e s well b e y o n d a c a d e m ic skills and
abilities m e a s u r e d by s ta n d a r d iz e d tests:
T h e s e additional o u tc o m e s include l e a r n i n g to know , u n d e r s ta n d , and reason
in an a c a d e m ic discipline. T h ey also include personal, civic, m oral, social,
and intercultural k n o w le d g e a n d actions - o u tc o m e s the E du cation al Testing
S erv ice h as d escrib ed as soft. T h i s set o f o u tc o m e s - w h ich (...) I will call
personal a n d social responsibility (P S R ) skills - are ev ery bit as d e m a n d i n g as
th e a c a d e m ic sk ills that o ften get labeled exclusively a s th e c o g n itiv e skills and
are too im p o r ta n t not to be m e a su re d , (p. 1)
A s im ila r u n d e r s ta n d i n g o f c o g n itiv e , in c lu d in g c o g n itiv e and m e ta c o g n itiv e a s
p e c ts o f s e lf regu latio n, social b e h a v io r a n d moral rea so n in g , w a s held by K lie m e
and L e u t n e r (2006) w h e n they - w ith reference to Wei nert (1999, 2001) - pro p o sed
a w o r k in g definition o f c o m p e te n c ie s as context-specific cognitive disp o sitio n s that
are a c q u ir e d by le a rn in g and n e e d e d to s u c c e s s fu lly co p e w ith c e rta in situations
or ta s k s in specific d o m a in s . T h is definition is strongly in line w ith the p ra g m a tic
definition c h a r a c te r iz e d in th e prev io u s section. C o m p e te n c e c o n s tr u c ts c o n c e p t u a l
ized as learnable, c o n te x tu a liz e d , co g n itiv e d ispo sitio n s are a d e q u a t e criteria for the
p ro d uctiv ity o f educational p ro c e s s e s and system s. T h e p ro p o s e d w o r k in g definition
o f c o m p e te n c ie s not m erely refers to abilities re q u ire d for s u ccess w ith in school, but
claim s to cov er disp osition s that are re q u ire d for s u c c e s s in fu tu re , e.g. vocational
situations.

J o h a n n e s I U r t ! g . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuM iJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

10

E. K liem e. J. Hartig. & D. Rauch

Current Challenges in the Assessment of Competencies


W e identify four key a re a s in th e a s s e s s m e n t o f co g n itiv e com p etencies: first
and fo rem o st, the d ev e lo p m e n t o f theoretical m o d e ls o f c o m p e te n c e (A rea 1, see
H as se lh o rn , 2 0 0 8 and Spiel & G luck, 200 8, C h a p te r s 2 and 3 in th is book), c o m p le
m e n te d by th e c o n s tr u c tio n o f p s y c h o m e tric m o d e ls (A rea 2, see H a r tig , 2008; von
Davier, C a rs te n s e n , & v o n D av ier 2 008 and von Davier, D iBello, & Y am am oto,
2008; W ils o n , D e B oeck, & C a r s te n s e n , 2008, C h a p t e r s 4 to 7 in this book). T h is
leads o nto th e c o n s tru c tio n o f m e a s u r e m e n t in s tr u m e n ts for the em pirical a s s e s s
m e n t o f c o m p e te n c ie s (A re a 3, see E g g e n , 2008; Ju re c k a , 2008; L eutner, H artig,
& Jude, 2008; W irth , 2008 , C h a p te r s 8 to II in th is book). R esearch on th e u se o f
d ia g n o stic in fo rm a tio n (A rea 4, C h u n g , O Neil, Bevvley, & Baker, 2008; H osenfeld,
2008; K l i e m e & M a a g M e rk i, 2008; N achtigall, K r o h n e , E nders, & Steyer, 2008;
Seidel & P renzel, 200 8, C h a p t e r s 12 to 16 in this book) r o u n d s o f f th e research field.
In th e following, w e explicate the c o n c r e te q u estio n s a n d pro b lem s a d d re s s e d w ith in
each o f th e fo u r areas and outline th e c u r re n t state o f research.

Area i: D evelopm ent of Cognitive Models of Com petencies


A s m e n tio n e d above, the shift to w a rd s the c o m p e te n c e c o n s tru c t has p ro m p te d ef
forts to im p ro v e the a s s e s s m e n t o f th ese co m p le x and co n te x tu a liz e d co nstructs.
T h e first q u estio n to arise here is w h ic h m o d e ls p ro vide a basis for developing m e a
s u r e m e n t in s t r u m e n t s and in t e r p r e ti n g th e ir results. In c u rr e n t educational research,
only a lim ited n u m b e r o f c o m p e te n c e m o d e ls exist. Therefore, it is i m p o r ta n t to
develop co g n itiv e m o d e ls that ex p lain in terin dividu al d iffe re n c e s in d o m a in -s p e c ific
p e rfo rm a n c e .
A first c h allen g e in m odel d evelo p m en t is the c o n te x tu a liz e d c h a r a c te r o f c o m p e
tencies, w h ic h m e a n s th at both p e rs o n - a n d situation-specific facto rs have to be taken
into account. For exam p le, w h en d e s c r ib in g foreign la n g u a g e skills w ith referen ce to
situational d e m a n d s , the c o m p e te n c ie s r e q u ire d to read a text c a n be d is tin g u is h e d
from th o se re q u ire d to e n g a g e in co n v ersatio n (e.g., by d is ti n g u is h in g w ritte n vs.
spoken text, o r text c o m p re h e n s io n vs. te x t production). O n th e in d iv id u a ls side,
k n o w le d g e s tr u c tu r e s relevant to differen t situations m u s t be ta k en into acco u n t; for
ex am ple, th e available vocabu lary, g r a m m a ti c a l kn ow led g e, and m a s te r y o f s o c io
p r a g m a tic rules (Chen, 2 0 04; K o b ay ash i, 2002). T h is s im u lta n e o u s c o n sid eratio n o f
in d iv id u al- and situation-specific c o m p o n e n ts h as c o n s e q u e n c e s for the s tr u c tu re
o f c o m p e te n c ie s as well as for the descrip tio n o f c o m p e te n c e levels. H ence, tw o
g r o u p s o f theoretical m o d e ls d e v ise d to d e s c rib e a n d explain c o m p e te n c ie s can be
distin g u ish ed : m o d e ls o f c o m p e te n c e levels and m o d e ls o f c o m p e te n c e s tru c tu re s
(H a r tig & K liem e, 2006; K lie m e, M a a g M erki, & H artig, 2007). M o d e ls o f c o m
p e te n c e levels define the specific situational d e m a n d s that can b e m a s te re d by in d i

J o h a n n e s H urt!#. K c k h a r d K lie m e . D c tJ e v L e u tn e r : A s s e s s m e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttingen 2 0 0 R


2D 0S H o g ic fc P u b li Jiin g G m b H
K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

The C o n c e p t o f C o m p e te n c e in E ducational C ontexts

11

v id u a ls w ith c ertain levels or profiles o f c o m p eten cies; levels o f c o m p e te n c ie s are


used to p ro vide a c rite rio n -referen ce d in terpretation o f m e a s u r e m e n t results. T h e s e
m o d e ls - also called c o n s tr u c t m a p s (W ilso n , 2 00 8 ) - are p articu larly useful for
a s s e s sin g and e v a lu a tin g educational o u tc o m e s on an a g g re g a te d level. M o d e ls o f
c o m p e te n c e structures deal w ith the relations b e tw e e n p e r f o r m a n c e s in differen t
co n texts and seek to identify c o m m o n u n d e r ly in g dim en sio n s. T h e s e m o d e ls are
especially in te re s tin g for e x p la in in g p e r f o r m a n c e in specific d o m a in s in t e r m s o f
u n d e r ly i n g b asic abilities, a n d can provide a basis for m o re d ifferen tiated m e a s u r e
m e n t results o f in d iv id u a l-c e n te re d assessm en ts. T h e tw o k in d s o f m o d e ls relate to
different a s p e c ts o f c o m p e te n c e co nstructs. T h e y are not m u tu ally exclusive, but
ideally com p lem e n tary .
T h e as p e c t o f d ev elo p m en t is a ls o v e ry relevant in th e c o n te x t o f theoretical c o m
p e te n c e models. To date, o nly a few c o m p e te n c e m o d e ls have a d d re s s e d the issue
o f c o m p e te n c e d ev e lo p m e n t ( p r im a r ily in th e d o m a in o f science; e.g., Bybee, 1997;
Prenzel et al., 2004, 2005, W ils o n , 2008). For th e m o s t part, th e se m o d e ls still have
lim ited em pirical fo un dation in t e r m s o f long itu dinal data, a n d th eir c o n c e p t u a li z a
tions o f c o m p e te n c e d ev e lo p m e n t differ. S o m e m o d e ls see c o m p e te n c e develo p m en t
as a c o n tin u o u s progression, s h iftin g successively fro m the low est to the h ig h e st
c o m p e te n c e level (e.g., Prenzel et al., 2004 , 2005). T h e level o f elaboration and
s y ste m a tiz a tio n in c re a se s w ith th e c o m p e te n c e level (as d e s c rib e d by B ybee, 1997,
for scientific literacy). O t h e r m o d e ls c o n c e p tu a liz e c o m p e te n c e d ev elo p m en t as a
n o n - c o n tin u o u s p ro cess c h a r a c te r iz e d by qualitative leaps (e.g., co n c e p tu a l c h a n g e
in science; S c h n o tz & PreuB, 1997; S chno tz, V o sn iad o u , & C arretero, 1999). T h is
p ro cess involves a f u n d a m e n ta l reo rg a n iz a tio n o f c o n c e p ts a n d s tr u c t u r e s fro m e v
eryday life to c o rr e s p o n d w ith n e w s c ie n c e -b a s e d ideas (e.g., V osniadou, Io a n n id e s,
D im itr a k o p o u lo u , & P a p a d e m e tr io u , 2001; W ilso n , 2008).
In addition, the d esig n o f co g n itiv e m o d e ls o f c o m p e te n c ie s d e p e n d s on the q u e s
tions a d d re s s e d or th e decisio n s to be in fo rm e d . A model fitting for so m e p u r p o s e s
(e.g., g iv in g i m m e d i a te feed back) m ay b e totally in effective for o th e r p u r p o s e s (e.g.,
c o m p a ra tiv e ev alu atio n o f ed u catio nal institutions). A m o r e detailed m odel o f c o m
peten cies is n e e d e d in the first c a s e th a n in the second. In o ne case, precise e sti
m a te s m ig h t be re q u ire d on an individual level, in a n o th e r c a s e on an a g g re g a te d
level. S w itc h in g b e tw e e n tw o p u r p o s e s can c a u s e a w hole host o f problem s, a s recent
e x p e rie n c e s in th e U n ite d S tates have show n (C h eng , W an atab e, & C u rtis, 2004;
F u h r m a n n & E lm ore, 2004).
To s u m m a r i z e , in m a n y d o m a in s w h e r e the n eed for w ell-fo u n d ed c o m p e te n c e
a s s e s s m e n ts is evident, basic research c o n c e r n i n g theo retically as well as e m p ir i
cally so u n d m o d e ls o f c o m p e te n c e s tru c tu re s, c o m p e te n c e levels, and c o m p e te n c e
d ev elo p m en t is still required. A lth o u g h a tte m p ts have b een m a d e to in te r c o n n e c t
cogn itive c o m p e te n c e m o d e ls w ith p s y c h o m e tric m o d e ls and m e a s u r e m e n t i n s t r u
m ents, th e y have often failed to m e e t th e d e m a n d s o f th e c u rre n t, m o re com plex
definition o f c o m p eten cies. T h e r e is a cle a r need for m o re integrative, in terd iscip lin
ary research activities.

J o h a n n e s H u rt!# . K c k h a r d K lie m e . O c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H o g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

12

E. K liem e. J. Hartig. & D. Rauch

Area 2: Psychometric Models


A s E m b re ts o n (1983) put it, p s y c h o m e tr ic m o d e ls are a b o u t m o d e lin g the e n c o u n te r
o f a p erson w ith an ite m (p. 184). P s y c h o m e tric m o d e ls are th e lin k b e tw e e n t h e o
retical c o n s tru c ts and th e results o f em pirical assessm en ts; th e y provide th e m e a s u r e
m e n t ru le s by w h ich test scores are assig n ed b a s e d on p e r f o r m a n c e in test situations.
G iven the c o n te x tu a liz e d n a tu re and c o m p le x ity o f c o m p e te n c e c o n s tru c ts, p s y c h o
m e tric m o d e ls for th e ir m e a s u r e m e n t have to m e e t c ertain r e q u ir e m e n ts (H artig,
2008, C h a p t e r 4 in this b ook ; H a r t i g & K lie m e , 2006). O n the o ne h and, th e y have to
in c o rp o r a te all relevant c h a ra c te ris tic s o f the in d iv id u als w h o se c o m p e te n c ie s are to
b e evaluated. B e c a u s e c o m p e te n c ie s refer to p e r f o r m a n c e in co m p lex d o m a in s , the
m o d e ls should ta k e into a c c o u n t that m ultiple abilities m ay be req uired At the s a m e
tim e, they have to ta k e into a c c o u n t d o m a in -s p e c ific situational d e m a n d s . B e c a u s e
c o m p e te n c ie s are c o n c e p tu a liz e d a s context-specific co n stru cts, th e results o f c o m
p e te n c e a s s e s s m e n ts should be related to the m a s te r y o f specific, d o m a in -re le v a n t
situations. Item re s p o n s e th e o ry (IR T ) h as a long tradition in ed u catio nal a s s e s s
m ent, and m a n y o f its past a n d recent d e v e lo p m e n ts w e re m a d e to cater for specific
n e e d s in th is area. IR T allow s ability e s tim a te s and item difficulties to b e c o m p a re d
(E m b re tso n , 2006), th u s p ro v id in g a basis for m o d e ls in c o rp o ra tin g in dividu al and
situational characteristics. Several recent d e v e lo p m e n ts in IRT hold con siderable
p ro m is e for th e m o d e lin g o f c o m p e te n c ie s , n am ely e x p la n a to ry IR T m odels, m u l
tid im e n sio n al IR T m o d e ls (e.g., H a r tig & Holder, 2008), and m o d e ls for cognitive
diagnosis.
E x p la n a to r y IR T m o d e ls (W ils o n & D e Boeck, 2004; W ilso n , D e B oeck, &
C a rs te n s e n , 20 08, C h a p t e r 5 in this book) in c o rp o ra te p red ic to rs for su ccessfu l in
te ra c tio n s o f a p e rs o n w ith an item, i.e. a ttrib u te s o f th e p erso n o r featu re s o f the
item ( p e rs o n pred ic to rs or item p re d ic to rs ). Specific item featu re s c a n b e used to
represen t c e rta in situational d e m a n d s. In c o rp o ra tin g effects o f item f e a tu re s into the
p s y c h o m e tric model is a h ig h ly suitable w a y o f c o n s tr u c t in g a p s y c h o m e tric m odel
o f c o m p e te n c e that ta k e s th e c o r r e s p o n d in g d e m a n d s into account. A lth o u g h m o d e ls
in c lu d in g item f e a tu re s have been in u se for so m e tim e (e.g., th e linear-logistic test
m odel, L L T M ; Fischer, 1997), recent d ev e lo p m e n ts su ch a s th e inclusion o f ra n d o m
effects on the item s side (e.g., Jan ssen , S chepers, & Peres, 2 0 04; Janssen, Tuerlinckx,
M eu ld ers, & D e B oeck, 2 0 0 0 ) m a k e th e m m o re flexible to model em pirical data
fro m co m p le x p e r f o r m a n c e situations.
M o d e ls w ith item featu re s that allow situational d e m a n d s to be in c o rp o ra te d (e.g.,
th e L L T M ) are ty p ically u n id im e n s io n a l. To m odel p e r f o r m a n c e in co m p lex situ a
tions, it m a y be n e c e s s a ry to include m o re th a n o n e ability d im e n s io n in th e model
A stra ig h tfo rw a rd w a y o f d o in g so is to apply m u ltid im e n s io n a l IR T ( M I R T ) m o d
els (e.g., M c D o n a ld , 2000; R e c k a se , 1997). M I R T m o d e ls w ith m ultiple correlated
scales, w e re each item s d ra w s on a single ability (so called b e tw e e n -ite m multid im e n s io n a lity " ) have b een applied to d a ta from educational a s s e s s m e n ts to take
into a c c o u n t relations b e tw e e n p e r f o r m a n c e in d ifferen t d o m a in s (e.g., in th e PISA

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008 H u g ic fc PuM iJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

The C o n c e p t o f C o m p e te n c e in E ducational C ontexts

13

studies; A d a m s , 2005; A d a m s & W u , 2002). M o d e ls th at in c o rp o ra te m ultiple abili


ties for each item (so called w ith in - ite m m u ltid im e n s io n a lity ) are m o r e a p p e a lin g
for p s y c h o m e tr ic m o d e ls o f c o m p eten cies, b e c a u s e w ith in -ite m m u ltid im e n s io n a l
ity m a k e s it possible to model successful p e r f o r m a n c e as th e result o f a m i x t u r e
o f differen t abilities. H ow ever, such m o d e ls have b een ty p ic a lly applied to a c c o u n t
for n u is a n c e d im e n s io n s (e.g., local item d e p e n d e n c ie s w ith in testlets; W an g &
W ils o n , 200 5 ) rather th a n for theo retically defined ability dim en sio n s. E x a m p le s o f
sim ple M I R T m o d e ls w ith m e a n in g f u l w ith in - ite m m u ltid im e n s io n a lity are given by
W a lk e r a n d B e r a tv a s (2003), Stout (2007), a n d H a r tig and H o h le r (2008). A n o th e r
d ev elo p m en t o f g r e a t relevance to p s y c h o m e tric m o d e ls o f c o m p e te n c ie s h as been the
e m e r g e n c e o f co g n itiv e d ia g n o stic m o d e ls o r m ultiple classification m o d e ls (Di Bello
& Stout, 2007; M aris, 1999, von Davier, 2005; vo n Davier, Di Bello, & Y am am o to ,
2008, C h a p t e r 7 in this book).
To s u m m a r iz e , recent y e a rs have seen a n u m b e r o f significant d e v e lo p m e n ts in
p sy c h o m e tric s th a t hold g re a t p ro m is e for the tran slatio n o f theoretical m o d e ls o f
c o m p e te n c ie s into m e a s u r e m e n t models. M o d e ls that su cceed in ta k i n g b o th situ
ational c h arac teristics and in d iv id u al abilities in to a c c o u n t can d o m o re th a n provide
ru le s for m e a s u r e m e n t (i.e., g e n e r a te test scores). T h e y can also serv e as e m p ir i
cally testable m o d e ls o f the in teraction b e tw e e n in dividu al abilities a n d situational
d em an d s. H owever, to realize th e potential o f th e a d v a n c e d p s y c h o m e tric m e th o d s
recently developed, th e se te c h n iq u e s n eed to b e c o m b in e d w ith stro n g theoretical
models.

Area 3: Measurement Concepts and Instruments


T h is section e x a m in e s ho w c o m p e te n c e m o d e ls and p s y c h o m e tric m o d e ls c a n be
tra n sla te d into c o n c re te em pirical m e a s u r e m e n t p ro c e d u re s , w ith a fo cu s on c o m
puter-based a s s e s sm e n t (see also Ju re c k a , 2008, C h a p t e r 9 in this book).
C o m p e te n c ie s a re assessed in differen t ed u cational contexts; in la rg e-scale a s s e s s
m e n ts (e.g., T I M S S a n d PISA), in e v a lu a tio n s o f specific p r o g r a m s o r institutions,
in basic research, and in the a s s e s s m e n t o f individual qualifications o r le a r n in g o u t
com es. R e s e a rc h e rs a n d stakeh o ld ers in ed ucational p ro c e s s e s assess stu d en t c o m p e
ten cies f o r p u r p o s e s o f system m o n ito rin g , to test the effectiven ess o f specific fo r m s
o f instru c tio n , to give fe e d b a c k abou t individual le a r n in g progress, or to d e s c rib e
d e v e lo p m e n ts in c o m p eten cies. For th e m o s t part, s ta n d a r d iz e d tests are applied.
H owever, n o n -s ta n d a r d iz e d tests a n d o b s e rv a tio n s o f educational p ro c e s s e s (e.g.,
te ach ers' o b s e rv a tio n s in direct interaction w ith learners) are also c o m m o n w a y s
o f a s s e s s in g c o m p eten cies. G iv en the c o m p le x ity o f c o m p e te n c e c o n s tru c ts, it is
im p o r ta n t to a d a p t and a d v a n c e th e se m e a s u r e m e n t c o n c e p ts a n d in s tru m e n ts . T h ey
should b e p a rs im o n io u s , have a firm theoretical fou ndatio n, and a llo w in feren c es to
b e d ra w n ab out th e m a s te ry o f d e m a n d s in real-life situations.

J o h a n n e s I U r t ! g . K c k h a r d K lie m e . D c tJ e v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R


2D 0S H o g i c f c P u M i Jiin g G m b H
K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

14

E. K liem e. J. Hartig. & D. Rauch

G iv en the c o m p le x ity o f c o m p e te n c e c o n s tru c ts and the n eed to u n d e r s ta n d the d if


ferent abilities a n d p ro c e s s e s th a t lead to s u ccess in real-life situations, it h as b e c o m e
in c reasingly im p o r ta n t that a s s e s sm e n t p ro c e d u re s are b ased on co g n itiv e m o d e ls
o f co m p eten ce. A n excellent e x a m p le o f em pirical a s s e s s m e n ts b a s e d on theoretical
m o d e ls o f c o m p e te n c e is the B erkeley E valuation & A s s e s s m e n t R esearch ( B E A R )
Center, w hich fo cu ses on th e m o d e l-b a s e d a s s e s s m e n t o f c o m p e te n c ie s in science
ed ucation (W ilson, 20 08, C h a p te r 5 in th is book; W ils o n & D raney, 2004; W ilso n
& Sloane, 2000). In DES1, a la rg e -sca le a s s e s s m e n t o f la n g u a g e c o m p e te n c ie s in
G e r m a n y , m e a s u r e m e n t in s tr u m e n ts and m e a s u r e m e n t m o d e ls w e r e d ev e lo p e d on
th e basis o f cognitive and lin g u istic m o d e ls o f la n g u a g e c o m p e te n c e a n d la n g u a g e
acquisition (e.g., B eck & K lie m e, 2007; D E S I - K o n s o r tiu m , 2008; Nold, 2003; Nold
& R ossa, 2007a, b).
In the co n tex t o f d evelo p in g n e w m e a s u r e m e n t concepts, it is im p o r ta n t not to
overlook in n o v a tiv e m e a s u r e m e n t p ro c e d u re s , m a n y o f w h ich cap italize on n e w te c h
nologies. T e c h n o lo g y -b a s e d a s s e s s m e n t (T B A ) are w id ely used in educational set
tin g s in the U n ite d States and s o m e E u r o p e a n c o u n trie s (H artig, K ro h n e , & Ju re c k a ,
2007; J u re c k a , 2008 , C h a p t e r 9 in th is book). T h is k in d o f a s s e s sm e n t h as n u m e ro u s
ad van tag es; it allo w s co m p lex stim uli and re s p o n s e fo rm a ts, interactive testin g p r o
ced u re s, in c lu d in g c o m p u te riz e d a d a p tiv e te s tin g (CAT; e.g., v a n d er L in d e n , 2005),
in w h ich the item s p resented are selected to fit th e individual ability level o f th e test
ta k e r real-tim e, ass e s sm e n t o f co g n itiv e p ro c e s s e s (W irth , 2004; W irth & K liem e,
2003), and a u to m a tiz e d analysis and fe e d b a c k p r o c e d u r e s (C hung, O 'N e il, & Baker,
200 8, C h a p t e r 12 in this book; O rd in ate , 2004; Reeff, 2007), w h ich also a llo w to
asses le a r n in g p ro g re s s ( d y n a m i c te stin g ).
F u rth e rm o re , te c h n o lo g y -b a s e d a s s e s s m e n t p e r m i ts the c o n s tru c tio n o f com plex
and interactiv e stim uli that w ould be v ery costly o r im p o ss ib le to realize w ith o u t the
u se o f co m p u ters. It th u s affords the possibility to em pirically assess n e w c o m p e te n c e
d o m a in s th at w e re not a s sessab le w ith traditional m e a s u r e m e n t p ro ced u res. B e c a u se
te c h n o lo g y -b a s e d a s s e s s m e n t allo w s th e sim u latio n o f co m p le x and d y n a m i c s itu a
tions, a s s e s s m e n t d e s ig n s c a n be m o re valid w ith re s p e c t to the d e m a n d s o f real-life,
com plex situations (D rasg ow , 2002). T h e possibility to s im u la te com p lex real-life
situ ations in the a s s e s sm e n t situation m a k e s te c h n o lo g y -b a s e d a s s e s s m e n t an excel
lent e x a m p le o f a s s e s s m e n t o f c o m p e te n c ie s in the s e n s e w e defined c o m p e te n c ie s
above, i .e. as context-specific cognitive dispositions.
T h e n e w possibilities affo rd ed by te c h n o lo g y -b a s e d a s s e s s m e n t have b een used
in n u m e ro u s contexts. H owever, m a n y o f th ese applicatio ns are d riven b y the rapid
d ev e lo p m e n t o f c o m p u te r te c h n o lo g y ra th e r th a n by w ell-foun ded theories. M u ch
em pirical and theoretical w ork is still n e e d e d to link co m p lex c o m p u te riz e d m e a
s u r e m e n t p ro c e d u r e s to c o g n itiv e a n d p s y c h o m e tric models. T h e a s s e s s m e n t o f
c r o s s -c u r ric u la r problem so lv in g in the G e r m a n national extension o f th e PISA 2 0 0 0
study p rov id e an e x a m p le for th e th e o ry -b a s e d co n s tru c tio n o f a co m p u te r-b a s e d , in
te ra c tiv e a s s e s s m e n t applicatio n (K lie m e , 2004; K liem e, L eutner, & W i r t h , 2005).

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

The C o n c e p t o f C o m p e te n c e in E ducational C ontexts

15

Area 4: Reception and Usage of Assessment Results


T h e s u c c e s s o f m a n y educational decisions and in te rv e n tio n s h in g e s o n a c c u ra te a s
s e s s m e n ts o f le a r n e r s baseline c o m p e te n c ie s and le a r n in g o utcom es. A s s e s s m e n ts
m ay have d iffe re n t practical goals: on an individual level, th e y allow e d u c a to rs to
select a p p ro p ria te in terv en tio n s for individual cases (i.e., to p ro m o te individual
learning). T h e results o f individual a s s e s s m e n ts m ay also in f o r m decisio ns on the
a d m itt a n c e to s e c o n d a r y tr a c k s o r to h ig h e r education. In co n tra st, a s s e s sm e n t p ro
g r a m s th a t are d e s ig n e d to rep o rt a c h ie v e m e n t on an a g g re g a te d level serve to ev aluate
ed ucational p r o g r a m s , institutions, o r system s, as well as to in f o r m decision m a k e rs
on th e a d m in is tr a t iv e and political levels. A s Pellegrino, C h u d o w s k y , a n d G la s e r
(2001) concluded, o ne size o f a s s e s s m e n t do es not fit all (p. 222). D e p e n d in g on th e
goal o f the a sse ssm e n t, different m e a s u r e m e n t in s tr u m e n ts are n e e d e d and different
research q u e s tio n s arise. A c o n t in u in g ch allenge fa c in g re s e a rc h e rs in th is area is
th u s to d e t e r m in e w h ich m odels, m e a s u r e m e n t rules, and m e a s u r e m e n t p ro c e d u r e s
provide the a p p ro p ria te in fo rm a tio n for w h ic h goal o f assessm ent.
A s s e s s m e n t to p ro m o te individual le a rn in g can b e re g ard ed as fo rm a tiv e e v a lu
ation on an individual level. It should a llo w precise co nclu sio n s to b e d r a w n about
individual le a r n in g p ro cesses and le a r n e r s s tr e n g th s and w e a k n e s s e s w ith resp ect to
specific c u r r ic u l a r units. T h e s e conclu sio n s c a n help to s u p p o rt individual in s tru c tio n
and le a rn in g a n d ideally offer co n sid erab le potential to e n h a n c e teaching. Teachers
m a k e o b s e rv a tio n s o f s tu d e n ts u n d e r s ta n d i n g a n d p e r f o r m a n c e in a variety o f ways:
in c lassro o m dialo gue, h o m e w o r k a s s ig n m e n ts , a n d form al tests (P e lle g rin o et al.,
2001). T h e s e p ro c e d u re s should p e r m it d ia g n o sis on an individual level, in te r m s o f
u n d e r s ta n d i n g s tu d e n ts in dividu al solution paths, m isc o n c e p tio n s, etc. (Segers et
al., 2003; W ilson, 2008, c h a p te r 5 in this book). A p p ro p ria te individual f e e d b a c k is
crucial to s u p p o rt the s u b se q u e n t le a r n in g process. A n u m b e r o f re s e a rc h q u estio n s
arise in th is context: W h a t k ind o f d ia g n o s tic in fo rm a tio n is b est u n d e r s to o d by s tu
dents, and w h a t k ind by teachers? H o w well can te ach ers e v a lu a te individual l e a r n
ing processes? W h a t facto rs influence te ach ers' g r a d i n g decisions? W h a t m o d e ls o f
c o m p e te n c e do te a c h e rs rely on - im plicitly o r explicitly? H o w well fo u n d ed and
h o w helpful is the ind ividual s tu d e n t fe e d b a c k p ro v id ed by th e te a c h e r 9
T h e a s s e s s m e n t o f in d iv id u al a c h ie v e m e n t m ay also entail the s u m m a t i v e e v a lu
ation o f an in d iv id u a l's c o m p eten cies. T h e s e ev alu atio n s help to d e t e r m i n e w h e th e r
a stu d en t h as a tta in e d a c ertain level o f c o m p e te n c e after c o m p le tin g a p a rtic u la r
p h a s e o f edu catio n (e.g., in end -o f-u n it tests or th e letter g r a d e s assign ed at the end
o f a course; P ellegrino et al., 2001). T h e s e p e r f o r m a n c e m e a s u r e m e n ts a r e often
high stakes, m e a n i n g th a t their o u tc o m e s have significant co n s e q u e n c e s. S tu d en ts
w h o fail to attain c ertain s ta n d a r d s (e.g., p a s s in g th eir final school ex am s) m a y be
refu sed access to th e next level. A n im p o r ta n t q u estio n for research on a s s e s sm e n t
in this field is h o w tests can b e c o n s tr u c t e d to reflect educational goals and ho w
results can be in te r p re te d w ith reference to c u rric u la (e.g., Cizek, B u n ch , & K oons,
2004; H aertel & Lorie, 2 0 04; K la u e r & L eu tner, 2007; K lie m e et al., 2003). A related

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n t e x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H ugtcfc PuMiJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

The C o n c e p t o f C o m p e te n c e in E ducational C ontexts

17

14 in th is book). B ased on th ese s ta n d a rd s , a system h as b een d ev elo ped to assess


stud ents' c o m p eten cies. N e w evaluation ag en cies have been f o u n d e d as p a r t o f th ese
o n g o in g ed ucational reform s. T h e s e agencies assess le a rn in g o u tc o m e s on both the
classroo m and the school level, a n d p rov id e in fo rm a tio n for policy makers.
N o w a d a y s th e a c c u ra te em pirical a s s e s s m e n t o f c o m p e te n c ie s is regarded e s
sential for th e e n h a n c e m e n t o f ed u cational p ro c e s s e s and th e d e v e lo p m e n t o f
educational system s. To co p e w ith th e theoretical and m etho dolog ical challen g es
h ig h lig h te d in th e p rev io u s sectio n s th e G e r m a n R esearch F o u n d atio n (D e u ts c h e
F o rsc h u n g sg e m e in s c h a ft, DFG) h as f u n d e d th e priority p r o g r a m C o m p e te n c e
M o d e ls for A s s e s s in g Individual L e a r n i n g O u tc o m e s and E v alu atin g E d ucatio nal
P r o c e s s e s " ( K lie m e & L eutner, 2006). T h e p r o g ra m , w h ich is s c h e d u le d to run for
6 years, b u n d le s th e G e r m a n research capacities on c o m p e te n c e models: It involves
a n e tw o rk o f (currently) 23 in dividu al research projects co v e rin g differen t a re a s o f
c o m p e te n c e a s s e s s m e n t ( K o e p p e n , H artig, K lie m e, & Leutner, 2008). T h e p ro g ra m
un ites e x p e r t s in differen t d o m a in s o f s tu d y w ith co g n itiv e psycholog ists and e x p e rts
in educational m e a s u re m e n t, in ord er to c o n trib u te to re s e a rc h o n the d ev elo p m en t o f
theoretical m o d e ls o f c o m p e te n c e , th e c o n s tr u c tio n o f p s y c h o m e tric m odels, th e c o n
stru ction o f m e a s u r e m e n t as well as research on the u se o f d ia gn ostic inform ation.

References
A d am s. R (Ed.). (2 005) PISA 2003 tech n ica l report. Paris: O E C D
A d am s. R ., & W u , M (E ds.) (2002) PISA 2 0 0 0 tech n ica l re p o rt P aris: O ECD .
A m elan g , M .,& F u n k e . J. (2005). E nt w ick lung u n d Im p lem en tieru n g e in e sk o m b in ie rte n B eratu n g sun d A u sw a h lv e rfa h re n s fu r d ie w ic h tig ste n S tu d ien g an g e an d e r U n iv ersitat H eidelberg.
P sych o lo g isch e Rundschau, 56, 135137.
B a n d u ra , A . (1990). C onclusion: R eflectio n s on n o n ability d e te rm in a n ts o f co m petence. In R.
S tern b erg & J. K o llig ian Jr. (E d s ), C om peten ce c o n sid e re d (pp. 315-362). N ew H aven, CT: Yale
U niversity Press.
B eck. B , & K lie m e, E (2007). E in lc itu n g . In B. B eck & E. K liem e (E d s ), DESl: Sprachliche
K om petenzen. K o n zep te u n d M e ssu n g (p p 1-8) W cinhcim : Beltz.
B inkley, M R., S tern b erg . R . Jo n es, S., & N o h ara, D (1999). A n o verarch in g fra m e w o rk f o r
u n derstan din g a n d a sse ssin g life skills. U n p u b lish ed In tern atio n al L ife S kills Survey (IL S S )
F ram ew o rk s W a sh in g to n .
B y b ee, R. W. (1997). T ow ard an u n d e rsta n d in g o f scientific literacy. In W. Gr&ber & C. B olte (E ds.),
Scientific L iteracy, an in tern a tio n a l Sym posium (pp. 3 7 -6 8 ). Kiel: IPN.
C h en , L. (2004). O n text stru c tu re . la n g u ag e proficiency, and re a d in g co m p re h en sio n test fo rm at
interactio n s: a reply to K o b ay ash i. 2002. L anguage Testing. 21. 2 2 8 -2 3 4 .
C heng. L .. W atan ab e, Y . & C u rtis. A (E d s ). (2004). W ashback in language testing, research
con texts a n d m ethods. M aliw ah L aw ren ce E rlbaum
C hom sky, N (1968) L angu age a n d M in d N ew York H a rc o u rt. B ra c e & W orld. Inc
C h u n g , Ci. K W. K ., O 'N e il, H F , B ew ley, W L ., & B aker, E. L. (2008). C o m p u ter-b ased a ssess
m en ts to su p p o rt d ista n c e learn in g . In J I la rtig , E K liem e, & 1) I.e u tn e r (E d s ), A ssessm en t o f
com p eten cies in ed u ca tio n a l con texts (pp. 25 3 -2 7 6 ). G o ttin g en : H ogrefe & Huber.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n t e x t s , H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g u b c o d e r V c r v ic lf a tig u n g .

18

E. K liem e. J. Hartig. & D. Rauch

C izek , G. J. (2001). M ore u n in te n d ed co n se q u e n c e s o f h ig h -sta k e s testing. E d u catio n al M easu rem en t,


Issu es, and P ractice, 20. 19-28.
C izek . G. J . B u n ch . M B . & K oo n s. H. (2004). A N C M E in stru c tio n a l m o d u le on se ttin g p erfo rm a n ce
stan d ard s: C o n tem p o rary m ethods. E d u ca tio n al M e asu rem e n t Issu es a n d P ra ctice, 23, 31-50.
C o n n ell. M W , S h erid an , K & G ard n er, H. (2003) On A b ilities and D o m ain s. In R J S tern b erg
& E L G rig o ren k o ( E d s ) The P syc h o lo g y o f A bilities, C om peten cies, a n d E x p ertise (p p 1 2 6 155) N ew York: C am b rid g e U n iv ersity Press.
C sapo, B. (2004). K n o w led g e and co m p eten cies. In .1 L etsc h e rt (E d ). The in te g ra te d p erso n . How
curriculum d evelo p m en t rela tes to n ew co m p eten cies (pp. 3 5 -4 9 ). E nschede: C ID R E E /S L O .
D E S I-K o n so rtiu m (2008). I n terrich t und K o m p eten zen verb in D eu tsch und Eng/isch. E rgebn isse
d e r D E SI-Studie. W einheim : B eltz.
D iB ello , I, V . & Stout. W (2 007) G u est e d ito rs ' in tro d u ctio n and overview : IR T -based co g n itiv e
d ia g n o stic m o d e ls an d related m e th o d s J o u rn a l o f E d u ca tio n a l M easurem ent, 44. 285-291
D rasgow . F (2002). T h e w o rk ah e ad A p sy ch o m etric in fra s tru c tu re for co m p u terize d ad ap tiv e
te stin g In C. N M ills, M T P o ten za, J J. F rem er, & W C W ards (E ds.), C o m p u ter-b a sed
testing. B u ilding the fo u n d a tio n f o r fu tu r e a ssessm en ts (pp. 1-35). M alnvah, NJ: E rlbaum .
E ggen. T. .1 II M. (2008). A d ap tiv e testin g an d item b an k in g . In J. H artig , E. K lie m e, & 1). L eu tn er
{E d s ), A ssessm en t o f com p eten cies in ed u ca tio n a l con texts (pp. 215-234). G o ttin g en : H ogrefe
& H uber.
E m b retso n , S. I. (1983). C o n stru c t validity: c o n s tru c t rep rese n tatio n vs. n o m o th etic span.
P sych o lo g ica l Bulletin, 93, 179-197.
E m b retso n , S. E. (2 006) T h e co n tin u ed se arch for n o n arb itrary m e tric s in psychology. A m erican
P sych ologist, 61, 5 0 -5 5
F ischer, G. H (1997). U n id im en sio n al lin e a r logistic rasch m o d els. In W. .1 van d e r L in d en & R K
1 lam bleton (E d s ). H an db o o k o f m odern item respon se th eo ry (pp. 225-243). N ew York. B erlin:
Springer.
F u h rm a n n . S. I I . & E lm o re. R F. (E ds.). (2004). R ed esig n in g a cco u n ta b ility system s f o r education.
N ew York T eachers C o lleg e Press.
G o ld .A ..& S o u v ig n ie r.E .(2 0 0 5 ) P ro g n o sc d c rS tu d ie rfa h ig k e it E rg c b m ssc a u sL a n g ssc h n itta n a ly sc n
Z eitsch rift f u r E n tw ick lu n g sp sych o lo g ie u n d P ad a g o g isch e P sych ologic. 37, 214-222
H a b e rm a s, J (1981): T heorie d e r kom m unikativen K o m p elen z (Vols. 1 & 2). F ra n k fu rt a M
S u h rk am p
H ae rte l, E. IT, & L orie, W A (2004). V alidating sta n d a rd s-b a se d test score in terp retatio n s.
M easu rem en t In terd iscip lin a ry R esearch a n d P ersp ectives, 2, 61-103.
H a rtig , J. (2008). P sy ch o m etric m o d els fo r th e assessm en t o f co m p etencies. In J. H artig . E. K liem e,
& D L e u tn e r (E ds.), A ssessm en t o f co m p eten cies in ed u ca tio n a l con texts (pp. 6 9 -9 0 ). G ottin g en :
H o g refe & Huber.
H artig . J.. & H older. J (2008) R e p re sen tatio n o f co m p ete n cie s in m u ltid im en sio n al IRT m odels
w ith w ith in -item an d b etw e en -ite m m u ltid im en sio n ality Z eitsch rift f u r P sych ologic .Journal o f
P sychology\ 216. 89-101.
H a rtig , J., & K liem e, E. (2006): K o m p eten z und K o m p eten zd iag n o stik . In K S clnveizer (E d ),
Lei stu n g u n d l.eistu n g sd ia g n o stik (pp. 127-143). B erlin: Springer.
H a rtig , J , K ro h n e. U . & Ju reck a. A. (2007). A n fo rd e ru n g e n an C o m p u ter- un d N etz w erk b asie rtes
A ssessm en t. In J. H artig & E. K lie m e (Eds.), M oghchkeU en u n d Yoraussetzungen
tech n o lo g ieb a sierter K o m p eten zd ia g n o stik (p p 5 7 -6 7 ). B erlin: F ederal M in istry o f E d u catio n
and R esearch.
H a sse lh o m . M. (2008) D ev elo p m en t o f co m p eten cies. In J H artig , E K liem e. & D. L e u tn e r (E d s ),
A ssessm en t o f com p eten cies in e d u ca tio n a l contexts (pp. 2 3 -4 3 ) G o ttin g e n H o g refe & H uber

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

0 21)08 H u g ic fc Publish ing G m b H


K e in e u n e r la u b te W c ite r g u b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

The C o n c e p t o f C o m p e te n c e in E ducational C ontexts

21

S ch n ee w in d , K A ., & P e k ru n , R (1994). T h eo rien d e r E rzieh u n g s- und S o zialisationspsychologie.


In K. A S ch n eew in d (Ed.). P sych o lo g ic d e r E rzieh u ng u n d S o zia lisa tio n (pp. 3 -4 0 ). G ottingen:
H uber.
S ch n o tz, W . & PrcuB, A (1997) T ask-dependent co n stru c tio n o f m en tal m o d e ls as a b asis for
co n cep tu al ch an g e. E u ro p ea n J o u rn a l o f P syc h o lo g y o f E ducations 12, 185-210.
S ch n o tz, W . V osniadou, S., & C a rretero , M (E d s ). (1999). S e w p e rsp e c tiv e s on co n cep tu a l change
O xford: Elsevier.
S egers, M ., D ochv, F . & C ascallar, H. (Eds.). (2003). O p tim isin g new m odes o f assessm ent: in search
o f q u a lity a n d stan dards. D o rd rech t: K lu w er
Seidel. T., & P re n z e l, M. (2008). A sse ssm e n t in larg e-scale stu d ies. In J H artig . E. K liem e, & D.
L e u tn e r (E ds.), A ssessm en t o f co m p eten cies in ed u ca tio n a l con texts (pp. 2 7 9 -3 0 4 ). G ottingen:
H o g refe & H uber.
S havelson. R J (2007) A b r ie f h isto ry o f stu den t lea rn in g assessm en t: H ow we g o t where we are
a n d a p ro p o sa l f o r w h ere to g o next. W a sh in g to n . DC: A sso ciatio n o f A m eric an C o lleg es and
U niversities.
S havelson R. J . B ax ter. G. P., & P in e .1 (1991) P erfo rm an c e assessm ent in science. A p p lied
M easu rem en t in E ducation, 4. 347.
S havelso n , R. J., B ax ter. G. P., & P ine. J. (1992). P e rfo rm a n c e assessm ents: P olitical rh eto ric and
m e asu rem e n t reality. E d u ca tio n a l R esearch er. 21. 2 2 -2 7 .
Shoham y. E. (1996) C o m p eten ce an d p e rfo rm a n c e in la n g u ag e testin g . In G. B ronn. K M ah n k jcr.
& J. W illiam s. (E d s ), P erform an ce a n d C om peten ce in S e c o n d L anguage A cq u isitio n (p p 1 3 6 151) N ew York: C a m b rid g e U n iv ersity P ress.
Siegler. R S., & A lib a li, M W (2 005) C h ild re n s thinking (4th e d ) . U p p e r S ad d le R iver, N J
P re n tice Hall
S im onton. I) K (2003): E x p ertise , C o m p eten ce, and C reativ e A b ility : T h e P erp lex in g C om plexities.
In R J. S tern b erg & E. L. G rig o re n k o ( E d s ), The P sych o lo g y o f A b ilities, C om peten cies, a n d
E xpertise (pp. 213-239). New York: C a m b rid g e U niversity Press.
Spiel. C. & G luck. J. (2008). A m o d el-b ased test o f co m p ete n ce p ro file and c o m p e te n c e level in
d ed u ctiv e reaso n in g . In J H artig . E. K liem e. & D L e u tn e r (E ds.), A ssessm en t o f com peten cies
in ed u ca tio n a l con texts (p p 4 5 -6 5 ). G o ttin g en H o g refe & H uber.
S tern b erg , R J., & G rig o ren k o , E ( E d s ) (2003). The p sy c h o lo g y o f abilities, com peten cies, a n d
ex p ertise N ew York C a m b rid g e U niversity Press.
Stout. W (2007). S k ills d ia g n o sis u sin g IR T -based co n tin u o u s latent trait m odels. J o u rn a l o f
E d u cation al M easurem ent, 44, 313324.
Trier, U. P.. & Peschar. J. (1995). C ro s s-C u rric u la r C o m p eten cies: R atio n ale and S trategy for
D eveloping a New Indicator. In O ECD . (E d ). M ea su rin g w h at Students L earn (pp. 99-109).
Paris: O ECD .
van d e r L in d en , W (2005) A co m p ariso n o f it cm -select ion m e th o d s fo r ad a p tiv e te sts w ith content
c o n stra in ts J o u rn a l o f E d u ca tio n a l M easurem ent, 42, 2 8 3 -3 0 2 .
von D avier, M (2005). A g en era l d ia g n o stic m o d el a p p lie d to language testin g data. E T S R esearch
R e p o rt R R -05-16.
von D avier, A. A.. C a rsten sen . K., & von D avier, M. (2008). L in k in g co m p ete n cie s in horizontal,
v e rtic a l, an d lo n g itu d in al se ttin g s and m e a su rin g grow th In J. H artig , E. K liem e, & D. L eu tn e r
(E ds.), A ssessrnent o f com p eten cies in ed u ca tio n a l con texts (p p 121-149). G ottin g en : H ogrefe.
von D avier, M . D iB ello , L. & Y am am oto. K (2008). R e p o rtin g test o u tc o m es u sin g m o d e ls for
co g n itiv e d ia g n o sis In J. H a rtig . E. K liem e. & D L eu tn e r (E d s ), A ssessm en t o f com peten cies in
e d u ca tio n a l con texts (p p 151-174) G o ttingen: H ogrefe.
V osniadou, S., Io an n id es, C ., D im itrak o p o u lo u , A . & P a p a d e m e trio u , E (2001) D esig n in g le arn in g
e n v iro n m e n ts to p ro m o te c o n c ep tu a l ch a n g e in science. L earn in g a n d Instruction, 15, 317-419.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n t e x t s , H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

0 21)08 H ogicfc Publishing G m bH


K e in e u n e r la u b te W c ite r g u b c o d e r V c r v ic lf a tig u n g .

22

E. K liem e. J. Hartig. & D. Rauch

W alker, C. M . & B e re tv as, S. N (2003). C o m p a rin g m u ltid im en sio n al and u n id im en sio n al


proficiency classificatio n s: M u ltid im e n sio n a l IR T as a d ia g n o stic aid J o u rn a l o f E du cation al
M easurem ent, 40. 2 5 5 -2 7 5
W ang. W -C ., & W ilso n . M. (2005). T h e R a sc h tc stle t m odel A p p lie d P sych o lo g ica l M easurem ent.
29, 126-149.
Wei nert, V E. (1999). C o n cep ts o f co m p eten ce M u n eh en M ax -P lan ck -In stitu t fu r P sych o lo g isch e
P orschung.
W em ert, F. E. (2001) C oncept o f co m p eten ce: a c o n c ep tu a l clarification In I). S. R vchen & I.. II.
S alg an ik (E ds.), D efining a n d selectin g k ey com p eten cies (pp. 4 5 -6 5 ). Seattle: I Iogrefe & I Iuber
P ublishers.
W h ite. R W (1959): M o tiv atio n R eco n sid ered : T h e C o n cep t o f C om petence. P sych o lo g ica l R eview .
66. 297-333.
W ilso n , M (2008). C o g n itiv e d ia g n o sis u sin g item re sp o n se m odels. Z eitsch rift fu r P sych ologic
J o u rn a l o f P sychology, 216. 7 4 -8 8
W ilso n , M , & D e B o eck , P (2004). D escrip tiv e and ex p lan ato ry item resp o n se m o d els In P.
I)e B oeck & M W ilson (E d s.), E xplan atory item resp o n se m odels: A g e n e ra lize d lin ea r a n d
n on lin ear ap p ro a ch . N ew York: Springer-V ei lag.
W ilson, M.. D e B o eck , P . & C a rsten se n , C. II. (2008). E x p lan ato ry item re sp o n se m odels: a b rief
intro d u ctio n . In J. H artig , E. K liem e, & D. L e u tn e r (E d s ). A ssessm en t o f co m p eten cies in
ed u ca tio n a l con texts (pp. 91-120). G o ttingen: H o g refe & H u b er
W ilson M , & D raney, K (2004) S o m e lin k s b etw een larg e-scale and classro o m assessm ents:
T he case o f th e B E A R A ssessm en t S y stem In M W ilso n (E d ). T ow ard coh eren ce betw een
classro o m a ssessm en t a n d a cco u n ta b ility (103rd Y earbook o f the N ational Society for the Study
o f E d u ca tio n , P a rt II ) C hicago: U n iv ersity o f C h icag o P ress.
W ilson, M ., & Sloane. K. (2000). F rom p rin cip les to practice: A n e m b ed d ed assessm en t system .
A p p lie d M easu rem en t in E ducation, 13, 181-208.
W irth , J. (2004). S elb streg u la tio n vo n L ern p ro zessen M unster. W axm ann.
W irth . J (2008). C o m p u ter based tests: a lte rn a tiv e s for test an d item design. In J. H a rtig , E. K liem e.
& D. L e u tn e r (E ds.). A ssessm en t o f co m p eten cies in ed u ca tio n a l con texts (pp. 235-252).
G ottingen: H o g refe & H u b er
W irth , J., & K liem e, E (2003) C o m p u ter-b ased assessm en t o f p ro b lem -so lv in g com petence.
A ssessm en t in E ducation: P rin ciples, Policy, a n d P ractice, 10, 329-345

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

0 21)08 H o g ic fc Publish ing G m b H


K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

24

M. H asselh orn

Competencies as Individual Preconditions


for Successful Learning
D u r i n g th e last f e w d e c a d e s th e in f o rm a tio n -p r o c e s s in g a p p ro a c h to h u m a n le a rn in g
a n d co gnition has p rovided a notably useful theoretical f r a m e w o r k for th i n k i n g about
le a rn in g c o m p eten cies. T h is a p p ro a c h h as d r a w n o u r attention to th e c o m p o n e n ts
a n d m e c h a n is m s crucial for th e proficiency o f o u r cogn itive system. G eneral sys
tem c o m p o n e n ts like attentional capacity, w o r k in g m e m o ry , and m e ta c o g n itio n have
b een d escrib ed w ith in th is fra m e w o r k . T h e b e tte r th ese m e c h a n is m s w ork in l e a r n
ing situations, th e b e t te r the in d iv id u a ls general c o m p e te n c y for learning. T h u s , one
m ig h t define th e functional proficiency o f th ese m e c h a n is m s a s th e individual p re
co n d itio n s for su ccessful learning. Several y e a rs ago, how ever, Pressley, B orkow ski,
a n d S c h n e id e r (1989) re c o n s tru c te d th e c h a ra c te ris tic s o f su ccessfu l le a rn e rs w ith in
th e in f o r m a tio n - p ro c e s s in g f r a m e w o r k and identified the w ay le a rn e rs p ro c e s s in
form atio n as the crucial factor; thereby co n c lu d in g that s u ccessfu l le a rn e rs are those
w h o intelligently m a k e u se o f strategies. T h e strategic p r o c e s s in g thus w a s ta k en as
a s ta r tin g point to id e n tify in g th e individual p re co n d itio n s n e c e s s a ry o r even suf
ficient for th is k in d o f g o o d in f o rm a tio n p r o c e s s in g . Pressley et al. (1989) rep o rted
find ings on a n u m b e r o f core c h a ra c te ris tic s o f su ccessful le a rn e rs w h ich included
m e c h a n is m s in fo u r areas o f individual preconditions: (1) the efficiency o f selective
attention and w o r k in g m e m o r y d u r i n g the e n c o d in g a n d p ro c e s s in g o f in fo rm atio n ;
(2) th e a m o u n t and quality o f available prior know ledg e; (3) the u s a g e and m e ta c o g nitive reg u la tio n o f strategies; and (4) m otivational dispo sitions, w ith th eir specific
effects o n the intensity and m a in te n a n c e o f e n g a g in g in le a rn in g activities. M o re
recently, C o r n o and K a n f e r (1993) d is c u s s e d a fifth a re a o f individual p reco n d itio n s
for successful learning. T h e y a r g u e d that, a lth o u g h m otivational dispo sitio n s m ig h t
d riv e stu d e n ts to build up a le a r n in g inten tio n and invest effo rt in a task, m otivation
is not the m e c h a n is m that e n ab les the in itialization and con trolled a c c o m p lis h m e n t
o f a d e q u a te le a r n in g activities. T h e i r co n clusion w a s that in a d d itio n to cog nitiv e
a n d m otivational c o m p e te n c ie s successful le a r n in g req u ires volitional dispositions.
T h e follow ing ou tline o f th e m a in d ev elo p m en tal c h a n g e s and c o n s tra in ts in s tu
d en ts' c o m p e te n c ie s for su ccessful le a r n in g fo cu ses on th e se cognitive, m otivational,
a n d volitional co m p eten cies. T h e ethical a n d social c o m p o n e n ts o f c o m p e te n c ie s
m e n tio n e d by W e in e r t (2001) as f u r t h e r a r e a s o f relevance are o m itte d for lack o f
space.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H o g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

26

M. H asselh orn

th is pattern o f results to indicate th a t d u r i n g th e e le m e n ta r y school y e a rs th e ability


to fo c u s o n e s attention on relevant in fo rm a tio n increases, a n d th at fro m the age o f
12 the ability to ig n o re irrelevant in f o r m a tio n also em erges. M o r e recen t studies have
c o n fir m e d th at th e ability to selectively fo cu s on relevant in fo rm a tio n inc re ase s in
efficiency up to early a d u lth o o d (H a ss e lh o rn , K a m m , & Ueffing, 1989).
Similarly, th e ability to in h ibit irrelevan t in fo rm a tio n d u r i n g the recall o f item s
to b e p ro c e s s e d s e e m s to e m e rg e b e tw e e n the ages o f n in e and 12 as well, a s d e m
o n s tra te d by stud ies u s in g th e directed forgetting paradigm (W ilso n & Kipp, 1998).
In th e se studies, p articipants are p resen ted w ith a list o f item s a c c o m p a n ie d by in
stru c tio n s to r e m e m b e r s o m e (R-item s) and forget o th e rs (F-items). In stu d ie s that
apply th is p a r a d ig m u s in g th e so -c a lle d word method, p a rticip an ts are given specific
in s tru c tio n s to either m e m o r i z e o r forget alo n g w ith each individual item a s it is
successively presented. R esearch w ith th e w ord m e th o d revealed d iffe re n c e s in the
e n c o d i n g o f the R- and F-items, w h ich su g g ests that the d if fe re n c e b e tw e e n the
n u m b e r o f R- a n d F-item s r e p ro d u c e d is th e result o f selective e n c o d i n g (B a s d e n &
B asd en , 1998). T h is ability to selectively e n c o d e h as b een identified in c h ild ren as
y o u n g a s seven y e a rs o f age ( H a ss e lh o rn , Hille, & Elster, 1997).
In d irec ted forg ettin g stud ies u sin g the list method , th e in s tru c tio n to forget is
giv en after a longer list o f items. Here, su b jects also re p ro d u c e few er F- th a n Ritems. H owever, the list m e th o d g u a r a n t e e s that the F -item s and the R -item s have
b een e n c o d e d w ith c o m p a ra b le intensity, since the in s tru c tio n to forget (irrelevant)
or m e m o r i z e (relevant) an item is g iv e n after the presentation o f a w hole list o f items.
T h e d iffe re n c e in p e r f o r m a n c e can th erefore be trac ed b ack to a retrieval inhibition
(Bjork, 1989). D e v e lo p m e n ta l stud ies w ith th e list m e th o d do not rep o rt any s p o n ta
n e o u s d ire c te d fo rg ettin g effect for c h ild ren y o u n g e r th a n n in e y e a r s o f age, w hich
h as b een in te rp re te d as e v id en ce o f th eir retrieval inhibition deficit (Bray, Justice, &
Z a h m , 1983; H a r n is h f e g e r & Pope, 1996, L o r s b a c h & R eim er, 1997).
T h e fact that a d irec ted forg ettin g effect cou ld not be s h o w n for y o u n g e r children
u sin g the list m e th o d d o e s not necessarily m e a n that y o u n g e r c h ild ren are u n a b le to
in h ib it irrelevant in fo rm atio n . Z e l a z o and F ry e (1999) su sp e c t that at this a g e av a il
able strateg ies are still to o lim ited to a c tu a lly u se th e e x istin g in h ibitio n com petency.
Indeed, it h as b een s h o w n that ch ild ren j u s t s ta r tin g p r i m a r y school are able to forget
intentionally i f the d ire c te d forg ettin g ta sk is p resen ted in a ch ild lik e a n d m o tiv atin g
way, and i f a d e q u a te u n d e r s ta n d i n g is e n s u r e d by p o in tin g out explicitly that fo rg et
ting is useful in th is situation ( H a s s e lh o r n & Richter, 2002; K r e s s & H as se lh o rn ,
2000 ).
H owever, a n e c e s s a ry p rereq u isite for properly ap p ly in g selective attention s tra te
gies is an intact w o r k in g m e m o r y system. T h is en ables h u m a n s to store a v a rie ty o f
in fo rm a tio n tem porarily, to keep it available instantaneously, a n d to relate each in d i
vidual piece o f in fo rm a tio n to th e others. T h e tr ip a rtite v i e w o f w o r k in g m e m o r y put
forth b y B a d d eley (1986, 20 0 0 ) prov id es a useful f r a m e w o r k in w h ich to c o n s tr u c t
a detailed p ic tu re o f th e m a jo r c h a n g e s in w o r k in g m e m o r y developm ent. T h is v ie w
c h a ra c te r iz e s w o r k in g m e m o r y as c o m p ris in g a central executive co n tro llin g system

J o h a n n e s l l a r t i g , F c k h a r d K lie m e . D c tle v l .c u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x t s , l l o g r e f e P u b l i s h in g G m b H , G o ttin g e n 2 0 0 8


2 0 0 8 H o g r tf e P u b lis h in g G m b H
K c in c u n e riu u b fe W c itc r g u b c o d e r V c r v ic lf a tig tin g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

30

M. H asselh orn

O lv e r a n d G reen field (1971) p o stu la ted th re e differen t f o r m s o f k n o w le d g e re p re


sentation a c q u ir e d successively in th e c o u r s e o f ontogenesis. A c c o r d in g to them ,
th e first fo rm a c q u ir e d is an enactive (m o v e m e n t-b a s e d ) one, w h ich g iv e s w a y to a
pictorial and later a linguistic-symbolic fo rm o f represen tatio n. H ow ever, th e a s
s u m p tio n th at differen t fo r m s o f k n o w le d g e rep resentation are a c q u ir e d successively
in th e c o u rs e o f d e v e lo p m e n t h as not proven tenable. Rather, it has b een found that
differen t representational f o r m s exist alo ngside one a n o th e r fro m early c h ild h o o d
o n w a r d s (cf. K r is t & W ilk e n in g , 1991; S odian , 2002).
T h e b asic fact th a t prior k n o w le d g e , w h e n available, facilitates le a rn in g leads to the
q u estio n o f w h e th e r th is influ ence c h a n g e s over the c o u rs e o f h u m a n developm ent. If
it do es, then e x p e r im e n ta l stu d ie s in w h ich a g e and p rio r k n o w le d g e a re in d e p e n d e n t
v a ria b le s should s h o w statistically significant in teraction effects. S tu d ies o f th is kind
have in d e ed b een co n d u c te d in w h ich le a rn e rs w e re o b s e rv e d fro m early e le m e n ta ry
school up to a d u lth o o d (e.g., Schneider, G ru b e r , G old, & O p w is, 1993). N o n e o f
th ese stu d ie s w a s able to identify in te ra c tio n s b e tw e e n th e tw o factors, however,
w h ich w ould sug gest that the influen ce o f prior k n o w le d g e is not ag e-d ep e n d en t.

S tra te g ie s a n d T h e ir M e t a c o g n itiv e R e g u la tio n


T h e literatu re on this subject d e s c r ib e s the fo llo w ing typical s e q u e n c e o f p h ases
in th e acq u isition o f a co g n itiv e or m e ta c o g n itiv e strategy. A situation in w hich
th e le a rn e r d o es not p o s se ss the co g n itiv e p re co n d itio n s for u sin g the strategy is
d e s c rib e d as a mediation deficiency. H ow ever, even i f th e n e c e s s a ry co g n itiv e p r e
co n d itio n s have d ev eloped, th ere is usually a p h a s e in w h ich the le a rn e r d o e s not
s p o n tan eo u sly m a k e u se o f the strateg y (production deficiency >). A n early study by
M oely, O lson, H alw es, a n d Flavell (1969) on th e d ev e lo p m e n t o f o rg a n iz a tio n a l strat
egies m a k e s clear h o w useful it is to d ifferentiate b e tw e e n m e d iatio n and p r o d u c
tion deficiencies. T h e a u th o rs asked first-grade, th ird -g ra d e, and fifth -g rad e students,
a m o n g others, to m e m o r i z e a n u m b e r o f categorically a r ra n g e d pictures. W h e n the
c h ild ren w e re s h o w n th e p ic tu re s they w e re told th at they could m o v e them a ro u n d
a s m u ch as they w a n te d to or do w h a te v e r else w ith th e m that w ou ld facilitate l e a r n
ing and m e m o riz a tio n . It w a s s h o w n that o nly the fifth -g rad e s tu d en ts s p o n tan eo u sly
g r o u p e d the p ic tu res into catego ries a n d then also ass e m b le d th e m in n o n -ran d o m
gro ups. T h e th ir d - g ra d e stu d e n ts o nly used the desired o rg an iz atio n a l strateg y w h en
th e e x p e r i m e n te r helped th e m ind irectly by p ro v id in g n a m e s o f categories; they did
not u s e th e strateg y spontaneously. E ven p ro v id in g n a m e s o f categ ories did not help
th e first-grade students.
C h ild r e n o f school age o ften display p ro d u ctio n deficiencies. T h i s m e a n s that
alth o u g h it is o ften possible to get th e m to app ly a p p r o p r ia te strateg ies th ro u g h
s y ste m a tic in s tru c tio n , th e y still will not p ro d u c e th e se strateg ies spontaneously.
T h is p ro d u ctio n deficit im p a ir s in d e p e n d e n t le arn in g . D e p e n d in g on th e k ind o f
strateg y and a m o u n t o f practice, the p r o d u c tio n deficit c a n b e o v e r c o m e s o o n e r

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H o g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

34

M. H asselh orn

Defense Mechanism s in the Emergence of the Achievem ent Motive System


O n the path to w a rd a m o tiv e system that b e c o m e s relatively stable o ver t i m e fro m
ap p ro x im a te ly th e tw e lfth y e a r o f life one e n c o u n te r s a series o f d e fe n s e m e c h a n is m s
a n d risk m e ch an ism s. For exam ple, if one a s k s seven-year-old children to list th e b est
s tu d en ts in th e ir class, a lm o s t all o f th e m include th e m s e lv e s (e.g., Nicholls, 1979).
T h is o v er-o p tim ism fulfills an e x tre m e ly im p o r ta n t fu n c tio n : it e n ab les children to
believe th a t they are capable o f a c h ie v in g a n y t h in g and d o in g w h a te v e r an y o n e else
can do sim ply by e x e rtin g e n o u g h effort. T h is f u n d a m e n ta l b e lie f leads school-starte rs to o v e re s tim a te th e ir o w n abilities.
T h is overly o p tim is tic a n d u n re a lis tic w ish fu l t h i n k i n g g iv e s w a y to a realistic
self-assessm en t at the a g e o f a p p ro x im a te ly eight years. T h e fact th a t children lose
th eir o v e r-o p tim is m at this point b e c o m e s app aren t, for exam p le, in a d e c lin in g g e n
eral e n jo y m e n t o f le a r n in g (H e lm k e , 1993).
T h e general p r e d o m i n a n c e o f o v e r-o p tim is m in early c h ild h o o d plays a m a jo r role
in th e p ro g ression o f d ev elo p m en t th at c o m p e te n c y beliefs follow. T h us, a ro u n d the
a g e o f fo u r to six years, n o con cep t o f ability at all can b e identified. T h is is a p p a r
ent, for ex am ple, in the fact that children s actual o u tc o m e s on a ta sk have alm o st
no influence on th eir p ro g n o s e s o f h o w well they will d o on it a secon d tim e (cf.
D w e c k , 2002). O n e five-year-old child w h o had ju s t received a bicycle a s a g ift and
had a lre a d y failed 20 tim e s at rid in g it w ith o u t tr a in in g w h e e ls b e g a n his tw enty-first
a tte m p t c o n v in c e d that: T h is time, Til s u cceed !"
U p to the sixth o r seventh y e a r o f life, social c o m p a r is o n s also s e e m to be o f
relatively little interest to c h ild ren in general, a n d o f n o interest at all w h e n e v a lu
a tin g th eir o w n c o m p eten cies. T h is c h a n g e s v e ry slowly b e tw e e n th e a g e s o f six
a n d eight. In this age range, one b e g in s to see the first d o m a in -s p e c ific c o n cep ts
o f ability w h ich are obviously influenced by the results o f p re v io u s actio ns and
w h ic h in c re asin g ly develop th r o u g h social c o m p a r is o n ( D w e c k , 2002). T h e result
o f this d ev elo p m en t is that excessively o p tim is tic s e lf-assessm en ts held by ch ild ren
at the b e g i n n i n g o f school in c reasingly give w ay to m o re realistic assessm ents.
C o rre s p o n d in g ly , th e a v e ra g e v alu e o f school-related se lf-c o n c e p ts o f ability sink s
w ith in c re a s in g age (van A k e n , H elm ke, & Schneider, 1997).
I f o ne e x a m i n e s the a c h ie v e m e n t s e lf-a s s e s sm e n ts o f c h ild ren in the first tw o years
o f school m o re closely by p o s in g a q u estio n such as: W h o are th e b est stu d e n ts in
y o u r class? o r H o w well do y ou t h i n k y ou will d o at th e fo llo w in g ta s k ? o ne
often finds that the o v e r-o p tim is m o f girls giv e s w ay to a realistic self-assessm ent
ap p ro x im a te ly h a l f a y e a r e a rlie r th a n w ith boys. P r e s u m a b ly this g e n d e r d ifference
results fro m a c o r r e s p o n d in g acceleration in the c o g n itiv e d ev e lo p m e n t o f girls. T h is
m ig h t b e in terp reted as an indication o f girls' c o g n itiv e d ev elo p m en tal a d v a n ta g e
t u r n i n g a g a in st them into a m otivational disad v an tag e. In o th e r w ords, w ith th e early
loss o f th e (naive) b e lie f th a t they can ac h ie v e a n y t h in g th r o u g h sheer effort, it may
be th a t girls are d ep riv e d o f the m otivation to m a k e th e e x tra effo rt n e e d e d to m e e t
ex ceptionally h ig h d em an d s.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2D0S H o g ic fc PuMiJiing G m b H
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

38

M. Hasselhorn

le a rn in g (for f u r t h e r details, see S od ian, 2002). T h e y are left aside, how ever, in the
fo llo w in g outline.
T h e m a in fe a tu re o f th e first turning point is an e n o r m o u s in c re ase in the effi
ciency o f working memory , esp ecially o f phonological w o r k in g m e m o r y - th e stor
a g e system respo nsib le for th e p ro c e s s in g o f verbal a n d aco ustic inform atio n. M o s t
c h ild ren a rriv e at th is critical m o m e n t at so m e p o in t in the sixth y e a r o f life, w h en
th e subvocal rehearsal p ro cess o r in n e r sp e e c h " b e c o m e s a u to m atically activated
w ith in th e phonological loop A lth o u g h in principle children are cap able o f e n g a g
ing in in n e r speech at an even y o u n g e r age, th e y d o not d o so s p o n tan eo u sly or
automatically. W ith d ire c t g u id a n c e , y o u n g e r c h ild ren can be in d u c ed to e n g a g e in
s im ila r i n n e r s p eech processes, but it is im p o r ta n t to r e m e m b e r th a t this re q u ire s a
significant a m o u n t o f in struction . W ith o u t i n n e r speech, th e fu n ctio n al cap acity o f
th e phonological w o r k in g m e m o r y is v e r y limited. T h e phonological store o f w ork
ing m e m o r y h as a cap acity o f only a b o u t one and a h a l f to tw o seconds, a f te r w hich
it is o v erw ritten .
At the age o f approxim ately eight y ears - usually so m ew h at earlier in girls than
in boys - the second turning point in developm ent can be discerned. Its m a in char
acteristic is an overall ch an g e in motivational preconditions. T h e over-optim ism o f
y o u n g children that en g end ers the feeling o f being highly com p etent is lost. Social
c o m p a ris o n s - above all w ith peers - increasingly d e te rm in e self-perception and selfevaluation. Increasingly realistic self-appraisals are the result. Yet a tra c e o f the you ng
child's over-optim ism rem ains, visible in the tendency - still perceptible in adulthood
- to distort inform atio n in the interests o f o n e s o w n self-image and overestim ate o n e
self in relation to the facts (e.g., Taylor & Brow n, 1988). T h is m otivational self-defense
b e c o m e s very fragile from th e ninth y e a r o f life. T h u s one finds that individuals with
low self-esteem often lack th e motivationally beneficial tendency to lift th eir self-es
teem b ecau se they doubt that a m o re positive self-concept w ould stand up to reality
(Baum eister, Tice, & Hutton, 1986).
At ap p ro x im a te ly 10 y e a rs o f age th e child a rriv e s at a third turning point. It is
c h a ra c te r iz e d by the first o c c u r r e n c e o f ab stract self-reflection" (Piaget, 1971).
A b stra c t self-reflection is ex tre m ely im p o r ta n t for th e central executive a s well as the
m e ta c o g n itiv e facets o f l e a r n i n g preconditions. P resu m ab ly, it is the s a m e co g n itiv e
p ro c e s s that trig g ers m otivational c h a n g e s at the a g e o f eight and th e n b rin g s a b o u t a
significant in c re ase in metacognitions b e tw e e n th e ag es o f eight and 10. At this point,
c h ild ren b e g in t h i n k i n g m o re a n d m o re a b o u t the m selv es, th e ir o w n k n o w led g e, and
th eir o w n le a r n in g processes. A s a c o n s e q u e n c e , they s u d d e n ly b e g in to s p o n ta n e
ously em ploy strategies. T h e p rod uction deficits o f m a n y strateg ies are lost at the end
o f e le m e n ta ry school, p rob ably a s a result o f a b s tra c t self-reflection, w h ich lead s to
a significant in c re ase in m e ta c o g n itiv e know ledge.
In the eleventh o r tw e lfth y e a r o f life th e in d iv id u a ls performance motive system
stabilizes, c h a ra c te r iz e d by an o rientation to w a r d s e ith e r s u ccess o r failure. H ere w e
can id e ntify the fourth turning point in the d ev e lo p m e n t o f individual p reco n d itio n s
for successful learning. It o c c u rs in a p h a s e o f d ev elo p m en t c h a r a c te riz e d by in-

J o h a n n e s l l a r t i g . E e l h a r d K lie m e . D c tle v l .e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x t s , l l o g r e f e P u b l i s h in g G m b H , G o ttin g e n 2 0 0 8


2 0 0 8 H o g r tf e P u b lis h in g G m b H
K c in c u n e riu u b fe W c itc r g a b c o d e r V c r v ic lf a tig tin g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

42

M. H asselh orn

N icholls, J G. (1979). D ev elo p m en t o f p erc e p tio n o f o w n a tta in m e n t and cau sal a ttrib u tio n s for
su ccess and failu re in read in g . J o u rn a l o f E d u ca tio n a l P sychology, 71. 9 4 -9 9 .
N icholls. J G ..& M iller. A T. (1985). D ifferen tiatio n o f th e c o n c e p ts o f luck and skill. D evelo p m en ta l
P sych ology, 21, 7 6 -8 2 .
N isan . M . & K o riat, A (1977). G u ld e n 's actu al ch o ices and th e ir co n cep tio n o f the w ise ch o ice in
a delay -o f-g ratificatio n situ atio n . C h ild D evelopm en t, -IS, 4 8 8 -4 9 4 .
Piaget. .1. (1971). B io lo g y a n d kn ow ledge C h icag o C h icag o U niversity P ress
P ickerin g . S. J. (2001). T h e d ev elo p m en t o f v isu o -sp a tial w o rk in g m em ory. M em ory. 9, 4 2 3 -4 3 2 .
P ick erin g , S. J., (ia th e rc o le , S. E ., H all. M . & L loyd, S. A. (2001). D evelopm ent o f m em ory for
p a tte rn and path: F u rth e r ev id en ce for th e fractio n a tio n o f v isu o -sp a tial m em ory. Q u a rterly
J o u rn a l o f E xperim en tal Psychology'. 54A, 3 9 7 -4 2 0 .
P in trich . P. R ., & Z u sh o . A. (2002). T h e developm ent o f ac ad e m ic self-regulation: T h e role
o f co g n itiv e and m o tiv atio n al factors. In A W igfield & J S. E c c lc s (E ds.). D evelopm en t o f
ach ievem en t m o tiva tio n (pp 2 4 9 -2 8 4 ) San D iego. CA A cad em ic P ress.
Pressley. M ., B orkow ski, J. G , & S chneider. W (1989) G ood in fo rm atio n p ro c e ssin g W hat it is and
how ed u catio n can p ro m o te it J o u rn a l o f E d u ca tio n a l R esearch, 2, 8 5 7 -8 6 7
R h e in b erg , F., S ehm alt. 11-I).. & W asser. J. (1978). Rin L e h re ru n te rsc h ie d , d e r e tw a s ausm acht.
Z eitsch rift f iir E n tw icklu n gspsych ologie und P a d a g o g isch e P sych ologie. 10. 3-7.
S chneid er, W . <&BO ttner, G. (2002). E n tw ick lu n g d es G e d a c h tm sse s bei K in d e rn und Ju g en d lich en
In R O e rte r & L. M o n tad a (E d s ), E n tw icklu n gspsych ologie (filth ed itio n , pp. 495-516).
W ein h cim P sychologie V erlags U nion
S chneid er, W . G ru b e r. H.. G old. A .. & O p w is. K (1993). C h e ss e x p e rtise and m em o ry fo r ch ess
p o sitio n s 111 c h ild re n an d adults. J o u rn a l o f E xperim en tal C h ild P sych ology, 56, 3 2 8 -3 4 9
S ch u m an n -H en g steler, R. (1995) D ieE n tw ick lu n g d esvisu ell-rJ u m lich en G ed d ch tn isses. G ottingen
I Iogrefe.
S hod a. Y.. M ischel. W.. & P eake, P. K. (1990). P re d ic tin g adolescent co g n itiv e and self-regulatory
co m p ete n cie s fro m preschool delay o f g ratificatio n . D evelo p m en ta l P sych ology, 26. 9 7 8 -9 8 6 .
Siegel. L. (1994). W o rk in g m em o ry and read in g : A lifesp an p ersp ectiv e. In tern ation al J o u rn a l o f
B eh a vio u ra l D evelo p m en t. 17. 109-124
S ieglcr. R. S. (1996). E m ergin g minds: The p ro c e ss o f ch an ge in children's thinking New York. N Y
O xford U n iv ersity Press.
S kinner. E . A (1995) P e rc e iv e d con trol, m otivation , a n d co p in g N ew bury Park, CA Sage.
S k in n er, E. A.. Z im m er-G em b e ck , M. .!., & C o n n ell, J. P. (1998). In d iv id u al d iffe re n ce s and the
developm ent o f p erceiv ed control. M on ograph s o f the S o c ie ty fo r R esearch in C h ild D evelopm ent,
63 (2 -3 , S erial No. 254).
S o d ian . B. (2002). E n tw ic k lu n g b eg rifflich e n W issen s. In R. O e rte r & L. M ontada (E d s ),
E n tw icklu n gspsych ologie (fifth ed itio n , pp. 4 4 3 -4 6 8 ). W cinheim : P sychologic V erlags U nion
Taylor, S E . & B ro w n . J. D (1988) Illu sio n and w ell-b ein g A so cial-psychological p ersp e ctiv e on
m ental health P syc h o lo g ica l Bulletin, 103, 193-210
T ru d ew in d , C\. & K o h n e, W. (1982). B e z u g sn o rm o rie n tie ru n g d e s L e h re rs und M otiv en tw ick lu n g :
Z u sa m m e n h a n g e m it S ch u lleistu n g , In tellig en z u n d M erk m alen in d e r hauslichen U m w elt in der
G run d sch u le. In F. R h e in b erg (Ed.), J a h rh u ch fiir E m pirische E rzieh u n gsw issen sch aft 1982 (pp.
115-142). D u sseld o rf: S chw ann,
van A k en . M A. G . H elm k e, A ., & S chneider, W (1997). S elb stk o n zep t und L eistu n g - D y n am ik
ih res Z u sa m m c n sp ie ls E rg cb n isse aus d em S C H O L A S T IK -P rojekt. In F. E. W ein ert & A.
H e lm k e (E ds.), E n tw icklun g nn G ru n d sch u la lter (pp. 341-350) W einhcim B eltz.
W einer, B (1979) A th e o ry o f m o tiv atio n fo r so m e classro o m ex p erien ces. J o u rn a l o f E du cation al
P sychology, 71, 3-25.
W ein ert, F. E. (2001). C o n cep t o f co m p eten ce: A c o n c e p tu a l clarification. In I) L .R ychen & L. II
S alg am ik (E ds.), D efining a n d selectin g k e y co m p eten cies (pp. 4 5 -6 5 ) G ottin g en : 1Iogrefe.

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 00K

2008

H u g ic fc PubliJiing G m b H

K e in e u n e r la u b te W c ite r g u b c < xlcr V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

46

C. Spiel & J. Gluck

o f 12- to 18-year-old students. B ased on th e ir c o m p e te n c e profile (item resp o n se


patterns), th e latent classes can be d e s c r ib e d as c o n c re te -o p e ra tio n a l stage, i n t e r m e
diate stage, a d v a n c e d in te r m e d ia te stage, a n d fo rm a l-o p e ra tio n a l stage (Spiel et al.,
2001, 2004). T h us, the D R V p ro v id es a state-o f-th e-art c o n c e p t a n d in s tr u m e n t for
d ia g n o s is in educational contexts. To enable flexible use, im m e d ia te p e r f o r m a n c e
fe e d b a c k and m a n y o th e r ad v a n ta g e s, w e are c u rre n tly p r e p a r in g a c o m p u te r version
o f th e DRV. In gen eral, th e u se o f e - a s s e s s m e n t in educational co ntex ts is perceived
a s a v e r y p ro m isin g develo p in g area.
T h is p a p e r in tro d u c e s the DRV, its theoretical b a c k g r o u n d a n d p s y c h o m e tric p r o p
erties. In particular, w e p resen t v a rio u s an a ly se s o f the validity o f th e DRV. In a d d i
tion, th e c o m p u te r version o f th e D R V still u n d e r d ev e lo p m e n t is briefly described.

Theoretical Background
Deductive Reasoning
R e a s o n in g is a p ro cess o f th o u g h t that p ro d u c e s con clu sio n s on the basis o f p e r c e p
tions, tho u g h ts, or assu m p tio n s. D e d u c tio n yie ld s valid conclu sio n s that m u s t b e tr u e
giv en that th eir p re m is e s are tru e ( J o h n so n -L a ird , 1999). T h e m o s t p r o m in e n t theory
o f d e d u c tiv e r e a s o n in g is P ia g e ts c o g n itiv e -d e v e lo p m e n ta l th e o ry (Piaget, 1971).
A c c o r d in g to this theory, c h ild re n m ove th r o u g h four d ev elo p m en tal s tag es w h ich
d iffe r qualitatively in the c o g n itiv e p r o c e s s e s that they c a n handle: the sen sum otor,
th e preoperatio nal, the c o n c re te -o p e ra tio n a l a n d th e fo rm al-o p eratio n a l. T h e tr a n s i
tions b e tw e e n th e se stages o c c u r a ro u n d ages 2, 6, a n d 12. A c c o r d in g to Piaget, the
tran sition from the c o n c re te -o p e ra tio n a l to the f o r m a l- o p e r a tio n a l stag e is th e m ost
im p o r ta n t step in the d ev e lo p m e n t o f ded u ctiv e reaso ning. T h u s , th e D R V fo c u s e s on
th e transition from c o n c re te -o p e ra tio n a l to fo rm a l-o p e ra tio n a l think in g.
A c c o rd in g to Piaget, th e re are tw o m a jo r c h a ra c te ris tic s o f fo rm al-o p eratio n a l
th i n k i n g in c o m p a r is o n to c o n c re te -o p e ra tio n a l th in king: h y p o th e tic o - d e d u c tiv e
r e a s o n in g a n d propositional th o u g h t ( I n h e ld e r & Piaget, 1958; Piaget, 1972; see also
Berk, 1993). W h e n faced w ith a problem , fo rm a l-o p e ra tio n a l in d iv id u als th in k o f all
possible factors that could affect the outcom e. T h e y are also able to ta k e into a c c o u n t
factors that are not im m e d ia te ly s u g g e s te d by c o n c re te fe a tu re s o f th e situation. In
addition, fo rm a l-o p e ra tio n a l in d iv id u als can ev a lu a te th e logic o f sta te m e n ts by r e
flecting on th e s ta te m e n ts them selves; th e y do not need to relate them to real-world
c irc u m s ta n c e s .
H o w ev e r, a s em pirical results show, th e se theoretical classifications reflect w h a t
is possible ra th e r th a n w h a t is typical o f adolescents. W h e n p resen ted w ith c o n d i
tional re a s o n in g problem s, a d o le sc e n ts a n d even adults rarely s p o n ta n e o u s ly d is tin
g u ish d e d u c tiv e a r g u m e n ts fro m n o n -d e d u c tiv e a r g u m e n ts (B u llo c k & S odian, 2003;

J o h a n n e s l U r t i g , E e l h a r d K lie m e . D c tle v l .e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x t s , l l o g r e f e P u b l i s h in g G m b H , G o ttin g e n 2 0 0 8

2 0 0 8 H o g rtfe Publishing G m bH
K c in c u n e riu u b fe W c itc r g u b c o d e r V c r v ic lf u tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

50

C. Spiel & J. Gluck

to all studies, th e m a tic (i.e., f a m ilia r and concrete) c o n d itio n a ls are e a s ie r to solve
than a b s tra c t a n d c o u n te rfa c tu a l ones. B ased on Piaget ( In h e ld e r & Piaget, 1977),
c o u n te r fa c tu a l ta s k s should be best suited to prove w h e th e r p a rticip an ts are able to
d is ta n c e th e m s e lv e s fro m real-world know ledge. H ow ever, the em pirical e v id en ce
is m ix e d : w h ile so m e stud ies fo u n d d iffe re n c e s in difficulty b e tw e e n a b s tra c t and
c o u n te rfa c tu a l ta sk s (Ziegler, 1996), o th e rs fo u n d no such d iffe re n c e s (M a r k o v its &
V achon, 1989; R o b e r g e & Paulus, 1971).
M o d e of Presentation of th e A n te ce d e n t
T h e a n te c e d e n t can be p resented w ith ( I f th e sun do es not shine, M ic h a e la w e a r s a
red T-shirt ) and w ith o u t negation (see e x a m p le above). W h i l e n eg atio n s are, in p r in
ciple, co m p re h e n sib le for c o n c re te operational in d iv id u a ls (Gray, 1990; M o s h m a n ,
1977), d u r i n g th e tran sition to f o rm a l-o p e ra tio n a l stage n eg ations c a n h in d e r th e
tr a n s f o rm a tio n o f c o m p e te n c e into p e r f o r m a n c e ( M o s h m a n , 1977). Em pirical in
vestig atio n s sh o w system a tic in c re a se s in ta sk difficulty w h e n n eg ations w e re used
in a n teced en ts, w h ile n o effects o f n eg atio n s in the c o n s e q u e n t w ere o b s e rv e d
( M o s h m a n , 1979; R o b e rg e & M a s o n , 1978; W i l d m a n & Fletcher, 1977).

The Competence Profile Test Deductive Reasoning - Verbal;


DRV
Description of the DRV
T h e DRV a s s e s se s c o m p e te n c e profile a n d c o m p e te n c e level in d e d u c tiv e re a s o n in g
from the a g e o f abou t 10 y e a rs and up. By a n a l y z in g profile and level, the D R V m e a
s u re s b oth q ualitativ e and q u a n tita tiv e facets o f d ed u ctiv e r e a s o n in g p e r f o r m a n c e
(Spiel et al., 2001, 2004). T h e D R V classifies indiv id ual sta tu s in th e c o u rs e o f the
tran sitio n f r o m the c o n c re te -o p e ra tio n a l stage to the fo rm a l-o p e ra tio n a l stage a s p r o
p o s e d by Piaget (1971).
T h e D R V consists o f 24 single item s (6 differen t p r e m is e s x 4 ty p e s o f inference).
T h e six p re m is e s w e re c o n s tr u c te d b ased on th e literatu re on m o d e r a to r variables
o f syllogistic rea so n in g , sy stem a tica lly v a r y i n g the co n ten t o f th e c o n d itio n a ls (co n
crete, abstract, and c o u n te r fa c tu a l) a n d th e m o d e o f p resentation o f the a n te c e d e n t
(with and w ith o u t negation) (see Table 2; Spiel et al., 2001, 2004; Spiel, GoBler, &
G luck, in press).
E a c h item co n sists o f th e prem ise, a sta te m e n t and a q u estio n (inference), followed
by th r e e possible answ ers: yes" (correct a n s w e r fo r m o d u s p o n e n s items), no" (cor
rect a n s w e r fo r m o d u s tollens items) or p e rh a p s " (correct a n s w e r for negation o f
a n te c e d e n t and affirm atio n o f c o n s e q u e n t items). In addition, to avoid g u essin g , the

J o h a n n e s ll u r t i g , E e l h a r d K lic m e . D c tle v I .e u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts , H o g r e f c P u b l is h i n g G m b H , G o ttin g e n 2 0 0 8

2D08 H u g ic fc PuMiJiing G m b H
K c in e u n e r lu u b te W c ite r g a b c o d e r V c r v ic lf u tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

54

C. Spiel & J. Gluck

C o n te n t o f C onditional
C oncrete

C/>
p*
or
o
cr

100 %

80 %

60 %

40 %

20%

A bstraet

C ounterfactual

u.

Ur-

.2
ja
o
CO

0%

Type o f
M T /M P N A /A C
Inference L
C o n crete

M T /M P N A /A C

M T /M P N A /A C

M T /M P NA /A C
J

Interm ediate

A dvanced

F orm al O p eratio n al5

D e v e lo p m e n ta l Stages

Figure J. Averaged solution probabilities in the three latent classes identified by


M R M s and in the theoretically postulated fourth class (Formal-Operational Stage
by type o f inference; M T M P
biconditional, NA AC
fallacies) and content o f
conditional.

A d v a n c e d Interm ed iate Stage


A d v a n c e d in te r m e d ia te p a rtic ip a n ts p e r fo r m e d b e tte r in the fallacies th a n in the b i
co n d itio n s for all items, in d e p e n d e n t o f content. T h i s su g g e s ts that they g e n e ra liz e d
th eir p r e li m in a r y u n d e r s ta n d i n g o f d e d u c tiv e r e a s o n in g to a b s tra c t a n d c o u n t e r f a c
tual content. H ow ever, th e ir insight that p e r h a p s can b e th e c o r re c t solution to
so m e sy llo g ism s s e e m s to h av e led to in c re a se d u s e o f th is re s p o n s e altern ative for
M P and M T item s also (see F ig u re 1).
F o rm a l-O p e ra tio n a l Stage
F o rm a l-o p e ra tio n a l in d iv id u als are e x p e c te d to s h o w high solution probabilities for
all item s in d e p e n d e n t o f co n ten t a n d m o d e o f p resentation o f the anteced ent. H owever,
in all em pirical in vestigation s w e have c o n d u c te d until now, th is g r o u p h as b e e n too
small to reliably e s ti m a te the relevant p a ram eters. Therefore, the a s s ig n m e n t to this
g r o u p h as b een d o n e theoretically a n d not empirically.

J o h a n n e s llu rtig , K ck h ard K licm e. D c tle v I.e u tn c r: A sse ssm e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n te x ts , H o g re fc P u b lish in g G m b H . G o ttin g e n 200R

H u g ic fc PuM iJiing G m b H
K e in e u n erlu u b te W c ite rg a b c o d e r V c rv ic lfa tig u n g .

2008

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

58

C. Spiel & J. Gluck

the three classes with respect to grade: 27.0% o f the advanced intermediate participants
were only in the 7th grade, while 18.8% o f the concrete-operational participants w ere in the
12th grade. Thus, the levels o f deductive reasoning performance are only partly age- and
grade-dependent.
Relation Betw een D R V P erform ance and M arks in M athem atics
A s d ed u ctiv e r e a s o n in g c o m p e te n c e is a s s u m e d to b e a n e c e s s a ry prerequ isite for
u n d e r s ta n d i n g and solving p ro b le m s in m a th e m a tic s , w e ex pected a significant re
lationship b e tw e e n m a r k s in m a th e m a tic s and D R V p e rfo rm a n c e . A 2 x 2 analysis o f
v a r ia n c e w ith the facto rs o f g r a d e (7th to 12lh) a n d co g n itiv e stage ( c o n c r e te - o p e ra
tional, in te rm e d ia te , a d v a n c e d in term ed iate) sh o w ed a significant effect o f c o g n i
tive stage, 7 (2,398) = 5.285, p = .005), n o g r a d e effect, and n o interaction. In the
A u stria n g r a d i n g sy ste m , w ith 1 b e in g the b e s t m a rk and 5 b e i n g the w o rst m a rk ,
c o n c re te -o p e ra tio n a l s tu d en ts had an averag e m a th e m a tic s m a rk o f 3.13 (SD = .97),
in te rm e d ia te students, 2.88 (SD = 1.02), and a d v a n c e d in te r m e d ia te stud ents, 2.83
(SD = 1.18). A g ain, the d iffe re n c e w a s largely b e tw e e n the c o n crete-o p eratio n al
a n d th e tw o in te rm e d ia te stages. C o n c re te -o p e ra tio n a l s tu d en ts h a d low er m a r k s in
m a th e m a tic s than the o th e r tw o groups.
F ro m a theoretical point o f v ie w relations b e t w e e n c o m p e te n c e in d e d u c tiv e r e a s o n
ing and m a th e m a tic s can only b e e x p e c te d o n ce s tu d en ts have reached a n a d v a n c e d
level in d e d u c tiv e reaso ning. To test this a s s u m p tio n w e calculated the S p e a r m a n
co rrelation coefficient b e tw e e n m a th e m a tic s m a r k s and d e d u c tiv e r e a s o n in g per
f o r m a n c e w ith in each latent class. R esults s u p p o rte d th e a s s u m p tio n . C orrelation
coefficients w e re - . 0 5 (/; = .521) in the c o n c re te -o p e ra tio n a l stage, - .1 0 (p = .245) in
th e in te rm e d ia te stage, and - . 2 9 in the a d v a n c e d in te rm e d ia te stag e (p = .001).

The Computer Version of the DRV


In co o p e ra tio n w ith the S c h u h frie d co m p an y , w e are cu rre n tly developing a c o m p u t
e r version o f th e D R V (Spiel et al., in press). T h e first version o f th e co m p u te r-b a s e d
D R V is re a d y to test. However, n o data are available yet.
T h e electro nic D R V (e-D R V ) is largely parallel to the p a p e r a n d pencil version. It
is p resen ted on th e P C ( m o u s e -c o n tro lle d ) w ith n o ti m e limit. A f te r b r i e f in stru c tio n s
abo ut m o u s e u se and general in fo rm a tio n a b o u t the ta sk , a sim p le item is presented.
I f the item is a n s w e r e d correctly, a se c o n d m o re difficult item (AC item) is presented.
H ere, the difficulty o f fallacies is explicitly p o in ted out. T h e screen for th e e -assessm e n t o f th e first item o f th e e - D R V looks as follow s (see Table 3). G enerally, th e next
item is only p resen ted a fte r th e p a rtic ip a n t h as selected o ne o f th e possible an sw ers,
and p a rticip an ts c a n n o t re tu rn to earlier items.

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n t e x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H ugtcfc PuMiJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

62

C. Spiel & J. Gluck

References
A kaike. II. (1973). In fo rm atio n theory and an ex ten sio n o f th e m a x im u m lik elih o o d principle. In B.
N. P etro v & F. C sak i (E d s ). 2 n d In tern ation al Sym posium on Inform ation Theory. B udapest:
A k a d e m ia i K iado.
B e c k m a n n , J 1\, & G u th k e, J (1999) P sy c h o d ia g n o stik d es S ch lu ssfo lg ern d cn D enkens: H an d b u ch
z u r A d ap tiv en C o m putergestU tztcn In tc llig e n z -L e rn tc stb a ttc rie fu r S ch lu ssfo lg crn d es D enken
(ACIL). G o ttin g en H ogrefe
B eller, S., & S p ad a, II (2003) T h e logic o f co n ten t effects in p ro p ositional reaso n in g : T h e case o f
cond itio n al re aso n in g w ith a p o in t o f v ie w Thinking a n d R eason ing, 9, 335-378.
B erk. L. E . (1993). Infants, children, a n d adolescen ts. N eed h am H eig h ts, M A : A lly n & Bacon.
B ullock. M . & S o d ian , B (2003). E n tw ic k lu n g d e s w isse n sch a ftlich e n D en k en s. In W. S ch n eid er &
M K n o p f (H rs g ). E ntw icklung, L eh ren und Lernen. Ztint G eden ken an F ran z E m anuel We in ert
(pp. 75-91). G o ttin g en : H ogrefe.
B y rn es. J P . & O v erto n . W F. (1986). R e aso n in g ab o u t c e rta in ty an d uncertain ty in co n crete, causal,
and p ro p o sitio n al contexts. D evelo p m en ta l P sychology, 22, 793-799.
C h azan , I) (1993). H igh school g eo m etry stu d e n ts' justification for th e ir v ie w s o f e m p iric a l ev id en ce
and m ath em atical p ro o f E d u ca tio n a l S tudies in M ath em atics, 24, 359-387
d e R ib a u p ie rre , A .R ie b e n . L .,& L autrey. J. (1991). D ev elo p m en tal ch a n g e and indiv id u al differences:
A lo n g itu d in al stu d y u sin g P iag etian tasks. P syc h o lo g y M on ograph s, III, 287-311
D eL isi, R .. <& S taudt, J. (1980). In d iv id u al d iffe re n ce s in co lleg e s tu d e n ts p e rfo rm a n c e on form al
o p eratio n al ta sk s Jo u rn a l o f A p p lie d D ev elo p m e n ta l P sychology. I. 201-208.
D u g an , C. M . & R cv lin . R (1990). R esp o n se o p tio n s an d p rese n tatio n form at as c o n trib u to rs to
co n d itio n al re aso n in g J o u rn a l o f E x p erim en ta l P sychology, 42, 8 2 9 -8 4 8
E v an s, J. S B T , N ew stead , S E ., & B y rn e . R M J (1993). H um an reasoning: The p sy c h o lo g y o f
dedu ction. Hove: L aw ren ce E rlbaum
Fischer, G. H (1995). T he L in e a r L o g istic Test M odel In G. H. F ischer & I. W. M o len aar (E d s ),
R asch M o d els F oundations, recen t developm en ts, a n d a p p lica tion s (pp. 131-155). N ew York:
Springer.
F ischer. G. H ., & M o len aar, I. W (Eds.). (1995). R asch m odels F oundations, recen t developm en ts,
a n d application s. N ew York S p rin g er
F isk. J. E.. & S h arp , C. (2002). S y llo g istic re a so n in g an d co g n itiv e ageing. Q u a rte rly J o u rn a l o f
E xperim en ta l P sych ology: H um an E xperim en tal Psychology', 55A, 1273-1293
Gnlhooly, K J . L ogie, R. 11. & W y n n , V (1999). S y llogistic re a so n in g task s, w o rk in g m em ory, and
skill. E u ropean J o u rn a l o f C o g n itive P sych ology, 11, 4 7 3 -4 9 8 .
G luck. J., & In d u rk h y a. A. (2001). A sse ssin g ch an g es in th e lo n g itu d in al salien ce o f item s in
co n stru c ts. J o u rn a l o f A d o lesc en t R esearch , 16, 169-187.
G luck. J.. M ach at. R . Jirasko. M.. & R o llett. B. (2002). T rain in g -related ch a n g e s in solution strateg y
in a sp atial test: A n ap p licatio n o f item re sp o n se m odels. L earn in g a n d In divid u a l D ifferences,
13. 1- 2 2 .
G ltick, J . & Spiel. C (1997) Item re sp o n se m o d e ls fo r rep ea ted m e a su re s d e sig n s A p p licatio n and
lim ita tio n s o f fo u r d ifferen t ap p ro ach es. M ethods o f P sych o lo g ica l R esearch Online, 2 (O nline
Journal: h ttp ://w w w . m pr-online.dej.
G luck, J., & Spiel, C (in press). U sing item resp o n se m o d els to a n a ly z e change: A dv an tag es and
lim itatio n s. In M. von D av ie r & C .H C a rsten se n (E d s ). M u ltiva ria te a n d m ixture distribu tion
R asch m o d els - extensions a n d application s. N ew York: Springer.
G ray. W. M (1990) F o rm al o p eratio n al th o u g h t In W. F. O v e rto n (E d ), R easoning, N ecessity, a n d
Logic: D evelo p m en ta l P ersp ective s (pp. 227-253). H illsd ale: E rlb au m
G rig g s, R A., & C ox, J R. (1982) T he elu siv e th em atic m a te ria ls effect in W ason's selection task.
B ritish J o u rn a l o f P sychology, 73, 4 0 7 -4 2 0

J o h a n n e s l l a r t i g . K c k h a r d K lic m e . D c tle v l .e u tn c r : A s s e s s m e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 8


0 2D 0S H o g r e f e P u b lis h in g G m b H
K c in c u n e r lu u b te W c ite r g u b c o d e r V e r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

J o h a n n e s H a rtig . K ck h ard K liem e. D c tlc v L e u tn e r: A sse ssm e n t o f C o m p e te n c ie s in K d u catio n al C o n te x ts . H o g re fe P u b lish in g G m b H . G o ttin g e n 2 0 0 8


2 0 0 8 H o g re fe P u b lish in g G m b H
K e in e u n e rla u b te W citerg u b c o d e r V c rv ic lfa tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

70

J. H artig

p e r tin e n t d o m a in , and th a t are u s e d to c o n s tru c t ability sc o re s fro m o b s e rv e d test


behavior. T h e d e m a n d s on th e se m o d e ls co n stitu te th e m a in fo cu s o f th is chapter. It
a im s to principally illustrate differen t possibilities o f p s y c h o m e tric m o d e lin g suit
able for c o m p e te n c e c o n s tru c ts, u s in g relatively sim ple m o d e ls as illustrations and
p re s e n tin g th e m in a c o m m o n notation. T h e c h a p te r d eals w ith the im p lic atio n s o f the
context-specific n a tu re o f c o m p e te n c e c o n s tr u c ts for a d e q u a t e p s y c h o m e tric m o d e ls
for c o m p e te n c e assessm ent. First, s o m e general d e m a n d s on p s y ch o m etric m o d e ls
for c o m p e te n c e c o n s tr u c ts are derived. Subsequently, different possible effects o f
ta sk d e m a n d s on item r e s p o n s e s are c o n c e p tu a lly illustrated. A f t e r that, gen eral f e a
tu r e s th at c h a r a c te r iz e differen t p s y c h o m e tric m o d e ls a r e d escrib ed. Subsequently,
selected e x a m p le s o f p s y c h o m e tric m o d e ls are used to illustrate d iffe re n c e s b e tw e e n
m o d e ls B ased on the general model c h a ra c te ris tic s a n d th e e x am p les, criteria for the
selection o f an a p p ro p ria te model are discussed.

Psychometric Models for Context-Specific Constructs


E d u ca tio n al m e a s u r e m e n t a i m s to m a k e in fe re n c e s from o b s e rv e d b e h a v io r in a test
situation (e.g., so lv in g a re a d in g test item ) to a ttrib u te s o f test ta k e rs (e.g., re a d in g
c o m p e te n c e ). W itho ut f u r th e r g o in g into the details o f specific r e q u ir e m e n ts for test
content and item c o n s tr u c tio n that are a d e q u a t e for th e a s s e s s m e n t o f c o m p e te n c e
c o n s tru c ts, so m e a s s u m p ti o n s re g a rd in g the test co n ten t and th e test item s are e s
sential for th e co n sid eratio n s in this chapter:
1)The content o f the test item s ad e q u a te ly rep resen ts the real-life situations that are
relevant for the c o m p e te n c e c o n s tru c t that is m e a su re d .
2 ) T h e specific d e m a n d s o f the test ite m s rep resen t the actual d e m a n d s in th e relevant
real-life situations.
3 ) T h e in dividu al test item s are explicitly c h a r a c te riz e d in te r m s o f different d e m a n d s
th a t n eed to b e m a s te re d by th e test ta k e r to solve specific items.
A s s u m p t i o n s (1) a n d (2) a r e a s p e c ts o f content validity. A s s u m p tio n (3) is not n e c e s
sarily fulfilled for a s s e s sm e n t item s used in em pirical research. H owever, i f (1) a n d
(2) are fulfilled, th e explicit descrip tio n o f test item s should g e n e ra lly b e feasible.
T h i s d escrip tio n is n e e d e d , as will be sh o w n later in th is chapter, to in c o rp o ra te
specific ta sk d e m a n d s into p s y c h o m e tric models, a n d to em pirically e x a m i n e effects
o f different ta sk d em an d s.
G iv en th e context-specific n a tu re o f c o m p e te n c e c o n s tru c ts, c o m p e te n c e can
be c h a r a c te riz e d a s th e result o f th e successful interactio n o f individual abilities
w ith specific situational d e m a n d s . G iven valid test a n d item content, a p s y c h o m e tric
m odel in te n d s to model successful b e h a v io r in test situations w h ile ta k i n g into a c
c o u n t relevant abilities o f th e a ctin g individual as well a s th e relevant task d em an d s.
In th is chapter, the te rm task dem and refers to any ch a ra c te ris tic o f a test item that

J o h a n n e s H u rt!# . K c k h a r d K lie m e . l> e tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H o g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

74

J. H artig

item s I a n d 2. In technical te rm s ,7 the ta sk d e m a n d w o uld affect th e dimensionality


/
o f the items. A n additional ability d im e n s io n w o u ld probably be n e e d e d to a c c o u n t
for the specific v a ria tio n c au sed by the specific d e m a n d s o f item 3. In substantive
te r m s , th e ability to m a s t e r d e m a n d (c) w o u ld b e relatively u n re la te d to the abilities
to m a s te r d e m a n d s (a) a n d (b). It cou ld b e reg arded as a d ifferen t ability w h ich is
n e e d e d additionally to m a s te r item 3. T h e interpretation w o u ld be different abili
ties are required to master item 3, due to the specific dem ands o f this item". I f dif
fe re n c e s in task d e m a n d s g o alon g w ith d ifferen tiated co rre la tio n s b e tw e e n item s
(as in case 2) and d iffe re n c e s in difficulties (as in case 1), th e in terp retatio n o f this
p a tte rn will fo cu s on the d im e n s io n a lity and probably disreg ard th e item difficulties.
D iffe re n c e s in difficulties are only m e a n in g f u l i f th e item s are s u p p o se d to m e a s u r e
th e s a m e abilities.
A third possibility in th is hy p o th e tica l m in i- e x a m p le w ould b e that th e ta sk d e
m a n d h as no o b s e rv a b le effect at all item s 1, 2, a n d 3 could tu rn out to be equally
difficult a n d to have equal co rre la tio n s a m o n g each other. In th is case, th e task d e
m a n d w ould a p p e a r to b e irrelevant for o b s e rv a b le test p erfo rm an c e; the a s s u m p
tions r e g a rd in g its relevance w ou ld h av e to be revised.
In real a s s e s s m e n t situations, ty p ic a lly a large n u m b e r o f item s is a n s w e re d by
th e test takers, and several task d e m a n d s m a y b e defined to d e s c rib e th ese items.
T h e possible effects o f ta sk d e m a n d s ou tlined a b o v e - effects on difficulty and
d im e n s io n a lity - can b e a c c o u n te d for sy stem a tica lly in different w ays in differ
ent p s y c h o m e tric models. T h e next section d e s c r ib e s basic featu re s o f p s y c h o m e tric
m o d e ls and the substantial im p lic atio n s o f th e se fe a tu re s for the interactio n b e tw e e n
th e abilities o f te st-tak ers and ta sk d em an d s.

Elements and Characteristics of Psychometric Models


T h e p s y c h o m e tric m o d e ls applied in ed u cational m e a s u r e m e n t d e s c rib e h o w the
abilities that are m e a s u r e d a s s u m in g ly relate to the test behavior. B ased on these
a s s u m p tio n s, p s y c h o m e tric m o d e ls p ro vide r u le s h o w to c o n s tr u c t test sco res that
p rovide e s tim a te s for th e individual abilities o f interest. D ifferen t m o d e ls have d if
ferent im p lic atio n s as to what type o f scores are c o n s tru c te d a s results fro m a test,
a n d how th ese sco res are c o n s tru c te d . For in stance, specific m o d e ls result in single
scores on c o n tin u o u s c o m p e te n c e scales, c h a r a c te r iz i n g in d iv id u als in te r m s o f low
to high co m p eten ce. O th e r m o d e ls result in individual profiles, d e sc rib in g in d iv id u
als w ith scores in several d ifferen t dim ensions. O th e r m o d e ls result in classifications
o f individuals, e.g. re p re s e n tin g specific t y p e s o f learners.
M o d e r n p s y c h o m e tric m o d e ls used in educational m e a s u r e m e n t b e lo n g to th e large
class o f statistical m o d e ls w ith latent variables. Technically s p e a k in g , th e s tr u c tu re
o f o b s e rv e d r e s p o n s e s is m o d e led by a s s u m i n g one o r m o re ra n d o m v a ria b le s that

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n t e x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H ugtcfc PuMiJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

78

J. H artig

in m u ltid im en sio n a l d a ta s tru c tu re s , m u ltid im en sio n a l m o d e ls a r e re q u ire d to a d


equately re p re s e n t th e abilities u n d e r ly i n g successful p e r f o r m a n c e in differen t tasks.
In th e se cases, ta sk d e m a n d s a r e u s e d to define the lo a d in g s tr u c t u r e o f it e m s on
m ultiple latent variables.

Scale of Latent Variables


T h e m o d e ls m o s t freq u en tly applied in ed u catio nal m e a s u r e m e n t contain continuous
latent variables , i.e. the m e a s u r e d abilities a r e rep resented by co n tin u o u s d i m e n
sions. W ith c o n tin u o u s latent variables, le a r n in g and m a s t e r y o f educational content
is d e s c rib e d as a c o n tin u u m f r o m low to high levels o f proficiency. P s y c h o m e tric
m o d e ls can, how ever, also co n tain latent v a ria b le s that are categorical. W i t h abilities
re p resen te d by categorical latent variables, in d iv id u als a re g r o u p e d into c ateg o ries o f
a latent variable. T h e s e group s, w h ich are defined by th e c ateg o ries o f a categorical
latent variable, are referred to a s latent classes. M o d e ls c o n ta in in g categorical latent
v a ria b le s a r e referred to a s latent class models o r mixture models , th e latter b e c a u s e
o b s e rv e d re s p o n s e s are m o d e led as resultin g fr o m a m ix tu r e o f g ro u p -sp ecific re
s p o n s e patterns.
T h e scale o f th e latent v ariab les h as su bstan tiv e im p lic atio n s for th e abilities that
are m e a s u r e d . W h i le th e descrip tio n o f abilities on c o n tin u o u s d im e n s io n s c a n be
c h a ra c te r iz e d as a continuum view o f mastery, latent classes c a n b e a d e q u a te for a
state view o f mastery, w h ich o n ly d is tin g u is h e s b e tw e e n the m a s te ry and n o n - m a s
te ry o f specific co n ten t ( M e s k a u s k a s , 1976). In line w ith the state v ie w o f mastery,
latent c la sse s c a n represen t ordered categories , e.g. m a s te ry or n o n - m a s te ry o f s p e
cific d e m a n d s. L aten t classes c a n also be u sed to represen t typologies o f abilities.
In this case, the latent classes d o not need to b e ordered, in d iv id u als are assig ned
to g r o u p s w ith qualitatively different patterns o f abilities. B esides th e sub stan tiv e
im plications, the scale o f th e latent v aria b le d e t e r m in e s the t y p e o f score c o n s tr u c te d
as a result o f the assessm en t. W h i l e co n tin u o u s latent v ariables result in co n tin u o u s
scores or score profiles for each ind ividual, latent class m o d e ls result in a classifica
tion o f each individual to a specific group.
A p a r t from m o d e ls w ith m erely categorical a n d c o n tin u o u s latent variables, there
also exist p s y c h o m e tric m o d e ls c o n t a in in g m i x t u r e s o f b oth (e.g., Rost, 1990; Rost
& von Davier, 1995; W ilso n , 1998; see also Spiel & G luck , 2008 , and von D av ier et
al., 2008; C h a p te rs 3 a n d 7 in this book). T h is c h a p te r will, how ever, o n ly illustrate
m o d e ls th a t co n tain either c o n tin u o u s or categorical latent variables.

Link Functions for Multiple Latent Variables


I f th e model for a c o m p e te n c e c o n s tr u c t in c o rp o ra te s m ultiple abilities, a n d i f the
probability o f g e ttin g an item right d e p e n d s on m o re th a n one ability, th e req u ired

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tm n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc Publish in g G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

82

J. H artig

for pred ic to rs o f item difficulty. T h u s , effects o f ta sk d e m a n d s on d im e n s io n a lity and


on difficulty can b oth be specified separately. E m b re ts o n 's (1984) general c o m p o n e n t
latent trait model (G L T M ) c o n s titu te s a n o th e r m odel that c o m b in e s m ultiple d i m e n
sions w ith effects o f ta sk d e m a n d s on item difficulties. W h i l e the m u ltid im en sio n a l
R asch m odel and th e m u ltid im en sio n a l L L T M are c o m p e n s a to r y m odels, the abil
ity d im e n s io n s are c o m b in e d in a n o n c o m p e n s a to r y (multiplicative) f u n c tio n in the
G L T M . In th e G L T M , o n e set o f ta sk d e m a n d s ( ability c o m p o n e n ts ) is defined,
w h ic h d efin e s th e d im e n s io n a lity o f th e items, and o n e s e p a ra te difficulty c o m p o n e n t
is e s tim a te d for each item on every dim ension.
I f th e m u ltid im e n s io n a l R asch model o r a n o th e r m u ltid im en sio n a l IR T model is
used as a m e a s u r e m e n t m o del, an ability profile 0 p can b e e s tim a te d fo r each in d iv id
ual, c o n t a in in g sco res for each d im e n s io n in the m odel (e.g., 0 p = {0.2,1.1, 0.3,0.5}
in a m odel w ith fo u r dim ensions). C o m p a r e d to a u n id im e n s io n a l m odel, th is profile
can provide m o re d ifferen tiated d ia g n o stic in fo rm atio n . In d iv id u a ls are not only d e
sc rib e d w ith respect to an overall level o f ability, b ut also w ith re s p e c t to individual
s tre n g th s and w e a k n e s s e s in different dim en sio n s.

T h e D I N O Model
T h e D I N O model (deterministic input . noisy or, Tempi in & H e n so n , 200 6, see also
Tem plin, 2 0 0 6 ) is a latent class model that c o n ta in s o n e d ic h o to m o u s latent abil
ity for each ta sk d e m a n d in the m odel, i.e. 0 pv e {0, l}. A s in the m u ltid im e n s io n a l
R asch m odel, th e re is a c o r r e s p o n d in g latent v a ria b le for each ta sk d e m a n d (only the
ind ex k is used for b oth latent v a ria b le s a n d ta s k dem ands). T h u s , an indiv id ual p
either m a s te r s a ta sk d e m a n d k ( 0 ( = 1) or he or she d o e s n 't (0 |k 0). T h e total ability
profile 0^ o f an in d iv id u al is re p re s e n te d by a v e c to r o f d ic h o to m o u s elem ents, d efin
ing w h ich d e m a n d s an in dividu al d o e s m a s t e r a n d w h ich he o r she d o e s not. T h e
D I N O m odel c o n ta in s o n e latent class for each possible p a tte rn o f abilities 0 , i.e. 2 k
classes. A m odel w ith four task d e m a n d s , for instance, c o n ta in s 16 latent classes for
all c o m b in a tio n s o f abilities from 0 p = { 0 ,0 ,0 , 0 j to 0 p = [l, 1,1,1
T h e D I N O m od el c o n ta in s a d ic h o to m o u s latent solution p ro b a b ility co that
only ta k e s v a lu e s o f z e ro a n d o n e ( a ) p; e [0,1}; h e n c e the d en o tatio n d e te rm in is tic
input"). T h e D I N O model is a compensatory m odel; abilities are c o m b in e d in a
disjunctive fiin c tio n :

< v = i - f l ( '- e j *
k=l

(3 )

In E q u a tio n (3), 0 pk is a d ic h o to m o u s v a ria b le in d icatin g i f individual p p o ssesses


ability k (0pk = 1) or not (0pk = 0). T h e Q - m a tr i x elem en t q. in d icate s w h e t h e r item /
re q u ire s ability k (</k = 1) o r not
= 0). I f any o f the abilities required for the item

J o h a n n e s I U r t ! g . K c k h a r d K lic m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008 H o g ic fc PubliJiing G m b H

K c in e u n e r lu u b te W c ite r g u b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

8 6

J. H artig

th e se a s s u m p tio n s, m o d e ls in c o rp o r a tin g w ith in -item m u ltid im e n s io n a lity m a y be


m o re appealing.

Scale of the Latent Variables


T h e decision b e tw e e n c o n tin u o u s and categorical latent v ariables can b e regard ed as
b e a r i n g substantial im p lic atio n s for th e n a tu re o f th e abilities that are m e a su re d . T h e
choice o f th e scale o f th e latent v a ria b le s can b e a n s w e r e d w ith referen ce to th e o re ti
cal a s s u m p ti o n s re g a rd in g the c o n s tr u c t s th at are m e a s u r e d . A re th e interin d iv id u al
d iffe re n c e s th a t are m e a s u r e d g r a d u a l (e.g., fro m lo w to h igh know ledge), o r are
distin ct g r o u p s o f in d iv id u als s u p p o se d to exist (e.g., s tu d en ts th a t d o u n d e r s ta n d a
p a r tic u la r c o n c e p t vs. s tu d en ts that do not)? O r d o es it seem a p p ro p ria te to c h a r a c
te riz e d iffe re n c e s b e tw e e n in d iv id u a ls by qualitatively differen t g r o u p s (e.g., ty p e s
o f learners)? W h i l e a categorical m e a s u r e m e n t o f m a s te r y m ay b e m o re a p p ro p ria te
for low-level skills o r h igh ly specia lize d k n o w led g e, the m o d e lin g o f c o n tin u o u s
d im e n s io n s may b e m o re suitable for h ig h e r skills and m o r e b roadly defined profi
ciency constru cts. In m a n y applications, however, th e decision b e tw e e n co n tin u o u s
o r categorical scales will not be u n a m b ig u o u s . D e B oeck, W ilso n , and A cto n (2005)
p rovide and illustrate a fo rm a liz e d f r a m e w o r k to d is tin g u is h b e tw e e n c o n tin u o u s
and categorical c o n s tru c ts, w h ile they c h a ra c te r iz e th is distinction as grad ual.
D e s p ite su b sta n tiv e aspects, the decision for c o n tin u o u s or categorical c o n s tr u c ts
can be b a s e d on rather technical c o n s id e ra tio n s related to the intention o f th e m e a
surem en t. I f th e a s s e s sm e n t p r im a r ily a i m s at a classification o f in d iv id u als, e.g. in
a p a s s / fail t y p e o f test, a latent class model m a y be m o r e a p p ro p ria te than a model
w ith a c o n tin u o u s d im e n sio n . C ategorical m e a s u r e m e n t, be it m o tiv ated by th e o re ti
cal or practical consid eratio n s, will p ro v id e a classification o f in d iv id u a ls as a result
o f th e m e a s u r e m e n t. It should be kept in m i n d that this im p lie s h o m o g e n e ity w ith in
th e categories, so th e re is no d ia g n o stic in fo rm a tio n to d is tin g u is h b e tw e e n in d iv id u
als w ith in the s a m e category. I f abilities are m o d e le d as categorical c o n s tru c ts, the
a s s u m p tio n o f th is h o m o g e n e ity h as to b e eith er well ju stified by theoretical a r g u
m ents, o r explicitly d eclared a s irrelevant for the p u r p o s e o f an assessm ent.

Com pensatory vs. N oncom pensatory Models


I f m ultiple ability d im e n s io n s are s u p p o se d to u n d erlie p e r f o r m a n c e in a given d o
m a in , a n d i f th e te st item s are s u p p o se d to re q u ire m ultiple abilities to solve (w ith
in-item m u ltid im en sio n a lity ), w e m u s t decid e w h e th e r th e in teraction o f m ultiple
abilities is c o m p e n s a to r y o r n o n c o m p e n s a to r y by nature. T h is decision h a s to be
m a d e on th e basis o f careful su b sta n tiv e con sideratio ns, since different fu n c tio n s to
c o m b in e m ultiple latent v a ria b le s have substantial im p lic atio n s for the m e a n i n g o f
th e se v a ria b le s them selv es. All o th e r c h arac teristics o f a p s y c h o m e tric model b e in g

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

90

J. H artig

S tout. W. (2007). S k ills d ia g n o sis u sin g IR T -based co n tin u o u s latent trail m odels. Jou rn al o f
E du ca tio n a l M easurem ent, 44. 313-324
T em plin, J. (2 006) C D M C o g n itive d ia g n o sis m odelin g w ith M plus. U ser G uide. R etrieved
N o v em b e r 2006 fro m http://\Y\\ w .iq g rad s n et/jtc m p lin /d o \v n lo a d s/C D M _ u scr_ g u id e p d f
T em plin. J L., & H en so n . R A. (2006). M easu rem en t o f psy ch o lo gical d iso rd e rs u sin g co g n itiv e
d ia g n o sis m o d e ls P sych o lo g ica l M ethods, 11, 2 8 7 -3 0 5
T h u rsto n e, I.. L . (1947). M u ltiple f a c to r analysis. C hicago: U niversity o f C h icag o P ress.
Van den N o o rtg ate. W . l)e B o eck , P., & M eulders. M. (2003) C ro ss-classificatio n m ultilevel logistic
m o d e ls in p sy ch o m etrics. J o u rn a l o f E d u ca tio n a l a n d B eh a vio ral S ta tistics. 28, 3 69-386.
von D avier, M ., D iB ello. L . & Y am am oto, K (2008). R e p o rtin g test o u tco m es u sin g m o d els for
co g n itiv e d iag n o sis. In J. H artig . E. K liem e. & D L e u tn e r (Eds.X^sse-ss/wew/ o f com peten cies in
ed u ca tio n a l con texts (pp. 151-174). G o ttin g en : H ogrefe.
W ilso n , M (1998) Saltus: a m o d el o f d isc o n tin u ity in co g n itiv e developm ent P sych o lo g ica l
Bulletin, 105, 276-289.
W ilso n , M . & D e B oeck. P. (2004). D escrip tiv e and ex p lan ato ry item resp o n se m odels. In P.
I)e B oeck & M W ilson (E d s ). E xplan atory item resp o n se m odels: A g e n e ra lize d lin ea r a n d
n on lin ear a p p ro a ch (pp. 43-74). N ew York: Springer.
W ilson. M.. D e B oeck. P . & C a rsten se n , C. II. (2008). E x p lan ato ry item re sp o n se m odels: a b rief
intro d u ctio n . In J. H a rtig , E. K liem e, & D. L e u tn e r (E d s ). A ssessm en t o f co m p eten cies in
e d u ca tio n a l con texts (pp. 91-120). G o ttin g en : H o g refe & H u b er

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

0 21)08 H ogicfc Publishing G m bH


K e in e u n e r lu u b te W c ite r g u b c < xlcr V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

94

M. W ilson. P. Dc B o eck. & C. H. C arsten se n

th e G e r m a n PISA sam p le o f 15 year-old s tu d en ts in the 2003 a d m in is tra tio n . T h e


test w a s d ev elop ed u n d e r the s a m e general g u id e lin e s a s th e PISA m a th e m a tic s test,
w h e r e m a th e m a tic a l literacy is a d e s c rib e d v aria b le w ith several successive lev
els o f sophistication in p e r f o r m in g m a th e m a tic a l tasks. T h e s e levels are s h o w n in
Table I. T h e r e are 881 s tu d en ts in total.
T h e test booklet co ntained 64 dicho to m ou s items; 18 o f th ese item s w e re selected
for this example. Two exam ples o f items are show n in Figure 1. Each item w a s c o n
structed acco rd in g to topic area and t y p e o f m ath em a tic m o delling required. T h e topic
areas w ere arithmetic, algebra and geom etry. T h e m od elling ty p e s w ere technical
processing, num erical m o d e lin g and abstract m o d e lin g (see N eu b ran d & N eubrand,
2003). T h e technical processing dim ension required students to ca rry -o u t operations
th at had been rehearsed such as c o m p u tin g numerical results using stan d ard pro ce
dures. N u m erical m o d e lin g requires students to con stru ct solutions to problem s with
given n u m b e rs in one o r m ore steps. In contrast, ab stract m o deling requires students
to fo rm u late rules in a m o re general way, for exam ple, by g ivin g an equation o r by
d escrib in g a general solution in so m e way.
(T ech n ical P r o c e s s in g in A lg eb ra )
F u n c tio n
T h e fu n c tio n giv en by th e e q u a tio n y - 2 . v - 1 shall be analysed.
a)

Fill in th e m issin g values.

-2

-i

3
19

(A b stra ct M o d e l l i n g in A r ith m e t ic )
D ifferen ce
Put th e digits 3, 6, 1, 9, 4, 7 in the b o x e s s o that the d ifference
b e tw e e n the t w o th re e -d ig it n u m b e r s is m a x im ised .

(Each digit may he used only once)


1. n u m b e r: 1

11

11

2. n u m b e r: 1

||

Figure 1. Two examples o f items from the German M athematical Literacy Test.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc P u b lish in g G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

98

M. W ilson. P. Dc B o eck. & C. H. C arsten se n

th e data. T h e m o d e ls w e will d is c u s s later in th is ch ap ter are all b a s e d on a o n e -s te p


p r o c e d u r e w ith a d ire c t m o d e lin g o f th e effect o f external variables.
O n e can ta k e a totally d ifferen t v ie w on the d a ta m a tr ix than that d e s c rib e d above.
T h is is th e v ie w that is m o s t c o m m o n ly ta k en for ex p e rim e n ta l and longitudinal
designs. From th is m o re c o m m o n p ersp ectiv e one w ould look at th e c o l u m n s o f the
d a ta m a trix to investigate the relation w ith c o v a ria te s o f th e re p e a te d ob serv ations,
such as m a n ip u la te d factors in an ex p e rim e n ta l study, and ti m e and tim e - c o v a ria te s
in a longitudi nal study. In o u r e x a m p le th is w o uld m e a n th a t w e w a n t to find o ut w hat
th e effect is o f topic a re a (i.e., arith m etic, algebra, g e o m e tr y ) a n d m o d e lin g t y p e (i.e.,
T = te c h n ic a l pro cessin g, N = n u m e ric a l m o deling , A = a b s tra c t m odeling), w itho ut
h a v in g m u c h interest in th e m e a s u r e m e n t o f individuals. T h u s th e s a m e data can be
used for th is o th e r p u rp o s e , to w h ich m e a s u r e m e n t c o n c e r n s are less im p o rta n t, but
a r e not ignored.
W e will call the c o v a ria te s o f th e rep eated o b s e rv a tio n s item properties b e c a u s e
they relate to the item s in the test. Item p ro p e rtie s are either m a n ip u la te d w ith in - s u b j e c t factors, such as in o u r exam p le, o r they relate to an u n p la n n e d v a ria tio n o f the
items. W h e n the test is intentionally c o n s tru c te d on th e b a s is o f th e se item properties,
they, and th e w ay they are c o m b in e d in item s, m ig h t b e c o n sid ered th e e lem en ts o f
th e "Test b l u e p r i n t A n e x a m p le o f u n p la n n e d v a ria tio n m ig h t be that w h ich is based
on p ro p e rtie s derived fro m a post-hoc content analysis o f the item s in an e x tan t test.
T h e desig n s h o w n in Table 2 is the basis for lo o k in g at the item side o f the data
m a t r i x to a n s w e r q u e s tio n s re g a rd in g the effect o f the covariates. In the rest o f this
p a r a g r a p h w e note s o m e p a tte rn s that can b e o b s e rv e d on the item side o f th e m a trix .
First, c o n s id e r topic areas. T h e m e a n scores a c ro ss the th r e e topic a r e a s are: a r i t h m e
tic = .59, algebra = .41, g e o m e tr y = .47. T h is m a k e s sense, as s tu d en ts learn a r i t h m e
tic b efo re they learn algebra and g eo m etry , hence, one w ould ex p ect th e arith m e tic
item s to be easier. T h e g e o m e t r y ite m s are s o m e w h a t easier th a n the algebra items,
w h ich is less predictable, th o u g h not surprising. T h e m e a n scores a c ro ss the m o d e l
ing ty p e s are: technical p r o c e s s in g = .59, n u m e ric a l m o d e lin g = .44, a b s tra c t m o d e l
ing
.44. A g a in , it m a k e s sense that the m e a n for technical p ro c e s s in g is h ig h e r
than th e o th e r tw o as, by its d escrip tio n , it is a less co m p lex set o f skills. G iv en the
general descrip tio n above, o n e m ig h t have e x p e c te d that a b s tra c t m o d e lin g w ould be
s o m e w h a t m o re difficult for the stu d e n ts th a n n u m e rical m odelin g, but in d e ed this is
a m a tte r than can be m o d e ra te d th ro u g h th e co m p lex ities o f the items.
W h a t can b e le a rn e d from "the o th e r side o f th e d a ta m a tr ix nicely c o m p le m e n ts
w h a t can b e le a rn e d from the p erso n side. T h e p e rs o n side yields test scores and
th e relationship o f th e se scores to o th e r variables, fro m w h ich w e can in fer possible
s o u rces o f individual differences. T h e item side tells u s ab out general effects that are
in d e p e n d e n t o f individual differences. For exam ple, a b r i e f look at m e an results over
sets o f item s ca rrie d out in the prev io u s p a r a g r a p h showed: (a) a r ith m e tic is easier
th a n algebra o r g e o m e tr y ; a n d (b) te ch n ical p r o c e s s in g is e a s ie r than e ith e r o f the
tw o ty p e s o f m odelin g. Even m o r e could be le arn ed i f the in teractio n s b e tw e e n the

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R


21)08 H o g i c f c P u M i Jiin g G m b H
K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

102

M. W ilson. P. Dc B o eck. & C. H. C a rs te n s e n

individual differences, so in t e r m s o f m e a s u r e m e n t they ran ge fro m purely d e s c r i p


tive to fu lly explanatory. O n th e o th e r hand, as d is c u s s e d above, o ne m a y see th e ex
p la n a to ry m o d e ls also as m o d e ls for re p e a te d o b s e rv a tio n d a ta u sed to test th e effect
o f p erson factors a n d o f facto rs in the item design, a s in a psychological ex p erim en t.
Specifically, fo r th e s a m p le d a ta the person facto rs are gender, SES and pro g ram ,
w h ile th e facto rs in the item design are topic a re a and m o d e lin g type.
In it'em response theory the p r i m a r y fo cu s is on m o d e lin g r e s p o n s e s to individual
items. In th is chapter, th e te rm item response models will be used for m o d e ls w ith
th e aim o f m o d e lin g item responses, in d e p e n d e n t o f th e k in d o f model. Since the
re sp o n se s are m ostly categorical, m ost item r e s p o n s e m o d e ls are m o d e ls for repeated
categorical observ atio n s. B e c a u s e o f th e categorical a n d repeated n a tu re o f th e o b
servations, a n d given th e discussion above, it should not b e s u r p ris in g th a t m o s t item
r e s p o n s e m o d e ls are G L M M s (for a s im ila r o b servatio n, see M e lle n b e rg h , 1994).
In o u r discu ssio n o f the fo u r item re s p o n s e m o d e ls w e will again c o n c e n tr a te on
b in a ry d a ta so that F = 0,1, b ut an extension to m u ltic a te g o ry data is not to o difficult
- see T u erlin ck x & W a n g (2004).
A s ex p la in e d above, G L M M s have th re e c o m p o n e n ts . W h e n they are applied to
item re s p o n s e m o d e ls for b in a ry data, th e G L M M c o m p o n e n ts are first, the random
component , a n in d e p e n d e n t B ern oulli d istribu tion for each c o m b in a tio n o f a per
son and an item w ith p a r a m e te r tt , th e probability o f a 1-response for p erso n p on
item /. S econ d, the link function is either a probit link or a logit link , li n k in g tt, to
r\;i. D e p e n d i n g on the k ind o f lin k functio n, th e item re s p o n s e m odel b elong s to the
fam ily o f n o r m a l- o g iv e m o d e ls o r th e fam ily o f logistic m odels. In the follow ing we
will co n c e n tra te o n logistic m o d e ls, but all that is said also applies to n o rm a l-o g iv e
m o d e ls i f the logit link is rep la ced w ith the probit link. T h ird , th e lin e a r c o m p o n e n t
is a sim ple linear component that m a p s a set o f pred ic to rs into r| (. T h e p a ra m e te rs
a r e the in tercep t and the slopes o f the linear function. T h e lin e a r c o m p o n e n t req u ires
so m e f u r t h e r explanation.
T h e typical intercept in an item r e s p o n s e m odel is o n e that v a rie s at r a n d o m over
persons. It is th e re fo re called th e person parameter. In the notation for item resp o n se
m o d e ls it is c o m m o n ly d e n o te d as 0 . O fte n a no rm al d istrib u tio n is a s s u m e d for 0 .
P
P
T h e h ig h e r its value, the la rger r ^ , a n d th e re fo re also
For th e n o rm a l og iv e m o d
els, 7t is a n o rm al o g iv e -fu n c tio n o f 0^ for each item /', w h ile for th e logistic models,
Tt is a logistic fu n c tio n o f 0; for each item /. T h e s e fu n c tio n s c o n n e c tin g th e ran d o m
intercept 0^ to nju are called ite m -c h a ra c te ris tic c u r v e s o r item re s p o n s e function s.
T h e ra n d o m in tercep t or p erso n p a r a m e te r fulfills th e fu n c tio n that is often
th e m a in reason w h y people are given a test. T h e person p a r a m e te r pro v id es a
m e a s u r e m e n t o f v a ria b le s such as abilities, a c h ie v e m e n t levels, skills, cog nitiv e
p ro cesses, co g n itiv e strategies, d ev elo p m en tal stages, m otiv ations, attitud es, per
sonality traits, em otional states o r inclinations. A gen eral te rm th a t w e will u se
for w h a t is m e a s u r e d in a te st is propensity.
A s above, w e d e n o te th e item predictors by an X w ith su b sc rip t k ( k = 1, ..., K) for
th e predictors, so, A , is the v a lu e o f item / on p re d ic to r k. T h e m ost typical predictors

J o h a n n e s ll u r t i g , K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x t s , l l o g r c f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K c in e u n e r lu u b te W c ite r g a b c o d e r V e r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

M. W ilson. P. Dc B o eck. & C. H. C arsten se n

106

A Doubly Descriptive Model: The Rasch Model


T h e Rasch mode! w a s defined earlier in E q u a tio n s 2 and 3: W e will c o n tin u e w ith
E q u a tio n 3. From this eq u atio n , an exp ression o f th e odds, o r ft,f/ ( l n pi)> is o b
ta in e d i f on both sides th e exp o n en tial form is used: exp(r) ) = e x p ( 0 ( - p (), so that

%pt f ( 1

Kp ) =

e X P ( 0 ;.)

1 eX P (P >

(4 )

E q u a tio n 4 is the exponential fo rm o f th e R asch model. A s a w a y to u n d e r s ta n d


E q u a tio n 4, ta k e exp (0 ; ) as an exponential m e a s u r e o f th e ability o f person p ta k in g
an a c h ie v e m e n t test, and ta k e e x p ( p ) a s an exponential m e a s u r e o f the difficulty
o f item / fro m th a t test. In th is case, th e fo rm u la e x p re s s e s th e o d d s o f success'' as
th e ratio o f a p e r s o n s ability to th e difficulty o f th e item. T h e intuition reflected
in the fo rm u la is that ability m a k e s one su cceed , w hile difficulty m a k e s o n e fail.
F ro m E q u a tio n 4, it follow s that

= exP(0,, - P,)/ [1 + exP(0 ,, - P.)]

(5)

T h is is th e f a m ilia r probability fo rm u la for the R asch model. A s a w ay to u n d e rs ta n d


this a lte rn a te fo rm u la for th e s a m e m od el, t h i n k o f a rep resen ta tio n w h e re th e item
difficulties a r e r e p re s e n te d as points along a line, and th e ability o f th e p erso n is
s h o w n a s a point alo n g the s a m e line. T h e a m o u n t d e t e r m in i n g th e probability o f
s u ccess is th e n th e d ifferen c e b e tw e e n the tw o lo c atio ns - (0 ( - P ). T h i s re p r e s e n ta
tion is s o m e t im e s called an item m a p or c o n s tr u c t m a p . A g e n e ric e x a m p le is
p rovided in F ig u re 3 w h e re the stu den ts a r e s h o w n on th e left-hand side, and the
ite m s on th e rig h t-h an d side. T h i s rep resentation has b een u sed a s a w a y to e n h a n c e
th e in terp retability o f th e results f ro m item r e s p o n s e m od el analyses. S e g m e n ts o f
th e line c a n be labeled as e x h ib itin g p a r tic u la r fe a tu re s for both the p e rs o n s a n d the
ite m s and th e p ro g re s s o f students, o r o th e r test-takers, th r o u g h th is set o f s e g m e n ts
can b e in terp reted a s d e v e lo p m e n t in co m p etency. T h e p la c e m e n t o f th e p erso n and
item p o in ts in a direct lin e a r relatio nship has b een the g e n e s is o f an ex tensive m e t h
o d o lo g y for in te rp r e tin g the m e a s u r e s (M a s te rs , A d a m s & W ilso n , 1990; W ilson,
2005; W rig h t, 1968, 1977).

i.e.. the ratio o f the success probability (7t ) to the failure probability (1 - n _).

J o h a n n e s H a r tig . K c k h a r d K lie m e . l> e tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts , H o g r e f e P u b l is h i n g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

114

M. W ilson. P. Dc B o eck. & C. H. C a rs te n s e n

Figure 7. Graphic representation o f the LL1M.


T h e goodness-of-fit v alues o f the L L T M are provided in Table 5. A lth o u g h th ese
v a lu e s are clearly in ferio r to th o s e o f the prev io u s m o d e ls for the d e v ia n c e and the
A IC , the v a lu e for the BIG is s u p e rio r to th at o f the R asch model. It a p p e a r s th a t u sin g
n in e p a r a m e te r s is m o re efficient than d efining a s e p a ra te p a r a m e te r for each item.
O f cou rse, th is L L T M is v ery close to b e in g d escrip tiv e - really all it is a s s u m i n g
is that each p air o f item s that w a s g e n e ra te d fro m th e c o m b in a tio n o f topic area and
m o d e lin g t y p e h as s im ila r difficulties. A s tro n g er a s s u m p tio n w ould b e that the tw o
effects simply c o m b in e to give an a c c u ra te difficulty estim ate. W e tested this in a
s e p arate analysis, fitting a m odel w ith j u s t th re e levels o f each o f the item p ro p erties
(i.e., five free param eters). For this m o re ex p la n a to ry analysis, n o n e o f the fit indices
w e re in favor o f the L L T M , a m o re s ta n d a rd finding (e.g., see W ilso n & D e Boeck,
2004).
T h e e s tim a te d person variance is 1.551. N ote that the v a ria n c e is slightly sm a lle r
than for the R asch m odel (w h ere it w a s 1.561). T h is illustrates h o w the e s tim a te s for
th e p e rs o n m o d e are slightly affected by a different a p p ro a c h for the item m o d e (ex
p la n ato ry in s te a d o f descriptive). T h i s p h e n o m e n o n can b e ex p la in e d a s a s c a lin g effect
( S n ijd e r s & Bosker, 1999). In the context o f o u r application, the effect follow s from the
differen t e stim a tio n o f the scale o f re c o n s tru c te d item p a r a m e te r s in c o m p a r is o n to the
scale o f freely e stim a te d item p aram eters. T h is is due to th e less than p e rfe c t e x p la n a
tion o f th e se item p a r a m e te r s on the basis o f the item p ro p e rtie s (see nex t paragraph).

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n u l C o n te x t s , l l o g r c f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K c in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

122

A. A. von Davier. C. H. C ars te n s e n , & M. von Davier

2003). T h is is tr u e in educational co n tex t even i f w e u se the s a m e test fo rm at differ


ent p o in ts in tim e, i f the c o n s tr u c t itself evolves o v e r time.
W h e n do q u e s tio n s like th e se arise? For exam p le, th e y arise in a c h ie v e m e n t and
lic e n su re tests w h e r e th e m e a n i n g o f th e re p o r tin g scale should b e p re s e rv e d across
a d m in is tr a t io n s a n d the fa ir n e s s for th e e x a m i n e e s th a t ta k e d ifferen t test fo rm s
should b e in s u re d . In such a process, also called a horizontal e q u a tin g process, tw o
te st f o r m s X a n d Y that are built to b e parallel are placed on th e s a m e scale. By par
allel test form s, w e m e a n here te st f o r m s th at w e r e c o n s tr u c te d fo llo w in g th e s a m e
co n ten t and statistical specifications, th a t w e re in te n d e d to b e s im ila r in difficulty,
and th a t w e re built w ith th e intention o f m e a s u r i n g th e s a m e construct. H owever,
since th e co n s tru c tio n o f parallel te st f o r m s is never perfect, test e q u a tin g is used to
m a k e th e scores on the t w o tests f o r m s in terchangeable. T h is form o f li n k in g o f tw o
parallel test f o r m s is called horizontal e q u a tin g a n d is a p re lim in a ry step in a n s w e r
ing the first research q u estio n above. In the h o rizo ntal e q u a tin g designs, even i f th e
g r o u p s o f e x a m in e e s w h o ta k e th e parallel f o r m s d iffe r in ability, th e s e differen c es
a r e usu ally small.
In o th e r c irc u m s ta n c e s , w e need to m a k e sco res o f test f o r m s - that m e a s u r e the
s a m e d o m a in o r c o n s tru c t but d iffe r in difficulty - c o m p a r a b le a c r o s s y e a rs o f study,
to enab le m e a s u r e m e n t o f g r o w th in a p a rtic u la r d o m a in . T h is p a rtic u la r t y p e o f
li n k in g is often called vertical l i n k in g ' in the field o f ed u cational m e a su re m e n t.
In a vertical l i n k in g design, the e x a m i n e e s that ta k e the tw o (or m ore) test fo rm s
a r e from re p re s e n ta tiv e sam p les o f th e ir co h o rts, the e x a m in e e s are assessed at the
s a m e point(s) in time, and obviously, the s a m p le s o f e x a m i n e e s that ta k e different
test f o r m s d iffe r significantly in ability, and th e fo rm s to b e lin ked are not parallel
( H a r r is et al., 2004; H a r ris & Hoover, 1987; K olen & B r e n n a n , 2004; Yen & Burket,
1997). Vertical l i n k in g is a p re lim in a ry r e q u ir e m e n t for a n s w e r in g the seco nd r e
search question. M a n y e le m e n ta ry a n d s e c o n d a r y test b a tte rie s rep o rt scores on a
vertical scale, such a s Iowa Test o f Basic Skills (IT B S ; Hoover, D un bar, & Frisbie,
2001, 2003) o r A C T s E d ucatio nal P la n n in g and A s s e s s m e n t S y stem (EPA S; ACT,
Inc., 2000).
In addition to the tw o practical c i r c u m s ta n c e s d e s c rib e d above, th ere are a s s e s s
m e n ts th at fo cu s on th e individual d ev elo p m en t over tim e u n d e r a p a rtic u la r t y p e o f
tr e a tm e n t o r ed u cational exposure. S uch situations arise in fo rm a tiv e a s s e s s m e n ts as
well as in th e classical assessm en ts. In a longitudinal li n k in g process, th e m e a s u r e
m e n t in s t r u m e n t m ig h t be th e sam e, m ig h t be parallel test fo rm s , o r m i g h t b e fo rm s
that are less parallel a c ro s s tim e p o in ts but a r e taken by th e same in d iv id u a ls at d if
ferent p o in ts in time.
L o n g itu d in a l li n k in g in a sense is related to vertical lin k in g b u t it re q u ire s a m o re
restrictive design. In th e longitudinal (p anel) design, the s a m e subjects are asked
to a n s w e r the s a m e o r several m e a s u r e m e n t in s tru m e n ts /te s ts at differen t points in
time. T h is ap p ro ach en ables m e a s u r e m e n t o f g r o w t h for th e individual e x a m in e e s as
a fu n c tio n o f t i m e and (lin k ed ) test sco re o r a s a fu n c tio n o f o th e r v a ria b le s o f inter
est. T h e d ifferen c e b e tw e e n th e lo ngitud in al l i n k in g and th e vertical l i n k in g resides
in the d ifferen c e b e tw e e n co llectin g data fro m a longitudinal desig n a n d collecting

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n t e x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008 H o g ic fc PuM iJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

126

A. A. von Davier. C. H. C ars te n s e n , & M. von D avier

N o n - E q u i v a l e n t G r o u p s with A n c h o r Tests (NEAT)


In th e N E A T design (see, fo r exam ple, von D av ier et al., 2004; Kolen & B ren n an ,
20 0 4 ) there are tw o populations, P and O, o f test tak ers, and a sam p le o f e x a m in e e s
fro m each. T h e sam p le fro m P ta k e s te st X, th e sam p le fro m Q ta k e s te st F, a n d both
sam p les ta k e a set o f c o m m o n items, the a n c h o r test, V In o b s e r v e d -s c o r e equ atin g ,
th e c o m m o n set o f item s is used to ad ju st for th e d iffe re n c e s in ability b e tw e e n the
t w o n o n - e q u iv a le n t grou ps; in an IR T context, th e c o m m o n set o f item s is u s e d for
s c a le -lin k in g p u rp o s e s , w h ich is a w a y o f a c c o u n t in g for d iffe re n c e s b e tw e e n p o p u
lations by c o n s tra in ts on th e item p a r a m e te r s in th e a n c h o r test. T h e N E A T design
is often u sed w h e n only o n e test form can be a d m in is te r e d at o n e test a d m i n i s t r a
tion b e c a u s e o f test secu rity or o th e r practical co ncern s. T h e t w o p o p u la tio n s may
not be eq u iv a le n t (i.e., th e tw o s a m p le s are not f r o m a c o m m o n population). T h e
N E A T d e s ig n s c a n have b o th internal (item s in F a r e also p art o f b oth A and Y) and
external (item s in F a r e n either in X n o r in Y) a n c h o r tests. T h e d a ta s tru c tu re for a
N E A T design is d e s c rib e d in Table 3. Usually, w h e n the d a ta is collected fo llow ing a
N E A T d esig n for horizontal li n k in g o r e q u a tin g p u r p o s e s , th e d iffe re n c e s in ability
b e tw e e n P and O are relatively small.

Table 3. Data Collection in a Non-Equivalent groups with Anchor Test (NEAT)


Design.
P opulation

S am ple

V
V

F rom Tables 2 a n d 3, w e see that the N E A T desig n c o n ta in s tw o in d e p e n d e n t SG


designs, one o n p o p u la tio n P and o n e on p opu latio n O.

D a ta C o lle c tio n D e s i g n s U s e d in Vertical L in k in g


In vertical linking , th e d e s ig n s used for d a ta collection are also c o m b in a tio n s o f the
th r e e d e s ig n s d e s c r ib e d above. Usually, th ere are th r e e c ateg o ries o f such c o m b i
nation designs: (no n -eq u iv alen t groups) that have levels w ith ov erla p p in g content,
levels w ith no o v erla p p in g content, a n d levels w ith c o m m o n ite m s a n d a s c a lin g test
(Patz, 2005).
In o rd e r to b e t te r adjust for th e d iffe re n c e s in c o m p e te n c y at th e e x tre m e levels (in
an educational co n tex t th a t is often a low er g ra d e , such as g r a d e th r e e o r four, and
th e highest g rad e) a c o m b in a tio n o f th e d e s ig n s is u s e d in s o m e cases, as d e s c rib e d
in Tables 4, 5 and 6.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008 Hugicfc PuMiJiing GmbH

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

130

A. A. von Davier. C. H. C ars te n s e n , & M. von Davier

and m a th em a tics or science and m athem atics, so that cov arian ces betw een th ese subject
m atters can be estimated. T his obviously m e a n s increasing the n u m b e r o f different
booklets.
T h e logistics o f a d m i n i s t e r i n g b o o k le ts c o n s tr u c te d u s in g th e B IB design is roughly
as follows: T h e set o f b o o k le ts is spiralled th r o u g h o u t th e sam ple, so that c la ss ro o m s
te sted w ith in th e a s s e s sm e n t will receive a (p s e u d o -) r a n d o m selection o f the b o o k le ts
(a s im ila r ap p ro ach as in an EG design). In th is way, w h e n the set o f b o o k le ts are
giv en o ut to the e x a m in e e s d u r i n g test a d m in is tr a tio n , c lu ste rin g o f b o o k le ts will be
m i n i m i z e d and ap p ro x im a te ly th e s a m e n u m b e r o f b o o k le ts is given to ap p ro x im a te ly
eq u iv alen t sub-sam ples.
B oth IR T and traditional e q u a tin g m e th o d s can be alternativ ely used for m o s t o f the
e q u a tin g designs, in c lu d in g th e N E A T design. H owever, i f th e data is collected fol
lo w in g a B IB design, then the large m iss in g -b y -d e sig n d a ta fe a tu re can be ad d re s s e d
o nly b y the IR T m ethods. Similarly, in vertical l i n k in g settings, w h e r e th e differen c es
b e tw e e n the abilities o f th e g r o u p s o f te st ta k e rs are large, the IR T m e th o d o lo g ie s are
m o re a p p ro p ria te for li n k in g p u r p o s e s th a n the o b s e rv e d -s c o re m e th o d s , even tho ugh
th e re are c i r c u m s ta n c e s w h e r e o b s e rv e d - s c o r e e q u a t in g m e th o d s are used.
In this section w e d e s c rib e d th e data collection d e sig n s u sed in differen t t y p e s o f
l i n k in g processes. In the next section w e p ro vide a descrip tio n o f the statistical tools
used to c o n d u c t th e actu al lin k in g o n ce the data h as b een collected.

Horizontal and Vertical Linking and Scaling


T h e role
m e th o d s
tiple test
m e th o d s

o f th e d a ta collection d e sig n s is to p ro v id e suitable r a w d a ta for statistical


a im e d at p r o d u c in g scale lin k a g e s a c r o s s tim e points, p opulations, o r m u l
form s. In this section, w e p rov id e a s u ccin c t descrip tio n o f the statistical
u sually em p lo y ed for e q u a tin g and linking.

T h e statistical m e th o d s used to ac h ie v e sco re co m p a ra b ility m a y be classified as ei


th e r IR T m e th o d s (H a m b le to n , S w a m in a th a n , & Rogers, 1991; Lord, 1980; a n d m a n y
others) o r as classical o r traditional e q u a tin g m e th o d s that are b ased on o b s e rv e d
scores (K o len & B re n n a n , 2004).
I f IR T is u s e d in th e e q u a tin g o f th e test scores, it is n e c e s s a ry to first u s e s o m e sort
o f l i n k in g p r o c e d u r e o r calib ratio n m e th o d to place the IRT p a r a m e te r e s tim a te s o n a
c o m m o n scale (see von D a v ie r & von Davier, 2007). O n c e th is is a cco m p lish ed , then
additional m ethods, such as IR T tr u e o r o b s e rv e d -s c o re e q u a tin g (K olen & B re n n a n ,
20 0 4 ) m ig h t b e em ployed. T h e n a th ird step is em ployed, w h ic h refers to p la cin g the
raw -sco res o n to so m e r e p o rtin g scale. H ow ever, in m a n y settings, m a in ly in the v e rti
cal l i n k in g and th e la rg e-scale ass e s sm e n ts , a f te r th e calibration is done, th e person
p a r a m e t e r e s tim a te s are directly placed o n to the re p o rtin g scale (Yen, 1984), a n d the
second step is sk ip p ed altogether.

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008 H u g ic fc PuM iJiing G m b H

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

134

A. A. von Davier. C. H. C ars te n s e n , & M. von Davier

1981). In th e past, th e jo i n t m a x i m u m lik elih o o d (J M L ) m e th o d h as b een u sed ; c o n


ditional m a x i m u m lik elihoo d (C M L ) m e th o d s a re used for o n e p a r a m e te r logistic
IR T m o d e ls (1PL, Rasch m odel, o r for 1PL m o d e ls w ith fixed slopes such a s O P L M ;
V erhelst & G lass, 1995), or in s o m e recent d ev elo p m en ts, em pirical B ayesian e sti
m a tio n s m e th o d s a r e u sed in th e co n tex t o f M a rk o v c h ain M o n te C a rlo ( M C M C )
e stim a tio n (see P a t z & Junker, 1999).
V on D av ier and von D a v ie r (2 0 0 7 ) p ro p o se a n e w p e rs p e c tiv e on IR T scale lin k in g
by v i e w i n g any l i n k in g f u n c tio n as a restriction fu n c tio n on th e jo i n t log-likeli hoo d
f u n c tio n b a s e d on the w h o le d a ta R e w ritin g any l i n k in g as a restriction fu n c tio n and
e s tim a tin g the m odel p a r a m e te r s u n d e r th is restriction im plies a la rg er flexibility
in th e li n k in g pro cess w h e n d e a lin g w ith vertical lin k in g , fo r exam ple. T h is n e w
m e th o d can in c o rp o ra te the m o d e lin g o f g ro w th , possibly e x p re s s e d as a hierarchical
s tr u c tu r e on th e item p a r a m e te r s in th e a n c h o r (discussed in m o re detail below). T h e
a p p ro a c h p resen ted in th e p a p e r by von D a v ie r and von D a v ie r (2007) m a y easily be
e x te n d e d to m u lti-d im e n s io n a l IRT m odels, at least for sim ple s tr u c tu r e m ulti-scale
IRT m o d e ls (like the o n e used in N A E P and o th e r la rge-scale assessm ents).

IRT C a lib r a tio n in B a la n c e d In c o m p le t e Block D e s ig n


O peration ally , m a tr ix s a m p le s o f item re s p o n s e s f r o m b a la n c e d in c o m p le te block
d e s ig n s are calibrated jo intly u s in g M M L e s tim a tio n o f IR T m o d e ls (B o c k &
A itk in, 1981), since M M L e stim a tio n d o es not re q u ir e th at all item s are ta k en by all
e x a m in e e s . T h e n u m b e r o f item s across all blo c k s in s u rv e y a s s e s s m e n ts reaches
h u n d r e d s , and the n u m b e r o f e x a m in e e s in nationally re p re s e n ta tiv e s a m p le s reached
h u n d r e d s o f tho usand s. D u e to co m p u tatio n al c o n s tra in ts in the past, s o m e a s s e s s
m e n t p r o g r a m s use o nly a su b se t o f e x a m in e e s to c a r r y out item p a r a m e te r e s t i m a
tion. T h i s red u ctio n o f th e s a m p le used for e s tim a tio n is n o lo nger n e c e s s a ry w h en
u sin g M M L e stim a tio n w ith recent c o m p u te r hardw are.
M M L e s tim a tio n o f IR T m o d e ls and m a tr ix -design can b e d o n e b oth w ith R asch
ty p e m o d e ls (or o n e - p a r a m e te r logistic m o d e l) and w ith t w o - p a r a m e t e r (2PL) and
t h r e e - p a r a m e t e r (3PL) logistic m o d e ls as well as m ix e d m o d e ls for d ic h o to m o u s and
p o ly t o m o u s re s p o n s e data. T h e a p p ro a c h ta k en is a m u lti-s ta g e estim atio n ; P atz and
J u n k e r (1999) used th e p h ra s e divid e and c o n q u e r for this approach. First step is
e s tim a tin g the stru ctu ra l p a r a m e te r s o f th e m e a s u r e m e n t model (item param eters),
th e se c o n d step is e s tim a tin g th e stru ctu ra l p a r a m e te r s o f a p o p u la tio n m o del (of
ten a latent regression on g r o u p i n g variables) that m ay include a potentially large
n u m b e r o f covariates; then, th e th ird step follows, w h ich is c o n c e r n e d w ith e s ti m a t
ing distributions, percentiles, and p e rc e n ta g e s a b o v e cut-p o in ts for policy relev ant
sub-p o p u latio n s.
It is im p o r ta n t to note that item re s p o n s e m o d e ls include h o m o g e n e ity a s s u m p tio n s
th at m a y not be m et by default in com plex s a m p le s a s well a s sam p les fr o m c o m p o s
ite populations. C arefu l item selection, and i f n ecessary, tr e a tm e n t s o f a small n u m
b e r o f m is-fitting item s such a s splitting item s ( if d ifferen t item s p a r a m e te r s have to

J o h a n n e s ll u r t i g , K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n u l C o n te x t s , l l o g r c f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K c in e u n e r lu u b te W c ite r g a b c o d e r V e r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

13X

A. A. von Davier. C. H. C ars te n s e n , & M. von Davier

ey.T ( . v ) = G V ' ( / V (.v)),

(3)

w h e r e F r(x) and GT(y) are the c u m u la tiv e d istribution fu n c tio n s (cdfs), o f X a n d Y,


respectively, on I G iv e n that X and Y are d iscrete ra n d o m v ariables, th e ir c d fs are
step function s. H ow ever, in o rd er for this definition to m a k e s en se and to in s u r e that
th e inverse e q u a t in g fu n c tio n exists w e also a s s u m e th a t F T(x) a n d GT(y) have been
m a d e co n tin u o u s o r c o n tin u iz e d so that th e inverse f u n c tio n s exist for F r(x) and

GT(y).
Several im p o r ta n t classes o f o b s e r v e d -s c o r e e q u a tin g m e th o d s m ay be v ie w e d
a s only d iffe rin g in th e w a y th a t the c o n tin u iza tio n o f F T(x) a n d GT(y) is achieved.
T h e traditional e q u ip e rc e n tile e q u a t in g m e th o d (also called percen tile r a n k m e th o d )
u s e s lin e a r in terp o latio n o f th e d iscrete d istribu tion to m a k e it p ie c e w is e lin e a r and
therefore, co ntinuous. T h e kernel e q u a t in g (K E ; von D av ier et al., 2004; H olland &
Thayer, 1989) m e th o d u s e s a G a u s s ia n kernel s m o o th i n g to a p p r o x im a te the d iscrete
h is to g r a m by a c o n tin u o u s d ensity function.
T h e e q u ip e rc e n tile e q u a tin g f u n c tio n lead s to lin e a r e q u a tin g if one a s s u m e s that
F t(x) and Gr(y) are c o n tin u o u s and have th e s a m e s h a p e w h ile d iffe rin g in m e a n and
variance. T h e lin e a r e q u a tin g fu n ctio n , Lin, T(x), is defined by

Lilly j

( x ) =

[I

yT + G

f ( x - V xt T
YT

(4)

In T h e o r e m 1.1 o f von D a v ie r et al. (2004), it is s h o w n that any e q u ip e rc e n tile e q u a t


ing fu n c tio n c a n be d e c o m p o s e d into the c o r r e s p o n d in g lin ear e q u a t in g fu n c tio n and
a n o n -lin e a r part.
T h e o b s e rv e d - s c o r e e q u a t in g f u n c tio n s for the N E A T design, eq u ipercentile, and
linear also m a k e a s s u m p ti o n s in o rd e r to o v e r c o m e the m is s in g by d esig n data, a
feature o f the N E A T design. T h e a s s u m p tio n s a n d the f o rm u la s for the classical
linear and e q u ip e rc e n tile e q u a tin g are given in K olen a n d B r e n n a n (2 0 0 4 ) and von
D av ier et al. (2004).
Von D av ier et al. (2004) v ie w any o b s e rv e d -s c o re test e q u a tin g a s h a v in g five
steps o r p arts, each o f w h ich involves distin ct ideas. T h e y are:
1) p r e - s m o o th in g o f the score distributions;
2) e stim a tio n o f th e sco re probabilities on the target population;
3) conti nuization o f th e d iscrete fitted score d istributions;
4) c o m p u tin g th e e q u a tin g fu n ctio n;
5) c o m p u tin g th e s ta n d a rd e rr o r o f e q u a t in g a n d related a c c u r a c y m easures.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x t s , l l o g r e f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2D0S H ogrefe Publishing G m bH


K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

142

A. A. von Davier. C. H. C ars te n s e n , & M. von Davier

R esearch Q u e s t i o n s
Von D av ier and v o n D av ier (in press) identified fo u r re s e a rc h q u e s tio n s th a t m otivate
th e s tu d y o f c h a n g e (regardless o f the d o m a in , in psychology, ed u cation, o r other
fields):
1) W h a t is the c h a n g e that each p e rs o n e x p e r ie n c e s o v e r t i m e ?
(individual change)
2) D o th e rates at w h ich each individual c h a n g e s differ by v a lu e s /o u tc o m e s o f
b a c k g r o u n d v a ria b le s ? (inter-individual s y ste m a tic change)
3) D o e s a specific tr e a tm e n t h a v e a n effect on h o w an indiv id ual c h a n g e s ?
(causal inferences)
4) H o w do es a c o h o r t c h a n g e o ver ti m e ?
O bviously, an a n s w e r to the first question on th e in dividu al trajectories over tim e is
a prerequisite for a n s w e r s to th e s u b s e q u e n t tw o questions. In o th e r w o rd s, m o d e lin g
th e c h a n g e that an ind ividual person e x p e rie n c e s w ith ti m e is at th e c o re o f th e study
o f change. H ow ever, th e a n s w e r to th e fourth research question m a y o r m a y not rely
on the indiv id ual trajectories.
A ssum ptions
W h e n w e talk abou t p ro c e s s a s s u m p tio n s , w e refer to th o se a s s u m p ti o n s n ec e s s a ry
to a n s w e r the q u estio n s above. Von D av ier and von D a v ie r (in press) identified three
ty p e s o f p ro c e s s assum p tio n s: a) a s s u m p tio n s abou t the d ata, b) a s s u m p tio n s about
th e i n s tr u m e n t/ o u tc o m e variable, a n d c) a s s u m p tio n s ab out the model(s).

Assum ptions About the Data


T h e d a ta should have a p p ro p ria te fe a tu re s d e p e n d i n g on w h ich research question
a study a i m s to answ er. In ord er to a n s w e r research q u estio n s such a s the first and
second above, ideally d a ta are available lon g itu d in a lly on m a n y in d iv id u a ls (when
tim e p o in ts and in d iv id u als have b e e n sa m p le d representatively) for at least three
tim e points. Ideally d a ta are b a la n c e d (although th e H / M L M t y p e o f a p p ro a c h e s can
relax this requirem ent). If o ne w a n ts to m a k e causal in fe re n c e s (the third research
q u estio n above), th e n the sim ila rity b e tw e e n u n its /e x a m in e e s a c ro ss tr e a tm e n t and
control g r o u p s should b e e n s u r e d (H o llan d, 2005; R a u d e n b u s h , 2004).
In o r d e r to a n s w e r q u e s tio n s su ch a s th e fo u rth o n e above, th e data d o e s not n e c e s
sarily n eed to be lo ngitudinal. A desig n w h e r e ra n d o m s a m p le s are in d e p en d en tly
d r a w n w ith re p la c e m e n t fro m a c o h o rt at d ifferen t ti m e points m ig h t suffice.

Assum ptions About the Outcome Variable


In S E M / H L M approaches the outcom e variable m ust be a co n tinu ous variable at either
th e interval o r ratio level. In o ther approaches, however, the o u tc o m e variable do es not
need to be continuous. M odels for categorical o u tc o m e variables can b e handled by
extensions o f S E M / H L M m odels o r IRT m odels for chan g e (Cronbach & Furby, 1970;

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R


2 D 08 H u g i c f c P u M i Jiin g G m b H
K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

146

A. A. von Davier. C. H. C ars te n s e n , & M. von Davier

Discussion
T h is c h a p te r rev ie w s th e e x istin g m e th o d o lo g ie s fo r e q u a t in g a n d l i n k i n g tests that
m e a s u r e the s a m e c o n s tr u c t o ver time. T h e first p a r t o f the c h a p te r d e s c r ib e s the
horizontal e q u a tin g , w h e r e in ter-ch an g eab ility o f the scores is desired. T h e n , v e r t i
cal scaling is d is c u s s e d as it is u sed in ed u cational assessm en ts, w h e r e m e a s u r i n g
g ro w th in a p a rtic u la r d o m a in a n d c o m p a r a b ility o f sco res on test f o r m s th at m e a
sure th e s a m e c o n s tr u c t but d iffe r in difficulty is desired. W e also a d d r e s s th e c h a l
lenges o f c o v e r in g large content d o m a in s in educational s u rv e y a s s e s s m e n ts over
m a n y cycles and d is c u s s so m e o f th e solutions th a t w e r e d ev eloped u s in g exten sio n s
o f item re s p o n s e m o d e ls for r e p o rtin g s u b g ro u p distributions. In the p rev io u s sec
tion, w e d is c u s s e d s o m e e x istin g e x p la n a to ry m o d e ls for inter- a n d intra-in dividu al
g row th.
E a c h o f th e se a r e a s is a large field in itself, and it potentially has a stro n g i m
pact on the ed u cational policies, on the life o f stu d e n ts a n d p a re n ts or on the life o f
professionals.
N o w a d a y s , w h e n m o re and m o re s ta n d a r d iz e d te stin g is used nationally and inter
nationally, w e are also d is c o v e rin g m o re c h a lle n g e s in e n s u r i n g that th e p ro c e s s and
th e results are fair and accu rate. For e x am p les, a m o n g the c h a lle n g e s and research
o p p o r tu n itie s for test l i n k in g w e easily can m entio n the definition o f g ro w th , the
c o n s tr u c tio n o f the a n c h o r sets, the choice o f the re p o rtin g scale, a n d the c h a ra c te r
istics o f the sam p les used for e s ta b lis h in g th e scale, w h ich ideally should be re p re
sentative for the p o p u la tio n o f test takers. In addition, p s y c h o m e tr ic ia n s w o r r y about
th e m a i n te n a n c e o f th e scale: h o w to in tro d u c e n e w fo rm s , h o w to m o n ito r the scale
o ver tim e, a n d h o w to ad ju st to c h a n g e s in the a d m in is tr a t io n m ode.
In conclusion, w e notice that m a n y re s e a rc h e rs and p ractitio n ers alre ad y w ork
to g e th er in a d d r e s s in g th e se challenges, and w e h o p e that m a n y u n iv e rsities will
c o n s id e r im p le m e n tin g tr a i n in g c u rr ic u la that p re p a re th e fu tu re g e n e ra tio n o f p sy
c h o m e tric ia n s for the c h a lle n g e s in the field o f ed u c a tio n in th e 21st century.

References
ACT. Inc. (2000). A C T ed u ca tio n a l pla n n in g a n d a ssessm en t system . Iow a City, I A: Author.
A llen. N. L., Je n k in s, I; . & S ch o eps, T. L. (2004). The N AE P 1997 a rts tech n ical an alysis rep o rt
(E T S -N A E P 04-T01). P rin ceto n . NJ: ETS.
A n g o ff, W H (1971). Scales, n o rm s, and equivalent scores. In R L T h o rn d ik e (E d ), E du cation al
m easu rem en t (2nd e d . pp 5 0 8 -6 0 0 ). W ash in g to n , DC: A m eric an C o u n cil o n E ducation
B o ck, R D , & A itk in . M (1981). M arg in al m a x im u m likelihood e stim a tio n o f item param eters:
A pplication o f an E M algorithm . P sych om etrika, 46, 4 4 3 -4 4 5 .
C om preh en sive tests o f b a sic skills, fo rm s I a n d I '(P re lim in a ry Technical Report). (1982). M onterey,
CA: C T B /M eG raw -I Ii 11

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2D08 Hugicfc PuMiJiing GmbH


K e in e u n e r lu u b te W c ite r g u b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

J o h a n n e s H a rtig . K ck h ard K liem e. D c tlc v L e u tn e r: A sse ssm e n t o f C o m p e te n c ie s in K d u catio n al C o n te x ts , H o g re fe P u b lish in g G m b H . G o ttin g e n 2 0 0 8


2D0S Hogrefe Publish in g G m b H
K e in e u n e rla u b te W c itc rg a b c o d e r V c rv ic lfa tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

154

M . v o n D a v i e r , L. Di B e l l o , & K. Y a m a m o t o

e e z will solve all item s A to F w ith c o m p a r a b ly h igh probability, since e x a m in e e z


is in possession o f all th r e e skills re q u ire d by th is set o f items.

I'ahlc I. Fictitious O -matrix fo r six items (A to F), three skills (Add, Sub, Mult) and
two examinees y and z with different skill sets.
Q -m atrix:
T a s k by Skill
Skill:

Add

Sub

E x am in ee y

M ult

Task
A

Add

Sub

Mult

Add

Sub

M ult

no

no

y es

yes

Yes

yes

1
1

F
G

+
+

+
+
+

1
I

E xam inee z

+
4-

Note. A d d = A d d i t i o n : S u b = S u b t r a c t i o n : M u l t = M u l t i p l i c a t i o n .
T h e im plied ru le w h e n c o n v e rtin g a Q - m a t r i x and a skill p a tte rn to a set o f e x p ected
r e s p o n s e s is: the m o r e re q u ire d skills present, the h ig h e r the probability o f s u c c e s s .
T h is assists in d e t e r m i n i n g th e m ost probable r e s p o n s e s fo r each set o f skills. For ex
a m i n e e y in th e e x a m p le one m a y a r g u e that (A = 0, B = 0, C = 0, D = 0, E = 0, F = 1,
G = 0) is the m o s t plausible v e c to r o f r e s p o n s e s i f the p re s e n c e o f all re q u ire d skills
is n e c e s s a r y to solve a specific task. T h is v i e w w o u ld re p re s e n t a non-compensatory
a p p ro a c h u n d e r ly i n g the w a y in w h ich skills are exp ressed o r tra n sla te d in to success
rates. A s o m e w h a t m o r e fo rg iv in g v ie w cou ld a r g u e that e x a m in e e m a y either show
th e above pattern o f r e s p o n s e s or m ay p ro d u c e at least o n e o th e r re s p o n s e pattern,
n am ely (A = 0, B = 1, C = 0, D = 0, E = 1, F = 1, G = 1), since at least a fraction o f
th e re q u ire d skills are present. T h i s r e p re s e n ts a compensatory a s s u m p tio n o f how
skill p resen ce is ex p re s s e d in h ig h e r o r low er probabilities o f s u c c e e d in g in tasks.
For e x a m in e e z, how ever, all re q u ire d skills are present, so the typical re s p o n s e fro m
th is e x a m in e e should b e (A = 1, B = 1, C = 1, D = 1, E = 1, F = 1, G = 1).

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n t e x ts , l l o g r c f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

166

M . v o n D a v i e r , L. D i B e l l o , & K. Y a m a m o t o

0 if a A=0,1,...,/, -1
1

if a k = /,

if a k = 0,

1 / / a , = 1,...,/,.-!,/,

T h e s e n e w d ic h o to m o u s s u b -a ttrib u te s , by definition, satisfy th e follow ing order


constrain t: ak , <ak , < . . . < a 4 , . In o th e r w ords, the only allow able c o m b in a tio n s o f
th ese s u b -a ttrib u te s are the G u t t m a n patterns:

)=(000...00),.(000...01),...,(011...11),(1 II...11)

It can easily be s h o w n that the n o rm a l fusion m odel p a r a m e te riz a tio n applied to


th ese s u b -a ttrib u te s w ith th e e n fo rc e d o rd e r c o n s tra in t g iv e n above is eq u iv alen t
to the original p a r a m e te r iz a tio n o f the o rd e re d p o ly to m o u s fusion model (Tem plin,
2004).

The General Diagnostic Model


T h is section in tro d u c e s a G D M (G D M ; von D av ier & Y am am o to , 2004 c; von Davier,
2 0 0 5 ) for d ic h o to m o u s and p o ly to m o u s d a ta and ordinal skill levels. T h e class o f
d ia g n o stic m o d e ls is defined by a discrete, m u ltid im e n s io n a l, latent v aria b le 0, i.e.,
0 =
w ith d iscrete - u s e r defined - skill levels ak e
a*,,...,a\. }.
In th e sim plest (and m o s t c o m m o n ) c a s e the skills are d ic h o to m o u s , i.e., the skills
will ta k e on o nly tw o v alues a k {0,1}. In th is case, the skill levels a re in te rp re te d as
m a s t e r y (1) v e rs u s n o n - m a s te r y (0) o f skill k. Let 0 = (<:/.,...//.) b e a /^-dim ensional
skill profile c o n s is tin g o f K p o ly to m o u s skill levels ak, k = 1,..., K. T h e n define the
item specific logits as

J o h a n n c * H u rt!# . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tm n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008 H o g ic fc PuM iJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

170

M . v o n D a v i e r , L. Di B e l l o , & K. Y a m a m o t o

P I R L S - Progress in International R eading Literacy Study - (P IR L S is kno w n as IG L U


in Germany).

Estimation and Data Requirem ents


A n im plem entation o f th e E M algorithm b ased on a program for discrete m ix tu re
distribution IR T m odels (von Davier, 2001; von D av ier & Y am am oto, 2004c) has been
developed. T h is extended p rogram , called mdltm , can be u s e d to estim ate p aram eters
o f the model as given in equation G D M .5. T h e p ro g ram em ploys th e E M algorithm
a n d provides inform ation about convergence, n u m b e rs o f required iteration cycles and
descriptive m e a su re s o f m odel-data fit and item fit. T h e p rogram is controlled by a
scripting la n g u ag e th at is used to d escrib e the data input form at and the skill model,
i.e., the item-skill com binatio n as given in th e Q -m a trix , the n u m b e r o f skill levels,
skill level scores a. for each skill, and w h e th e r the y p a ram eters are constrained across
items or estim ated freely.
T h e s o ftw a re h as b een te sted w ith sam p les o f up to 2 0 0 ,0 0 0 e x a m i n e e s im p le
m e n tin g a tw o - d im e n s io n a l IR T model a s well as w ith up to 50 ,000 e x a m in e e s and
an eig h t-d im e n sio n al d ic h o to m o u s skill v aria b le 0 = (/.,...,tfy). L a rg e r n u m b e r s o f
skills v e ry likely will pose pro b lem s w ith identifiability, w h e th e r M C M C (in Bayes
nets o r o th e r a p p ro ach e s) o r M M L m e th o d s are used, u n le s s the n u m b e r o f item s
p e r skill variab le is sufficiently large. T h e mdltm s o f tw a r e allow s im p o s in g v ario u s
ty p e s o f c o n s tra in ts that m ay help to achiev e identifiability in such cases. Currently,
th e fo llow ing skill profile m o d e ls can be e stim a te d u s in g th e softw are:
- m ultiple classification latent class m o d e ls (M aris, 1999), d ia g n o stic m o d e ls w ith
d ic h o to m o u s skill variables, a c o m p e n s a to r y Fusion / A r p e g g i o (s o m etim es
referred to a s R U M ; H a r tz , R oussos, & Stout, 2002);
- d ire c t ex te n sio n s o f th ese d ia g n o stic m o d e ls to p o ly to m o u s re s p o n s e data, and
p o ly to m o u s , o rd inal skill levels (von D a v ie r & Y am am o to , 2004c; v on Davier,
2005), w ithout th e n eed for re p la c e m e n t o f ordinal skills by d ic h o to m o u s sub
skills w ith ord er constraints;
- u n id im e n s io n a l IR T m o d e ls such a s the R asch M odel (R a sc h , 1960), the partial
credit R asch m odel ( M a s te r s , 1982), th e 2PL IRT (B ir n b a u m , 1968) m odel, the
g e n e r a liz e d partial credit model (M u r a k i, 1992);
- o th e r latent s tr u tu r e m o d e ls such a s located L a te n t C la ss m o d e ls ( H a b e r m a n ,
1979; F o r m a n , 1985), c o n f irm a to ry m u ltiv a ria te IR T models, d iscrete m i x t u r e IRT
m o d e ls (von D a v ie r & Rost, 2 0 0 6 ) such as p o ly to m o u s m ix e d R asch m o d e ls (von
D a v ie r & Rost, 1995).
T h e s o f tw a r e can read A S C II d a ta files in a rb itra ry fo rm a t and th e scrip tin g la n g u a g e
used to control th e s o ftw a re e n ab les th e u s e r to sp e c ify w hich c o l u m n s rep resen t
w h ich variables. T h e s o ftw a re also h a n d le s w e ig h te d data, m ultiple g r o u p data (m ul
tiple populations), data m is s in g by design (m atrix sam ples) in re s p o n s e variables,
and d a ta m is s in g at ra n d o m in re s p o n s e v a ria b le s as well a s in g ro u p in g variables.

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008 Hugicfc PuMiJiing GmbH

K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

J o h a n n e s H a rtig . K ck h ard K liem e. D c tle v L e u tn e r: A sse ssm e n t o f C o m p e te n c ie s in K d u catio n al C o n te x ts , H o g re fe P u b lish in g G m b H . G o ttin g e n 2 0 0 8

Hogrefe P u b lish in g G m b H
K e in e u n erlu u b te W c ite rg a b c o d e r V c rv ie lfa tig u n g .

2008

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

1 8 0

D. L e u t n e r , J. H a r t i g , & N. J u d e

should b e applied to tests used for th e a s s e s sm e n t o f c o m p e te n c ie s as well: objec


tivity, reliability, and validity. T h e s e th re e q u a lity criteria a r e c h a ra c te r iz e d in th e
fo llo w in g sections, w ith specific reference to the a s s e s s m e n t o f co m p eten cies. We
also d is c u s s o n e f u r t h e r criterion o f p a rtic u la r relevance to c o m p e te n c e assessm en t:
th e n ecessity for criterion-oriented interpretation o f test scores. It should b e kept
in m in d that, a s a rule, th e se q u a lity criteria are used to d e s c rib e th e p s y ch o m etric
qu ality o f th e scales and s u b sc ales o f a te st (see F ig u re 1), but can also b e used for
c h a ra c te r iz a tio n s o f a te st co n sistin g o f several scales. T h e y are ty p ically not a p
propriate, how ever, for d e s c rib in g individual ta s k s o r items.

O bjectivity
T h e objectivity o f a test m e a n s above all th at th e test results reflect o n ly c h a r a c te ris
tics o f th e individual test subject and not c h a ra c te ris tic s o f th e p erson a d m in is te r i n g
th e test o r c h a ra c te ris tic s o f the test situation. O fte n the c o n c e p t o f objectivity also
e n c o m p a s s e s th e idea that the test results th e m selv es, a n d th e in terp retatio n o f th ese
results, are in d e p e n d e n t o f the p erson e v a lu a tin g and in te r p r e tin g them . In line with
th e different criteria used for e v a lu a tin g a te s ts objectivity, it is c o m m o n to d is tin
g u ish b e tw e e n objectivity o f im p lem en tatio n , analysis, and interpretation.
T h e m o s t c o m m o n strategy for e n s u r in g the objectivity o f a d ia gno stic p r o c e d u re
is to s ta n d a rd iz e and d o c u m e n t each step re q u ire d in a d m in is te r i n g the test and in
a n a l y z in g a n d in te r p r e tin g its results. I f th e test p r o c e d u r e has b een s ta n d a rd iz e d ,
a n d i f the test h as b een a d m in is te r e d by a tra in e d tester, objectivity is usually c o n
sidered as given. A lth o u g h objectivity c a n be c o n sid ered a n e c e s s a r y p recond ition
for the reliability and validity o f a s s e s sm e n t p ro c e d u r e s (R ost, 2004), this quality
criterio n is fa r m o r e rarely the subject o f d iscu ssio n and analysis in ed u catio nal and
psychological a sse ssm e n t. For s ta n d a r d iz e d psychological tests, w h ich usu ally only
c o n ta in closed a n s w e r form ats, it m a y in d e ed b e tr u e th at objectivity is not a critical
issue. In ed u catio nal contexts, however, o ther d a ta s o u rces are o ften used a s the basis
to id e ntify inter-individual d iffe re n c e s in c o m p e te n c ie s , in c lu d in g beh avio ral obser
vations, j u d g m e n t s o f freely fo rm u la te d texts, o r portfolios. S ince c o m p e te n c ie s are
defined as c o n te x t-d e p e n d e n t realistic c o n s tru c ts, this b road a ra n g e o f m e th o d s
for em pirical an aly sis a p p e a r s appropriate. A s th e evaluation o f o p e n a n s w e r fo rm a ts
usu ally r e q u ire s the appraisal o f o b s e rv e d b e h a v io r by raters, however, th e q u estio n
o f the objectivity o f test results is significant in this case. I f an a s s e s sm e n t pro cess
d o e s involve j u d g m e n t s by a rater, it is all the m o re in d is p e n s a b le that th o ro u g h
d o c u m e n ta tio n and in s tr u c tio n s for th e a s s e s sm e n t be provided and that raters be
tra in e d on this basis. It is equally in d isp e n sa b le th a t th e co n siste n cy o f the raters be
a n a ly z e d , o r at least a ra n d o m sam p le o f th em . A lack o f co n siste ncy a m o n g raters
w o u ld indicate th e need to revise th e evaluation criteria, the ev alu atio n process, or
th e d o c u m e n ta tio n .

J o h a n n e s I U r t ! g . K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

186

D. L e u t n e r , J. H a r t i g , & N. J u d e

no c o in c id e n c e th a t self-evalu ations are often referred t o as m e a s u r e s o f perceived


c o m p e te n c e o r o f a c a d e m ic s e lf-c o n c e p t and not sim ply regard ed as c o m p e
te n c e . S elf-co n ce p t co n stitu tes an in d e p e n d e n t c o n s tr u c t th at h as b een d escrib ed
a n d r e s e a rc h e d extensively (e.g., M a r s h , T ra u tw e in , Liidtke, Koller, & B a u m e rt,
2005, 2006). R esearch on a c a d e m ic self-co ncept - but also stu d ie s on self-assess
m en t o f general intelligence a n d o th e r c o n s tru c ts o f ability - have s h o w n that test
results and self-assessm en ts are positively but at m ost m o d e ra te ly co rrela ted (e.g.,
H acker, Bol, Morgan, & Rakow , 2 00 0; K r u g e r & D u n n i n g , 1999; T ousignant, &
D e s M a r c h a is , 2002).
In th e p resen t v o lu m e, s elf-assessm en t is not c o n sid ered an a d e q u a te m e a s u r e
m e n t in s t r u m e n t for a s s e s s in g co m p eten cies. T h e t e r m s a s s e s sm e n t i n s t r u m e n t ,
m e a s u r e m e n t i n s t r u m e n t , o r t e s t refer to m e a s u r e m e n t p ro c e d u r e s that are b ased
on beh avio ral d a ta from p e r f o r m a n c e situations.

Strategies for Test Construction: Generation and Selection of Test and Task
Content
T h e central and p r i m a r y question in the c o n s tr u c tio n o f any n e w a s s e s sm e n t i n s t r u
m e n t is h o w the test and th e co n ten t o f ite m s are to be defined, delim ited , and select
ed. T h e selection o f test co n tents u ltim ately d e t e r m in e s the n a tu re o f th e m e a s u re d
co n stru ct. In the following, four general strategies used in psychology to derive test
co n te n ts will b e presented: ex tern al test c o n s tru c tio n , ded u ctiv e test c o n stru ctio n ,
ind uctive test c o n stru ctio n , and criterion sam pling. In practice, m i x tu r e s o f th ese
four strategies are used as well.
External Test C o n stru ctio n
T h e p r im a r y goal o f extern al c o n s tr u c tio n is to predict a p a rtic u la r external criterion,
e.g., w h e th e r th e test-tak er will fall in to a p a rtic u la r catego ry o f individuals. T h e
origin o f item co n te n ts is o f s e c o n d a r y im p o r ta n c e here, and u su ally the ra n g e o f
ite m s chosen is as b ro ad and h e te ro g e n e o u s as possible (b o th in tests and q u e s tio n
naires). T h e item s are inv estig ated em p irically to see w h e th e r the g r o u p s o f interest
d iffe r in any respect. T h o s e item s that d iffe r m o s t w id ely by g r o u p are c o m b in e d
into a m e a s u r e m e n t i n s t r u m e n t . T h is in s t r u m e n t can th e n b e used w ith in d iv id u
als in f u tu r e stud ies to e s tim a te the g r o u p to w h ich they belong. T h is p r o c e d u r e is
used, for ex am p le, to assign psychiatric patients to specific d ia g n o stic g r o u p s (see,
for exam ple, H a th a w a y & M cK in ley, 1951) o r to d is tin g u is h b e tw e e n successful and
u n s u c c e s s fu l applicants for a jo b or u n iv e rs ity en ro llm en t.
In the case o f extern al test c o n stru ctio n , the p ra g m a tic q u estio n o f h o w to pred ict
a criterion b est is o f sole interest, and theoretical a s s u m p tio n s a b o u t th e c o n s tr u c ts
th at a r e m e a s u r e d are usu ally disregarded. For c o m p e te n c e assessm en ts, however,
this ap p ro ach is hardly appropriate.

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n a l C o n te x t s . H o g r e f e P u b l i s h in g G m b H , G o ttin g e n 2 0 0 R


2D 0S H u g i c f c P u M i Jiin g G m b H
K e in e u n e rla u b te W c ite r g a b c o d e r V c rv ie lfa tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

Introduction to C o n cep ts and Q u e s tio n s o f A s s e s s m e n t in Education

191

c o m p e te n c e - o r ie n te d tests fro m s ta n d a r d iz e d psychological tests. W ith reg ard to


c o m p e te n c e - o r ie n te d tests, th r e e to p ics w e r e d is c u s s e d th a t n e e d to b e ta k en in to a c
c o u n t re g a r d in g te st co n stru ctio n : (1 ) Q u e s tio n n a ire s w h e r e re s p o n d e n ts are asked to
assess th e ir o w n c o m p e te n c ie s c a n n o t b e ta k en a s a d e q u a te in s t r u m e n t s for m e a s u r
ing co m p eten cies. (2) D e d u c tiv e strategies and criterion s a m p lin g a p p ro a c h e s rather
th a n external o r in d u c tiv e a p p ro a c h e s are suitable strateg ies for test c o n stru ctio n . (3)
Test m o d e ls b ased on item re s p o n s e th e o r y - and esp ecially the R asch model and its
ex te n sio n s - are p articu larly apt for d e s c r ib in g the test b e h a v io r o f test ta k e rs and
a s s e s s in g th e ir com p etencies.

References
B o rsb o o m , D.. M ellen b erg h , G. J.. & van H ce rd c n . J (2004) T h e C oncept o f Validity. P sych o lo g ica l
R eview . I l l , 1061-1071.
B row n , J D., & H u d so n , T. (2002). C riterio n -re fe ren ce d lan gu age resting. C a m b rid g e C am b rid g e
U niversity P ress.
C ronbach . L. J., & M eehl. P. E. (1955). C o n stru c t v alid ity in psychological tests. P sych ological
B ullet in. 52. 281-302.
E m b retso n . S. (2006). T h e co n tin u ed search for non arb itrary m e tric s in psychology. A m erican
P sych ologist, 61, 50-55.
Flcxcr. R W . & B aer, R M . (2005). D esc rip tio n and ev alu atio n o f a u n iv ersity -b ased tran sitio n
en d o rse m en t p ro g ra m C a r e e r D evelo p m en t f o r E x cep tio n a l Individuals, 28, 80-91
G augler. B. B. R o sen th al, D B , T h o rn to n , G C .. & B e n tso n , C (1987). M e ta-an a lv sis o f assessm en t
c e n te r validity. J o u rn a l o f A p p lie d P sych ology, 72, 493-511.
G oldberg, L. R (1990). A n alte rn a tiv e d escrip tio n o f p e rso n a lity : T h e B ig-F ive fa c to r stru ctu re.
J o u rn a l o f P erso n a lity a n d S o c ia l P sychology, 59, 1216-1229.
H acker. D. J . B ol. L .. H o rg an , D. D.. & R akow , E. A. (2000). Test pred ictio n and p e rfo rm a n c e in a
classro o m context. J o u rn a l o f E d u ca tio n a l P sychology. 92, 160-170
H am b lcto n . R. K , & Z em sk y . A (2003). A d v an ces in crite rio n -re fe re n c e d te stin g m e th o d s and
practices. In C. R. R e y n o ld s & R W. K a m p h a u s (E d s ), H an dbook o f p s y c h o lo g ic a l a n d
e d u ca tio n a l a ssessm en t o f ch ildren (2nd cd ., p p 3 7 7 -4 0 4 ). N ew York G u ilfo rd P ress.
H artig , J (2006). S k alie ru n g u n d K o m p eten zn iv eau s [S caling and co m p ete n ce levels] In B B eck
& E K liem e (E d s ), S prach liche K om ptenzen. K on zepte u n d M e ssu n g (pp. 8 3 -9 9 ). W einheim :
B eltz.
1Iarlig. J. (2008). P sy ch o m etric M o d els for th e A sse ssm e n t o f C o m p eten cies. In J. I Iartig. E. K liem e,
& D. L e u tn e r (E ds.), A ssess m ent o f co m p eten cies in e d u ca tio n a l con texts (pp. 6 9 -9 0 ). G ottingen:
I logrefe & H u b er
H a rtig , J.. & K liem e, E. (2006). K o m p cten z und K o m p e te n z d ia g n o stik [C om petencies and
co m p ete n ce assessm ent | In K. S ch w eizer (E d ), L eistu n g u n d L eistu n g sd ia g n o stik (pp. 127-143)
B e rlin Springer.
H athaw ay. S. R & M e K in ley . J. C. (1951). The M in n eso ta m u ltiph asic p e r so n a lity in ven tory
(revised). M in n eap o lis: U niversity o f M in n eso ta.
K lauer. K. J. (1987). K riteriu m so rien tierte Tests [C riterio n -referen ced tests]. G ottin g en : H ogrefe.
K oep p en . K .. H a rtig , J.. K liem e, E ., & L eu tn er. D. (2008). C u rre n t issues in co m p ete n ce m o d elin g
and assessm en t. Z eitsch rift f u r P sych o lo g ie Jo u rn a l o f P sychology. 216, 61-73
K ru g er. J . & D u n n in g . D. (1999). U n sk illed and u n aw a re o f it ho w d ifficu lties in rec o g n iz in g
o n e 's o w n in co m p eten ce lead to in flated se lf-assessm en ts J o u rn a l o f P erso n a lity a n d S ocia l
P sych ology, 77, 1121-1134

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g u b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

Introduction to the C o m p u te r-B a se d A s s e s s m e n t o f C o m p e te n c ie s

195

Classification of E-Assessm ent


T h e fo llo w in g p a s s a g e s a tte m p t to o utlin e a classification o f s o m e o f the ex p ressio n s
m e n tio n e d above, by c h a r a c te r iz i n g th e m w ith reference to th re e nested classes (see
F ig u re 1).

Figure I. ( lasses oj computer-based assessment.


W ith in this classification, computer-based assessment (CBA ) is th e m o s t general
term : c o m p u te rs are used for the actual a s s e s sm e n t process. A s p o in ted out before,
th is in clu des th e p resen tatio n and selection o f s tim u lu s m aterial and item s as well as
th e collection a n d storage o f r e s p o n s e s a n d re a c tio n s o f th e testees. A u to m a te d scor
ing con stitu tes a n o th e r fe a tu re o f m a n y C B A p ro g ra m s . In s o m e cases, ev en item
fo rm a ts like essay w riti n g and o p e n a n s w e r fo rm a ts can be scored autom atically. In
any case, th o s e item fo rm a ts can be p resen ted and re s p o n d e d to by com pu ter, b u t th e
actual s c o r in g o f the r e s p o n s e s is often still p e r fo r m e d by h u m a n beings. H ere, the
c o m p u te r se rv e s a s an in te rfa c e for th e scorer. C B A refers to a s s e s sm e n t te c h n iq u e s
w h e r e th e c o m p u te r is used.
T h e o th e r tw o levels o f th is hierarchy, n e tw o rk -b a s e d and Intern et-b ased a s s e s s
m ent, b e lo n g to C B A a s well. N ev ertheless, there a re s o m e restriction s respectively:
Network-based assessment ( N B A ) in clu des a s s e s s m e n t te c h n iq u e s w h e r e u sually
all e x a m i n e e s are a d m in is te r e d to the s a m e test o r test battery, a n d all r e s p o n s e s are
stored on o n e central server. U sually it is possib le to a d m i n i s t e r th o se te sts s im u lta
neously to tw o o r m o re persons. T h e s o ftw a re w h ich is need ed for te st a d m i n i s t r a
tion is typically installed on all c o m p u te r s w ith in th e network. F o r ex am p le, this
m ig h t be the c a s e in a com puter-poo l w h e r e all c o m p u te r s are c o n n e c te d by a Local
A r e a N etw ork. S o m e tim e s, th e s erv er c o lle c tin g and s to rin g the relevant test d a ta is

J o h a n n e s I U r t ! g . K c k h a r d K lie m e . l> e tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

Introduction to the C o m p u te r-B a se d A s s e s s m e n t o f C o m p e te n c ie s

2 0 1

e x p e c te d to be capab le o f m e a s u r i n g p a rts o f the c o n s tr u c t that traditional P PT-item s


c a n n o t assess.
O n th e o th e r hand, a d is a d v a n ta g e o f C B A re g a rd in g validity m ig h t be caused
by an un d er-rep resen tatio n o f th e c o n s tr u c t that is m e a s u r e d T h is m ig h t result
f r o m in tr o d u c in g e r r o r into e x a m in e e s ' test sc o re s by m e a s u r i n g c o n s tr u c t- ir r e l
evant m aterial, or by i m p r o p e r e s tim a te s o f e x a m i n e e sco res and ite m - p a r a m e te rs
( H u f f & Sireci, 2001, p. 18). T h e y co n c lu d e that th is m i g h t be d u e to m e a s u r i n g
co n s tru c t-irre le v a n t f e a tu re s like com puter-proficiency, c o m p u te r platform f a m ilia r
ity, in flu en cin g the u s e r interface, s p e e d e d n e s s , or c o m p u te r anxiety. O n e o f the
m o s t freq u en tly d is c u s s e d issues w h ich m u s t b e ta k e n into a c c o u n t w h e n c a r r y in g
out c o m p u te r-b a s e d a s s e s sm e n t regards co m p u te r- and s o ftw a re fa m ilia rity o f both
testee a n d test ad m in is tra to r, e x a m in e e s s u ff e rin g fro m c o m p u te r anxiety, or test
scores b e i n g d e p e n d e n t on c o m p u te r experience. Also, th ere m a y b e cultural, ethnic
or g e n d e r d iffe re n c e s in test p e r fo r m a n c e , d ue to th e application o f th e c o m p u te r
for a s s e s s m e n t p urposes. T h is m a y especially afflict test fa irn e s s , a n d th u s lead to
a d is c r im in a tio n o f the p eo p le c o n c ern e d . C onseq u en tly , since the afo rem e n tio n ed
p ro b le m s m ig h t well lead to a d e c r e a s e in data-q u ality , v alid ity could be affected as
well. H ow ever, s o m e o f th o se pro b lem s m ig h t v e ry easily b e prev ented by a p p ly in g
tr a i n in g p h a s e s before test a d m in is tra tio n .
P resum ably, in the n e a r future, m a n y new ly d ev elop ed tests will solely be c o n
s tru c te d for c o m p u te r iz e d and not fo r paper-pencil a d m in is tra tio n . S tu d ies c o n c e r n
ing the e q u iv a le n c e o f test fo rm s and research on te st-m o d e effects will m o s t likely
b e c o m e less im p o rta n t. It is also likely that in the futu re, m o r e and m o re people
will b e c o m e u sed to b e in g e x p o s e d to c o m p u te r s in a lm o s t e v e ry s p h e re o f life.
At p resent already, m a n y p eople are e x p e r ie n c e d in h a n d lin g c o m p u te rs a n d the
Internet. H owever, this is p articu larly tr u e for h igh ly in d u s tria liz e d co u n tries. In less
developed c o u n trie s this will m ost likely not h a p p e n so soon. In tim e s o f g lo b a liz a
tion and Internet a sse ssm e n t, th e new ly e m e r g e d exp ression digital d iv id e b e c a m e
h ig hly d iscussed. P o o rer c o u n trie s a r e esp ecially affected by the so -called digital
d iv id e , w h ich refers to in e q u ita b le a c c e ss to in fo rm a tio n and c o m m u n ic a tio n te c h
nologies (ICTs) b e tw e e n w ealth y and p o o r c o u n trie s and b e tw e e n p riv ileged and
u n d e rp r iv ile g e d social g r o u p s w ith i n all c o u n trie s (G asp e rin i & M c L e a n , 2001, p.
1). T h u s , before a d m i n i s t r a t i n g la rg e -sca le stu dies on c o m p u te r s in all p a rtic ip a tin g
co u n tries, it will b e absolutely n e c e s s a ry to co n d u c t a th o ro u g h research on p o p u la
tion d iffe re n c e s in o r d e r to g u a r a n t e e te st fa ir n e s s for all participants. T h is m u s t
especially be taken into a c c o u n t d u r i n g th e d ev elo p m en t p ro cess o f the tests, a n d it
m u s t b e a s s u re d that the m o d e in w h ich the test a d m in is tr a tio n ta k e s place d o e s not
d e p e n d on c o m p u te r e x p e r ie n c e a n d /o r c o m p u te r anxiety, and h as th u s no effect on
th e results. At least, it should be possible for th e p a rtic ip a tin g e x a m i n e e s to p ractice
b efore the a d m in is tra tio n o f c o m p u te r - b a s e d tests. H o w ev er, th e re is also e v id en ce
th at in g eneral, the "digital d iv id e d o e s not apply to test p e rf o r m a n c e " (G ershon,
2005, p. HO).

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n a l C o n te x t s , l l o g r c f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

206

A.Jurecka

o f alre ad y previou sly e x istin g tests (B ro c k e & Vock, 2002). T h o s e can be c o m b in e d


d e p e n d in g on th e d e m a n d s o f th e custom er. F u r th e r m o re , it is possible to integ rate
n e w p a r ts into th e system.
T h e definition o f c o m p e te n c e for th is a s s e s s m e n t system d e p e n d s on th e u n
d e rly in g c o n c e p ts o f the resp ectiv e scales a n d te sts (O e n n in g , 2003). T h e system
c o m b in e s d iffe re n t tests and q u e s tio n n a ir e scales for th e m e a s u r e m e n t o f different
co m p eten cies, such as em o tio n al a n d social c o m p e te n c e , intelligence, co n centration,
w o r k in g habits, c u s to m e r orientation, stressin g factors, le ad ersh ip and m a n a g e m e n t,
sales a n d m a n a g e m e n t, a ttitu d e s to w a r d s work, m otivation, a n d w o r k in g s a m p le s for
call c en ter agents.
T h e te stin g system co n sists o f tw o m odules: T h e o rg an iz er- and the testing-m odu le.
T h e y can be installed on a local com puter. In th e o rg a n iz e r-m o d u le , d e p e n d i n g on
jo b specifications and -re q u ire m e n ts , 94 d im e n s io n s can be c o m b in e d to assess v a r i
o u s j o b specifications. W ith in the testin g m o d u le the test is b e in g a d m in is te r e d to
th e candidate.
Testing criteria d e p e n d on the scale or test w h ich is b e in g used. M a n y tests w ere
published independently, and specifications for reliability and valid ity c a n b e found
w ith in th e resp ective m a n u a ls (B ro c k e & Vock, 2002).
A f u r t h e r e x a m p le is the a s s e s s m e n t sy ste m o f th e G e r m a n A r m e d Forces. E ach
y e a r ap p ro x im a te ly 2 0 0 0 0 0 ap titu d e tests are a d m in is te r e d to v o lu n teers and to
p e rs o n s w h o are liable for m ilita r y service ( G iK O M C SE, 2007). T h e s e tests have
b e e n a d m in is te r e d by c o m p u te r since the y e a r 2000. F o r s o m e o f th e m , ComputerAdaptive Testing is applied. T h e tests w ere d ev elop ed by the psychological s ta f f o f
th e a r m e d forces and the s o f tw a r e - c o m p a n y G iK O M C SE. T h e testin g system is
called C A T 4 ( G iK O M C S E , 2007).
T h e C A T 4 is a n etw o rk -b ased a s s e s s m e n t system for perso n n el selection and
placem ent. Its core is a central s erv er w ith an S Q L datab ase. T h e P C s w h ere the
tests are a d m in is te r e d c a n b e steered a n d s u p e r v is e d by o n e central server. U p to 50
p e rs o n s can be tested w ith d ifferen t testin g b atteries at the s a m e tim e. T h e testing
batteries, i.e. the collection o f tests w h ich a r e a d m in is te r e d to a candidate, f u n c tio n
in an a d a p tiv e way. In a c c o r d a n c e to the c a n d id a te 's a c h ie v e m e n t d u r in g th e a s s e s s
m e n t process, th e c o m p u te r c o n tin u o u sly m a tc h e s the c a n d id a te 's profile w ith the
r e q u ire m e n ts for th e v a c a n t jobs. A s long as no m a tc h is in d icate d , n e w tests c a n be
a d m in is te r e d to the c a n d id a te ( G iK O M C S E , 2007). A p p lic a n ts for an officer's c aree r
m u s t c o m p le te a c o m p u te r iz e d a d a p tiv e test in m a th e m a tic s to q u a lify for s tu d y in g
at the G e r m a n A r m e d Forces. T h e co rrelation b e tw e e n the a d a p tiv e (m athem atics,
analogies, and m atrices) a n d th e conventional ( P P T ) tests c o m e u p to r > .6 (p < 0.01)
(S torm , 1999). T h e d e c re a se o f te s tin g ti m e for th e adap tiv e tests, c o m p a r e d w ith
co n v en tio n al tests, c a m e u p to 5.5 m in u te s p e r person and subtest. Also, the a m o u n t
o f item s to be a d m in is te r e d d e c r e a s e d significantly (M elter, H a m a n n , K u tsch k e, &
S torm , 2002). F u r th e r m o re , th e CA Ts w e r e able to d ifferen tiate b e tw e e n p eople w ith
and w ith o u t h ig h e r s e c o n d a r y education.

J o h a n n e s H a r tig . K c k h a r d K lie m e . l> e tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts , H o g r e f e P u b l is h i n g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

2 1 2

A.Jurecka

In tern atio n al Test C o m m issio n (2005). In tern ation al G u id e/in d es on C o m p u ter-B a sed a n d InternetD e liv e r e d Testing. R etriev ed A p ril 28, 2008 fro m http://\vvvvv.intestcom .org/D o\vnloads/IT C % 2
0 G u id elin es% 2 0 o n % 2 0 C o m p u ter% 2 0 -% 2 0 v ersio n % 2 0 2 0 0 5 % 2 0 ap p ro v ed .p d f
Irv in e . S.. K u tsch k e. T., & W alker, R (2000) S creen in g c o n s c rip ts in G erm any using Item
G en erativ e -Tests. A rb eitsb eric h l Nr. I 2 0 0 0 d e s P sych o lo g isch en D ien stes d e r B undesw ehr.
B o n n B u n d e sm in iste riu m d e r V erteid ig u n g F SZ III 4
Jo d o in , M. G. (2003). M easu rem en t E fficiency o f In n o v ativ e Item F o rm a ts in C o m puter-B ased
Testing. J o u rn a l o f E d u ca tio n a l M easurem ent, -10. 1-15.
K irb a c h , C. .& M ontel. C. (2003). D as In te rn e tre c ru itm g to o l P E R L S. In: .1 E rp e n b e c k & L. von
R o sen stiel (E ds.), Ila n d b u ch K om peten zm essu n g (pp. 4 6 0 -4 7 0 ). S tu ttg art: S chaffer-P oeschel.
K o n ak , 0 .. D u in d a m . T.. & K am p h u is. F. (2005). C IT O -S prach test. W issen sch aftlicher Bericht.
R etriev ed J a n u a n 28. 2006 from h ttp //\vw w .cito.com /de/sprachtcst/rcsourccs/C T to-S prachtcstW issen sch aftlich er-B erich t PDF
K u b in g er, K D . & F ark as, M G. (1991) D ie B rau ch b ark eit d e r N o rm en d es P apier-B leistift-T ests fu r
die C om puter-V orgabe: E in E x p erim e n t am B eispiel d e r SPM von R av en als k ritisc h e r ITeitrag.
Z eitsch rift f i t r D ifferen tielle u n d D ia g n o stisch e P sycb o lo g ie, 12, 257266.
K ubinger. K. D (1993). T estth eo retisch e P ro b lem e d e r C o m p u terd ia g n o stik . Z eitsch rift f i t r A rb eitsu n d O rg a m sa tio n sp sych o lo g ie, 3 7. 130-137.
L u m sd en . J. A ., S am p so n . J. P., R eard o n . R. C\, & L en z, J. G. (2002). A co m p a riso n stu d y o f
the p a p er, p e r so n e l co m p u ter (PC), a n d in tern et version o f H o lla n d 's S elf-D irected Search:
Technical R ep o rt No. 30. T allah assee. C en ter for th e S tudy o f T echnology in C o u n selin g and
C a re er D ev elo p m en t. T h e C a re e r C enter. F lo rid a S tate U n iv ersity
M ead, A D , & D rasgow , F (1993). E q u iv alen ce o f co m p u terize d a n d p ap er-an d -p en cil co g n itiv e
ability tests: a m e ta-an aly sis. P sych o lo g ica l Bulletin. 114, 4 4 9 -4 5 8
M eijer. R. R., & N ering. M I.. (1999) C o m p u teriz ed ad ap tiv e te stin g O v erv iew and introduction.
A p p /ie d P sych o lo g ica l \le a su re m e n t, 23. 187-194.
M elter. A .. I la m an n . I.. K u tseh k e. T.. & S to rm , E. G. (2002) M o d e m isie ru n g d e r E ig n u n g sd ia g n o slik
im P sy ch o lo g isch en D ien st d e r B u n d e sw e h r - E rg eb n isse und P e rsp e k tiv e n Z eitsch rift f itr
P erso n a lp sych o lo g ie. J. 35-41
M orelan d . K. L (1992). C o m p u ter-assisted psy ch o lo g ical assessm en t. In M Z e id n e r& R M ost (Eds.),
P sych o lo g ica l testing: A n in side v ie w (p p 3 4 3 -3 7 6 ) Palo A lto. CA C o n su ltin g P sychologists
P ress.
N ag lieri, J A., D rasgbow . F , S ch m it, M . H an d ler, L., P rifitera, A., M argolis, A.. & V elasquez, A.
(2004). P sychological T estin g on th e In tern et. N ew P ro b lem s, Old Issues. A m erican P sych ologist.
59, 150-162.
N eu m a n n , G.. <fc B ay d o u n , R (1998). C o m p u teriz atio n o f p ap er-an d -p en cil tests: w h en a re they
eq u iv alen t? A p p lie d P sych o lo g ica l M easurem ent, 22. 7 1 -8 3
O cn n in g . S (2003). D as E l.IG O -S y stem . In J. E rp e n b e c k & L. von R o sen stiel (E ds.), Ilan dbu ch
K om petenzm essu n g. (pp 4 5 4 -4 5 9 ). S tu ttg a rt SchalTer-Poschel Verlag
O rd in a te (2004). Set 10 Test D escrip tio n . R etriev ed N o v em b er 11,2007 fro m h ttp /Avww versan ttest.
d e /p d f/V a lid atio n R ep o rt.p d f
P A S S -IT (2003). P hase I interim resea rch re p o rt sum m ary. R etriev ed A pril 2008 from h ttp ://
w w w .p ass-it.o rg .u k /reso u rces/0 3 1 1 1 7 -research p l-su m m ary .p d f
PA SS-IT (2005). D ev elo p in g the fu tu re p o te n tia l o f a ssessm en t through technology. R etriev ed A p ril
2008 fro m http://\vw \v.pass-it.o rg .u k /re so u rc e s/p a ss-it_ le a lle t_ p d f
Pater. E (2004). S p raeh sta n d serh cb u n g fu r d ie K in d e r d e s E in sc h u lu n g sja h rg a n s 2004 in D uisburg.
W eiteren tw ick lu n g und A n w en d u n g d e s zw e isp rach ig en C ito-T ests D eu tsch als Z w eitsprache,
2, 26-33.
P oggio, J , G lasn ap p , D. R ., Yang, X., & P oggio. A J. (2005) A co m p a ra tiv e ev alu atio n o f score
resu lts from c o m p u te riz e d and p ap e r & pencil m a th em atic s testin g in a large scale state
assessm en t p ro g ram The J o u rn a l o f Technology, Learning, a n d A ssessm en t, 3, 4 -3 0 .

J o h a n n e s H a r tig , K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x ts , H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2008 Hogicfc Publiiiing GmbH

K e in e u n e r lu u b te W c ite r g u b c < xlcr V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

216

T. H. J. M. E ggen

ing procedure, b u t e m p h a s is is on th e g a in in m e a s u r e m e n t efficiency. It has been


sh o w n th a t CATs need f e w e r item s to m e a s u r e the proficiency o f th e test-tak er w ith
th e s a m e precision. C o m p a r e d to a linear, n o n -a d a p tiv e test only abou t 5 0 % to 6 0 %
o f th e n u m b e r o f item s is needed. Since th e publication o f th e basic ideas on m o d e rn
c o m p u te r iz e d a d a p tiv e te stin g by L ord (1970), th e ed u catio nal a n d p s y ch o m etric
c o m m u n i t y has p ro d u c e d n u m e ro u s articles and b o o k s on this subject. Recently, a
n u m b e r o f b o o k s have p resen ted o v e r v ie w s o f the literature. A v o lu m e e d ited by
W a in e r (2000) g iv e s th e historical d e v e lo p m e n t a n d the b asics o f c o m p u te r iz e d a d a p
tive te stin g a n d d e s c rib e s possibilities for building, m a in ta i n in g and u s in g CATs. A
v o lu m e edited by V an d e r L in d en and G las (2000) is a com p ilation o f recen t p sy
c h o m e tric research on CATs. Finally, a b o o k edited by Parshall, Spray, K a lo h n , and
D av ey (2002) giv e s a m o re practical o v e r v ie w o f issues in c o m p u te r iz e d (adaptive)
testing. T h e fo llow ing o v e r v ie w p resen ts the basic a s p e c ts o f CAT.

Differences Between CATs and Linear Com puterized Tests


In a lin e a r C B T (C o m p u te r b a s e d test) th e te s tin g is c o m p u te riz e d and before th e test
is a d m in is te r e d , the ord er o f the p resen tatio n o f the ite m s in the test is fixed. T h e
m a i n d istinctiv e fe a tu re s o f CA Ts c o m p a re d to lin e a r C B T s are:
E v ery individual gets a p e rs o n a liz e d test, that is, the co n ten t a s well a s th e length
o f the test can d iffe r fro m p erso n to person. A C A T is o p tim iz e d for the individual,
w h ich h as at least tw o favorable con seq u en ce s:
a) th e m e a s u r e m e n t efficiency is e n h a n c e d sin ce few er item s are n e e d e d to g e t the
s a m e precision; and
b) e v e ry stu d en t can be ch allenged at his o f h er o w n level, w h ich generally has a
s tim u la tin g effect and is e x p e rie n c e d as pleasant.
F e w e r item s are n eed ed , so individual students, test c o n s tr u c to r s a n d test o rg a n iz e rs
can save ti m e a n d /o r money. Since item s a n d tests are d ev elop ed w ith in th e f r a m e
w ork o f a so u n d test th e o ry (IRT), the testin g has a n u m b e r o f k n o w n ch arac teristics
a n d th e re fo re probably b e tte r m e a s u r e m e n t quality.
B e s id e s th ese distinctive ad v a n ta g e s, C A T h as several a d v a n ta g e s in c o m m o n
w ith lin e a r CBTs. C o m p a r e d to p ap er-based te stin g th e m o s t im p o r ta n t a d v a n ta g e s
a r e (see also Ju re c k a , 2008, C h a p t e r 9, W irth , 2008 , C h a p t e r 11, and C h u n g , O 'N e il,
Bewley, & Baker, C h a p te r 12 in th is book):
1) D irect s c o rin g and fe e d b a c k is possible and th e s c o rin g is objective (errorless).
2) N e w item ty p e s b e c o m e available, that is, not o nly n e w f o r m a ts for a n s w e r in g
items, but richer, m ore a u th e n tic item s can be p resen ted
3) Test a d m in is tr a tio n is m o re efficient.
4) Flexibility is possible in test m o m e n t and location.
5) Test s e c u rity is better.
6) S tu d e n ts are m o r e m otivated.
7) T h e r e are m o re possibilities to apply m o d e r n test theory.

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n a l C o n te x t s , l l o g r e f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

2 2 0

T. H. J. M. E ggen

th e p a r a m e te r s as i f th e y w e re k n o w n . F u rth e rm o re , stu d e n ts get different item s and


in a c c u r a c ie s in item p a r a m e te r s could w o rk differen tly for d ifferen t students.
To achieve a high q u a lity item calibration, th e choice o f a specific IRT model, the
choice o f an estim atio n m e th o d o f the p a ra m e te rs , the choice o f the model fit sta
tistics, and the n a tu re and th e size o f the stu d en t sam p le play im p o r ta n t in terrelated
roles. For detailed c o n s id e ra tio n s ab out this, o n e is referred to th e p sy cho m etrical
literatu re (e.g., F isch er & M o len aa r, 1995 a n d V an d e r L in d en & H a m b le to n , 1996).
F o llo w in g are s o m e s u m m a r i z i n g r e m a rk s a n d re c o m m e n d a tio n s .
W ith respect to th e choice o f an IR T m od el, s im p le r m o d e ls req uire sm a lle r s a m
ples a n d b e tte r statistical m e th o d s for e s tim a tin g and testin g model fit are available.
To e s tim a te th e p a r a m e te r s w ith reaso n ab le a c c u ra c y and to test the model w ith
so m e p o w e r in th e 1PL-, 2P L - and 3 P L -m o deI, one needs, respectively at least 200,
500 and 1,000 stu d en t a n s w e r s p er item. O n th e o th e r hand, it is m o re difficult to
obtain a g o o d fit for a s im p ler m odel, w h ich could m e an th a t s o m e item s h a v e to be
deleted fro m th e b a n k , possibly th r e a te n in g th e valid ity o f th e item bank.
Two general lik e lih o o d -b a se d m e th o d s are available for the e stim a tio n o f the
item p a ram eters. First, th e re is the gen eral applicable m a rg in a l m a x i m u m likeli
ho od m e th o d (M M L ). If w e use this m e th o d , w e have to a s s u m e th a t the sam p le
is a ra n d o m one fro m a specified d istribu tion o f the proficiency in the po pulation
(U su ally the n o rm a l d istrib u tio n is a ssu m ed .) W ith M M L , the item p a r a m e te r s and
th e p a r a m e te r s o f the proficiency d istrib u tio n are e stim a te d sim ultaneously. T h is is
not the case i f w e u se co nditional m a x i m u m lik elih o o d (C M L ) for the e stim a tio n
o f the item p a ram eters. W ith C M L , w e d o not need a s s u m p tio n s on the proficiency
d istrib u tio n o f the students; o n e only n e e d s sam p les fro m th e population. In practice,
this could be v e ry favorable b e c a u s e in ed u c a tio n real ra n d o m s a m p le s are not easily
obtained. H a v in g s a m p le s d r a w n in m o re steps and in clusters d o es not invalidate
th e C M L e stim a tio n m e th o d o f th e item p a ram eters. H owever, the C M L e s t i m a
tion m e th o d is not applicable in ev ery model. It can be used in the 1P L -m od el and
in a s o m e w h a t restricted form o f th e 2 P L -m o d e l (E g g en , 1990; Verhelst & Glas,
1995). T h i s m odel is im p le m e n te d in the O P L M c o m p u te r p r o g r a m (Verhelst, Glas,
& V erstralen, 1995) w h ich also c o n ta in s item fit statistics w ith proven g o o d statisti
cal prop erties. I f o n e u s e s C M L for e s ti m a ti n g item p a r a m e te r s and model fit testing,
th e calibration is to be co m p le te d by separately e s tim a tin g the ( p a r a m e te r s o f a)
proficiency distribution.

Incom plete Calibration Designs


T h e size o f item b a n k s in u se for CA Ts is n o rm a lly such th a t it is not feasible to
use co m p lete te stin g d e s ig n s in the calibration o f th e b an k . Therefore, only a part
o f th e stu d e n ts in a calibration s tu d y will b e able to a n s w e r only a p art o f th e items.
S o m e tim e s th ere are practical re a s o n s for u s in g in c o m p le te designs; often efficiency
is a m o tiv a tin g fa c to r for u s i n g in c o m p le te designs. Efficiency in item calibration,

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n t e x ts , l l o g r c f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

A daptive Testing and Item B an k in g

225

Start
In this phase, th e s ta r tin g item(s) in th e C A T is (are) indicated. A s th e re is, in general,
no in fo rm a tio n available on th e proficiency o f th e student, o n e could start with:
- one or m o re r a n d o m ly selected item s fro m th e item b a n k ;
- o ne o r m o re r a n d o m ly selected item s from c ertain s u b - d o m a i n s o f th e item bank.
O n e cou ld decid e to start w ith item s w h ich h a v e a c ertain c o m m o n in s tru c tio n for
th e students. A n e x a m p le w o u ld be to s ta rt w ith a test o f m a th e m a tic s w ith a few
item s not a llo w in g the test-tak er to u s e p a p e r a n d pencil to m a k e co m p u tatio n s;
- o ne o r m o re ra n d o m ly selected item s from a c e rta in subset o f the item b a n k , b ased
on psy ch o m etrical c h a ra c te ris tic s o f the items. A n e x a m p le w ould be to start w ith
a ran d o m selection o f a fe w easy items.
I f th e re is in fo rm a tio n available on th e proficiency o f th e s tu d e n t before testing, s e
le ctin g th e first item s o n the basis o f that in fo rm a tio n is o f c o u rs e possible.

Select
B efore th e a d m in is tr a tio n o f every item, a n item selection p r o c e d u r e is carried out.
F rom the item b a n k an item is chosen that is in a c c o r d a n c e w ith the a n s w e rs g iv e n by
th e stu d en t th u s far: the test is a d a p te d or tailored to th e proficiency o f the stu d e n t b e
ing tested. T h e item selection p r o c e d u r e is respo nsib le for the gain in efficiency that
c a n be a ch iev ed w ith a CAT. It is b a s e d on a n in fo rm a tio n c o n c e p t o f w h ich the basic
idea is that th e item, w h ich p ro m ise s to giv e the m ost in fo rm a tio n on the s tu d e n t s
proficiency, d e m o n s tr a te d thus far is a d m in is te r e d next. In a s e p a ra te section o f this
chapter, the criteria for item selection will b e d e s c rib e d in m o re detail.

Administer mid Score


T h e s e p h a s e s in r u n n i n g the C A T a lg o rith m are clear: the item s are p resented and
a n s w e r e d and the a n s w e r s are scored.

Compute
In the c o m p u ta tio n phase, the scores o f the stu d en t are processed . B ased on the
scores, statistical p r o c e d u r e s (discussed in m o r e detail in the next section) d e t e r m i n e
th e s tu d e n t s proficiency a n d an ind ication o f its accuracy.

Stop
A f t e r the a d m in is tra tio n o f each item a decision is m a d e w h e th e r a n e w item is to be
selected o r w h e th e r testin g can be te rm in a te d . C rite ria for s to p p in g are:
- the
a c c u r a c y o f th e
e s tim a te o f th e s tu d e n t s proficiency;
- the
a c c u r a c y o f the
decision for cla ss ify in g the student;
- the
m a x i m u m (and
possibly also the m i n i m u m ) practically available te stin g tim e
o r n u m b e r o f item s th at can be a d m in is te re d
T h e last criterion in co m b in a tio n w ith o ne o f the first tw o criteria is often chosen.

J o h a n n e s H a r tig . K c k h a r d K lie m e . l> e tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n a l C o n t e x ts , H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

230

T. H. J. M. E ggen

tions h as resulted in m o re c o n sid eratio n b e in g given to co n te n t-b a s e d and practical


r e q u ir e m e n ts o r co n d itio n s in item selection algorithm s. In m o d e rn CATs p sy ch o m e tric a lly o p tim al item s m e e tin g th e se practical co n d itio n s a r e selected. C o m m o n
to th e se co n d itio n s is th a t they all have a small d e trim e n ta l effect on the m e a s u r e
m e n t a c c u ra c y o f th e CAT. H ow ever, th e size o f th e loss in a c c u r a c y g e n e ra lly do es
not cou n terv ail a g a in st th e practical req u irem en ts. Two different conditions, w hich
a r e a lm o s t alw a y s applied, will b e d is c u s s e d next.

Content Control
W h e n o nly m a x i m u m in fo rm a tio n selection ta k e s place, it cou ld provide results th at
a r e in conflict w ith the desired content specification o f th e test. A test c o n s tru c to r
could d e m a n d th a t s u b - d o m a i n s o f the m e a s u r e d proficiency are re p resen te d in a
c e rta in p ro p o rtio n in each CAT. O n e reason for this d e m a n d could be the content
a n d fa c e valid ity o f the test, a n o th e r th e r e q u ire m e n t to re p o rt s e p a ra te e s tim a te s on
th e s u b - d o m a i n s for d ia g n o s tic p u rp o ses. For ex am p le, a test on e le m e n ta r y a r it h
m e tic should h av e an equal n u m b e r o f a d d itio n and s u b tr a c tio n items. T h e r e are
several possibilities for re a liz in g such a specification, tw o o f w hich are:
T h e item b a n k is p artitio n e d into s u b -b a n k s . For each s u b - d o m a i n to be d is tin
g u is h e d th e re is a s u b - b a n k a n d a stu d en t t a k i n g a C A T for each s u b -b a n k . I f one
uses a variab le le n g th for each sub-C A T , there is no c o m p le te control on the n u m
ber, a n d th u s on th e p r o p o rtio n s b e tw e e n the n u m b e r o f item s, in the s u b -d o m a in s.
N e v erth eless, this a p p ro a c h is often applied. O n e m a in reason to d o this is that for
c e rta in s u b - d o m a i n s s o m e t im e s a specific s tim u lu s o r item t y p e is used. A lte rn a tin g
item s o f differen t s u b - d o m a i n s in one C A T then m ay ca u s e problem s. A C A T on
la n g u a g e could, for ex am p le, consist o f re a d in g and listening items. It is practical not
to m i x th e se item ty p e s d u r i n g th e a d m in is tr a tio n o f th e C A T
I f it is re q u ire d to have item s on s u b - d o m a i n s in fixed p r o p o rtio n s in every CAT, it
is possible to ad ap t the C A T a lg o rith m to ac h ie v e that. T h e idea is that the alg o rith m
ta k e s care o f th e b est ap p ro x im a tio n o f th e desired specification d u r in g th e a d m i n
istration o f the CAT. A n elegant and sim ple m e th o d to ac h ie v e this w a s p r o p o s e d by
K i n g s b u r y a n d Z a r a (1991), w hich o p e ra te s a s follows: a fte r each item, th e p e rc e n t
a g e s o f ite m s in the s u b - d o m a i n s o f all item s a d m in is te r e d th u s fa r are su b tracted
fro m th e d esired p e rc e n ta g e s o f ite m s in th e s u b - d o m a in s . From th e s u b - d o m a i n th at
h as the low est p ercen tage, an item that h as m a x i m u m in fo rm a tio n is a d m in is te r e d
next. So, in th e alg o rith m first th e s u b - d o m a i n is d e t e r m in e d a n d w ith in th is d o m a in
th e m o s t in fo rm a tiv e item.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

236

J. W irth

a m in e e . From each perspective, a d v a n ta g e s o f n e w test and item d e s ig n s will be


d is c u s s e d a s well a s p ro b le m s that e m e rg e from u s in g them .

Competencies
C o m p u t e r te c h n o lo g y a n d u se affects educational research in tw o ways. First, c o m
puters p ro vide the o p p o r tu n i ty to develop n e w ty p e s o f le a r n in g e n v iro n m e n ts and
n e w a s s e s sm e n t m e th o d s that can b e u s e d in research on traditional c o n c e p ts or on
n e w a s p e c ts o f th o se concepts. S econd , c o m p u te r u se and le a r n in g w ith n e w m e d ia
have estab lish ed n e w and in d e p e n d e n t fields o f ed u catio nal research le ad in g to the
definition o f n e w c o n s tru c ts a n d the d ev elo p m en t o f n e w tests a s s e s sin g th e se n e w
c o n s tru c ts. T h u s , c o m p u te r te c h n o lo g y pro v id es th e o p p o r tu n i ty to o p e r a te c o n v e n
tional c o n s tru c ts in n e w w ays or to develop n e w c o n s tru c ts, raising the pro sp ect o f
b e in g able to r e -c h a ra c te r iz e conventional c o n s tru c ts o r to define n e w c o n s tr u c ts
a n d n e w a re a s o f research (H a d w i n et al., 2005).
O n e o f the m o s t p r o m in e n t e x a m p le s o f r e - c h a r a c te r iz in g a conven tio nal c o n s tru c t
is d e m o n s tr a te d by th e w o rk on co m p lex pro b lem solving u n d e r t a k e n by D o r n e r and
his g r o u p in G e r m a n y (D o rn e r, K re u z ig , Reither, & Staudel, 1983; D o r n e r & Preufiler,
1990; D o rn e r, S chaub, & S troh schneider, 1999). D o r n e r created a c o m p u te r- s im u la t
ed to w n called L o h h a u s e n S ubjects w e re a p p o in te d m a y o r o f L o h h a u s e n and in
stru cted to g o v e r n th e tow n . T h e sim u latio n included ap p ro x im a te ly 2 0 0 0 variables
each o f w h ich w a s s o m e h o w c o n n e c te d to the others. Variables c h a n g e d th eir values
either as an effect o f a m a y o r's in terv en tio n a n d /o r a s a fu n c tio n o f time. D o r n e r and
his c o lle a g u e s w e re the first to u se the c o m p u te r to sim u late such a h ig h ly com plex
a n d d y n a m i c sy stem . T h e i r (and related) w ork had a s tro n g im p a c t on research on
pro b lem solving. C o m p le x pro b lem solving as the c o m p e te n c e re q u ire d to learn how
to control a co m p le x a n d d y n a m ic sy stem b e c a m e a n e w c o n s tr u c t in cognitive p sy
c h o lo g y (F re n s c h & F u n k e , 1995), and th e distin ctio n b e tw e e n k n o w le d g e a c q u is i
tion a n d k n o w le d g e application b e c a m e p r o m in e n t in definitio ns o f problem so lv in g
(F u n k e , 1985). B e c a u s e low o r even neg ativ e co rre la tio n s w e re found b e tw e e n c o m
plex p ro b le m -s o lv in g p e r f o r m a n c e a n d intelligence (e.g., P u tz -O s te rlo h , 1981), even
research on definitions and m e a s u r e s o f in telligence w e r e h igh ly influ en ced by this
a n d related w ork (e.g., K ro ner, 2001; Leutner, 2002; Sufi, 1996, 1999). N o w adays,
c o m p u te r-b a s e d tests are in d isp e n sa b le tools for th e a s s e s s m e n t o f p ro b lem -so lv
ing c o m p e te n c ie s even in la rge-scale a s s e s s m e n ts ( B a k e r & O Neil, 2002; K lie m e,
Leutner, & W irth , 2005; W i r t h & K liem e, 2003). T h e y are also used in th e a s s e s s
m e n t o f tacit k n o w le d g e a b o u t p ro c e d u r e s and strategies th a t c a n n o t easily be ver
balized and th e re fo re is v e ry difficult to assess u s in g convention al p a p er-b ased tests
( B e r r y & B ro ad b en t, 1995; B u c h n e r, F un ke, & Berry, 1995; K r a u s s et al., 2004).

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n a l C o n te x t s . H o g r e f e P u b l i s h in g G m b H , G o ttin g e n 2 0 0 R


2D 0S H o g i c f c P u M i Jiin g G m b H
K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

240

J. W irth

h ig hly co m p lic a te d cases a re p resen ted in an interactive p ro b le m -s o lv in g situation in


an a tte m p t to reflect real life situation s as au th entically as possible.
F u rth e rm o re , c o m p u te r-b a s e d item s have th e potential to g o b e y o n d b e in g realistic.
A n im a tio n s can allow e x a m in e e s to v is u a liz e p ro c e s s e s th a t are difficult to perceive
or not o b s e rv a b le at all in real life, for ex am p le, m etab o lic p ro cesses (N erdel, 2003).
A n im a tio n s can also s im u la te e x p e r im e n ts a n d re a c tio n s that are to o d a n g e r o u s to
c o n d u c t in reality (M ik elsk is, 1997; P ren zel, von Davier, B leschke, S enk beil, &
U rh a h n e , 2000). T h e latest te c h n o lo g y even allow s subjects to a u g m e n t th eir real-life
o b s e rv a tio n s w ith f u r t h e r in fo rm a tio n w h ile they w a lk a ro u n d ( K n i g h t et al., 2005).
Until now, such te c h n o lo g y h as b een u sed for instru c tio n a n d t r a i n i n g in le a rn in g
e n v i r o n m e n ts but it is j u s t a question o f ti m e until it is also u sed for assessm ent.
In s u m m a r y , n e w d e sig n s o f c o m p u te r - b a s e d tests often u s e m ultiple m e d ia to p re s
ent in fo rm a tio n in different co m p lex , d y n a m i c and interactiv e m odes. M u ltim e d ia ,
com plexity, d y n a m i s m and in teractiv ity are co n sidered goo d fe a tu re s o f m o re a u
thentic and realistic, a n d th e re fo re m o re valid, tests - altho ugh d iffe re n c e s in perfor
m a n c e b e tw e e n real and c o m p u te r-s im u la te d test situations m ay r e m a in (Shavelson,
Baxter, & G ao, 1993). In creased au th en tic ity leads to in c re ase d ecological validity
o f a test w h ich in t u r n m ay lead to h ig h e r a c c e p ta n c e o f the test fro m b oth test
deliverers and ex a m in e e s .
H owever, authenticity d o e s not a u to m a tic a lly lead to h ig h e r c o n s tr u c t validity.
A d d in g m o re realism to test item s do es not a u to m atically lead to valid m easures.
[...] A n y n e w fe a tu re a d d e d to a test that is not essential to the v aria b le the test is
in ten d ed to m e a s u r e is a potential th re a t to [construct] v a lid ity (van d er Linden,
200 2, pp. 93.). T h is m e a n s that th e m o re co m p lex and d y n a m i c a test is and the m o re
differen t m e d ia used to presen t th e in f o rm a tio n , th e m o re difficult it is to e n s u r e that
th e test situation reflects only key a s p e c ts o f the context and d o m a in that is p art o f
th e definition o f th e c o m p e te n c e and that test p e r f o r m a n c e only reflects th e level o f
th e c o m p e te n c e the test is in te n d e d to m e a su re . T hus, w h e n d e s ig n in g n e w co m p u te rb a s e d ite m s a n d te sts there is alw a y s a t r a d e - o f f b e tw e e n ecological validity (and
a c c e p ta n c e by test deliverer and e x a m in e e ) and c o n s tru c t validity.
T h e t r a d e - o f f b e tw e e n task co m p lex ity a n d sc o rin g sim plicity is a n o th e r issue to
be co n s id e re d w h en d e s ig n in g c o m p u te r -b a s e d item s a n d tests (L u e c h t & Clauser,
2002). C o m p le x ta s k s lead to c o m p le x s tr u c tu r e s o f the co llected data, a n d sc o rin g
b e c o m e s a co m p lex and lab o rio u s task. Conversely, restric tin g ta sk c o m p le x ity to
e n s u r e e c o n o m ic s c o r in g can lead to artificial a n d o versim plified tasks. It is i m
p o r ta n t to ev aluate ta sk c o m p le x ity a n d auth enticity w ith in the co n tex t o f the t e s t s
p u r p o s e to e n s u r e th at the ta sk provides a p p ro p ria te and valid in fo rm a tio n fo r the
t e s t s purpose.

J o h a n n e s I U r t ! g . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b l i s h i n g G m b H , G o ttin g e n 2 0 0 R

2008

H o g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

246

J. W irth

C o m p u te r- b a s e d m e a s u r e s have th e potential to b e highly objective a n d i n e x p e n


sive, esp ecially w h e n com plex d a ta s tr u c tu r e s like c o n c e p t m a p s are to b e analy zed .
H o w ev er, th e re are issues re g a r d in g th e classical criteria o f te st q u a lity th a t are often
overlooked. Reliability o f co m p u te r-b a s e d m e a s u r e s is difficult to achieve, especially
w h e n a m e a s u r e is b a s e d on a h ig h ly co m p lex data stru ctu re . Indices to e s tim a te the
reliability o f m e a s u r e s o f th e p ro cess o f c h a n g e are still to b e developed. Evaluation
o f c o n s tru c t validity and th e ability o f n e w c o m p u te r-b a s e d item s and tests to be
g e n e ra liz e d is often insufficient w h ile te st d e v elo p ers often seem to b e co n ten t w ith
fa c e v alid ity or ecological validity.

Summary and Conclusion


C o m p u te r-b a s e d c o m p e te n c e testin g and a s s e s s m e n t is a v a lu a b le addition to c o n
ventional form ats. C o m p u te r - b a s e d item s a n d tests lead to th e definition o f n e w c o n
structs o r th e r e -c h a r a c te r iz a tio n o f trad ition al co nstructs. T h e y also m a k e it possible
to assess c o m p e te n c ie s in w a y s they w ere never able to b e assessed before. T h e use
o f m u ltim e d ia c a n e n h a n c e test-fa irn ess a n d au th en tic ity o f th e test situation, w hich
in t u r n can lead to h ig h e r a c c e p ta n c e o f th e test by test deliverers and e x a m in e e s
alike and, i f carefully d e s ig n ed , also to im p ro v e d c o n s tr u c t validity. C o lle c tin g data
on c o m p u te r s m e a n s no d a ta v alu es are m issed . D ata, in c lu d in g ti m e recording, can
be collected uno b tru siv e ly online. T h e s e data p ro vide th e g r o u n d w o r k for the d efin i
tion o f inn o v a tiv e m e a s u r e s such as p ro c e s s m e a s u r e s and c a n be c o d e d objectively
a n d inexpensively.
B e c a u s e o f th ese a d v a n ta g e s , c o m p u te r-b a s e d tests can p rov id e n e w objective,
valid a n d reliable m e a s u r e s for traditional o r n e w c o m p eten cies. However, th ere are
at least th r e e issu es that have to b e carefu lly c o n sid ered w h en d e s ig n in g n e w c o m
pu te r-b ased item s and tests: (1) Ecological validity and c o n s tr u c t validity a r e not
th e sam e, and d e v e lo p in g h ighly au th en tic test e n v i r o n m e n ts do es not autom atically
lead to th e provision o f valid c o n s tru c t m e a s u r e s (van d e r L in den , 2002). In contrast,
in c re a sin g auth enticity and co m p lex ity o f a test e n v iro n m e n t often affect c o n s tru c t
validity b e c a u s e fe a tu re s a r e a d d e d to th e te st situation th a t a r e n ot an essential part
o f th e definition o f th e c o m p e te n c e . T h u s th e re is alw a y s a t r a d e - o f f b e tw e e n eco lo g
ical validity a n d c o n s tr u c t validity. (2) T h e m o re co m p le x the s tr u c tu r e o f collected
d a ta the m o r e difficult it is to filter the signal f ro m th e noise and to develop a reliable
a n d valid m e asure. T h u s , th e se c o n d t r a d e - o f f is b e tw e e n task- and d a ta co m p lex ity
on one hand and sco rin g sim plicity on th e o th e r (L u e c h t & Clauser, 2002). (3) A s is
tr u e for all n e w m e a su re s , c o m p u te r-b a s e d item s a n d te sts h av e to be ev aluated c a r e
fully. M u lti- m e th o d -d e s ig n s for th e evaluation o f validity as well as th e d ev e lo p m e n t
o f n e w e s tim a te s o f reliability are h ighly desirable.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2D0S H ogrefe Publishing G m bH


K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

250

J. W irth

L eu tn er, D , & P lass, J. L. (1998). M e a su rin g le arn in g sty les w ith q u estio n aire s v ersu s d irect
o b se rv a tio n o f p referen tial ch o ice b eh av io r in au th en tic le a rn in g situations: T he v isu a liz e r/
v e rb a liz e r b e h a v io r o b se rv a tio n scale (V V -B O S). C om pu ters in H um an B ehavior. 14. 543-557,
L o m p sch er, J. (1994). L e rn stra tc g ie n Z u g an g e a u f d e r R eflex io n s- un d H an d lu n g se b c n e (L earn in g
strateg ie s A ccesses on th e reflective and b eh a v io ral level], L L F -B erichle (Vol 9, pp 114-129).
P otsdam U n iv ersitat P o tsd am
L ord, 1' M . (1980). A p p lica tio n s o f item respon se th eo ry to p r a c tic a l problem s. H illsd ale, NJ:
L aw ren ce E rlb au m
L u ech t, R. M ., & C lauser. B. E. (2002). Test m o d els fo r co m p lex CBT. In C.N . M ills, M.T. P oten za,
J. J. F rem er. & W C. W ard (E ds.). C o m p u te r-b a sed testing. B uilding the foun dation fo r future
assessm en ts (pp. 6 7 -8 8 ). M ah w ah . NJ: L aw re n c e E rlbaum .
M ayer. R. E. (2001). M u ltim ed ia lea rn in g C am b rid g e: C a m b rid g e U niversity P ress.
M cA rdle, J. J.. & B e ll, R Q (2000) A n in tro d u ctio n to latent g ro w th m o d e ls for developm ental
d ata analysis. In T D L ittle. K U S ch n ab el. & J. B a u m e rt (E d s ). M o d elin g lon gitu din al a n d
m u ltilevel data. P ra c tic a l issues, a p p lie d a p p ro a ch es a n d specific exam ples (pp. 6 9 -1 0 7 ).
M ahw ah, NJ: L aw ren ce E rlbaum .
M ik elsk is, 11. F. (1997). D e r C o m p u ter ein m u ltim e d iale s W erk zeu g zu m L e rn e n vo n P hysik (The
c o m p u t e r - a m u ltim e d ia tool for le arn in g physics] P hysik in d e rS c h u le , 35, 3 94-398.
M islevy, R. J. (1996). Test th eo ry reco n ceiv ed J o u rn a l o f E d u ca tio n a l M easurem ent. 33. 379-416.
N erdel. C. (2003). D ie W irkung von A n im ation u n d Sim ulation a u f das C erstandm s von
stojfw ech selp h ysio lo g isch en P ro ze ssen [E ffects o f an im a tio n s an d sim u latio n s on u n d e rsta n d in g
m etab o lic processes]. U n p u b lish ed d isse rta tio n U n iv ersity K iel
N ieg em an n , H M , L eu tn er, D , & B rtln k e n , R (Eds.). (2004). In stru ction al d esig n f o r m u ltim edia
learn in g M unster: W axm ann
N o rm an n , M ., D eb u s, G., D o rre, P.. & L eutner, D (2004). T rain in g o f tram d riv e rs in w orkload
m an ag em en t - w o rk lo ad assessm en t in real life and in a d riv in g /tra ffic sim u la to r In T.
R o th en g atter & R.D . I lu g u e n in (E ds.). Traffic a n d tra n sp o rt p sy c h o lo g y th eo ry a n d a p p lication
(P ro c e e d in g s o f th e IC T T P 2 0 0 0 . pp. 113-121). A m ste rd a m E lsevier.
O lso n -B u eh an an . J B . D rasgow . F . M obcrg, P J , M ead , A. D .. K e e n a n , P. A , & D onovan, M.
(1998) C o n flict reso lu tio n sk ills assessm en t A m o d e l-b a sed , m u lti-m ed ia approach. P erso n ell
P sychology, 51, 1-24
Page, E B . & P etersen . N S (1995). T he c o m p u te r m oves in to essay g ra d in g P hi D elta K appan,
76, 561-565.
P in trich . P. R., S m ith , D. A. F., G arcia , T., & M cK each ie, W. J (1991). The m o tiv a te d stra te g ie s fo r
learn in g qu estio n a ire (M SLQ). A n n A rb o r. M I: N C R 1PT A L . T h e U niversity o f M ichigan.
P in trich , P. R.. W olters. C. A .. & B axter. G .P (2000). A sse ssin g m e ta c o g n itio n and self-reg u lated
learn in g . In G. S chraw & J. C Im p ara ( E d s ), Issues in the m easurem ent o f m etacogn ition (pp.
4 3 -9 7 ). L in co ln , N E B u ro s In stitu te o f M ental M easu rem en t
P lass. J L . C h u n . D M .. M ayer, R E .. & L eu tn er. D (1998) S u p p o rtin g v isu a l a n d verbal le a rn in g
p referen ces in a seco n d -lan g u ag e m u ltim e d ia le a rn in g en v iro n m en t J o u rn a l o f E du cation al
P sychology, 90. 2 5 -3 6
P lich a rt, P., Jad o u l. R ., V andenabeele, L., & I.ato u r. T. (2004, N ovem ber). TAG, a colla b o ra tive
d istrib u te d c o m p u te r-b a sed a ssessm en t fra m ew o rk built on sem an tic w eb stan dards. P aper
p rese n ted at th e In te rn a tio n a l C o n feren ce on A d v an ces in In tellig en t S ystem s - T heory and
A p p licatio n s A IS T A . L uxem bourg.
P rc n z e l, M ., von D avier. M . B lesch k c, M. G , S en k b cil, M .. & U rh a h n c, D (2000). D id a k tisc h
o p tim ic rte r E in sa tz N eu er M ed ien E ntvvicklung von c o m p u te rg c stu tz te n U n terrich tsk o n zcp ten
fu r die n a tu rw isse n sc h a ftlic h e n F ach er (D id actically o p tim iz e d use o f new m e d ia developm ent
o f co m p u ter-b ased te ach in g c o n c e p tio n s in scien ce teach in g ] In D. L e u tn e r & R B ru n k en
(E d s ), A'eue M edien in C n terricht, A u s- u n d W eiterbildung. A ktuelle E rgebn isse em pirisch er
p a d a g o g isc h e r F orschung (pp. 113-121). M u n ster: W axm ann.

J o h a n n e s H a r tig . K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2008

H o g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

C o m p u te r-B ased A s s e s s m e n t in S u p p o rt o f D istan ce L e a rn in g

255

le a rn in g and a s s e s s m e n t research, co g n itiv e science, natural la n g u a g e p ro cessin g


and instru ctional design.
U nfortunately, co m p re h e n siv e , re s e a rc h -b a s e d g u id e lin e s fo r the design, d e v e lo p
m ent, delivery, and evaluation o f D L s y ste m s d o not exist. T h e r e are so m e research b a s e d g u id e lin e s for p a r t o f th e DL design and evaluation process (O 'N eil, 2 00 5 )
th at are b a s e d on instructio nal and a s s e s sm e n t research in g en eral, but not on D L
research in particular. T h e ca u s e o f th is deficiency is th a t m o s t D L research h as f o
cused on c o m p a r is o n s o f traditional in stru c tio n to in stru c tio n delivered by D L rath er
than e x a m i n i n g the v a ria b le s th a t influence the effectiv en ess o f D L for different
people le a r n in g differen t skills in differen t e n v iro n m e n ts .
W e believe a co m p re h e n siv e a p p ro a c h to D L in s tru c tio n is needed. O v e r th e last
d e c a d e th e re have b een n u m e ro u s rev ie w s and stu dies o f the effectiveness o f D L in
a c a d e m ic a n d m ilita ry settin g s (Fletcher, 2003). M a n y stu d ie s related to D L have
ev aluated the effectiv en ess o f DL c o m p a r e d to le a r n in g in th e traditional schoolhouse. In g eneral, m ost o f the m ilita ry stud ies have fou n d effect sizes ra n g in g from
.39 for co m p u te r-b a s e d in s tru c tio n to 1.05 for able tutors. H ow ever, w e believe that it
is d u e to th e in stru c tio n al desig n , not the m e d ia p er se. T h e s e fin d in g s are consistent
w ith prior m e d ia c o m p a r is o n stu d ie s th a t have e x a m in e d and c o m p a r e d each w av e
o f te c h n ic a l in n o v a tio n w ith traditional in s tru c tio n ( B a r r y & R u n y a n , 1995; Clark,
1983, 1989; L o ck e e, B u rto n , & Cross, 1999; M a c h tm e s & Asher, 2000; P h ip p s &
M eriso tis, 1999; S m ith & D illon, 1999; W is h e r et al., 1999). T h e r e is little e v id en ce
that sim ply m a k i n g content available over th e W eb to in d iv id u als, a s is d o n e in m a n y
D L im p le m e n ta tio n s, will result in effective instructio n. A s C la rk (1983, 1989) noted,
it is the in stru c tio n al m e th o d that in flu en ces le a rn in g , not the media.

Assessment Components for DL Systems


O u r co n cep tio n o f th e a s s e s s m e n t c o m p o n e n t o f th e nex t-generation D L e n v iro n m e n t
has b een culled from rev ie w s o f the D L literature ( B a r r y & R u n y an , 1995; B o n k &
W ish er, 2 00 0; M o o r e & Kearsley, 1996; O 'N e il & Perez, in press; P hipp s, & M erisotis,
1999; W i s h e r et al., 1999), instru ctional a n d design g u id e lin e s for DL (A dv an ced
D istrib u ted L e a rn in g , 2004; A IC C C o u r s e w a r e T echn ology S u b c o m m itte e ,
1997; A m e r i c a n C ouncil on E d ucatio n, 1996; O Neil, 2005; R eigeluth, 2005; U.S.
D e p a r tm e n t o f D efense, 1996), a n d the N ational C e n t e r for R e s e a rc h on Evaluation,
S ta n d a rd s , and S tu den t T estin g s ( C R E S S T ) e x p e r ie n c e and e x p e rtis e in a s s e s sm e n t
and evaluation (Baker, 1994; Baker, A b edi, Linn, & N iem i, 1996; Baker, A g u ir r e M u n o z , W an g , & N iem i, 2005; B a k e r & B ro w n , 2003; B a k e r & M ayer, 1999; Baker,
N ie m i, & H erl, 1994; B a k e r & O 'N e il, 1994, 2003, in press; Baker, O Neil, & Linn,
1993; O Neil & Baker, 1991, 1994, 1997; O Neil, Ni, Baker, & W ittro ck , 2002).

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2D08 H u g ic fc PuMiJiing G m b H
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

C o m p u te r-B ased A s s e s s m e n t in S u p p o rt o f D istan ce L e a rn in g

259

tional p u r p o s e s (e.g., A r m b r u s t e r & A n d e rso n , 1984; C h a n g , Sung, & C h io u , 2002;


H e in z e -F r y & N o v ak , 1990; H olley & D a n s e r e a u , 1984; J o n asse n , Beissner, & Yacci,
1993; M c C u r e , S on ak, & S u en, 1999; N o vak, 1998; N ovak & Govvin, 1984). O v e r
th e last d e c a d e th e re h a s b een interest in u s in g k n o w le d g e m a p s a s a w a y to assess
stud ents' c o n cep tu al u n d e r s ta n d i n g o f a d o m a in (e.g., D ela c ru z , C h u n g , & Bewley,
2003; Herl, Baker, & N iem i, 1996; R u iz -P rim o , S chultz, Li, & S havelson, 2001a).
O f p a rtic u la r relevance to D L are the efforts at a u to m a tin g the a d m in is tr a tio n , scor
ing, and re p o rtin g o f k n o w le d g e m a p s (e.g., A lp ert, 2003; C h u n g et al., 2003; Herl,
O Neil, C h u n g , & S chacter, 1999; H o e f t et al., 2003). For in -d epth rev ie w s o f a s s e s s
m e n t issu es related to k n o w le d g e m aps, see Herl et al. (1999), C h u n g et al. (2003),
and R u iz -P rim o , Shavelson, Li, and S chultz (2001b).

Figure 1. Example o f Proposition.


A critical valid ity issue o f an a s s e s sm e n t is the s c o rin g p ro c e d u re , reg ard le ss o f
au to m a te d capability. In th is section w e briefly d e s c rib e h o w k n o w le d g e m a p s have
b e e n scored. In general, sc o rin g k n o w le d g e m a p s can b e referent-free o r referentbased. R eferent-free m e th o d s ev a lu a te the s tu d e n ts m a p ag ain st a ru b ric o r w ith
o th e r criteria (e.g., j u d g i n g the q uality o f th e p ro p o sition s [node-1 in k -n o d e tuple], or
c o u n t in g the n u m b e r o f c o n c e p ts in th e map). R eferent-based m e th o d s c o m p a r e a
s tu d e n ts m a p a g a in s t a referent m a p (e.g., an e x p e r t s m a p o r o th e r gold standard). In
either case, different s c o r in g a p p ro a c h e s use, to different d eg re e s, th e configural and
s e m an tic p ro p e rtie s o f th e netw ork. Table 1 s u m m a r i z e s th e tw o sc o rin g m ethods.

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n t e x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R


2008 H u g t c f c P u M i Jiin g G m b H
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

C o m p u te r-B ased A s s e s s m e n t in S u p p o rt o f D istan ce L e a rn in g

265

Q u e ry
V e cto r
qw!
t2

qw2

t3

qw3

t-m

qwm

Figure 3. Query vector.


O n c e the to -b e -s c o re d content v e c to r is set, a c o m p a r is o n is m a d e b e tw e e n the to -b e scored co n ten t v e c to r and each d o c u m e n t in the te rm - b y - d o c u m e n t m a tr ix , a s sh o w n
in F ig u re 4.

Term-by-Document Matrix

qw,

qw2
qw3

ti

wn

w12

W,3

<2

W2 ,

w 22

w 23

t3

W31

W j2

W3 3

tm

Wm,

Wm 2

"m 3

w 1n

w 2n

wmn
mn

J i i
Com pute similarity between query vector
and each docum ent vector

tfn

doc*

Similarity Scores
doc3

doc,

Q.
O

Query
Vector

doq doc2 doc3

- Si

S2

s3

Sn

qwn
B e c a u s e the matrix o p e ra tio n s return a ve c to r of s c o re s , th e s e s c o r e s c a n be u s e d in a
variety of w a y s. F o r exam p le, e a c h u n s c o re d e s s a y cou ld be c o n s id e re d a q u e ry vector,
and the s c o r e of the e s s a y could be the m e a n of the 10 m o st sim ilar d o c u m e n ts from the
te rm -b y -d o c u m e n t matrix.

Figure 4. Cvm p u tin g similarity scores between the term-by-document m atrix and
query vector.

J o h a n n e s H u rt!# . K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PubliJiing G m b H

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

C o m p u te r-B ased A s s e s s m e n t in S u p p o rt o f D istan ce L e a rn in g

271

B aker, E. L.. O 'N e il. 11 F., Jr., & L in n , R. L. (1993). Policy an d v alid ity p ro sp e c ts fo r p erfo rm a n ce based assessm en t. A m e n can P sych o lo g ist, 48. 1210-1218.
B arry. M .. & R u n y a n . G. B (1995). A review o f d ista n c e -le a rn in g stu d ie s in the U.S. m ilitary.
A m erican J o u rn a l o f D istan ce E ducation, 9. 3 7 -4 7
B en n ett. R E .. & B cjar, I 1 (1998). V alidity an d au to m ated sc o rin g It's not only the scoring.
E du ca tio n a l M easurem ent: Issues a n d P ractice. 17. 9 -17
B erry. M W., & Young. P. G. (1995). U sing latent se m an tic in d ex in g for m u lilan g u a g e info rm atio n
retrieval. C o m p u ters a n d the H um anities, 29, 413429.
B loom . B. S. (1984 ). T he 2 sig m a problem : T h e search for m e th o d s o f g ro u p in stru c tio n as effectiv e
as o n e-to -o n e tu to rin g . E d u ca tio n a l R esearcher. 13. 4 -1 6 .
B o n k . C J.. & W ish er, R A. (2000. S eptem ber). A p p lyin g co lla b o ra tive a n d e-lea rn in g tools to
m ilita ry ed u ca tio n d ista n ce learning: A resea rch fr a m e w o rk (In te rim T echnical R eport).
A le x a n d ria , VA U.S. A rm y R e sea rc h In stitu te fo r th e B eh av io ral and S ocial S ciences.
B reuer. K .. M o lk e n th in , R . & T ennyson, R D. (in press) R ole o f sim u latio n in w e b -b a se d le a rn in g
In 11 F O 'N eil & R Perez ( E d s ), W eb -b a sed learning; Theory, research, a n d p ra c tic e . M ahw ah.
NJ Law rence E rlbaum A ssociates.
B u rste in , J. (2003). T h e e -ra te r* sc o rin g en g in e: A u to m ated essay sco rin g w ith n atu ral lang u ag e
p ro cessin g In M. D. S h e rm is & J. B u rste in (E d s ),A u to m a te d e s sa y scorin g: A cro ss-d iscip lin a ry
p e r sp e c tiv e (pp. 113-121). H illsd ale, NJ: E rlbaum .
B u rste in . J., K ukich. K., W olff. S.. & L u , C. (1998. A pril). C o m p u ter an alysis o f e s s a y co n ten t f o r
a u to m a ted sco re p re d ic tio n P ap er p rese n ted at th e an n u al m e e tin g o f the N atio n al C ouncil on
M easu rem en t in E d u ca tio n S an D iego. CA
B u rste in , J., & M areu . I) (2003). A u to m ated ev alu atio n o f d isc o u rse s tru c tu re in stu d en t essays. In
M D S h erm is & J B u rste in (E ds.), A u to m a ted e s sa y scorin g: A c ro ss-d isc ip lin a ry p e rsp e c tiv e
(pp. 2 0 9 -2 2 9 ). M ahw ah. NJ L aw ren ce E rlbaum A ssociates.
C h an g , K .-E .. S ung, Y.-T.. & C h io u , S. K (2002). U se o f h ie ra rc h ica l h y p e r c o n c ep t m ap in w ebbased courses. J o u rn a l o f E d u ca tio n a l C om pu tin g R esearch , 27, 335-353.
C h an g . W .-C., H su. H -H .. S m ith, T K , & W ang, C -C (2004). E n h a n c in g S C O R M m e ta d a ta for
assessm en t a u th o rin g in c-L ea rn in g . J o u rn a l o f C o m p u ter A s s is te d L earning, 20. 305-316.
C huan g . S .- I I . & O 'N e il. H F. (in press). Role o f task -sp ecific a d a p ted feed b ack on a com puterbased co llab o rativ e p ro b lem -so lv in g task In II F O 'N eil & R Perez (E d s ), W eb-based learning;
Theory, research, a n d p ra c tic e . M ahw ah. NJ L aw ren ce E rlbaum A sso ciates
C hung , G K. W K., B aker, E. L.. B rill, I) G., S in h a , R S aad a t, F , & B ew ley, W L. (2003).
A u to m ated assessm en t o f d o m ain k n o w led g e w ith o n lin e k n o w led g e m apping. P ro ceed in g s o f
the I/ITSEC , 25, 1168-1179.
C hung . G. K. W. K ., de V ries. L. F.. C h e a k , A. C\. & Bew ley. W. (2005). P ro cess m easu res o f
pro b lem solving. M an u scrip t in p rep aratio n .
C hung . G. K W K . H arm o n . T. C ..& B aker. E. L (2001). T h e im p act o f a sim u latio n -b a sed le arn in g
d esig n project on stu d en t le a rn in g IEEE Transactions on E du cation. 44. 3 9 0 -3 9 8
C lark, R E. (1983) R e co n sid erin g research on le a rn in g fro m m edia. R eview o f E du cation al
R esearch, 53. 4 4 5 -4 5 9
C lark. R E. (1989). C u rre n t p ro g re ss and fu tu re d ire c tio n s for research in in stru c tio n a l technology.
E du ca tio n a l Technology R esearch & D evelopm en t. 37, 5 7 -6 6 .
C lark. R. E. (2005). M o tiv atio n strategies. In II F. O 'N e il (E d ). W hat w orks in distan ce learning:
G u idelin es (pp. 89-109). G reen w ich , CT: In fo rm atio n A g e P u b lish in g Inc.
C lauser, B E (2000). R e cu rre n t issu es and recen t ad v a n ce s in sco rin g p e rfo rm a n c e assessm ents.
A p p lie d P sych o lo g ica l M easurem ent. 24. 310-324
D eerw ester. S , D u m a is, S. T , F u rn a s, G W , L an d au er, T K , & H a rsh m a n , R (1990). Indexing by
latent se m an tic an aly sis J o u rn a l o f the A m erican S o c ie ty f o r Inform ation Science. 41, 391-407.

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g u b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

276

G. K. W. K. Chung. H. F. O'Neil. W. L. Bewlev, & E. L. Baker

S inghal, A.. S alton. G .. M ilra. M .. & B uckley, C. (1996). D o cu m en t length n o rm alizatio n . Inform ation
P ro cessin g a n d M anagem ent, 32. 6 1 9 -6 3 3 .
S m ith . P. L . & D illon. C L. (1999) C o m p arin g d ista n c e le a rn in g and classro o m le a rn in g C o n cep tu al
co n sid eratio n s. A m erican J o u rn a l o f D istan ce E ducation, 13. 6 -3 6 .
T ennyson. R D (2005). L e a rn in g th e o rie s an d in stru c tio n a l d esig n A h isto ric a l p e rsp e c tiv e o f th e
lin k in g m odel In J. M S pector, C. O h ra z d a , A Van S ch aaek . & D A. W iley (E ds.), Innovations
in in stru ctio n a l technology. E ssays in h o n o r o f M. D a v id M errill (pp. 219 -2 3 5 ) M ah w ah , NJ:
L aw ren ce E rlb au m A ssociates.
U .S. D e p a rtm e n t o f D efense. (1996). D ep a rtm en t o f D efen se handbook: D evelo p m en t o f in teractive
m u ltim edia in stru ctio n (IMI) (p a rt 3 o f 4 parts). L ak eh u rsl. NJ: N aval A ir S y stem s C om m and.
W est. C. D.. Pom eroy. J. R ., P ark . J. K ., G ersten b erg er. E. A .. & S an d oval. J. (2000). C ritical th in k in g
in g ra d u a te m ed ical e d u c atio n A ro le fo r co n c ep t-m ap p in g assessm en t J o u rn a l o f the A m erican
M e d ic a l A ssociation , 284. 1105-1110.
W irt. J , R ooney. P . H ussar. B . C hoy, S . P ro v a sn ik , S . & H am p d en -T h o m p so n , C (2005) The
C on d itio n o f E du cation 2005 (N C E S 2 0 0 5 -0 9 4 ) W ashington, DC: U S D ep a rtm e n t o f E d u catio n ,
N ational C e n ter fo r E d u catio n Statistics.
W isher. R. A., C h am p ag n e, M. V., P aw lu k , J. L., E ato n . A ., T h o rn to n . D. M .. & C urnow . C. K
(1999). Training through d ista n ce learning: A n a ssessm en t o f research fin din gs (Tech. Rep. No.
1095). A lex a n d ria. VA: U .S. A rm y R esearch In stitu te for th e B eh av io ral and Social Sciences.
W ong. S. K M . Z iark o . W . R ag h av an , V V.. & W ong. P. C N (1987). O n m o d e lin g o f in fo rm atio n
retriev al c o n c e p ts in v e c to r sp a ces A C M Transactions on D a ta b a se System s. 12. 299-321
W u. C , B erry . M . S h iv ak u m ar, S., & M c L a rty , J. (1995) N eu ral n etw o rk s fo r fu ll-sc a le p ro tein
se q u en ce classificatio n S eq u en ce en c o d in g w ith s in g u la r v alu e d ecom position. M achine
Learning, 21, 177-193

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R


0 21)08 H o g r e f e P u b lis h in g G m b H
K e in e u n e r la u b te W c ite r g a b e o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

2X1

A ssessm en t in L arg e-S cale S tudies

such as q u e s tio n n aires, o b s e r v a tio n s o r in terv iew s. T h u s , la rge-scale a s s e s sm e n t


stud ies c o v e r a b road ra n g e o f educational a re a s a n d a llo w for th e an aly sis o f rela
tio n sh ip s b e tw e e n m e a s u r e d variables. In this co n tex t large-scale a s s e s s m e n ts s t u d
ies are u s u a lly referred to as ex post fa c to , c a u s a l-c o m p a r a tiv e ' o r co rrela tio n al
research.
C orrelational stu dies th at are ex post fa c to research fo cu s on th e relationships
b e tw e e n v a ria b le s as they o c c u r in natural settings. (...) Ex post fa c to research
can be c o n sid ered a p art o f s u rv e y re s e a rc h b e c a u s e o f its n on-experi mental
n a t u r e and the w a y data are collected; in essence, subjects are s u r v e y e d
(W ie rs m a , 2000, p. 158).
T h e choice o f design, in s tr u m e n ts and an aly sis strategies d e t e r m i n e w h e th e r rela
tio n sh ip s b e tw e e n v a ria b le s can b e d escrib ed o r w h e th e r relationships can be used
to ex p lain and pred ict educational p ro c e s s e s and o u tc o m e s (M an d l & K op p, 2005;
P renzel, 2005; S havelson & Tow ne, 2002). D esp ite these possible v aria tio n s, largescale stu d ie s gen erally can b e referred to as a form o f s u rv e y research e m p lo y in g
specific research designs. T h e d e s ig n s differ in th e pop ulation or s a m p le to be s tu d
ied and the s a m p lin g p ro ced ures. T h is c ateg o rizatio n also c a n be u s e d to differen ti
ate b e tw e e n la rg e -sca le assessm en ts. A n o v e r v ie w o f research d e s ig n s is giv en in
Table 1.

Table I. Research designs in large-scale studies.


Design

D ata collection Population stu d ie d Sam pling


(e.g., age groups)
times

Cross-sectional
One-shot

> 1 (e.g., grad es


4, 8, 12)

Random samples for each popu


lation at one point in time

Trend

> 1

> 1

Random samples for each popu


lation at each data collection
time

Simple panel

> 1

One initial random sample is


used repeatedly throughout data
collection times

Complex panel

> 1

> 1

Multiple random samples are


used repeatedly throughout data
collection limes and could be
combined with cross-sectional
designs

L o n g itu d in al

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c i e s in K d u c a tio n a l C o n te x t s , l l o g r c f e P u b l i s h in g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K e in e u n e rla u b te W c itc rg a b c o d e r V c rv ic lfa tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

2X6

T. Seidel & M. Prenzel

tr e n d design w ith a four-year cycle. In addition, T I M S S 2008 a s s e s se s s tu d en ts at


th e e n d o f sch o o lin g ( g r a d e 12 or 13) in a d v a n c e d m a th e m a tic s and physics courses.
T h e s e stu d ie s are c o m m is s io n e d by IEA.

Table 3. ( Classification o f prominent large-scale assessment studies.


PISA

TIM SS

P IR L S

NAEP

NAALS

S a m p le

Random

Random

Random

Random

Random

Scope

International

International

International

National

National

Focus
a ssessm en t

Reading
Mathematics
Science

Mathematics
Science

Reading

Reading
English
Mathematics literacv
w
Science
Writing
Other subjects
periodically

B ack g rou n d

Student
background

Student
background

Student
background

Student
background

Adult
background

Target group

Students
Students
( 15-ycar-olds) (grades
4/8/12)

Students
(grades 3/4)

Students
(grades
4/8/12)

Adults

Design

Trend
Trend
Trend
Trend
Trend
(3-year cycle) (4-year cycle) (5-year cycle) ( I -year cycle) (10-year
cycle)

Initiator

OECD

IEA

IEA

NCES

NCES

P ro g re s s in International R e a d in g L iteracy Study (P I R L S ) is the third e x a m p le o f


a p r o m in e n t intern ational la rge-scale assessm ent. P I R L S targ e ts r e a d in g literacy o f
s tu d en ts in g r a d e s 3 a n d 4 and applies a trend design w ith a five-year cycle. P I R L S
is a n o th e r large-scale a s s e s s m e n t s tu d y c o m m is s io n e d by IE A (M u llis, K ennedy,
M a r ti n , & S ain sb ury, 2006; M ullis, M a rtin , K ennedy, & Flaherty, 2002).
T h e fo u rth e x a m p le is N ational A s s e s s m e n t o f E du cation al P ro g re s s ( N A E P ), a
la rg e-scale a s s e s s m e n t study c o m m is s io n e d by the U.S. N ational C e n te r fo r E d u cation
Statistics (N C ES). N A E P e n c o m p a s s e s all subjects w ith read in g , m a th e m a tic s , sci
en ce and w r itin g a s the m a jo r a s s e s s m e n t c o m p o n e n ts ; o th e r subjects are tested p e
riodically. N A E P ta rg e ts s tu d e n ts in g r a d e s 4, 8, and 12. C o h o r ts are te sted in a
o n e-y ear cycle (N ational A s s e s s m e n t o f E du cation al Pro gress, 2001a, 2001b, 2001c,
2001d, 2001e, 2001f; N atio n al C e n t e r fo r E d u ca tio n Statistics, 2005).
T h e last e x a m p le is N ational A s s e s s m e n t o f A du lt L iteracy ( N A A L ) c o m m i s
sioned by N C E S . N A A L is fo c u s e d on ad ult English literacy a n d tw o ad ult co h o rts
have b een investigated in a 10-year cycle (K utner, G re e n b e rg , & Baer, 2005; National
C e n te r for E d u ca tio n Statistics, 2005).

J o h a n n c * H u rt!# . K c k h a r d K lie m e . D c tle v L e u tn e r : A ts e s v m c n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008 H ugicfc PuMiJiing G m bH

K c in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

293

A s sessm en t in L arg e-S cale Studies

m a jo r fields o f physics, ch e m is try , biological science, earth and space science,


and scie n c e -b a s e d techno lo gy. K n o w le d g e about science" refers t o k n o w le d g e
o f the m e a n s ( scientific e n q u i r y ) and g oals ( scientific e x p la n a tio n s ) o f
science (A ustralian C ouncil for E ducational R esearch , 2006, p. 8).
T h is definition o f scientific literacy is tra n sla te d to a structu ral m odel (F ig u re 3).
In add ition , each model c o m p o n e n t is d e s c rib e d in detail a n d m ultiple e x a m p le s
are given w ith regard to differen t le a r n in g contexts, c o m p e te n c ie s a n d k n o w le d g e
facets.

Test D evelopm ent in PISA


F r a m e w o r k s represen t a b lu e p rin t for test developm ent. In PISA, test item s are d e
veloped by differen t test c e n te rs such a s the A u stra lia n Council for E ducational
R e s e a rc h (ACER), th e N e th e rla n d s N ational In stitute o f E d ucatio nal M e a s u r e m e n t
(Citogroup), E d u ca tio n al Testing S e rv ic e ( E T S ) in the U nited States, and the J a p a n e s e
N ational Institute for E du cation al Policy R esearch ( N I E R ) T h e test c e n te r s c o lla b o
rate w ith additional subco n tracto rs. In P IS A 2006, for exam p le, s u b c o n tra c to rs are
th e Institu te for L e a r n i n g Science (IL S ) in N o r w a y and the L e ib n iz -In s titu te for
S cience E d u c a tio n ( I P N ) in G erm an y .

C o n te x ts

Authentic situations
where science and
technology are
important

C o m p e te n c ie s

Identifying scientific
questions
Explaining phenome
na scientifically by
applying scientific
knowledge
Using scientific evi
dence to make and
communicate deci
sions

K n o w le d g e

Knowledge of
science
(basic concepts)
Knowledge about
science
A ttitu d in a l re s p o n s e s

Attitudes towards
scientific and techno
logical issues

Figure 3. Framework for FISA 2006 Science Assessment (Australian Council for
Fducationa / Research . 2006, p. 13).
T h e test d ev e lo p m e n t follow s a strict technical p r o c e d u r e (O rg a n isatio n for E c o n o m ic
C o - o p e r a tio n and D e v elo p m en t, 2 0 0 2 a, 2002b, 2005a, 2005b, 2005c) w h ich in clud es
th e d ev e lo p m e n t o f a m anu al for item c o n s tru c tio n (item ty p e s, item form at, verbal

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r la u b te W c ite r g u b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

A s sessm en t in L arg e-S cale Studies

299

ticipate effectively in m o d e r n society (O rg a n isatio n for E c o n o m ic C o - o p e r a tio n and


D e v elo p m en t, 1999). D e s p ite th e efforts o f e x p e rts to define literacy in a w a y that
m e e ts th is objective, PISA h as not yet ta k en steps to investigate h o w proficiency
levels a s as s e s se d in PISA are lin k e d to y o u n g a d u lts lon g -term d ev e lo p m e n t and
th e ir su ccess in p a rtic ip a tin g in society. T h u s , research on the pro g n o stic v alidity is
re q u ire d to put PISA on solid g r o u n d o ver the long-term .

Integrating Knowledge Gained Through Different Large-Scale Assessments


A n u m b e r o f national and intern ational a s s e s s m e n t stud ies are cu rre n tly realized
in education. A s w e have sh o w n in th e first p a r t o f this chapter, th e se studies differ
w ith regard to th e ir focus o f a sse ssm e n t, th e ir target p o p u la tio n and th e ir d e v e lo p
m e n t by le a d in g o r g a n iz a tio n s and institutions. A s a result, fin d in g s are not readily
c o m p a r a b le a m o n g large-scale ass e s sm e n ts , for exam p le, resu lts o f P I R L S fo c u s in g
on fo u rth g r a d e r s c a n n o t b e easily c o m p a r e d to results o f T I M S S on g r a d e 7/8 or
P IS A o n 15-year-olds. E ven m o re difficult is the c o m p a ra tiv e in terp retatio n o f results
w h e n additional national a s s e s s m e n ts are included. A com p etitiv e a p p ro a c h a m o n g
different la rg e -sca le stu d ie s c a n ce rta in ly be fru itful for a while, however, in the
long r u n the in tegration o f results h as to b e e n s u r e d in o r d e r to m a x i m i z e the output
o f th e se cost-intensive studies. In teg ratio n refers to all m ile s to n e s in the realizatio n
o f la rg e-scale a s s e s s m e n ts such a s develo p m en t o f f r a m e w o r k s (e.g., m o s t o f the
stu d ie s co n c e n tra te on d o m a i n s such as readin g, m a th e m a tic s and science), test d e
v e lo p m e n t (e.g., li n k in g item s that a r e used in T I M S S as well as in PISA), technical
p ro c e d u r e s for field and m a in test a d m in is tr a tio n (e.g., u s in g the o rg an iz atio n a l and
logistic s tr u c tu r e s o f countries), scaling m e th o d o lo g ie s (e.g., c o m p a r in g different a p
p ro ach es in scaling), and lesso n s le arn ed w ith regard to p r o d u c in g research re p o rts
and c o m m u n i c a t i n g fin ding s to d iffe re n t a u d ie n c e s (e.g., policy m a k e rs, s ta k e h o ld
ers, school a d m in is tra to r s , teachers, p arents, etc.).

Invite Re-Analyses and Additional Research Com ponents


O u r final q u e s tio n d eals w ith h o w to e n s u r e th a t la rg e-scale a s s e s s m e n ts a re in fact
research stu dies a n d not m erely in s titu tio n a liz e d m e c h a n i s m s to ev a lu a te the o u tput
o f ed u c a tio n system s. T h e size o f the sam p les in la rge-scale a s s e s s m e n ts a lm o s t in
evitably invites additional analysis th a t g o e s b e y o n d the scop e o f th e overall research
objectives o f th e se studies. T h u s , it s e e m s like a w a s te w h e n a research c o m m u n i t y
do es not ta k e a d v a n ta g e o f th e s e d a ta sets. T h e rean aly sis o f th e T I M S S d a ta sets
( M a rtin , G regory, & Stemler, 2000; M a r ti n , M ullis, & C h ro s to w s k i, 2004; National
R esearch C ou n cil, 1999) as well a s th e c o m p le m e n ta r y v id e o studies ( H ie b e rt et al.,
2003; Roth et al., 2006; Stigler, G o n z a le s , K a w a n a k a , K noll, & S e rra n o , 1999) serve
as e x a m p le s o f h o w th e se issues can b e tackled. A n u m b e r o f additional research

J o h a n n e s H u rt!# . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R


2008 H u g i c f c P u M i Jiin g G m b H
K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

304

T. Seidel & M. Prenzel

S havelson, R J. (19% ). S ta tistic a l reason in g f o r the b eh a vio ra l scien ces (3rd e d ) . N eed h am 1Ieights,
M A A lly n & B acon.
S havelso n . R J . & T ow ne, L. (Eds.). (2002). Scientific resea rch in edu cation W ash in g to n , DC:
N atio n al A cadem y P ress.
Stigler, J W . G o n za les, P . K a w a n a k a , T . K n o ll. S.. & S erran o , A (1999) The T1MSS V ideotape
C lassro o m Study. M eth ods a n d fin d in g s fr o m a n ex p lo ra to ry resea rch p ro je c t on eigh th -grade
m ath em atics in stru ction in G erm any, Japan , a n d the U nited S la tes W ash in g to n , D C. U.S.
D e p a rtm e n t o f E ducation,
van d e r L in d en , W. J., & G las. C A. (E d s ). (2000). C o m p u te rize d a d a p tiv e testing. T h eory a n d
p ra c tice . D ordrecht: Kluvver.
W ein ert. F. E. (2001). V erg leich en d e L e is tu n g sm e ssu n g in S chulen - ein e u m stritte n e
S elb stv erstan d lich k eit In F E W einert (Ed.), L eistun gsm essu ngen in Schulen (pp. 17-31).
W ein h cim B eltz V erlag
W iersm a, W (2000) R esearch m ethods in edu cation: A n in tro d u ction (7th ed.). N eed h am H eights,
M A P earson.
W irth , J. (2008). C o m p u ter B ased Tests: A lte rn a tiv e s for Test and Item D esign. In J. H artig, E.
K liem e. & D. L e u tn e r ( E d s ), A ssessm en t o f com p eten cies in ed u ca tio n a l con texts (pp. 235-252).
G ottingen: I Io g refe & I Iuber.
W u. M .. & A d am s, R (2004). P lau sible values: W hy th ey a re im portan t U npub lish ed paper.

J o h a n n e * lla r tijr , K c k h a r d K lie m e . D c tlc v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

0 21)08 H ogicfc Publishing G m bH


K e in e u n e r la u b te N V citergubc o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

310

E. K lie m e & K. M a a g Mcrki

s ta n d a r d s in th e G e r m a n - s p e a k i n g co untries. W h i le different m e th o d s o f re p re s e n t
ing c o m p e te n c e levels in p s y c h o m e tric m o d e ls d o exist (R o st 2004), detailed m o d
els have only b een d ev elo ped for v e ry few subjects - for ex am p le, m a th em a tics.
C on seq u en tly , th e P IS A studies fo rm th e m a in point o f reference for c o m p e te n c e
m o d e ls in m a th e m a tic s , natural sciences a n d first la n g u a g e s , w h ile the C o m m o n
E u ro p e a n R e fe re n c e F ra m e w o rk (Council o f E u ro p e, 2001) plays th e s a m e role in
m o d e lin g levels o f c o m p e te n c e fo r foreign langu ages. So far, however, to o little ex
p e rie n c e has b e e n g a in e d to sp ecify general m o d e ls th a t will be con siste n t o ver all
su b g ro u p s o f the stu d en t population. F u r th e r m o re , w e can a s s u m e th a t th e system atic
descrip tio n o f c o m p e te n c e levels d iffers d e p e n d i n g on th e d o m a in . In g eneral, the
successiv e levels represen t c o m b in a tio n s o f th e facets listed a b o v e (kn o w led g e, skill,
u n d e rs ta n d in g , action, m otivation, etc.; ...). (.. .) It m ay be th e c a s e th a t th e levels o f
a c o m p e te n c y m odel can also be in te rp re te d as co n s e c u tiv e steps in th e acquisition
process. ( K lie m e et al., 200 4, pp. 68). A c o m p e te n c e model o f this k in d is capable
not only o f e x p re s s in g p a r tic u la r levels w ith in a specific age o r birth c o h o rt o f s tu
den ts but also o f d e s c rib in g h o w the differen t c o m p o n e n ts o f c o m p e te n c e evolve
in in teraction w ith one a n o th e r in th e le a r n in g b io g ra p h ie s o f c h ild ren and y o u n g
people, and h o w c o m p e te n c ie s a re f u n d a m e n ta ll y acq u ired . N evertheless, w e may
a s s u m e that d efining levels o f c o m p e te n c e fro m the p e rs p e c tiv e o f d ev elo pm ental
p sy cho lo gy w ould in c re ase th e c o m p le x ity o f th e m o d e ls dram atically. O verall, vast
room r e m a in s for f u r t h e r investigation on this asp ect o f im p le m e n tin g educational
standards.
O w i n g to their p ersistin g lack o f a n em pirical fo u ndatio n for c o m p e te n c e models,
G e r m a n ed u cation al s ta n d a rd s d o not s tr u c t u r e c o m p e te n c ie s into levels o f c o m p e
tence. Instead, they b re a k the s ta n d a r d s d o w n into core r e q u ire m e n t areas. T h e s e
areas differ f u n d a m e n ta lly fro m levels o f c o m p e te n c e (K u ltu s m in is te r k o n f e r e n z ,
200 4, p. 17), how ever, a n d in stead o f b e in g g r o u n d e d in em p irically v alid ate d test
ing p r o c e d u r e s they are b a s e d on t e a c h e r s professional e x p e rie n c e and task fo rm a ts
from traditional testin g m aterials. T h e core r e q u ir e m e n t a r e a s are therefore pro v i
sional. At th e In stitute for E d u ca tio n al P ro g re s s (Institut z u r Q u a lita ts e n tw ic k lu n g
im B ild u n g s w e s e n , IQB), w h ich w a s f o u n d e d specifically to develop te stin g m a te r i
als on a s o u n d scientific basis, e x p e r ts w o rk to d raft test item s th at c o n c re tiz e these
s ta n d a r d s a n d re n d e r th e m m e asu rab le. In 2 0 0 6 , the first s ta n d a r d iz e d ta s k s for the
subject o f m a th e m a tic s w e re d ev eloped for all o f th e G e r m a n I .cinder (federal states)
parallel to the 2 0 0 6 PISA study.
S w itz e rla n d has ta k e n a differen t app ro ach . W h i le in G e r m a n y educational
s ta n d a r d s w e re im p le m e n te d w ith o u t the ex isten ce o f any c o m p e te n c e m o d
els ( K u ltu r m in is te r k o n f e r e n z , 2004, p. 6), the S w iss C o n f e r e n c e o f th e C antonal
M in is t e r s o f E d u ca tio n ( E D K ) decid ed to first develop and e m p iric a lly test th e s ta n
d a r d s alon g w ith th e c o r r e s p o n d in g c o m p e te n c e m o d e ls in a m u lti-p h a s e process
(S ch w eizerisc h e K o n fe r e n z d e r k an to n alen E rz ie h u n g s d ir e k to r e n , 2004). In th is p r o
cess, differen t scientific c o n s o r tia have b een estab lish ed th at c o m p ris e specialists
fro m a r a n g e o f fields, te ach ers o f differen t subjects, and social scientists, w h o w ork

J o h a n n e s H u rt!# . K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

314

E. K lie m e & K. M aa g Merki

H am ilto n . L. S., Stecher. B. M . M arsh . J. A.. M cC o m b s, J. S., R o b y n, A .. R u ssell, J. L., N aftel, S., &
B arn ey , H. (2007). S ta n d a rd s-B a se d A cco u n ta b ility U nder N o C h ild L eft Behind. E xperien ces o f
Teachers a n d A d m in istra to rs in Three S ta tes S anta M o n ica. CA R A N D E ducation.
H am cycr. U.. Frey, K . & H aft. H (E ds) (1983). H andbuch d erC u rricu lu m forsch u n g: U bersich ten
z u r F orschung 1970 19SJ W ein h cim Beltz.
H e rm a n ,.! L (2004). T h e E ffec ts o f T esting o n In stru ctio n . In S H. F u h r m a n & R F. E lm ore (E d s ),
R ed esig n in g A cco u n ta b ility S ystem s fo r E ducation (pp. 141-166). N ew Y ork/L ondon T eachers
C ollege P ress.
H orn, R. A J. (2004). S tandards. N ew York: P eter Lang.
K lie m e, E., A v en ariu s, H.. B lu m , W.. D o b rich , P.. G ru b er. H ., P re n z e l, M . et al. (2004). The
D evelo p m en t o f N a tio n a l E d u ca tio n a l S tandards. An E xpertise. B erlin : B u n d e sm in iste riu m
fu r B ild u n g und F orschung. A v ailab le U R L : h ttp ://w w w .b m b f.d e/p u b /th e_ d ev elo p m en t_ o f_
nat ion a 1 educat io n cl_ st a nd ard s.p d f
K u ltu sm in iste rk o n fe re n z (2004). B i/du n gsstan dards d e r K ultusm inisterkonferenz. E r/auterungen
zu r K o n zep tio n u n d Entwicklung. M unchen: L u ch terh an d .
L u cv sh y n , .!. (undated). H an dreich u n g B ildun gss tan dards. Em Be it ra g zu r Q u a/itdtssich eru n g an
O sterreich s Schulen. S alzburg: o h n e Verlag.
M cL eod, D. B ., S take, R. E ., S ch ap p elle, B., M ellissin o s, M ., & G ierl. M. J. (1996). S ettin g the
S tan d ard s: N C T M 's role in th e refo rm o f m a th em atic s ed u catio n . In S. A R aizen & E. I). B ritto n
(E d s ). B o ld ventures: U.S. innovations in scien ce a n d m ath em atics edu cation. (Vol 3 C ases in
m ath em atics ed u catio n , p p 13-132). D o rd rech t: Kluvvcr
N ichols, S. L . & B erlin er. D. C. (2007). H o w high-stakes testin g co rru p ts A m erican ' schools.
C a m b rid g e H arv ard E d u ca tio n P ress.
O 'D ay. .1 A. (2004) C om plexity, A cco u n tab ility , an d School Im provem ent. In S H F u h rm a n & R.
F. E lm o re (E d s ), R edesign in g A cco u n ta b ility S ystem s f o r E ducation (pp. 15-43). N ew Y ork/
L ondon: T eachers C o lleg e P ress.
R avitch . D. (1995). N a tio n a l S ta n d a rd s in A m erica n E ducation. A C itizen's G uide. W ashington,
D C B ro o k in g s In stitu tio n P ress.
R ost, J. (2004) P sy ch o m c trisch c M odelle z u r U b erp rU fu n g von B ild u n g ssta n d ard s a n h a n d von
K o m p eten zm o d ellen . Z eitsch rift f u r P a d a g o g ik , 50 (5), 6 6 2 -6 7 8
R ychen. D S.. & S a lg a n ik . I. H . (E d s ). (2001). D efining a n d S electin g K ey C om peten cies. S eattle
H ogrefe & H u b e r P u b lish ers
R ychen. D. S., & S alg a n ik , L 11 (Eds.). (2003). K e y C o m p eten cies f o r a S uccessfu l Life a n d WellE unktioning S ociety. Seattle: I Iogrefe & I Iuber P ublishers.
S w iss C o n feren ce o f th e C an to n al M in is te rs o f E d u catio n (2004). H ARM O S. Z ielsetzun gen und
K on zeption. B ern: ED K
S e n k , S. L ., & T h o m p so n . D R (E d s ). (2003) S ta n d a rd s-B a sed S ch ool M ath em a tics C u rricu la.
W hat A re They? W hat D o S tu den ts Learn?. M a h w a h /L o n d o n L a w re n c e E rlbaum .
S techer. B M (2002). C o n seq u en c es o f larg e-scale, h ig h -sta k e s te stin g on sch o o l and classro o u m
p ractice In L S H am ilto n , B. M Stecher. & S. P. K lein (E d s ), T est-based a cco u n ta b ility in
edu ca tio n (p p 7 9 -1 0 0 ). S a n ta M onica: R A N D E ducation.
S w an so n , C. B , & S tev en so n , D. L. (2002). S tan d a rd s-b ased -re fo rm in practice: E v id en ce on state
policy an d classro o m in stru c tio n fro m the NAF.P sta te assessm en t. E du cation al E valu ation a n d
P o lic y A n alysis, 24, 1-27.
W einert. F E. (2001a) C o n cep t o f C o m p eten ce A C o n cep tu al C larification. In D. S R y ch en & L H.
S alg an ik (E d s ), D efining a n d S electin g K ey C o m p eten cies (pp. 4 5 -6 5 ). S eattle. B ern: H ogrefe
& H u b er P ublishers.
W ein ert, F. E. (2001b). S ch u lle istu n g en - L eistu n g en d e r S ch u le o d e r d e r S chuler? In F. E. W einert
(Ed.), L eistungsniessiingen in Schulen (p p 7 3 -8 6 ). W ein h eim B eltz.

J o h a n n e s H a r tig . E c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n t e x ts . H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R


0 21)08 H u g t c f c P u b l i s h i n g G m b H
K e in e u n e r la u b te W e itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

320

C. Nachtigall. U. K ro h n e . U. E n d e rs. & R. Stcyer

variables. T h ird , th e re a r e th o s e d e te r m in a n ts stu d ie s w h ich m e a s u r e the e ffe c tiv e


ness o f sch o oling w a n t to evaluate.

Student Characteristics
T h e r e are a lot o f well k n o w n c h arac teristics th a t d e t e r m i n e stu d en ts' a c h ie v e m e n t
a n d c o m p eten cies. T h e s tu d e n t's sex, age o r la n g u a g e are well k n o w n as im p o r
ta n t but will not b e d is c u s s e d b e c a u s e the im p o r ta n c e o f th o s e is o b v io u s and th e ir
m e a s u r e m e n t un pro b lem atic. H o w e v e r it is im p o r ta n t to notice, th a t s o m e o f the
v a ria b le s d e s c rib e d next are te m p o ra lly p re c e d in g to th e in stru c tio n at school, w hile
o th e rs m ay interact w ith past and c u r r e n t in s tru c tio n (e.g. s tu d e n t s 1 motivation).
O n e o f the m ost im p o r ta n t d e t e r m in a n t s o f c o m p e te n c ie s ta k en into a c c o u n t in
school effectiv en ess research is the social b a c k g r o u n d o f th e s tu d en ts w h ich can be
d iv id ed into SES and cultural capital ( B a u m e r t & S chu m er, 2001). In th e c u r r e n t
re s e a rc h th ere are different ind icato rs for th e SES such a s the g a in fu l e m p lo y m e n t o f
th e s tu d e n ts parents, th e ir level o f edu catio n o r th e relative p rosperity o f th e family.
P o ssessio n s in th is co n tex t include, for exam p le, electronic e q u ip m e n t, cars, b a t h
ro o m s, h o u s in g c o n d itio n s in general, and so on ( B a u m e r t & S chum er, 2001). T h e
te rm cu ltu ra l cap ital g o e s b ack to B o u rd ie u . In th e 1960s he pointed out that p ar
ticipation in the c u rre n t cu ltu re is an im p o rta n t d e t e r m in a n t o f school achiev em ent,
a n d co in ed the c o n c e p t o f cultural capital ( B o u r d ie u & P assero n, 1964). C u ltu ral
capital c o m p ris e s m a n y differen t a s p e c ts o f daily living such a s cultural assets and
resources. P IS A used differen t indicators to m e a s u r e the cultural capital such as the
national orig in o f stu d e n ts and their parents, h u m a n capital'1 or cultural practices
in the fam ily w h a t m e a n s the f a m i l y s closeness to its c o m m u n i t y s c o m m o n cu ltu re
( B a u m e r t & S chum er, 2001) w h ich is also called so cio -cu ltu ral m ilieu. T h e often
cited t e r m s c lo sen e ss to e d u c a tio n o r social class are m o stly c o m b in a tio n s o f
tw o or m o re o f th e ab o v e -m e n tio n e d indicators o f social b ac k g ro u n d .
T od ay's stu dies s h o w that B o u rd ie u w a s at least p artially right; d e t e r m in a n t s such
a s i m m i g r a n t sta tu s h av e a su b sta n tiv e influence, but only w h e n both paren ts are
from a foreign land is th e in flu en ce o f any practical relevance. Effect sizes o f social
b a c k g r o u n d on school a c h ie v e m e n t are small o r in term ed iate. For ex am ple, the study
Qualittitsuntersuchungan Schnlen zum Unterricht inM athem atik ( Q u a S U M ) sh ow s
an effect size o f /* = .31 for SES and c o m p e te n c e in m a th e m a tic s (H e lm k e , H osenfeld,
Schrader, & W agner, 2002a). T h e s a m e effect size w a s fo u n d by th e G e r m a n part
o f P I R L S , called Internationale Gritndschul-Lese-Unter.suchung (IG L U ), b e tw e e n
social b a c k g r o u n d and re a d in g c o m p e te n c e (S chw ip pert, Bos, & L a n k e s , 2003).
However, th e re are s o m e restrictio ns to th e se results. S o m e c o m p u ta tio n s i n previouslym e n tio n ed stu d ie s sh o w e d th at th e ind icato rs m e a s u r e partially r e d u n d a n t c o n s tru c ts

h u m a n c a p ita l: c o m p r is e s k n o w le d g e , q u a lif ic a tio n s a n d d if f e r e n t p o s itiv e in d iv id u a l


c h a r a c te r is tic s th a t p r o v id e c h a n c e s in fin d in g a jo b b u t a ls o p o s itiv e c o n s e q u e n c e s in
n o n - e c o n o m ic d o m a in s s u c h a s h e a lth o r w e ll-b e in g .

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c i e s in E d u c a tio n a l C o n t e x ts , H o g r e f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R


2D 0S H u g i c f c P u M i Jiin g G m b H
K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

326

C. Nachtigall. U. K ro h n e . U. E n d e rs. & R. Steyer

Com parison to Expected Values


Strategy 3 offers c o m p a ris o n s not to existin g classes o r schools b u t to expected values.
Such expected v alues describe th e achievem ent w h ich can be expected u n d e r specific
constellations o f relevant d e te rm in a n ts o f s tu d en ts com petencies. In PISA 2000, the
school feedback reports in G e r m a n y contained expected values w hich w e re calculated
on the basis o f the follow ing b a c k g ro u n d variables: stu den ts sex and basic cognitive
skills (m easured w ith the K o g n itiv er Fahigkeitstest by Heller, G aedike, & W einlander,
1985); social status; relative wealth o f the fam ily; the p arents level o f education; col
loquial la n g u ag e in th e family; and level o f parental support w ith hom ew ork. For each
student, an expected value w as calculated u sin g a multiple linear regression model with
th e predictors d escribed above. O n ly data fro m th e sam e state and school t y p e w ere used.
T h e difference b e tw e e n expected value and students actual test score (residual) w as o f
interest. T h e average o f these differences over all tested stud ents o f a school w as taken
as an indicator for the effectiveness o f this school (W a te rm a n & Stanat, 2004). This
calculation o f ex pected values w a s b ased on specific assum ptions, i .e., that the relation
is linear and that interactions d o not exist. A n alternative w hich does not need these a s
su m ptions is the u se o f a saturated model o f th e A NO V A-type. However, the predictors
have to be discrete for this approach. For exam ple, this p ro ced u re is used in the context
o f the T h u r in g ia n C o m p e ten cy Tasks, a co m p a ra tiv e test in the G e r m a n state T h u rin g ia
tyww.kompetemtests.de). A n expected value is estim ated by the cell m e an w ithin
a m ultifactor analysis o f variance. Relevant tem porally preced in g predictors such
as students m o th e r to n g u e or so cio-eco nom ic b a c k g ro u n d are the different factors,
stud ents tests score is the depen d en t variable. T h e average o f all stud ents expected
values w ithin one class provides the co m p a riso n value for this class (cf. Nachtigall,
K ro h n e , & Muller, 2005). In the case o f m ix e d (discrete and continuous) predictors, a
com b in ation o f linear and saturated m odels w ith in the G eneral L in e a r M odel f r a m e
w ork should work w ithout problems.
A n o th e r w a y to deal w ith e x p ected v a lu e s w as applied in M A R K U S ( H e lm k e &
Jager, 2002). S im ilar to PISA, relevant co n tex t v ariab les w e r e used to calculate ex
pected v a lu e s via lin e a r regression models. In this case, th e calculation too k place on
th e class level. T h e residual (i.e., the d iffe re n c e s b e tw e e n class m e a n a n d ex p ected
class m ean ) w a s c o m p u te d for each class and a d d e d to th e e x p ected v a lu e for the av
e rag e p red ictor constellation. T h is v alu e w a s reported. It is in te rp re te d a s the score
a class w o u ld have achieved i f s tu d e n ts b a c k g r o u n d co n d itio n s w ere not a s they
actu ally are but i f context co n d itio n s h ad b een equal o ver all classes.
A s has b een sh o w n , e x p e c te d v a lu e s are u s e d in differen t w a y s - t o calcu
late th e results w h ich could b e e x p e c te d given actual c o n d itio n s o r to calculate
scores w h ich could be e x p e c te d i f co n d itio n s w e re different. H ow ever, u s in g ex
pected v a lu e s in th e latter w a y im plies th a t the reg ression model u sed reflects a
causal model, an a s s u m p tio n w h ich m ay b e w rong . In gen eral, the e x p ected

J o h a n n e s ll u r t i g , K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

33 0

C. Nachtigall. U. K ro h n e . U. E n d e rs, & R. Stcyer

th is topic. T h e c o n c e p t o f m ediators, w h ich plays a crucial role in th e a re a o f school


effectiveness, h as to be in c o rp o r a te d w ith this theory. For instan ce, school t y p e is
such a variable. It ex p la in s m u c h o f the v a r ia n c e in a c h ie v e m e n t at school, a lth o u g h
it m e d ia te s th e effect o f o th er co n tex t v a ria b le s such as co g n itiv e abilities or SES,
w h ich th e m s e lv e s affect the choice o f school type.
O n the su bstan tiv e side, th ere are m a n y small ra n g e m o d e ls and th e o rie s w h ich
provide insight into th e f u n c ti o n in g o f selected d e te rm in a n ts . H ow ever, an in teg rated
th e o ry o f d e t e r m in a n t s and th e ir causal f u n c t i o n i n g w ith respect to s tu d e n t s c o m
peten cies is still missing. R esearch sy n th esis is n e e d e d here, but th e ta sk is tricky.
D e t e r m i n a n t s o f s tu d e n ts c o m p e te n c ie s are d iv id ed into th r e e groups: th o s e that are
te m p o ra lly preced in g ; th o s e th a t are partially p re c e d in g and partially classified in
interaction w ith instru ction; and th o se th a t m e a s u r e th e a ttrib u te s a n d c i rc u m s ta n c e s
o f in s tru c tio n and sc h o o lin g the m selv es. Tem porally p re c e d in g pred ic to rs should
be used for a d ju stm e n t p ro cedures. B ut w h a t a b o u t v ariables that in te ra c t w ith in
stru ction like m otivation, parental s u p p o r t or prev io u s k n o w le d g e 9 T h e predictors
o f a c h ie v e m e n t and the a c h ie v e m e n ts th e m s e lv e s influ ence each other. T h u s , the
in d e p e n d e n t v aria b le is p artially c au sed by th e d e p e n d e n t v a ria b le and vice versa.
T h e fact th a t the p re d ic to r includes s o m e o f the effect o f in s tru c tio n h as th e c o n
s e q u e n c e that the in flu en ce o f in s tru c tio n is u n d e re s tim a te d . In co n tra st, the t e m
porally p re c e d in g pred ic to rs can be ta k e n a s in d e p e n d e n t v a ria b le s w ith o u t such
problem s. In clu d in g such in te ra c tiv e v ariables in the a d ju s tm e n t model m ig h t lead to
u n d e re s tim a tio n s o f th e effect o f in s tru c tio n and school type; le av in g them o ut leads
to an overestim ation . Strategies a r e n e e d e d to co p e w ith this problem . O n e appro ach
m ig h t be the differentiation b e tw e e n states and traits o f th o se interactiv e variables.
F o r ex am ple, the m e a s u r i n g o f a m otivational trait o f the stu d en t m ig h t b e a useful
p re d ic to r for ad ju stm en t, w h ile th e actu al state (or better, state residual) in d icate s the
effect o f in s tru c tio n a n d should not be used for a d ju s tm e n t p urposes.

Hierarchical Structure
S tu d en ts are n ested in classes, classes a r e n ested in schools a n d scho ols in regions
a n d co u n tries. T h o u g h th e re h as b een an im p o r ta n t statistical a d v a n ta g e in d e a l
ing w ith hierarchical d a ta (e.g., B r y k & R a u d e n b u s h , 1992; R a u d e n b u s h , H ong, &
R ow an, in press), m a n y theoretical a n d practical q u e s tio n s r e m a in u n a n s w e r e d . O n
th e theoretical side, a p p ro a c h e s to causal m o d e llin g still lack a d a p ta tio n to su ch a h i
erarchical context. O n th e practical side, the effects o f d e t e r m in a n t s in f o r m e r s t u d
ies have b een m e a s u r e d quite heterogeneously, s o m e t im e s at th e stu d en t level and
s o m e tim e s at th e class o r school level. T h e fo llo w in g short e x a m p le should d e s c rib e
th e problem. M A R K U S e x a m in e d th e effect o f television and v id e o c o n s u m p tio n
on the class level and on th e single stu d en t level. In the first case the correlation
b e tw e e n th e p redictor and the c o m p e te n c y in m a th e m a tic s d iffers b e tw e e n r = - 0 . 0 2
and r = - . 2 0 , d e p e n d i n g on the school type. O n th e stu d en t level, a z e ro correla-

J o h a n n e s H a r tig , K c k h a r d K lie m e . D c tlc v L c u tn c r : A s s e s s m e n t o f C o m p e te n c ie s in E d u c a tio n a l C o n t e x ts , l l o g r c f e P u b l is h i n g G m b H . G o ttin g e n 2 0 0 R

2 0 0 8 H ogrefe Publishing G m bH
K e in e u n e r la u b te W c itc r g a b c o d e r V c r v ic lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

Chapter 16
Monitoring and Assurance of School Quality:
Principles of Assessment and Internet-Based
Feedback of Test Results
Ingmar Hosenfeld

In 2 0 0 2 th e 16 G e r m a n L a n d e r (federal states) a g re e d to im p le m e n t n a tio n w id e e d


ucational s ta n d a r d s in ord er to a s s u r e ed u cational quality. A g r o u p o f e x p e rts, h ead ed
by E c k h a r d K liem e, w e re asked to ou tline th e basic idea, the fu n ctio n , th e re q u ire d
f e a tu re s a n d possible p ro b le m s o f im p le m e n tin g such ed u cation al stan dards. In the
resu ltin g d o c u m e n t ( K lie m e et al., 2003), the e x p e r ts state that ed ucational s ta n d a r d s
facilitate th e n e c e s s a r y c h a n g e in ed u cation al policy in G e r m a n y b e c a u s e they can
b e re g ard ed as a scale a g a in s t w h ic h stu d en t a c h ie v e m e n t should b e m e a s u re d , thus
e n a b lin g the in sta llation o f an o u tp u t-o rie n ta tio n (as o p p o s e d to the in p u t-o rien tatio n
that c h a r a c te r iz e d the d o m i n a n t v ie w in G e r m a n y at th at time). In ord er to fulfill this
fun ction, e d u catio n al s ta n d a r d s (cf. K lie m e & M a a g M erki, 2008; C h a p t e r 14 in this
bo o k ) need to have s o m e specific properties:
1) T h e y m u s t represen t a c o re c u r r ic u l u m , w h ich should be c o m m o n to the c u rric u la
o f the 16 L ander. Not all topics o f traditional c u r r ic u l a c a n be covered, a red uction
to w a rd s central topics is necessary.
2) T h e y n eed to be oriented to w a rd s c o m p e te n c e m o d e ls and should n a m e the
c o m p e te n c ie s stu d e n ts o f c ertain g r a d e s need to a c q u ir e in o rd er to p e rfo rm
su ccessfu lly at s u b se q u e n t ed u cation al stages. T herefo re, ed u cational s ta n d a r d s
should define the m i n im a l c o m p e te n c ie s re q u ire d o f the s tu d en ts o f a specific
g r a d e level. I f elab o rated m o d e ls o f c o m p e te n c e are available, it should be possible
to also define m o r e a d v a n c e d levels o f c o m p e te n c ie s , b ut the central reg u la to ry
fu n c tio n lies w ith in the definition o f m in im a l stand ards.
3) They should be defined in su ch an explicit a n d well s tr u c tu re d w ay that they allow
a straig ht f o rw a rd te st c o n s tr u c tio n alo n g th e lin es o f th e defined com petencies.
Shortly after the publication o f this expertise, s ta n d a r d -s e ttin g c o m m is s io n s w ere
set up and b eg an d e fin in g educational s ta n d a r d s for th e final g r a d e s o f differen t
s e c o n d a r y school tr a c k s and for p r im a r y school levels, e.g. g r a d e s 4, 9, and 10. All o f
th e L a n d e r w e re involved in th e se c o m m is s io n s so that c u r r ic u la r validity could be
en su red. T h e orientation to w a r d s m o d e ls o f c o m p e te n c e p osed p ro b lem s as n o e l a b o
rated and e m p iric a lly fo u n d e d c o m p e te n c e m o d e ls w e re available. A ccording ly, it
.1 H a rtig , li K lie m e, & D. L e u tn e r (E d s ),
A ssessm en t o f C o m p eten cies in E d u ca tio n a l C on texts, 337-356.
> 2008 I Io g refe & I Iu b er P u b lish ers
J o h a n n e s H a rtig . K ck h ard K liem e. D c tle v L e u tn e r: A sse ssm e n t o f C o m p e te n c ie s in K d u catm n al C o n te x ts . H o g re fe P u b lish in g G m b H . G o ttin g e n 2 0 0 8
2 0 0 8 Hogrefe P u b lish in g G m b H
K e in e u n erlu u b te W c ite rg a b c o d e r V c rv ie lfa tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

M o n ito rin g School Quality: A s sessm en t and Internet-B ased Feedback

341

Information
about
level of
achievement
variation of
achievement
types of errors
aspects of
diagnostic
competency

Com parison
to
classrooms of
the same school
similar
classrooms
(contextwise)
Federal State
Educational
Standards
results of
previous tests

Figure 1. The May from diagnosis to innovation (Helmke, 2004).


T h e first as p e c t o f the path w ay from d ia g n o sis to in n o v a tio n c o n c e r n s the rich n e ss
o f the d ia g n o sis o f the c u r re n t state, e.g. co n ten t and d e g r e e o f detail o f th e feedback.
L o g ically s p e ak in g , the m o re in fo rm a tio n is p ro vided th e m o r e in f o r m a tio n m ig h t
be used. H owever, not all f e e d b a c k is n e c e ssa rily c o m p r e h e n d e d as intentional so
that th e selection o f in fo rm a tio n to be r e p o rte d is o f crucial im p o rta n c e . I believe
it is n e c e s s a ry to in fo rm te a c h e rs not only abo ut a v e ra g e a c h ie v e m e n t but to also
include in fo rm a tio n on th e v a ria tio n w ith in th eir classro om . D ifferen t p ed agog ical
orien tatio n s m ig h t not be reflected ad e q u a te ly i f only central ten d en cies (e.g., m e a n
scores) are rep o rted . M oreover, p ro v id in g d etaile d results in d ifferen t co n ten t a re a s
(e.g., r e a d i n g co m p re h e n sio n , w ritin g , spelling, g eo m etry , a rith m etic) ra th e r th a n
re p o r tin g scores on the subject level (e.g., m a th e m a tic s , G e r m a n ) m a k e s it easier for
te a c h e r s to relate th e results to th e ir p re v io u s in struction . A n o th e r a s p e c t that facili
ta tes this link b e tw e e n in s tru c tio n and result is the availability o f different scales o f
co m p a ris o n . In the V E R A - p ro je c t w e try to p ro vide in fo rm a tio n for all th re e p o s
sible t y p e s o f co m p a riso n :
1) Social co m p a ris o n : c o m p a r in g th e results o f o n e s c lassro o m to that o f th e o ther
c la s s r o o m s o f the s a m e school, th e s a m e L an d , or a g r o u p o f s im ila r classroom s.
2) C o m p a r is o n w ith a s u b je c t-m a tte r b a s e d criterion: H o w m a n y o f th e s tu d en ts o f a
giv en classro om p o sse ss th e c o m p e te n c ie s o f a c e rta in proficiency level o r th o se
explicated in the educational stand ards?

J o h a n n e s H a r tig . K c k h a r d K lie m e . D c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in K d u c a tm n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

M o n ito rin g School Quality: A s sessm en t and Internet-B ased Feedback

347

4) Finally, in fo rm a tio n c o n c e r n i n g th e c o r r e s p o n d e n c e b e tw e e n predicted and actual


n u m b e r s o f s tu d en ts solving th e m a th e m a tic s ite m s selected by th e school is
m a d e available to p a rtic ip a tin g teachers. T h i s is s u m m a r i z e d in t e r m s o f absolute
d iffe re n c e s b e tw e e n p red ictio n and actual solution fr e q u e n c y (level in fo rm atio n )
as well a s in t e r m s o f th e co rrelation b e tw e e n the predicted and actual ra n k ord ers
o f th e item difficulty p er c lassro o m (precisio n inform ation).
In ord er to e s tim a te th e a tta in e d a c h ie v e m e n t at the level o f L ander, a s a m p le o f
schools is d ra w n for ex te n d e d analyses. T h e p a rtic ip a tin g te a c h e rs o f th e se schools
are asked to fill in an in tern et-b ased t e a c h e r q u e s tio n n a ir e a b o u t the classroom c o m
position and to rep o rt b ack th e ir b asic scores w ith in a sh o rte r ti m e fram e. T h e s e c
ond w a v e o f fe e d b a c k s ta r ts w h e n all d a ta fro m th e se schools have been collected. It
c o n ta in s fo u r pieces o f additional inform ation:
1) T h e display o f d istrib u tio n s across levels o f c o m p e te n c e c o m p a r i n g th e classroom
w ith th e school level is ex te n d e d to the level o f Federal State.
2) T h e display o f d istrib u tio n s across levels o f c o m p e te n c e c o m p a r i n g th e classroom
w ith th e school level is ex te n d e d to a g r o u p o f c la s s ro o m s w ith a s im ila r classro om
co m p o sitio n , e.g. c o m p a r a b le context situation.
3) T h e coefficient c o n c e r n i n g th e c o r re s p o n d e n c e o f ra n k o rd e rs o f predicted and
actual item difficulty is located in the d istribution o f all such correlation coefficients
o f p a rtic ip a tin g teachers.
4) T h e relative f re q u e n c y o f c o rre c t a n s w e r s w ith in the c lassro o m is co n tra sted w ith
th e relative freq u en cy o f c o rr e c t a n s w e r s w ith in the rep resen ta tiv e s a m p le o f
d r a w n schools. T h is is d o n e g ra p h ic a lly for each test item.
A final task w ith i n th is third b r a n c h o f the V E R A project is to g a th e r evaluative
feed b ac k by the w a y o f an in te rn e t-b a s e d q u estio n n aire. Tw o a s p e c ts a r e focused:
1) H o w are the v a rio u s steps o f the project (ad m in istratio n , scoring, fe e d b a c k o f
results, etc.) e v a lu a te d 9
2) H o w do individual te a c h e rs and staffs o f schools m a s t e r the ch a lle n g e s /ta s k s
im p o s e d on th e m by the project and h o w do they deal w ith the provided results?
R esu lts c o n c e r n i n g the first topic are used to im p ro v e the p ro c e d u re s , m a te ria ls and
fe e d b a c k o f results w ith in the project. R esults related to the se c o n d a s p e c t provide
first in sigh ts into the trig g e re d p ro c e s s e s o f school d e v e lo p m e n t as well a s into the
inevitable pro b lem s a ris in g fro m the n e w ta s k o f d e v e lo p in g a c u ltu re o f d e a lin g w ith
evaluation.

Illustration of Internet Use


A s m e n tio n ed a b o v e th e V E R A - p r o j e c t u tilizes the in te rn e t t h r o u g h o u t th e different
stages. G e n e ra lly th r e e fu n c tio n s o f in te rn e t in teractio n s th a t are d e s c rib e d in the
fo llow ing sections can be d istin guished :

J o h a n n e s H u rt!# . K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s . H o g r e f e P u b li s h in g G m b H . G o ttin g e n 2 0 0 R

2008 Hogicfc PubliJiing GmbH

K e in e u n e r la u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.

354

I. Hosenfeld

th e ir c o m p u te r e q u ip m e n t co ntin u ally at a fu n ctio n al level. In c a s e s w h e r e school


c o m p u te r s are well m a in ta i n e d th is is often d u e to th e individual efforts o f teachers.
A ccordingly, it is quite c o m m o n for p a r tic ip a tin g te a c h e rs to u se th e ir o w n personal
c o m p u te r s at h o m e in the c o u rs e o f th e project. A lth o u g h th is is helpful - particularly
in th e fa c e o f hard - o r s o f tw a r e p ro b lem s w ith the S ch o o l's m a c h i n e s - it m a s k s the
tr u e d im e n s io n o f the problem a n d should not b e necessary. T h is situation req u ires
special atten tio n d u r i n g p r o g r a m m i n g as w e c a n n o t a s s u m e that th e u s e r s c o m p u te r s
o r o p e r a tin g s y s te m s are up to date. M oreover, it is o f vital necessity to p rov id e help
c o n c e r n i n g all technical a s p e c ts - not j u s t the o n e s directly related to the project. In
th e V E R A - p r o je c t this is a ch iev ed by o ffe rin g a telep h o n e hotline w h ic h is op erated
by specially tr a in e d students.
T h is hotline also d eals w ith th e problem o f a v e ry large v a r ia n c e o f c o m p u te r p r o
ficiency o f G e r m a n p r im a r y school teachers. In so m e cases th e V E R A -p ro je c t p ro
v id e d the first interaction w ith the internet, especially in th e g r o u p o f te a c h e r s aged
50 and older. A n o t h e r closely related a s p e c t c o n c e r n s th e usability o f the system.
A s th e u s e rs have to co m p lete all steps w ith in c e rta in t i m e f r a m e s and in add ition to
th e ir usual daily w ork sch edule s, it is o f crucial im p o r ta n c e that the sy stem is easy
to u n d e r s ta n d and use a n d error-tolerant, a n d that it p ro v id es in fo rm a tiv e feed b ac k
at all times. E very problem the te a c h e rs e x p e r ie n c e along the way a b s o rb s re s o u rc e s
(time, effort, m otivation) th a t w ou ld b e tte r b e invested in the reflection on th e results
o r possible co n seq u en ce s.
T h e m ost t i m e - c o n s u m i n g technical p a rt for the te a c h e rs is th e input o f the e v a lu
ation o f stu d en t resp o nses, b e c a u s e several h u n d re d m o u s e clicks are n ec e s s a ry in
this step. In th e fa r f u tu r e the test m ig h t no lo nger be a d m in is te r e d in a p ap er-andpencil-fo rm at, but as a c o m p u te r iz e d test, th u s e l im in a tin g this lab o rio u s part. T h is
w ou ld also help to im p ro v e m e a s u r e m e n t accuracy, a s tests could b e adaptive. In the
n e a r e r fu tu re the lack o f sufficient n u m b e r s o f c o m p u te rs in schoo ls - each student
w o u ld n eed o n e - probably prohibits th is option.
A different, a lth o u g h related te ch n ical p roblem , c o n c e r n s the d istrib u tio n o f the
test b o o k le ts th r o u g h the internet. A lth o u g h this a p p ro a c h h as clear o rg an iz atio n a l
ad van tag es, o v er the classical ap p ro ach o f s e n d in g p rin te d b o o k le ts by m ail, e s p e
cially w h e n schools have a d e g r e e o f choice o f the item s c o n ta in e d in the test, it also
h as specific dow n sides: A s the scho ols u se w h a te v e r e q u i p m e n t is available to them
to p rin t a n d copy the m aterial, th e layout o f th e b o o k le ts n e e d s to b e fairly sim p le and
should not rely on colors or v e r y detailed pic tu res for c o n v e y in g relevant i n f o r m a
tion. It is not even possible to g u a r a n t e e that in ten d ed o p p o s in g p ages (e.g., c o n t a in
ing a text on o n e and th e q u e s tio n s c o n c e r n i n g this text on the o ther page) a re indeed
p resen ted to the stu d e n ts like intend ed (m an y sch ools have d ire c t a c c e s s only to
c o p y in g m a c h in e s that c a n n o t p rin t on both sides o f a sh e e t o f paper). F u rth e rm o re ,
th e duplication o f th e test b o o k le ts c o n s u m e s co n sid erab le a m o u n ts o f t i m e (often
th at o f teachers) a n d f u n d s and it th u s n e e d s to b e d o n e in a fairly short p e rio d o f
t i m e (for m a th em a tics: f r o m Friday th r o u g h Monday). T h e re q u ire d organizatio nal
p la n n in g is s o m e tim e s perceived as an unjustified extra b u r d e n by th e schools. In

J o h a n n e s H u rt!# . K c k h a r d K lie m e . l> c tle v L e u tn e r : A s s e s s m e n t o f C o m p e t e n c ie s in E d u c a tio n a l C o n te x t s , H o g r e f e P u b li s h in g G m b H , G o ttin g e n 2 0 0 R

2008

H u g ic fc PuMiJiing G m b H

K e in e u n e r lu u b te W c ite r g a b c o d e r V c r v ie lf a tig u n g .

You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.