You are on page 1of 26

DOI: 10.1111/j.1475-679X.2011.00421.

x
Journal of Accounting Research
Vol. 49 No. 5 December 2011
Printed in U.S.A.

The Role of Financial Incentives


in Balanced Scorecard-Based
Performance Evaluations:
Correcting Mood Congruency
Biases
SHUJUN DING AND PHILIP BEAULIEU

Received 14 September 2010; accepted 9 June 2011

ABSTRACT

Moods are low-intensity affective states that individuals bring to a decision,


and may be especially important when the balanced scorecard (BSC) is
used for performance evaluation purposes. We propose that financial incen-
tives can motivate decision-makers to correct mood congruency biases, in
which judgments and decisions are consistent with moods. In experiment 1,
participants rated the performance of one division manager based on two

University of Ottawa; University of Calgary. An earlier version of this paper was presented
at the 2008 AAA Midwest Regional Meeting, the 2008 AAA Annual Meeting, the 2008 CAAA
Annual Meeting, and the 2009 ABO conference. It is based on the PhD dissertation (University
of Calgary) of the first author, supervised by the second author. The authors would like to
thank the members of the committee, Cynthia Simmons, Kate White, and Michael Wright,
and external examiners, Teresa Kline and Alan Webb. We also thank Douglas Skinner (the
editor) and an anonymous referee, and acknowledge the helpful comments of Fodil Adjaoud,
Cam Graham, Linda Grensing-Pophal, Susan Haka, Irene Herremans, Steven Kaplan, Joanne
Leck, Tim Miller, Cam Morrill, Janet Morrill, Sean Peffer, Steve Salterio, Parbudyal Singh, and
Gary Spraakman, and the comments of workshop participants at Brock University, Concordia
University, University of Lethbridge, University of Manitoba, University of Ottawa, and York
University. Financial support was provided by the University of Calgary to the first author. We
thank Kate White for sharing her mood induction instrument.

1223
Copyright 
C , University of Chicago on behalf of the Accounting Research Center, 2011
1224 S. DING AND P. BEAULIEU

accounting measures and another manager based on a 16-measure BSC;


there were mood congruency biases at both levels of information load. Fi-
nancial incentives to make benchmark-consistent judgments eliminated bias
in the former condition but not in the BSC condition. In experiment 2, in-
centives were offered and performance evaluations were based on an eight-
measure BSC; mood congruency bias was eliminated. Results suggest that
management control systems, specifically financial incentives, should be in-
cluded in future affect correction research.

1. Introduction
Many claim that the balanced scorecard (BSC) is one of the most important
management accounting innovations in the last two decades, and surveys
consistently list it as one of the most popular management tools around
the world (Rigby and Bilodeau [2009]). Accounting researchers, however,
have documented several biases and problems associated with its applica-
tion (e.g., Lipe and Salterio [2000], Banker, Chang, and Pizzini [2004],
Ittner, Larcker, and Meyer [2003]). The complexity of the BSC is believed
to result in information overloading, which, in turn, compromises decision
quality when using the BSC for performance evaluation, as is commonly
found in practice. We test whether the BSCs complexity leaves it vulnera-
ble to the well-documented mood congruency bias, and whether this bias
is affected by financial incentives, another element of management control
systems (MCS). Our study links BSC-related biases, affect, and MCS litera-
ture.
The term affect refers to feelings, including both moods and emotions.
Moods are defined as low-intensity affective states that individuals bring to
the decision context, while emotions are defined as more-intensive affective
states with a definite cause and clear cognitive content related to decisions
(Forgas [1992], Kida, Moreno, and Smith [2001], Moreno, Kida, and Smith
[2002]). The moods and emotions of decision-makers are critically impor-
tant because individuals rarely make decisions devoid of feeling. Forgas and
George [2001, p. 5] commented that moods are especially important in
examining individuals behaviors and play a crucial role in organizational
settings, because:

Moods thus provide the underlying affective context for most of our on-
going thought processes and behaviors. Enduring mood states may be
triggered by such fleeting cues as a passing smile, the weather, a pleasant
room, a tone of voice, or a nonverbal gesture. Indeed, mild, nonspecific
moods often have a more subtle and insidious influence on organizational
behavior precisely because they lack elaborate cognitive content and thus
often escape conscious scrutiny.

The distinction between emotions and moods is very important in judg-


ment and decision-making research. Emotions may or may not be useful in
organizational settings; sometimes they provide content that is irrelevant to
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1225

decisions, but positive emotions resulting from being treated fairly by team
members increase willingness to cooperate with them in a common task
(Cremer and Hiel [2006]). By definition, though, moods are unequivocally
irrelevant to decision-making contexts. It is crucial, therefore, to improve
our understanding of how, when, and why mood will influence decision-
makers thinking and behavior (Forgas [2001a, 2001b]).
Numerous studies have demonstrated that mood states impact decision-
makers behavior, resulting in mood-congruent judgments (Fiedler [2001],
Schwarz and Clore [1983]). For example, a manager may read a newspaper
story about poor prospects for recovery from a global recession, and be in a
pessimistic mood later that day when rating a subordinates performance as
below expectations. The cause of the mood, speculation about the future
course of the economy, is unrelated to the judgment of past performance,
and the manager is unaware of its influence. Mood congruency, also known
as affect infusion, has been found in a variety of contexts and is a reliable
everyday phenomenon (Forgas [2001a, 2001b]). It has also been studied
in behavioral economics, including the role played by moods in investors
decision-making (e.g., Saunders [1993], Hirshleifer and Shumway [2003],
Kamstra, Kramer, and Levi [2003], Edmans, Garcia, and Norli [2007],
Kaplanski and Levy [2010]).
Research on correcting for mood congruency biases (McFarland and
Buehler [1998], McFarland, White, and Newth [2003], Schwarz and Clore
[1988], Tice and Bratslavsky [2000]) has focused on attending to and ac-
knowledging moods. Mood acknowledgment is based on the assumptions
that people can correct for biasing influences better if they have a theory
to explain them and they are motivated to correct their judgments (Wil-
son and Brekke [1994], Wegener and Petty [1995]). However, mood ac-
knowledgment strategies have not always been effective in prior research
(Detweiler-Bedell and Salovey [2003], Gohm [2003], Showers and Kling
[1996], Smith and Petty [1995]).
We contribute to the affect literature by proposing that MCS, which have
not been considered in bias correction models, provide an alternative mo-
tivation to correct mood congruency biases. Conversely, we contribute to
the MCS literature by suggesting a benefit that has been neglected: reduc-
tion of mood congruency biases in judgment and decision-making. Prior re-
search in accounting and auditing has drawn upon the affect literature and
employed acknowledgment to reduce congruency biases (Kadous [2001]),
but has not discussed or tested the ability of conventional MCS components
to perform the same function, especially in the environment of the BSC, a
popular and complex evaluation tool. In other words, mood congruency
biases have not been as fully incorporated into accounting contexts as we
attempt. Given that moods constitute the underlying context of organiza-
tional behavior (Forgas and George [2001]) and that MCS are in place in
most, if not all, organizations, the coexistence of moods and MCS in orga-
nizations presents an interesting setting to understand the role MCS play
in correcting for congruency bias.
1226 S. DING AND P. BEAULIEU

We design two experiments to examine our research questions. Our


experiments tie monetary rewards to performance evaluations, an ar-
rangement that is documented in both research and practice (Salvem-
ini, Reilly, and Smither [1993], Murphy and Cleveland [1995], Grensing-
Pophal [2001], Roch [2005], St-Onge et al. [2009]). Consistent with prior
literature (e.g., Lipe and Salterio [2000, 2002], Libby, Salterio, and Webb
[2004], Banker, Chang, and Pizzini [2004]), participants in our study make
subjective performance evaluations. Although the use of subjectivity in a
BSC-based evaluation context may raise concerns, as some firms apply
a formula-based approach to BSC evaluations (e.g., Malina and Selto
[2001]), the involvement of subjectivity in performance evaluations is well
documented (Gibbs et al. [2004], Bol [2008], Ittner, Larcker, and Meyer
[2003]).
In experiment 1, we establish a mood congruency bias affecting perfor-
mance evaluation judgments in the absence of financial incentives; partici-
pants who were induced to feel good (bad) gave higher (lower) evaluation
scores to divisional managers. The bias appeared at a low level of infor-
mation load, in which evaluations were based on two financial measures,
and at a high level, where evaluations were based on a BSC including 16
measures. Our finding of mood congruency bias at both levels thus sug-
gests that performance evaluation itself is a complex task that motivates
evaluators to adopt an heuristic approach to complete it; the finding is also
consistent with a recent affect study in which auditors feeling good (bad)
made a higher (lower) valuation of inventory (Chung, Cohen, and Monroe
[2008]). When financial incentives to make benchmark-consistent evalua-
tions were added in experiment 1, they eliminated mood congruency bias
in the low, but not the high, information load condition, reflecting that
information load and evaluators skills may influence the effectiveness of
financial incentives as a correction tool. Experiment 2 introduced an inter-
mediate level of information load in which performance evaluations were
based on a reduced BSC that comprised eight measures. Financial incen-
tives for benchmark-consistent evaluations were offered in experiment 2
and they eliminated mood congruency bias. This result suggests that, to en-
able financial incentives to reduce mood congruency biases, the number
of measures included in a BSC must be reduced to the capacity of working
memory (Miller [1956]).

2. Theory and Hypotheses


We first review in this section prior literature in psychology, accounting,
and behavioral economics/finance on the mood congruency effect, then
develop our first hypothesis, in which such an effect is expected when in-
dividuals perform the task of evaluations without financial incentives. In
section 2.1, we discuss mood de-biasing mechanisms proposed in prior re-
search, and effects of financial incentives in accounting, economics, and
organizational contexts. The second hypothesis on the correcting role of
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1227

incentives follows from this discussion. In section 2.2, we develop a research


question regarding the role of information load in mood congruency bias
correction.
Mood congruency bias has been well documented in psychology litera-
ture (e.g., Forgas and George [2001]), but affect research in accounting has
been devoted primarily to biases caused by emotions in capital budgeting
and investment decisions (e.g., Kida, Moreno, and Smith [2001], Moreno,
Kida, and Smith [2002], Sawers [2005], Kaplan, Petersen, and Samuels
[2007]). A notable exception is Chung, Cohen, and Monroe [2008], who
document a mood congruency bias among auditing professionals and stu-
dents.1 Emotional reactions occur in accounting contexts, but mood states
are much more common and constitute the underlying affective context
in which judgments are made (Forgas and George [2001], Chung, Cohen,
and Monroe [2008]). The morale in firms, a result of factors such as merg-
ers, expansion, good or bad leadership, success, and failure, probably cre-
ates moods that employees bring to their tasks, and those who evaluate the
performance of others may be especially prone to affective influences.
Moods influence judgment and decision-making when individuals
heuristically process information (Schwarz and Clore [1983], Forgas
[1995], Forgas and George [2001]). Affect-as-information heuristics are
found in both psychology and accounting literature (Clore, Schwarz, and
Conway [1994], Schwarz and Clore [1983, 2003], Kadous [2001]); in these
heuristics, affect itself is a source of information. For example, in an au-
diting litigation setting, jurors may feel really bad when they learn about
negative audit outcomes, and may use their negative feelings as relevant
information to blame auditors (Kadous [2001]). Kadous studied emotions,
but recent research examining the behavior of investors also shows that
stock returns are significantly affected by investors moods arising from avi-
ation disasters (e.g., Kaplanski and Levy [2010]), soccer game losses (e.g.,
Edmans, Garcia, and Norli [2007]), and weather (Hirshleifer and Shumway
[2003]) in a mood-congruent way. Therefore, the affect-as-information
heuristic is related to both moods and emotions. Wilson and Brekke [1994,
p. 128] argue that people tend to adopt heuristics when processing infor-
mation in order to reduce effort; they are not aware that they use such
heuristics, and thus the influence of irrelevant information in particular
is difficult to correct (Wilson, Centerbar, and Brekke [2002]). As Kadous
[2001, p. 429] argues, the affect-as-information heuristic could be imple-
mented when individuals make complex social judgments and difficult and
complex evaluations. The first hypothesis applies the affect-as-information
heuristic literature to performance evaluations based on accounting infor-
mation, predicting that individuals may heuristically conduct performance

1 Our examination of mood effects is different from Chung, Cohen, and Monroe [2008].

First, we focus on the application of the BSC to performance evaluations, while they investi-
gated auditor judgments. Second, more importantly, we focus on correction of mood congru-
ency bias, while they did not.
1228 S. DING AND P. BEAULIEU

evaluations, a complex task, when there is no financial incentive in place. It


aims to replicate the mood congruency bias in this setting, and serves as a
necessary baseline against which we examine the effects of financial incen-
tives. We assert that H1 holds regardless of the degree of information load,
and defer a discussion of the role of information load in bias correction to
section 2.2.
H1: When financial incentives are not in place, performance evalua-
tions will be higher (lower) when positive (negative) moods are
present.

2.1 FINANCIAL INCENTIVES AND MOOD CONGRUENCY CORRECTION


Financial incentives have rarely been discussed in prior research on bias
correction and performance evaluation. This section reviews the limited
research in these areas relevant to our second hypothesis and extends the
literature review to the field of economics. We pay particular attention to
Rickman and Witt [2008] and Salvemini, Reilly, and Smither [1993]; both
examine the effects of financial incentives on correcting biases. Drawing
upon these ideas, we predict the effect of financial incentives in mood cor-
rection in H2.
Performance-contingent financial rewards are a conventional means of
motivating improved performance in organizations (Awasthi and Pratt
[1990], Drake, Haka, and Ravenscroft [1999], Bonner et al. [2000],
Bonner and Sprinkle [2002]). Bonner and Sprinkle [2002] point out that a
monetary incentive works first by improving the incentive-effort link, then
by enhancing the effort-performance relation. Improved performance re-
quires success in both links.
In general, affect correction research in psychology is uninformed by
accounting research on cognitive effects of monetary incentives. The pri-
mary method of correcting affect-induced biases in the psychology litera-
ture is acknowledgment/attribution, in which decision-makers are directed
to evaluate their affective states or attribute their feelings to other sources
before making their judgments (McFarland, White, and Newth [2003],
Schwarz and Clore [1983]). Acknowledgment/attribution has also been
studied as a correction mechanism in auditing research (Kadous [2001]).
However, affect acknowledgment has not always been effective in correct-
ing biases. Wilson and Brekke [1994, p. 126] attributed mixed results to
the fact that people underestimate their own susceptibility to bias, and
they overestimate the extent to which they can control their judgments and
feelings. We propose that financial rewards can create motivation pow-
erful enough to correct a mood congruency bias without requiring mood
acknowledgment.2

2 Some may argue that the effect of financial incentives on mood congruency correc-

tion is not due to the motivation induced, but because moods could be changed by the in-
centives. Empirical evidence in psychology and accounting did not find that bias-correction
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1229

We now turn to the fields of economics and organizational behavior to


further review the effect of financial incentives. The economics paper most
relevant to our second hypothesis is that by Rickman and Witt [2008], in
which a financial incentive eliminated unconscious bias. The bias was a fa-
voritism in English soccer, in which referees awarded more injury (extra)
time when the home team was behind by one goal, and less time when it
was ahead by one goal. In the 20012002 season, for the first time, the En-
glish Premier League employed truly professional referees and paid them
salaries. Rickman and Witt found that favoritism that had existed before
that season was eliminated, and, after controlling for effects like changes in
referee quality, concluded that incentives were responsible. In H2 we pre-
dict that incentives can eliminate mood congruency, another unconscious
bias. Before we review the particularly relevant study in performance eval-
uation involving financial incentives, we first establish the practicality of
incentivizing performance evaluations.
The quality of performance evaluation and how to motivate managers
to make effective evaluations concern both academics and managers. Ac-
cording to Grensing-Pophal [2010], a Senior Professional in Human Re-
sources (SPHR), it is common practice to rate managers on the timeliness
and quality of their performance evaluations. Human resources (HR) ex-
perts claim that managers should be held accountable for the quality of
performance evaluation and such quality should be tied to managers com-
pensation (Grensing-Pophal [2001, p. 47]). Leading scholars in the human
relations field (Murphy and Cleveland [1995]), and HR managers (St-Onge
et al. [2009]), also recommend that the accuracy and quality of perfor-
mance evaluations should be financially rewarded in order to provide suf-
ficient incentives. Our study thus reflects both actual and recommended
performance evaluation practices.
Few judgment and decision-making studies have addressed the prac-
tice of incentivizing performance evaluations; the studies by Salvemini and
his coauthors are exceptions. Salvemini [1988] and Salvemini, Reilly, and
Smither [1993] offered evaluators financial incentives to be accurate; they
were motivated to give more accurate appraisals to customer sales repre-
sentatives after watching videotapes in which such representatives were per-
forming their tasks. In Salvemini, Reilly, and Smither [1993], a 3 3 fac-
torial design was adopted, with financial incentives and prior performance
information being independent variables. Financial incentives were either

mechanisms changed participants experience of moods. In Schwarz and Clores [1983] clas-
sic study, for example, participants receiving attribution manipulation no longer rely on
their moods to assess their life satisfaction, but retain either positive or negative moods
after the manipulation. Kadous [2001, p. 439] explicitly indicates that attribution instruc-
tions/manipulations are not expected to change the experience of affect, and the empirical
evidence presented in her experiment is consistent with her arguments. As discussed later, we
did not measure mood states, but prior studies suggest that bias-correction schemes alone do
not change moods.
1230 S. DING AND P. BEAULIEU

present or absent; when they were present, raters were informed about such
rewards either prior to viewing the tapes or after they viewed the tapes.
Similarly, raters in the positive (negative) prior performance information
condition were informed that ratees had received above (below) average
ratings for their prior work; the third group did not receive any information
on prior performance. Salvemini, Reilly, and Smither [1993] manipulated
financial incentives in a way quite similar to ours, which is discussed later
in the section on experimental design. More specifically, they informed the
raters that the true performance scores of ratees were provided by a group
of experts, and raters would be rewarded cash depending on the extent to
which their evaluations came close to expert ratings. Those who provided
the most accurate evaluations, that is, closest to expert ratings, would re-
ceive $200.
Consistent with their hypotheses, when financial incentives were absent,
prior performance information led to biased appraisals of the current pe-
riod such that raters receiving positive (negative) prior performance in-
formation gave more (less) favorable ratings than their counterparts with-
out this information; that is, the lack of motivation resulting from the lack
of financial incentives led to an assimilation effect. Furthermore, raters
who were provided with prior performance information, either positive or
negative, evaluated ratees less accurately compared to those with no such
information. When monetary incentives were offered, as predicted, prior
performance information failed to influence either the average rating or
accuracy. Salvenimi et al. further found that the timing to offer incentives
did matter; raters who were provided with the incentives before viewing the
tapes of ratees performance gave evaluations of the highest quality.
Financial incentives offered in these studies enabled participants to cor-
rect for prior information bias without being made aware of it. Our proposi-
tion differs significantly from Salvemini, Reilly, and Smither [1993] because
we address mood congruency biases and correction, whereas Salvemini et
al. were interested strictly in decision-making accuracy; as noted before,
mood congruency bias is understudied in accounting, and using financial
incentives to correct for such an effect has not been examined in account-
ing and psychology. More importantly, we examine the application of the
BSC, a popular yet complex performance evaluation tool, and consider the
possible impact arising from information load when financial incentives
are employed to debias moods. The widespread application of the BSC and
the well-documented concerns arising from its application indicate that
more work is needed to examine the design of this popular tool; our in-
vestigation of BSC in the performance evaluation setting in which moods
could be an important contextual factor sheds new light on this manage-
ment accounting innovation. Evaluators in Salvemini et al. were required
to evaluate customer sales representatives in just three dimensions, but
the BSC employed by our study involves four perspectives, with each per-
spective involving multiple measures, as discussed below. The evaluation
task thus is much more complex in our study, but Salvemini et al. does
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1231

provide us with a precedent for experimental use of financial incentives in


performance evaluation tasks. H2 proposes using incentives to eliminate
an unconscious bias, similar to the result of Rickman and Witt (2008), but
in a controlled, rather than natural, experiment. We examine two levels of
information load in the context of performance evaluation, but H2 does
not predict whether incentives will be effective at only one or at both lev-
els of information load. H2 reflects that individuals would be motivated to
exert greater effort when financial incentives are present, but defers the
role that information load may play in influencing the effectiveness of such
incentives to section 2.2.
H2: When financial incentives are in place, moods will not influence
performance evaluations.

2.2 THE EFFECT OF INFORMATION LOAD


Complexity of accounting measurement is an important consideration
because, as information complexity increases, individuals are required to
have higher skill levels to improve task performance (Bonner et al. [2000],
Bonner and Sprinkle [2002]). Among the 131 experiments reviewed by
Bonner et al. [2000], financial incentives are found to lead to improved per-
formance only in 50% of the experiments; information complexity, among
other things, is blamed for the failure of financial incentives to improve per-
formance in almost half of the experiments. According to these authors,
when information becomes cognitively complex, it increases the gap be-
tween requirements of the task of interest and individuals knowledge and
skill, thus eliminating the effect of monetary incentives.
In experiment 1, we include two information load conditions. The popu-
larity of the BSC has attracted many firms to use it as a performance evalu-
ation tool, with multiple perspectives and many measures, while others do
not use the BSC framework and adopt only a few traditional financial mea-
sures. The BSC serves as a proxy for the high information load condition,
while the latter is used for the low information load condition.
More specifically, the high information load condition bases perfor-
mance evaluations on 16 BSC measures in four perspectives. The devel-
opers of the BSC have indicated that it should be used for strategic plan-
ning and management (Kaplan and Norton [1996a, 1996b, 2001a, 2001b])
rather than as a performance evaluation tool. Nevertheless, the BSC is used
for control purposes, such as performance evaluations, as evidenced by
many accounting studies (e.g., Banker, Chang, and Pizzini [2004], Ittner
and Larcker [1998], Lipe and Salterio [2000, 2002]).
H1 predicts that the absence of financial incentives will lead to a mood
congruency effect, even for the evaluation based on two financial mea-
sures only. This seems a plausible prediction because the task of perfor-
mance evaluation itself is a complex one. According to Bonner et al. [2000],
performance evaluations should be classified as problem-solving tasks, the
most difficult and cognitively challenging among the five types of tasks
1232 S. DING AND P. BEAULIEU

they reviewed. Problem-solving tasks, according to them, generally can be


completed in numerous ways, and the existence of multiple options further
increases the complexity of these tasks. Therefore, when it comes to perfor-
mance evaluations, even with traditional financial measures only, individu-
als may still find the task complex and effortful; when financial incentives
are not in place, they may therefore use what they were feeling to make a
shortcut to complete the evaluation task, that is, the affect-as-information
heuristic. The mood congruency effect may arise due to the adoption of
such heuristics.
Bonner et al. [2000] suggest that complexity plays a significant role in
determining whether incentives will affect performance. However, their re-
view also indicates that, in any specific study, the precise point at which in-
creasing information load eliminates incentive effects is an empirical ques-
tion. When financial incentives are provided, a greater level of effort is
expected, but whether an increased level of effort leads to improved per-
formance and/or higher quality of decision-making depends on the in-
formation load and the skills/abilities of individuals performing the task.
Consequently, we examine the impact of information load in the following
research question.
RQ1: At what level of information load will financial incentives elimi-
nate the mood congruency bias?

3. Method and Results


3.1 EXPERIMENT 1
The dependent variable in experiment 1 was performance ratings on
a 101-point scale. There were three independent variables, each having
two levels: mood (positive and negative), financial incentive (absent and
present), and information load (two accounting measures and 16 BSC
measures).

3.1.1. Participants. One hundred and four MBA students participated in


three sessions, one in the no-incentive conditions and two in the incentive
conditions.3 There were 52 subjects in each mood condition, and 37 (67)
subjects were in the no-incentive (incentive) condition. The mean age of
subjects was 33.4 years and they had been in business for 9.6 years on av-
erage (median = 8). There were 75 male (72.1%) and 29 female (27.9%)
students. Among 104 participants, the majority (75, 72.1%) had experience
of evaluating others, and 34 (32.7%) of them had used the BSC as a perfor-
mance evaluation tool. On average, participants had been evaluating others
in their organizations for 4 years (median = 3).

3 Three sessions were available and it was necessary to designate each as either no-incentive

or incentive. The no-incentive condition replicates prior congruency bias research, and we
accordingly introduced H1 as a baseline. The third session was designated incentive because
this condition is the focal point of the paper.
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1233

3.1.2. Procedure. Students were invited to participate in a study designed


to improve our understanding of performance evaluations involving the
BSC. They were informed that the study included two tasks. The first task
was called visual imagery style, asking them to recall and describe a life
event in order to show their visual imagery style, and the second task in-
volved performance evaluations. Participants were informed that the re-
searchers were interested in how their imagery style influences perfor-
mance evaluations. Participants were paid $10 for their participation in a
40- or 50-minute session, depending on the condition (discussed below)
and in addition to the participation fee, there was a lottery at the end of
the session in which the winner would receive $400 cash.
Participants listened to the experimenter explain the general instruc-
tions; completed the first task, visual imagery; and read the evaluation case
(the second task) themselves. The first task, visual imagery, was used to in-
duce moods. Participants were not asked between the two tasks to acknowl-
edge their moods and no theory or instructions for mood correction were
given.
The case began with a passage containing background information illus-
trating the evaluation systems used by two divisions. Included in the case
information was a selection of performance data on two divisions of a firm
that were being evaluated for annual review. Included also was the name of
each division manager, the target market of each division, and the divisional
strategies. In the case information, participants were informed that this was
an initial evaluation for annual performance review, and they needed to try
their best to give evaluations on each division manager given data limita-
tions. They rated divisional managers performance based on a 101-point
scale that has been used in prior studies (Lipe and Salterio [2000, 2002],
Libby, Salterio, and Webb [2004]). A rating of 100 indicated excellent per-
formance and 0 indicated extremely poor performance. Participants were
then asked to indicate the extent to which they were happy with each di-
visional managers performance and whether or not they would promote
each manager based on the information given. After completing the case
evaluation, they completed a demographic questionnaire. Before partici-
pants left the session, a debriefing was conducted to remove the impact of
the mood induction.

3.1.3. Manipulations. Moods were manipulated between subjects at two


levels, positive and negative, by the first visual imagery style task. Par-
ticipants were asked to recall and describe an event they experienced, a
mood induction method that has been widely used by affective studies in
psychology literature (e.g., DeSteno et al. [2000], McFarland, White, and
Newth [2003], Ruder and Bless [2003], Rusting and DeHart [2000], Tamir
and Robinson [2004]). Participants were required to remember a particu-
larly positive and pleasant (negative and unpleasant) event, such as hear-
ing some very good (bad) news. They visualized either positive or negative
events that still made them have very positive or negative feelings for about
1234 S. DING AND P. BEAULIEU

2 minutes, and then spent 10 minutes writing a description of them on


sheets provided, in as much detail as they could as though these events were
happening again. We did not include a manipulation check of moods in
the experiments because Erber, Wegner, and Therriault [1996] and McFar-
land, White, and Newth [2003] found that explicit questions about moods
may enable participants to acknowledge their moods and correct biases.
Incentives to give benchmark-consistent evaluations were manipulated
as either present or absent. As noted before, offering financial incentives
to evaluators was adopted in prior experimental research (e.g., Salvem-
ini, Reilly, and Smither [1993], Roch [2005]), and is an emerging busi-
ness practice. When financial incentives were absent, participants were in-
formed that their chances of winning the lottery did not depend on their
performance evaluation answers; there would be one card containing their
participant number (i.e., one chance) in the lottery box for each of them
regardless of their answers. This instruction clarified that the draw was a
random result unrelated to performance. When financial incentives were
in place, participants were also informed that a group of professionals had
given a benchmark performance score for each divisional manager using
the same evaluation scale, and that incentives would be determined by the
degree of correspondence between their scores and the scores given by the
professionals. Specifically, they were told that if both of their scores fell
within the range of 3 of the benchmark scores, they would have 10 cards
containing their participant number placed in the lottery box; otherwise
they had only 1. Thus, participants giving benchmark-consistent perfor-
mance scores would be rewarded by having nine additional cards in the
lottery box. This design follows Roch [2005] and Salvemini, Reilly, and
Smither [1993], in which evaluators were given monetary incentives and
those whose ratings came closest to the expert ratings would receive a cash
reward.
As was explained in section 2, information load may affect the ability of
decision-makers to correct biases. Therefore, information load was manip-
ulated within subjects at two levels: the task was based on two accounting
measures (low information load) and 16 accounting measures in a BSC
(high information load). As in the Lipe and Salterio [2000, 2002] instru-
ment, participants were informed that RACK Incorporated is a firm that
specializes in womens apparel, and has two divisions: TeenWear and Work-
Wear, with each having a different strategy. WorkWear Division specialized
in business uniforms and focuses on its financial performance. Only two
financial measures were provided for the division; this was the low informa-
tion load condition. The measures, return on sales and revenues per sales
visit, were indicated to be equally relevant and important to reaching Work-
Wear Divisions strategic goal. One measure was above target, one measure
was below target, and the percentage by which one measure exceeded the
target was equal to the percentage by which the other measure fell short
of the target. Return on sales is a common measure (used by both divi-
sions), while revenues per sales visit is a measure unique to WorkWear. Nine
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1235

faculty members in several business schools were asked to give a per-


formance score, which indicated an approximately average performance
(mean = 53.3).
TeenWear Division was the highinformation load condition. It special-
ized in clothing for teenagers and had developed a BSC. It had four cate-
gories: financial, customer-related, internal business processes, and learn-
ing and growth, with each category having four measures. Within each
category, two measures were above target, two measures were below target,
and the percentage by which two measures exceeded targets was equal to
the percentage by which another two measures fell short of targets (Bhat-
tacharjee and Moreno [2005]). It was further indicated that each cate-
gory and measure in TeenWears BSC was equally relevant and important
to reaching its strategic goal. This design was employed in Bhattacharjee
and Moreno [2005] in order to indicate an approximately average perfor-
mance. A group of managers in Bhattacharjee and Morenos study (the
control group) did rate performance as approximately average (mean =
58.3) using this design and the same scale employed in this study, and this
mean score was used as the benchmark for the BSC division. The BSC in-
formation given for TeenWear Division is presented in the appendix.
With a total of 16 measures organized into four related BSC categories,
TeenWear Divisions accounting information was more complex than Work-
Wear Divisions two measures. Half of the participants evaluated Work-
Wear Division first and TeenWear Division second, while the other half per-
formed the task in a reverse order, to counterbalance order effects.4

3.1.4. Results. At the end of sessions in experiment 1, participants were


asked to indicate the extent to which they agreed with two statements
based on a seven-point scale, with one (seven) meaning strongly disagree
(agree). The first stated specifically that chances of winning the lottery de-
pended on their performance evaluation answers, and the second stated
that they had a financial incentive. For statement (1), the mean value for
participants in the nonincentive condition (incentive condition) was 1.68
(4.91), a significant difference (F = 76.092, p < .001). For statement (2),
the mean response for participants in the nonincentive condition (incen-
tive condition) was 3.11 (4.76), also significant (F = 14.383, p < .001).
These differences indicate that the manipulation of financial incentives was
successful. As noted before, the mood manipulation was not checked.5

4 Order did not significantly affect performance evaluations as a main effect or interacting

with the independent variables.


5 Before participants left the session, they were asked to indicate the extent to which they

were feeling good or bad, with seven (one) denoting very good (bad). They were also asked
a second question about whether they felt positive or negative, using a similar scale where
seven (one) indicated very positive (negative). The purpose of these questions was to obtain
some information about how subjects were feeling, so that the debriefing process that followed
could be more targeted and effective. To some extent, the questions may help in assessing the
1236 S. DING AND P. BEAULIEU

TABLE 1
Evaluation Scores by Financial Incentive, Moods, and Information Load a
Positive Negative Difference
Moods (1) Moods (2) (1)(2)
No financial BSC (16-measure) 64.42b (15.14)c [19]d 52.78 (12.53) [18] 11.64
incentive Traditional (2-measure) 57.05 (19.67) [19] 46.67 (13.28) [18] 10.38
Financial BSC (16-measure) 62.67 (15.42) [33] 53.18 (13.04) [34] 9.49
incentive Traditional (2-measure) 53.85 (15.02) [33] 56.47 (13.88) [34] 2.62
a
Evaluations were made using a 101-point scale adapted from Lipe and Salterio [2000].
b
Mean evaluation scores.
c
Standard deviation.
d
Cell size.

Performance evaluation scores given by participants are presented in


table 1, and will be discussed further below for each condition of the experi-
ment. First, a 2 2 2 Repeated Measures ANCOVA was conducted on the
complete data set, including experience as a covariate. Both general work-
ing experience and experience of evaluating others were employed in the
tests. The sample sizes of experimental groups in this study were not equal
and this inequality might affect the F tests performed. However, Boneau
[1960, 1961] found that the combination of heterogeneity of variance and
different sample sizes could be the only situation in which significance tests
will be adversely affected. Levenes test was conducted to test whether the
error variance of the dependent variable was equal across groups. The P -
values were .353 and .561, respectively, failing to reject the null hypothesis
of equal error variance. Therefore, the inequality of sample sizes should
not adversely affect the significance tests.
The ANCOVA results are presented in table 2, with general working ex-
perience as the covariate. Table 2 indicates that moods significantly affected
evaluation judgments (p = .005). The interactive effect of information load
(the BSC vs. traditional financial measures) and moods was also signifi-
cant (p = .041), suggesting that moods differentially affected performance
evaluations, depending on the information load employed. The difference
in mean evaluation scores between positive and negative moods with tra-
ditional measures (incentive and no-incentive conditions combined) was
1.94, but with BSC measures was 10.27 (untabulated). The moods in-
formation load incentive interaction is marginally statistically significant
(p = .096).
We then examine the effect of mood congruency under the no-incentive
and incentive conditions, respectively. When no incentives are in place,
as can be seen from table 1, performance evaluations were lower in the

effectiveness of the mood manipulation. For a combined variable in which responses to the
questions are summed, the mean response of participants who were instructed to visualize
negative and unpleasant events was 8.00, whereas those who visualized positive and pleasant
events had a mean response of 9.47. This difference is consistent with our expectations (F =
3.466, p = .071). Statistical results are consistent for each question considered separately.
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1237
TABLE 2
Results of Repeated Measures ANCOVA of Performance Evaluations: Business Working Experience as a
Covariate
Variables df SS MS F P
Between subjects:
Moods 1 2,481.04 2,481.04 8.210 .005
Incentive 1 78.16 78.16 .259 .612
Business working experience 1 .318 .318 .001 .974
Moods incentive 1 684.78 684.78 2.266 .135
Error 99 29,919.28 302.22
Within subjects
Information load 1 2.36 2.36 .018 .894
Information load moods 1 571.16 571.16 4.305 .041
Information load incentive 1 378.51 378.51 2.853 .094
Information load moods incentive 1 375.66 375.66 2.831 .096
Error 99 13,135.94 132.69

TABLE 3
Experiment 1: Simple Effects Tests Holding Financial Incentives Absent
Adjusted Mean
Positive Moods Negative Moods F Value P Value
BSC (16-measure) 65.03 53.36 6.30 .014
Traditional (2-measure) 56.41 46.05 4.22 .043

negative mood condition than in the positive mood condition in both lev-
els of information load; the difference was 10.38 with two measures and
11.64 with 16 measures. Table 3 presents simple effects tests by holding fi-
nancial incentives constant at the absent level; the main effect of moods is
significant for both levels of information load (p = .014 for the BSC condi-
tion and p = .043 for the two-measure condition), supporting mood con-
gruency as predicted in H1; the result is also presented in figure 1. The

FIG. 1Experiment 1: performance evaluations when financial incentives are absent.


1238 S. DING AND P. BEAULIEU

FIG. 2Experiment 1: performance evaluations when financial incentives are present.

finding suggests that even with two financial measures evaluators may still
consider performance evaluation a complex and difficult task and thus
rely on their moods to make a shortcut; they may have used the well-
documented affect-as-information heuristic when evaluating the perfor-
mance of divisional managers, affirming the impression based on figure 1
that the mood effect was equally strong with respect to the two financial
measure and 16-measure BSC conditions.6 Years of business experience is
included as a covariate in the simple effects tests.7
When financial incentives are provided, results are as presented in fig-
ure 2. Mood congruency was completely eliminated with two measures; the
(raw) mean in the negative (positive) mood condition was 56.47 (53.85,
t = 0.742). Mood congruency persisted in the 16-measure BSC condi-
tion, where mean evaluation scores were 62.67 and 53.18 in the positive
and negative mood conditions. The difference in these scores, 9.49, is simi-
lar to the difference in the nonincentive condition, 11.64. Table 4 presents
simple effects tests by holding financial incentives constant at the present
level. The main effect of moods was not significant (p = .450) for the
two-measure condition, but was highly significant for the BSC condition

6 As previously mentioned, in addition to performance evaluation scores, we employed two

supplemental measures: participants happiness with managers performance and their will-
ingness to promote managers. Simple effects tests were also conducted for these two measures
by holding financial incentives constant at the absent level. For participants happiness with
managers performance, the results of the simple effects tests remain unchanged; the main
effect of mood was significant for the two-measure condition (p = .022) and for the BSC con-
dition (p = .058). For participants willingness to promote managers, the main effect of moods
was significant (p = .040) for the two-measure condition only; the main effect of moods for
the BSC condition was not significant at the conventional level (p = .190).
7 When years of evaluating others was used as the covariate, the results remain unchanged.

The main effect of moods was significant for the two-measure condition (p = .042) and for
the BSC condition as well (p = .015).
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1239
TABLE 4
Experiment 1: Simple Effects Tests Holding Financial Incentives Present
Adjusted Mean
Positive Moods Negative Moods F Value P Value
BSC (16-measure) 62.44 52.74 7.87 .006
Traditional (2- measure) 54.08 56.93 .57 .450

(p = .006).8 Years of business is included as the covariate in the simple


effects tests.9 These results support H2 with respect to the two-measure
condition, where mood congruency bias was eliminated, but suggest that
decision-makers have difficulty eliminating mood congruency when a 16-
measure BSC is used. The finding that mood congruency bias persisted
in the BSC environment when financial incentives were in place partially
answers our research question; a conventional BSC, with four perspectives
and 16 measures, could be too complex for financial incentives to elimi-
nate mood congruency biases. Having clearly obtained mood correction in
the two-measure condition but not with a 16-measure BSC, we designed a
second experiment to see whether less complex BSC-based performance
evaluations would respond to financial incentives.

3.2 EXPERIMENT 2
A total of eight measures were included in experiment 2, two in each of
four BSC perspectives. Eight measures fall within Millers [1956] boundary
condition regarding the capacity of working memory of seven items plus
or minus two. Regardless of how participants consider items from different
perspectives, with a grand total of eight it is unlikely that working mem-
ory would be overloaded, and financial incentives may be able to eliminate
mood congruency bias. The inclusion of only two measures under one per-
spective is found in practice; for example, the BSC examined by Campbell
[2008] is used by a major fast-food retailer, and only two measures are in-
cluded under the perspective of people.
In experiment 2, the dependent variable is performance ratings, the
same as in experiment 1. However, mood is the only independent variable,

8 Consistent with our previous analyses, we further conducted simple effects tests using par-

ticipants happiness with managers performance and their willingness to promote managers,
respectively, with years of business experience as the covariate; financial incentives were held
constant at the present level. The results remain unchanged. When participants happiness
with managers performance was used, the main effect of moods was not significant (p =
.099) for the two-measure condition, but was significant for the BSC condition (p = .037). Re-
garding participants willingness to promote managers, again the main effect of moods was not
significant (p = .473) for the two-measure condition, but was significant for the BSC condition
(p = .011).
9 When years of evaluating others was used as the covariate, the results remain unchanged.

The main effect of moods was not significant for the two-measure condition (p = .426), but
was highly significant for the BSC condition (p = .008).
1240 S. DING AND P. BEAULIEU

again at two levels (positive and negative). A financial incentive is provided


and an eight-measure BSC is the only level of information load.

3.2.1. Participants. Thirty-two MBA students participated; 16 were ran-


domly assigned to each mood condition, positive and negative. They also
had business experience (mean years = 4.2) and experience in evaluating
others (mean years = 2.8).

3.2.2. Procedure. The same procedure and materials were used as in ex-
periment 1 except that only eight measures were presented, two in each of
the same four BSC perspectives, and there was no manipulation of incen-
tives (incentives were always present in experiment 2). These eight mea-
sures were a subset of the 16 measures used in experiment 1. Consistent
with its design, within each perspective one measure was above target, one
measure was below target, and the percentage by which they exceeded or
fell short of targets was equal. The same group of faculty members that
provided performance scores for the low information load (two financial
measures) condition in experiment 1 gave scores for this version of the in-
strument. They indicated an approximately average performance (mean =
53.0).

3.2.3. Results. The same financial incentive manipulation check ques-


tions were asked in experiment 2. Recall that participants were asked to
indicate their agreement with two statements on a seven-point scale, with
one (seven) meaning strongly disagree (agree). For the statement about
chances of winning the lottery depending on performance evaluation an-
swers, the mean response in experiment 2 was 5.62. Recall that the mean re-
sponses in experiments 1 were 1.68 for the nonincentive condition and 4.91
for the incentive condition, respectively. For the statement about having a
financial incentive, the mean response was 5.19 (3.11 and 4.76 for the non-
incentive and incentive conditions in experiment 1, respectively). Two one-
way ANOVAs (untabulated) were conducted for the two experiments, one
for each manipulation check question, and both models were significant
(F = 57.111 and 10.288, respectively, and both p-values < .001). Post-hoc
multiple comparisons were also conducted; for chances of winning lottery
depending on performance, the mean response score for the nofinancial
incentives condition in experiment 1 is significantly lower than that in the
incentive condition in experiment 1 and in experiment 2 (Bonferroni and
Scheffe tests were consistent, p-values <. 001). The difference between the
incentive condition of experiment 1 and experiment 2 (financial incentives
were in place in both) is not significant (p = .139). Regarding individuals
incentives, the mean response score in the nonincentive condition in ex-
periment 1 is significantly lower than that in the incentive condition in
experiment 1 (p = .001) and in experiment 2 (p < .001). No difference is
found between the incentive condition of experiment 1 and experiment 2
(p = .633).
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1241
TABLE 5
Results of One-Way ANCOVA of Performance Evaluations: Experiment 2
df SS MS F P
Business working experience as a covariate
Moods 1 .070 .070 .000 .987
Business working experience 1 58.267 58.267 .220 .642
Error 29 7,667.671 264.402
Experience of evaluating others as a covariate
Moods 1 1.191 1.191 .004 .947
Experience of evaluating others 1 21.631 21.631 .081 .777
Error 29 7,704.307 265.666

Evaluation scores in experiment 2 were almost identical in the positive


mood condition (mean = 56.50, std. dev. = 16.95) and negative condition
(mean = 56.44, std. dev. = 15.09). As presented in table 5, the main effect
of moods on performance evaluations was not significant in one-way AN-
COVA using either business working experience or experience of evaluat-
ing others as a covariate (main effect F < .005 in both models). The results
of experiment 2 indicate that, in the case of a four-perspective scorecard
with eight measures in total, financial incentives were able to eliminate the
mood congruency bias. Therefore, an eight-measure BSC seems to reflect
an appropriate level of information load that enables financial incentives
to correct for mood congruency bias.
Given that participants in experiment 2 are less experienced than the
participants in experiment 1, it is necessary to rule out the possibility that
it is the difference in experience, rather than information load, that led
to our findings in experiment 2. We combined data from experiment 1
(high information load condition with incentives) and experiment 2 (in-
termediate information load condition) to run a 2 2 ANCOVA. The first
between-subject factor is information load, and the second is moods; evalu-
ation scores for the BSC-based division are used as the dependent variable.
The ANCOVA was run twice, once each with business experience and ex-
perience of evaluating others as a covariate. The interaction of information
load and moods was significant at the 5% level with both covariates, indi-
cating that results reported in table 5 cannot be attributed to differences in
experience. Business experience and experience of evaluating others were
not significant covariates.
Although experiment 2 was conducted after experiment 1 (incentive
condition) with a different sample of MBA students, these combined results
suggest that the rule of seven measures plus or minus two (Miller [1956])
could be an important complexity factor. However, given that participants
were not randomly assigned between experiment 1 (incentive condition)
and experiment 2, a more conservative interpretation is that it remains for
future research to establish whether incentives can eliminate the bias when
four-perspective scorecards having more than eight total measures are
employed.
1242 S. DING AND P. BEAULIEU

4. Conclusions
Field and experimental accounting studies on the application of the BSC
to performance evaluation provide evidence regarding its limitations, in-
cluding: the common-measure bias (Lipe and Salterio [2000]), the group-
ing effect (Lipe and Salterio [2002]), the overreliance on financial mea-
sures (Ittner, Larcker, and Meyer [2003]), and the conditional use of
strategy-linked measures (Banker, Chang, and Pizzini [2004]). This stream
of research indicates that effects of the BSCs complex design on indi-
viduals cognitive effort should be considered, especially in the perfor-
mance evaluation context. Our study further examines this context by di-
recting attention to another well-documented effect, the mood congruency
bias.
In experiment 1, we establish that mood congruent judgments are a re-
liable phenomenon in an application of the BSC; this mood-congruency
effect had been found in other decision-making research (Forgas [1995],
Chung, Cohen, and Monroe [2008]) as well. We then explore whether a
conventional element of MCS, monetary incentives, enables individuals to
correct for mood congruency bias, and find that evaluation judgments us-
ing a 16-measure BSC are susceptible to affective influences even in the
presence of incentives to make benchmark-consistent judgments. In ex-
periment 2, we show that financial incentives successfully eliminate mood
congruency bias when a simplified BSC with only eight measures, but re-
taining four perspectives, is employed. This result suggests that the critical
aspect of information load is the upper limit of individuals processing ca-
pacity (Miller [1956]). Our finding in experiment 1 that mood congruency
bias occurs even when only two financial measures are used for evaluation
purposes also provides supporting evidence that information load is an im-
portant issue to consider in the context of performance evaluations, as the
evaluation task itself is complex and cognitively difficult.
The results are also consistent with the existence of a boundary con-
dition between autonomic responses in regions of the brain not subject
to conscious control (described in Critchley [2005], Kerfoot, Chattillion,
and Williams [2008]) and conscious judgment in decision-making when
anticipating gain. It may be that incentive response and mood correction
were autonomic processes in the two- and eight-measure conditions, but at
the level of complexity with 16 measures, correction cannot be autonomic.
This is speculation now, but future judgment and decision-making research
aided by advances in neural science may explore the boundaries between
autonomic and deliberate bias correction.
Future avenues may be explored to solve the information load problem
associated with BSC applications. A limitation of this study is that it does not
directly address judgment accuracy. Incentives to make judgments consis-
tent with organizational benchmarks were offered, as in Roch [2005], but
consistency is not equivalent to accuracy, which is difficult to determine
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1243

in performance evaluation contexts. Evaluation accuracy is sometimes


judged in practice by correlating assessments with future performance of
evaluatees (DeNisi and Pritchard [2006]), but we could not replicate
this extended time frame in our experiment. Like Salvemini, Reilly, and
Smither [1993] and Roch [2005], we employed expert ratings in our
experiments as a benchmark, but we acknowledge that our panel of fac-
ulty may not be experts at this task and it could be difficult to know the
benchmark for performance evaluations in practice. Finally, we acknowl-
edge that using financial incentives in performance evaluations could be a
challenge.
The experimental task excluded many performance evaluation duties,
such as interactions with evaluation committees and those being evaluated.
To the extent that feedback and complaints by the latter may deter evalua-
tors from judging performance with bias, and that the performance review
may involve interactions among members of the evaluation committee, the
role of moods in influencing performance evaluations could be exagger-
ated. To the best of our knowledge, the role of group decision processes in
correcting mood congruency biases has not been studied.
Beyond the topic of mood congruency bias in performance evaluation
applications of the BSC, we make an original theoretical contribution to af-
fect correction research by inserting the concept of MCS. We propose that
one component of MCS, financial incentives, may provide enough moti-
vation for decision-makers to overcome mood congruency bias without re-
quiring them to acknowledge their moods, when information load is not
too high. This represents a significant departure from extant research, in-
cluding studies of affect in accounting and auditing.
Financial incentives are just one component of MCS; other components,
including budgets and organizational hierarchies, might also correct bias
without requiring decision-makers to acknowledge their moods. Budgets
and hierarchies impose accountability on decision-makers, and account-
ability has been shown to affect social judgment (Tetlock and Kim [1987]),
although its ability to attenuate the effects of emotions is complex and
equivocal (Lerner and Gonzalez [2005]). However, accountability has cor-
rected judgment biases involving accounting information (Dezoort, Harri-
son, and Taylor [2006], Libby, Salterio, and Webb [2004], Webb [2002]).
Financial incentives are one of the most frequently used and effective
management control techniques (Bonner and Sprinkle [2002], Sprinkle
[2003]), but it may be that other organizational controls correct mood con-
gruency biases more effectively and efficiently.
1244 S. DING AND P. BEAULIEU

APPENDIX
The Balanced Scorecard for the TeenWear Division a
Measure Target Actual
Financial
1. Return on sales 15% 16.5%
2. Sales growth 10% 9%
3. New store sales 25% 27%
4. Market share relative to retail space $80 $73.6
Customer-related
1. Repeat sales 30% 29.37%
2. Customer satisfaction rating (1100) 95 97
3. Mystery shopper program rating (1100) 96 93
4. Returns by customer as % of sales 10% 9.69%
Internal business processes
1. Average major brand names/store 34 35
2. Returns to suppliers 5% 4.8%
3. Sales from new market leaders 25% 24.26%
4. Average markdowns (average % markdown 10% 10.4%
from original retail price)
Learning and growth
1. Hours of employee training/employee 20 17.5
2. Average tenure of sales personnel (in months) 24 22
3. Employee suggestions/employee 8 9
4. Store computerizing 90% 97.5%
a
The BSC is adapted from Lipe and Salterio [2000].

Measures in italics were on the eight-measure BSC used in experiment 2.

An increase in these measures represents deterioration in performance.

REFERENCES
AWASTHI, V., AND J. PRATT. The Effects of Monetary Incentives on Effort and Decision Perfor-
mance: The Role of Cognitive Characteristics. The Accounting Review 65 (1990): 797811.
BANKER, R.; H. CHANG; AND M. PIZZINI. The Balanced Scorecard: Judgmental Effects of Per-
formance Measures Linked to Strategy. The Accounting Review 79 (2004): 123.
BHATTACHARJEE, S., AND K. MORENO. Debiasing the Impact of Positive and Negative Prior Im-
pressions on Managers Performance Evaluation Judgments. Working paper, Virginia Poly-
technic Institute and State University, and Northeastern University, 2005. Website, SSRN:
http://ssrn.com/abstract=491683.
BOL, J. Subjectivity in Compensation Contracting. Journal of Accounting Literature 27 (2008):
124.
BONEAU, C. The Effects of Violations of Assumptions Underlying the T Test. Psychological
Bulletin 57 (1960): 4964.
BONEAU, C. A Note on Measurement Scales and Statistical Tests. American Psychologists 16
(1961): 26061.
BONNER, S.; R. HASTIE; G. SPRINKLE; AND M. YOUNG. A Review of the Effects of Financial
Incentives on Performance in Laboratory Tasks: Implications for Management Accounting.
Journal of Management Accounting Research 12 (2000): 1964.
BONNER, S., AND G. SPRINKLE. The Effects of Monetary Incentives on Effort and Task Perfor-
mance: Theories, Evidence, and a Framework for Research. Accounting, Organizations and
Society 27 (2002): 30345.
CAMPBELL, D. Nonfinancial Performance Measures and Promotion-Based Incentives. Journal
of Accounting Research 46 (2008): 297332.
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1245

CHUNG, J.; J. COHEN; AND G. MONROE. The Effect of Moods on Auditors Inventory Valuation
Decisions. Auditing: A Journal of Practice and Theory 27 (2008): 13759.
CLORE, G.; N. SCHWARZ; AND M. CONWAY. Affective Causes and Consequences of Social Infor-
mation Processing, in Handbook of Social Cognition, Second Edition, edited by R. Wyer and
T. Srull. Hillsdale, NJ: Lawrence Erlbaum, 1994: 323417.
CREMER, D., AND A. HIEL. Effects of Another Persons Fair Treatment on Ones Own Emotions
and Behaviors: The Moderating Role of How Much the Other Cares for You. Organizational
Behavior and Human Decision Processes 100 (2006): 23149.
CRITCHLEY, H. D. Neural Mechanisms of Autonomic, Affective, and Cognitive Integration.
The Journal of Comparative Neurology 493 (2005): 15466.
DENISI, A. S., AND R. D. PRITCHARD. Performance Appraisal, Performance Management and
Improving Individual Performance: A Motivational Framework. Management and Organiza-
tion Review 2 (2006): 25377.
DESTENO, D.; R. PETTY; D. WEGENER; AND D. RUCKER. Beyond Valence in the Perception of
Likelihood: The Role of Emotion Specificity. Journal of Personality and Social Psychology 78
(2000): 397416.
DETWEILER-BEDELL, J. B., AND P. SALOVEY. Striving for Happiness or Fleeing from Sadness?
Motivating Mood Repair Using Differentially Framed Messages. Journal of Social and Clinical
Psychology 22 (2003): 62764.
DEZOORT, T.; P. HARRISON; AND M. TAYLOR. Accountability and Auditors, Materiality Judg-
ments: The Effects of Differential Pressure Strength on Conservatism, Variability, and Ef-
fort. Accounting, Organizations, and Society 31 (2006): 37390.
DRAKE, A.; S. HAKA; AND S. RAVENSCROFT. Cost System and Incentive Structure Effects on
Innovation, Efficiency and Profitability in Teams. The Accounting Review 74 (1999): 32345.
EDMANS, A.; D. GARCIA; AND . NORLI. Sports Sentiment and Stock Returns. Journal of Finance
62 (2007): 196798.
ERBER, R.; D. WEGNER; AND N. THERRIAULT. On Being Cool and Collected: Mood Regulation
in Anticipation of Social Interaction. Journal of Personality and Social Psychology 70 (1996):
75766.
FIEDLER, K. Affective Influences on Social Information Processing, in Handbook of Affect and
Social Cognition, edited by J. P. Forgas. Mahwah, NJ: Lawrence Erlbaum, 2001: 16586.
FORGAS, J. P. Affect in Social Judgments and Decisions: A Multi-Process Model, in Advances in
Experimental Social Psychology 25, edited by M. Zanna. San Diego, CA: Academic Press, 1992:
22775.
FORGAS, J. P. Mood and Judgment: The Affect Infusion Model (AIM). Psychological Bulletin
117 (1995): 3966.
FORGAS, J. P. Introduction: Affect and Social Cognition, in Handbook of Affect and Social Cog-
nition, edited by J. P. Forgas. Mahwah, NJ: Lawrence Erlbaum, 2001a: 124.
FORGAS, J. P. The Affect Infusion Model (AIM): An Integrative Theory of Mood Effects on
Cognition and Judgments, in Theories of Mood and Cognition: A Users Handbook, edited by L.
Martin and G. Clore. NJ: Lawrence Erlbaum, 2001b: 99134.
FORGAS, J.P., AND J. GEORGE. Affective Influences on Judgments and Behavior in Organiza-
tions: An Information Processing Perspective. Organizational Behavior and Human Decision
Processes 86 (2001): 334.
GIBBS, M.; K. MERCHANT; W. VAN DER STEDE; AND M. VARGUS. Determinants and Effects of
Subjectivity in Incentives. The Accounting Review 79 (2004): 40936.
GOHM, C. Mood Regulation and Emotional Intelligence: Individual Differences. Journal of
Personality and Social Psychology 84 (2003): 594607.
GRENSING-POPHAL, L. Motivate Managers to Review Performance. HRMagazine 46 (2001):
4448.
GRENSING-POPHAL, L. Personal communication by email. 2010.
HIRSHLEIFER, D., AND T. SHUMWAY. Good Day Sunshine: Stock Returns and the Weather.
Journal of Finance 58 (2003): 100932.
ITTNER, C., AND D. LARCKER. Innovations in Performance Measurement: Trends and Research
Implications. Journal of Management Accounting Research 10 (1998): 20538.
1246 S. DING AND P. BEAULIEU

ITTNER, C.; D. LARCKER; AND M. MEYER. Subjectivity and the Weighting of Performance Mea-
sures: Evidence from a Balanced Scorecard. The Accounting Review 78 (2003): 72558.
KADOUS, K. Improving Jurors Evaluations of Auditors in Negligence Cases. Contemporary
Accounting Research 18 (2001): 42544.
KAMSTRA, M. J.; L. A. KRAMER; AND M. D. LEVI. Winter Blues: A SAD Stock Market Cycle.
American Economic Review 93 (2003): 32443.
KAPLAN, R. S., AND D. P. NORTON. Using the Balanced Scorecard as a Strategic Management
System. Harvard Business Review JanuaryFebruary (1996a): 7585.
KAPLAN, R. S., AND D. P. NORTON. The Balanced Scorecard: Translating Strategy into Action. Boston,
MA: Harvard Business School Press, 1996b.
KAPLAN, R. S., AND D. P. NORTON. Transforming the Balanced Scorecard from Performance
Measurement to Strategic Management: Part I. Accounting Horizons 15 (2001a): 87104.
KAPLAN, R. S., AND D. P. NORTON. Transforming The Balanced Scorecard from Performance
Measurement to Strategic Management: Part II. Accounting Horizons 15 (2001b): 14760.
KAPLAN, S.; M. PETERSEN; AND J. SAMUELS. Effects of Subordinate Likeability and Balanced
Scorecard Format on Performance-Related Judgments. Advances in Accounting 23 (2007):
85111.
KAPLANSKI, G., AND H. LEVY. Sentiment and Stock Prices: The Case of Aviation Disasters.
Journal of Financial Economics 95 (2010): 174201.
KERFOOT, E. C.; E. A. CHATTILLION; AND C. L. WILLIAMS. Functional Interactions Between the
Nucleus Tractus Solitarius (NTS) and Nucleus Accumbens Shell in Modulating Memory for
Arousing Experiences. Neurobiology of Learning and Memory 89 (2008): 4760.
KIDA, T. E.; K. MORENO; AND J. SMITH. The Influence of Affect on Managers Capital-
Budgeting Decisions. Contemporary Accounting Research 18 (2001): 47794.
LERNER, J., AND R. GONZALEZ. Forecasting Ones Future Based on Fleeting Subjective Experi-
ences. Personality and Social Psychology Bulletin 31 (2005): 45466.
LIBBY, T.; S. SALTERIO; AND A. WEBB. The Balanced Scorecard: The Effects of Assurance and
Process Accountability on Managerial Judgment. The Accounting Review 79 (2004): 107594.
LIPE, M. G., AND S. SALTERIO. The Balanced Scorecard: Judgmental Effects of Common and
Unique Performance Measures. The Accounting Review 75 (2000): 28398.
LIPE, M. G., AND S. SALTERIO. A Note on The Judgmental Effects of the Balanced Scorecards
Information Organization. Accounting, Organizations, and Society 27 (2002): 53140.
MALINA, M., AND F. SELTO. Communicating and Controlling Strategy: An Empirical Study of
the Effectiveness of the Balanced Scorecard. Journal of Management Accounting Research 13
(2001): 4790.
MCFARLAND, C., AND R. BUEHLER. The Impact of Negative Affect on Autobiographical Mem-
ory: The Role of Self-Focused Attention to Moods. Journal of Personality and Social Psychology
75 (1998): 142440.
MCFARLAND, C.; K. WHITE; AND S. NEWTH. Mood Acknowledgment and Correction for the
Mood-Congruency Bias in Social Judgment. Journal of Experimental Social Psychology 39
(2003): 48391.
MILLER, G. The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for
Processing Information. The Psychological Review 63 (1956): 8197.
MORENO, K.; T. KIDA; AND J. SMITH. The Impact of Affective Reactions on Risky Decision
Making in Accounting Contexts. Journal of Accounting Research 40 (2002): 133149.
MURPHY, K., AND J. CLEVELAND. Performance Appraisal: Social, Organizational, and Goal-Based Per-
spectives. Thousand Oaks, CA: Sage, 1995.
RICKMAN, N., AND R. WITT. Favouritism and Financial Incentives: A Natural Experiment.
Economica 75 (2008): 296309.
RIGBY, D., AND B. BILODEAU. Management Tools and Trends 2009. Bain & Company sur-
vey, 2009. Website, http://www.bain.com/bainweb/PDFs/cms/Public/Management Tools
2009.pdf.
ROCH, S. G. An Investigation of Motivational Factors Influencing Performance Ratings: Rat-
ing Audience and Incentive. Journal of Managerial Psychology 20 (2005): 695711.
INCENTIVES IN BALANCED SCORECARD-BASED EVALUATIONS 1247

RUDER, M., AND H. BLESS. Mood and the Reliance on the Ease of Retrieval Heuristic. Journal
of Personality and Social Psychology 85 (2003): 2032.
RUSTING, C., AND T. DEHART. Retrieving Positive Memories to Regulate Negative Mood:
Consequences for Mood Congruent Memory. Journal of Personality and Social Psychology 78
(2000): 73752.
SALVEMINI, N. The Effects of Rate Rewards and Prior Ratee Performance Upon Rating Accu-
racy: An Investigation of the Motivational Component in Performance Appraisal, Unpub-
lished dissertation, Stevens Institute of Technology, 1988.
SALVEMINI, N.; R. REILLY; AND J. SMITHER. The Influence of Rater Motivation on Assimilation
Effects and Accuracy in Performance Ratings. Organizational Behavior and Human Decision
Processes 55 (1993): 4160.
SAUNDERS, E. M. Stock Prices and Wall Street Weather. American Economic Review 83 (1993):
133745.
SAWERS, K. Evidence of Choice Avoidance in Capital-Investment Judgments. Contemporary
Accounting Research 22 (2005): 106392.
SCHWARZ, N., AND G. CLORE. Mood, Misattribution, and Judgments of Well-Being: Informative
and Directive Functions of Affective States. Journal of Personality and Social Psychology 45
(1983): 51323.
SCHWARZ, N., AND G. CLORE. How Do I Feel About It? The Informative Function of Affec-
tive States, in Affect, Cognition, and Social Behavior , edited by K. Fiedler and J. P. Forgas.
Gottingen, Germany: Hogrefe, 1988: 4462.
SCHWARZ, N., AND G. CLORE. Mood as Information: 20 Years Later. Psychological Inquiry 14
(2003): 296303.
SHOWERS, C., AND K. KLING. Organization of Self-Knowledge: Implications for Recovery from
Sad Moods. Journal of Personality and Social Psychology 70 (1996): 57890.
SMITH, S., AND R. PETTY. Personality Moderators of Mood Congruency Effects on Cognition:
The Role of Self-Esteem and Negative Mood Regulation. Journal of Personality and Social
Psychology 68 (1995): 1092107.
SPRINKLE, G. Perspectives on Experimental Research in Managerial Accounting. Accounting,
Organizations and Society 28 (2003): 287318.
ST-ONGE, S.; D. MORIN; M. BELLEHUMEUR; AND F. DUPUIS. Managers Motivation to Evaluate
Subordinate Performance. Qualitative Research in Organizations and Management 4 (2009):
27393.
TAMIR, M., AND M. ROBINSON. Knowing Good from Bad: The Paradox of Neuroticism, Nega-
tive Affect, and Evaluation Processing. Journal of Personality and Social Psychology 87 (2004):
91325.
TETLOCK, P. E., AND J. KIM. Accountability and Judgment Processes in a Personality Prediction
Task. Journal of Personality and Social Psychology 52 (1987): 70009.
TICE, D., AND E. BRATSLAVSKY. Giving in to Feel Good: The Place of Emotion Regulation in
the Context of General Self-Control. Psychological Inquiry 11 (2000): 14959.
WEBB, A. The Impact of Reputation and Variance Investigations on the Creation of Budget
Slack. Accounting, Organizations and Society 27 (2002): 36178.
WEGENER, D., AND R. PETTY. Flexible Correction Processes in Social Judgment: The Role of
Nave Theories in Corrections for Perceived Bias. Journal of Personality and Social Psychology
68 (1995): 3651.
WILSON, T., AND N. BREKKE. Mental Contamination and Mental Correction: Unwanted Influ-
ences on Judgments and Evaluations. Psychological Bulletin 116 (1994): 11742.
WILSON, T.; D. CENTERBAR; AND N. BREKKE. Mental Contamination and the Debiasing Prob-
lem, in Heuristics and Biases: The Psychology of Intuitive Judgment, edited by T. Gilovich, D.
Griffin, and D. Kahneman. Cambridge: Cambridge University Press, 2002: 185200.
Copyright of Journal of Accounting Research is the property of Wiley-Blackwell and its content may not be
copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written
permission. However, users may print, download, or email articles for individual use.

You might also like