Professional Documents
Culture Documents
5/19/2010
CONTENTS
Introduction .................................................................................................................................... 1
Heuristic Evaluations................................................................................................................... 2
Background ..................................................................................................................................... 3
Hypotheses ................................................................................................................................. 8
Method ........................................................................................................................................... 8
Participants ................................................................................................................................. 8
Equipment ................................................................................................................................. 10
Metrics ...................................................................................................................................... 10
Procedure .................................................................................................................................. 11
References .................................................................................................................................... 14
INTRODUCTION
Proficiency in conducting heuristic evaluations does not come easily; it is an acquired skill that
takes years to master. It is often difficult to convey an effective evaluation strategy through a
verbal approach. While communicating verbally, people may prompt to where they focus their
attention, but this is difficult to convey. Through an eye tracking study, the relationship
between an expert’s gaze while performing a task and a novice’s learning to better perform a
heuristic evaluation will be explored. Novices concentrate on basic, but irrelevant parts of a
task while processing complex stimuli whereas experts process stimuli quicker while focusing
on relevant aspects (Jarodzka et al., 2009). Finding a way to convey this to a novice would make
a novice's approach quicker and more efficient than before. It has already been shown in a
couple of different domains that watching an expert’s gaze is useful to novices in performing
certain tasks (Stein et al. 2004, Jarodzka et al. 2009). Through this study, it will be shown that
this method of knowledge transfer can be extended to the heuristic evaluation process.
visual attention while performing a task. In addition, studies do not use eye tracking because
they can obtain required results through standard methods. More often than not, participants
are not aware of what they see even when their eyes fixate on that element on screen (Mack,
A., & Rock, I., 2000). Also, participants are biased in reporting their own behavior due to several
factors. A study conducted by Schiessl et al. (2003) shows a disconnection between self-
reported data (what they say) and eye tracking data (what they actually attend to). This results
in the skewing of data in usability studies. Eye tracking studies help us determine where a
person looks, which in turn helps us understand what parts of the interface get the most visual
attention. This is important to study because a gaze replay of experts will be used to convey
visual attention to a novice, which will help a novice understand the expert’s intentions better
HEURISTIC EVALUATIONS
Heuristic Evaluation is a popular and most used Usability Evaluation Method (UEM) (Law, E.
L.C., & Hvannberg, E. T., 2004). A typical heuristic evaluation is conducted by one or more
experts and is based on Nielsen’s ten recommended heuristics (Nielsen, J., & Molich, R., 1990).
The time taken to conduct an evaluation is based on the complexity of the interface. It may
typically take an hour or two. There are no step-by-step instructions on how to perform a
heuristic evaluation. An evaluator usually scans the interface a few times and develops a list of
obvious design features based on the ten heuristics. The outcome of an evaluation is usually a
list of design characteristics including good design elements and usability flaws categorized by
the respective heuristic. While it is often not probable that one evaluator would identify all
positive and negative elements in an interface, more than one evaluation by different
heuristic evaluations are clear, cheap and easy to conduct, they are open-ended and can
produce unreliable results (Nielsen, J., & Molich, R., 1990, Chattratichart, J., & Lindgaard, G.
2008). When even experienced usability professionals can miss issues, it is easy for novices to
miss major problems in an evaluation. Learning from an expert would extend a novice’s
expertise in heuristic evaluations. Experts also use different knowledge-based shortcuts while
conducting tasks. Passing this on to novices would also facilitate quick and effective evaluation.
BACKGROUND
EXPERT-NOVICE RESEARCH
A study conducted by Stein, R., & Brennan, S. E. (2004) showed that experts’ eye gaze as an
input can help novice programmers code better. This study was conducted in two phases.
Phase 1 had four experts debugging code for three Java programs. They were asked to use the
retrospective think aloud protocol. They found a total of eight bugs. Each expert was given up
to ten minutes to find the bugs. The study chose 8 out of the 32 recorded videos for use in
phase two. These videos were chosen based on factors like success, accuracy, length, typicality
and insightfulness. Each video was between 30 to 150 seconds in length. Phase 2 had 6
programmers divided into two groups of 3. Each programmer was asked to find 8 bugs in the
code. The first group was informed that they would watch videos of experts finding out some,
but not all bugs. The other group was not shown any videos at all. The programmers who
watched videos were allowed to view the videos as many times as they wanted because
The findings of the study by Stein, R., & Brennan, S. E. (2004) suggest that it may be confusing
for a novice to watch an expert’s eye gaze that follows a complex path. Also, if an expert
switches between two sections of the code while debugging it, there might be a connection
between these two parts. The study also suggests that novices may be able to identify the
section where a bug may occur, but not the bug itself. This is advantageous as the expert is not
directly showing the broken code. The authors recommend the use of an expert’s voice through
retrospective think-aloud protocol to help novices beyond just watching an experts gaze replay.
By looking at novices’ eye gaze, we can better determine their state of (mis)understanding.
Another study conducted by Jarodzka et al. (2009) showed how displaying experts’ eye gaze to
novices was helpful in completing complex tasks with rich visual components. Novices tend to
concentrate on “saliency rather than to those aspects that are relevant for task performance”
(Lowe, 1999; cited in Jarodzka et al. 2009, pg.2920) while processing complex stimuli. On the
other hand, experts focus on more relevant aspects. This may be attributed to the experts’
years of experience and the quality of this experience. It was also noted that experts tend to
use knowledge based shortcuts. The study had to address a design issue: should several
experts’ average gaze be displayed instead of one? They chose to include the gaze of one
expert instead of an average of several experts. Results showed that, if a participant’s eye
movement closely matched that of an expert during communication, the better was the
The study was conducted using 51 students, five of which were excluded due to poor eye
tracking data. Out of these 46, 32 were female students and 14 were male students. The
participants were divided into two groups: The Control Group and the Gaze Display group. The
control group had 24 participants and the gaze display group had 22 participants. All the
participants were shown four videos. Participants were evaluated based on a free description of
the locomotive pattern of the fish shown in the videos. The participant was asked to look at an
expert’s gaze replay video and then a fixation cross was displayed appeared for two seconds
before the video began; the participant’s eye movements were recorded. The control group
watched unaltered videos of the locomotion pattern, whereas the gaze display group watched
The study uncovered two issues while modeling an expert’s behavior. Firstly, the task needed to
be analyzed in detail which was accomplished by comparing experts and novices while
performing the task. For some tasks requiring additional perceptual skills, eye tracking proved
to be the method of choice. Secondly, the modeling according to task analysis was also an issue.
A common problem is that experts perform tasks using shortcuts, modeling these shortcuts to
novices lacking an understanding of these shortcuts is useless; thus experts were instructed to
behave in a “didactical” manner. Although shortcuts are not desirable at a novice level, they are
extremely effective at a higher level of expertise. The study suggests that future research could
include showing participants gaze displays of several experts. It is already shown that learning
from multiple approaches to solve a problem is beneficial (Atkinson et. Al., 2000; cited in
HEURISTIC EVALUATION
A paper by Law, E. L., & Hvannberg, E. T. (2004) studied two strategies for improving the
and training. The authors compared Nielsen’s ten usability heuristics to Gerhardt-Powals’
cognitive engineering principles (Gerhardt-Powals, 1996; cited in Law, E. L.-C., & Hvannberg, E.
T., 2004) (see Table 1). According to the findings of this study, Nielsen’s heuristics are more
problems. Nielsen’s heuristics are relatively easy to comprehend because they are written in
difficult technical terms. In addition, Nielsen’s heuristics are well known to novice and expert
usability practitioners and thus will be the usability evaluation method of choice for this study.
The think-aloud method is a common usability method used to report what a participant is
thinking about while performing a task. The two most commonly used think-aloud methods are
Concurrent Think Aloud (CTA) and Retrospective Think Aloud (RTA). Both these methods have a
few issues of their own. According to a study by Van Gog et al. (2005), RTA contains less
information than CTA because RTA only references the actual actions performed by the
participants during the task. In addition, participants forget their actions and may also fabricate
information while reporting via the RTA method. According to another study conducted by
Guan et al. (2006), CTA may have detrimental effects like distraction, inattention and change in
approach when performing a task because a participant is concentrating on the task while also
concurrently reporting his thoughts. In order to overcome these problems a technique called
Cued Retrospective Reporting (CRR) as proposed in the study by Van Gog et al. (2005) will be
used. In the CRR method, a participant first performs the regular task without any think aloud
reporting. After completing a task, a participant performs a RTA using the gaze replay video of
that task. The gaze replay is the cue for the RTA method. This way, a participant would generate
more information than RTA. The participant’s verbalization is then superimposed over the
existing gaze replay video to form a composite video that will serve as the treatment for this
study.
Heuristic Analyses are open-ended and can produce unreliable results (Chattratichart, J., &
Lindgaard, G. 2008). For a novice usability practitioner, it is easy to come up with false positives
(Bailey et. al., 1992) and unreliable results while uncovering usability problems through a
heuristic evaluation. Expert usability practitioners perform better heuristic evaluations than
novices owing to practice and experience. This study will provide a better learning methodology
through which a novice can perform better heuristic evaluations through the use of an expert’s
AIM OF STUDY
The aim of this study is to verify that novice usability practitioners who have seen a gaze replay
of expert usability practitioners would perform better heuristic evaluations when compared to
others who have not. This comparison will be made with respect to the number of usability
problems uncovered, relevance of the mentioned usability problem to a heuristic, and the
HYPOTHESES
H1: The treatment group will uncover a higher number of usability problems than the control
group
H2: The treatment group will be more effective in performing a heuristic evaluation
H3: The average fixation duration within an Area of Interest for the treatment group will be less
METHOD
PARTICIPANTS
This research study would have at least 12 HCI students as (novice) participants. The
participants are required to have conducted a minimum of one heuristic analysis in the past
year. Participants will be randomly assigned to two groups: Control Group, which will evaluate a
website without watching experts gaze, and Treatment Group, which will evaluate a website
after watching experts gaze video. Participants will be recruited from the Rochester Institute of
Definitions:
• Novice usability practitioners are graduate and undergraduate students who have taken
heuristic evaluation at least once in the past year. The “novices” will be given
• Expert usability practitioners are experienced in both HCI and usability evaluations.
Professors and/or usability engineers who have at least five years of experience in HCI
heuristic evaluations.
TEST INSTRUMENTS
• Screener
• Post-Evaluation Questionnaire
EQUIPMENT
The study will use an SMI iView-X RED eye tracker with a data rate of 60Hz or 250Hz. The Eye
tracker is a remote (non obtrusive) eye tracker. The eye tracker is positioned below the test
monitor and captures a corneal reflection with a range of 60cm - 80cm. The SMI automatically
calibrates and shows calibration results. If the calibration is incorrect, the participant can be
recalibrated. The SMI comes bundled with an application called BeGaze, which has a set of
analysis tools for Area’s of Interest (AOI), Knowledge Performance Indicators (KPI), scan paths,
attention maps and more. BeGaze works on data saved by the experiment.
METRICS
• Success rate: The success between the control group and treatment group will be compared
based on the number of design features uncovered and the validity of the problems.
• Fixation duration within an Area of Interest: The duration of fixations in seconds within an
AOI will be recorded and compared between the treatment group and control group. If the
time spent on an AOI is less as a result of watching an expert's gaze video, it is better.
EXPERIMENT DESIGN
This study will use a single factor between-subjects design. There will be two participant groups
and two websites. The participants are divided into two groups, the Control Group and
Treatment Group. The participants in the control group will not watch any gaze replay videos.
They will perform a heuristic evaluation in website-B. Participants in the treatment group will
watch the treatment video obtained at the end of Cued Retrospective Reporting phase. The
treatment video is a consolidated video consisting of an expert’s gaze replay and verbalization
of website-A. After participants watch the treatment video, they would perform a heuristic
evaluation on website-B. Only HCI students from the RIT population are recruited as
participants for the second phase of the study. Participants in both control group and treatment
group are tested in the same eye tracking lab, and on the same equipment to eliminate the
effect of environmental differences in between-subjects design. Further, both groups will have
the same number of participants with a similar background. In the treatment group, all
participants will watch the same treatment video. Lastly, students are assigned at random to
both groups.
PROCEDURE
Participants will be recruited through a website and selected on the basis of satisfying the
following criteria:
description).
2. Participants must have performed heuristic evaluations (at least one evaluation by
The study is carried out in two phases. In phase 1, one or two expert usability practitioners will
perform a heuristic evaluation on both website-A and website-B (See Appendix D). The expert is
informed that his gaze replay would be used as treatment in phase 2 of the study, and is
instructed to perform the evaluation in a didactical manner so that a novice can understand
what the expert is doing by watching the treatment video. Once the evaluation is completed,
the experts will be asked to perform a cued retrospective reporting, i.e., experts will be asked
to describe what they looked at based on their gaze replay. This whole process is expected take
about 30 minutes to complete. The expert’s voice-over will then be added to the gaze video.
This video consisting of the gaze replay and the expert’s verbalization is the composite video.
This process is repeated for all the experts. One video will be chosen as the treatment based on
validity (was the heuristic evaluation satisfactory?), clarity (Is the gaze replay clear and easy to
follow?) and run-time (shortest video will be chosen) after compiling gaze replay videos from all
the experts. Videos that have audio/video issues will not be chosen. This final chosen
In phase 2, participants in the control group and treatment group perform a heuristic
evaluation of website-B. Participants are randomly assigned to each of the groups. The study is
evaluation without watching the treatment video. Participants in the treatment group will first
watch a treatment video of an expert performing a heuristic evaluation on website-A, and are
evaluation, all novices are asked to perform a common task on website-B (See Appendix E). The
novice is asked to perform a task so that he/she is familiar with the website. Novices are given a
heuristics template that they have to complete. The heuristics template will have 3 or 5
heuristics and some space for participant’s comments. The heuristic evaluation is timed and will
have to complete in less than 10 minutes. Once a participant completes a heuristic evaluation,
he will hand over the heuristic template to the test administrator. After this, the participants
are handed a post test questionnaire to complete. They will then be thanked and are free to
leave. A t-test will be used to analyze the collected data to verify that there is a statistical
REFERENCES
Bailey RW, Allan RW, Raiello P. Usability Testing vs. Heuristic Evaluation: A Head-To-Head
Comparision. In: Proceedings of the Human Factors Society 36th Annual Meeting.
Atlanta, Georgia: Human Factors and Ergonomics Society; 1992:409-413.
Guan, Z., Lee, S., Cuddihy, E., & Ramey, J. (2006). The validity of the stimulated retrospective
think-aloud method as measured by eye tracking. In Conference on Human Factors in
Computing Systems. ACM.
Jarodzka, H., Scheiter, K., Gerjets, P., van Gog, T., & Dorr, M. (2009). How to Convey Perceptual
Skills by Displaying Experts’ Gaze Data. COGSCI (pp. 2920-2925). Amsterdam.
Lavery, D., Cockton, G., & Atkinson, M. (2010). Heuristic Evaluation: Usability Evaluation
Materials. Retrieved from http://www.dcs.gla.ac.uk/asp/materials/HE_1.0/.
Law, E. L., & Hvannberg, E. T. (2004). Analysis of strategies for improving and estimating the
effectiveness of heuristic evaluation. In Proceedings of the third Nordic conference on
Human-computer interaction. Tampere, Finland: ACM
Mack, A., & Rock, I. (2000). Inattentional Blindness (1st., p. 287). The MIT Press.
Nielsen, J., & Molich, R. (1990). Heuristic evaluation of user interfaces. In Proceedings of the
SIGCHI conference on Human factors in computing systems: Empowering people.
Seattle, Washington, United States.
Schiessl, M., Duda, S., Thölke, A., & Fischer, R. (2003). Eye tracking and its application in
usability and media research. MMI-Interaktiv. Retrieved from http://eye-
square.com/documents/EyeTracking-ResearchApplications.pdf.
Stein, R., & Brennan, S. E. (2004). Another person's eye gaze as a cue in solving programming
problems. In Proceedings of the 6th international conference on Multimodal interfaces.
State College, PA, USA: ACM.
Van Gog, T., Paas, F., van Merriënboer, J. J., & Witte, P. (2005). Uncovering the problem-solving
process: cued retrospective reporting versus concurrent and retrospective reporting.
Journal of experimental psychology. Applied, 11(4), 237-44.
A. Screener
F. Heuristics template
1. Name:
2. Sex:
Male
Female
3. Age
4. Occupation
1
2
3
4
5
Greater than 5
9. Which of the following best describes the reason you perform heuristic
evaluations:
As a part of a course
As a part of your job
As a part of a project
For no specific reason
10. Do you use any usability evaluation methods other than Nielsen’s ten heuristics?
(e.g., Gerhardt Powals’ cognitive principles)
Yes
No
If you answered "YES" to the question above, please specify the usability evaluation
method(s) used:
11. Do you use a screen reader, screen magnifier or other assistive technology to
use the computer and the Web?
Yes
No
Use of experts' gaze by novice usability practitioners to perform a better heuristic evaluation
INTRODUCTION
You are invited to join an eye tracking study that aims to establish a relationship between an expert’s gaze replay and a novice’s
learning to better perform a heuristic evaluation will be explored. Please take whatever time you need to discuss the study with your
family and friends, or anyone else you wish to. The decision to join, or not to join, is up to you.
RISKS
We do not foresee any risks in this study.
BENEFITS
Previous studies in different domains suggest that novices learnt better from experts by watching their eye gaze while performing
tasks. We are trying to establish that this fact is true for usability professionals in conducting a heuristic evaluation too. However, we
can’t guarantee that you will personally experience benefits from participating in this study. Others may benefit in the future from
the information we find in this study.
CONFIDENTIALITY
The information in the study records will be kept strictly confidential. Data will be stored securely and will be made available only to
persons conducting the study unless you specifically give permission in writing to do otherwise. No reference will be made in oral or
written reports, which could link you to the study. Publications related to this work will not make reference to any individuals.
I have read and understand the above information. I have received a copy of this form. I agree to participate in this study.
1. Visibility of system status
The system should always keep users informed about what is going on, through appropriate
feedback within reasonable time.
2. Match between system and the real world
The system should speak the users' language, with words, phrases and concepts familiar to the
user, rather than system‐oriented terms. Follow real‐world conventions, making information
appear in a natural and logical order.
3. User control and freedom
Users often choose system functions by mistake and will need a clearly marked "emergency
exit" to leave the unwanted state without having to go through an extended dialogue. Support
undo and redo.
4. Consistency and standards
Users should not have to wonder whether different words, situations, or actions mean the same
thing. Follow platform conventions.
5. Error prevention
Even better than good error messages is a careful design which prevents a problem from
occurring in the first place. Either eliminate error‐prone conditions or check for them and
present users with a confirmation option before they commit to the action.
6. Recognition rather than recall
Minimize the user's memory load by making objects, actions, and options visible. The user
should not have to remember information from one part of the dialogue to another.
Instructions for use of the system should be visible or easily retrievable whenever appropriate.
7. Flexibility and efficiency of use
Accelerators ‐‐ unseen by the novice user ‐‐ may often speed up the interaction for the expert
user such that the system can cater to both inexperienced and experienced users. Allow users to
tailor frequent actions.
8. Aesthetic and minimalist design
Dialogues should not contain information which is irrelevant or rarely needed. Every extra unit
of information in a dialogue competes with the relevant units of information and diminishes
their relative visibility.
9. Help users recognize, diagnose, and recover from errors
Error messages should be expressed in plain language (no codes), precisely indicate the
problem, and constructively suggest a solution.
10. Help and documentation
Even though it is better if the system can be used without documentation, it may be necessary
to provide help and documentation. Any such information should be easy to search, focused on
the user's task, list concrete steps to be carried out, and not be too large.
REFERENCES
Nielsen, J. (2005). 10 Heuristics for User Interface Design. Retrieved from
http://www.useit.com/papers/heuristic/heuristic_list.html
APPENDIX D: LIST OF CITY WEBSITES
City Website
Please conduct a heuristic evaluation and fill information relevant to the following heuristics