You are on page 1of 6

CHI 2006 · Work-in-Progress April 22-27, 2006 · Montréal, Québec, Canada

Do We Need Eye Trackers to Tell


Where People Look?

Sune Alstrup Johansen Abstract


IT University of Copenhagen We investigated the validity of two low-cost alternatives
Rued Langgaards Vej 7, to state-of-the-art eye tracking technology: 1)
DK-2300 Copenhagen S, Denmark prompting users to report from memory on their own
sune@itu.dk eye movements, and 2) asking experienced web
designers to predict the eye movements of a typical
John Paulin Hansen user. Users could reliably remember 70 % of the web
IT University of Copenhagen elements they had actually seen. Web designers could
Rued Langgaards Vej 7, only predict 46 % of the elements typically seen. Users
DK-2300 Copenhagen S, Denmark were not particularly good at remembering the order of
paulin@itu.dk their fixations. We discuss how to further improve the
validity of self-reported gaze patterns and suggest new
areas that it may be used in.

Keywords
Eye Tracking, Usability Evaluation, Visual Attention,
Web Design, Cognition, User Centered Design.

ACM Classification Keywords


H.5.2 User Interfaces: Evaluation/methodology, H.1.2
User/Machine Systems: Human information processing
Copyright is held by the author/owner(s).
CHI 2006, April 22–27, 2006, Montréal, Québec, Canada.
ACM 1-59593-298-4/06/0004.
Introduction
Visual perception is an essential part of users’
interaction with interfaces. Modern eye tracking

923
CHI 2006 · Work-in-Progress April 22-27, 2006 · Montréal, Québec, Canada

equipment makes it possible to record and analyze could still be room in the usability toolbox for a simple
parts of this process. Which elements are actually and fast method that everybody could use. In a similar
seen? Where do users look first? What do users look at way, paper and pen are often the preferred tools by
the most? Did modifications of the graphic design lead usability professionals, although they have lots of
to a wanted change in user gaze patterns? advanced systems available for data collection and
notation. For instance, stick notes and paper cards have
Eye tracking has been criticized for being costly and survived over sophisticated digital tools for e.g. “card-
tedious [1,9]. Older generations of eye tracking sort” analysis of information architectures.
equipment did not deliver sufficient value for usability
professionals. Difficulties calibrating the equipment to People’s patterns of fixation are known to be highly
users with glasses, contact lenses, heavy make-up, or predictive of what they can remember afterwards [3],
even dark/brown eyes were common. Precision was and people tend to have a consistent viewing pattern
low, and tiny head movements could jeopardize the when they re-visit something that they have seen
validity of the recorded eye tracking data. previously, cf. the scan path theory [10]. It is an open
issue to what degree people can reliably remember
State-of-the-art eye tracking equipment has solved where they have looked. Obviously, they can tell from
most of these problems, and accurate recordings of eye moment to moment what they look at, and users can
movements can be made without obtrusive head- explain their own eye movements in details during
mountings or unnatural fixations of the head. However, retrospections of gaze recordings [4], but can they
some barriers for deploying eye tracking in usability keep the attended elements in their memory – even for
studies still remains: Data analysis can be very time a short while - before reporting on them?
consuming, even with the use of advanced software
tools that come with modern eye trackers. Procedure
Furthermore, state-of-the-art eye trackers are In the first experiment, state-of-the-art eye tracking
relatively expensive, (especially when purchasing both equipment (Tobii 1750 with Clearview 2.5.1 analysis
the hardware and analysis software) with market prices software, cf. Figure 1) was used to track the actual eye
above 20,000 $ (as of January 2006). Last but not movements of 10 users, while they searched for an
least, eye tracking equipment may reduce the test answer to simple questions on 8 different web pages.
Figure 1: A Tobii eye tracker validity, for instance by notably slowing the system For instance, they were asked: “Where would you click
recording visual attention. response or by requiring users to re-calibrate between on this web page to find the shop nearest to your
Afterwards, the user reproduced
tasks. residence?”
gaze patterns from memory.

These barriers may well disappear with improvements The user first pressed a button to get the question, and
of technology. Until this happens, there will be a need then again to view the webpage and search for an
for alternatives to eye tracking by costly equipment. answer. Immediately after answering, the users would
Even when tracking technology becomes mature, there report their remembered eye movements by repeating

924
CHI 2006 · Work-in-Progress April 22-27, 2006 · Montréal, Québec, Canada

them as accurately as possible by looking at the screen gestalt laws (e.g. “law of good continuation”, “law of
with the original webpage shown again. This made it proximity”, “law of similarity”, “low of closure”, and
possible to track remembered eye movements, and “law of symmetry”). Each AOI was furthermore given a
later to analyze how well they compared to the original unique identifier and classified from a list of common
ones (cf. Figure 2). web page elements, including banners, contact
information, drawings, email address, input fields,
To analyze the data, each webpage was divided into a logos, mixed content, navigation elements, pictures,
number of Areas of Interest (AOI´s) on basis of the search fields, text blocks, and URL´s.

Green boxes indicate the defined


Areas of Interest (AOI’s).
Black text indicate the unique
identifier given to each AOI for
analysis purposes, and a text
code according to the nature
of the element.

Red circles indicate fixations. The


sizes of the circles illustrate time
spent looking at a particular point,
e.g. bigger circles means longer
time spent. Numbers in circles
indicate the order of fixations.

Figure 2: Plot of a user repeating from memory his gaze pattern associated with the question
“Where would you click to find out if this hotel has a pool?”

925
CHI 2006 · Work-in-Progress April 22-27, 2006 · Montréal, Québec, Canada

In the second experiment, 17 web designers (all of On average, users had 1.9 false memories per web
them with more than 18 months of experience in web page (SD=1.4), with a non-significant tendency that
design) were asked to predict the eye movements by web pages with many AOI´s would introduce more
marking a “typical user scan path” on print-outs of the false memories than simple ones. For instance, on the
8 web pages that had been used in the first simplest web page with only 7 pre-defined AOI´s, users
Seen Remembered experiment. The designers’ predictions were compared had only 0.5 false memories, while on the most
with typical user scan paths constructed by n-gram complex one with 28 pre-defined AOI´s, they had an
analysis of the total eye tracking data from the 10 average of almost 3 false memories.
0 5 10 15 20
subjects. N-gram’s are sequences of characters from
the ASCII character set, where a bi-gram (N=2) The users were not particularly good at remembering
7
consists of 2 characters, a tri-gram (N=3) consists of 3 the actual order by which they had looked at AOI´s. We
14 characters, etc. Based on the unique identifiers of measured this by calculating the Levensthein Distance
AOI’s, each user scan path could be expressed as a (LD) between the sequence of AOI´s seen [5,8] and
17 sequence of ASCII characters. “Typical user scan paths” the sequence remembered. On average, the LD was 16
of each web page was constructed from the bi-grams (SD = 12.8) and this was highly dependent on the
20 with the highest appearance, matched together in a length of the sequence to remember (R2 = 0.91, p
continuing order to create a complete sequence. The <0.05). A LD of 0 would indicate perfect memory, while
22 first and last AOI that had been seen by most people each basic operation (insertion, deletion, or
would define the beginning and the end of the substitution), needed to get from the correct string of
24 sequence. elements to the one actually remembered, would
increase the LD by one.
28 Results
The users could remember 70 % (SD 17.7%) of the We also analyzed if there were any differences in users’
28 AOI´s they had fixated on for more than 125 ms. The memory of their eye movements for different types of
average number of AOI´s users remembered they had web page elements seen. It turned out that people
Figure 3: For each of the 8 web seen did vary across different web pages (from 3.2 to could only remember having seen a logo in 34 % of the
pages, white bars illustrate the 11.4), as some of the web pages had more elements cases they had actually seen it, while they could
number of AOI’s seen, and black than others (cf. Figure 3). Also some of the questions remember having seen 77 % of the photos they had
bars show AOI’s remembered.
were harder to answer than others, thereby initiating a looked at, 74 % of the navigation elements, and 75 %
Numbers at the left side of the
bars indicate the total number of longer visual search of the web page. The amount of of the text elements. This difference in memory
AOI’s that each web page was AOI´s that the user could remember having seen percentages between logos and the other elements
divided into. depended on how many AOI’s they had actually seen, were significant (photos: (t(9) = -3.26, p < 0.01),
but there were no differences in the relative memory (navigation elements: (t(9) = -4.74, p < 0.01), (text
between simple and complex web pages (i.e. web elements: (t(9) = -4.97, p < 0.001).
pages with many elements).

926
CHI 2006 · Work-in-Progress April 22-27, 2006 · Montréal, Québec, Canada

In the second experiment, the group of 17 web which has proven to increase their incentives for the
designers could predict 46 % (SD = 22.1 %) of a presented stimuli [11], and we recommend short and
typical user scan path, when given the question that simple search tasks, since memory of long sequences
the user had to answer. This is significantly worse than tend to be more erroneous.
the 70 % that users could remember themselves (t(25)
= -3.04, p < 0.01). Still using the same questions for The results suggest that the predictions of web
each task, we also asked the web designers to try to designers are generally less reliable than asking the
predict the first 3 AOI´s that a user would be most users themselves. When web designers are to make
likely look at, using the “squint test” recommended by important decisions on e.g. how to organize a layout,
[7] for predictions of scan paths. With this method, they should not just trust their “professional intuition”
designers could only predict 36 % (SD = 27 %) of the but ask users how they actually looked at the interface.
first 3 AOI´s in the typical user scan paths. We believe that the self-reporting method is efficient
enough to be part of an iterative design routine, and
Discussion the “squint test” does not seem to be an alternative in
Even a “quick-and-dirty” method should produce tests where users are to search for particular
reliable indications, though not necessarily 100 % information.
complete data sets. It should definitely not lead to
wrong conclusions based on skewed data, and the In future research, we intend to analyze if users can
possible limitations of the method should be clear to all remember their scan paths for a longer time than just
using it. For instance, our experiment indicates that the few seconds we required them to, and if they can
there are severe limitations to people’s memory of hold the memory of them when conducting several
having seen a logo. So we would not recommend tasks or visiting several web pages before they deliver
testing logo design and/or placement by the self- the report. As an alternative, the mouse can be used to
reporting method. indicate continuously where one is looking – indeed a
cheap and ubiquitous attention tracker. What are the
Earlier research by [2] totally rejected the differences then between a state-of-the-art eye
trustworthiness of users’ self-reported eye movements. tracking and real-time reports given by mouse?
On the basis of the present findings, we find enough
reasons to keep improving the self-reporting methods We would also like to investigate if it is possible to
and to identify the best conditions for using them. increase the validity of self-reported gaze patterns by
Reporting of eye movements should be done for one reducing the complexity of the stimuli. In practice, this
web page at a time, and as soon as the user had found could be done by de-selecting “display of graphic
the answer, thus using the principles of recency [6] elements” in the browser settings. Also, for tests
while retaining the context with the original stimuli still conducted with simple prototypes, it will be interesting
shown in front of the user. Also, we recommend that to know how well they will predict gaze patterns on the
instructions to the users should be highly task-oriented, final interface.

927
CHI 2006 · Work-in-Progress April 22-27, 2006 · Montréal, Québec, Canada

The self-reporting method could prove especially [2] Babcock, J.S., Pelz, J.B., & Fairchild, M.D. ‘Eye
relevant in large-scale user studies conducted over the tracking observers during rank order, paired
comparison, and graphical rating tasks’, Proc. PICS
Internet, since it would then be practically possible to
Digital Photography Conference 2003 (2003).
make users report on their scan paths remotely. Can
[3] Chapman, Peter ‘Remembering what we´ve seen:
we give precise instructions – in written or multimedia
Predicting recollective experiences from eye
formats – that would teach users to report their gaze
movements when viewing everyday scenes’ Cognitive
patterns from a distance? Processes in Eye Guidance, G. Underwood (Ed.), Oxford
University Press, UK, 2005, pp. 237 – 258.
Finally, we would like to investigate, if the immediate [4] Hansen, J. Paulin ‘The use of eye mark recordings
memory of eye movements on a web page is related to to support verbal retrospection in software testing’,
other dimensions of usability. For instance, cluttered Acta Pcychologica 76 (1991), pp. 31 – 49.
web pages are probably more difficult to report eye [5] Josephson, S. and Holmes, M. E. ‘Visual attention
movements on than well-designed web pages. The to repeated internet images: testing the scanpath
immediate memory of own eye movements could also theory on the world wide web’, Proc. ETRA '02, ACM
turn out to be a good predictor of the long-term Press (2002), pp. 43 - 49.
memory of the web page, and thus to the feeling of [6] Matlin, Margaret W. Cognition, Wadsworth
familiarity with it. Thomson Learning, NY, USA, 2002.
[7] Mullet, Kevin & Sano, Darrell Designing Visual
Conclusion Interfaces – Communication Oriented Techniques, Sun
The validity of users’ memory of their scan paths on Soft Press, Mountain View, California, USA, 1995.
web pages is limited when compared to actual [8] Pan, B., Hembrooke, H. A., Gay, G. K., Granka, L.
recordings, especially for logo elements. The validity is A., Feusner, M. K., and Newman, J. K. ‘The
high enough to justify further research, and several determinants of web page viewing behavior: an eye-
tracking study’, Proc. ETRA ’04, ACM Press (2004).
improvements are possible. This may lead to a range of
new usability tools that everybody, including [9] Schnipke, S. K. and Todd, M. W. ‘Trials and
Tribulations of Using an Eye-tracking System’, Proc.
unsupervised users, can apply. Even now, the validity is
CHI 2000, ACM Press (2000), pp. 273-274.
higher than relying on predictions from web designers.
[10] Stark, L.W. & Ellis, S.R. ‘Scanpaths revisited:
cognitive models direct active looking’, Eye
Acknowledgements movements: cognition and visual perception, D. F.
We thank the Royal School of Library and Information Fisher et al. (Eds.), Hillsdale, NJ: Lawrence Erlbaum
Science in Copenhagen for supporting this research. Associates (1981), pp. 193-226.
[11] Van Waes, Luuk ‘Thinking Aloud as a Method for
References Testing the Usability of Websites: The Influence of Task
[1] Aaltonen, Annti ‘Eye Tracking in Usability Testing: Variation on the Evaluation of Hypertext’, IEEE
Is It Worthwhile?’, ‘Usability & eye tracking’ workshop Transactions on Professional Communication, vol. 43,
at CHI'99, ACM Press (1999). no. 3, 2002.

928

You might also like