You are on page 1of 31

Detecting User Stress State Using Computer Mouse

Movements

Dylan Drein
13344741

Final Year Project – 2017


B.Sc. Single Honours in
Computer Science

Department of Computer Science


Maynooth University
Maynooth, Co. Kildare
Ireland
A thesis submitted in partial fulfilment of the requirements for the B.Sc.
Single Honours in Computer Science
Supervisor: Dr Kevin Casey
Contents
Declaration ........................................................................................................................................... i
Acknowledgements............................................................................................................................. ii
Abstract .............................................................................................................................................. iii
List of Figures ..................................................................................................................................... iv
List of Tables ...................................................................................................................................... iv

Chapter one: Introduction......................................................... 1


Summary ............................................................................................................................................. 1
1.1 Topic addressed in this project ............................................................................................... 1
1.2 Motivation............................................................................................................................... 1
1.3 Problem statement ................................................................................................................. 1
1.4 Approach ................................................................................................................................. 2
1.4.1 Initial Decisions ............................................................................................................... 2
1.4.2 Designed Solution ........................................................................................................... 2
1.4.3 Evaluation........................................................................................................................ 2
1.5 Metrics .................................................................................................................................... 3

Chapter two: Technical Background.......................................... 4


2.1 Topic material ......................................................................................................................... 4
2.2 Technical material ................................................................................................................... 5
2.2.1 Python ............................................................................................................................. 5
2.2.2 JSON and CSV .................................................................................................................. 5
2.2.3 UNIX time ........................................................................................................................ 5
2.2.4 Packages .......................................................................................................................... 5
2.2.5 Graphing .......................................................................................................................... 5
2.2.6 Floating Point Arithmetic ................................................................................................ 6
2.2.7 Error analysis ................................................................................................................... 6
2.2.8 Distributions .................................................................................................................... 6
2.2.9 Hypothesis Testing .......................................................................................................... 7

Chapter three: The Problem...................................................... 8


Summary ............................................................................................................................................. 8
3.1 Technology to use ................................................................................................................... 8
3.2 Preparing the Data .................................................................................................................. 8
3.3 Isolating Relevant Data ........................................................................................................... 9
3.3.1 Isolate Click Sequences ................................................................................................... 9
3.3.2 User Environment ........................................................................................................... 9
3.4 Calculating Metrics.................................................................................................................. 9
3.5 Problem Analysis ................................................................................................................... 11
Summary ........................................................................................................................................... 12
4.1 Tools ...................................................................................................................................... 12
4.2 Cleaning the Data .................................................................................................................. 12
4.3 File Conversion ...................................................................................................................... 13
4.4 Classification ......................................................................................................................... 13
4.4.1 By User .......................................................................................................................... 13
4.4.2 In lab vs. Out of Lab....................................................................................................... 13
4.5 Metrics .................................................................................................................................. 14
4.5.1 Distance......................................................................................................................... 14
4.5.2 Efficiency ....................................................................................................................... 15
4.5.3 Speed............................................................................................................................. 15
4.5.4 Time .............................................................................................................................. 16
4.5.5 Hover time .................................................................................................................... 16
4.5.6 Overshoot...................................................................................................................... 16
4.5.7 Axis Error ....................................................................................................................... 16
4.6 Hypothesis Testing ................................................................................................................ 16

Chapter five: Evaluation .......................................................... 18


Summary ........................................................................................................................................... 18
5.1 Solution Verification ............................................................................................................. 18
5.1.1 Verification of Analysis Methods .................................................................................. 18
5.2 Validation/Measurements .................................................................................................... 18
5.2.1 Results .......................................................................................................................... 19
5.2.2 Explanation of Results ................................................................................................... 19
5.2.3 Analysis of Results ......................................................................................................... 20

References ............................................................................. 23
Declaration

I hereby certify that this material, which I now submit for assessment on the program of study
as part of BSc. Computer Science qualification, is entirely my own work and has not been
taken from the work of others - save and to the extent that such work has been cited and
acknowledged within the text of my work.

I hereby acknowledge and accept that this thesis may be distributed to future final year
students, as an example of the standard expected of final year projects.

Signed: Date:

i
Acknowledgements

I would like to thank my supervisor Dr Kevin Casey for providing consistent and invaluable assistance
throughout the course of this project.

I would like to thank Dr Patrick Murphy of University College, Dublin for meeting with me and assisting
me greatly with the mathematics and statistics elements of this project.

Many thanks to everybody who spent the past 6-7 months in the final year lab, driving me to continue
and providing moral support when it was needed. Misery loves company.

I extend my gratitude to the entire department of Computer Science and its faculty here at Maynooth
University for the level of support and assistance provided over the past 4 years.

Finally, I would like to thank my family and friends who supported and guided me throughout the
duration of this project and the past 4 years in general. It has been hugely appreciated.

ii
Abstract
Stress has been shown to manifest in many forms and has debilitating effects on a person’s mental
and physical health, personal productivity, workplace efficiency and quality of life. Symptoms and
indicators of stress are well studied, however traditional detection methods can be invasive, cost-
prohibitive or too subjective to be reliable.
Here we explore the advantages of using computer mouse movements as a method of stress detection
in 129 college students. Mouse movement data from each individual user was logged across the
semester and analysed within the context of whether it took place in a lab exam environment or
outside of a lab exam environment. Metrics were devised, based on this data, as being potentially
effective in identifying a change in user stress state between environments. These metrics were
calculated for each mouse movement and the difference in results between environments was tested
for significance. Of the 9 metrics tested, mouse movement speed; efficiency; time duration; target
overshoot; hover time over target, error in the x-axis component of movement and efficiency in the
x-axis and y-axis components of movement were found to be significantly different between the
stressful and less stressful environments. Error in the y-axis component of movement was found to
have no significant difference.
These findings strongly support the merit in using Human Computer Interaction through computer
mouse movements in detecting stress and serve as a potential foundation for further research using
keyboard data as an additional stress detection indicator or in the application of machine learning
principles to the data to be able to classify a user as stressed or non-stressed.

iii
List of Figures
Figure 3-1: Cleaned, reversed JSON data showing a mouse click event. ............................... 8
Figure 3-2 Plot of an individual users Click Sequence (note: overshoot) ............................... 9
Figure 3-3: Actual mouse path (left), Approximation of actual mouse path using
intermediate Euclidean distances between mouse events (right) ..................... 10
Figure 3-4: Actual mouse path with overshoot of mouseDown location, showing overshoot
and return path of mouse cursor to the location of the click event. ................. 11
Figure 4-1: (Left) Mouse activity over the week of lab 1 and (Right) mouse activity over the
week of lab 2. ..................................................................................................... 14
Figure 4-2 Distance function using NumPy linear algebra tools ......................................... 15
Figure 4-3 (Left) Histogram of in lab Click Sequence efficiency and (Right) Histogram of
out of lab Click Sequence efficiency (sample mean, standard deviation and
variance incl.) ..................................................................................................... 15
Figure 4-4 Histogram of in lab Click Sequence speed (Left) and out of lab Click Sequence
speed (Right) ...................................................................................................... 15
Figure 4-5 Histogram of in lab Click Sequence hovertime (Left) and out of lab Click
Sequence hovertime (Right) ............................................................................... 16
Figure 5-1: Example of dummy file used to verify code analysis of metrics ........................ 18

List of Tables
Table 2-1 Type errors encountered in Hypothesis Testing ................................................... 7
Table 4-1 UNIX time stamps of lab exam........................................................................... 13
Table 5-1 Table of metric descriptive statistics.................................................................. 19
Table 5-2 Summary of Welch’s t-test results showing the significance of the stress
indicator metrics. Star (∗) indicates significance at α = 0.0167. ........................ 20

iv
Chapter one: Introduction
Summary
Chapter one outlines an overview of the existing work in this field and the motivation for carrying out
this project as well as an outline of the approach and methodology used, from planning, through the
implementation to the evaluation of the results.

1.1 Topic addressed in this project


This project uses computer mouse movement data collected over the 2014/2015 college year from
129 students in and out of a college lab exam environment to investigate behavioural and stress state
patterns between both of these environments.
The aim of the project is to identify useful behavioural characteristics from the data and to use analysis
methods to determine whether or not these characteristics can be used to infer a relationship between
user behaviour and stress levels and the environment in which the user uses the computer.
Just over 19.6GB of raw mouse data was collected using a JavaScript based mouse event handler which
stored the data in JSON format. The most useful metrics to analyse were identified, data for each
metric was generated for both in lab and out of lab environments and the results were analysed for
significance.

1.2 Motivation
The motivation for this project was to provide a passive, non-invasive and inexpensive model of
detecting stress indicators in users by monitoring their behaviour through their computer mouse
movements.
Previous research has strongly indicated that stress in the workplace has a negative effect on employee
efficiency and employee performance, negatively affects work ethic and contributes to employee
‘burnout’. [1] Furthermore, in college students stress has been shown to negatively affect physical and
emotional health [2] and strongly correlates with high levels of depression and anxiety. [3]
Although conventional stress indicators are well studied and understood [4], measuring and collecting
data in an unobtrusive yet effective way has not been straightforward. [5] Electroencephalography
(EEG) and electrocardiography (ECG) are proven methods in indicating levels of stress in subjects [6],
though they require specialised and expensive equipment to be in place before, during and after the
stressful situation takes place. Custom sensing hardware encounters similar issues, while self-report
tools can provide biased and unreliable feedback depending on the environment in which testing takes
place. [5]

1.3 Problem statement


The project involved the processing, categorisation and analysis of vast amounts of data collected from
129 student’s mouse movements in and out of a lab environment over a college year. The raw data
required a large amount of cleaning before it could be effectively analysed in any meaningful way. The
data also had to be categorised into separate user files so that analysis could be carried out by
individual user across both the stressful and less stressful environments. This allowed trends to be
identified between users before further, more general analysis was carried out. Following this, the

1
data was classified into one of two groups based on whether or not Click Sequences occurred in or out
of a lab environment.
Before the data was cleaned, it was necessary to determine what aspects of a Click Sequence would
be most useful to analyse in order to achieve the aims of the project. Determining this early on ensured
that only the most relevant fields were retained during the cleaning process. Relevant analytical
methods that would be appropriate to apply to this data were explored throughout the project as the
distribution and structure of the data became more understood.

1.4 Approach
The three mouse events encountered in the data were mouseMove, mouseDown and mouseUp. A
mouseDown event followed by a mouseUp event constitutes a ‘full’ mouse click, however
‘mouseDown’ and ‘click’ are used synonymously throughout this report. A full mouse movement
sequence is identified as a succession of mouseMove events ending in a mouseDown event. This full
sequence will hereafter be referred to as a Click Sequence.
1.4.1 Initial Decisions
Python was the programming language used to carry out this project, chosen for its widely supported
and well documented statistical analysis and numerical packages, namely NumPy, SciPy, Pandas, scikit-
learn for machine learning and matplotlib for graph plotting. Initially Project Jupyter’s IPython [7]
command shell was used which offers an interactive browser-based development notebook, though
as time went on the Windows command prompt was used as the predominant Python interpreter.
1.4.2 Designed Solution
Relevant metrics such as efficiency of the users Click Sequence, the speed of the Click Sequence, the
time duration of each Click Sequence among others outlined in this report were initially identified as
promising methods of identifying and measuring stress indicators in users. Data for each of these
metrics was generated from the raw data and categorised at first by user, then by environment.
Descriptive statistics were generated for each of these metrics across both environments as well as
visual analysis that was carried out using plotting software. These initial analyses served as the
foundation for a final round of testing which determined which metrics, if any, show a significant
difference between in lab vs. out of lab environments thus determining which metrics were strongest
in identifying stress indicators in users.
1.4.3 Evaluation
Data was initially analysed by graphing its spread with histograms, scatter plots and box plots using
matplotlib plotting software. Doing this allowed for further relevant analysis to be identified based on
how the data looked visually and also allowed for tweaks and changes in the previous code to be made
to better suit the data once it had been visualised.
In addition, descriptive statistics such as the mean, median, standard deviation and variance were used
alongside the visual analysis. This information was used to compare data between both environments
and served as the basis for testing for correlation between in lab and out of lab conditions for each of
the metrics outlined in section 1.5.
Hypothesis testing was used to determine if corresponding metrics from in lab and out of lab
environments were significantly different from one another. A modified student’s t-test was used to
carry this out called Welch’s t-test which can be less powerful than the Student’s t-test but has no

2
restrictions for data skew or data variance[8]. Data transformation such as square or log
transformation can be carried out on the data to normalise it’s spread and make it more suitable for
various tests which rely on the assumption of a normal distribution. These method of transformation,
although widely used, have been highlighted as potentially problematic [9], so alternative approaches
will be explored to strengthen conclusions.

1.5 Metrics
Based on the spread of the distribution for each metric (discussed in detail below) relevant Click
Sequences are limited to those which last for at most 1450ms and end with a mouseDown event
The relevant metrics that were obtained from the data and used for analysis are:
1. Distance: Both actual distance and optimal distances were calculated for each Click Sequence.
Actual distance measured the real mouse path distance of each Click Sequence. Actual paths were
made up of a sequence of mouseMove events ending in a mouseDown event. The Euclidean
distance between these successive mouseMove events were summed to approximate the distance
travelled. Optimal distance was calculated as the Euclidean distance between the start and end
point of the Click Sequence.

2. Time: The duration of each Click Sequence was calculated in milliseconds from the initial point to
the mouseDown point.

3. Speed: The speed of each Click Sequence was calculated in pixels per millisecond (px/ms) and
pixels per second (px/s).
See Appendix D

4. Click hover time: The duration for which each user ‘hovered’ the cursor over the click point before
clicking the mouse was calculated for each Click Sequence.
See Appendix B, I

5. Efficiency: Efficiency of each Click Sequence was calculated by calculating:


𝑫𝒊𝒔𝒕𝒂𝒏𝒄𝒆
𝑬𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒄𝒚 = 𝑶𝒑𝒕𝒊𝒎𝒂𝒍 𝑫𝒊𝒔𝒕𝒂𝒏𝒄𝒆
𝑨𝒄𝒕𝒖𝒂𝒍
See Appendix H, I

6. Overshoot: The distance for which the user ‘overshot’ the target with the mouse cursor was
calculated for each Click Sequence.
See Appendix C, I

7. X/Y-axis error: The error was calculated for optimal and actual distance for the individual x-y
components of each Click Sequence. This used the same method as (1.) though it measured the
change in distance along each axis separately for each Click Sequence.
See Appendix E, F, G

8. X/Y-axis efficiency: Efficiency was calculated in the same way as above for the individual x-y
components of each Click Sequence.
See Appendix E, F, G

3
Chapter two: Technical Background
Summary
This chapter outlines and discusses the previous approaches, methodologies and results of similar
research carried out in this area. It also aims to break down and describe in detail all technical aspects
of this particular project, the approaches taken and the technology used.

2.1 Topic material


Measuring stress and user intent in subjects is a well explored area of research with applications in the
health sector[6], the advertising and marketing industry[10], online profiling [11], education [12],
psychology[13] and many other varied fields.
Sun et al.[5] used a mass-spring damper system as a model of the human arm. In this approach (n=49)
the focus was on measuring muscle stiffness as an indicator of stress through known arm-hand
dynamics. The experimental design involved measuring the accuracy of drag and drop actions, point
and click actions and mouse cursor steering. Stress was induced in the subjects by the researchers in
stages, with an initial calm period followed by a stressor period after which the mouse tasks were
carried out and measures taken, with a final cool down calm period. The stressor event was a series of
mental arithmetic problems posed to the subjects under limited time constraints.
Along with ECG signals recorded concurrently with user reported stress ratings, this research made
strong findings in a novel area of the field, the use of a physiological model of the users arm as a
method of measuring stress captured through computer mouse movements.
Hernandez et al. [14] carried out an approach with some similarities to the work of Sun et al. This
experiment (n=24) involved use of a pressure sensitive computer keyboard as well as a capacitive
computer mouse, with subjective user evaluated stress reports also forming part of the feedback in a
within-subjects lab study. Different text transcription exercises were carried out under calm and
stressful situations (time constraints, speed constraints etc.) measuring dynamics such as keystroke
speed and latency between key strokes among others. The computer mouse was again used as to
investigate whether stress could accurately be measured and identified through mouse movement
behaviour. The mouse tasks in this approach involved the user carrying out repetitive mouse click
actions under various time constraints, and across varying distances on the screen. The mouse
exercises were carried out before and after the induced stress keyboard exercises. Findings of this
study included a correlation between user stress levels and keystroke pressure as well as increased
contact with the computer mouse.
Guo and Agechtein [15] conducted a study (n=860) investigating user intent in the context of web
search queries on a Google search result page. They decided to classify mouse paths in such a way that
a user’s query intent could be deduced, such as navigational versus informational mouse movements.
They concluded that mouse path trajectories can be used to infer a user’s intent and classify mouse
paths that initially appear to have ambiguous intent, though no solid definition is given within the
context of the study as to what makes a mouse path navigational or informational.
Demšar and Arzu Çöltekin [16] carried out research (n=11) into the relationship between mouse
movement trajectory and eye movement tracking in a controlled laboratory environment. Users traced
basic geometric shapes on-screen with the mouse cursor as well as visually with their eyes. The users
were given various instructions such as to lead with the mouse cursor when tracing the shapes and
allow their gaze to follow, to trace the shape with their eye gaze and move the mouse cursor
accordingly etc. They concluded that this approach, foregoing expensive and invasive eye tracking

4
software, is indicative of a strong relationship between mouse cursor movement and eye gaze
movement.
This exploratory work serves as a foundation for further work in this field and, when considered in
conjunction with other work in the fields of augmented reality [17] and virtual reality [18] for example,
the prospective applications of mouse movement analysis are extremely promising.

2.2 Technical material


2.2.1 Python
Python is an interpreted programming language with a dynamic type system and an object-oriented
programming paradigm. It is rapidly growing as a strong tool for statistical analysis alongside R and
MATLAB due to its wide range of data analysis packages and libraries with extensive documentation
and a strong user base of data scientists.

2.2.2 JSON and CSV


JSON (JavaScript Object Notation) is a data storage format which utilises attribute – value pairs to save
data within a JSON object. JSON objects can contain one or more of these attribute – value pairs and
values of attributes can themselves be attribute – value pairs, allowing for a nested hierarchical data
structured JSON object.[19]
CSV (comma-separated value) files are well suited to storing tabular data in which each row of data
contain fields which are individually separated by a delimiting character, usually a comma. As such,
CSV files are regularly used in the storage of data in which each row entry has the same consistent set
of fields. The storage of data in CSV format, text-based or numeric, can be likened to that of a matrix
or a 2D array. CSV does not support hierarchical or nested data. [20]

2.2.3 UNIX time


UNIX time, also referred to as Epoch time, is a date format which quantifies instants in time as the
number of seconds which have passed since 00:00:00, 1st January 1970 UTC, called the Epoch. [21]

2.2.4 Packages
Anaconda is an extensive data science platform for Python. Conda is an open source package and
environment management system which comes preinstalled as part of the Anaconda suite. As an
environment manager, Conda handles switching between Python environments quickly when working
with packages which are version specific. Conda supports the installation of packages through the
command prompt with the ‘conda install [package name]’ command. It also handles the
deletion and updating of packages among other tasks.[22]
NumPy is a scientific computation package for use with Python. It allows for the storage of data in
homogenous, n-dimensional ‘ndarrays’ data objects as well as the ability to carry out a wide range of
mathematical functions and transformations on the data stored in these arrays.
SciPy is a Python library which has features for handling data processing, optimisation, numerical
integration, regression analysis, clustering, statistical analysis and hypothesis testing among data sets
along with many other features. SciPy works with NumPy ndarray objects as part of the NumPy stack.

2.2.5 Graphing
Matplotlib [23] is a Python plotting library which is capable of producing static and interactive graphs
and figures. These graphs can be heavily manipulated depending on the graph being generated (eg.

5
Histograms, box plots, error bars, scatter plots etc.), allowing for in depth analysis of spread of the data
as well as visual analysis of relationships between data sets.

2.2.6 Floating Point Arithmetic


Floating point representation is used by computers to store and manipulate numbers. Numbers stored
in floating point format (floats) can only be stored to a finite level of precision. This finite set of
numbers is referred to as the set of Machine Numbers (𝑀).
A number 𝑥 in floating point representation is written in the form:
𝑥 = 𝜎𝜈𝑏 𝑃
𝜎 = 𝑡ℎ𝑒 𝑠𝑖𝑔𝑛, 𝑎 𝑠𝑖𝑛𝑔𝑙𝑒 𝑏𝑖𝑡 (1: 𝑓𝑜𝑟 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒, 0: 𝑧𝑒𝑟𝑜 𝑜𝑟 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒.)
𝜈 = 𝑡ℎ𝑒 𝑚𝑎𝑛𝑡𝑖𝑠𝑠𝑎
𝑏 = 𝑏𝑎𝑠𝑒
𝑃 = 𝑝𝑜𝑤𝑒𝑟
A number 𝑥 such that |𝑥| < 𝑏𝑃𝑚𝑖𝑛−1 will result in an underflow error, as it is too small a number for
1
the computer to be able to store in memory. Likewise, a number 𝑥 such that |𝑥| > 𝑏 𝑝𝑚𝑎𝑥 . (1 − 𝑏𝐿 )
(where 𝐿 denotes the length of the mantissa) will result in an overflow error as it is too large a number
for the computer to store in memory. If 𝑥 ∉ 𝑀 then 𝑟𝑑(𝑥) is used where 𝑟𝑑() is some rounding
function such that 𝑟𝑑(𝑥) ∈ 𝑀, resulting in rounding errors.
NumPy allows for the specification of 32-bit or 64-bit precision when using floating point numbers,
however internal Python functions such as the CSV writerow()function used to output data to a
CSV file, for example, truncates floating point numbers to a lower level of precision. This can make
program-wide consistency in FP representation difficult and also results in truncation or round-off
errors.

2.2.7 Error analysis


Data and numerical analysis can be affected by a number of different types of errors:
- Input error: This can rarely be controlled for, and is a result of measurement error with respect
to data gathered from the real world. As it tends to be unavoidable, this type of error is usually
accounted for during the analysis and can be mitigated depending on the analytical
approaches used.
- Rounding error: As stated above, computers can introduce rounding errors when floating point
representation is used as only machine numbers (M) can be represented. Numbers which are
not an element of this finite set must be rounded or truncated so that they become elements
of that set, introducing round-off error.
- Approximation error: When trying to solve a particular problem 𝑃 with a solution 𝑛 and a
method is devised which approximates a solution 𝑛∗, the level of error between the true
solution 𝑛 and the approximate solution 𝑛∗ can be defined using several methods including
absolute error, relative error and percentage error:
o 𝑬𝑨𝒃𝒔 = |𝑛∗ − 𝑛|
|𝑛∗ −𝑛|
o 𝑬𝑹𝒆𝒍 =
|𝑛|
o 𝑬𝑷𝒆𝒓 = (𝑬𝑹𝒆𝒍 ) ∗ 𝟏𝟎𝟎

2.2.8 Distributions
Distribution of the data can be visually represented with graphs such as box plots and histograms.
Representing the data this way can show, among other things, the spread of the data, data outliers,

6
the maximum and minimum values as well as the skew of the data in a more intuitive and readable
way. Plotting the data this way can also give a more informed view of how further analysis should be
carried out as, in the case of hypothesis testing for example, certain testing methods are only suited
to specific distributions of data and can give inaccurate, unreliable or less powerful results if applied
incorrectly.
As such, it is important to identify features of the data like data skew to determine if it is positively or
negatively skewed or if it is normally distributed, if it is unimodal or bimodal etc. It is also useful, and
even necessary, to identify more descriptive statistical attributes such as the mean value, the median
value, variance (how far a set of numbers in a set are spread out from their mean) and standard
deviation (measure of the amount of variation among a set of data values) to get a true understanding
of the distribution and ‘shape’ of the data set being worked with.

2.2.9 Hypothesis Testing


The appropriate hypothesis tests that can be applied to the results depend on what kind of data is
being worked with. A common and often reliable test is known as the Student’s t-test which tests if
the means of two sets of data are significantly different in relation to their variance.
The null hypothesis 𝐻0 states that there is no statistically significant difference between the means of
the data sets whereas the alternative hypothesis 𝐻𝐴 says that there is a significant difference. If the
alternative hypothesis is accepted with a specified level of confidence then the null hypothesis is
rejected and the alternative hypothesis is accepted, otherwise you fail to reject the null hypothesis
rather than accept it. The Student’s t-test assumes that the data is normally distributed and that the
variance among the data sets is equal. As such, it may or may not be applicable to particular data sets
depending on its distribution.
Hypothesis testing can result in errors if the data is improperly interpreted, summarised in Table 2-1:
NULL HYPOTHESIS
TRUE FALSE
DECISION Reject Type 1 error Correct inference
MADE results in False results in True
ABOUT Positive Negative
NULL Fail to Correct inference Type 2 error
HYPOTHESIS Reject results in True results in False
Negative Negative

Table 2-1: Type errors encountered in Hypothesis Testing


Welch’s t-test is a variation of the Student’s t-test where normality is assumed but equal variance is
not. Accordingly, it can be applied to a wider range of data sets, though its power is limited if the
sample sizes are not large enough (n>10). It has been shown [8] that for large enough sample sizes
(n>15), the assumption of normality can also be discarded with negligible effect on the power of the
test. This was proven using a skewed Chi-squared distribution using the MATLAB analytics suite.

7
Chapter three: The Problem
Summary
The aim of the project is to create a system which analyses collected mouse movement data and
identifies potential correlations between the metrics outlined section 1.5 for in lab and out of lab
environments. The raw data needs to be cleaned and prepared appropriately so that this can be carried
out and the results need to be obtained in such a way that they can be statistically analysed to produce
useful and potentially novel conclusions.

3.1 Technology to use


Python will be used to analyse the data, with its various statistics and data analysis packages and
libraries. Python has modules for reading and writing both JSON and CSV formatted data, which will
prove useful as discussed below.
As the data is read in using Python’s .load() and .reader() modules, NumPy and SciPy support
data storage in NumPy arrays which allow for fast and efficient computation to be performed on each
element stored in the array. NumPy supports 64-bit floating point number precision which is used for
all numerical values throughout this project.

3.2 Preparing the Data


The JavaScript mouse tracking software compiles all of the mouse data for each user into a single log
file. There are over 88 million mouse events in this log file with each event entry holding 11 fields. It
will be necessary to remove non-relevant fields and delete duplicate event entries to make the analysis
as efficient as possible.

Figure 3-1: Cleaned, reversed JSON data showing a mouse click event.

One of the 11 fields recorded in each mouse event is the UNIX timestamp of when that event took
place. In the raw file, entries were logged from oldest to most recent, with the mouseDown event
being the final event in a Click Sequence. During analysis, the data will be read in row by row, starting
with the final mouseDown event and working backwards in time to the start of the Click Sequence, or
until the mouseDown event of a previous Click Sequence is encountered. Thus, the data needs to be
loaded in the reverse order of which it is saved. The serialised nature of the data based on the
timestamp means it is straightforward to reverse the log file ordering and save the data from newest
to oldest to make the backwards traversal of the Click Sequences possible.
After the data has been cleaned and reversed it will be necessary to create separate files for each
user’s mouse movement data. This will involve parsing the data based on the name identifier saved in
each mouse event.
Due to the tabular nature of the data, with each column holding the same fields in each row entry, it
may be appropriate and indeed more efficient to convert the JSON files to comma separated value
files using the built-in reader and writer objects of Python’s CSV module.
Associated files: initialcleaning.py, FileReverse.py

8
3.3 Isolating Relevant Data
3.3.1 Isolate Click Sequences
As the project is concerned primarily with the analysis of user Click Sequences, it will need to be
determined what does and what does not constitute a valid mouse Click Sequence. A generic definition
of what we expect a valid Click Sequence to look like will need to be used as a general rule at first, with
refinements and improvements made a posteriori based on the observed spread of the data and also
based on previous research approaches that are applicable to this data.

Figure 3-2 Plot of an individual users Click Sequence (note: overshoot)


3.3.2 User Environment
User mouse activity will need to be classified into two categories: Click Sequences made in a lab
environment and those made outside of a lab environment.
The dates and times of the two lab exams that this project will focus on are known and so their
timestamps will need to be converted to UNIX time. Then, as each mouse event is loaded into the CSV
module row by row, they can be compared to the timestamps of the lab exams and be grouped
according to whether or not they occurred during a lab. Classifying the data in this way is what will
allow us to compare metrics in stressful and less stressful environments.

3.4 Calculating Metrics


The metrics that will be measured as part of this project will be devised based on previous research in
this field and what has shown to be successful [12], as well as being based on the data itself and what
can be computed from it.
Distance, both actual and optimal, will need to be calculated for each Click Sequence. Optimal distance
is calculated as the Euclidean distance between the initial point of the Click Sequence and the final
point, where a mouseDown event is observed (though, as stated, it is calculated in reverse). The actual
distance will be approximated by calculating the Euclidean distance between each mouse move event
along the Click Sequence path from initial point to the mouseDown event. The sum of these straight
line interval distances will then be calculated.

𝐷(𝑥, 𝑦) = √(𝑦1 − 𝑥1 )2 + (𝑦2 + 𝑥2 )2 + ⋯ + (𝑦𝑛 − 𝑥𝑛 )2 = √∑𝑛𝑖=1(𝑦𝑖 − 𝑥𝑖 )2

9
MouseDown Event MouseDown Event

(xn, yn)

(xn+1, yn+1)
X
Initial Point Initial Point

Figure 3-3: Actual mouse path (left), Approximation of actual mouse path using intermediate Euclidean
distances between mouse events (right)

Time duration of each Click Sequence will be calculated using the UNIX timestamps associated with
each mouse event. As the data is read in, the initial mouse event the CSV module will load will be the
most recent, the mouseDown event. The parser will then work backwards to the start of the Click
Sequence, saving the timestamp of the first and last point in the sequence and calculating the duration
of the Click Sequence as follows:

𝐷𝑢𝑟𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝐶𝑙𝑖𝑐𝑘 𝑆𝑒𝑞𝑢𝑒𝑛𝑐𝑒


= 𝑇𝑖𝑚𝑒 𝑜𝑓 𝑚𝑜𝑢𝑠𝑒𝐷𝑜𝑤𝑛 𝑒𝑣𝑒𝑛𝑡 − 𝑇𝑖𝑚𝑒 𝑜𝑓 𝑓𝑖𝑟𝑠𝑡 𝑚𝑜𝑢𝑠𝑒 𝑒𝑣𝑒𝑛𝑡 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑒𝑞𝑢𝑒𝑛𝑐𝑒

Efficiency will be used as a measure of how close the actual path length of the Click Sequence is to the
optimal path length. Using:
𝑂𝑝𝑡𝑖𝑚𝑎𝑙 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒
𝐸=
𝐴𝑐𝑡𝑢𝑎𝑙 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒
The efficiency of each Click Sequence can be parameterised as a numerical value such that 𝐸 ∈ [0,1]
with 𝐸 = 1 indicating highest possible efficiency in actual path length. Due to the fact that efficiency
of a Click Sequence is calculated using distances, it may be required to apply some sort of weighting
factor such as time or speed to the efficiency values when plotting their spread so that shorter Click
Sequences do not result in a disproportionate level of skew towards high efficiency.
X-Y component efficiency was also calculated using a similar method as above, in which the actual and
optimal Click Sequence length in individual axes was compared and used to calculate axis efficiency.
Speed of mouse Click Sequences will be measured in the traditional way as the rate of change in Actual
distance with respect to time. The average speed of the entire Click Sequence will be measured, though
it may prove useful to calculate differences in speed across specific intervals of the Click Sequence such
as comparing the initial speed of the cursor movement to the speed of the cursor just before the
mouseDown event.
Overshoot will measure how far the user overshoots the target click point before bringing the mouse
cursor back to that point. It should provide a useful measure of how accurate each Click Sequence is
in getting from the initial point to the click point. It is expected that Click Sequences with large
overshoot distances would have a corresponding drop in the measure of efficiency. Based on previous
research [24] this measure can also be compared to metrics such as speed and time taken to get a
better indication of the users stress level during that click sequence.

10
Figure 3-4: Actual mouse path with overshoot of mouseDown location, showing overshoot and return
path of mouse cursor to the location of the click event.

Click Hover Time will be measured to indicate how long the user hovers the mouse cursor over the
mouseDown event point before a mouseDown event is actually recorded. This metric can be compared
with speed, among other metrics, to investigate the relationship between hesitation and efficiency.
X-Y Axis error will be investigated to determine if there is any significant difference in the efficiency or
the level of error between Actual distance and Optimal distance when the x and y axis movements are
considered in isolation rather than together. This may prove a stronger indication of efficiency as it can
better account for the variation in movement of the human arm vertically and horizontally when using
a computer mouse.

3.5 Problem Analysis


Visual analysis will be carried out on the data between both situations using Python’s matplotlib
graphing package. The spread of the data for each metric measured can be represented graphically
using histograms, boxplots, scatter plots etc. It is expected that this element of the analysis will provide
a strong foundation for further hypothesis testing as the spread and skew of the data can be observed.
As stated above, the relevant metrics will be measured within the context of ‘in a lab environment’ vs.
‘out of a lab environment’. The users (n=129) can therefore be treated as the independent variables
(IV) in the analysis, with the metrics measured over the two situations acting as our dependent
variables (DV). The dependent variables may be paired or unpaired, depending on the specific metric
being analysed.
Hypothesis testing can be carried out on this type of data to investigate differences between the
spread of the metrics so that it can be said, with a level of confidence determined by the test, whether
or not the difference in the spread of the data between both situations is statistically significant. The
specific hypothesis tests to be used will be determined once the spread and skew of the data is known.

11
Chapter four: The Solution
Summary
The chapter outlines in extensive detail the approach taken and the methods used to solve the
problems explored in Chapter Three to achieve the stated aims of the project.

4.1 Tools
When comparing programming languages which are used by data analysts, Python had the shallowest
learning curve compared to software suites such as R or MATLAB. Python has numerous data analysis
packages and plotting libraries which are widely supported by a community of data scientists as well
as extensive reference documentation. Python is also an extremely common language used in the field
of machine learning and neural networks which may be used in future work based on the results and
conclusions of this project.

4.2 Cleaning the Data


The original raw data file comprised of over 88 million JSON objects with each object representing a
mouse move event from 129 users over a college year, both in and outside of a lab environment. Each
object held 11 attribute – value pairs as follows:
{version; time; name; sessionID; tabID; event; type; x; y; cX; cY}
Version: Log file version
Time: UNIX epoch time instance of each mouse move event recorded in milliseconds.
Name: Pseudonym identifier for each individual student.
sessionID: Session identifier recorded by the browser as part of its session management
protocol.
tabID: Identifier recorded by the browser to identify individual tabs opened by the user in-
browser.
Event: Mouse event recorded by the JavaScript mouse event handler.
Type: The type of mouse event recorded eg. mouseMove, mouseDown, mouseUp. A
mouseDown and mouseUp event in succession represents a full mouse ‘click’.
X: The x-coordinate of the mouse event in relation to the computer screen.
Y: The y-coordinate of the mouse event in relation to the computer screen.
cX: The x-coordinate of the mouse event in relation to the on-screen client.
cY: The y-coordinate of the mouse event in relation to the on-screen client.
With such a vast amount of data it was necessary to remove redundant or unnecessary fields that
would not be required for any parts of the analysis. For the purposes of this project it was decided to
remove version, sessionID, tabID, event, cX and cY. The log file version was irrelevant to the aim of
measuring user stress, as was the sessionID logged as part of session management. During the lab
exams the users were only allowed to have one tab open on screen and out of lab time the number of
tabs open was unknown and bore little relevance to the aims of the project. It was decided to use the
x and y coordinates rather than the cX and cY coordinates as there was no consistent way to determine
if the user had been using the open window in full-screen, split-screen etc. This was noted as an input
error which had to be worked around.

12
Traditional text editors were unable to open such a large file, so in order to carry out the data cleaning
a Python script was developed and executed from the command line which read in the JSON file, line
by line, using Python’s json.load() method. It deleted the unnecessary fields using key-matching
before writing the new, cleaned JSON object back out to a file. As part of this script, the data was also
parsed into separate user files with the name attribute used as an identifier. This is discussed in section
4.4.
Reversing of each user file was handled separately but in a similar manner to the above. Each user file
was read in binary mode using the name key as the identifier. If a user name was encountered which
hadn’t been read before, it was saved to a name array. The entire file was converted to a list (with
each row as an element), passed to Python’s reverse() method and subsequently written back
out to an output file. Output files were created dynamically for each user by comparing if that name
had been encountered before, if it had then the row was appended to the existing file, if not then a
new file was created with that name as the filepath.
Associated files: InitialCleaning.py, FileReverse.py

4.3 File Conversion


Due to the tabular 2D matrix-style format of the data it was decided to convert each of the cleaned
user files from JSON’s unordered object format to CSV format with each column of the data holding
the same field in each row for consistency. Using the json.load() method as before, the user files
were read in, the value in each key-value pair was appended to list and this list was outputted using
Python’s CSV module writer() and writerow() methods, dynamically creating files based on the
name key in the same was as outlined above.
Associated files: CSVScript.py

4.4 Classification
4.4.1 By User
As outlined above, the data was categorised into separate user files which held mouse movement data
for each individual user across the college year. This was done to identify potential patterns,
relationships or interesting characteristics in the data for individual users or between users.
Associated files: InitialCleaning.py

4.4.2 In lab vs. Out of Lab


The next step was to categorise the data in a way that allowed for detailed analysis to be carried out
to achieve the objectives of the project. This led to classifying the data into two groups based on Click
Sequences made inside of the specified lab exams and Click Sequences made outside of these lab
exams. The labs took place on the 7th and 28th of November 2014, from 11am-1pm on both days. The
start and end times in UNIX form of both labs are shown in Table 4-1.
UNIX Start Time UNIX End Time
Lab 1 1415358000000 1415365200000
Lab 2 1417172400000 1417179600000
Table 4-1 UNIX time stamps of lab exam
The timestamp of each mouse event was classified as either falling between one of the two above
intervals (in a lab) or falling outside of them (outside of a lab) and were partitioned accordingly for

13
each metric, allowing analysis to be carried out on how behaviour and stress is affected by the stress
inducing lab environment vs. the calmer out of lab environment.
Associated files: CSVScript.py, inVsOutLabData.py,
As a result of the out of lab environment being outside of the control of our test design, emphasis is
put on the fact that the out of lab environment cannot accurately be defined as ‘calm’, however it is
assumed that by nature it is at least calmer than the in lab environment. This is supported by research
outlined previously relating higher stress levels to mentally intensive tasks carried out under time
constraints.
To visually represent the difference in the level of mouse activity during a lab environment versus the
level of mouse activity out of a lab environment, the mouse activity for the entire week of both lab
exams was plotted on a histogram. Note the pronounced spikes in activity during both lab exams.

Figure 4-1: (Left) Mouse activity over the week of lab 1 and (Right) mouse activity over the week of lab 2.

4.5 Metrics
CSV files containing the mouse events for each individual user were stored in a directory called
‘CSVSeparated’. The files were read in using Python’s os.walk() directory tree generator which
treats the current directory like a root node in a tree and recursively walks through the sub directories
and files of that root node, returning a tuple [current_directory, directories, files] for each that can be
used in a for-in loop to do the traversing. Python’s os.path.join() method is used to join the
filepath inside the for-in loop and the open()method is used to open the file in ‘read binary’ mode.
One user has been removed from the test data due to them having made no Click Sequences in or out
of the lab environment in the timeframe being investigated. (n*=128)

4.5.1 Distance
Distance was calculated using NumPy’s core linear algebra tools which treats the distance between
two points as a vector and calculates its 2-norm. When a mouseDown event was encountered, the
start x-y coordinates and start time were saved and the distance between each subsequent mouse
event was calculated by passing the current and previous x-y coordinates to the dist() function.
This continued ‘while(measuring = True)’ until the 1450ms time limit for a Click Sequence
was reached, until another mouseDown event was encountered or until the end of the file was
reached. Optimal distance was calculated as the distance between the last point and the start point.
Both actual and optimal values were appended to separate numpy arrays for analysis.
Associated files: ActualVsOptimalCalculator.py, inVsOutLabData.py, MousePathScript.py,

14
Figure 4-2 Distance function using NumPy linear algebra tools
4.5.2 Efficiency
Efficiency is calculated as a measure of how efficient the actual mouse path distances are compared
to the optimal Euclidean distances by calculating optimal distance divided by actual distance as
discussed above.

Figure 4-3 (Left) Histogram of in lab Click Sequence efficiency and (Right) Histogram of out of lab Click
Sequence efficiency (sample mean, standard deviation and variance incl.)

Efficiency was also graphed with weighting factors of both speed and time which aimed to reduce the
effect that the large number of much shorter mouse paths had on the skew of the distribution towards
higher efficiency.
Efficiency in the x-y axes was calculated in the same way with distance limited specifically to the
individual x-y components of the Click Sequence.
Associated files: ActualVsOptimalCalculator.py, ttest.py, EfficiencyWithTimeGraphs.py, GraphScript.py
4.5.3 Speed
Speed of Click Sequences was calculated as a function of distance per unit time. Although the unit of
time was milliseconds, it was converted into seconds to make the plot more intuitive to read. Speed
was plotted with weighting factors including distance of Click Sequence and efficiency of Click
Sequence to investigate whether these weightings had any bearing on the spread of the data.

Figure 4-4 Histogram of in lab Click Sequence speed (Left) and out of lab Click Sequence speed (Right)

Associated files: SpeedvsOptimalScript.py, SpeedwithTimeGraph.py, SpeedvsEfficiencyScript.py

15
There were constraints places on the measurement of this metric to yield a more accurate result and
also to handle potential edge case errors in the data. These constraints were that the actual distance
of the Click Sequence could not be 0 (i.e. the mouse had to actually have been moved) and that the
time between the initial point of the Click Sequence and the final mouseDown event could not be 0.
4.5.4 Time
Time was measured for actual Click Sequences by recording the difference between the mouseDown
event time and the initial point’s time in the click sequence.
Associated files: TimeGraphs.py , InVsOutLabData.py
4.5.5 Hover time
Hover time was measured as an indication of user hesitation before executing the mouseDown event
click. This was achieved by measuring how long the user hovered the mouse over the x-y coordinates
of the mouseDown event before a mouseDown event was actually recorded.
Associated files: ClickWaitTime.py, ClickWaitTimes-relevant.py, HovertimeGraphs.py

Figure 4-5 Histogram of in lab Click Sequence hovertime (Left) and out of lab Click Sequence hovertime
(Right)
4.5.6 Overshoot
Overshoot was measured as the distance a user brought the mouse cursor past the x-y coordinates of
the mouseDown event before bringing the mouse back to that point to click. This metric was measured
to determine if the level of inaccuracy could be correlated with stress and was also compared alongside
other metrics such as speed and efficiency.
Associated files: Overshoot.py, Overshoot-relevant.py, OvershootGraph.py
4.5.7 Axis Error
Axis error was measured as a more in depth analysis of distance. Click Sequences were isolated into
their individual x and y axis components and error analysis on the Click Sequence distance was
compared between individual axes rather than as a Click Sequence whole. This aimed to account for
the difference in arm movement in the vertical and horizontal planes while using a computer mouse
and to potentially identify if one axis is a higher source of error than the other.
Associated files: In_Lab_X_Axes_Graphs.py, In_Lab_Y_Axes_Graphs.py, Out_Lab_X_Axes_Graphs.py
4.6 Hypothesis Testing
In order to identify whether or not the difference between the in lab environment and the out of lab
environment was statistically significant, Welch’s t-test was carried out on each metric for their in and
out of lab values. Due to the skewed nature of the data, transformations (logarithmic and squaring)
were considered to make the data follow a normal distribution. However, these transformations are

16
controversial among data scientists [9] and the Student’s t-test for unequal variances (equivalent to
Welch’s t-test) which was used can handle non-normal data distributions if the sample sizes are
sufficiently large(n>15). [8]
Associated files: ttest.py

17
Chapter five: Evaluation
Summary
Descriptive statistics served as a strong basis for the evaluation of results alongside the visual analysis
discussed previously. These statistics, as well as the plots of spread and distribution, allowed
appropriate testing methods to be identified and applied to the data to make significant conclusions
about what we can say about the users stress state from each individual metric.

5.1 Solution Verification


5.1.1 Verification of Analysis Methods
A dummy user file was used throughout the duration of the project to determine the accuracy of the
code used. This was achieved by creating a user file with controlled, simplified mouse event data so
that each metric could be computed, where possible, by hand and compared to the output from the
code on the same file. This also helped to identify edge cases and sources of error in the code which
could then be accounted for in the analysis of the real user files.

Figure 5-1: Example of dummy CSV file with {name, time, y, x, mouseEvent} used to verify code
analysis of metrics

The dummy files were created to be simple enough so that the metrics being analysed could be
computed by hand such as optimal distance, actual distance, efficiency, time etc. These hand
computed values were then compared against the output of the Python code after it was run on the
same dummy file to verify its correctness. Time values were changed to reflect in and out of lab times
to test classification of both situations, invalid mouse event descriptors and negative UNIX times were
used to test error handling for example. This method allowed for a controlled level of error testing and
code accuracy verification.

5.2 Validation/Measurements
Due to the way in which the metric data was analysed, it was not treated as a dependent test with
repeated measures among users, but rather an independent test comparing independent Click
Sequences in the in lab vs. out of lab environment.

18
5.2.1 Results
Descriptive statistics for all metrics were calculated (with 0 values masked out) as follows:

In Lab Out of lab

Mean SD Var Min Max Mean SD Var Min Max

Speed
264.88 914.94 837129.1 0.69 258924.7 251.75 530.11 281024.2 0.11 203456.7
(px/s)

Efficiency 0.71 0.261 0.068 0.001 1 0.725 0.253 0.064 0.001 1

Time (ms) 1010.738 403.72 162986.5 1 1450 1066.2 382.3 146162.3 1 1450

Overshoot
16.691 35.897 1288.6 2 837.6 14.255 53.566 2869.3 1 2714.2
(px)
Hovertime
181.95 180.31 32512.14 34 1443 208.88 193.45 37424.1 258 1450
(ms)
X-axis REL
2.695 18.55 344.18 0.001 1452 2.125 16.52 273.015 0.001 3836
error (px)
Y-axis REL
2.588 14.41 207.69 0.001 1194 2.61 14.28 203.78 0.002 1952
error (px)
X-axis
0.7398 0.5234 0.2739 0.0006 1 0.7653 0.332 0.11 0.0002 1
efficiency
Y-axis
0.7117 0.332 0.11 0.0008 1 0.7048 0.512 0.262 0.0005 1
efficiency

Table 5-1 Table of metric descriptive statistics


The above results are considered alongside the graphs of the distribution of each metric which are
included throughout this report and in the appendix.
5.2.2 Explanation of Results
Speed has a positively skewed distribution with an average speed of 264.88px per second however it
is prone to outliers, shown by the maximum speed observed of 258924.7px per second. Large outliers
such as this cannot be immediately explained with a degree of confidence with the data that is
available, however it is possible that it can be attributed to a user rapidly shaking the mouse before
mouseDown events in what have been described in previous research as ‘stress clicks’.[5] Speeds in
lab environment are shown to be almost twice as spread out from the mean as the out of lab
environment.
Efficiency is negatively skewed towards a high average efficiency. Weighting this histogram with the
distance metric to account for a high number of shorter paths did not appear to have much of an effect
on this skew. Both in and out of lab exam efficiencies appear to share the same skew with similar
means and standard deviations.
X and Y axis efficiency are similarly skewed towards high efficiency. Notably, in lab y-axis efficiency is
slightly higher on average than the out of lab y-axis mean. Extreme values for both axes efficiencies
are comparably similar.

19
The time duration of each Click Sequence appears to share a similar mean and standard deviation in
both environments, although due to the fact that time was one of the most controlled variables in the
analysis this isn’t hugely surprising. Time is also negatively skewed, towards longer click sequence
times.
Overshoot and Hover time both show a marked difference in spread between environments, with both
exhibiting a positively skewed distribution towards short overshoot lengths and short hover time
durations respectively. Both of these metrics are promising as indicators of stress between
environments.
The average relative error in x-axis movements between environments was considerably larger than
the average relative error in y-axis movements. Maximum values of both metrics show that x-axis error
was more prone to large outliers in comparison to the y-axis errors. Further examination of the
implications of these values would require more extensive and specific analysis to be carried out in a
more controlled environment than was available for this project. It may also require an analysis more
similar to that used in the MouStress [5] paper with a focus on mechanical arm-hand dynamics.
5.2.3 Analysis of Results
An independent-samples 2-tailed Welch’s t-test was conducted to compare behavioural stress
indicators in lab exam and non-lab exam conditions.
t df p-value (Sig) mean
diff
SPEED (PX/S) 4.8675 622836 <0.000001* 13.13
EFFICIENCY -17.68 622836 <0.000001* -0.015
TIME (MS) -43.85 622836 <0.000001* -55.46
OVERSHOOT (PX) 2.9005 15839 <0.00003* 2.44
HOVERTIME (MS) -46.209 617537 <0.000001* -26.93
X-AXIS REL ERROR (PX) 9.6 585916 <0.000001* 0.57
Y-AXIS REL ERROR (PX) -0.635 588578 0.53 -0.022
X-AXIS EFFICIENCY -15.839 585916 <0.000001* -0.026
Y-AXIS EFFICIENCY 5.683 588578 <0.000001* 0.0069

Table 5-2 Summary of Welch’s t-test results showing the significance of the stress indicator metrics. Star
(∗) indicates significance at α = 0.0167.
Each metric except for relative error in the y-axis component of the Click Sequences was statistically
significantly different between in lab versus out lab environment. With test hypotheses as follows:
𝐻0 : 𝜇𝑖𝑛 = 𝜇𝑜𝑢𝑡
𝐻1 : 𝜇𝑖𝑛 ≠ 𝜇𝑜𝑢𝑡
For speed, efficiency, time, overshoot, hovertime, relative error in the x-axis component and efficiency
in both x and y axes we reject the null hypothesis H0 and accept the alternative hypothesis H1. For
relative error in the y-axis component we fail to reject the null hypothesis.

20
Chapter six: Conclusion
Summary
The results and their implications are discussed in this chapter, along with a general discussion and
summation of the project success and progress as a whole.

6.1 Contribution to the state-of-the-art


The aim of the project was to identify a method which allowed us to compare data gathered from a
stressful environment and a calmer environment, using the behavioural metrics which we could
generate from this data to make inferences about the change in user stress state between these
environments.
This report has generated strong, statistically significant results for eight different metrics which can
be used to identify stress states in computer users through analysis of their mouse movements. These
metrics have been identified with statistical certainty at the 99.98% level and could potentially serve
as the basis of robust future research into this and related fields. The ability to use these unobtrusive
and inexpensive methods using basic data logged from a mouse event handler to identify changes in
stress states among users in different environments has widespread potential applications for
subsequent research into human computer interaction (HCI).

6.2 Results discussion


The empirical nature of how to identify valid Click Sequences used here will likely not be extendible to
other approaches. The nature of the mouse data meant that a method of consistently identifying
exactly when a Click Sequence with intent began could not be formulated in a robust or error proof
way. My approach in using a time-limit based identifier was based on an empirical method which was
deemed applicable to this data set upon analysis of the data’s various attributes.
The use of these metrics, however, can be extended to many other implementations which seek to
identify or explore user stress states using computer mouse movements. Once an appropriate method
for isolating valid Click Sequences is identified for the data being examined, these metrics are
universally applicable to a data set which contains mouse x-y coordinates, a time identifier for each
mouse event and an individual user identifier if required.
One interesting result was observed in the y-axis component of movement, where the efficiency of
this component was observed to be significantly different across environments though the difference
in relative error in movement in this component was not found to be significant. Further research with
a more controlled test design is required to explore this result fully.
The results show that, as was the aim of the project, mouse movement data can be used to provide a
non-invasive and low-cost method to measure changes in user stress states in different environments.
Furthermore, my research has identified six statistically robust metrics which can be effectively used
to carry this out.

6.3 Project Approach


Overall, I am extremely pleased with how this project progressed from the initial research and
experimental stages, through the laborious task of processing and computing relevant metric data
from the original raw log file and on to the analysis of this data to produce meaningful and even novel
results and conclusions.

21
Difficulties arose throughout the duration of the project including how to handle such a vast amount
of data in a way that maintained its integrity during processing, how to handle edge cases in the data
such as duplicate or anomalous entries and how to deal with specific problems in the data which
needed to be tackled on a case by case basis for each metric as they arose.
These issues were solved, and the level of confidence in our results improved as a result, by identifying
previous research methods which have been proven effective in this field, along with consistent and
productive consultation with my project supervisor through email and regular face to face meetings.
Dr Patrick Murphy of University College Dublin (UCD) was consulted for his expertise in the area of
mathematics and statistics early on in the project. Having Dr Murphy’s advice at such an early stage in
the development process allowed me to identify potentially fruitful areas of statistical analysis for the
data I was working with while also identifying approaches which may not be as useful. I was then able
to tailor the identification of relevant metrics and subsequent processing of the data with this in mind.

6.3 Future Work


The approach implemented in this project is just one of several which could have been used to identify
and isolate metrics useful in measuring user stress states through computer mouse movement.
This approach used a method which compared individual Click Sequences to the entire sample of Click
Sequences between both environments. The results allowed us to make inferences about how these
individual Click Sequences compare to a population of click sequences between environments.
Another approach might attempt to focus on comparison of individual users between in lab and out of
lab environments to determine if a relationship between stress states can also be found this way. This
would treat the sample as a repeated measures design with the same users being tested in both
environments. The nature of this approach might allow user exam scores to be included as part of the
analysis as well as gender, age etc. to make more wide ranging and potentially significant conclusions
about the results found.
Analysis of keyboard data is another active area of research in the field of HCI which could be
integrated into the above research approach. Research in this area may explore metrics such as time
between keystrokes, key press time, typing speed etc. much like the metrics employed in our research
above to explore potentially similar differences in stress states between environments.
Should the above approaches produce results similar in significance to the results of this project, the
issue of identifying differences in stress state between in lab and out of lab environments could be
treated as a classification problem. Machine learning principles could then be applied to the identified
metrics to train a ‘stress detection’ algorithm which is able to identify whether or not a user is in a
stressed or non-stressed state based on their mouse and/or keyboard data.

The above observations explore several possible approaches and potential developments in the area
of low-level learning analytics. With the emergence of augmented reality and VR headsets as well as
the increasing ubiquity of computers in every facet of people’s daily lives, the importance of exploring
and understanding how people and computers interact cannot be understated.

22
References
[1] W. U. Ali, A. R. Raheem, A. Nawaz, and K. Imamuddin, “Impact of Stress on Job Performance:
An Empirical study of the Employees of Private Sector Universities of Karachi, Pakistan,” Res. J.
Manag. Sci. Res. J. Manag. Sci, vol. 3, no. 7, pp. 2319–1171, 2014.
[2] T. Baghurst and B. C. Kelley, “An Examination of Stress in College Students Over the Course of
a Semester,” Health Promot. Pract., vol. 15, no. 3, pp. 438–447, May 2014.
[3] R. Beiter et al., “The prevalence and correlates of depression, anxiety, and stress in a sample
of college students,” J. Affect. Disord., vol. 173, pp. 90–96, 2014.
[4] N. Mucci et al., “Work-related stress assessment in a population of Italian workers. The Stress
Questionnaire,” Sci. Total Environ., vol. 502, pp. 673–679, Jan. 2015.
[5] D. Sun, J. Canny, and P. Paredes, “MouStress: Detecting Stress from Mouse Motion.”
[6] J. F. Alonso, S. Romero, M. R. Ballester, R. M. Antonijoan, and M. A. Mañanas, “Stress
assessment based on EEG univariate features and functional connectivity measures.”
[7] “IPython: A System for Interactive Scientific Computing Python: An Open and General-
Purpose Environment,” 2007.
[8] “Minitab - 2-Sample t-Test,” http://support.minitab.com/en-
us/minitab/17/Assistant_Two_Sample_t.pdf.
[9] C. Feng et al., “Log-transformation and its implications for data analysis.,” Shanghai Arch.
psychiatry, vol. 26, no. 2, pp. 105–9, Apr. 2014.
[10] T. Issa and P. Isaias, “Human Computer Interaction and Usability in the New Participative
Methodology for Marketing Websites,” Issa Isaias Pacific Asia J. Assoc. Inf. Syst., vol. 6, no. 3,
pp. 47–78, 2014.
[11] S. Zahoor, D. Rajput, M. Bedekar, and P. Kosamkar, “Inferring Web Page Relevancy through
Keyboard and Mouse Usage,” in 2015 International Conference on Computing Communication
Control and Automation, 2015, pp. 474–478.
[12] T. Yamauchi, “Mouse trajectories and state anxiety: Feature selection with random forest,” in
Proceedings - 2013 Humaine Association Conference on Affective Computing and Intelligent
Interaction, ACII 2013, 2013, pp. 399–404.
[13] S. Arshad, Y. Wang, and F. Chen, “Analysing mouse activity for cognitive load detection,” in
Proceedings of the 25th Australian Computer-Human Interaction Conference on
Augmentation, Application, Innovation, Collaboration - OzCHI ’13, 2013, pp. 115–118.
[14] J. Hernandez, P. Paredes, A. Roseway, and M. Czerwinski, “Under pressure,” Proc. 32nd Annu.
ACM Conf. Hum. factors Comput. Syst. - CHI ’14, no. April, pp. 51–60, 2014.
[15] Q. Guo and E. Agichtein, “Exploring mouse movements for inferring query intent,” 31st Annu.
Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., pp. 707–708, 2008.
[16] U. Demšar and A. Çöltekin, “Quantifying the interactions between eye and mouse movements
on spatial visual interfaces through trajectory visualisations.”
[17] Z. Lv et al., “PreprintTouch-less Interactive Augmented Reality Game on Vision Based
Wearable Device.”
[18] N. Padmanaban, R. Konrad, T. Stramer, E. A. Cooper, and G. Wetzstein, “Optimizing virtual
reality for all users through gaze-contingent and adaptive focus displays.,” Proc. Natl. Acad.
Sci. U. S. A., vol. 114, no. 9, pp. 2183–2188, Feb. 2017.
[19] “The JSON Data Interchange Format COPYRIGHT PROTECTED DOCUMENT.”
[20] “CSV, Comma Separated Values (RFC 4180).” [Online]. Available:
http://www.digitalpreservation.gov/formats/fdd/fdd000323.shtml. [Accessed: 16-Mar-2017].
[21] “General Concepts.” [Online]. Available:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_16.
[Accessed: 16-Mar-2017].
[22] “Presentations &amp; Blog Posts — Conda documentation.” [Online]. Available:
https://conda.io/docs/index.html#. [Accessed: 16-Mar-2017].
[23] J. D. Hunter, “Matplotlib: A 2D graphics environment,” Comput. Sci. Eng., vol. 9, no. 3, pp. 90–

23
95, 2007.
[24] M. Hibbeln, J. L. Jenkins, C. Schneider, J. S. Valacich, and M. Weinmann, “HOW IS YOUR USER
FEELING? INFERRING EMOTION THROUGH HUMAN–COMPUTER INTERACTION DEVICES 1,”
MIS Q., vol. 41.

24

You might also like