Using Machine Learning To Predict Temporal Orientation of Search Engine Queries in The Temporalia Challenge

presentation: NTCIR-11 Temporalia
Using machine learning to predict temporal

orientation of search engines queries
in the Temporalia challenge
Michele Filannino, Goran Nenadic
filannim@cs.man.ac.uk, g.nenadic@manchester.ac.uk
Tokyo, 11/12/2014
temporal intent of queries (TIQ)

Given a user query and its submission time, can a
system predict its temporal intent?
input: queries & submission date

output: temporal intent
PAST, RECENCY, FUTURE or ATEMPORAL
easy for people

hard for machines
Source:
11/12/2014, Tokyo
/25
TQI: recency
Source: https://www.google.co.uk/search?q=google+stock+price
11/12/2014, Tokyo
/25
TQI: future
Source: https://www.google.co.uk/search?q=weather+forecast+manchester
11/12/2014, Tokyo
/25
TQI: past
Source: https://www.google.co.uk/search?q=who+was+eliminated+on+dancing+with+the+stars
11/12/2014, Tokyo
/25
TQI: atemporal
Source: https://www.google.co.uk/search?q=who+was+eliminated+on+dancing+with+the+stars
11/12/2014, Tokyo
/25
the data
training set
<query_issue_time>May 1, 2013 GMT+0</query_is

<temporal_class>recency</temporal_class>
</query>
<query> presentation: NTCIR-11 Temporalia
<id>005</id>
<query_string>i am a gummy bear</query_string>
<temporal_class>atemporal</temporal_class>
</query>
<query>
<id>006</id>
<query_string>madden 2014 release date</query
<temporal_class>future</temporal_class>
</query>
<query>
<id>007</id>
<query_string>comet coming in 2013</query_stri
<temporal_class>past</temporal_class>
</query>
<query>
<id>008</id>
<query_string>french open 2013 official</query_s
<temporal_class>future</temporal_class>
</query>
<query>
<id>009</id>
<query_string>daylight savings time 2010 united
<temporal_class>past</temporal_class>
</query>
<query>
<id>010</id>
<query_string>attack synonym</query_string>
<temporal_class>atemporal</temporal_class>
</query>
<query>
<id>011</id>
11/12/2014,
Tokyo
8 /25
<query_string>may 2013
calendar
printable</que
80 instances +
20 instances (released as preliminary test set)
test
300 instances
proposed approach
data-driven rather than rule-based
low-sparsity attributes
external resources:
TempoWordNet1, a temporal lexical KB
ManTIME, a temporal expression
NLTK
extraction system
[1] G. H. Dias, M. Hasanuzzaman, S. Ferrari, and Y. Mathet. TempoWordNet for sentence time tagging. In
Proceedings of the 23rd International Conference on World Wide Web Companion, pages 833838, Republic
and Canton of Geneva, Switzerland, 2014.
11/12/2014, Tokyo
/25
1
ManTIME
usage
a ML-based temporal expression extraction system

madden 2014 release date
madden <TIMEX3 value=2014-XX-XX type=DATE>2014</TIMEX3> release date
drudge report 2013 september

drudge report <TIMEX3 value=2013-09-XX type=DATE>2013 september</TIMEX3>
[1] M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification and
normalization in the TempEval-3 challenge. In Proceedings of SemEval 2013, pages 5357,
Atlanta, USA, June 2013. ACL.
11/12/2014, Tokyo
10 /25
trigger classes
Feature selection RELIEF algorithm
BOW representation
4 dictionaries (1 per class)
PAST
RECENCY
FUTURE
ATEMPORAL
ancient
days
death
did
history
last
months
actual
cost
costs
current
daily
day
direction
agenda
calendar
chance
coming
dates
forecast
forthcoming
chords
lyrics
21 triggers
44 triggers
2 triggers
27 triggers
11/12/2014, Tokyo
11
/25
attributes
#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Attribute description
Is it a Wikipedia page title?
Does it contain a temporal expression?
Submissions term
Submissions trimester
Timing
Most frequent trigger class
Wh type
Most frequent TempoWordNet class
Most frequent POS tag tense
Most frequent coarse-grained POS tag
Trigger classes footprint
Temporal between submission and query
Tenses footprint
Ordered TempoWordNet classes
Most frequent fine-grained POS tag
Coarse-grained POS tag ordered footprint
Fine-grained POS tag ordered footprint
Coarse-grained POS tag footprint
Fine-grained POS tag footprint
Sparsity is measured on the full data set: training + test
Sparsity
2
2
3
4
4
5
5
5
7
8
11
16
18
18
21
119
202
204
265
Example
Input (query/time) attribute value
New York Times YES
june 2013 movies YES
Feb 28, 2013 GMT+0 B
Aug 26, 2013 GMT+0 M2
Movies 2012, Feb 28, 2013 past
peso dollar exchange rate present
how did hitler die how
current stock prices present
what is stop kony 2012 VBZ
kony 2012 fake N
what was I thinking lyrics past-atemporal
fathers day 2010, Feb 28, 2013 36.0
when does fall start VBZ-VB
the last song past-future-presentkony 2012 fake NN
when is labour day N-W-V
when is labour day NN-WRB-VBZ
when is labour day W-V-N-N
when is labour day WRB-VBZ-NN-NN
11/12/2014, Tokyo
12 /25
run 1: minimal
#
classifier:
SVM with polynomial
kernel
* default parameters (C and gamma)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Sparsity
Submissions term
Timing
Wh type
Tenses footprint
11/12/2014, Tokyo
2
2
3
4
4
5
5
5
7
8
11
16
18
18
21
119
202
204
265
13 /25
run 2: intermediate
#
classifier:
SVM with polynomial
kernel
* default parameters (C and gamma)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Sparsity
Submissions term
Timing
Wh type
Temporal between submission and
Tenses footprint
11/12/2014, Tokyo
2
2
3
4
4
5
5
5
7
8
11
16
18
18
21
119
202
204
265
14 /25
run 3: full
#
classifier:
Random Forests
1000 random trees
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Sparsity
Submissions term
Timing
Wh type
Temporal between submission and
Tenses footprint
11/12/2014, Tokyo
2
2
3
4
4
5
5
5
7
8
11
16
18
18
21
119
202
204
265
15 /25
results (submitted runs)

100
Accuracy
75
50
25
55.00%
61.33%
66.33%
Intermediate
Minimal
0
Full
1st ranked system
11/12/2014, Tokyo
16 /25
results: 5 x 10 cross-fold v.
100
Accuracy
75
50
73.18%
77.95%
78.33%
Full
Intermediate
Minimal
25
1st ranked system
11/12/2014, Tokyo
17 /25
a posteriori fix
100
Accuracy
75
50
25
55.00%
61.33%
66.33%
Intermediate
Minimal
72.33%
0
Full
best combination of attributes
Minimal
fixed
11/12/2014, Tokyo
18 /25
how to reach the peak?

confusion matrix
Classified as
Recency
Past
Future
Atemporal
43
21
11
60
Future
38
35
Atemporal
61
Recency
Past
minimal run
11/12/2014, Tokyo
20 /25
confusion matrix
Classified as
Recency
Past
Future
Atemporal
43
21
11
60
Future
38
35
Atemporal
61
Recency
Past
minimal run
11/12/2014, Tokyo
21 /25
dicult queries
iPhone 5 release date
it can be FUTURE or PAST according to the submission time
keywords dont help here
2061: Odyssey Three
keywords can lie!
season 2 dexter
use of external sources of knowledge
11/12/2014, Tokyo
22 /25
dicult queries
iPhone 5 release date
it can be FUTURE or PAST
keywords dont help here
Ventura Stern 2016
keywords could possibly lie
season 2 dexter
use of external sources of knowledge
11/12/2014, Tokyo
23 /25
temporal footprint
a continuous period on the time-line that temporally
defines the existence of a particular concept.
Source: Filannino, M., Nenadic G. Mining temporal footprints from Wikipedia. Proceedings of
the First AHA!-Workshop on Information Discovery in Text. (COLING 2014) (Dublin, Ireland,
August 2014), ACL.
11/12/2014, Tokyo
24 /25
online material
Source: http://www.cs.man.ac.uk/~filannim/projects/temporalia/
11/12/2014, Tokyo
25 /25
Thank
you
S
N
O
I
T
S
E
QU
Contact:
filannim@cs.man.ac.uk
Machine
Learning
Natural Language
Processing
Statistics
Semi-structured
data
Text
Mining
Linguistics
Parallel
computing
11/12/2014, Tokyo
28 /25
the task
Temporal aspects of events provide a natural
mechanism for organising information
source: written texts

goal: a (machine-understandable)
temporal representation of the texts
easy for people

hard for machines
11/12/2014, Tokyo
/25
linguistic key concepts

temporal expressions: phrases denoting a temporal
entity such as an interval or a time point
01/05/2014, March 15, the next week, Saturday, at that time,

yesterday, 5 oclock, 3 days, every 4 hours
events: phrases denoting eventuality and states
inflected verbs and nouns: spoken, deliver, will be published
links: temporal relation between two phrases
BEFORE, AFTER, INCLUDES, ENDS, DURING, BEGINS
Source: ISO-TimeML (ISO/TC37/SC 4 N412 ), rev. 12, 2007
11/12/2014, Tokyo
30 /25
example
Yesterday, Deutsche Bank released a note saying

that China's current economic policies would result in an
enormous surge in coal consumption over the next
decade.
Source: CNN news article published on 28th February 2010.
11/12/2014, Tokyo
31 /25
example: temporal expressions

value: 2010-02-27
type: DATE
Yesterday(T), Deutsche Bank released a note saying

that China's current economic policies would result in
an enormous surge in coal consumption over the next
decade(T).
value: P10Y
type: DURATION
11/12/2014, Tokyo
32 /25
example: events
class: REPORTING
class: OCCURRENCE
Yesterday(T), Deutsche Bank released(E) a note saying(E)

that China's current economic policies would result(E) in
an enormous surge(E) in coal consumption over the next
decade(T).
class: OCCURRENCE
class: OCCURRENCE
11/12/2014, Tokyo
33 /25
example: links
is included
is included
Yesterday(T), Deutsche Bank released(E) a note saying(E)

that China's current economic policies would result(E) in
after
an enormous surge(E) in coal consumption over the next
decade(T).
is included
11/12/2014, Tokyo
34 /25
example: ISO-TimeML output

<TimeML xsi:noNamespaceSchemaLocation=http://timeml.org/timeMLdocs/TimeML_1.2.1.xsd>
<DOCID>nyt_20100228_china_pollution</DOCID>
<DCT><TIMEX3 functionInDocument="CREATION_TIME" tid="t0" type="DATE"
value=2010-02-28">2010-02-28</TIMEX3></DCT>
<TITLE>As Pollution Worsens in China, Solutions Succumb to Infighting</TITLE>
<TEXT>
<TIMEX3 tid="t1" type="DATE" value=2010-02-27">Yesterday</TIMEX3>, Deutsche Bank <EVENT
class="OCCURRENCE" eid="e1">released</EVENT> a note <EVENT class="REPORTING" eid="e2">saying</EVENT>
that China's <TIMEX3 tid="t2" type="DATE" value=PRESENT_REF>current</TIMEX3> economic policies
would <EVENT class="OCCURRENCE" eid="e3">result</EVENT> in an enormous <EVENT class="OCCURRENCE"
eid="e4">surge</EVENT> in coal <EVENT class="OCCURRENCE" eid="e5">consumption</EVENT> over <TIMEX3
tid="t3" type="DURATION" value="P10Y">the next decade</TIMEX3>.
</TEXT>
<TLINK eventInstanceID="ei1" lid="l52" relType="IS_INCLUDED" relatedToTime="t1"/>
<TLINK eventInstanceID="ei4" lid="l53" relType="IS_INCLUDED" relatedToTime="t2"/>
<TLINK eventInstanceID="ei2" lid="l54" relType=IS_INCLUDED" relatedToTime="t1"/>
<TLINK eventInstanceID="ei4" lid="l59" relType="AFTER" relatedToEventInstance=ei1"/>
</TimeML>
11/12/2014, Tokyo
35 /25
visual representation
released,
saying
27 Feb. 2010 now
2020
surge
Utterance time: 28th February 2010.
11/12/2014, Tokyo
36 /25
TempEval-3 results
Identification
Research group
Normalisation
accuracy
Overall
score
Prec.
Rec.
F1
The University of Heidelberg
0.93
0.88
0.9
0.86
0.776
US Naval Academy
0.89
0.91
0.9
0.79
0.71
The University of Manchester
0.95
0.85
0.9
0.77
0.69
Stanford University
0.89
0.91
0.9
0.75
0.674
AT&T Lab Research
0.98
0.75
0.85
0.77
0.656
University of Colorado Boulder
0.94
0.87
0.9
0.72
0.647
Jadavpur University
0.93
0.8
0.86
0.74
0.638
Katholieke Universiteit Leuven
0.93
0.76
0.84
0.75
0.63
Joint Research Centre European Commission
0.9
0.8
0.85
0.68
0.582
Rule-based
Machine learning-based
11/12/2014, Tokyo
37 /25
model selection
93 features, 4 models:
M1: morpho-lexical only

M2: morpho-lexical + syntactic
M3: morpho-lexical + gazeetters
+ WordNet
Source: Filannino, M., and Nenadic G. ManTIME: Temporal expression extraction with
systematic feature type selection and a posteriori label adjustment. Journal of Information
processing and Management: Special Issue on Time and Information Retrieval, (2014),
Elsevier. (under review)
*5x10-fold cross validation
11/12/2014, Tokyo
38 /25
Better software, better research

temporal footprint
A temporal footprint is a continuous period

on the time-line that temporally defines
the existence of a particular concept.
Source: Filannino, M., Nenadic G. Mining temporal footprints from Wikipedia. Proceedings of
the First AHA!-Workshop on Information Discovery in Text. (COLING 2014) (Dublin, Ireland,
August 2014), ACL.
11/12/2014, Tokyo
40 /25
evaluation
subjects: people
lived from 1000 AD to 2014
text from Wikipedia web pages
year of birth and death from DBpedia
228,824 people collected

simple definition of temporal footprint
birth and death dates

11/12/2014, Tokyo
41 /25
results
Galileo Galilei (1564-1642), prediction: 1556-1654
Error: 0.204
11/12/2014, Tokyo
42 /25
results
Computer (1940-today), prediction: 1882-1982
Source: http://www.cs.man.ac.uk/~filannim/projects/temporal_footprints/
11/12/2014, Tokyo
43 /25
application?
Source: http://start.csail.mit.edu/answer.php?query=
11/12/2014, Tokyo
44 /25
i2b2 shared Task 12

ADMISSION DATE: 2011-02-06;
DISCHARGE DATE: 2011-02-08;
HISTORY OF PRESENT ILLNESS: Mr. Pohl is a 53 - year-old male with history of alcohol use
and hypertension. Blood alcohol level was 383. Agitated in emergency room requiring 4
leather restraints, received 5 mg of Haldol, 2 mg of Ativan. He became hypotensive in the
emergency room with a systolic blood pressure in the 80 's and had decreased respiratory
rate. He received a normal saline bolus of 2 litres of good blood pressure response. The
patient was then admitted to the medical Intensive Care Unit for observation and then
transferred to our service on medicine when the blood pressures remained stable
overnight...
06/02/2011
07/02/2011
General
Tests
Treatments
Problems
admission
transfer
SBP ~80
Ativan 2mg
BAL 383
Saline bolus 2l
hypotensive
discharge
SBP stable
decreased respiratory rate

Haldol 4mg
08/02/2011
blood pressure medications
stable
Source: Kovaevi, A., Dehghan, A., Filannino, M., Keane, J. A., and Nenadic, G. Combining
rules and machine learning for extraction of temporal expressions and events from clinical
narratives. Journal of American Medical Informatics (2013).
hands tremor
improved
11/12/2014, Tokyo
45 /25
clinical data
disease progression
modelling
analysis of the eectiveness

of treatments
extraction of patients clinical

pathway
11/12/2014, Tokyo
46 /25
st
1
year backup
identification techniques
ACE-2004 dev & eval
TempEval Task#15
TempEval-3 Task#1
(TERN2004 corpus)
(in SemEval07)
(in SemEval13)
TempEval-2 Task#13
TimeML
(in SemEval10)
(standard)
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
TimeBank
SVM
Conditional Random Fields
(corpus)
(machine learning)
(machine learning)
Hand grammar approach
Maximum Entropy Class.
Markov logic network
(rule-based)
(machine learning)
(machine learning)
11/12/2014, Tokyo
48 /25
scientific interest
temporal expressions AND clinical
70
63
61
56
49
49
46
42
46
43
38
35
28
25
21
18
14
7
0
10
2000
16
15
2003
2004
12
2001
2002
Source: Google Scholar (27/02/2012)
2005
2006
2007
2008
2009
2010
11/12/2014, Tokyo
2011
49 /25
conferences & journals

SemEval: Evaluation Exercises on Semantic Evaluation
TempEval: Temporal Evaluation Task
TIME: Time International Symposium Series

JAMIA: Journal of American Informatics Association
COLING: Computational Linguistics Conference
IJHI: International Journal of Health Informatics
11/12/2014, Tokyo
50 /25
ISO-TimeML
DATE
[YYYY-MM-DD]
TIME
[date]T[hh:mm:ss]
SET
P[[n][Y/M/D/w/h/m/s]]
DURATION
R[n][set]
11/12/2014, Tokyo
51 /25
temporal forms
time or date references
11pm, February 14th
time references that

anchor on another time
one hour after midnight
durations
two days, five years
recurring times
twice in the hour
context-dependent
times
today, last year
vague references
the near future
times indicated by an
event
the day after Silvio

Berlusconi resigned
J. Poveda, M. Surdeanu, and J. Turmo, An analysis of Bootstrapping for the Recognition of

Temporal Expressions, 2009
11/12/2014, Tokyo
52 /25
temporal binding
fully-qualified: no reference to any other temporal
entity
March 15, 2001
deictic: reference to the time of utterance
today, yesterday, three weeks ago, last Thursday
anaphoric: reference to a temporal expression

previously evoked in the text
March 15, the next week, Saturday, at that time
D. Ahn, S. Fissaha Adafre, and M. de Rijke, Towards Task-BasedTemporal Extraction and

Recognition, 2005
11/12/2014, Tokyo
53 /25
NorMA architecture
design, implement and evaluate a novel:
identification architecture
normalisation architecture
investigate the dierence between general and clinical domain

investigate the use of the proposed frameworks to the general
domain
suggest a more temporally-aware error measure for

normalisation phase
11/12/2014, Tokyo
54 /25
clinical NorMA architecture


domain

normalisation phase
11/12/2014, Tokyo
55 /25
clinical NorMA pipeline


domain

normalisation phase
11/12/2014, Tokyo
56 /25
example of clinical rule

pattern = re.findall(^(?:the |her |his |their )?([09][09])(?:st|nd|rd |th)
(?:post|post|day)? ?(?:pod| operative |op| hospital |hsp|day|hd)(?:ly)?
(?:day|night|afternoon)?$, raw_expression)
if pattern:
value = add_date(reference_date , int(pattern[0]) )
return expression, DATE , value, postoperative_literals3
temporal expression
type
ISO-8601 representation
(value)
rule name
11/12/2014, Tokyo
57 /25
Rules activation distribution


domain

normalisation phase
11/12/2014, Tokyo
58 /25
Rules activation distribution


domain

normalisation phase
11/12/2014, Tokyo
59 /25
example: raw text

Admission Date :
02/01/2002
Discharge Date :
02/08/2002
HISTORY OF PRESENT ILLNESS :
Saujule Study is a 77-year-old woman with a history of obesity and
hypertension who presents with increased shortness of breath x 5
days. Her shortness of breath has been progressive over the last
2-3 years. On admission , she was diuresed with Lasix and was
negative 1-2 liters per day for several days.
Source: i2b2 2012 clinical corpus
11/12/2014, Tokyo
60 /25
example: identification
Admission Date :
02/01/2002
Discharge Date :
02/08/2002
Unisys must
about ILLNESS
$100 million
HISTORY
OF pay
PRESENT
: in interest every quarter, on
top of $27
million
dividendswoman
on preferred
Saujule
Study
is a in
77-year-old
with astock.
history of obesity and
hypertension who presents with increased shortness of breath x 5
days. Her shortness of breath has been progressive over the last
2-3 years. On admission , she was diuresed with Lasix and was
negative 1-2 liters per day for several days.
11/12/2014, Tokyo
61 /25
example: normalisation
<TIMEX3 type="DATE" val="2002-02-01" mod="NA">02/01/2002</TIMEX3>

<TIMEX3 type="DATE" val="2002-02-08" mod="NA">02/08/2002</TIMEX3>
<TIMEX3 type="DURATION" val="P5D" mod="NA">5 days</TIMEX3>
<TIMEX3 type="DURATION" val="P2.5Y" mod="APPROX">2-3 years</TIMEX3>
<TIMEX3 type="DURATION" val="P3D" mod="APPROX">several days</TIMEX3>
11/12/2014, Tokyo
62 /25
nd
2
year backup
ml-driven identification phase

Conditional Random Fields
Features: harvested from the literature
Tagging scheme: BIO (beginning, inside, outside)
Factor graph:
11/12/2014, Tokyo
64 /25
factor graph
... was | discovered | in | 1977 | , | Feynman | immediately ...

w-3
Source: Richard P. Feynmans page
w-2
w-1
w0
w+1
w+2
w+3
11/12/2014, Tokyo
65 /25
unique values per feature

12000
10800
9600
8400
7200
6000
4800
3600
2400
1200
phon_length
phon_last_phoneme
phon_form
phon_first_phoneme
lex_pnp
lex_is_space
lex_chunk
temp_year
temp_weekday
temp_time
temp_temporal_prepositions
temp_temporal_conjunctives
temp_temporal_co-reference
temp_temporal_adverbs
temp_temporal_adjectives
temp_signal
temp_season
temp_present_ref
temp_pod
temp_period
temp_past_ref
temp_ordinal
temp_number
temp_month
temp_modifier
temp_literal_number
temp_fuzzy_quantifier
temp_future_ref
temp_festivity
temp_digit
temp_compound
temp_cardinal
lex_unusual
lex_last_s
lex_is_upper
lex_is_title
lex_is_numeric
lex_is_lower
lex_is_digit
lex_is_decimal
lex_is_alpha
lex_is_alnum
lex_is_all_digits_and_dots
lex_is_all_caps_and_dots
lex_has_symbols
lex_has_digit
lex_first_upper
gaz_stopword
gaz_male_names
gaz_festivities
gaz_female_names
TIMEX3 (class)
lex_polarity
gaz_uscities
gaz_nationalities
gaz_iso_countries
gaz_countries
lex_tense
lex_token_with_no_letters_and_numbers
lex_treetagger_pos
lex_pattern
lex_vocal_pattern
lex_extended_pattern
lex_token_with_no_letters
lex_sux
lex_lancaster_stem
lex_prefix
lex_treetagger_lemma
lex_porter_stem
lex_lemma
_word_preprocessed
_word
66 /25
11/12/2014, Tokyo
Post-processing analysis
11/12/2014, Tokyo
67 /25
Temporal
Galileo Galilei
(1564-1642)
ManTIME
wikipedia pages
using dates only
Dante
(1265-1321)
gaussian fit
to be improved
11/12/2014, Tokyo
68 /25
ManTIME architecture
feature type selection

93 features
morpho-lexical, syntactic, gazetteers and WordNet
4 models

M4: morpho-lexical + gazeetters + WordNet
model selection
11/12/2014, Tokyo
70 /25
model selection result

That means Unisys must pay about $100 million in interest every
quarter, on top of $27 million in dividends on preferred stock.
Silver + Gold, 5x10-fold cross validation
11/12/2014, Tokyo
71 /25
TempEval-3
temporal information
extraction
challenge
#
#
annotation
Corpus
purpose
organised every 3 years

(ACL)
AQUAINT
73 in SemEval
33.973
experts
training
TimeBank
documents
TempEval-3 silver
TempEval-3 eval
words
source
183
61.418
experts
training
2.452
666.309
systems
training
20
6.375
experts
testing
Source: TempEval-3 challenge; Corpora released in October 2012 (except the eval).
11/12/2014, Tokyo
72 /25
identification post-processing
Probabilistic correction module

BIO fixer
Threshold-based label switcher
CRFs
PCM
Silver + Gold; 4x10-fold cross validation
BIO
fixer
TbLS
BIO
fixer
11/12/2014, Tokyo
73 /25
TempEval-3: results (Task A)

investigate semi-supervised techniques
Normalisation
approach
the
normalisation
phase
in
a
novel
way
Training data (postaccuracy

strict matching
lenient matching
Identification
processing)
Overall
score
Prec
Rec
F1
Prec
Rec
F1
Type
Value
investigate
the
dierences
between
general
and
clinical
Human&Silver (no)
0.79
0.64
0.7
0.97
0.79
0.87
0.89
0.77
0.672
Human&Silver (yes)
0.8
0.66
0.72
0.97
0.8
0.88
0.87
0.76
0.667
Human (no)
0.76
0.64
0.7
0.95
0.8
0.87
0.87
0.77
0.675
0.78
0.63
0.7
0.97
0.8
0.87
0.89
0.77
0.672
0.66
0.73
0.98
0.79
0.88
0.91
0.78
0.683
domain
use of0.74
the proposed
other0.69
0.79 the 0.7
0.95
0.85 framework
0.9
0.86 to 0.77
investigate
Human (yes)
Silver (no)
domains0.82
Silver (yes)
suggest a more temporally-aware error measure in the

normalisation
Source: M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification and
normalization in the TempEval-3 challenge. Proceedings of the Seventh International Workshop on
Semantic Evaluation (SemEval 2013), pages 5357, Atlanta, Georgia, USA, June 2013. ACL.
11/12/2014, Tokyo
74 /25
TempEval-3: ranking (Task A)

Identification
System
(best run only)
strict matching
lenient matching
Normalisation
accuracy
Overall
score
Prec
Rec
F1
Prec
Rec
F1
Type
Value
HeidelTime
0.84
0.79
0.81
0.93
0.88
0.9
0.91
0.86
0.776
NavyTime
0.79
0.8
0.8
0.89
0.91
0.9
0.89
0.79
0.71
ManTIME
0.79
0.7
0.74
0.95
0.85
0.9
0.86
0.77
0.69
SUTime
0.79
0.8
0.8
0.89
0.91
0.9
0.89
0.75
0.674
ATT
0.91
0.7
0.79
0.98
0.75
0.85
0.91
0.77
0.656
ClearTK
0.86
0.8
0.83
0.94
0.87
0.9
0.93
0.72
0.647
JU-CSE
0.82
0.7
0.75
0.93
0.8
0.86
0.87
0.74
0.638
KUL
0.77
0.63
0.69
0.93
0.76
0.84
0.89
0.75
0.63
FSS-TimEx
0.52
0.46
0.49
0.9
0.8
0.85
0.81
0.68
0.582
Source: Naushad UzZaman, Hector Llorens, Leon Derczynski, James Allen, Marc Verhagen,
and James Pustejovsky. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions,
events, and temporal relations. Proceedings of the Seventh International Workshop on
Semantic Evaluation (SemEval 2013), pages 1-9, Atlanta, Georgia, USA, June 2013. ACL.
11/12/2014, Tokyo
75 /25
TempEval-3: results (Task A)

investigate semi-supervised techniques
Normalisation
approach
the
normalisation
phase
in
a
novel
way
Training data (postaccuracy

strict matching
lenient matching
Identification
processing)
Overall
score
Prec
Rec
F1
Prec
Rec
F1
Type
Value
investigate
the
dierences
between
general
and
clinical
Human&Silver (no)
0.79
0.64
0.7
0.97
0.79
0.87
0.89
0.77
0.672
Human&Silver (yes)
0.8
0.66
0.72
0.97
0.8
0.88
0.87
0.76
0.667
Human (no)
0.76
0.64
0.7
0.95
0.8
0.87
0.87
0.77
0.675
0.78
0.63
0.7
0.97
0.8
0.87
0.89
0.77
0.672
0.66
0.73
0.98
0.79
0.88
0.91
0.78
0.683
domain
use of0.74
the proposed
other0.69
0.79 the 0.7
0.95
0.85 framework
0.9
0.86 to 0.77
investigate
Human (yes)
Silver (no)
domains0.82
Silver (yes)
suggest a more temporally-aware error measure in the

normalisation
Source: M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification and
normalization in the TempEval-3 challenge. Proceedings of the Seventh International Workshop on
Semantic Evaluation (SemEval 2013), pages 5357, Atlanta, Georgia, USA, June 2013. ACL.
11/12/2014, Tokyo
76 /25
TempEval-3: ranking (Task A)

Identification
System
(best run only)
strict matching
lenient matching
Normalisation
accuracy
Overall
score
Prec
Rec
F1
Prec
Rec
F1
Type
Value
HeidelTime
0.84
0.79
0.81
0.93
0.88
0.9
0.91
0.86
0.776
NavyTime
0.79
0.8
0.8
0.89
0.91
0.9
0.89
0.79
0.71
ManTIME
0.79
0.7
0.74
0.95
0.85
0.9
0.86
0.77
0.69
SUTime
0.79
0.8
0.8
0.89
0.91
0.9
0.89
0.75
0.674
ATT
0.91
0.7
0.79
0.98
0.75
0.85
0.91
0.77
0.656
ClearTK
0.86
0.8
0.83
0.94
0.87
0.9
0.93
0.72
0.647
JU-CSE
0.82
0.7
0.75
0.93
0.8
0.86
0.87
0.74
0.638
KUL
0.77
0.63
0.69
0.93
0.76
0.84
0.89
0.75
0.63
FSS-TimEx
0.52
0.46
0.49
0.9
0.8
0.85
0.81
0.68
0.582
Source: Naushad UzZaman, Hector Llorens, Leon Derczynski, James Allen, Marc Verhagen,
and James Pustejovsky. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions,
events, and temporal relations. Proceedings of the Seventh International Workshop on
Semantic Evaluation (SemEval 2013), pages 1-9, Atlanta, Georgia, USA, June 2013. ACL.
11/12/2014, Tokyo
77 /25
feature type selection

93 features
morpho-lexical, syntactic, gazetteers and WordNet
4 models

model selection
11/12/2014, Tokyo
78 /25
model selection result

That means Unisys must pay about $100 million in interest every
quarter, on top of $27 million in dividends on preferred stock.
Silver + Gold, 5x10-fold cross validation
11/12/2014, Tokyo
79 /25
TempEval-3
temporal information
extraction
challenge
#
#
annotation
Corpus
purpose
organised every 3 years

(ACL)
AQUAINT
73 in SemEval
33.973
experts
training
TimeBank
documents
TempEval-3 silver
TempEval-3 eval
words
source
183
61.418
experts
training
2.452
666.309
systems
training
20
6.375
experts
testing
Source: TempEval-3 challenge; Corpora released in October 2012 (except the eval).
11/12/2014, Tokyo
80 /25
identification post-processing
Probabilistic correction module

BIO fixer
Threshold-based label switcher
CRFs
PCM
Silver + Gold; 4x10-fold cross validation
BIO
fixer
TbLS
BIO
fixer
11/12/2014, Tokyo
81 /25
rd
3
year backup
why is it challenging?
1. Matt exercised during his lunch break.

2. He stretched, lifted weights, and ran.
3. He showered, got dressed and returned work.
Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012
11/12/2014, Tokyo
83 /25
linguistic knowledge
1. Matt exercised(E) during his lunch break(E).
2. He stretched(E), lifted(E) weights, and ran(E).
3. He showered(E), got dressed(E) and returned(E) work.
lunch break
exercised
11/12/2014, Tokyo
84 /25
lunch break
exercised
stretch, lift, run
11/12/2014, Tokyo
85 /25
lunch break
exercised
shower, dress, return
stretch, lift, run
11/12/2014, Tokyo
86 /25
common sense knowledge

lunch break
exercised
stretch, lift, run
11/12/2014, Tokyo
87 /25

lunch break
exercised
stretch, lift, run
11/12/2014, Tokyo
88 /25

lunch break
exercised
shower
dress
return
stretch, lift, run
11/12/2014, Tokyo
89 /25
domain knowledge
lunch break
exercised
stretch
run
stretch
shower
dress
return
lift
11/12/2014, Tokyo
90 /25
Temporal footprint
results
Robin Williams (1951 - 2014), prediction: 1953-2006
E: 0.159
11/12/2014, Tokyo
92 /25
other types of temporal footprint?
!
A
H
Christopher Columbus will die in 2057 ?!
Prediction: 1366-2057 (1451-1506), E: 0.92
11/12/2014, Tokyo
93 /25
physical existence vs. social coverage
Anne Franks footprint is shifted in the future
11/12/2014, Tokyo
94 /25
Temporalia
data
Query
Submission date
CLASS
Feb 28, 2013
past
Upcoming Movies in 2013
Jan 1, 2013
future
2013 MLB Playo Schedule
Jan 1, 2013
future
Feb 28, 2013
present
benchmark test set: 300 queries

Number
of Neck Muscles
Feb 28, 2013
present
Movies 2012
training
set: 100 queries
current
price of gold
Amazon Deal of the Day
Feb 28, 2013
atemporal
11/12/2014, Tokyo
96 /25
attributes
ID
Query
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Does the query contain a temporal expression?
Submissions term
Timing
Wh type
Tenses footprint
Submitted runs
Minimal
Intermediate
Full
11/12/2014, Tokyo
97 /25
error measure
gold
prediction
union
overlap
Fatima De Carvalho. 1996. Histogrammes et indices de proximite en analyse donne es

symboliques. Acyes de le cole de te sur lanalyse des donne es symboliques. LISECEREMADE, Universite de Paris IX Dauphine, pages 101127.
11/12/2014, Tokyo
98 /25

Using Machine Learning To Predict Temporal Orientation of Search Engine Queries in The Temporalia Challenge

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Using Machine Learning To Predict Temporal Orientation of Search Engine Queries in The Temporalia Challenge

Uploaded by

Copyright:

Available Formats

presentation: NTCIR-11 Temporalia