Professional Documents
Culture Documents
Overview
What is text categorization?
Why text categorization?
How to do text categorization?
Generative probabilistic models
Discriminative approaches
d1
d2
d3
c3
c2
ck
n(+)
n(+)
n(+)
dN
Classification Accuracy =
count ( y( )) count ( n( ))
kN
Per-Document Evaluation
c1
d1
d2
d3
c2
c3
Human (+)
n(+)
n(+)
n(+)
System (y)
System (n)
True Positives
False Negatives
TP
Human (-)
ck
False Positives
FP
FN
True Negatives
TN
TP
Precision
TP FP
Recall
TP
TP FN
Per-Category Evaluation
c1
d1
d2
d3
c2
c3
Human (+)
n(+)
n(+)
n(- )
System (y)
System (n)
True Positives
False Negatives
TP
Human (-)
ck
False Positives
FP
FN
True Negatives
TN
TP
Precision
TP FP
Recall
TP
TP FN
R
2 PR
F1
PR
2
2 1
1
2 1
( 2 1) P * R
1
2P R
P
P: precision
R: recall
: parameter (often set to 1)