Professional Documents
Culture Documents
Pavel Laskov
Fraunhofer FIRST.IDA Berlin, Germany
Page 2
Page 3
Page 4
Page 5
System complexity is our greatest enemy. New attacks can be automated and distributed very fast.
Page 6
System complexity is our greatest enemy. New attacks can be automated and distributed very fast. Reaction time matters.
Page 7
System complexity is our greatest enemy. New attacks can be automated and distributed very fast. Reaction time matters. Hackers are extremely industrious.
Page 8
An intrusion detection system (IDS) is a hardware or software system monitoring event streams for evidence of attacks.
By operation principles
Page 9
Signature-based IDS
Page 10
Signature-based IDS
Sensor
Page 11
Signature-based IDS
Sensor: packet capture, defragmentation, TCP stream reassembly. Feature extractor: application protocol parsing, event stream generation.
Sensor
Feature extractor
Page 12
Signature-based IDS
Sensor: packet capture, defragmentation, TCP stream reassembly. Feature extractor: application protocol parsing, event stream generation. Signature engine: matching of data against a signature database.
Sensor
Feature extractor
Signature engine
Signature database
Page 13
Signature-based IDS
Sensor: packet capture, defragmentation, TCP stream reassembly. Feature extractor: application protocol parsing, event stream generation. Signature engine: matching of data against a signature database. Logger: storage and administration of alerts.
Logger
Sensor
Feature extractor
Signature engine
Signature database
Alert database
Page 14
Signature-based IDS
Sensor: packet capture, defragmentation, TCP stream reassembly. Feature extractor: application protocol parsing, event stream generation. Signature engine: matching of data against a signature database. Logger: storage and administration of alerts.
Logger
Sensor
Feature extractor
Signature engine
Signature database
Reactive unit
Page 15
Page 16
Page 17
Page 18
A ash worm may infect all vulnerable servers on the Internet within 15 minutes!
Page 19
A ash worm may infect all vulnerable servers on the Internet within 15 minutes!
Page 20
A ash worm may infect all vulnerable servers on the Internet within 15 minutes!
False alarm rates After 10 false alarms per day, a system administrator will shut down an IDS.
Page 21
A ash worm may infect all vulnerable servers on the Internet within 15 minutes!
False alarm rates After 10 false alarms per day, a system administrator will shut down an IDS. On a high-speed link, e.g. 200MB/s, it sufces to have a false alarm rate of ca. 0.0001% to get 10 false alarms per day!
Page 22
A ash worm may infect all vulnerable servers on the Internet within 15 minutes!
False alarm rates After 10 false alarms per day, a system administrator will shut down an IDS. On a high-speed link, e.g. 200MB/s, it sufces to have a false alarm rate of ca. 0.0001% to get 10 false alarms per day! Performance
Page 23
A ash worm may infect all vulnerable servers on the Internet within 15 minutes!
False alarm rates After 10 false alarms per day, a system administrator will shut down an IDS. On a high-speed link, e.g. 200MB/s, it sufces to have a false alarm rate of ca. 0.0001% to get 10 false alarms per day! Performance Scanning each packet for ca. 10,000 signatures is costly.
Page 24
A ash worm may infect all vulnerable servers on the Internet within 15 minutes!
False alarm rates After 10 false alarms per day, a system administrator will shut down an IDS. On a high-speed link, e.g. 200MB/s, it sufces to have a false alarm rate of ca. 0.0001% to get 10 false alarms per day! Performance Scanning each packet for ca. 10,000 signatures is costly.
Page 26
Generalization error
emp
Page 27
emp
Page 28
Generalization error
emp
Page 29
Generalization error
Page 30
Page 31
Non-vectorial and structured data: embedding of network data in metric spaces. Difculty of getting labels:
Page 32
Non-vectorial and structured data: embedding of network data in metric spaces. Difculty of getting labels:
Page 33
Non-vectorial and structured data: embedding of network data in metric spaces. Difculty of getting labels:
Malicious noise: a need for robust learning methods. Huge data volumes:
Page 34
Page 35
Sensor
Page 36
Sensor
Feature extractor
Page 37
Sensor
Feature extractor
Detector
Signature database
Page 38
Sensor
Feature extractor
Detector
Detectors: Anomaly detection. Classication. Clustering. Support infrastructure: synchronisation platform, GUI, logging facility.
Monitor
Synchronisation platform
Page 39
Packet:
Header
Features:
flat features raw bytes parsed input flat features
Page 40
Embedding of sequences
0.01
P g
0.005
An embedding function dened over the ... language: frequency count w (x) : for w in x binary ag
0 0
%%3
0.05
0.01
0.005
GET
Page 41
Kernels, e.g.
Linear
w PL
w (x)x (y)
w PL
Distances, e.g.
Page 42
m(x ; y ; w)
Reformulations of specic similarity measures: m(p; q) Kernel functions Linear Distances Euclidean Manhattan Chebyshev max pq (p q)2
jp qj pq
Page 43
Run-time linear in sequence length Efcient indexing of sparse dimensions Low run-time and space constants
Page 44
Run-time linear in sequence length Efcient indexing of sparse dimensions Low run-time and space constants
Data structures Hash table (Rieck, Laskov & Mller, DAGM 2006)
Trie (Rieck, Laskov & Mller, DAGM 2006) Generalized sufx tree (Rieck, Laskov & Sonnenburg, NIPS 2006) 64-bit array (Rieck & Laskov, submitted to JMLR) Compacted trie (Rieck & Laskov, submitted to JMLR) Sufx and LCP array (Rieck & Laskov, submitted to JMLR)
Page 45
b
2,1
c
4,2
a
2,1
a
4,2
r
2,1
r
4,2
d
1,1
: # insertions : # leaves
b
4,2
c
4,1
a
4,2
a
4,1
n
4,2
r
4,1
d
3,1
k
1,1
d
4,1
Page 46
r d
n k
Page 47
r d
n k
Page 48
r d
n k
Page 49
r d
n k
Page 50
r d
n k
Page 51
r d
n k
Page 52
r d
n k
Page 53
r d
n k
Page 54
r d
n k
Page 55
r d
n k
Page 56
Center of mass
QSSVM 0.7 0.7
Linkage clustering
Qoppa 0.7
0.6
0.6
0.6
0.5
0.5
0.5
0.4
0.4
0.4
0.3
0.3
0.3
0.2
0.2
0.2
0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
R (xi ) =
1 card(Ci )
Page 57
Any element Kij can be expressed at the k-th iteration using auxiliary quantites F and G:
(k) (k) (k) Kij = Kij Gi Gj + F (k)
QSSVM 0.7
0.6
0.5
0.4
0.3
0.2
= =
Gi
(k1) (k1) k Gi
k1 k
(k1)
2 k2
k 1 i=1 Kki
1 K k2 kk
0.1
1 + k Kik
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Page 58
Compare an incoming example with all previously stored examples: O(n). For those within the R-distance from an incoming example, recompute the score: O(n).
Qoppa 0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
R (xi ) =
1 card(Ci )
Page 59
Find k nearest neighbors of an incoming example: O(nk). For each previous example, compare an incoming example with the end of the heap, insert if closer: O(n log k). If heap changed update the score; rst term: O(1), second term: O(k).
Zeta 0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Page 60
Evaluation Objectives
Detection accuracy on unknown attacks. Best features and algorithms. Comparison with a signature-based IDS.
Datasets
PESIM 2005 Network trafc collected in a virtual environment at Fraunhofer FIRST. Attacks carried out by a penetration testing expert. DARPA 1999 The only publicly avialable dataset for IDS evaluation.
Page 61
Experimental protocol:
Results: Protocol HTTP FTP SMTP Best algorithm AUC0:01 Center-of-mass 0.781 Local density 0.746 Linkage clustering 0.756
Page 62
HTTP traffic
FTP traffic
1 0.9 0.8 true positive rate 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.01 0 0 0.002
SMTP traffic
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.002 Best combined Words Snort 0.004 0.006 0.008 false positive rate 0.01
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.002 Best combined Words Snort 0.004 0.006 0.008 false positive rate
Best combined Words Snort 0.004 0.006 0.008 false positive rate 0.01
Accuracy: Zeta: 8092% at 0% false positive rate Snort: 3590% at 0% false positive rate
Page 63
HTTP traffic
FTP traffic
1 0.9 0.8 true positive rate 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.01 0 0 0.002
SMTP traffic
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.002 Best combined Words Snort 0.004 0.006 0.008 false positive rate 0.01
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.002 Best combined Words Snort 0.004 0.006 0.008 false positive rate
Best combined Words Snort 0.004 0.006 0.008 false positive rate 0.01
Accuracy: Zeta: 5090% at 0% false positive rate Snort: 3080% at 0% false positive rate
Page 64
Machine learning is a viable tool for intrusion detection: Complementary to signature analysis.
Efcient for detection of unknown attacks. High detection accuracy with low false positives.
Page 65
Machine learning is a viable tool for intrusion detection: Complementary to signature analysis.
Efcient for detection of unknown attacks. High detection accuracy with low false positives. Detection of compromised computers. Detection and analysis of novel malware. Spam detection for VoIP (SPIT). ???
Further applications:
Page 66
NIPS 2007 W ORKSHOP ON M ACHINE L EARNING IN A DVERSARIAL E NVIRONMENTS FOR C OMPUTER S ECURITY 7 or 8 December, 2007 Whistler, British Columbia, Canada
Organizers: Pavel Laskov (Fraunhofer Institute FIRST, Germany) Richard Lippmann (MIT Lincoln Laboratory, USA) Deadlines: Extended abstract submission: Notication of acceptance: Contact: Email: Web page: October 19, 2007 November 2, 2007
mls-nips07@rst.fraunhofer.de mls-nips07.rst.fraunhofer.de
Computer and network security has become an important research area due to the alarming recent increase in hacker activity motivated by prot and both ideological and national conicts. Increases in spam, botnets, viruses, malware, key loggers, software vulnerabilities, zero-day exploits and other threats contribute to growing concerns about security. In the past few years, many researchers have begun to apply machine learning techniques to these and other security problems. Security, however, is a difcult area because adversaries actively manipulate training data and vary attack techniques to defeat new systems. A main purpose of this workshop is examine adversarial machine learning problems across different security applications to see if there are common problems, effective solutions, and theoretical results to guide future research, and to determine if machine learning can
Page 67
Thank you!
Questions?
Page 68
Positive examples
Positive signatures
Data source
Anomaly detector
freq. difference
Negative examples
Negative signatures
Objectives: Extract characteristic features of novel attacks. Use available label information for improvement of detection accuracy.
Page 69
0.010
0.005
0.000
0x00 SSH padding during initial handshake 0x02 beginning of a data chunk in an HTTP Tunnel
Page 70
0.011
0.008
0.005
0.003
0.000
%%35c is converted by a vulnerable IIS server to %5c (ASCII code 0x35 corresponds to 5) and nally interpreted as backslash (ASCII code 0x5c).
Page 71
0.038
0.029
0.019
0.010
0.000
0xfffffff0 jump back into previous heap chunk 0x9090eb18 nop sled: 2 NOPs + jump 24 bytes (forces execution of malicious shell code)
Page 72