You are on page 1of 121

Using Dr. Fuzz, Dr.

Memory, and
Custom Dynamic Tools for Secure
Development

Derek Bruening bruening@google.com


Qin Zhao zhaoqin@google.com
Outline

● Introduction

● Dr. Fuzz Tool

● Dr. Fuzz Framework

● Dr. Memory

● DynamoRIO

● Conclusion

2
Introduction

● DynamoRIO
● A dynamic binary instrumentation framework
■ Windows, Linux, Android
■ IA-32, AMD64, ARM, AArch64
● Custom tools
■ Dr. Memory, drstrace, drltrace, drcachesim, drcov, etc.
● Dr. Memory
● A memory checking tool build on top of DynamoRIO
● Dr. Fuzz
● Dr. Fuzz tool
■ Dr. Memory fuzz testing mode
● Dr. Fuzz framework
■ DynamoRIO fuzz testing extension

3
DynamoRIO

application code basic


block
foo() bar() cache

A
A A
C
B C
DynamoRIO D
D

4
DynamoRIO + Client

transformation time execution time:


e.g., code coverage e.g., dynamic instr count

application code basic


client code
block
foo() bar() cache

A’
A
A’
C’
B C
DynamoRIO D’
D

5
Outline

● Introduction

● Dr. Fuzz Tool

● Dr. Fuzz Framework

● Dr. Memory

● DynamoRIO

● Conclusion

6
Dr. Fuzz Tool

● Fuzzing Test
● Generate (random) input data
● Discover coding errors and security vulnerabilities
● Dr. Fuzz Tool
● Dr. Memory Fuzz Testing Mode
■ Dr. Memory: a memory checking tool
■ In-process function-level fuzzing
■ Feedback guided fuzzing
■ Customizable input generation
● Dr. Fuzz vs LibFuzzer

7
Dr. Fuzz Tool Outline

● Fuzzing Test
● Generate (random) input data
● Discover coding errors and security vulnerabilities
● Dr. Fuzz Tool
● Dr. Memory Fuzz Testing Mode
■ Dr. Memory: a memory checking tool
■ In-process function-level fuzzing
■ Feedback guided fuzzing
■ Customizable input generation
● Dr. Fuzz vs LibFuzzer

8
Dr. Memory: A Memory Checking Tool

● Memory Checking Tools


● Instrument applications with error checking code
■ Compiler: Address/Memory/Thread Sanitizer
■ Dynamic Binary Instrumentation: Dr. Memory, Valgrind, Purify
● Discover more bugs than just those leading to a visible crash
● Report errors earlier than a crash
● Dr. Memory
● Detects a variety of memory errors
■ Unaddressable bugs
■ Uninitialized read
■ Double free, handle/memory leaks, etc.
● 10x slowdown for “full mode”, 3x for “light mode”

9
Dr. Memory: A Memory Checking Tool

● Demo

10
Dr. Fuzz Tool Outline

● Fuzzing Test
● Generate (random) input data
● Discover coding errors and security vulnerabilities
● Dr. Fuzz Tool
● Dr. Memory Fuzz Testing Mode
■ Dr. Memory: a memory checking tool
■ In-process function-level fuzzing
■ Feedback guided fuzzing
■ Customizable input generation
● Dr. Fuzz vs LibFuzzer

11
In-process Function-level Fuzzing

application code basic


Dr. Fuzz
block
foo() bar() cache

A
A
F
E
C
B C
DynamoRIO D
D
pre_func()
E E

F post_func()
F

12
In-process Function-level Fuzzing

● In-process Fuzzing
● Run init/finit code only once
● Amortizing code cache building cost
■ Building code cache is expensive in Dr. Memory
● Context switches, code instrumentation/emission, …
● Worse in testing environment: short run, little code reuse
■ Much better code reuse with Dr. Fuzz

pdfium_test with one page pdf file 100 times

Native Dr. Memory Dr. Fuzz Dr. Memory Dr. Fuzz


Light Mode Light Mode Full Mode Full Mode

37.04s 328.69s 75.75s 1553.74s 268.40s

8.9x 2.0x 41.9x 7.2x

13
In-process Function-level Fuzzing

● Function-level Fuzzing
● Focus On The Relevant Code
● Library API Fuzzing
● Runtime Options
■ -fuzz_module
■ -fuzz_function
■ -fuzz_offset
■ -fuzz_num_iters
■ -fuzz_call_convention
■ -fuzz_num_args
■ -fuzz_data_idx
■ -fuzz_size_idx

14
In-process Function-level Fuzzing

● Demo

15
Dr. Fuzz Tool Outline

● Fuzzing Test
● Generate (random) input data
● Discover coding errors and security vulnerabilities
● Dr. Fuzz Tool
● Dr. Memory Fuzz Testing Mode
■ Dr. Memory: a memory checking tool
■ In-process function-level fuzzing
■ Feedback guided fuzzing
■ Customizable input generation
● Dr. Fuzz vs LibFuzzer

16
Feedback Guided Fuzzing

● Execution Feedback
● Code/branch/path coverage
● System/function calls
● Trace
● Guidance
● Quantitatively measure the quality of input data
■ Bias to better input data mutation
■ Coverage guided fuzzing (-fuzz_coverage)
● Flipping each bit of initial input data
● Add current input as the mutator seed if discover new code
● Trace based analysis and symbolic execution
■ Smart input generation

17
Basic Block Coverage

A C D

application code basic


coverage
block
foo() bar() cache

A
A A
C
B C
DynamoRIO D
D

18
Feedback Guided Fuzzing

● Demo

19
Dr. Fuzz Tool Outline

● Fuzzing Test
● Generate (random) input data
● Discover coding errors and security vulnerabilities
● Dr. Fuzz Tool
● Dr. Memory Fuzz Testing Mode
■ Dr. Memory: a memory checking tool
■ In-process function-level fuzzing
■ Feedback guided fuzzing
■ Customizable input generation
● Dr. Fuzz vs LibFuzzer

20
Customized Input Generation

● Initial Input Data


● Current value
● Specific value
● Corpus
● Mutation Algorithm
● Random/ordered
● Bit flip
● Dictionary
● Custom mutator
● Custom third-party mutator library

21
Customized Input Generation

● Demo

22
Dr. Fuzz Tool Outline

● Fuzzing Test
● Generate (random) input data
● Discover coding errors and security vulnerabilities
● Dr. Fuzz Tool
● Dr. Memory Fuzz Testing Mode
■ Dr. Memory: a memory checking tool
■ In-process function-level fuzzing
■ Feedback guided fuzzing
■ Customizable input generation
● Dr. Fuzz vs LibFuzzer

23
Dr. Fuzz vs LibFuzzer

● LibFuzzer
● A library for coverage-guided fuzz testing
● http://llvm.org/docs/LibFuzzer.html
● LibFuzzer

LibFuzzer + XSan Dr. Fuzz

Toolchain LLVM not required

Source code write dedicated code not required

Input corpus required not required

Advantages stack, globals libraries, asm, JIT

Performance = XSan (ASan: ~2x, MSan: ~3x) < DrM (Light: ~3x, Full: ~10x)

With coverage adds another ~2x negligible overhead

24
Outline

● Introduction

● Dr. Fuzz Tool

● Dr. Fuzz Framework

● Dr. Memory

● DynamoRIO

25
Dr. Fuzz Framework

application code basic


Dr. Fuzz
block
foo() bar() cache

A
A
F
E
C
B C
DynamoRIO D
D
pre_func()
E E

F post_func()
F

26
Dr. Fuzz API

● Basic API
● drfuzz_fuzz_target(func_pc, ..., pre_func, post_func)
● drfuzz_get_arg()
● drfuzz_set_arg()
● pre_func()
● Set target function’s arguments
● Set other execution context if necessary
● post_func()
● Decide whether to repeat fuzzing or continue execution
● Provide feedback to the mutator

27
Dr. Fuzz API

● Demo

28
Dr. Fuzz API

● Load Mutator
● drfuzz_mutator_load
● drfuzz_mutator_unload
● Mutator Library API
● drfuzz_mutator_start
● drfuzz_mutator_has_next_value
● drfuzz_mutator_get_current_value
● drfuzz_mutator_get_next_value
● drfuzz_mutator_stop
● drfuzz_mutator_feedback

29
Dr. Fuzz API

● Demo

30
Dr. Fuzz API

● Misc API
● drfuzz_get_target_num_bbs
● drfuzz_get_target_user_data
● drfuzz_set_target_user_data
● ...

31
Outline

● Introduction

● Dr. Fuzz Tool

● Dr. Fuzz Framework

● Dr. Memory

● DynamoRIO

● Conclusion

32
Motivation: Memory Bugs

Memory bugs are challenging to detect and fix


• Memory corruption, reading uninitialized memory, memory leaks

Observable symptoms resulting from memory bugs are


often delayed and non-deterministic
• Errors are difficult to discover during regular testing
• Testing usually relies on randomly happening to hit visible symptoms
• The sources of these bugs are painful and time-consuming to track
down from observed crashes

Memory bugs often remain in shipped products and can


show up in customer usage

33
Meet the Doctor

Detects unaddressable memory


accesses
• Use-after-free
• Heap buffer/array overflow/underflow
• Wild access to invalid address
• Read beyond top of stack

Detects uninitialized memory reads


• Read before write, anywhere in memory

Detects memory leaks


• Unreachable heap objects

34
Dr. Memory

Detects invalid heap arguments


• Double free
• Allocate with new[], free with delete

Detects Windows handle leaks


• Kernel, user, GDI

Detects GDI usage errors


• Get/Release vs Create/Delete
• Thread ownership
• Delete while selected, etc.

35
Deployment

Operates purely at runtime


• No source code or object code modification required

Monitors third-party components and libraries


• More challenging to monitor with compiler or link based tools

Runs on Windows, Linux, and Android (native code)

Visual Studio integration


• External Tool output mode

36
Dr. Memory Demo

37
Implementation Strategy

Track the state of application memory using shadow


memory
• Track whether allocated and whether defined

Monitor every memory-related action by the application:


• Memory allocation at kernel layer (map file, anonymous map)
• Memory allocation at library layer (new, malloc, etc.)
• Memory read or write
• Stack adjustment

At exit or on request, scan memory to check for leaks

38
Shadow Metadata

Shadow each byte of memory + registers with 1 of 3 states:


allocate: mmap, calloc

allocate:
malloc, stack write

unaddressable uninitialized defined

deallocate

deallocate

39
Shadow Memory

Shadow Stack Heap Shadow Heap


Stack
defined header unaddr
uninit redzone unaddr

defined defined
malloc uninit
unaddr defined
redzone unaddr
padding unaddr
header unaddr

freed unaddr

40
The Uninitialized Whole Word Problem

Sub-word variables are moved around as whole words


• Sub-word often initialized as sub-word yet copied as whole word
• Reads involved in copying should not raise errors

Compare 16 bits
Copy 32 bits
byte 0 init init init
Initialize 16

byte 1 init init init


bits

byte 2 uninit uninit uninit


byte 3 uninit uninit uninit

Solution: report errors on “meaningful” reads only


• Use in compare, conditional branch, address register, or system call

Requires propagating metadata and shadowing registers


• Shadow metadata mirrors application data flow

41
Memory Leaks

Dr. Memory uses reachability-based leak detection


• A leak is memory that is no longer reachable by the application
• Memory that is never freed is not considered a leak
▪ Acceptable to not free memory whose lifetime matches process lifetime

At exit time, or on request, perform leak analysis


• Similar to mark-and-sweep garbage collection

Dr. Memory divides all allocated memory into categories


based on how it can be reached by live application pointers
• Any pointer-aligned and initialized pointer-sized word is considered a
potential pointer

42
Memory Leak Categories

Reachable = not a leak: malloc header

requested size
aligned, for application
initialized data
pointer
malloc padding

Not reachable, no way


malloc header
to free = a leak:
requested size
no pointer for application
found to any data
part of data!
malloc padding

43
Possibly Reachable Memory

malloc header

requested size
aligned, for application
initialized data
pointer
malloc padding

Memory that is reachable only via pointers to the middle of


the allocation, rather than the head
• These may or may not be legitimate pointers to that allocation
• A "possible leak"

44
Layered Heap Routines

application

C++: new

Heap 3
C library: malloc

Windows API:
HeapAlloc
Heap 1

Heap 2
Native API:
RtlAllocateHeap

45
Monitoring on Windows

System calls that allocate memory


• NtMapViewOfSection, NtUnmapViewOfSection, NtAllocateVirtualMemory,
NtFreeVirtualMemory, NtMapCMFModule, NtGdiCreateDIBSection

Kernel actions that adjust the stack


• Kernel-mediated control flow: callbacks, exceptions, and APCs, along
with corresponding system calls (NtContinue, NtCallbackReturn) and
interrupts (int 0x2b)
• Directly set thread stack pointer (NtSetContextThread)

46
Memory Reads and Writes by the Kernel

Must mark memory written by kernel as initialized


• Else will see many false positives

Must check that memory read by kernel is initialized


• Else can have false negatives

Must check that memory written by kernel is addressable


• Else can have false negatives

Thus, must know the size and shape of all in and out
parameters to all system calls

47
Dr. Syscall

Database of Windows system calls


• Originally based on Nebbett and Microsoft DDK/SDK headers
• Augmented with our own analysis

Two key operations:


• Top-level parameter iteration
• Deep memory blob iteration

48
Dr. Strace

System call tracer for Windows

Built on top of Dr. Syscall

Pretty-printing all of the types is a work-in-progress


• Contact us if you would like to contribute

49
Dr. Strace

NtOpenKeyEx
arg 0: 0x001fcd0c (type=HANDLE*, size=0x4)
arg 1: 0x109 (type=unsigned int, size=0x4)
arg 2: len=0x18, root=0x3c, name=150/152 "SOFTWARE\Microsoft\Windows NT\
CurrentVersion\LanguagePack\SurrogateFallback", att=0x40,
sd=0x00000000, sqos=0x00000000 (type=OBJECT_ATTRIBUTES*, size=0x4)
arg 3: REG_OPTION_RESERVED or REG_OPTION_NON_VOLATILE
(type=named constant, value=0x0, size=0x4)
succeeded =>
arg 0: 0x001fcd0c => 0x134 (type=HANDLE*, size=0x4)
retval: 0x0 (type=NTSTATUS, size=0x4)
NtQueryKey.KeyCachedInformation
arg 0: 0x134 (type=HANDLE, size=0x4)
arg 1: 0x4 (type=named constant, size=0x4)
arg 2: 0x001fcb5c (type=*, size=0x4)
arg 3: 0xb0 (type=unsigned int, size=0x4)
arg 4: 0x001fca34 (type=unsigned int*, size=0x4)
succeeded =>
arg 2: _KEY_CACHED_INFORMATION {_LARGE_INTEGER {0x1ca043f05a7a595},
int=0x0, int=0x4, int=0x1a, int=0x1, int=0xc, int=0x18, int=0x22}
(type=*, size=0x4)
arg 4: 0x001fca34 => 0x28 (type=unsigned int*, size=0x4)
retval: 0x0 (type=NTSTATUS, size=0x4)

50
Performance Comparison

51
Light Mode

Focus on unaddressable accesses: heap underflow and overflow

Insert redzone around application malloc chunks with special value


(pattern) like 0xf1fd in the redzone

malloc pre- requested size for post- malloc


● Instrumentation
header redzone application data redzone padding

○ Check value before every memory access


○ If match, check whether address is in redzone
● ~3x overhead (vs ~10x full mode)
● Only mode available for 64-bit and Android

52
Outline

● Introduction

● Dr. Fuzz Tool

● Dr. Fuzz Framework

● Dr. Memory

● DynamoRIO

● Conclusion

53
Goals

Profile, monitor, or inspect application binaries as they run


• Build customized dynamic program inspectors

Support comprehensive inspection tools


• Not just sampling-based tools

Target applications that include legacy components, third-


party libraries, or dynamically-generated code
• Want to inspect whole program even if cannot recompile it all

54
Reach of Toolchain Control Points

runtime inspector
“DynamoRIO”?!?

Dynamo Dynamo
@HP Labs @HP Labs
on PA-RISC on x86
late 1990’s 2000

RIO @MIT
Dynamo + RIO →
(Runtime Introspection
DynamoRIO
and Optimization)
1999 2001

56
DynamoRIO History

VMware Google
DynamoRIO Determina
acquires sponsors
@MIT security startup
Determina Dr. Memory
2001 2003 2007 2010

open-sourced
binary releases
BSD license

2002 2009

57
DynamoRIO Tool Platform Design Goals

Efficient
• Near-native performance

Transparent
• Match native behavior

Comprehensive
• Control every instruction, in any application

Customizable
• Adapt to satisfy disparate tool needs

58
Outline

● Introduction
● Dr. Fuzz Tool
● Dr. Fuzz Framework
● Dr. Memory
● DynamoRIO
● Efficient
● Transparent
● Comprehensive
● Customizable
● Conclusion

59
Basic Interpreter

application code

foo() bar()

A
interpreter
B C
fetch decode execute

Slowdown: 300x
60
Improvement #1: Basic Block Cache

application code basic


block
foo() bar() cache

A A

B C C
DynamoRIO
D D

E E

F F

Slowdown: 300x 25x


61
Improvement #2: Linking Direct Branches

application code basic


block
foo() bar() cache

A A

B C C
DynamoRIO
D D

E E

F F

Slowdown: 300x 25x 3x


62
Improvement #3: Linking Indirect Branches

application code basic


block
foo() bar() cache

A A

B C C
DynamoRIO
D D

E E
indirect
branch
F F lookup

Slowdown: 300x 25x 3x 1.2x


63
Improvement #4: Trace Building

application code basic trace


block cache
foo() bar() cache

A A

B C C ind. br.
DynamoRIO stays
on
A
D D trace? C
D
E E E
indirect
?
branch
F F F
lookup

Slowdown: 300x 25x 3x 1.2x 1.1x


64
Base Performance: SPEC 2006

65
Avoiding Intermediate Layers

36.3

7.2 6.1
5.0
1.3 1.0

66
No Intermediate Layer

IR mirrors underlying ISA


• Preserves optimized application code
• Intermediate layers incur significant performance impact
▪ QEMU (user-mode) 6x slower than DR, Valgrind 4x slower than DR

• This is the key to good performance

Still have an abstraction layer


• Block or trace = list of instructions
• Instruction = lists of source and destination operands
• Tool code often still cross-platform
▪ “Does this instruction read memory?”

67
Outline

● Introduction
● Dr. Fuzz Tool
● Dr. Fuzz Framework
● Dr. Memory
● DynamoRIO
● Efficient
● Transparent
● Comprehensive
● Customizable
● Conclusion

68
Unavoidably Intrusive

processprocess
app cache

A A

process B C C process
DynamoRIO
D D

thread
thread
thread
thread
thread
thread

E E
look
up

F F

operating system

69
Arbitrary Interleaving

application code basic trace


block cache
malloc() cache

A call malloc() A

B C DynamoRIO C

E
indirect
branch
F thread-safe lookup

re-entrant!
70
Separate Resources

application

Win32 API

application Win32 DLLs

system call gateway system call gateway

operating system operating system

Linux Windows
71
Private Libraries

application client

Win32 API

Win32 DLLs Private DLLs DynamoRIO

system call gateway

operating system

72
Dynamically Modified Code

application code basic trace


block cache
foo() bar() cache

A A

B C C
DynamoRIO
A
D
X D C
D
E E E
indirect
?
branch
F F F
lookup

73
Code Cache Consistency

ARM x86
I-Cache D-Cache I-Cache D-Cache
A: A: A: A:
B: B: B: B:
C: C: C: C:
D: D: D: D:

store B store B
flush B jump B
jump B

74
Outline

● Introduction
● Dr. Fuzz Tool
● Dr. Fuzz Framework
● Dr. Memory
● DynamoRIO
● Efficient
● Transparent
● Comprehensive
● Customizable
● Conclusion

75
Above the Operating System

processprocess
app cache

A A

process B C C process
DynamoRIO
D D

thread
thread
thread
thread
thread
thread

E E
look
up

F F

operating system

76
Intercepting Windows Messages
user mode kernel mode

message pending
modify save user context
shared library
memory image
dispatcher
majority of
executed

time
code in a dispatcher
typical message handler
Windows
application
no message pending
restore context

77
Outline

● Introduction
● Dr. Fuzz Tool
● Dr. Fuzz Framework
● Dr. Memory
● DynamoRIO
● Efficient
● Transparent
● Comprehensive
● Customizable
● Conclusion

78
DynamoRIO + Client Custom Inspector

application code basic trace


client code block cache
foo() bar() cache

A
A

B C C
DynamoRIO A

D D C
D
E E E
indirect
?
branch
F F F
lookup

79
Primary Client Events: Code Stream

Client has opportunity to inspect and potentially modify


every single application instruction, immediately before it
executes

Entire application code stream


• Basic block creation event: can modify the block
• For comprehensive instrumentation tools

Or, focus on hot code only


• Trace creation event: can modify the trace
• Custom trace creation: can determine trace end condition
• For optimization and profiling tools
80
Transformation Time vs Execution Time
average
instruction
length
transformation
application code time basic trace
client code block cache
foo() bar() cache
call
instruction
A execution
A count
execution time
B C C
DynamoRIO
A
D C
D
D
E
E E
indirect
?
F
branch
F F
lookup

81
Secondary Client Events

Application thread creation and deletion

Application library load and unload

Application exception/signal
• Client chooses whether to deliver, suppress, bypass the app handler,
or redirect control

Application pre- and post- system call


• Client can inspect/modify call number, params, or return value

Bookkeeping: init, exit, cache management, etc.

82
DynamoRIO API: General Utilities

Safe utilities for maintaining transparency


• Separate stack, memory allocation, file I/O
• Thread-local storage, synchronization
• Create client-only thread or private itimer

Application control
• Suspend and resume all other threads

Application inspection
• Address space querying
• Module iterator
• Processor feature identification
83
DynamoRIO API: Code Manipulation

Clean calls to C or C++ code


• Automatically inlined for simple callees

Full IA-32/AMD64 instruction representation


• Includes implicit operands, decoding, encoding

State preservation
• Eflags, arith flags, floating-point state, MMX/SSE state
• Spill slots, TLS, CLS

Dynamic instrumentation
• Replace code in the code cache

84
DynamoRIO API: Extension Libraries

Extension libraries shipping with DynamoRIO


• drmgr: multi-instrumentation mediation
• drsyms: symbol table and debug info lookup
• drwrap: function wrapping and replacing
• drutil: memory address calculation, string loop expansion
• drx: multi-process management, misc utilities
• drsyscall: system call names, numbers, parameter types
• drreg: scratch register management
• umbra: shadow memory
• drcallstack: callstack walking (not yet librarified)
• drmalloc: heap allocation interception (not yet librarified)

85
DynamoRIO Client with Extensions

client code

drx drutil

basic trace
application code client code
drmgr block cache
cache
foo() bar()
A
A
A C
C D
B C
DynamoRIO D
D

86
drdecodelib

Internal decoder exported as an isolated library

Decoder, encoder, disassembler, and general instruction


manipulation

Supports x86, AMD64, and AArch32 A32 + T32

Instruction representation includes all operands (including


implicit operands) and condition code effects

87
Powerpoint Under Inspector

88
DynamoRIO versus Pin

Pin = insert callout/trampoline only


• Pin tries to inline and optimize
• Client has little control or guarantee over final performance
• Proprietary

DynamoRIO = arbitrary code stream modifications


• Callout/trampoline with inlining, plus…
• Client has full control over all inserted instrumentation
▪ Result can be significant performance difference:
PiPA memory profiler + cache simulator 25% faster with DR

• Supports ARM and (coming soon) AArch64


• Open-source
89
Outline

● Introduction
● Dr. Fuzz Tool
● Dr. Fuzz Framework
● Dr. Memory
● DynamoRIO
● Implementation
■ Efficient
■ Transparent
■ Comprehensive
■ Customizable
● Security applications
■ Program shepherding
● Conclusion
90
Anatomy of a Memory-Based Attack

network

ENTER

CORRUPT DATA system and


application
memory
HIJACK PROGRAM COUNTER

COMPROMISE
kernel

91
Critical Data: Control Flow Indirection

• Subroutine calls
– Return address and activation records on visible stack
• Dynamic library linking
– Function exports and imports
• Object oriented polymorphism: dynamic dispatch
– Vtables
• Callbacks – registered function pointers
– Event dispatch, atexit
• Exception handling

Any problem in computer science can be solved with another layer


of indirection.
- David Wheeler

92
Critical Data: Control Flow Exploits

• Return address overwrite


– Classic buffer overflow
• GOT overwrite
• Object pointer overwrite or uninitialized use
• Function pointer overwrite
– Heap, stack, data, PEB
• Exception handler overwrites
– SEH exploits

Any problem in computer science can be solved with another layer


of indirection. But that usually will create another problem.
- David Wheeler

93
Preventing Data Corruption Is Difficult

• Stored program addresses legitimately manipulated by


many different entities
– Dynamic linker, language runtime
• Intermingled with regular data
– Return addresses on stack
– Vtables in heap
• Even if could distinguish a good write from a bad write,
too expensive to monitor all data writes

94
Insight: Hijack Violates Execution Model

Hardware
Interface
Typical
Application Security Attack
Execution Model

95
Goal: Shrink Hardware Interface

Constrained
Hardware Interface

Typical
Application Security Attack
Execution Model

96
Program Shepherding

• Monitor all control-flow transfers during program


execution
– DynamoRIO is in perfect position to do this
• Validate that each transfer satisfies security policy based
on execution model
– Application Binary Interface (ABI): calling convention, library
invocation
• The application may be damaged by data corruption, but
the system will not be compromised by hijacking control
flow

97
Technique 1: Restricted Code Origins

application code basic trace


transformation block cache
time cache

A
unmodified
code
C
D

program D
modified shepherding
code
indirect
E
branch
lookup

98
Technique 2: Restricted Control Transfers

application code basic trace


transformation block cache
foo() bar() time cache

A A

B C C
A
program
D D C
shepherding
D
E E call E
return ?
F F F
jump

99
Technique 3: Un-circumventable Sandboxing

application code basic block cache

foo() bar()
A
B
A pre-check
B system call
jump system call post-check
C C

pre-check
system call
jump post-check
C

10
Security Policies: Restricted Code Origins

Self-contained
dynamically generated
code with no system LoadLibrary(),
calls dlopen()

Code in executable Only code from disk,


memory regions originally loaded
Any code

code origins

less restrictive more


restrictive
JIT Plug-in

10
Restricted Code Origins In Practice

• Tradeoffs between compatibility and security


• Allowed modified code:
– Libraries: relocation, rebinding
– Hooks: let them write once
• Allowed generated code:
– Stylized code fragments:
• Closures on stack/heap used by Visual Basic and gcc nested
functions
– JITs:
• Difficult if mix code and data and do not mark code regions
• Fallback: native execution, rely on language VM for security
– Injected threads, APC’s
• All other non-unmodified-image code disallowed

102
Security Policy: Function Returns

Only to after calls StackGuard canary Only to caller

Direct call
targeted by only StackGhost
Unrestricted one return transparent xor

function returns

less restrictive more


restrictive
nearly all (and more
programs expensive!)

103
Security Policy: Inter-Module Calls + Jumps

Only to import of
Unrestricted source module
Only to bindings
Only to export of given in an
target module interface list

call/jump between modules

less restrictive more


restrictive

104
Security Policy: Intra-module Calls + Jumps

If have symbol table:


only within same
function or to function Only to bindings
entry points given in an
If no symbol table:
unrestricted interface list

call/jump within module

less restrictive more


restrictive
program w/o program with
debug info debug info

105
DynamoRIO for Malware Analysis

Advantages versus VM-based instrumentation


• Semantic gap
▪ Inside guest user mode, can use clean interfaces and more easily gather
more and finer-grained information

• Performance
▪ QEMU software virtualization 6x+ slower than DR

• Ease of use
▪ Tool API

106
Conclusion

● Summary
● Dr. Fuzz Tool
● Dr. Fuzz Framework
● Dr. Memory
● DynamoRIO

● More information:
● http://dynamorio.org
● http://drmemory.org
● http://groups.google.com/group/dynamorio-users
● http://groups.google.com/group/drmemory-users

107
Optional Slides
Direct Code Modification

e9 37 6f 48 92 jmp <callout>

Kernel32!TerminateProcess:
7d4d1028 7c 05 jl 7d4d102f
7d4d102a 33 c0 xor %eax,%eax
7d4d102c 40 inc %eax
7d4d102d eb 08 jmp 7d4d1037
7d4d102f 50 push %eax
7d4d1030 e8 ed 7c 00 00 call 7d4d8d22
Debugger Trap Too Expensive

cc int3 (breakpoint)

Kernel32!TerminateProcess:
7d4d1028 7c 05 jl 7d4d102f
7d4d102a 33 c0 xor %eax,%eax
7d4d102c 40 inc %eax
7d4d102d eb 08 jmp 7d4d1037
7d4d102f 50 push %eax
7d4d1030 e8 ed 7c 00 00 call 7d4d8d22
Base Performance

SPEC CPU2000 Server Desktop


Transparency Landscape

Principle 1: Principle 2: Principle 3:


As few changes Hide necessary Separate
as possible changes resources

application code, machine context,


Code
stored addresses cache consistency
stack, heap,
separate stack,
Data registers,
heap, context, i/o
condition flags
threads,
Concurrency disjoint locks
memory ordering

Other preserve errors


Base Performance Comparison (No Tool)
BBCount Performance Comparison
Memory-Based Security Exploits

Microsoft Security Bulletins 2003-2005

11
Results for Restricted Code Origins

Technique 1
Technique 2 Technique 3
Attack Type Restricted Code
Origins
Injected code STOPPED

Existing code

• Injected code attack: introduce code masquerading as


data
• Code reuse attack: execute existing code in novel order
– “Return-to-libc”
• Often combined
– For reliable buffer address – find “jmp stack pointer”
– For bypassing NX
Results for Restricted Control Transfers

Restricted
Restricted Code
Attack Type Origins
Control Technique 3
Transfers

Injected code STOPPED

Chained calls STOPPED

Return HINDERED
Existing code
Other transfer

Not imported STOPPED


Inter-module
Call or jump

Imported

Have Mid-func STOPPED


entry
Intra-module info Func entry

No info

117
Results for Un-circumventable Sandboxing

Restricted Code Restricted Control Un-circumventable


Attack Type Origins Transfers Sandboxing

Injected code STOPPED

Chained calls STOPPED

Return HINDERED HINDERED


Existing code
Other transfer

Not imported STOPPED


Inter-module
Call or jump

Imported HINDERED

Have entry Mid-func STOPPED

Intra-module info
Func entry HINDERED

No info HINDERED

118
Determina, Inc.

• Startup company in host intrusion prevention market


• “Memory Firewall” product = program shepherding
• “Liveshield” product = patching without rebooting using
the code cache
• Two successful rounds of funding, 40+ employees, and
~50 customers including large financials
• Acquired by VMware in 2007
Minimal False Positives

• Carefully crafted security policies


• Automated exemption generation: ‘staging mode’
• Determina: 50 customers, 10,000 machines
– No false positives in MSFT apps
– <50 unique false positives in 3rd party libraries
• We treated these false positives as bugs rather than
customer driven policies
– Radically different from other security products

Kiriansky, Bruening, Amarasinghe.


“Secure Execution via Program Shepherding” SEC’02
Self-Protection

ASLR of code cache, heap, and stacks

Guard pages around every cache, heap, and stack block


• Stops sequential loops from hitting any DR memory

Zero state on DR stack while in the cache

Write-protected memory
• Data sections in DR library
• DR’s own generated code (indirect branch lookup, etc.)
• Code cache and heap is option-controlled: perf tradeoff

Hiding from library enumeration and memory queries

You might also like