Star: Stack Trace Based Automatic Crash Reproduction: PHD Thesis Defence

1
STAR: STACK TRACE BASED

AUTOMATIC CRASH REPRODUCTION
PhD Thesis Defence
Ning Chen
Advisor: Sunghun Kim
November 05, 2013
Outline
1. Motivation & Related Work
2. Approaches of STAR
1) Crash Precondition Computation
2) Input Model Generation
3) Test Input Generation
3. Evaluation Study
4. Challenges & Future Work
5. Contributions
Motivation
Failure reproduction is a difficult and time consuming task.
But it is necessary for fixing the corresponding bug.
For example: https://issues.apache.org/jira/browse/COLLECTIONS-70

Have not been fixed for five months due to difficulties in
reproducing the bug.
After a test case was submit, it was soon fixed with a comment:
As always, a good test case makes all the difference.
Problem Statement
The intention of this research is to propose a stack trace based
automatic crash reproduction framework, which is efficient and

applicable to real world object-oriented programs.
Sub-problem 1:
Propose an efficient crash precondition computation approach which

is applicable to non-trivial real world programs.
Sub-problem 2:
Propose a novel method sequence composition approach which can

generate crash reproducible test cases for object-oriented programs.
Contributions
Study the scalability challenge of automatic crash reproduction, and
propose approaches to improve its efficiency.

Study the object creation challenge for reproducing object-oriented
crashes, and propose a novel method sequence composition

approach to address it.
A novel framework, STAR, which combines the proposed approaches
to achieve automatic crash reproduction using only crash stack trace.

A detailed empirical evaluation to investigate the usefulness of STAR.
Related Work
Related Work
Record-and-replay approaches:
Jrapture, 2000,
BugNet, 2005,
ReCrash/ReCrashJ, 2008
LEAP/LEAN, 2010
Post-failure-process approaches:
Microsoft PSE, 2004
IBM SnuggleBug, 2009
XyLem, 2009
ESD, 2010
BugRedux, 2012
Record-and-replay Approaches
Approach:
Monitoring Phase: Captures/Stores runtime heap & stack objects.
Test Generation Phase: Generates tests that loads the correct
objects with the crashed methods.

Original Program Execution
Store from heap & stack
Stored
Objects
Load as crashed method params
Recreated Test Case
Record-and-replay Approaches
Frameworks
Instrumenta
tion
Data Collections
Memory
Overhead
Performance
Overhead
Jrapture00
Required
All Interactions
N/A
N/A
BugNet05
Required /
Hardware
All Inputs/
Executed Code
N/A
N/A
ReCrash08
Required
Stack Objects
7% - 90%
31% - 60%
LEAP10
Required
SPE Access /
Thread Info
N/A
7% - 600%
Limitations:
Require up-front instrumentations or special hardware deployment.
Collect client-side data, which may raise privacy concern. [Clause
et. al, 2010]

Non-trivial memory and runtime overheads.
10
Post-failure-process Approaches
Perform analyses on crashes only after they have
occurred.
Advantages
Usually do not record runtime data.
Incur no or very little performance overhead.
11
Crash Explanation Approaches
Microsoft PSE [Manevich et. al, 2004]
IBM SnuggleBug [Chandra et. al, 2009]
XyLem [Nanda et. al, 2009]
Assist crash debugging by providing hints on the target
crashes:
Potential crash traces
Potential crash conditions
Could not reproduce the target crashes.
12
Crash Reproduction Approaches
Core dump-based Approaches
Cdd [Leitner et. al, 2009]
RECORE [Roler et. al, 2013]
Symbolic execution-based approaches

ESD [Zamfir et. al, 2009]
BugRedux [Jin et. al, 2012]
Aims to reproduce crashes using only post-failure data
such as
Crash stack traces
Memory core dump at the time of the crash
13

Core dump-based approaches
E.g. Cdd [Leitner et. al, 2009] and RECORE [Roler et. al, 2013]
Leverage the memory core dump and even some developer written
contracts to guide the crash reproduction process.
Advantage
Higher chance of reproducing a crash as more data is provided.
Limitations
Requires not just stack trace, but the entire memory core dump at
the time of the crash.

Less capable in reality due to the lack of memory core dump.
14

Symbolic execution-based approaches
E.g. ESD [Zamfir et. al, 2009] and
BugRedux [Jin et. al, 2012]
Perform symbolic execution-based analysis to identify crash paths
and generate crash reproducible test cases.
15

Advantages:
Use only crash stack trace to achieve crash reproduction.
No runtime overhead is incurred at client-side.
Limitations:
Existing approaches rely on forward symbolic executions to
compute crash preconditions, which is less efficient.

Could not be fully optimized due to the nature of forward symbolic
execution.
Could not reproduce non-trivial crashes from object-oriented
programs due to the object-creation challenge.
16

STAR: Stack Traced based Automatic crash Reproduction
Advantages:
Approaches
Limitations
Advantages of STAR
Record-replay
Data collection
No runtime data collection
Record-replay
Performance overhead
No performance overhead
Core dump
based
Memory Core dump and

developer written contracts
Crash stack trace
Symbolic.
Exec.-based
Lack of optimizations
Symbolic
Exec.-based
Lack of support for objectoriented programs
Optimizations to greatly improve

the crash reproduction process.
Capable of reproducing non-trivial
crashes for object-oriented programs.
17
Overview of STAR
1
stack trace
Crash Precondition Computation
Crash
Preconditions
Input Model Generation
program
Crash Models
test
cases
3
Test Input Generation
19

1
stack trace
Crash
Preconditions
program
Crash Models
test
cases
3

Crash Precondition
the conditions of inputs at a method entry that can trigger the
crash.
It specifies in what kind of memory state can the crash be
reproduced.
20
21

Existing approaches such as ESD and BugRedux use forward
symbolic executions to compute the crash preconditions.

Program is executed in the same direction as normal executions.
Inputs and variables are represented as symbolic values instead of
concrete values.
Limitations of forward symbolic execution
Non-demand-driven: Need to execute many paths not related to
crash
Limited optimization: Difficult perform optimizations using the
crash information
22

STAR performs a backward symbolic execution to compute the
crash precondition.
Program is executed from crash location to method entry.
Advantages of backward symbolic execution

Demand-driven: Only paths related to the crash are executed.
Optimizations: Optimizations can be performed using the crash
information.
23
Backward Symbolic Execution

Given a program P, a crash location L and the crash condition C
at L, we execute P from L to a method entry with C as the initial

crash precondition.
The precondition is updated along the execution path according
to the executed statements.

E.g. int var3 = var1 + var2;
-> all occurrences of var3 are replaced by var1 + var2

E.g. if (var1 != null)
-> Coming from true branch: var1 != null is added to precondition

-> Coming from false branch: var1 == null is added to precondition
The preconditions at method entries are save as the final crash
preconditions.
24
Backward Symbolic Execution

Precondition
Method
Entry
If (i < buffer.length)
T
buffer[i] = 0;
Symbolic Execution
int i = this.last;
{buffer != null}
{last < 0 or last >=
buffer.length}
{last < buffer.length}
{buffer != null}
{i < 0 or i >= buffer.length}
{i < buffer.length}
{buffer != null}
{i < 0 or i >= buffer.length}
TRUE
AIOBE
25
Challenge Path explosion

isDebugging()
print()
debugLog()
buffer = new int[16]
index >= buffer.length
i=0
i = index
buffer[i] = 0
AIOBE
Optimizations
STAR introduces three different approaches to improve
crash precondition computation process:

Static Path Reduction
Heuristic backtracking
Early detection of inner contradictions
26
27

Observation:
Only a subset of the conditional branches and method calls
contribute to the target crash.
E.g. Methods that perform runtime logging can be safely skipped
E.g. Branches which do not modify the crash related variables can
be safely skipped.
Optimization:
STAR detects and skips branches or method calls that do not
contribute to the target crash during symbolic execution.
28

isDebugging()
print()
debugLog()
i=0
i = index
buffer[i] = 0
AIOBE
method isDebugging() does

not contribute to the crash
29

isDebugging()
print()
debugLog()
i=0
i = index
buffer[i] = 0
AIOBE
the conditional branch does not

contribute to the crash as well.
30

isDebugging()
print()
debugLog()
i=0
i = index
buffer[i] = 0
AIOBE
STAR can detect and skip

over methods and branches
not contributing to the crash
31

A conditional branch or a method call is contributive to the
crash if:
It can modify any stack location referenced in the current crash
precondition formula.
It can modify any heap location referenced in the current crash
precondition formula.
However, in backward execution, the actual heap
locations may not be decidable until they are explicitly

defined.
32

For any reference whose heap location cannot be decide:
Compare whether the modified heap location and the reference
has compatible data types.

Compare whether the modified heap location and the reference
has the same field name (exception array)

If both of the above criterion are satisfied, the heap locations are
considered the same.
In Java, the same heap location can only be accessed
through the same field name, except for array fields.
33
Heuristic Backtracking
Observation:
Backtracking execution to the most recent branching point is likely
inefficient, as the contradictions are usually introduced much earlier.
Optimization:
STAR can efficiently backtrack to the most relevant branches where
contradictions may still be avoided.
34
An executed path is not satisfiable
according to the SMT solver.
isDebugging()
print()
debugLog()
i=0
i = index
buffer[i] = 0
AIOBE
35
Typical backtracking is not
efficient.
isDebugging()
print()
debugLog()
i=0
i = index
buffer[i] = 0
AIOBE
36
STAR can quickly backtrack to
the most relevant branches
isDebugging()
print()
debugLog()
i=0
i = index
buffer[i] = 0
AIOBE
37
The unsatisfiable core of the last unsatisfied path
conditions.
A subset of the path conditions which are still unsatisfied by
themselves
A branching point is considered relevant to the last
unsatisfaction and will be backtracked to only if:

A condition in the unsatisfiable core was added in this branch, or
A variables concrete value in the unsatisfiable core was decided in
this branch, or
A variables actual heap location in the unsatisfiable core was
decided in this branch.
38
Inner Contradiction Detection

isDebugging()
print()
debugLog()
i=0
i = index
buffer[i] = 0
AIOBE
STAR quickly discovers innercontradictions in the current

precondition during execution.
39
Inner Contradiction Detection

isDebugging()
STAR quickly discovers innercontradictions in the current

precondition during execution.
print()
debugLog()
i=0
i = index
buffer[i] = 0
AIOBE
Crash Precondition:
index < 0 or index >= 16
Index < 16
40
Other Details
Loops and recursive calls
Options for the maximum loop unrollment and maximum recursive call
depth
Call graph construction

User can specify a pointer analysis algorithm to use
Option for maximum call targets
String operations
Strings are treated as arrays of characters.
Complex string operations/regular expressions are not support: require
the usage of more specialized constraint solvers: Z3-str, HAMPI
42

1
stack trace
Crash
Preconditions
program
Crash Models
test
cases
3
43

After computing the crash precondition, we need to
compute a model (object state) which satisfies this

precondition.
However, for one precondition, there could be many
models that can satisfy it.

E.g. For precondition: {ArrayList.size != 0}, there could be infinite
number of models satisfying it.
44
Generating Feasible Input Models

Object Creation Challenge [Xiao et. al, 2011]
Not every model satisfying a precondition is feasible to be
generated.
For precondition: ArrayList.size != 0, an input model: ArrayList.size
== -1 can satisfy it, but such object can never be generated.
Therefore, we want to obtain input models whose objects
are actually feasible to generate.
45
Generating Practical Input Models

For different input models, the difficulties in generating the
corresponding objects can be very different.

Model 1:
ArrayList.size == 100
Model 2:
ArrayList.size == 1
Requires add() 100 times
Requires add() 1 time
Therefore, we also want to obtain input models whose
values are as close to the initial values as possible.
46
Class Information
STAR has an input model generation approach that can
Generate feasible models
Generate practical models
Extracts and uses the class semantic information to
guide the input model generation process.

The initial value for each class member field.
The potential value range for each numerical field:
e.g. ArrayList.size >= 0
47

Crash Precondition
ArrayList.size !
=0
Class Information
Value Range
ArrayList.size
>= 0
Initial Value
ArrayList.size
starts from 0
SMT
Solver
ArrayList.size == 1
A feasible and practical model
49

1
stack trace
Crash
Preconditions
program
Crash Models
test
cases
3
50

Given a crashing model, it is necessary to generate test
inputs that can satisfy it.

However, it could be challenging to generate object test
inputs [Xiao et. al, 2011]

Non-public fields are not assignable
Class invariants are easily broken if generate using reflection.
A legitimate method sequence that can create and
mutate an object to satisfy the target model (target object

state).
51

Randomized techniques
Randoop [Pacheco et. al, 2007]
Dynamic analysis
Palulu [Artzi et. al, 2009]
Palus [Zhang et. al, 2011]
Codebase mining
MSeqGen [Thummalapenta et. al, 2009]
Not efficient as their input generation process are not demand-
driven, and may rely on existing code bases.

STAR proposes a novel demand-driven test input
generation approach.
52
Summary Extraction
Forward symbolic execution to obtain the summary of
each method.
53
54
Summary Extraction
Summary of a method
the collection of the summaries of its individual paths.
Summary of a method path: , where
: the path conditions represented as a conjunction of constraints
over the method inputs (heap locations read by the method)
: postcondition of the path represented as a conjunction of
constraints over the method outputs (heap locations written by the

method)
Essentially, it is the final effect of this method path.
55
Summary Extraction
Method
Entry
We perform a forward symbolic

execution to the target method.
obj != null
Path Effect
list[size] = obj
e = new
Exception()
size += 1
throw e
Path Condition
Path 1
obj != null
list[size] = obj
size += 1
Path 2
Method
Exit
obj == null
throw new
Exception
56
Method Sequence Deduction
STAR introduced a deductive-style approach to construct
method sequences that can achieve the target object state
57
Method Sequence Deduction

Given a target object state , the path summaries for each method, the
approach finds a method sequence that can produces an object

satisfying in a recursive deduction.
Recursive deduction for parameter
Deductive
Engine
Input Parameters
Object States
satisfies
Candidate Method
Constraint
Solver
Method Path:
By taking this path, the target

object state can be achieved
.
Example
public class Container {
public Container()
public void add(Object);
public void remove(Object);
public void clear();
}
Desired object state (Input model): Container.size == 10
58
59
Example Summary Extraction

Container()
Path 1
clear()
TRUE
TRUE
remove all in list
size = 0
size = 0
add(obj)
Path 1
Path 2
obj != null
obj == null
list[size] = obj
size += 1
remove(obj)
Path 1
throw an
exception
Path 1
Path 2
obj in list
obj not in list
remove from list

size -= 1
No effect
60
Example Sequence Deduction

Method
Container.size
== 10
Deduction
Can add() produce target state?
Select add(obj)
Yes, this.size == 9 && obj != null
Deductive
Engine
Can clear() produce target state?

Container.size
== 9
Select clear()
No, not satisfiable

Constraint
Solver
61
Example Sequence Deduction

Method
Container.size
== 10
Deduction
Select add(obj)
Deductive
Engine

Container.size
== 9
Select add(obj)

Constraint
Solver
Can Contaier() produce target state?

Container.size
== 0
Select
Container()
Yes, no parameter requirement
62
Example Final Sequence

Combine in reverse direction to form the whole sequence
void sequence() {
Container container = new Container();
Object o1 = new Object();
container.add(o1);
(10 times)
}
63
Other Details
The forward symbolic execution in method summary extraction
follows similar settings as precondition computation

E.g. Loops and recursive calls are expanded for only limited
times/depth. (So the extracted path summary total method paths)

The incompleteness of method path summary does not affect
the precision of the method sequence composition.

Generated method sequences are still correct.
Method sequences may not be generated due to missing path summary.
Optimizations have been applied to reduce the number of
methods and method paths to examine.
Evaluation
65
Research Questions
Research Question 1
How many crashes can STAR compute their crash triggering
preconditions?
Research Question 2
How many crashes can STAR reproduce based on the crash
triggering preconditions?
Research Question 3
How many crash reproductions by STAR are useful for revealing the
actual cause of the crashes?
66
Evaluation Setup
Subjects:
Apache-Commons-Collection (ACC):
data container library that implements additional data structures over

JDK. 60kLOC.
Ant (ANT)
Java build tool that supports a number of built-in and extension tasks such
as compile, test and run Java applications. 100kLOC.
Log4j (LOG)
logging package for printing log output to different local and remote
destinations. 20kLOC.
67
Evaluation Setup
Crash Report Collection:
Collect from the issue tracking system of each subject.
Only confirmed and fixed crashes were collected.
Crashes with no or incorrect stack trace information were discarded.
Three major types of crashes: custom thrown exceptions, NPE and AIOBE. (covers
80% of crashes, Nam et. al, 2009)
Subject
# of Crashes
Versions
Avg. Fix Time
Report Period
ACC
12
2.0 4.0
42 days
Oct. 03 Jun. 12
ANT
21
1.6.1 1.8.3
25 days
Apr. 04 Aug. 12
LOG
19
1.0.0 1.2.16
77 days
Jan. 01 Oct. 09
52 crashes were obtained from the three subjects.
68
Evaluation Setup
Our evaluation study has the largest number of crashes
compared to previous studies

Subject
Number of Crashes
RECRASH
11
ESD
BugRedux
17
RECORE
STAR
52
69
Research Question 1
How many crashes can STAR compute their crash
preconditions?
How many crashes can STAR compute crash precondition without
the optimization approaches.

How many crashes can STAR compute crash precondition with the
optimization approaches.
We applied STAR to compute the preconditions for each
crash.
70
Research Question 1
Percentage of crashes whose preconditions were computed by STAR
Crashes with preconditions (%)
80
71.4
70 66.7
60
+57.1
75
+36.9
73.7
73.1
+38.5
50
36.8
40
34.6
30
20
14.3
10
0
Without Optimizations
ACC
ANT
LOG
With Optimizations
Overall
71
Research Question 1
Average time to compute the crash preconditions (The lower the better)
Average time spent (second)
100
90.4
90
80
70
59.3
55.1
60
50
40
30
18.5
20
10
4.9
3.3
2.4
2.1
0
ACC
ANT
LOG Overall
Without Optimizations
With Optimizations
72
Research Question 1
Percentage of crashes whose preconditions were computed by STAR
Break down by each optimization
Crashes with preconditions (%)
80
75
75
73.7
71.4
70 66.7 66.766.7
73.1
60
50
47.4
42.1
36.8 36.8
40
30
44.2
34.6
38.5
36.5
23.8
23.8
20
14.3
14.3
10
0
No Optimization
All Optimizations
ACC
ANT
LOG
Overall

Contradiction Detect
73
Research Question 1
STAR successfully computed crash preconditions for 38
(73.1%) out of the 52 crashes.

STARs optimization approaches have significantly
improved the overall result by 20 (38.5%) crashes.

Static path reduction is the most effective optimization, but
the application of all three optimizations together can

achieve a much higher improvement.
74
Research Question 2
How many crashes can STAR reproduce based on the
crash preconditions?
Criterion of Reproduction [ReCrash, 2008]
A crash is considered reproduced if the generated test case can
trigger the same type of exception at the same crash line.
We applied STAR to generate crash reproducible test
cases for each computed crash precondition.
75
Research Question 2
Overall crash reproductions achieved by STAR for each
subject:
Subject
# of Crashes
# of
Precondition
# of
Reproduced
Ratio
ACC
12
66.7%
(88.9%)
ANT
21
15
12
57.1%
(80.0%)
LOG
19
14
11
57.9%
(78.6%)
Total
52
38
31
59.6%
(81.6%)
76
Research Question 2
More statistics for the test case generation process by
STAR
Subject
Average # of
Objects
Avg. Candidate
Methods
Min Max
Sequence
Average
Sequence
ACC
1.5
35.5
2 - 19
9.4
ANT
1.4
11.7
2 - 14
6.2
LOG
1.5
21.8
2 - 17
8.1
Total
1.5
21.4
2 - 19
7.7
77
Research Question 3
Criterion of Reproduction does not require a crash
reproduction to match the complete stack trace frames.

A partial match of only the top stack frames is still considered as a
valid reproduction of the target crash according to the criterion.
The root causes of more than 60% of crashes lie in the
top three stack frames [Schroter et. al, 2010]

It is not necessary to reproduce the complete stack trace to reveal
the root cause of a crash.
78
Research Question 3
Drawbacks of Criterion of Reproduction
The crash reproduction may not be the same crash.
The crash reproduction may not be useful for revealing the crash
triggering bug.
Reproduced
Buggy frame
79
Research Question 3
How many crash reproductions by STAR are useful for
revealing the actual causes of the crashes?

Criterion of useful crash reproduction
A crash reproduction is considered useful if it can trigger the same
incorrect behaviors at the buggy location, and eventually causes the
crash to re-appear.
We manually examined the original and fixed versions of
the program to identify the actual buggy location for each

crash.
80
Research Question 3
Overall useful crash reproductions achieved by STAR for
each subject:
Subject
# of Reproduced
# of Useful
Ratio (Total)
ACC
87.5% (58.3%)
ANT
12
58.3% (33.3%)
LOG
11
72.7% (42.1%)
Total
31
22
71.0% (42.3%)
81
Comparison Study
We compared STAR with two different crash reproduction
frameworks:
Randoop: feedback-directed test input generation framework. It is
capable of generating thousands of test inputs that may reproduce the

target crashes.
Maximum of 1000 seconds to generate test cases. (10 times of STAR)
Manually provide the crash related class list to increase its probabilities.
BugRedux: a state-of-the-art crash reproduction framework. It can
compute crash preconditions and generate crash reproducible test

cases.
We apply the two frameworks to the same set of crashes
used in our evaluation.
82
Comparison Study
The number of crashes reproduced by the three approaches
40
38
35
Number of Crashes
31
30
25
22
20
18
15
12
10
10
5
0
Precondition
0
Reproduction
Randoop
BugRedux
STAR
Usefulness
83
Comparison Study
STAR
Randoop
12 crashes
BugRedux
5 crashes
10 crashes
84
Comparison Study
STAR outperformed Randoop because:
Randoop uses a randomized search technique to generate method
sequences. Can generate many method sequences but not guided.
Due to the large search space of real world programs, the
probabilities to generate crash reproducible sequences are low.
STAR outperformed BugRedux because:

Several effective optimizations to improve the efficiency of the
crash precondition computation process.

A method sequence composition approach that can generate
complex input objects satisfying the crash preconditions.
85
Case Study
https://issues.apache.org/jira/browse/collections-411
An IndexOutOfBoundsException could be raised in method
ListOrderedMap.putAll() due to incorrect index increment.

01 public void putAll(int index, Map map) {
02
for (Map.Entry entry : map.entrySet()) {
03
put(index, entry.getKey(), entry.getValue();
04
++index; / / buggy increment
05
}
06 }
This bug was soon fixed by the developers by adding checkers
to make sure index is incremented only in certain cases.
86
Case Study
STAR was applied to generate a crash reproducible test case
for this crash:

Surprisingly, it successfully generated a test case that could crash both the
original and fixed (latest) version of the program.
We reported this potential issue discovered by STAR to the
project developers
https://issues.apache.org/jira/browse/collections-474
We also attached the auto-generated test case by STAR in our bug
report.
87
Case Study
Developers quickly confirmed:
The original patch for bug ACC-411 was actually incomplete. It
missed a corner case that can still crash the program.

Neither the developers nor the original bug reporter identified this
corner case in over a year.

It only took developers a few hours to confirmed and fixed the bug
after STARs test case demonstrated this corner case.

The crash reproducible test case by STAR was added to the
official test suite of the Apache Commons Collections project by

the developers.
http://svn.apache.org/r1496168
88
Case Study
STAR is capable of identifying and reproducing crashes that
are even difficult for experienced developers.

STAR can be used to confirm the completeness of bug fixes.
If a bug fix is incomplete, STAR may generate a crash reproducible
test case to demonstrate the missing corner case.
Challenges & Future Work
90
Challenges
We manually examined each not reproduced crashes to identify the
major challenges of reproduction:

Environment dependency (36.7%)
File input.
Network input.
SMT Solver Limitation (23.3%)

Complex string constraints (e.g. regular expressions)
Non-linear arithmetic
Concurrency & Non-determinism (16.7%)

Some crashes are only reproducible non-deterministically or under concurrent
execution.
Path Explosion (6.7%)
91
Future Work
Improving reproducibility
Support for environment simulation, e.g. file inputs
Incorporate specialized SMT solver: string solver like Z3-str
Automatic fault localization

Existing fault localization approaches requires both passing and failing
test cases locate faulty statements.

STARs ability to generate failing test cases can help automate the fault
localization process.
Crash reproduction for mobile applications

Android applications are similar to desktop Java programs in many
aspects.
92
Conclusions
We proposed STAR, an automatic crash reproduction
framework using stack trace.

Successfully reproduced 31 (59.6%) out of 52 real world crashes
from three non-trivial programs.

The reproduced crashes can effectively help developers reveal the
underlying crash triggering bugs, or even identify unknown bug.

A comparison study demonstrates that STAR can significantly
outperform existing crash reproduction approaches.
Thank You!
Appendix
95
Subject Sizes
Our evaluation study has one of the largest subject size
compared to previous studies

Subject
Subject Sizes
Average Subject Size
RECRASH
200 86,000
47,000
ESD
100 100,000
N/A
BugRedux
500 241,000
27,000
RECORE
68 62,000
35,000
STAR
20,000 100,000
60,000
96
Research Question 1
Average time to compute the crash preconditions (The lower the better)
Break down by each optimization
100
90
80
70
60
50
40
30
20 18.511.815.913.8
10
2.1
0
ACC
No Optimization
All Optimizations
90.4
86.8
74.8
67.5
59.3
55.1
48.2
47.8
54.3
50
39.2
28.3
4.9
ANT
3.3
2.4
LOG
Overall

Contradiction Detect
97
Comparison Study
Average time to reproduce crashes (The lower the better)
Only the common reproductions
35
29.9
30
25
20
15
10.8
10
4.283.75
5 2.4 2.3
0
BugRedux
ACC
STAR
8.7
ANT
LOG
4.6
Overall
98
User Survey
Survey Sent
Responses
Confirmed
Correctness
Confirmed
Usefulness
31
6 (19%)
ACC-53
The auto-generated test case would reproduce the bug. . . I

think that having such a test case would have been useful.
Comparison Study
Branch coverage achieved by different test case generation approaches
80
Branch Coverage (%)
74
69
70
58
60
61
54
54
50
40
40
29
30
20
16
29
19
36
30
22
20 22
12
10
0
ACC
Sample Execution
RecGen
JSAP
Randoop
Palus
Palulu
STAR
0
SAT4J

Star: Stack Trace Based Automatic Crash Reproduction: PHD Thesis Defence

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Star: Stack Trace Based Automatic Crash Reproduction: PHD Thesis Defence

Uploaded by

Copyright:

Available Formats

1

STAR: STACK TRACE BASED

But it is necessary for fixing the corresponding bug.

For example: https://issues.apache.org/jira/browse/COLLECTIONS-70

As always, a good test case makes all the difference.

automatic crash reproduction framework, which is efficient and

Propose an efficient crash precondition computation approach which

Propose a novel method sequence composition approach which can

propose approaches to improve its efficiency.

crashes, and propose a novel method sequence composition

to achieve automatic crash reproduction using only crash stack trace.

objects with the crashed methods.

et. al, 2010]

Assist crash debugging by providing hints on the target

Could not reproduce the target crashes.

Symbolic execution-based approaches

Aims to reproduce crashes using only post-failure data

Crash Reproduction Approaches

contracts to guide the crash reproduction process.

the time of the crash.

Crash Reproduction Approaches

and generate crash reproducible test cases.

Crash Reproduction Approaches

compute crash preconditions, which is less efficient.

programs due to the object-creation challenge.

Crash Reproduction Approaches

No runtime data collection

Memory Core dump and

Crash stack trace

Lack of support for objectoriented programs

Optimizations to greatly improve

Crash Precondition Computation

Input Model Generation

Test Input Generation

Crash Precondition Computation

Crash Precondition Computation

Crash Precondition Computation

Crash Precondition Computation

Input Model Generation

Test Input Generation

Crash Precondition Computation

Crash Precondition Computation

Crash Precondition Computation

Crash Precondition Computation

symbolic executions to compute the crash preconditions.

Crash Precondition Computation

Crash Precondition Computation

Advantages of backward symbolic execution

Crash Precondition Computation

Backward Symbolic Execution

at L, we execute P from L to a method entry with C as the initial

to the executed statements.

-> all occurrences of var3 are replaced by var1 + var2

-> Coming from true branch: var1 != null is added to precondition

The preconditions at method entries are save as the final crash

Crash Precondition Computation

Backward Symbolic Execution

Crash Precondition Computation

Challenge Path explosion

buffer = new int[16]

index >= buffer.length

Crash Precondition Computation

crash precondition computation process: