Mitchell Guest BPF

Randomly State Space Tracking
Bart Massey
Assoc. Prof. Computer Science
Portland State University
bart@cs.pdx.edu
First, Let's Talk About Me
● PSU Assoc Prof for 10 years, PhD AI UO,
MS PL impl. UO, BA Physics Reed
● Interested in technical tools for cross-
disciplinary problem solving
● ISO bright, motivated partners
– Check out http://psas.pdx.edu
– Check out http://code.google.com/soc for
Google Summer of Code info
– State-space search course, Spring 2010
What I'll Talk About
● Brief history of navigation!
● The state space tracking problem
● Brief mention of Kalman Filters
● Intro to Bayesian Particle Filters
● The weighted resampling problem
● Efficient weighted resampling
● Sketch correctness (?) and optimality
of my weighted resampling method
● The Ziggurat Method
● Efficient variate generation for my weighted resampling
method
● 1M particle updates per second!
History of Navigation
In One Slide
● Whenever people traveled, they used
– “Dead reckoning”: Navigation using a
model of where you are
– Observation: Navigation using sensory
evidence of where you are
● Accurate clocks help with both
● Observation has come to dominate
– In particular, single “reliable” sensor
– Skilled navigators are better than this
– An “AI-complete” problem?
State Space Tracking
● Not just for navigation

● Sensors always lie (except clocks)
● Filtering vs smoothing
“Formalizing” the problem
● Given
– g(t0..n-1 ) = estimated state values
– hi(t0..n ) = sensed state values
– Noise models for g and h
● Estimate
– g(tn) = new state estimate
● AKA “sensor fusion”
State Space Filtering
● Strategy: step the model forward in
time from last estimate, then
compare with the sensors
● Kalman Filtering
– Fine linear method
– Requires “only” O(s3) multiplies per step
– Various “fixes” exist
● Bayesian Particle Filtering (BPF)
– Nonlinear, but computationally expensive
BPF
● Recall Bayes' rule:
Pr(H|E) Pr(E) = Pr(H∧E) = Pr(E|H) Pr(H)
Pr(H|E) = Pr(E|H) Pr(H) / Pr(E)
● Here, the “evidence” comes from the
sensors, and the “hypothesis” from
the model
● But...we don't just want to know how
likely is that our model gave the right
position; we want to know what the
most likely position is?
“Particles”!
N
N
v
v
N
v N
N
v
● Track lots of models! ● Curse of dimensionality:

need lots of particles
● At each step, replicate
likely, discard unlikely ● Remember: I want 1Mu/s
models
● Particles “map” likely
position
“Bootstrap” BPF
● for each particle i
– gi(tn) = xi ← (m + Δ)(gi(tn-1 ))
– hi(tn) = wi = Pr(xi|E1..n )
= Pr(xi|E1) Pr(xi|E2) ...
● Estimate state from ensemble
● Resample (m → n, usually n = m)
– for i in {1..n}
– select p'i weighted-randomly
Weighted Random Resampling
● The naïve single-sample algorithm
– μ ← u[0..1] ; normalize wi
– w ← 0; i ← 1
– while w < μ
– w ← w + wi ; i ← i + 1
– return i
● Can iterate to get n samples
Faster Resampling
● Definitive method: O(mn)
● “Obvious” improvements:
– Binary search or treeify: O(m + n lg m)
– Sort variates and merge: O(n lg n + m)
● Could give up correctness
– Regular resampling: O(m)
– Regular with shuffle: O(m)
Optimal Resampling
● Variate merge is O(m lg m) because it
requires the variates in sorted order
● Just generate them that way?
● Distribution of the leftmost of n
variates μ ← u[0..1] is
p(μ0) = (1 - μ0)n-1
● Independence also implies recursion
● So we just generate variates in
increasing order and merge!
Sampling A Distribution
● Brute force: integrate, take ratio,

invert function
● Leftmost variate at μ0= 1 - μ1/n
Optimal, But Not Fast Enough
● The optimal algorithm
– normalize wi
– w ← w1 ; i ← 1; μ0 ← 1 - μ1/n
– while i ≤ n
– while μ0 ≤ w
– μ0 ← μ0 + (1-μ0) (1 - μ1/(n-i) )
– select particle i; i ← i + 1
– w ← w + wi
● O(m + n), but that one line is expensive
The “Ziggurat Method”
● Marsaglia and Tsang 2000; accelerated

“rejection method”
● Works directly on PDF; no integration or
inversion required
Ziggurat for the Whole Family
● Problem: (1-μ)n is parameterized on n

– We don't want to build n ziggurats
● Solution: Squint
– For large enough n, the curves are similar
– In fact, they look a lot like the
compounding curve: consider
rescaling μ and substituting to get
(1+(-μ/a))an
Notice
lim (1+(-μ/a))an = e-μn
A Slight Expansion
● For small i, use direct method

● For larger i, use expanded Ziggurat
My Trick Doesn't Work
● Need equal area under curve

● Two years, published paper, technical
talks: no one caught me
The Standard Trick
● Give up on simulating naïve sampling
as too hard and unnecessary
● Just sample at regular intervals
– w ← w1 ; j ← 1 ; μ0 ← μ / (n + 1)
– while μ0 ≤ 1 do
– while μ0 ≤ w
– μ 0 ← wj ; j ← j + 1
– select particle j
– μ0 ← μ0 + 1 / n ; w ← w + wi
Performance
Results
● Fast general-purpose BPF: 2.5Mu/s!

● Good results on vehicles
Machine Learning??
● “It's a lovely talk, but why this class?”
● Clearly AI: Attempt to infer from
incomplete, incorrect data and models
● Opportunities to apply ML to BPF
– Infer sensor model from measurements
– Discover systems failures
● Opportunities to apply BPF to ML
– Condition model data
– Find underlying probabilities
Acknowledgments and Availability
● Thanks to Prof James McNames and

Prof Gerardo Lafferriere
● Thanks to Jules Kongslie, Jamey Sharp
and Josh Triplett
● Thanks to BSP and PSAS students
● Thanks for the chance to talk!
● Paper is available w/ GPL code at
http://wiki.cs.pdx.edu/bartforge/bmpf
Randomly State Space Tracking
Bart Massey
Assoc. Prof. Computer Science
Portland State University
bart@cs.pdx.edu
First, Let's Talk About Me
● PSU Assoc Prof for 10 years, PhD AI UO,
MS PL impl. UO, BA Physics Reed
● Interested in technical tools for cross-
disciplinary problem solving
● ISO bright, motivated partners
– Check out http://psas.pdx.edu
– Check out http://code.google.com/soc for
Google Summer of Code info
– State-space search course, Spring 2010
What I'll Talk About
● Brief history of navigation!
● The state space tracking problem
● Brief mention of Kalman Filters
● Intro to Bayesian Particle Filters
● The weighted resampling problem
● Efficient weighted resampling
● Sketch correctness (?) and optimality
of my weighted resampling method
● The Ziggurat Method
● Efficient variate generation for my weighted resampling
method
● 1M particle updates per second!
History of Navigation
In One Slide
● Whenever people traveled, they used
– “Dead reckoning”: Navigation using a
model of where you are
– Observation: Navigation using sensory
evidence of where you are
● Accurate clocks help with both
● Observation has come to dominate
– In particular, single “reliable” sensor
– Skilled navigators are better than this
– An “AI-complete” problem?
State Space Tracking
● Not just for navigation

● Sensors always lie (except clocks)
● Filtering vs smoothing
“Formalizing” the problem
● Given
– g(t0..n-1 ) = estimated state values
– hi(t0..n ) = sensed state values
– Noise models for g and h
● Estimate
– g(tn) = new state estimate
● AKA “sensor fusion”
State Space Filtering
● Strategy: step the model forward in
time from last estimate, then
compare with the sensors
● Kalman Filtering
– Fine linear method
– Requires “only” O(s3) multiplies per step
– Various “fixes” exist
● Bayesian Particle Filtering (BPF)
– Nonlinear, but computationally expensive
BPF
● Recall Bayes' rule:
Pr(H|E) Pr(E) = Pr(H∧E) = Pr(E|H) Pr(H)
Pr(H|E) = Pr(E|H) Pr(H) / Pr(E)
● Here, the “evidence” comes from the
sensors, and the “hypothesis” from
the model
● But...we don't just want to know how
likely is that our model gave the right
position; we want to know what the
most likely position is?
“Particles”!
N
N
v
v
N
v N
N
v
● Track lots of models! ● Curse of dimensionality:

need lots of particles
● At each step, replicate
likely, discard unlikely ● Remember: I want 1Mu/s
models
● Particles “map” likely
position
“Bootstrap” BPF
● for each particle i
– gi(tn) = xi ← (m + Δ)(gi(tn-1 ))
– hi(tn) = wi = Pr(xi|E1..n )
= Pr(xi|E1) Pr(xi|E2) ...
● Estimate state from ensemble
● Resample (m → n, usually n = m)
– for i in {1..n}
– select p'i weighted-randomly
Weighted Random Resampling
● The naïve single-sample algorithm
– μ ← u[0..1] ; normalize wi
– w ← 0; i ← 1
– while w < μ
– w ← w + wi ; i ← i + 1
– return i
● Can iterate to get n samples
Faster Resampling
● Definitive method: O(mn)
● “Obvious” improvements:
– Binary search or treeify: O(m + n lg m)
– Sort variates and merge: O(n lg n + m)
● Could give up correctness
– Regular resampling: O(m)
– Regular with shuffle: O(m)
Optimal Resampling
● Variate merge is O(m lg m) because it
requires the variates in sorted order
● Just generate them that way?
● Distribution of the leftmost of n
variates μ ← u[0..1] is
p(μ0) = (1 - μ0)n-1
● Independence also implies recursion
● So we just generate variates in
increasing order and merge!
Sampling A Distribution
● Brute force: integrate, take ratio,

invert function
● Leftmost variate at μ0= 1 - μ1/n
Optimal, But Not Fast Enough
● The optimal algorithm
– normalize wi
– w ← w1 ; i ← 1; μ0 ← 1 - μ1/n
– while i ≤ n
– while μ0 ≤ w
– μ0 ← μ0 + (1-μ0) (1 - μ1/(n-i) )
– select particle i; i ← i + 1
– w ← w + wi
● O(m + n), but that one line is expensive
The “Ziggurat Method”
● Marsaglia and Tsang 2000; accelerated

“rejection method”
● Works directly on PDF; no integration or
inversion required
Ziggurat for the Whole Family
● Problem: (1-μ)n is parameterized on n

– We don't want to build n ziggurats
● Solution: Squint
– For large enough n, the curves are similar
– In fact, they look a lot like the
compounding curve: consider
rescaling μ and substituting to get
(1+(-μ/a))an
Notice
lim (1+(-μ/a))an = e-μn
A Slight Expansion
● For small i, use direct method

● For larger i, use expanded Ziggurat
My Trick Doesn't Work
● Need equal area under curve

● Two years, published paper, technical
talks: no one caught me
The Standard Trick
● Give up on simulating naïve sampling
as too hard and unnecessary
● Just sample at regular intervals
– w ← w1 ; j ← 1 ; μ0 ← μ / (n + 1)
– while μ0 ≤ 1 do
– while μ0 ≤ w
– μ0 ← w j ; j ← j + 1
– select particle j
– μ0 ← μ0 + 1 / n ; w ← w + wi
Performance
Results
● Fast general-purpose BPF: 2.5Mu/s!

● Good results on vehicles
Machine Learning??
● “It's a lovely talk, but why this class?”
● Clearly AI: Attempt to infer from
incomplete, incorrect data and models
● Opportunities to apply ML to BPF
– Infer sensor model from measurements
– Discover systems failures
● Opportunities to apply BPF to ML
– Condition model data
– Find underlying probabilities
Acknowledgments and Availability
● Thanks to Prof James McNames and

Prof Gerardo Lafferriere
● Thanks to Jules Kongslie, Jamey Sharp
and Josh Triplett
● Thanks to BSP and PSAS students
● Thanks for the chance to talk!
● Paper is available w/ GPL code at
http://wiki.cs.pdx.edu/bartforge/bmpf

Mitchell Guest BPF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mitchell Guest BPF

Uploaded by

Copyright:

Available Formats

Randomly State Space Tracking

● Not just for navigation

● Track lots of models! ● Curse of dimensionality:

● Brute force: integrate, take ratio,

● Marsaglia and Tsang 2000; accelerated

● Problem: (1-μ)n is parameterized on n

● For small i, use direct method

● Need equal area under curve

● Fast general-purpose BPF: 2.5Mu/s!

● Thanks to Prof James McNames and

● Not just for navigation

● Track lots of models! ● Curse of dimensionality:

● Brute force: integrate, take ratio,

● Marsaglia and Tsang 2000; accelerated

● Problem: (1-μ)n is parameterized on n

● For small i, use direct method

● Need equal area under curve

● Fast general-purpose BPF: 2.5Mu/s!

● Thanks to Prof James McNames and

You might also like