Professional Documents
Culture Documents
CSE291-D
Latent Variable Models
Instructor: Dr. Jimmy Foulds
Email: jfoulds@ucsd.edu
Office Hours: TuTh 5-6pm Atkinson Hall 4401
• Core principles:
– Dimensionality reduction
– Probabilistic graphical models
– Statistical inference, especially Bayesian inference
Data
Complicated, noisy,
high-dimensional
4
Probabilistic latent variable modeling
Understand,
Data explore,
predict
Complicated, noisy,
high-dimensional
5
Probabilistic latent variable modeling
Understand,
Data explore,
predict
Complicated, noisy,
high-dimensional
Latent
variable
model
6
Probabilistic latent variable modeling
Understand,
Data explore,
predict
Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations
Latent
variable
model
7
Underlying principle:
Dimensionality reduction
The quick brown fox jumps over the sly lazy dog
8
Underlying principle:
Dimensionality reduction
The quick brown fox jumps over the sly lazy dog
[5 6 37 1 4 30 5 22 570 12]
9
Underlying principle:
Dimensionality reduction
The quick brown fox jumps over the sly lazy dog
[5 6 37 1 4 30 5 22 570 12]
10
Latent variable models
Z Latent variables
12
Motivating Applications
• Computational biology:
– Sequence alignment, phylogyny
13
Motivating Applications
• Computational social science:
– Cognitive psychology, digital humanities, …
14
The digital humanities
Social network:
Model that only captures
homophily performs the best
Protein-protein interaction
network:
Model that captures both
homophily and stochastic
equivalence performs the best
Hoff, P. (2008). Modeling homophily and stochastic equivalence in symmetric relational data. NIPS 19
Latent Feature Models for
Social Networks
Alice Bob
Claire
Latent Feature Models for
Social Networks
Alice Bob
Cycling Tango
Fishing Salsa
Running
Claire
Waltz
Running
Latent Feature Models for
Social Networks
Alice Bob
Cycling Tango
Fishing Salsa
Running
Claire
Waltz
Running
Latent Feature Models for
Social Networks
Alice Bob
Cycling Tango
Fishing Salsa
Running
Claire
Waltz
Running
Miller, Griffiths, Jordan (2009)
Latent Feature Relational Model
Alice Bob
Cycling Tango
Fishing Salsa
Running
Claire
Waltz
Running
Data generating
Observed data
process
Inference
27
Figure based on one by Larry Wasserman, "All of Statistics"
Inference Algorithms
• Exact inference
– Belief propagation on polytrees, junction tree
• Approximate inference
– Optimization approaches
• EM
• Variational inference
– Variational Bayes, mean field
– Message passing: loopy BP, TRW, expectation propagation
– Simulation approaches
• Importance sampling, particle filtering
• Markov chain Monte Carlo
– Gibbs sampling, Metropolis-Hastings, Hamiltonian Monte Carlo…
28
The art of latent variable modeling:
Box’s loop
Understand,
Data explore,
predict
Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations
Latent
variable
model
29
The art of latent variable modeling:
Box’s loop
Understand,
Data explore,
predict
Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations
Latent
variable
model
30
The art of latent variable modeling:
Box’s loop
Understand,
Data explore,
predict
Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations
Latent
variable
(Algorithm, model) pair model
carefully co-designed for
tractability
31
The art of latent variable modeling:
Box’s loop
Evaluate,
Understand,
Data iterate
explore,
predict
Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations
Latent
variable
(Algorithm, model) pair model
carefully co-designed for
tractability
32
This course: CSE291-D
Latent Variable Models
• Three themes:
– Models
– Inference
– Evaluation
33
Learning Goals
• By the end of the course you will be able to:
34
Weeks 1-2
• Foundations
– Bayesian inference
– Generative models for discrete data
– Exponential family models
35
Week 3
• Monte Carlo Methods
– Importance sampling, rejection sampling, why
they fail in high dimensions
36
Week 4
• Modeling
– Mixture models (revisited)
37
Week 5
• Hidden Markov models (revisited)
– Applications, extensions, Bayesian inference
38
Week 6
• Markov random fields
39
Week 7
• Variational inference
40
Week 8
• Topic models and mixed membership models
– LSA/PLSA, Genetic admixtures, LDA
41
Week 9
• Social network models
– Exponential family random graph models
42
Week 10
• Models for computational biology
– Profile HMMs, Phylogenetic models, coalescent
43
Pedagogy: Active learning
• Student-centered instruction
44
“If the experiments [were]
medical interventions, …
the control condition might
be discontinued
because the treatment being tested was
clearly more beneficial.”
45
Pedagogy: Active learning
• Everyone gets to participate, not just students in
the front row
47
Peer instruction
• You can respond to Poll Everywhere polls with your
laptop, tablet, or smartphone (bring it to class!)
PollEv.com/jamesfoulds656
48
Course Readings
• Required textbook:
49
Reading for Today’s Lecture
• Blei, David M. (2014). Build, compute, critique, repeat: Data analysis with
latent variable models. Annual Review of Statistics and Its Application,
Sections 1-3.
50
Assessment
• Final 35%
• Participation 5%
51
Group Project
• Groups of 2-4
52
Group Project
• Milestones / deliverables
– Project proposal, 4/19/2016
– Midterm progress report, 5/10/2016
– Project report, 6/9/2016
53
Piazza
• Use piazza to ask questions about the course,
instead of emailing me or the TA directly, so that
everyone in the class can benefit from the answer
54
How to Succeed in CSE291-D
• While the course will be challenging in the sense that we have a lot
of material to cover in 10 weeks, the course is designed so that
everyone has the opportunity to succeed. I do not grade on a curve.
• Learning goals for each lesson will be clearly stated – if you achieve
these you will be well prepared for the exam.
55
Required Knowledge
• CSE250A is the only prerequisite
– Basics of directed graphical models
– d-separation, explaining away, Markov blanket
– Maximum likelihood estimation
– Expectation maximization
– A little exposure to mixture models, HMMs,…
– Prerequisites of CSE250A also apply
(elementary math, programming…)
58
Recap: Markov blanket
61
62
Answer
• Turns out none of them are!
63
Recap: d-separation
• “Bayes ball” is blocked by:
64
Recap: d-separation
• “Bayes ball” passes through:
65
Recap: Explaining Away (Example)
• X, Z independent coin flips encoded as 0 or 1
• Y=X+Z
• If we know Y, then X and Z become coupled
66
Recap: d-separation
• Boundary cases:
67
Recap: d-separation
• Why we need the boundary cases:
68
69
Answer
• Turns out none of them are! Even with 1
observed, you can still pass through 5 via
explaining away.
70
CSE291-D
Latent Variable Models
Instructor: Dr. Jimmy Foulds
Email: jfoulds@ucsd.edu
Office Hours: TuTh 5-6pm Atkinson Hall 4401