You are on page 1of 71

1

CSE291-D
Latent Variable Models
Instructor: Dr. Jimmy Foulds
Email: jfoulds@ucsd.edu
Office Hours: TuTh 5-6pm Atkinson Hall 4401

TA: Long Jin


Email: longjin@eng.ucsd.edu
Office Hours:

Course website: cseweb.ucsd.edu/classes/sp16/cse291-d/


Piazza: piazza.com/ucsd/spring2016/cse291d
Poll everywhere: PollEv.com/jamesfoulds656
Latent Variable Models
• Latent variable modeling is a general, principled
approach for making sense of complex data sets

• Core principles:
– Dimensionality reduction
– Probabilistic graphical models
– Statistical inference, especially Bayesian inference

Latent variable models are, basically,


PCA on steroids! 3
Probabilistic latent variable modeling

Data

Complicated, noisy,
high-dimensional

4
Probabilistic latent variable modeling

Understand,
Data explore,
predict

Complicated, noisy,
high-dimensional

5
Probabilistic latent variable modeling

Understand,
Data explore,
predict

Complicated, noisy,
high-dimensional

Latent
variable
model

6
Probabilistic latent variable modeling

Understand,
Data explore,
predict

Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations

Latent
variable
model

7
Underlying principle:
Dimensionality reduction

The quick brown fox jumps over the sly lazy dog

8
Underlying principle:
Dimensionality reduction

The quick brown fox jumps over the sly lazy dog
[5 6 37 1 4 30 5 22 570 12]

9
Underlying principle:
Dimensionality reduction

The quick brown fox jumps over the sly lazy dog
[5 6 37 1 4 30 5 22 570 12]

Foxes Dogs Jumping


[40% 40% 20% ]

10
Latent variable models

Z Latent variables

Parameters Φ X Observed data


Data
Points

Dimensionality(X) >> dimensionality(Z)


Z is a bottleneck, which finds a compressed, low-dimensional representation of X 11
Motivating Applications
• Industry:
– recommender systems, user modeling and
personalization, text analysis …

12
Motivating Applications
• Computational biology:
– Sequence alignment, phylogyny

13
Motivating Applications
• Computational social science:
– Cognitive psychology, digital humanities, …

14
The digital humanities

Mimno, D. (2012). Computational historiography: Data mining in a century of classics journals.


15
ACM Journal on Computing and Cultural Heritage, Vol. 5, No. 1, Article 3,.
The digital humanities

Mimno, D. (2012). Computational historiography: Data mining in a century of classics journals.


16
ACM Journal on Computing and Cultural Heritage, Vol. 5, No. 1, Article 3,.
The digital humanities

Mimno, D. (2012). Computational historiography: Data mining in a century of classics journals.


17
ACM Journal on Computing and Cultural Heritage, Vol. 5, No. 1, Article 3,.
The digital humanities

Mimno, D. (2012). Computational historiography: Data mining in a century of classics journals.


18
ACM Journal on Computing and Cultural Heritage, Vol. 5, No. 1, Article 3,.
Latent space models
for social network analysis

Social network:
Model that only captures
homophily performs the best

Protein-protein interaction
network:
Model that captures both
homophily and stochastic
equivalence performs the best

Hoff, P. (2008). Modeling homophily and stochastic equivalence in symmetric relational data. NIPS 19
Latent Feature Models for
Social Networks

Alice Bob

Claire
Latent Feature Models for
Social Networks

Alice Bob

Cycling Tango
Fishing Salsa
Running
Claire

Waltz
Running
Latent Feature Models for
Social Networks

Alice Bob

Cycling Tango
Fishing Salsa
Running
Claire

Waltz
Running
Latent Feature Models for
Social Networks

Alice Bob

Cycling Tango
Fishing Salsa
Running
Claire

Waltz
Running
Miller, Griffiths, Jordan (2009)
Latent Feature Relational Model

Alice Bob

Cycling Tango
Fishing Salsa
Running
Claire

Waltz
Running

Cycling Fishing Running Tango Salsa Waltz


Alice
Z= Bob
Claire
Automatically illustrating a guacamole recipe from https://www.youtube.com/watch?v=H7Ne3s202lU
25
26
Probability and Inference
Probability

Data generating
Observed data
process

Inference

27
Figure based on one by Larry Wasserman, "All of Statistics"
Inference Algorithms
• Exact inference
– Belief propagation on polytrees, junction tree

• Approximate inference
– Optimization approaches
• EM
• Variational inference
– Variational Bayes, mean field
– Message passing: loopy BP, TRW, expectation propagation
– Simulation approaches
• Importance sampling, particle filtering
• Markov chain Monte Carlo
– Gibbs sampling, Metropolis-Hastings, Hamiltonian Monte Carlo…
28
The art of latent variable modeling:
Box’s loop
Understand,
Data explore,
predict

Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations

Latent
variable
model

29
The art of latent variable modeling:
Box’s loop
Understand,
Data explore,
predict

Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations

Latent
variable
model

30
The art of latent variable modeling:
Box’s loop
Understand,
Data explore,
predict

Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations

Latent
variable
(Algorithm, model) pair model
carefully co-designed for
tractability
31
The art of latent variable modeling:
Box’s loop
Evaluate,
Understand,
Data iterate
explore,
predict

Low-dimensional,
Complicated, noisy,
semantically meaningful
high-dimensional
representations

Latent
variable
(Algorithm, model) pair model
carefully co-designed for
tractability
32
This course: CSE291-D
Latent Variable Models
• Three themes:
– Models

– Inference

– Evaluation

• Themes correspond to the steps in Box’s loop

33
Learning Goals
• By the end of the course you will be able to:

– Apply a variety of probabilistic models


– Formulate new probabilistic models to solve the
data science tasks you care about

– Derive inference algorithms for these models using


Bayesian inference techniques

– Evaluate their performance in order to critique and


improve them.

34
Weeks 1-2
• Foundations
– Bayesian inference
– Generative models for discrete data
– Exponential family models

35
Week 3
• Monte Carlo Methods
– Importance sampling, rejection sampling, why
they fail in high dimensions

– Markov chain Monte Carlo (MCMC)

36
Week 4
• Modeling
– Mixture models (revisited)

– Latent linear models


• Factor analysis, probabilistic PCA, ICA

37
Week 5
• Hidden Markov models (revisited)
– Applications, extensions, Bayesian inference

• Evaluating unsupervised models

38
Week 6
• Markov random fields

• Statistical relational learning and probabilistic


programming
– Markov logic networks, probabilistic soft logic,
Stan, WinBUGS

39
Week 7
• Variational inference

– Foundations, mean field updates

– Examples: Gaussian models, linear regression.


Variational Bayes EM

40
Week 8
• Topic models and mixed membership models
– LSA/PLSA, Genetic admixtures, LDA

– Inference: MCMC, variational inference

41
Week 9
• Social network models
– Exponential family random graph models

– Stochastic blockmodels, mixed membership


stochastic blockmodels, latent space models

42
Week 10
• Models for computational biology
– Profile HMMs, Phylogenetic models, coalescent

• Nonparametric Bayesian models


– Chinese restaurant process / Dirichlet process,
Indian buffet process

43
Pedagogy: Active learning

• Student-centered instruction

• Actively engage with the material in class

• In-class quizzes and polls, peer instruction,


discussion with peers

44
“If the experiments [were]
medical interventions, …
the control condition might
be discontinued
because the treatment being tested was
clearly more beneficial.”

45
Pedagogy: Active learning
• Everyone gets to participate, not just students in
the front row

• With traditional lectures only, course content is


“transmitted” in class, and you have to do the
hard yards of learning on your own

• With active learning to augment lectures,


learning, synthesis, and integration with prior
knowledge occur in class, with support from
instructor and peers
46
Peer instruction
• No need to buy a clicker: polleverywhere.com

47
Peer instruction
• You can respond to Poll Everywhere polls with your
laptop, tablet, or smartphone (bring it to class!)
PollEv.com/jamesfoulds656

• I recommend using the Poll Everywhere app, for


Android and Iphone. Find it in the app store.

• If you do not have a smartphone or laptop, you can use


colored voting cards.
The most important thing is that you vote!

48
Course Readings
• Required textbook:

Machine Learning: A Probabilistic Perspective.


Murphy (2012)

• Readings need to be completed


before each class. From Murphy, and/or other articles.
We will do reading quizzes at the start of each class.

• It is very important that you do the readings


so that we can make effective use of our limited lecture
time together (a “flipped classroom” approach.)

49
Reading for Today’s Lecture

• Blei, David M. (2014). Build, compute, critique, repeat: Data analysis with
latent variable models. Annual Review of Statistics and Its Application,
Sections 1-3.

• This article is a good overview of what CSE291-D is all


about, if you’re still deciding whether this course is
for you.
• Note how discussion is framed around Box’s loop.

50
Assessment

• Homeworks 25% (5 of them, 5% each)

• Group Project 35%

• Final 35%

• Participation 5%

51
Group Project
• Groups of 2-4

• An open-ended research project, to give you an


opportunity to explore the techniques and principles
covered in the course

• Must involve one or more of the themes of the course:


models, inference, evaluation (ideally all three)

• May overlap with your other research, but not any


other class project

52
Group Project
• Milestones / deliverables
– Project proposal, 4/19/2016
– Midterm progress report, 5/10/2016
– Project report, 6/9/2016

• Note that you may have to read ahead to start


your project. All readings are listed on the
syllabus, on the course webpage.

53
Piazza
• Use piazza to ask questions about the course,
instead of emailing me or the TA directly, so that
everyone in the class can benefit from the answer

• Piazza will also be used for announcements,


including information on the readings

• Please sign up at:


https://piazza.com/ucsd/spring2016/cse291d

54
How to Succeed in CSE291-D
• While the course will be challenging in the sense that we have a lot
of material to cover in 10 weeks, the course is designed so that
everyone has the opportunity to succeed. I do not grade on a curve.

• Learning goals for each lesson will be clearly stated – if you achieve
these you will be well prepared for the exam.

• Homeworks are designed to give you practice and feedback on the


learning goals.

• The 5% participation marks are there for the taking


– Participate in peer instruction/class discussions/etc, Piazza

55
Required Knowledge
• CSE250A is the only prerequisite
– Basics of directed graphical models
– d-separation, explaining away, Markov blanket
– Maximum likelihood estimation
– Expectation maximization
– A little exposure to mixture models, HMMs,…
– Prerequisites of CSE250A also apply
(elementary math, programming…)

• If you’re rusty, please read Murphy Ch. 10.


56
57
Answer
• D, 5. {2,3,6,7,4}

58
Recap: Markov blanket

• The Markov blanket of a node is the union of its


– Parents
– Children
– Co-parents (other parents of its children)

• The joint distribution is a product of


Pr(x|parents(x)) factors. The Markov blanket of x
is the set nodes it co-occurs with in these factors
59
60
Answer
• C, 4. {3,5,7,4}

61
62
Answer
• Turns out none of them are!

63
Recap: d-separation
• “Bayes ball” is blocked by:

64
Recap: d-separation
• “Bayes ball” passes through:

65
Recap: Explaining Away (Example)
• X, Z independent coin flips encoded as 0 or 1
• Y=X+Z
• If we know Y, then X and Z become coupled

66
Recap: d-separation
• Boundary cases:

67
Recap: d-separation
• Why we need the boundary cases:

• If y’ is a copy of y, reduces to explaining away

68
69
Answer
• Turns out none of them are! Even with 1
observed, you can still pass through 5 via
explaining away.

70
CSE291-D
Latent Variable Models
Instructor: Dr. Jimmy Foulds
Email: jfoulds@ucsd.edu
Office Hours: TuTh 5-6pm Atkinson Hall 4401

TA: Long Jin


Email: longjin@eng.ucsd.edu
Office Hours:

Course website: cseweb.ucsd.edu/classes/sp16/cse291-d/


Piazza: piazza.com/ucsd/spring2016/cse291d
Poll everywhere: PollEv.com/jamesfoulds656

You might also like