CSF 469 L20 L22 Recommender Systems SVD

BITS Pilani
Dr.Aruna Malapati
BITS Pilani Asst Professor
Department of CSIS
Hyderabad Campus
BITS Pilani
Hyderabad Campus
Recommender Systems using dimension

reduction
Todays learning objective
Understand the how singular value decomposition can

be used for recommender systems.
BITS Pilani, Hyderabad Campus

Dimension Reduction
DIMENSION REDUCTION
DIMENSION REDUCTION
Why reduce dimensions?
Singular Value Decomposition
The key issue in an SVD decomposition is to find a lower

dimensional feature space where the new features
represent concepts and the strength of each concept in
the context of the collection is computable.
The core of the SVD algorithm lies in the following

theorem
It is always possible to decompose a given matrix A into

A =U VT .

SVD - Definition
A[m x n] = U[m x r] [ r x r] (V[n x r])T

A: Input data matrix
m x n matrix (e.g., m users, n movies)
U: Left singular vectors
m x r matrix (m users, r concepts)
: Singular values
r x r diagonal matrix (strength of each concept)
(r : rank of the matrix A)
V: Right singular vectors
n x r matrix (n movies, r concepts)

SVD
n n

VT
m A m

SVD
T
n
1u1v1 2u2v2
m A +
i scalar
MATRIX A IS THE SUM OF DIFFERENT MATRICES WHICH IS
REPRESENTED AS OUTER PRODUCT OF DIFFERENT VECTORS. ui vector
vi vector
SVD - Properties
It is always possible to decompose a real
matrix A into A = U VT , where
U, , V: unique
U, V: column orthonormal
UT U = I; VT V = I (I: identity matrix)
(Columns are orthogonal unit vectors)
: diagonal
Entries (singular values) are positive,
and sorted in decreasing order (1 2 ...
0)
SVD Example: Users-to-Movies
A = U VT - example: Users to Movies

Casablanca
Serenity
Amelie
Matrix
Alien
n
1 1 1 0 0
3 3 3 0 0
4 4 4 0 0 VT
5 5 5 0 0 = m
0 2 0 4 4
0 0 0 5 5
0 1 0 2 2 U Concepts
AKA Latent dimensions
AKA Latent factors

SVD Example: Users-to-
Movies
Casablanca
Serenity
Amelie
Matrix
Alien
1 1 1 0 0 0.13 0.02 -0.01

SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Romance 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
Movies
Casablanca SciFi-concept
Serenity
Amelie
Matrix
comedy-concept
Alien
1 1 1 0 0 0.13 0.02 -0.01

SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Romnce 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
Movies
A = U VT - example: U is user-to-concept
Casablanca similarity matrix
Serenity
Amelie
Matrix
SciFi-concept comedy-concept
Alien
1 1 1 0 0 0.13 0.02 -0.01

SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Romnce 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
Movies
A = U VT - example:
Casablanca
Serenity
Amelie
Matrix
SciFi-concept
Alien
strength of the SciFi-concept

1 1 1 0 0 0.13 0.02 -0.01
SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Comedy 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
Movies
A = U VT - example:
Casablanca V is movie-to-concept
Serenity
Amelie
Matrix
SciFi-concept similarity matrix

Alien
1 1 1 0 0 0.13 0.02 -0.01

SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Comedy 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
SciFi-concept 0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
SVD - Interpretation #1
movies, users and concepts:

U: user-to-concept similarity matrix
V: movie-to-concept similarity matrix
: its diagonal elements:

strength of each concept

SVD Dimensionality
Reduction
Movie 2 rating
first right
singular vector
v1
Movie 1 rating

SVD Dimensionality
Reduction
Movie 2 rating
first right
singular
vector
v1
Movie 1 rating

A = U VT - example:
Movie 2 rating
V: movie-to-concept matrix first right
U: user-to-concept matrix singular
vector
v1
1 1 1 0 0 0.13 0.02 -0.01 Movie 1 rating
3 3 3 0 0 0.41 0.07 -0.03

4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
A = U VT - example:
Movie 2 rating
first right
singular
variance (spread) vector
on the v1 axis
v1
1 1 1 0 0 0.13 0.02 -0.01 Movie 1 rating
3 3 3 0 0 0.41 0.07 -0.03

4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
A = U VT - example:
Movie 2 rating
U : Gives the coordinates first right
of the points in the singular
vector
projection axis
v1
1 1 1 0 0 Movie 1 rating
Projection of users
3 3 3 0 0 on the Sci-Fi axis
4 4 4 0 0 (U ) T: 1.61 0.19 -0.01
5 5 5 0 0 5.08 0.66 -0.03
0 2 0 4 4 6.82 0.85 -0.05
0 0 0 5 5 8.43 1.04 -0.06
0 1 0 2 2 1.86 -5.60 0.84
0.86 -6.93 -0.87
0.86 -2.75 0.41
More details
Q: How exactly is dim. reduction done?
1 1 1 0 0 0.13 0.02 -0.01

3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09

More details
A: Set smallest singular values to zero
1 1 1 0 0 0.13 0.02 -0.01

3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
More details
1 1 1 0 0 0.13 0.02 -0.01

3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
x 0 9.5 0 x
5
0
5
2
5
0
0
4
0
4 0.68
0.15
0.11
-0.59
-0.05
0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
More details
1 1 1 0 0 0.13 0.02 -0.01

3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
More details
1 1 1 0 0 0.13 0.02
3 3 3 0 0 0.41 0.07
4 4 4 0 0 0.55 0.09 12.4 0
5 5 5 0 0 0.68 0.11 x 0 9.5 x
0 2 0 4 4 0.15 -0.59
0 0 0 5 5 0.07 -0.73
0 1 0 2 2 0.07 -0.29 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69

More details
1 1 1 0 0 0.92 0.95 0.92 0.01 0.01

3 3 3 0 0 2.91 3.01 2.91 -0.01 -0.01
4 4 4 0 0 3.90 4.04 3.90 0.01 0.01
5 5 5 0 0 4.82 5.00 4.82 0.03 0.03
0 2 0 4 4 0.70 0.53 0.70 4.11 4.11
0 0 0 5 5 -0.69 1.34 -0.69 4.78 4.78
0 1 0 2 2 0.32 0.23 0.32 2.01 2.01
Frobenius norm:
A-BF = ij (Aij-Bij)2
MF = ij Mij2 is small
SVD Best Low Rank Approx
Sigma
A =
U
VT
B is best approximation of A
Sigma
B = U
VT

SVD - Conclusions so far
SVD: A= U VT: unique
U: user-to-concept similarities
V: movie-to-concept similarities
: strength of each concept
Dimensionality reduction:
keep the few largest singular values (80-90% of energy)
SVD: picks up linear correlations

Case study: How to query?
Q: Find users that like Matrix
A: Map query into a concept space how?
Casablanca
Serenity
Amelie
Matrix
Alien
1 1 1 0 0 0.13 0.02 -0.01

SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Romnce 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
A: Map query into a concept space
how?
Alien
Casablanca
q
Serenity
Amelie
Matrix
Alien
q= 5 0 0 0 0 v2
v1
Project into concept space:
Matrix
Inner product with each
concept vector vi

A: Map query into a concept space
how?
Alien
Casablanca
q
Serenity
Amelie
Matrix
Alien
q= 5 0 0 0 0 v2
v1 q*v1
Project into concept space:
Matrix
Inner product with each
concept vector vi

Compactly, we have:
qconcept = q V
E.g.:
Casablanca
SciFi-concept
Serenity
Amelie
Matrix
0.56 0.12
Alien
0.59 -0.02
q= 5 0 0 0 0 x 0.56 0.12 = 2.8 0.6
0.09 -0.69
0.09 -0.69
movie-to-concept
similarities (V)

How would the user d that rated
(Alien, Serenity) be handled?
dconcept = d V
E.g.:
Casablanca
SciFi-concept
Serenity
Amelie
Matrix
0.56 0.12
Alien
0.59 -0.02
q= 0 4 5 0 0 x 0.56 0.12 = 5.2 0.4
0.09 -0.69
0.09 -0.69
movie-to-concept
similarities (V)

Observation: User d that rated (Alien,
Serenity) will be similar to user q that
rated (Matrix), although d and q have
zero ratings in common! Casablanca
Serenity
Amelie
Matrix
SciFi-concept
Alien
d = 0 4 5 0 0 5.2 0.4
q = 5 0 0 0 0 2.8 0.6
Zero ratings in common Similarity 0
SVD: Drawbacks
+ Optimal low-rank approximation
in terms of Frobenius norm
- Interpretability problem:
A singular vector specifies a linear combination
of all input columns or rows
- Lack of sparsity:
Singular vectors are dense!
VT

=
U

Example
Consider the following matrix
In order to find U, we have to start with AAT.
The transpose of A is

Example (Contd..)
Compute AAT
Next, we have to find the Eigenvalues and

corresponding Eigenvectors of AAT.
We know that Eigenvectors are defined by the equation

Example (Contd..)
And applying this to AAT gives us
We rewrite this as the set of equations
And rearrange to get

Example (Contd..)
Solve for by setting the determinant of the coefficient

matrix to zero.
This works out as
This gives us two eigenvalues = 10; = 12.

Example (Contd..)
Replace the values back in to the original equations

gives us our Eigenvectors.
For = 10, we get
Which is true for lots of values, so we'll choose x1 = 1 and

x2 = -1 since those are small and easier to work with.
Thus, we have the Eigenvector [1; -1] corresponding to
the Eigenvalue = 10.

Example (Contd..)
For = 12, we get
Lets consider to take x1 = 1 and x2 = 1
For = 12 we have the Eigenvector [1; 1]

Example (Contd..)
These Eigenvectors become column vectors in a matrix

ordered by the size of the corresponding Eigenvalue.
The Eigenvector for = 12 is column one, and the

Eigenvector for = 10 is column two.

Example (Contd..)
Finally, we have to convert this matrix into an orthogonal

matrix which we do by applying the Gram-Schmidt
orthonormalization process to the column vectors.
Begin by normalizing v1.

Example (Contd..)

Example (Contd..)
The calculation of V is similar. V is based on ATA, so we
have
Find the Eigenvalues of ATA by
Solve the equations convert that to an orthonormal matrix as

we did for U gives you V and take the transpose.
Example (Contd..)
For we take the square roots of the non-zero

Eigenvalues and populate the diagonal with them,
putting the largest in 11, the next largest in 22 and so
on until the smallest value ends up in mm.

Example (Contd..)
A = U VT

Summary
The singular-value decomposition of a matrix consists of
three matrices, U, , and VT.
The matrices U and V are column-orthonormal, meaning
that as vectors, the columns are orthogonal, and their
lengths are 1.
The matrix is a diagonal matrix, and the values along its
diagonal are called singular values.
SVD is useful when there are a small number of concepts
that connect the rows and columns of the original matrix.
The matrix U connects rows to concepts, represents the
strengths of the concepts, and V connects the concepts to
columns.

CSF 469 L20 L22 Recommender Systems SVD

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CSF 469 L20 L22 Recommender Systems SVD

Uploaded by

Copyright:

Available Formats

BITS Pilani

Recommender Systems using dimension

Understand the how singular value decomposition can

BITS Pilani, Hyderabad Campus

The key issue in an SVD decomposition is to find a lower

The core of the SVD algorithm lies in the following

It is always possible to decompose a given matrix A into

BITS Pilani, Hyderabad Campus

A[m x n] = U[m x r] [ r x r] (V[n x r])T

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

A = U VT - example: Users to Movies

BITS Pilani, Hyderabad Campus

1 1 1 0 0 0.13 0.02 -0.01

1 1 1 0 0 0.13 0.02 -0.01

1 1 1 0 0 0.13 0.02 -0.01

strength of the SciFi-concept

SciFi-concept similarity matrix

1 1 1 0 0 0.13 0.02 -0.01

movies, users and concepts:

V: movie-to-concept similarity matrix

: its diagonal elements:

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

1 1 1 0 0 0.13 0.02 -0.01 Movie 1 rating

3 3 3 0 0 0.41 0.07 -0.03

1 1 1 0 0 0.13 0.02 -0.01 Movie 1 rating

3 3 3 0 0 0.41 0.07 -0.03

1 1 1 0 0 0.13 0.02 -0.01

BITS Pilani, Hyderabad Campus

1 1 1 0 0 0.13 0.02 -0.01

1 1 1 0 0 0.13 0.02 -0.01

1 1 1 0 0 0.13 0.02 -0.01

BITS Pilani, Hyderabad Campus

1 1 1 0 0 0.92 0.95 0.92 0.01 0.01

BITS Pilani, Hyderabad Campus

: strength of each concept

keep the few largest singular values (80-90% of energy)

SVD: picks up linear correlations

BITS Pilani, Hyderabad Campus

1 1 1 0 0 0.13 0.02 -0.01

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

Consider the following matrix

In order to find U, we have to start with AAT.

BITS Pilani, Hyderabad Campus

Next, we have to find the Eigenvalues and

BITS Pilani, Hyderabad Campus

And applying this to AAT gives us

We rewrite this as the set of equations

And rearrange to get

BITS Pilani, Hyderabad Campus

Solve for by setting the determinant of the coefficient

This works out as

This gives us two eigenvalues = 10; = 12.

BITS Pilani, Hyderabad Campus

Replace the values back in to the original equations

Which is true for lots of values, so we'll choose x1 = 1 and

BITS Pilani, Hyderabad Campus

For = 12, we get