You are on page 1of 52

BITS Pilani

Dr.Aruna Malapati
BITS Pilani Asst Professor
Department of CSIS
Hyderabad Campus
BITS Pilani
Hyderabad Campus

Recommender Systems using dimension


reduction
Todays learning objective

Understand the how singular value decomposition can


be used for recommender systems.

BITS Pilani, Hyderabad Campus


Dimension Reduction
DIMENSION REDUCTION
DIMENSION REDUCTION
Why reduce dimensions?
Singular Value Decomposition

The key issue in an SVD decomposition is to find a lower


dimensional feature space where the new features
represent concepts and the strength of each concept in
the context of the collection is computable.

The core of the SVD algorithm lies in the following


theorem

It is always possible to decompose a given matrix A into


A =U VT .

BITS Pilani, Hyderabad Campus


SVD - Definition

A[m x n] = U[m x r] [ r x r] (V[n x r])T


A: Input data matrix
m x n matrix (e.g., m users, n movies)
U: Left singular vectors
m x r matrix (m users, r concepts)
: Singular values
r x r diagonal matrix (strength of each concept)
(r : rank of the matrix A)
V: Right singular vectors
n x r matrix (n movies, r concepts)

BITS Pilani, Hyderabad Campus


SVD

n n



VT
m A m

BITS Pilani, Hyderabad Campus


SVD
T

n
1u1v1 2u2v2

m A +

i scalar
MATRIX A IS THE SUM OF DIFFERENT MATRICES WHICH IS
REPRESENTED AS OUTER PRODUCT OF DIFFERENT VECTORS. ui vector
vi vector
BITS Pilani, Hyderabad Campus
SVD - Properties
It is always possible to decompose a real
matrix A into A = U VT , where
U, , V: unique
U, V: column orthonormal
UT U = I; VT V = I (I: identity matrix)
(Columns are orthogonal unit vectors)
: diagonal
Entries (singular values) are positive,
and sorted in decreasing order (1 2 ...
0)
BITS Pilani, Hyderabad Campus
SVD Example: Users-to-Movies

A = U VT - example: Users to Movies


Casablanca
Serenity

Amelie
Matrix
Alien

n
1 1 1 0 0
3 3 3 0 0
4 4 4 0 0 VT
5 5 5 0 0 = m
0 2 0 4 4
0 0 0 5 5
0 1 0 2 2 U Concepts
AKA Latent dimensions
AKA Latent factors

BITS Pilani, Hyderabad Campus


SVD Example: Users-to-
Movies
A = U VT - example: Users to Movies
Casablanca
Serenity

Amelie
Matrix
Alien

1 1 1 0 0 0.13 0.02 -0.01


SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Romance 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD Example: Users-to-
Movies
A = U VT - example: Users to Movies
Casablanca SciFi-concept
Serenity

Amelie
Matrix

comedy-concept
Alien

1 1 1 0 0 0.13 0.02 -0.01


SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Romnce 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD Example: Users-to-
Movies
A = U VT - example: U is user-to-concept
Casablanca similarity matrix
Serenity

Amelie
Matrix

SciFi-concept comedy-concept
Alien

1 1 1 0 0 0.13 0.02 -0.01


SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Romnce 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD Example: Users-to-
Movies
A = U VT - example:
Casablanca
Serenity

Amelie
Matrix

SciFi-concept
Alien

strength of the SciFi-concept


1 1 1 0 0 0.13 0.02 -0.01
SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Comedy 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD Example: Users-to-
Movies
A = U VT - example:
Casablanca V is movie-to-concept
Serenity

Amelie
Matrix

SciFi-concept similarity matrix


Alien

1 1 1 0 0 0.13 0.02 -0.01


SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Comedy 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32
0.56 0.59 0.56 0.09 0.09
SciFi-concept 0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD - Interpretation #1

movies, users and concepts:


U: user-to-concept similarity matrix

V: movie-to-concept similarity matrix

: its diagonal elements:


strength of each concept

BITS Pilani, Hyderabad Campus


SVD Dimensionality
Reduction

Movie 2 rating
first right
singular vector

v1

Movie 1 rating

BITS Pilani, Hyderabad Campus


SVD Dimensionality
Reduction

Movie 2 rating
first right
singular
vector

v1

Movie 1 rating

BITS Pilani, Hyderabad Campus


SVD - Interpretation #2
A = U VT - example:

Movie 2 rating
V: movie-to-concept matrix first right
U: user-to-concept matrix singular
vector

v1

1 1 1 0 0 0.13 0.02 -0.01 Movie 1 rating

3 3 3 0 0 0.41 0.07 -0.03


4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD - Interpretation #2
A = U VT - example:

Movie 2 rating
first right
singular
variance (spread) vector
on the v1 axis
v1

1 1 1 0 0 0.13 0.02 -0.01 Movie 1 rating

3 3 3 0 0 0.41 0.07 -0.03


4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD - Interpretation #2
A = U VT - example:

Movie 2 rating
U : Gives the coordinates first right
of the points in the singular
vector
projection axis
v1

1 1 1 0 0 Movie 1 rating
Projection of users
3 3 3 0 0 on the Sci-Fi axis
4 4 4 0 0 (U ) T: 1.61 0.19 -0.01
5 5 5 0 0 5.08 0.66 -0.03
0 2 0 4 4 6.82 0.85 -0.05
0 0 0 5 5 8.43 1.04 -0.06
0 1 0 2 2 1.86 -5.60 0.84
0.86 -6.93 -0.87
0.86 -2.75 0.41
BITS Pilani, Hyderabad Campus
SVD - Interpretation #2
More details
Q: How exactly is dim. reduction done?

1 1 1 0 0 0.13 0.02 -0.01


3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09

BITS Pilani, Hyderabad Campus


SVD - Interpretation #2
More details
Q: How exactly is dim. reduction done?
A: Set smallest singular values to zero

1 1 1 0 0 0.13 0.02 -0.01


3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD - Interpretation #2
More details
Q: How exactly is dim. reduction done?
A: Set smallest singular values to zero

1 1 1 0 0 0.13 0.02 -0.01


3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
x 0 9.5 0 x
5
0
5
2
5
0
0
4
0
4 0.68
0.15
0.11
-0.59
-0.05
0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD - Interpretation #2
More details
Q: How exactly is dim. reduction done?
A: Set smallest singular values to zero

1 1 1 0 0 0.13 0.02 -0.01


3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
SVD - Interpretation #2
More details
Q: How exactly is dim. reduction done?
A: Set smallest singular values to zero

1 1 1 0 0 0.13 0.02
3 3 3 0 0 0.41 0.07
4 4 4 0 0 0.55 0.09 12.4 0
5 5 5 0 0 0.68 0.11 x 0 9.5 x
0 2 0 4 4 0.15 -0.59
0 0 0 5 5 0.07 -0.73
0 1 0 2 2 0.07 -0.29 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69

BITS Pilani, Hyderabad Campus


SVD - Interpretation #2
More details
Q: How exactly is dim. reduction done?
A: Set smallest singular values to zero

1 1 1 0 0 0.92 0.95 0.92 0.01 0.01


3 3 3 0 0 2.91 3.01 2.91 -0.01 -0.01
4 4 4 0 0 3.90 4.04 3.90 0.01 0.01
5 5 5 0 0 4.82 5.00 4.82 0.03 0.03
0 2 0 4 4 0.70 0.53 0.70 4.11 4.11
0 0 0 5 5 -0.69 1.34 -0.69 4.78 4.78
0 1 0 2 2 0.32 0.23 0.32 2.01 2.01

Frobenius norm:
A-BF = ij (Aij-Bij)2
MF = ij Mij2 is small
BITS Pilani, Hyderabad Campus
SVD Best Low Rank Approx

Sigma

A =
U
VT

B is best approximation of A

Sigma

B = U
VT

BITS Pilani, Hyderabad Campus


SVD - Conclusions so far
SVD: A= U VT: unique

U: user-to-concept similarities

V: movie-to-concept similarities

: strength of each concept

Dimensionality reduction:

keep the few largest singular values (80-90% of energy)

SVD: picks up linear correlations

BITS Pilani, Hyderabad Campus


Case study: How to query?
Q: Find users that like Matrix
A: Map query into a concept space how?
Casablanca
Serenity

Amelie
Matrix
Alien

1 1 1 0 0 0.13 0.02 -0.01


SciFi
3 3 3 0 0 0.41 0.07 -0.03
4 4 4 0 0 0.55 0.09 -0.04 12.4 0 0
5 5 5 0 0 = 0.68 0.11 -0.05 x 0 9.5 0 x
0 2 0 4 4 0.15 -0.59 0.65 0 0 1.3
Romnce 0 0 0 5 5 0.07 -0.73 -0.67
0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
BITS Pilani, Hyderabad Campus
Case study: How to query?
Q: Find users that like Matrix
A: Map query into a concept space
how?

Alien
Casablanca

q
Serenity

Amelie
Matrix
Alien

q= 5 0 0 0 0 v2
v1
Project into concept space:
Matrix
Inner product with each
concept vector vi

BITS Pilani, Hyderabad Campus


Case study: How to query?
Q: Find users that like Matrix
A: Map query into a concept space
how?

Alien
Casablanca

q
Serenity

Amelie
Matrix
Alien

q= 5 0 0 0 0 v2
v1 q*v1
Project into concept space:
Matrix
Inner product with each
concept vector vi

BITS Pilani, Hyderabad Campus


Case study: How to query?
Compactly, we have:
qconcept = q V

E.g.:
Casablanca

SciFi-concept
Serenity

Amelie
Matrix

0.56 0.12
Alien

0.59 -0.02
q= 5 0 0 0 0 x 0.56 0.12 = 2.8 0.6
0.09 -0.69
0.09 -0.69
movie-to-concept
similarities (V)

BITS Pilani, Hyderabad Campus


Case study: How to query?
How would the user d that rated
(Alien, Serenity) be handled?
dconcept = d V
E.g.:
Casablanca

SciFi-concept
Serenity

Amelie
Matrix

0.56 0.12
Alien

0.59 -0.02
q= 0 4 5 0 0 x 0.56 0.12 = 5.2 0.4
0.09 -0.69
0.09 -0.69
movie-to-concept
similarities (V)

BITS Pilani, Hyderabad Campus


Case study: How to query?
Observation: User d that rated (Alien,
Serenity) will be similar to user q that
rated (Matrix), although d and q have
zero ratings in common! Casablanca
Serenity

Amelie
Matrix

SciFi-concept
Alien

d = 0 4 5 0 0 5.2 0.4

q = 5 0 0 0 0 2.8 0.6
Zero ratings in common Similarity 0
BITS Pilani, Hyderabad Campus
SVD: Drawbacks
+ Optimal low-rank approximation
in terms of Frobenius norm
- Interpretability problem:
A singular vector specifies a linear combination
of all input columns or rows
- Lack of sparsity:
Singular vectors are dense!
VT

=
U

BITS Pilani, Hyderabad Campus


Example

Consider the following matrix

In order to find U, we have to start with AAT.

The transpose of A is

BITS Pilani, Hyderabad Campus


Example (Contd..)

Compute AAT

Next, we have to find the Eigenvalues and


corresponding Eigenvectors of AAT.
We know that Eigenvectors are defined by the equation

BITS Pilani, Hyderabad Campus


Example (Contd..)

And applying this to AAT gives us

We rewrite this as the set of equations

And rearrange to get

BITS Pilani, Hyderabad Campus


Example (Contd..)

Solve for by setting the determinant of the coefficient


matrix to zero.

This works out as

This gives us two eigenvalues = 10; = 12.

BITS Pilani, Hyderabad Campus


Example (Contd..)

Replace the values back in to the original equations


gives us our Eigenvectors.
For = 10, we get

Which is true for lots of values, so we'll choose x1 = 1 and


x2 = -1 since those are small and easier to work with.
Thus, we have the Eigenvector [1; -1] corresponding to
the Eigenvalue = 10.

BITS Pilani, Hyderabad Campus


Example (Contd..)

For = 12, we get

Lets consider to take x1 = 1 and x2 = 1

For = 12 we have the Eigenvector [1; 1]

BITS Pilani, Hyderabad Campus


Example (Contd..)

These Eigenvectors become column vectors in a matrix


ordered by the size of the corresponding Eigenvalue.

The Eigenvector for = 12 is column one, and the


Eigenvector for = 10 is column two.

BITS Pilani, Hyderabad Campus


Example (Contd..)

Finally, we have to convert this matrix into an orthogonal


matrix which we do by applying the Gram-Schmidt
orthonormalization process to the column vectors.

Begin by normalizing v1.

BITS Pilani, Hyderabad Campus


Example (Contd..)

BITS Pilani, Hyderabad Campus


Example (Contd..)
The calculation of V is similar. V is based on ATA, so we
have

Find the Eigenvalues of ATA by

Solve the equations convert that to an orthonormal matrix as


we did for U gives you V and take the transpose.
BITS Pilani, Hyderabad Campus
Example (Contd..)

For we take the square roots of the non-zero


Eigenvalues and populate the diagonal with them,
putting the largest in 11, the next largest in 22 and so
on until the smallest value ends up in mm.

BITS Pilani, Hyderabad Campus


Example (Contd..)

A = U VT

BITS Pilani, Hyderabad Campus


Summary
The singular-value decomposition of a matrix consists of
three matrices, U, , and VT.
The matrices U and V are column-orthonormal, meaning
that as vectors, the columns are orthogonal, and their
lengths are 1.
The matrix is a diagonal matrix, and the values along its
diagonal are called singular values.
SVD is useful when there are a small number of concepts
that connect the rows and columns of the original matrix.
The matrix U connects rows to concepts, represents the
strengths of the concepts, and V connects the concepts to
columns.

BITS Pilani, Hyderabad Campus

You might also like