You are on page 1of 31

PRESENTED BY:

SADIA KHAN
HINA SALEEM
SIDRA KHAN
MADIHA BIBI


FACES
Faces are integral to human interaction
Manual facial recognition is already used
in everyday authentication applications
ID Card systems (passports, health card, and
drivers license)
Booking stations
Surveillance operations


Facial Recognition technology automates the recognition
of faces using one of two 2 modeling approaches:
Face appearance
2D Eigen faces
3D Morphable Model
Face geometry
3D Expression Invariant Recognition
2D Eigenface
Principle Component Analysis (PCA)
3D Face Recognition
3D Expression Invariant Recognition
3D Morphable Model


Facial Recognition
ICA finds the directions of
maximum independence
Facial Recognition: Eigenface
Decompose face
images into a small
set of characteristic
feature images.
A new face is
compared to these
stored images.
A match is found if
the new faces is close
to one of these
images.


Create training set of faces and calculate the
eigenfaces
Project the new image onto the eigenfaces.
Check if image is close to face space.
Check closeness to one of the known faces.
Add unknown faces to the training set and re-
calculate
Facial Recognition: PCA - Overview
Facial Recognition: PCA Training Set
Facial Recognition: PCA Training
Find average of
training images.
Subtract average face
from each image.
Create covariance
matrix
Generate eigenfaces
Each original image
can be expressed as a
linear combination of
the eigenfaces face
space

A new image is project into the facespace.
Create a vector of weights that describes this image.
The distance from the original image to this
eigenface is compared.
If within certain thresholds then it is a recognized
face.
Facial Recognition: PCA Recognition

Independent component analysis (ICA) is a method for
finding underlying factors or components from multivariate
(multi-dimensional) statistical data. What distinguishes ICA
from other methods is that it looks for components that are
both statistically independent, and nonGaussian.
A.Hyvarinen, A.Karhunen, E.Oja
Independent Component Analysis
What is ICA?

Blind Signal Separation (BSS) or Independent Component Analysis (ICA) is the
identification & separation of mixtures of sources with little prior
information.
Applications include:

Audio Processing
Medical data
Finance
Array processing (beamforming)
Coding
and most applications where Factor Analysis and PCA is currently used.
While PCA seeks directions that represents data best in a |x
0
- x|
2
sense,
ICA seeks such directions that are most independent from each other.
Often used on Time Series separation of Multiple Targets
ICA

Principle 1: Nonlinear decorrelation. Find the
matrix W so that for any i j , the components y
i
and
y
j
are uncorrelated, and the transformed components
g(y
i
) and h(y
j
) are uncorrelated, where g and h are
some suitable nonlinear functions.
Principle 2: Maximum nongaussianity. Find the
local maxima of nongaussianity of a linear
combination y=Wx under the constraint that the
variance of x is constant.
Each local maximum gives one independent
component.
ICA estimation principles


Given a set of observations of random variables x
1
(t),
x
2
(t)x
n
(t), where t is the time or sample index, assume
that they are generated as a linear mixture of independent
components: y=Wx, where W is some unknown matrix.
Independent component analysis now consists of
estimating both the matrix W and the y
i
(t), when we only
observe the x
i
(t).
ICA mathematical approach

ICA Principal (Non-Gaussian is Independent)
Key to estimating A is non-gaussianity
The distribution of a sum of independent random variables tends toward a Gaussian
distribution. (By CLT)





f(s
1
) f(s
2
) f(x
1
) = f(s
1
+s
2
)
Where w is one of the rows of matrix W.


y is a linear combination of s
i
, with weights given by z
i
.
Since sum of two indep r.v. is more gaussian than individual r.v., so z
T
s is more gaussian
than either of s
i
. AND becomes least gaussian when its equal to one of s
i
.
So we could take w as a vector which maximizes the non-gaussianity of w
T
x.
Such a w would correspond to a z with only one non zero comp. So we get back the s
i.
s z As w x w y
T T T
= = =
We need to have a quantitative measure of non-gaussianity for ICA Estimation.
Kurtotis : gauss=0 (sensitive to outliers)

Entropy : gauss=largest

Neg-entropy : gauss = 0 (difficult to estimate)

Approximations




where v is a standard gaussian random variable and :


2 2 4
}) { ( 3 } { ) ( y E y E y kurt =
}
= dy y f y f y H ) ( log ) ( ) (
) ( ) ( ) ( y H y H y J
gauss
=
{ }
2
2
2
) (
48
1
12
1
) ( y kurt y E y J + =
{ } { } | |
2
) ( ) ( ) ( v G E y G E y J ~
) 2 / . exp( ) (
) . cosh( log
1
) (
2
u a y G
y a
a
y G
=
=

Centering
x = x E{x}
But this doesnt mean that ICA cannt estimate the mean, but it just simplifies
the Alg.
ICs are also zero mean because of:
E{s} = WE{x}
After ICA, add W.E{x} to zero mean ICs
Whitening
We transform the xs linearly so that the x
~
are white. Its done by EVD.
x
~
= (ED
-1/2
E
T
)x = ED
-1/2
E
T
Ax = A
~
s
where E{xx
~
} = EDE
T
So we have to Estimate Orthonormal Matrix A
~

An orthonormal matrix has n(n-1)/2 degrees of freedom. So for large dim A we
have to est only half as much parameters. This greatly simplifies ICA.
Reducing dim of data (choosing dominant Eig) while doing whitening also
help.
Data Centering & Whitening

0) Centring = make the signals centred in zero
x
i
x
i
- E[x
i
] for each i

1) Sphering = make the signals uncorrelated. I.e. apply a transform V to x
such that Cov(Vx)=I // where Cov(y)=E[yy
T
] denotes covariance matrix
V=E[xx
T
]
-1/2
// can be done using sqrtm function in MatLab
xVx // for all t (indexes t dropped here)
// bold lowercase refers to column vector; bold upper to matrix

Scope: to make the remaining computations simpler. It is known that
independent variables must be uncorrelated so this can be fulfilled
before proceeding to the full ICA
Computing the pre-processing steps for ICA
Computing the rotation step
Fixed Point Algorithm
Input: X
Random init of W
Iterate until convergence:



Output: W, S
1
) (
) (

=
=
=
W W W W
S X W
X W S
T
T
T
g

=
=
T
t
T
t
T
G Obj
1
) ( ) ( ) ( I W W x W W
0 W X W X
W
= =
c
c
T T
g
Obj
) (
where g(.) is derivative of G(.),
W is the rotation transform sought
is Lagrange multiplier to enforce that
W is an orthogonal transform i.e. a rotation
Solve by fixed point iterations
The effect of is an orthogonal de-correlation
This is based on an the maximisation of an
objective function G(.) which contains an
approximate non-Gaussianity measure.
The overall transform then
to take X back to S is (W
T
V)
There are several g(.)
options, each will work best
in special cases. See FastICA
sw / tut for details.
Two architectures for performing ICA on images. (a) Architecture I for
finding statistically independent basis images. Performing source
separation on the face images produced IC images in the rows of U. (b)
The gray values at pixel location i are plotted for each face image. ICA in
architecture I finds weight vectors in the directions of statistical
dependencies among the pixel locations. (c) Architecture II for finding a
factorial code. Performing source separation on the pixels produced a
factorial code in the columns of the output matrix, U. (d) Each face
image is plotted according to the gray values taken on at each pixel
location. ICA in architecture II finds weight vectors in the directions of
statistical dependencies among the face images