You are on page 1of 3

!"#!

%&'&
()*+,-.)/- 01 !0.*2-), #34)/3)
5/46),74-8 01 9027-0/
:774;/.)/- <= >,0?+?4@47-43 A.+;) #);.)/-+-40/
(2)= %B&CB<'

The goal of this assignment is to apply statistical decision theory for the purpose of segmenting
an image into two components, foreground and background. Specifically, the goal of this
assignment is to segment the cheetah image into two components, the cheetah (foreground)
and the grass (background).



1.) Any descriptor that we could use to describe the observation space would be a function of the
image intensity. For the image in this case, a good descriptor would be a measure of texture
in the image. We will consider a simple texture measure based on the discrete cosine
transform (DCT). We will assume our observation space to be comprised of 8x8 image
blocks, i.e. we view each image as a collection of 8!8 blocks. For each block we compute the
discrete cosine transform (function dct2 on MATLAB) and obtain an array of 8!8 frequency
coefficients. For computational ease, we can convert the 2D array of coefficient into a 1D
vector comprising of 64 coefficients. The file CoefficientPattern.txt provides the position
of each coefficient in the 8x8 array. Further, the file TrainingSamplesPart1.mat contains
features for each class computed from a similar image. The features for the classes are stored
in two separate matrices, TrainsampleDCT_BG and TrainsampleDCT_FG for
background and foreground respectively.

The next step is to estimate the class conditional densities. To make this task easier, we are
going to reduce the feature vector to a single scalar. To do so, compute the position of the
coefficient with the 2
nd
largest energy value (absolute value) in the vector. This will serve as
our observation x. Given the training samples, build a histogram of these indexes to obtain
the class-conditionals for the two classes P
X|Y
(x|cheetah) and P
X|Y
(x|grass). The priors P
Y

(cheetah) and P
Y
(grass) should also be estimated from the training set as well.

a) using the training data in TrainingSamplesPart1.mat, what are reasonable estimates for the
prior probabilities?

b) using the training data in TrainingSamplesPart1.mat, compute and plot the histograms P
X|Y

(x|cheetah) and P
X|Y
(x|grass).

c) for each block in the image cheetah.bmp, compute the feature x. Compute the posterior
probability P(Y|x) and assign a class label to the state variable Y using the minimum probability
of error rule based on the probabilities obtained in a) and b). Store the state in an array A. Using
the commands imagesc and colormap(gray(255)) create a picture of that array.

d) The array A contains a mask that indicates which blocks contain grass and which contain the
cheetah. Compare it with the ground truth provided in image cheetah mask.bmp and compute
the probability of error of your algorithm.

Notes:

in TrainingSamplesPart1.mat each matrix row contains the zig-zag scanned vector of
absolute value of the coefficients. So, for a) and b) you do not need to compute the DCT,
etc.

in CoefficientPattern.txt the zig-zag pattern goes from 0 to 63, not 1 to 64. Please
remember this as MATLAB starts indexes from 1. (You can just add 1 to the numbers in
the file to get to the MATLAB coordinates).



2.) In part 2, we are going to assume that the class-conditional densities are multivariate
Gaussians of 64 dimensions.

Note: The training examples we used in part 1 contained the absolute value of the DCT
coefficients instead of the coefficients themselves. Please use the file
TrainingSamplesPart2.mat in this part of the assignment.

a) Using the training data in TrainingSamplesPart2.mat compute the histogram estimate of the
prior P
Y
(i), i=1,2 {cheetah, grass}. Compute the maximum likelihood estimate for the prior
probabilities. Compare the result with the estimates that you obtained in part 1. If they are the
same, interpret what you did in part 1. If they are different, explain the differences.

b) Using the training data in TrainingSamplesPart2.mat, compute the maximum likelihood
estimates for the parameters of the class conditional densities P
X|Y
(x|cheetah) and P
X|Y
(x|grass)
under the Gaussian assumption. Denoting by x = {X
1
, . . . ,X
64
} the vector of DCT coefficients,
create 64 plots with the marginal densities for the two classes - P
Xk|Y
(x
k
|cheetah) and P
Xk|Y

(x
k
|grass), k = 1, . . . , 64 - on each. Use different line styles for each marginal. Select, by visual
inspection, what you think are the best 8 features for classification purposes and what you think
are the worst 8 features (you can use the subplot command to compare several plots at a time).
Hand in the plots of the marginal densities for the best-8 and worst-8 features (once again you
can use subplot, this should not require more than two sheets of paper). In each subplot indicate
the feature that it refers to.

c) Compute the Bayesian decision rule and classify the locations of the cheetah image using i)
the 64-dimensional Gaussians, and ii) the 8-dimensional Gaussians associated with the best 8
features. For the two cases, plot the classification masks and compute the probability of error by
comparing with cheetah mask.bmp. Can you explain the results?

d) Compute the Bayesian decision rule and classify the locations of the cheetah image assuming
i) the 64 features are independent and can each be modeled by a univariate Gaussians, ii) the 8
best features are independent and can each be modeled by a univariate Gaussians, and iii) the 8
worst features are independent and can each be modeled by a univariate Gaussians. For the three
cases, plot the classification masks and compute the probability of error by comparing with
cheetah mask.bmp. Can you explain the results?

You might also like