Manifolds Algm

4556 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO.
11, NOVEMBER 2015
Dictionary Pair Learning on Grassmann Manifolds

for Image Denoising
Xianhua Zeng, Wei Bian, Member, IEEE, Wei Liu, Member, IEEE, Jialie Shen, and Dacheng Tao, Fellow, IEEE
Abstract Image denoising is a fundamental problem in I. I NTRODUCTION

computer vision and image processing that holds considerable
practical importance for real-world applications. The traditional
patch-based and sparse coding-driven image denoising methods
convert 2D image patches into 1D vectors for further processing.
A N IMAGE is usually corrupted by noise during the
processes of being captured, recorded and transmitted.
One general assumption is that an observed noisy image x is
Thus, these methods inevitably break down the inherent
generated by adding a Gaussian noise corruption to the original
2D geometric structure of natural images. To overcome this
limitation pertaining to the previous image denoising methods, clear image y, that is,
we propose a 2D image denoising model, namely, the dictionary
pair learning (DPL) model, and we design a corresponding x = y + v, (1)
algorithm called the DPL on the Grassmann-manifold (DPLG)
algorithm. The DPLG algorithm first learns an initial dictionary where v is the additive white Gaussian noise with a mean of
pair (i.e., the left and right dictionaries) by employing a subspace
partition technique on the Grassmann manifold, wherein the zero and a standard deviation .
refined dictionary pair is obtained through a sub-dictionary Image denoising plays an important role in the fields
pair merging. The DPLG obtains a sparse representation by of computer vision [1], [2] and image processing [3], [4].
encoding each image patch only with the selected sub-dictionary Its goal is to restore the original clear image y from the
pair. The non-zero elements of the sparse representation are observed noisy image x, which amounts to finding an inverse
further smoothed by the graph Laplacian operator to remove
the noise. Consequently, the DPLG algorithm not only preserves transformation from the noisy image to the original clear
the inherent 2D geometric structure of natural images but also image. Over the past decades, many denoising methods have
performs manifold smoothing in the 2D sparse coding space. been proposed for reconstructing the original image from the
We demonstrate that the DPLG algorithm also improves the observed noisy image by exploiting the inherently spatial
structural SIMilarity values of the perceptual visual quality correlations [5][12]. The image denoising methods are
for denoised images using the experimental evaluations on
the benchmark images and Berkeley segmentation data sets. generally divided into three categories including (i) internal
Moreover, the DPLG also produces the competitive peak denoising methods (e.g., BM3D [5], K-SVD [11], NCSR [12]):
signal-to-noise ratio values from popular image denoising using only the noisy image patches from a single noisy image;
algorithms. (ii) external denoising methods (e.g., SSDA [13], SDAE [14]):
Index Terms Image denoising, dictionary pair, 2D sparse training the mapping from noisy images to clean images using
coding, Grassmann manifold, smoothing, graph Laplacian only external clean image patches; and (iii) internal-external
operator. denoising methods (e.g. SCLW [15], NSCDL [16]): jointly
Manuscript received December 2, 2014; revised May 18, 2015 and using the external statistics information from a clean training
July 3, 2015; accepted August 4, 2015. Date of publication August 13, 2015; image set and the internal statistics information from the
date of current version August 31, 2015. This work was supported in part by observed noisy image. To the best of our knowledge, among
the National Natural Science Foundation of China under Grant 61075019 and
Grant 61379114, in part by the State Key Program of National Natural Science these methods, BM3D [10] is considered to be the current state
Foundation of China under Grant U1401252, in part by the Chongqing Natural of the art in the image denoising area over the past several
Science Foundation under Grant cstc2015jcyjA40036, and in part by the years. BM3D combines two classical techniques, non-local
Australian Research Council under Project FT-130101457,
Project DP-140102164, and Project LP-140100569. The associate editor similarity and domain transformation. However, BM3D is
coordinating the review of this manuscript and approving it for publication a complex engineering method and has many tunable
was Dr. Nilanjan Ray. parameters, such as the choices of bases, patch-size,
X. Zeng is with the Chongqing Key Laboratory of Computational
Intelligence, College of Computer Science and Technology, Chongqing Uni- transformation thresholds, and similarity measures.
versity of Posts and Telecommunications, Chongqing 400065, China (e-mail: In recent years, machine learning techniques based on
xianhuazeng@gmail.com). domain transformation have gained popularity and success in
W. Bian and D. Tao are with the Centre for Quantum Computation
and Intelligent Systems, and the Faculty of Engineering and Information terms of a good denoising performance [11], [12], [14][16].
Technology, University of Technology, Sydney, 81 Broadway Street, Ultimo, For example, K-SVD [11] is one of the most well-known
NSW 2007, Australia (e-mail: wei.bian, dacheng.tao@uts.edu.au). and effective denoising methods that apply machine learning
W. Liu is with the School of Electronic Engineering, Xidian University,
Xian 710071, China (e-mail: wliu.cu@gmail.com). techniques. This method assumes that a clear image patch can
J. Shen is with the School of Information Systems, Singapore Management be represented as a sparse linear combination of the atoms
University, Singapore 178902 (e-mail: jlshen@smu.edu.sg). from an over-complete dictionary. Hence, the K-SVD method
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. denoises a noisy image by approximating the noisy patch using
Digital Object Identifier 10.1109/TIP.2015.2468172 a sparse linear combination of atoms, which is formulated as
1057-7149 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
ZENG et al.: DPLG FOR IMAGE DENOISING 4557
minimizing the following objective function: noisy image patches almost never exactly lie on the same
manifold due to noise. A related work [26] shows that the
arg Mi n {Di X i 2 + i 1 }, (2) manifold smoothing is a usual trick for effectively removing
D,i i the noise. The weighted neighborhood graph, constructed
where D is an over-complete dictionary and each column from image patches, can approximate the intrinsic manifold
therein corresponds to an atom, and i is the sparse coding structure. The graph Laplacian operator is the generator
coefficient combination of all atoms for reconstructing the of the smoothing process on the neighborhood graph [25].
clean image patch from the noisy image patch X i under the Therefore, the recent promising graph Laplacian operator,
convex sparse priori regularization constraint .1 . in [26][29] and [31], for approximating the manifold
However, the above dictionary D is not easy to structure is leveraged as a generic smooth regularizer while
learn, and the corresponding denoising model uses a removing the noise of 2D image patches based on the sparse
1D vector, rather than the original 2D matrix to rep- coding model.
resent each image patch. Additionally, regarding the With the above considerations, we propose a Dictionary Pair
K-SVD basis, several effective, adaptive denoising methods, Learning model (DPL model) for image denoising. In the DPL
such as [11], [12], and [17][19] were also proposed in the model, the dictionary pair is used to capture the semantic
theme of converting image patches into 1D vectors and clus- features of 2D image patches, and the graph Laplacian
tering noisy image patches into regions with similar geometric operator guarantees a disciplined smoothing according to the
structures. Taking the NCSR algorithm [12] as a classical image patch geometric distribution in the 2D sparse coding
example, it unifies both priors in image local sparsity and non- space. However, we will face the NP-hardness of directly
local similarity via a clustering-based sparse representation. solving the dictionary pair and the 2D sparse coding matrixes
The NCSR algorithm incorporates considerable prior informa- for image denoising. In the NCSR model, the vectorized
tion to improve the denoising performance through introducing image patches are clustered into K subsets by K-means, and
sparse coding noise, (i.e., the third regularization term of the then one compact PCA sub-dictionary for each cluster is used.
following model, which is an extension of the model in Eq.(2)) So, in our DPL model, 2D image patches can, of course,
as follows: be clustered into some subsets with nonlocal similarities.
The 2D patches in a subset are very similar to each other.
arg Mi n {Di X i 2 + ai 1 + i i }, (3) Obviously, one needs only to extend the PCA sub-dictionary
D,i i to a 2DPCA sub-dictionary for each cluster. However, the
where i is a good estimation of the sparse codes i , and 2D image patches sampled from the noisy image with a
and are the balance factors of two regularization terms multi-resolution and sliding window in our DPL model are
(i.e., the convex sparse regularization term and sparse coding of a high quantity and have a non-linear distribution, such
noise term). that clustering faces a serious computational challenge.
In the NCSR model, while enforcing the sparsity of coding Fortunately, the literature [30] proposed a Subspace Indexing
coefficients, the sparse codes i s are also centralized to attain Model on Grassmann Manifold (SIM-GM) that can
a good estimations i s. Dictionary D is acquired by adopting top-to-bottom partition the non-linear space into local
an adaptive sparse domain selection strategy, which executes subspaces with a hierarchical tree structure. Mathematically,
K-Means clustering and then learns a PCA sub-dictionary for a Grassmann manifold is the set of all linear subspaces with a
each cluster. Nevertheless, this strategy still needs to convert fixed dimension [32], [33], and so an extracted PCA subspace
the noisy image patches into 1D vectors, so good estimations in each leaf node of the SIM-GM model corresponds to
i s are difficult to obtain. a point on a Grassmann manifold. To obtaining the most
To summarize, almost all patch-based and sparse effective local space, introducing the Grassmann manifold
coding-driven image denoising methods convert raw, distances (i.e., the angles between linear subspaces [34]),
2D matrix representations of image patches into 1D vectors the SIM-GM is able to automatically manipulate the leaf
for further processing, and thereby break down the inherent nodes in the data partition tree and build the most effective
2D geometric structure of the natural images. Moreover, the local subspace by using a bottom-up merging strategy.
learned dictionary and sparse coding representations cannot Thus, by extending the kind of PCA subspace partitioning
capture the intrinsic position correlations between the pixels on a Grassmann manifold to a 2DPCA subspace pair
within each image patch. On the one hand, to preserve the partitioning on two Grassmann manifolds, we propose a
2D geometric structure of image patches in the transformation Dictionary Pair Learning algorithm on Grassmann-manifolds
domain, a bilinear transformation is particularly appropriate (DPLG algorithm in shorthand). Experimental results on
(for image patches in the matrix representation) for extracting benchmark images and Berkeley segmentation datasets show
the semantic features of the rows and columns from the that the proposed DPLG algorithm is more competitive
image matrixes [20], which is similar to 2DPCA [21] on than the state-of-the-art image denoising methods including
two directions or can also be viewed as a special case of the internal denoising methods and the external denoising
some existing tensor feature extraction methods such as methods.
TDCS [22], STDCS [23] and HOSVD [24]. On the other The rest of this paper is organized as follows: In Section II,
hand, we assume that image patches sampled from a denoised we build a novel dictionary pair learning model for
image lie on an intrinsic smooth manifold. However, the 2D image denoising. Section III first analyzes the learning
4558 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 11, NOVEMBER 2015
methods of the dictionary pair and sparse coding matrixes, pair of the learned dictionary pair, then, exploiting the graph
and then summarizes the dictionary pair learning algorithm Laplacian as a smoothing operator, both smoothing and
on Grassmann-manifolds for image denoising. In Section IV, co-sparsity can be simultaneously guaranteed while mini-
a series of experimental results are shown, and we present the mizing a penalty term on the weighted L 1 -norm divergence
concluding remarks and future work in Section V. between the coding matrix of a given image patch and
those coding matrixes of its nonlocal neighborhood patches,
II. D ICTIONARY PAIR L EARNING M ODEL as in:

According to the above discussion and analysis, to preserve wi j Si S j F,1 , (6)
the original 2D geometric structure and to construct a sparse i, j
coding model for image denoising, the 2D noisy image patches
where wi j is the similarity between the i th patch and its
are encoded by projections on a dictionary pair that correspond
j th neighbor.
to left multiplying a matrix and right multiplying a matrix.
According to our previous research in manifold learning,
Then by exploiting sparse coding and graph Laplacian operator
a patch similarity metric is selected to apply the generalized
smoothing to remove noises, we design a Dictionary Pair
Gaussian kernel function in literature [31]:
Learning model (DPL model) for image denoising in this 1
section.
ex p X i X j F /2i

wi j = i f X j i s k near est neighbor s o f X i , (7)
A. Dictionary Pair Learning Model for 2D Sparse Coding

To preserve the 2D geometrical structure with sparse sensing 0, other wi se.
in the transformation domain, we need only to find two linear where is the normalization factor, i is the variance
transformations for simultaneously mapping the columns and of neighborhood distribution and is the generalization
rows of image patches under the sparse constraint. Let the Gaussian exponent. In this paper, the neighborhood similarity
image patches set be {X 1 , X 2 , . . . , X i , . . . , X n }, X i MN ; is assumed to obey the super-Gaussian distribution:
our method computes the left and right 2D linear transforma- 1
tions to map the image patches into the 2D sparse matrix wi j = ex p X i X j F,1 / 2i . (8)

space. Thus, the corresponding objective function may be
defined as follows:
C. The Final Objective Function
arg Mi n {A T X i B Si F + Si F,1 }, (4) Combining the sparse coding term in Eq. (4) and the
A,B,S i smoothing term in Eq. (6), the final objective function of the
where A MM1 and B NN1 are respectively called DPL model is defined as follows:

the left coding dictionary and the right coding dictionary,
arg Mi n {A TX B S + Si F,1

i i F
S = {Si }, Si M1N1 is the sparse coefficient matrix,
A,B,S i

i
is the regularization parameter, . F denotes the matrix
+ wi j Si S j F,1 },
Frobenious norm, and . F,1 denotes the matrix L 1 -norm (9)
i, j
which is defined as the sum of the absolute values of all its
(1) wi j = 1

entries.
S.t.

j
In this paper, the left and right coding dictionaries are (2) AkT Ak = I, BkT Bk = I, k = 1, . . . , K
combined and called as the dictionary pair A, B. Once the
dictionary pair and the sparse representations are learned, where . F,1 denotes the matrix L 1 -norm which is defined
especially, the left and right dictionaries constrained by as the sum of the absolute values of all matrix elements, and
block orthogonality, each patch X i can be reconstructed by A and B are constrained to be block orthogonal matrices in
multiplying the selected sub-dictionary pair Aki , Bki with the following learning algorithm.
its sparse representation, that is: The above Eq. (9) is an accurate description of the
Dictionary Pair Learning model (DPL model), and Fig. 1
T
X i Aki Si Bki , (5) shows an illustration of the DPL model. In the DPL model,
where the orthogonal sub-dictionaries Aki , Bki are selected to two similar 2D image patches, X i and X j , extracted from
code the image patch X i , and ki is the index of the selected the given noisy image are encoded on two dictionaries
sub-dictionary pair. Note that the selection method of the (i.e., the left dictionary A and the right dictionary B),
ki th dictionary pair is described in Section III-B. which are respectively consisted of sub-dictionary sets
A = {A1 , . . . , Ak , . . . , A K } and B = {B1 , . . . , Bk , . . . , B K }
for computational simplicity, as analyzed in Section III-A. The
B. Graph Laplacian Operator Smoothing left coding dictionary A is used to extract the features of the
Nonlocal smoothing and co-sparsity are the prevailing tech- column vectors from the image patches, and the right coding
niques for removing noises. Clearly, a natural assumption is dictionary B is used to extract the features of the row vectors
that the coding matrixes of similar patches should be similar. from the image patches. For sparse response characteristics,
If similar image patches are encoded only on a sub-dictionary the two learned dictionaries are usually required to be
Two sub-dictionaries for each cluster are computed, corre-

sponding to decomposing the covariance matrix and its trans-
posed matrix from 2D image patches (i.e., the sub-dictionary
pair). All such sub-dictionary pairs construct two large
over-complete dictionaries to characterize all the possible
local structures of a given observed image. It is assumed that
the k th subset is extracted to obtain the k th sub-dictionary
pair Ak , Bk , where k = 1, . . . , K . Then, in the dictionary
pair A, B = {Ak , Bk }k=1
K , the left dictionary A = {A , . . . ,
1
Ak , . . . , A K } is viewed as a point set on a Grassmann
manifold, and the right dictionary B = {B1 , . . . , Bk , . . . , B K }
Fig. 1. Similar image patches encoded by the dictionary pair A, B.
is also viewed as a point set on other Grassmann manifold
because a Grassmann manifold is the set of all linear subspaces
with the fixed dimension [32]. In this paper, obtaining the
redundant such that they can represent the various local dictionary pair A, B includes two basic stages: the initial
structures of 2D images. Unlike traditional sparse coding, dictionary pair A, B is obtained by the following Top-bottom
the sparse coding of each image patch in our DPL model is 2D Subspace Partition (TTSP algorithm); next the refined
a 2D sparse matrix. For sparsely coding each 2D image patch, dictionary pair A, B is obtained by the Sub-dictionary
a simple method is finding the most appropriate sub-dictionary Merging algorithm (SM algorithm).
pair from the learned dictionary pair A, B to carry out 1) Obtaining the Initial Dictionary Pair by TTSP Algorithm:
compact coding on it while constraining the zero coding For overcoming the difficulty in directly learning the effective
coefficients on those un-selected sub-dictionary pairs. This dictionary pair A, B under the nonlinear distribution
method can ensure the attainment of a global sparse coding characteristic of all of the 2D image patches, the entire training
representation. As for the third term in Eq. (9), corresponding image patch set is divided into non-overlapping subsets with
to the right of Fig. 1, it is expected to help realize as close and linear structures suited to the classical linear method, such
co-sparse as possible between the 2D sparse representations as 2DPCA, and the sub-dictionary pair on each subset are
of nonlocal similar image patches (that is, the constraints easily learned by the eigen-decompositions of two covariance
of smoothing and nonlocal co-sparsity). Thus, the 2D sparse matrixes.1 The literature [30] constructed a kind of data
coding matrices with corresponding to nonlocal similar partition tree for subspace indexing based on the global PCA,
image patches are regularized under the manifold smoothing but it is not suitable for our 2D subspace partition for
assumption with a L 1 -norm metric. learning the dictionary pair A, B. We propose a Top-bottom
2D Subspace Partition algorithm (TTSP algorithm)
III. D ICTIONARY PAIR L EARNING A LGORITHM for obtaining the initial dictionary pair A, B. The
ON G RASSMANN -M ANIFOLD TTSP algorithm recursively generates a binary tree, and
each leaf node is used in learning a sub-dictionary pair by
In the DPL model (i.e., Eq. (9)), the dictionary pair A, B
using an extended 2DPCA technique. The detailed steps of
and the sparse coding matrixes Si are all unknown, and their
the TTSP algorithm are described in Algorithm 1.
simultaneous solution is a NP problem. Therefore, our learning
2) Merging Sub-Dictionary Pairs by SM Algorithm: In the
strategy is to decompose the problem into three subtasks:
TTSP algorithm, each leaf node corresponds to two subspaces,
(1) learning the dictionary pair A, B from 2D noisy image
namely, the left sub-dictionary and right sub-dictionary, called
patches by eigen-decomposition, as shown in Section III-A;
a sub-dictionary pair. However, as the number of levels in
(2) fixing the dictionary pair A, B, and then updating the
the partition increases, the number of training image patches
2D sparse coding matrixes with smoothing, as shown
in each leaf node decreases. Leaf nodes may not be the
in Section III-B; and (3) reconstructing the denoised image
most effective local space for describing the image nonlocal
as shown in Section III-C. Thus, the so-called Dictionary
similarity and local distribution because each leaf node may
Pair Learning algorithm on Grassmann-manifold (DPLG) is
contain an insufficient number of samples. One reasonable
analyzed and summarized as follows.
method is to merge the leaf nodes that span almost the
same left sub-dictionaries, and almost the same right
A. Learning the Dictionary Pair sub-dictionaries. Because a Grassmann manifold is the set of
For solving Eq. (9), one important issue centers on how all linear subspaces with a fixed dimension and any two points
to learn the dictionary pair A, B for sparsely and smoothly on a Grassmann manifold correspond to two subspaces.
coding the 2D image patches. Due to the difficulty and Therefore, to merge the very similar leaf nodes, we assume
instability in the learned dictionary by directly optimizing that all left sub-dictionaries from all leaf nodes lie on
the sparse coding model, the dictionaries can also be directly one Grassmann manifold and that all right sub-
selected in conventional sparsity-based coding models
(i.e., analytically designed dictionaries). Thus, we design the 1 Two non-symmetrical covariance matrixes [21] of a matrix dataset
L
2DPCA subspace pair partition on two Grassmann manifolds {X 1 , X 2 , . . . , X L }, L cov = L1 i=1 (X i Ck )(X i Ck )T and Rcov =
1 L (X C )T (X C ) where C = 1 L X .
to implement the clustering-based sub-dictionary pair learning. L i=1 i k i k k L i=1 i
Algorithm 1 (TTSP Algorithm) Top-Bottom 2D Subspace

Partition
Fig. 2. Principal angles between sub-dictionaries.
two subspaces span(A1) and span(A2) is defined by:

cos(t ) = Max { Max u tT v t }
u t span( A1 ) v t span( A2 )
u T u = vT v = 1 (10)

t t t t
S.t. T
u t u r = v tT vr = 0, (t = r ),
where 0 t /2, t, r = 1, . . . , m, and u t and v t are the

basis vectors from two subspaces, respectively.
In Eq. (10), the first principal angle 1 is the smallest angle
among those between all pairs (each corresponds to two unit
basis vectors), which are respectively from the two subspaces.
The rest of the principal angles can be obtained by other basis
vectors in each subspace, as shown in Fig. 2. The smaller
the principal angles are, the more similar the two subspaces
are (i.e., the closer they are on the Grassmann manifold).
In fact, the cosines of all principal angles can be computed by
a more numerically stable method, the Singular Value Decom-
position (SVD) [34] solution, as described in Theorem 1, for
which we provide a simple proof in Appendix A.
Let A1 and A2 be two m-dimensional column-orthogonal
matrixes that respectively consist of orthogonal bases from
two left sub-dictionaries. Then, the cosines of all principal
angles between the two subspaces (i.e., the two sub-
dictionaries) are computed by the following SVD equation:
Theorem 1: If A1 and A2 are two m-dimensional
subspaces, then
A1T A2 = U V T , (11)
where the diagonal matrix = di ag(cos 1 , . . . , cosm ),

UU T = Im and V V T = Im .
In the following subspace merging algorithm, the similarity
Si m(A1 , A2 ) between the two subspaces A1 and A2 is defined
as the average of all principal angle cosine values:
1
m
dictionaries from all leaf nodes lie on the other Grassmann
Si m(A1 , A2 ) = cos l . (12)
manifold. m
l=1
The angles between linear subspaces have intuitively
become a reasonable measure for describing the divergence Therefore, the larger Si m(Ai , A j ) are, the more similar the
between subspaces on a Grassmann manifold [32]. Thus, two subspaces are (i.e., the closer they are on the Grassmann
for computational convenience, the similarity metric between manifold). Those almost same subspaces should be merge into
two subspaces is typically defined by taking the cosines a single subspace. On the other hand, the same situation should
of the principal angles. Taking the left sub-dictionaries for be considered for the right sub-dictionaries Bi , i = 1, . . . , K .
example, the cosines of the principal angles are defined as The similarity metric between the right sub-dictionaries is
follows: defined in the same manner as the above method. Therefore,
Definition 1: Let A1 and A2 be two m-dimensional simultaneously taking the left sub-dictionaries and the right
subspaces corresponding to the two left sub-dictionaries. sub-dictionaries into account, our Sub-dictionary Merging
The cosine of the t th principal angle between the algorithm (SM algorithm) is described in Algorithm 2.
Algorithm 2 (SM Algorithm) Sub-Dictionary Merging the non-local neighborhood similarity, and is the balance
Algorithm factor.
As for the balance factor , when the two terms of Eq. (14)
are simultaneously optimized, we can reach the following
conclusion (the proof is shown in Appendix B).
Theorem 2: If X i is the corrupted image patch by noise
N(0, ), and the non-local similarity obeys to the Laplacian
distribution with the parameter i , then the balance factor
= .
2
2i
Clearly, the objective function of Si in Eq. (14) is convex
and can be efficiently solved. The first term is to minimize the
reconstruction error on the sub-dictionary pair Aki , Bki , and
the second term is to ensure the smoothing and co-sparsity
in coefficient matrix space. We initialize the coding matrix
Si and S j by the projections of the image patch X i and its
neighbors X j on the selected sub-dictionary pair Aki , Bki ,
that is:
B. Updating Sparse Coding Matrixes Si (t) = Aki

T
X i Bki , (15)
T
Section III-A describes a method to rapidly learn the S j (t) = Aki X j Bki , j = 1, .., k1, (16)
dictionary pair A, B, where A = {A1 , . . . , Ak , . . . , A K },
B = {B1, . . . , Bk , . . . , B K }. For sparsely coding each where image patch X j is one of the k1-nearest neighbors of
2D noisy image patch and deleting noise, we need only image patch X i .
to find the most appropriate sub-dictionary pair Aki , Bki Additionally, for computational convenience, we can
from the learned dictionary pair A, B to represent the reformat and relax Eq. (14) into the following objection
patch, and denoise the image patch by smoothing the sparse function:

representation. arg Mi n {Aki T X B S + S wi j S j F,1 }

Si i ki i F i
For the i th noisy image patch, we assume that the most
appropriate sub-dictionary pair Aki , Bki is used to encode it j
(17)

S.t. w = 1.
and that the other sub-dictionary pairs are constrained to pro- i j
j
viding zero coefficient coding. According to the nearest center,
the most appropriate sub-dictionary pair for the i th noisy According to the literature [35], a threshold-shrinkage
image patch X i can be selected by the smallest L 1 nor m algorithm is adopted to solve the Eq. (17) (i.e., using the
coding, that is: gradient descent method and the threshold-shrinkage strategy).
Therefore, the sparse coding matrix Si on the sub-dictionary
ki = arg Mi n {AkT (X i Ck )Bk F,1 }, k = 1, . . . , K, (13)
k
pair Aki , Bki is updated by the following formula:

where K is the total number of sub-dictionary pairs, Si (t + 1) = f (Si (t) wi j S j (t), ) + wi j S j (t)
Ck denotes the center of the k th leaf node, and . F,1 j j (18)
denotes the matrix L1 nor m, which is defined as the sum
S.t. X i Aki Si Bki
T < cN 2 ,
F
of the absolute values of all matrix elements.
For obtaining sparse representations, we assume that any where is the noise variance, N is the number of image
noisy image patch is only encoded by one sub-dictionary pair patch pixels, is the gradient decent step, c is a scaling factor,
and that the coding coefficients on the other sub-dictionary which is empirically set 1.15, and f (., .) is the soft threshold-
pairs are constrained to zero. Therefore, for any noisy image shrinkage function, that is:
patch X i , we can simplify Eq. (9) to obtain the following
objective function definition: 0, if z <
f (z, ) = (19)
Definition 2: For image patch X i , let the selected near- z sgn(z), other wi se,
est sub-dictionary pair be Aki , Bki in Eq. (13). Then, the
smoothing sparse coding is computed by the following where sgn(z) is a sign function.
formula:

arg Mi n{Aki X i Bki Si F +
wi j Si S j F,1 },
T
C. Reconstructing the Denoised Image
Si
j
(14) As a type of non-local similarity and transformation domain
wi j = 1
S.t. approach, a given noisy image needs to be divided into many
j
overlapping small image patches. The corresponding denoised
where S j is the sparse coding matrix of the j th nearest image is obtained by combining all of the denoised image
image patch on the sub-dictionary pair Aki , Bki , wi j is patches. Let x denote a noisy image, and let the binary
Algorithm 3 (DPLG Algorithm) Dictionary Pair Learning on

Grassmann-Manifold
Fig. 3. The working flowchart of DPLG algorithm.
matrix Ri be used for extracting the i th image patch at

the position i , that is:
X i = Ri x, i = 1, 2, . . . , n, (20)
where n denotes the number of possible image patches.
If we let Si be the coding matrix, with smoothing and co-
sparsity obtained by using the sub-dictionary pair Aki , Bki ,
then the denoised image x is reconstructed by:

x = (Ri Aki Si Bki )
T T
(RiT Ri 1), (21)
i i
where denotes an element-wise division and 1 denotes a
matrix of ones. That is, Eq. (21) puts all denoised patches
together as the denoised image x (the overlapped pixels
between neighboring patches are averaged).
of the DPLG algorithm, and the detailed procedures of the
D. Summary of the DPLG Algorithm DPLG algorithm are described in the Algorithm 3.
1) The Description of the DPLG Algorithm: Summarizing 2) Time Complexity Analysis: Our DPLG method preserves
the above analysis, for adaptively learning and denoising from the original 2D structure of each image patch to un-change.
a given noisy image itself, we put forward the Dictionary If the size of the sampled image patches is b b, and the
Pair Learning algorithm on Grassmann-manifold (DPLG). sub-dictionary pair Ak , Bk is computed by using 2DPCA
The DPLG algorithm allows the dictionary pair to be on each image patch subset, then Ak and Bk are two b b
updated according to the last denoised result and then orthogonal matrices. Comparatively, NCSR needs to compute
obtains better representations of the noisy patches. Thus, a more complex b2 b 2 orthogonal matrix as the dictionary
the DPLG algorithm is designed as an iterative image by using the PCA on 1D presentations of image patches. For
denoising method. Each iteration includes three basic tasks, example, in the NCSR method, the matrix size appears to
namely, learning the dictionary pair A, B from the noisy be 64 times larger than our method when b = 8. Therefore,
image patches sampled from the current noisy image at a for DPLG, less time complexity is required to compute the
multi-resolution, updating the 2D sparse representations for eigenvectors.
image patches from the current noisy image, and reconstruct- Moreover, comparing our DPLG method with the
ing the denoised image, where the current noisy image is NCSR method, the former is to rapidly top-bottom divide each
a slight translation from the current denoised image to the leaf node into the left-child and right-child by the first principal
original noisy image. Fig. 3 shows the basic working flowchart component projection on the current sub-dictionary pair
TABLE I
T IME C OMPLEXITY FOR O NE U PDATE OF T WO BASIC S TEPS IN T HREE
D ICTIONARY L EARNING A LGORITHMS : DPLG, NCSR AND K-SVD
Fig. 4. The denoising performance of the DPLG at different k1-nearest

neighbors.
(i.e., the two-way partition of 1D real numbers). The latter is

to divide the whole training set (i.e., b2 -dimensional vectors) and the measure of Structural SIMilarity (SSIM) [36]. The
into the specified clusters by applying K-means with more PSNR and RMSE are the simplest and most widely used image
time complexity. Compared with the K-SVD method, each quality metrics. Common knowledge holds that the smaller the
atom of its single dictionary D needs to be updated by RMSE is, the better the denoising is. Equivalently, the larger
SVD decomposition. If the number of the dictionary atoms the PSNR is, the better the denoising is. Moreover, the
in K-SVD is equal to the amount of all sub-dictionary atoms RMSE and PSRN have the same assessment ability, although
in the DPLG or NCSR, then the computational complexity of they are not very well matched in the perceptual visual quality
K-SVD is the largest. However, the dictionary D of K-SVD of denoised images. The third quantitative evaluation method,
in real-world applications is only ever empirically set to the Structural SIMilarity (SSIM), focuses on the perceptual
a smaller over-complete dictionary atom number than the quality metric, which compares normalized local patterns of
DPLG and NCSR method, so that K-SVD has a faster com- pixel intensities. In our experiments, the PSNR and SSIM are
puting speed. Additionally, in the sparse coding step, the three used as objective assessments.
internal denoising methods DPLG, NCSR and K-SVD have
slight differences in time complexity, as shown in Table I.
Without loss of generality, letting the number of clusters B. Experiments on Benchmark Images
equal K , the number of image patches equal n, the size To evaluate the performance of the proposed model,
of each image patch equal b b, the iteration of K-means we exploit the proposed DPLG algorithm for denoising
clustering equal l, the k1-nearest neighbors equal k1, the ten noisy benchmark images [38] and another difficult-to-be-
number of dictionary atoms in K-SVD equal H , and the denoised noisy image (named the ChangE-3 image [39]),
max number of nonzero codes for each image patch in which is significant. Several state-of-the-art denoising methods
K-SVD equal M, we compare the computational complexity with default parameters are used for comparison with the
of the dictionary learning step and the sparse coding step in proposed DPLG algorithm, including the internal denosing
three iterative dictionary learning methods (internal denoising methods BM3D [10], K-SVD [11] and NCSR [12], the
methods), namely, DPLG, NCSR and KSVD, as shown in external denoising methods SSDA [13] and SDAE [14],
the Table I. Due to computing the non-local neighborhood SCLW [15], and NSCDL [16]. As for the parameter setting of
similarity within each cluster in our manifold smoothing our DPLG algorithm, the k1-nearest neighbor parameter, the
strategy, computing the Laplacian similarity only needs linear maximum depth of leaf nodes and the number of iterations
computational time. Finally, the total time complexity of the of the DPLG are empirically set to 6, 7 and 18, respectively,
DPLG is less than the NCSR and K-SVD algorithms with the from a series of tentative test. Taking the k1-nearest-neighbor
same size of their dictionaries (that is, when H = K b). parameter as an example, we analyze the performance of our
method at the different k1-nearest-neighbor parameters, as
IV. E XPERIMENTS shown in Fig.4. Accordingly, when the size of neighbors is not
In this section, we will verify the image denoising large enough (for example the k1-nearest neighbors at [6, 60]),
performance of the proposed DPLG method. We test the the performance of our DPLG method does not significantly
performance of the DPLG method on benchmark images [38], change. However, the DPLG can obtain the largest SSIM value
[39] and on 100 test images from the Berkeley Segmentation when the k1-nearest-neighbor parameter is set to 6.
Dataset [40]. Moreover, these experimental results of the 1) Comparing With Internal Denoising Methods: The
proposed DPLG method are compared with seven developed 20 different noisy versions of the 11 benchmark images,
state-of-the-art denoising methods, including three internal that is, corresponding to 220 noisy images, are denoised
denoising methods and four denoising methods using external respectively by the previously mentioned four internal denois-
information from clean natural images. ing methods: DPLG, NCSR, BM3D and K-SVD. The
SSIM results of the four test methods are reported in Table II,
and the highest SSIM values are displayed in black bold. The
A. Quantitative Assessment of Denoised Images PSNR results are reported in Table III, and the highest PSNR
An objective image quality metric plays an important role in values are displayed in black bold.
image denoising applications. Currently, three classical image It is worth noting that our DPLG method preserves the
quality assessment metrics are typically used: the Root mean 2D geometrical structure of the image patches and thus
square error (RMSE), the Peak Signal-to-Noise Ratio (PSNR) can significantly achieve the best visual quality, as shown
TABLE II
T HE SSIM VALUES BY D ENOISING 11 I MAGES AT D IFFERENT N OISE VARIANCE
TABLE III
T HE PSNR VALUES BY D ENOISING 11 I MAGES AT D IFFERENT N OISE VARIANCE
TABLE IV
C OMPARISON OF DPLG W ITH S EVERAL D ENOISING M ETHODS
U SING E XTERNAL T RAINING I MAGES
Fig. 5. The average SSIM values and average PSNR values of 11 denoised
images at different noise variance . (a) Average SSIM values of 11 denoised
images. (b) Average PSNR values of 11 denoised images.
in columns 5-17 in Table II. From Table III, we can see

that when the noise level is not very high (seemly noise Fig. 6. (a) The PSNR values versus iterations by using DPLG, NCSR and
KSVD when = 50; (b) The SSIM values versus iterations by using DPLG,
variance < 30), all of the four methods can achieve very NCSR and KSVD when = 50.
good denoised images. When the noise level is high (seemly
noise variance 30 < 80), our DPLG method can obtain
basically the best denoising performance corresponding to The SCLW and NSCDL denoising methods all exploit
columns 9-13 in Table III. Moreover, Fig. 5 shows the plots external statistics information from a clean training image
of the average PSNR and SSIM of the 11 images at different set and the internal statistics from the observed noisy image.
noise corruption levels. The SCLW learns the dictionary from external and internal
Regarding the structural similarity (SSIM) assessment examples, and the NSCDL learns the coupled dictionaries from
of restored images, our DPLG algorithm obtains the best clean natural images and exploits the non-local similarity from
denoising results for 87 noisy images, the NCSR method the test noisy images. The SSDA and SDAE adopt the same
is best for 60 noisy images, the BM3D method is best denoising technique, (i.e., learning a denoised mapping using a
for 71 noisy images, and the K-SVD method is best stacked Denoising Auto-encoder algorithm with sparse coding
for 2 noisy images. Experiments show that the proposed characteristics and a deep neural network structure [37]).
DPLG algorithm has the best average performance for Their aims are to find the mapping relations from noisy image
restoring the perceptual visual effect, as shown on the bottom patches to noise-free image patches by training on a large scale
of Table II and Fig. 5 (a). Under the PSNR assessment, our of external natural image set. Table IV shows the comparison
DPLG method obtains the best denoising results for 67 noisy of the DPLG with several internal-external denoising methods
images, while the NCSR method is best for 31 noisy images, and external denoising methods, in terms of characteristics
the BM3D method is best for 105 noisy images, and the and the denoising performance on benchmark images. Our
K-SVD method is best for 18 noisy images. The DPLG also experiments show that the joint utilization of external and
has a competitive performance in reconstructing the pixel internal examples generally outperforms either stand-alone
intensity, as shown in Table II and Fig. 5 (b). method, but no method is the best for all images. For
2) Comparing With External Denoising Methods: In this example, our DPLG can obtain the best denoising result on
experiment, we compare with several denoising methods the House benchmark image by using only the smoothing,
that exploit the statistics information of external, noise-free sparseness and non-local self-similarity of the noisy image.
natural images. Our DPLG method only exploits the internal Furthermore, our DPLG still maintains a better performance
statistics information of the tested noisy image itself. than the two external denoising methods SSDA and SDAE.
Fig. 8. The distribution of the SSIM and PSNR values of denoising 100 noisy
images corrupted by different noises by using NCSR, BM3D, KSVD and our
DPLG algorithm. (a) The distribution of SSIM values. (b) The distribution of
PSNR values.
that our DPLG has a more rapidly increasing speed of

PSNR and SSIM versus the iterations. It shows that our
algorithm can achieve the best denoising performance among
several iterative methods. The DPLG has competitive perfor-
mance for reconstructing the smooth, the texture and the edge
regions, as shown in the second row of Fig. 7.
Fig. 7. The performance of the denoised House image and ChangE-3

image at noise variance = 50 by using three iteration algorithms: DPLG, C. Experiments on BSD Test Images
NCSR and K-SVD, which are iterated 60 times. Our DPLG method achieves
the best denoised results (with corresponding to the second row, the denoised To further demonstrate the performance of the proposed
House image, PSNR: 30.042, SSIM: 0.81602, and the denoised ChangE-3 DPLG method, the image denoising experiments were also
image, PSNR: 24.2365, SSIM: 0.67538) of the several methods. conducted on 100 test images from the public benchmark
Berkeley Segmentation Dataset [40]. Aiming at 10 different
3) Comparing With Iteration Denoising Methods: Our noisy versions of these images, that is, corresponding to a
DPLG method is an iterative method that allows the dictionary total of 1000 noisy images, the comparison experiments were
pair to be updated using the last denoised result and then completed by respectively running the NCSR, BM3D, K-SVD
obtains better 2D representations of the noisy patches from and our DPLG method. In the experiments, the parameter set-
the noisy image. Fig. 6 and Fig. 7 show the denoising tings for the DPLG are the same as in the above experiments.
results of two typical noisy images (House and ChangE-3) Under the different Gaussian noise corruption, the average
with strong noise corruption (noise variance = 50) after PSNR and SSIM values of 100 noisy images are shown
60 iterations. The experimental results empirically demonstrate in Fig. 8. Our DPLG method can achieve the best total per-
the convergence of the DPLG, as shown in Fig. 6. As the formance for restoring the perceptual visual effect, as shown
number of iterations increases, the denoised results get better. in the distribution of the SSIM values of 100 denoised images
Fig. 6(a)-(b) display the plots of their PSNR values and SSIM by four methods in Fig. 8(a), and has competitive performance
values versus iterations, respectively. Comparing with two for reconstructing pixel intensity, as shown in the distribution
known iterative methods: K-SVD and NCSR, Fig.6 shows of the PSNR values of 100 denoised images in Fig. 8(b).
V. C ONCLUSION According to Eq.(18), Aki Si Bki T is the reconstruction of the
In this paper, we proposed a DPLG algorithm which is a clear image patch, {Aki , Bki } are the orthogonal dictionary
novel 2D image denoising method working on Grassmann pair.
manifolds and leading to state-of-the-art performance. The E AkiT X B S = 2
i ki i
DPLG algorithm has three primary advantages: 1) the adaptive As for the second term of Eq. (14). According to that the
dictionary pairs are rapidly learned via subspace partitioning F1 -norm Si S j F,1 obeys to the Laplacain distribution
and sub-dictionary pair merging on the Grassmann manifolds; in Eq. (6)
2) 2D sparse representations are notably easy to obtain;
and 3) a graph Laplacian operator makes 2D sparse code E( wi j Si S j F,1 ) = E( wi j E(Si S j F,1 ))
representations vary smoothly for denoising. Moreover, j
j

extensive experimental results achieved on the benchmark = E( wi j 2i )
images and the Berkeley segmentation datasets demonstrated
j
that our DPLG algorithm can obtain better-than-average = E( 2i wi j )
performance for restoring the perceptual visual effect than j

the state-of-the-art internal denoising methods. In the future, = E( 2i ), ( wi j = 1)
we would consider several potential problems, such as learning
j
3D multiple dictionaries for video denoising and exploring = 2i
the fusion of manifold denoising and multi-dimensional
sparse coding techniques. For preserving the scaling consistency, the ratio of two terms
should be equal to 1.

2
A PPENDIX A = 1 = 2 i
2 .
2i
T HE P ROOF OF T HEOREM 1 IN S ECTION III-A
We give a simple proof of Theorem 1.
Proof: A1 and A2 are two D m-dimensional column- ACKNOWLEDGMENT
orthogonal matrixes. The authors would like to thank the anonymous reviewers
A1T A2 is a m-dimensional matrix. for their help.
According to the SVD decomposition of the matrix A1T A2 ,
if 1 , 2 , . . . , m are m eigen-values from the largest to the
R EFERENCES
smallest, and U, V are two m-dimensional orthogonal matrixes
corresponding to the eigen-values 1 , 2 , . . . , m , then [1] P. Chatterjee and P. Milanfar, Is denoising dead? IEEE Trans. Image
Process., vol. 19, no. 4, pp. 895911, Apr. 2010.
U T A1T A2 V = di ag(1 , 2 , . . . , m ) [2] C. Sutour, C.-A. Deledalle, and J.-F. Aujol, Adaptive regularization of
the NL-means: Application to image and video denoising, IEEE Trans.
And V is a m m orthogonal matrix. Image Process., vol. 23, no. 8, pp. 35063521, Aug. 2014.
A2 V is a rotation transformation of subspace A2 by the [3] S. G. Chang, B. Yu, and M. Vetterli, Adaptive wavelet thresholding for
image denoising and compression, IEEE Trans. Image Process., vol. 9,
rotation matrix V , that is: no. 9, pp. 15321546, Sep. 2000.
span(A2) = span(A2 V ) [4] C.-H. Xie, J.-Y. Chang, and W.-B. Xu, Medical image denoising by
generalised Gaussian mixture modelling with edge information, IET
Similarly, the same equation span(A1) = span(A1U ). Image Process., vol. 8, no. 8, pp. 464476, Aug. 2014.
[5] J.-L. Starck, E. J. Candes, and D. L. Donoho, The curvelet transform
Let u k and v k of Eq.(11) respectively be the k th column for image denoising, IEEE Trans. Image Process., vol. 11, no. 6,
of matrixes A1 and A2 , that is: pp. 670684, Jun. 2002.
[6] A. Buades, B. Coll, and J. M. Morel, A review of image denoising
[u 1 , u 2 , . . . , u k , . . . , u m ] = A1 U algorithms, with a new one, Multiscale Model. Simul., vol. 4, no. 2,
[v 1 , v 2 , . . . , v k , . . . , v m ] = A2 V pp. 490530, Feb. 2005.
[7] J. Portilla, V. Strela, M. J. Wainwright, and E. P. Simoncelli, Image
then, denoising using scale mixtures of Gaussians in the wavelet domain,
IEEE Trans. Image Process., vol. 12, no. 11, pp. 13381351, Nov. 2003.
di ag(cos 1 , . . . , cos k , . . . , cos m ) [8] A. Buades, B. Coll, and J.-M. Morel, A non-local algorithm for image
denoising, in Proc. IEEE Comput. Soc. Conf. CVPR, vol. 2. Jun. 2005,
= [u 1 , u 2 , . . . , u k , . . . , u m ]T [v 1 , v 2 , . . . , v k , . . . , v m ] pp. 6065.
= U T A1T A2 V [9] A. Rajwade, A. Rangarajan, and A. Banerjee, Image denoising using
the higher order singular value decomposition, IEEE Trans. Pattern
= di ag(1 , 2 , . . . , m ). Anal. Mach. Intell., vol. 35, no. 4, pp. 849862, Apr. 2013.
[10] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, Image denoising
by sparse 3-D transform-domain collaborative filtering, IEEE Trans.
Image Process., vol. 16, no. 8, pp. 20802095, Aug. 2007.
A PPENDIX B [11] M. Elad and M. Aharon, Image denoising via sparse and redundant
T HE P ROOF OF T HEOREM 2 IN S ECTION III-B representations over learned dictionaries, IEEE Trans. Image Process.,
Proof: Considering the first term of Eq. (14). vol. 15, no. 12, pp. 37363745, Dec. 2006.
[12] W. Dong, L. Zhang, G. Shi, and X. Li, Nonlocally centralized sparse
X i is the corrupted image patch by noise N(0, ), and representation for image restoration, IEEE Trans. Image Process.,
Si is the code of the corresponding clear patch. vol. 22, no. 4, pp. 16201630, Apr. 2013.
And [13] J. Xie, L. Xu, and E. Chen, Image denoising and inpainting with deep

neural networks, in Proc. Adv. NIPS, Dec. 2012, pp. 350358.
E Aki T
X i Bki Si = E Aki T
X i Aki Si Bki
T
Bki F [14] H. Li, Deep learning for image denoising, Int. J. Signal Process.,
Image Process. Pattern Recognit., vol. 7, no. 3, pp. 171180, Mar. 2014.
[15] Z. Y. Wang, Y. Z. Yang, J. C. Yang, and T. S. Huang, Design- Xianhua Zeng received the Ph.D. degree in com-
ing a composite dictionary adaptively from joint examples, in Proc. puter software and theory from Beijing Jiaotong
IEEE Comput. Soc. Conf. CVPR, Jun. 2015. [Online]. Available: University, in 2009. He is currently an Associate
http://arxiv.org/pdf/1503.03621.pdf Professor with the Chongqing Key Laboratory of
[16] L. Chen and X. Liu, Nonlocal similarity based coupled dictionary Computational Intelligence, College of Computer
learning for image denoising, J. Comput. Inf. Syst., vol. 9, no. 11, Science and Technology, Chongqing University of
pp. 44514458, Nov. 2013. Posts and Telecommunications, Chongqing, China.
[17] S. Hawe, M. Kleinsteuber, and K. Diepold, Analysis operator learn- He was a Visiting Scholar with the University of
ing and its application to image reconstruction, IEEE Trans. Image Technology at Sydney, Sydney, from 2013 to 2014.
Process., vol. 22, no. 6, pp. 21382150, Jun. 2012. His main research interests include image process-
[18] P. Chatterjee and P. Milanfar, Clustering-based denoising with locally ing, machine learning, and data mining.
learned dictionaries, IEEE Trans. Image Process., vol. 18, no. 7,
pp. 14381451, Jul. 2009.
[19] W. Zuo, L. Zhang, C. Song, and D. Zhang, Texture enhanced image Wei Bian (M14) received the B.Eng. degree in
denoising via gradient histogram preservation, in Proc. IEEE Comput. electronic engineering, the B.Sc. degree in applied
Soc. Conf. CVPR, Jun. 2013, pp. 12031210. mathematics, and the M.Eng. degree in electronic
[20] J. Ye, Generalized low rank approximations of matrices, Mach. Learn.,
engineering from the Harbin Institute of Technology,
vol. 61, nos. 13, pp. 167191, Nov. 2005.
China, in 2005 and 2007, respectively, and the
[21] J. Yang, D. Zhang, A. F. Frangi, and J.-Y. Yang, Two-dimensional
Ph.D. degree in computer science from the Uni-
PCA: A new approach to appearance-based face representation and
versity of Technology, Sydney, Australia, in 2012.
recognition, IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 1,
His research interests include pattern recognition and
pp. 131137, Jan. 2004.
[22] S.-J. Wang, J. Yang, M.-F. Sun, X.-J. Peng, M.-M. Sun, and C.-G. Zhou, machine learning.
Sparse tensor discriminant color space for face verification, IEEE
Trans. Neural Netw. Learn. Syst., vol. 23, no. 6, pp. 876888, Jun. 2012.
[23] S.-J. Wang, J. Yang, N. Zhang, and C.-G. Zhou, Tensor discriminant
color space for face recognition, IEEE Trans. Image Process., vol. 20, Wei Liu (M14) is Professor of the School of
no. 9, pp. 24902501, Sep. 2011. Electronic Engineering, Xidian University, China,
[24] J. Liang, Y. He, D. Liu, and X. Zeng,Image fusion using higher order and has been a research scientist of IBM T. J.
singular value decomposition, IEEE Trans. Image Process., vol. 21, Watson Research Center, Yorktown Heights, NY,
no. 5, pp. 28982909, May 2012. USA from 2012 to 2015. He received the M.Phil.
[25] A. Elmoataz, O. Lezoray, and S. Bougleux, Nonlocal discrete regu- and Ph.D. degrees in electrical engineering from
larization on weighted graphs: A framework for image and manifold Columbia University, New York, NY, USA in 2012.
processing, IEEE Trans. Image Process., vol. 17, no. 7, pp. 10471060, Dr. Liu is the recipient of 2011-2012 Facebook
Jul. 2008. Fellowship and 2013 Jury Award for best thesis
[26] M. Hein and M. Maier, Manifold denoising, in Proc. Adv. NIPS, of Department of Electrical Engineering, Columbia
Jun. 2006, pp. 561568. University. He holds adjunct faculty positions in
[27] M. Zheng et al., Graph regularized sparse coding for image represen- Rensselaer Polytechnic Institute (RPI) and Stevens Institute of Technology.
tation, IEEE Trans. Image Process., vol. 20, no. 5, pp. 13271336, His research interests include machine learning, data mining, information
May 2011. retrieval, computer vision, pattern recognition, and image processing. Dr. Liu
[28] S.-J. Wang, S. Yan, J. Yang, C.-G. Zhou, and X. Fu, A general expo- has published more than 80 peer-reviewed journal and conference papers,
nential framework for dimensionality reduction, IEEE Trans. Image including Proceedings of IEEE, IEEE TIP, IEEE TMI, IEEE TCSVT, ACM
Process., vol. 23, no. 2, pp. 920930, Feb. 2014. TIST, NIPS, ICML, KDD, CVPR, ICCV, ECCV, IJCAI, AAAI, MICCAI,
[29] M. Belkin and P. Niyogi, Laplacian eigenmaps for dimensionality SIGIR, SIGCHI, ACL, DCC, etc. His current research work is geared to
reduction and data representation, Neural Comput., vol. 15, no. 6, large-scale machine learning, big data analytics, multimedia search engine,
pp. 13731396, Jun. 2003. mobile computing, and parallel and distributed computing.
[30] X. Wang, Z. Li, and D. Tao, Subspaces indexing model on Grassmann
manifold for image search, IEEE Trans. Image Process., vol. 20, no. 9,
pp. 26272635, Sep. 2011.
[31] X. H. Zeng, S. W. Luo, J. Wang, and J. L. Zhao, Geodesic distance- Jialie Shen received the Ph.D. in computer science
based generalized Gaussian Laplacian eigenmap, Ruan Jian Xue Bao/J. from the University of New South Wales (UNSW),
Softw., vol. 20, no. 4, pp. 815824, Apr. 2009. Australia, in large-scale media retrieval and
[32] J. Hamm, Subspace-based learning with Grassmann manifolds, database access methods. He was a Faculty Member
Ph.D. dissertation, Dept. Elect. Syst. Eng., Univ. Pennsylvania, Philadel- at UNSW, Sydney, and a Researcher with the
phia, PA, USA, 2008. Information Retrieval Research Group, University
[33] P. Turaga, A. Veeraraghavan, A. Srivastava, and R. Chellappa, Sta- of Glasgow, for a few years. He is currently a
tistical computations on Grassmann and Stiefel manifolds for image Faculty Member of the School of Information
and video-based recognition, IEEE Trans. Pattern Anal. Mach. Intell., Systems with Singapore Management University,
vol. 33, no. 11, pp. 22732286, Nov. 2011. Singapore.
[34] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed.
Baltimore, MD, USA: The Johns Hopkins Univ. Press, 1996.
[35] I. Daubechies, M. Defrise, and C. De Mol, An iterative thresholding
algorithm for linear inverse problems with a sparsity constraint, Com- Dacheng Tao (F15) is Professor of Computer Sci-
mun. Pure Appl. Math., vol. 57, no. 11, pp. 14131457, Nov. 2004. ence with the Centre for Quantum Computation &
[36] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, Image
Intelligent Systems, and the Faculty of Engineering
quality assessment: From error visibility to structural similarity, IEEE
and Information Technology in the University of
Trans. Image Process., vol. 13, no. 4, pp. 600612, Apr. 2004.
[37] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, Technology, Sydney. He mainly applies statistics
Stacked denoising autoencoders: Learning useful representations in a and mathematics to data analytics problems and
deep network with a local denoising criterion, J. Mach. Learn. Res., his research interests spread across computer vision,
vol. 11, no. 3, pp. 33713408, Mar. 2010. data science, image processing, machine learning,
[38] K-SVD. [Online]. Available: http://www.cs.technion.ac.il/~elad/software/, and video surveillance. His research results have
accessed Mar. 14, 2014. expounded in one monograph and 100+ publications
[39] Lander and Rover Capture Photos of Each Other. [Online]. at prestigious journals and prominent conferences,
Available: http://english.cntv.cn/20131215/104015_1.shtml, accessed such as IEEE T-PAMI, T-NNLS, T-IP, JMLR, IJCV, NIPS, ICML, CVPR,
May 20, 2014. ICCV, ECCV, AISTATS, ICDM; and ACM SIGKDD, with several best paper
[40] Berkeley Segmentation DataSet (BSDS). [Online]. Available: awards, such as the best theory/algorithm paper runner up award in IEEE
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/fg/, ICDM07, the best student paper award in IEEE ICDM13, and the 2014
accessed Jun. 13, 2014. ICDM 10 Year Highest-Impact Paper Award.

Manifolds Algm

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Manifolds Algm

Uploaded by

Copyright:

Available Formats

4556 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO.

11, NOVEMBER 2015

Dictionary Pair Learning on Grassmann Manifolds

Abstract Image denoising is a fundamental problem in I. I NTRODUCTION

Two sub-dictionaries for each cluster are computed, corre-

Algorithm 1 (TTSP Algorithm) Top-Bottom 2D Subspace

Fig. 2. Principal angles between sub-dictionaries.

two subspaces span(A1) and span(A2) is defined by:

where 0 t /2, t, r = 1, . . . , m, and u t and v t are the

where the diagonal matrix = di ag(cos 1 , . . . , cosm ),

B. Updating Sparse Coding Matrixes Si (t) = Aki

Algorithm 3 (DPLG Algorithm) Dictionary Pair Learning on

Fig. 3. The working flowchart of DPLG algorithm.

matrix Ri be used for extracting the i th image patch at

Fig. 4. The denoising performance of the DPLG at different k1-nearest

(i.e., the two-way partition of 1D real numbers). The latter is

in columns 5-17 in Table II. From Table III, we can see

that our DPLG has a more rapidly increasing speed of

Fig. 7. The performance of the denoised House image and ChangE-3

V. C ONCLUSION According to Eq.(18), Aki Si Bki T is the reconstruction of the

You might also like