You are on page 1of 9

Pattern Recognition 37 (2004) 201209

www.elsevier.com/locate/patcog
Using orientation codes for rotation-invariant template
matching
Farhan Ullah

, Shunichi Kaneko
Graduate School of Engineering, Hokkaido University, Kita-13 Nishi 8, Kitaku, Sapporo 060-8628, Japan
Received 5 August 2002; accepted 28 January 2003
Abstract
A new method for rotation-invariant template matching in gray scale images is proposed. It is based on the utilization of
gradient information in the form of orientation codes as the feature for approximating the rotation angle as well as for matching.
Orientation codes-based matching is robust for searching objects in cluttered environments even in the cases of illumination
uctuations resulting from shadowing or highlighting, etc. We use a two-stage framework for realizing the rotation-invariant
template matching; in the rst stage, histograms of orientation codes are employed for approximating the rotation angle of the
object and then in the second stage, matching is performed by rotating the object template by the estimated angle. Matching
in the second stage is performed only for the positions which have higher similarity results in the rst stage, thereby pruning
out insignicant locations to speed up the search. Experiments with real world scenes demonstrate the rotation- and brightness
invariance of the proposed method for performing object search.
? 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Keywords: Orientation code; Rotation invariance; Object search; Matching
1. Introduction
Rotation invariance has been one of the desirable features
in template matching or registration and a basis for many
applications, such as visual control [1,2], medical imag-
ing [3] and computation of visual motion [4]. Various
approaches have been used for achieving the invari-
ance, such as generalized Hough transform (GHT) [5],
geometric hashing (GH) [6], graph matching [7], and
geometric moment-based matching [8,9]. GHT is a pow-
erful tool for rotation- and scale-invariant matching that
utilizes voting through local evidences like edges, but
it needs enormous memory to keep the voting space
for selecting the largest score for discrimination, so
we have to make some devices for reasonable memory

Corresponding author: Tel.: +81-11-706-64-36; fax: +81-11-


706-6436.
E-mail address: kanekos@coin.eng.hokudai.ac.jp (F. Ullah).
URL: http://www.mee.coin.eng.hokudai.ac.jp
size by restricting the degree of freedom. GH is also
a good scheme to deal with the distribution of local
features, such as points or edges; however, the weakness
for additive noise in their positions can cause larger error
rates in registration. The graph matching approach is rather
good at making qualitative correspondence between models
and objects, but it has to be used with proper pre-processing
for feature detection which is rather sensitive to threshold
parameters. The moment-based matching methods have
been eective especially for rotated binary patterns, but
it is necessary for them to have large computation time
because they are based on many integration and multiplica-
tion operations.
On the other hand, Swain [10] has proposed an ef-
cient method for locating an object based on color
information only, thereby eliminating the necessity of reg-
istration for matching purposes; however, illumination vari-
ations can limit the eectiveness of this method especially
when the objective is to search the object in a cluttered scene.
Some methods have been proposed in order to introduce
0031-3203/03/$30.00 ? 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
doi:10.1016/S0031-3203(03)00184-5
202 F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209
robustness by incorporating geometric information which
may also be utilized for rotation-invariant object recognition
[11,12]. Application of gradient information has been noted
in tasks like nding dominant orientation [13] and gesture
recognition [14]. In these methods histograms of gradient
magnitude are used to recognize either the dominant texture
orientation within an image or to match pre-trained gesture
patterns from the scene. Gradient information, along with
a relative distance information, has been demonstrated to
have good results in rotation-invariant object recognition
by constituting orientation tokens which are then used in
the form of co-occurrence matrices [15].
Histograms of orientation codes for object search has
been used for illumination- and rotation-invariant template
matching [16], but the histograms are usually insucient
for precisely locating the template position within a scene
while matching based methods are eective for aligning two
similar images. But they are not ecient for rotation invari-
ant search without some ad hoc treatment of the template.
For example, the template has to be rotated by all likely an-
gles and then matched for the similarity/dissimilarity of each
probable orientation at each image location [17]. Integra-
tion of various similarity measures has also been successful
in database search [18,19] where similarity measures based
on gradient angle histograms were combined with color his-
tograms or moment-based measures; however, similarities
based on color information or image brightness directly are
known to be vulnerable in cases of illumination uctuations.
In this paper, we propose a two-stage framework for
realizing rotation-invariant template matching. Our method
eciently exploits the powers of both histogram and
matching-based methods in order to realize the rotation
invariance. The motivation for this work was to intro-
duce the property of rotation invariance to the previously
proposed orientation code matching (OCM) based image
registration technique [20] by using the same features, i.e.
orientation codes to handle the problem of rotation-invariant
template matching. OCM has proved to be robust for
template matching in the presence of many real world
irregularities, such as illumination variation due to shad-
ing, highlighting, and in the presence of partial occlusion.
The rst stage estimates the approximate rotation angle of
the object image by using histograms of the orientation
codes and the second stage searches for the true position
by performing match only on the candidates ltered out
in the rst stage and rotating the template by the closest
angle of rotation also estimated in the rst stage. The pro-
posed method is invariant to illumination variations due to
utilization of geometrical information in the form of gradi-
ent angles rather than pixel brightness directly. We do not
expect much robustness against occlusion by the proposed
method due to reliance on the histograms in the rst stage;
however, some cases of minor occlusion have been handled
by some manipulation of parameters.
The rest of this paper is organized as follows: Section
2 details the proposed framework along with the denition
of orientation codes and the measures of dissimilarity em-
ployed in the framework. Section 3 describes the experimen-
tal results obtained by using the proposed algorithm. The
paper is concluded with some comments in Section 4.
2. Matching for rotation invariance
We propose the use of orientation codes as the feature for
approximating the rotation angle as well as for pixel-based
matching. First, we construct the histograms of orientation
codes for the template and a subimage of the same size and
compute the similarity between the two histograms for all
the possible orientations by shifting the subimage histogram
bins relative to the template histogram. The similarity eval-
uation values corresponding to the shift which maximizes
this similarity measure are noted for every location along
with the information of shift in order to provide an estimate
of similarity between the template and the subimage as well
as the expected angle of rotation. In the second stage, the
orientation code image of the template is matched against
the locations whose similarity in the rst stage exceeds some
specied threshold level. The estimate of the rotation angle
in the rst stage eliminates the need for matching in ev-
ery possible orientation at all candidate positions. Also, in-
signicant locations are pruned out based on the similarity
function in the rst stage.
For the matching stage, we select the candidate positions
based on a pruning threshold level j on the dissimilarity
function obtained for the whole scene in the rst stage. For
our experiments, we used a criterion level corresponding to
the 90th percentile for a subimage position to be consid-
ered in the second stage (i.e. best 10% candidate positions
are used for matching stage). Fig. 1 shows the ow of the
proposed framework as in block diagram.
2.1. Orientation codes
For discrete images, the orientation codes are obtained as
quantized values of gradient angle around each pixel by ap-
plying some dierential operator for computing horizontal
and vertical derivatives like Sobel operator, and then apply-
ing the function to their ratio to obtain the gradient angle,
such as 0 = (c[}c,)}(c[}cx). The orientation code for a
pixel location (i, )), letting 0
i)
be gradient angle, for a preset
sector width
0
is given as
c
i)
=

0
i)

, |I
x
| + |I
,
| I,
N otherwise,
(1)
where [ ] is the Gaussian operation.
If there are N orientation codes then c
(i))
is assigned values
{0, 1, . . . , N 1}. We assign the particular code N for low
contrast regions (dened by the threshold I) for which it
is not possible to stably compute the gradient angles. For
F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209 203
Gray
Template
OC
Template
Gray
Image
OC
Image
Compute
Similarity with
Template Histogram
Shift OC Histograms
of Subimage
successively
E
s
t
i
m
a
t
e
d

R
o
t
a
t
i
o
n

P
r
o
f
i
l
e
s

(
O
)

D
i
s
s
i
m
i
l
a
r
i
t
y

V
a
l
u
e
s

(
D
1
)
Candidate
Selection
(Pruning)
OCM Dissimilarity
(D
2
)
Rotate
OC Template
by O(x,y) *

Integrate D
1
& D
2
and Search for
Minimum
(for all subimages of same size as template)
(for the subimages selected after pruning)
Fig. 1. Block diagram for the proposed framework.
/8
15/8
7/8
13/8
3/2
11/8
5/4
9/8
5/8
3/4
7/4
/2
3/8
/4
0
3
2
1
4
5
6
7
8
9
10
11
13
14
15
12
0
Fig. 2. Illustration of orientation.
all of the experiments, we used 16 orientation codes
corresponding to a sector width
0
of }8 radians. An
illustration of orientation codes is shown in Fig. 2. The
orientation codes for all pixel locations are computed as
a separate image O = {c
i)
} (referred to as an orientation
code image hereafter). The threshold I is important for
suppressing the eects of noise and has to be selected ac-
cording to the problem at hand; very large values can cause
suppression of the texture information. For most of our
experiments, we used a small threshold value of 10, but for
noisy images or images involving occurrences of occlusion,
larger values are recommended.
2.2. Orientation code histograms
We compute the similarity between a subimage and the
template based on the dierence between their orientation
code histograms. The ith bin of the orientation code his-
togram for an object at a subimage I
mn
at a position (m, n)
can be expressed as
h
mn
(i) =

(x,,)I
mn
o(i c
x,
), (2)
where o() represents the Kroneckers delta. The bins cor-
responding to i =0, 1, , N 1 represent the frequency of
occurrence of the orientation codes computed by gradient
operation and the last bin (i = N) is the count of the codes
corresponding to low contrast regions. The histograms for
the subimage h
mn
, and for the template h
1
can be written
compactly by ordered lists as
h
mn
= {h
mn
(i)}
N
i=0
, h
1
= {h
1
(i)}
N
i=0
.
There are dierent approaches for checking the similar-
ity (or dissimilarity) between two histograms such as
the Chi-Square statistic, Euclidean distance or city-block
distance. We use the city-block metric (sum of absolute
dierences) which is equivalent to the histogram intersec-
tion technique based on maxmin strategy for the cases
when the subimage and the template histograms are of the
same size [10].
The dissimilarity function D
1
between the template and
the subimage histograms can be written as
D
1
= 1 max
k
S
k
. (3)
The second term in the above expression is the normalized
area under the curve obtained by the intersection between
the template histogram and the subimage histogram shifted
left by k bins (symbolized by the superscript k). This in-
tersection area is maximized to nd the closest approxima-
tion of the angle by which the object may appear rotated in
the scene. Since the intersection and city block dierences
are complementary, the maximization of S
k
is equivalent to
minimization of the dissimilarity in Eq. (3). S
k
is given by
S
k
=
1
M

N1

i=0
min{h
k
mn
(i), h
1
(i)}+min{h
mn
(N), h
1
(N)}

,
k = 0, 1, N 1, (4)
where, M is the template size and h
k
represents the histogram
h shifted by k bins computed as
h
k
(i) = h(i + k mod N). (5)
204 F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209
The last bin of the histogram, corresponding to the low
contrast pixels, is not shifted and its intersection with the
corresponding bin is added separately as shown in the above
expression. The orientation maximizing the intersection
evaluation of Eq. (4) is stored along with the dissimilarity
function values for reference at the matching stage.
Atypical example of orientation code histograms is shown
in Fig. 3, where (a) and (b) show a template and the corre-
sponding object from the scene which appears rotated coun-
terclockwise. We use a circular mask for employing only
the pixels lying within the circles shown on the images. The
plot in Fig. 3(c) shows the orientation code histograms for
the template and the subimage along with the intersection
curve; the plot in (d) shows the same histograms and inter-
section curve with the dierence that subimage histogram
is shifted cyclically by 3 bins. As can be seen, the shifted
histogram closely resembles the template histogram with a
larger area under the intersection curve. The radar plot in
Fig. 3(e) shows the values for areas under the intersection
curve for all possible shifts. The maximum value corre-
sponds to code 3, indicating the possibility of the subimage
being rotated by about 3 22.5

counterclockwise relative
to the template. In this example 22.5 is the value for
0
implying 16 orientation codes spanning the whole circle
as mentioned earlier in Section 2.2. Note that the last bin,
reserved for the cases of lowcontrast pixel codes, is excluded
from the shifting operation and is used directly with the
corresponding bin of the other histogram.
2.3. Orientation code matching
The dissimilarity measure for matching in the second
stage is dened as the summation of the dierence between
the orientation codes of the corresponding pixels. The cyclic
property of orientation codes is taken into account for nd-
ing the dierence. If O
T
represents the orientation code im-
age of the template, and O the orientation code image for
a subimage position then the dissimilarity function between
them is given by
D
2
=
1
M, E

(i, ))
d(O(i, )), O
T
(i, ))), (6)
(a) (b)
0
50
100
150
200
250
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16*
Template Histogram Subimage Histogram Intersection
Orientation Code
0
50
100
150
200
250
Orientation Code
Template Histogram Subimage Histogram
(shifted left by 3 bins)
Intersection
*=No Shift (c) (d)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(e)
Fig. 3. Template and subimage. (a) Template (55 55 pixels). (b) Subimage. (c) Original histograms. (d) Shifted histograms for subimage.
(e) Radar plot for all shifts.
F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209 205
where M is the total number of pixels used in the match and
E is the maximum possible error between any two orienta-
tion codes. The error function d() is based on the cyclic
dierence criterion and can be written as
d(a, b) =

min{|a b|, 0 6a, b 6N 1,


N |a b|},
N}4, (a = N and b = N) or
(a = N and b = N),
0 a = b = N.
(7)
In the above evaluation of the error function, the folding
property of orientation codes is utilized in order to ensure
stability in cases of minor dierences between the estimated
and the actual orientation of the object being searched. For
example, the cyclic dierence between the codes 0 and
N 1 is 1. It is clear that the maximum error between any
two codes is never more than N}2, which is assigned to E.
Before performing the registration step, the template orien-
tation code image is rotated by the angle corresponding to
the orientation estimated in the rst stage.
The OCM based computation can be eciently performed
by constructing a lookup table and the above expression for
evaluation of the error function d() can be reduced simply
to a reference in the table.
2.4. Integrated dissimilarity
The overall dissimilarity is a weighted sum of dissimi-
larity evaluations at the histogram matching stage and the
OCM stage. This is to be minimized for nding the best
match for the given template by evaluating
D = : D
1
+ (1 :) D
2
. (8)
The weighting factor : may be used to attach bias for a
particular stage; we used the equal weighting for most of our
experiments (: = 0.5). Since the ranges of the dissimilarity
functions in Eqs. (3) and (6) are between 0 and 1, the overall
dissimilarity D also varies between 0 and 1.
The signicance of integration of dissimilarity measures
can be seen from an example in Fig. 4 where the results for
searching the template are shown for the situations when
orientation code histogram alone was used (labeled OH) and
when the integrated evaluation was employed (labeled OH+
OCM). In the former case, search reported an erroneous
object (containing similar features as the template) as the
best match while the ground truth location ranked at 181 in
the order of best matches (equivalent to 99.39th percentile).
The integrated measure was successful in nding the correct
location as the best match.
Fig. 4. Searches based on D
1
and D evaluations. (a) OH,
OH + OCM, (b) Template.
2.5. Quantizing angle
In the case of pixel matching-based search, an object
can match the template if the relative dierence between its
angle in the template and the angle of rotation in the scene
is not signicant. It is due to the quantized nature of image
pixels and the fact that gray values or their gradient in-
formation have some similarities with their neighbors. For
example, if we assume a template containing a circular ob-
ject then the pixel corresponding to the center of the rotated
object should have same gray value as that in the original
template regardless of its angle of rotation in the scene.
However, as we move farther from the center to the periph-
ery, the eect of rotation angle will increase with the radial
distance. An object rotated by a small angle will still con-
tain a large area in the middle that has some similarity with
the unrotated template and can be matched directly. These
angles of tolerance for the object (which allow direct match
with unrotated template) may depend on the size and con-
tents of the template used. For OCM-based search, matching
the template object with the same object rotated by an arbi-
trary angle requires that the tolerance angle must be at least

0
(to accommodate
0
}2) when the template is rotated
by the nearest quantum of
0
for realizing a successful
search.
Quantizing angle width
0
and the number of orien-
tation codes have an inverse relationship as reported in
Section 2.1. Increasing the number of codes will increase the
number of bins required for histogram construction for the
evaluation of dissimilarity in the rst stage with better ac-
curacy. However, since all bins of the histogram are shifted
one by one for nding the approximate angle of rotation, in-
crease of one bin will result in the addition of another shifted
histogram matching operation as required in Eq. (4). Since
the objective of this study is to locate the object by only ap-
proximating the angle of rotation, we can allow some error
in the actual angle of rotation and the one used for rotating
206 F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209
the template as long as this error is within the acceptable
tolerance range.
After experimenting with dierent number of orientation
codes on articially rotated scene images (with various de-
grees of rotation) and addition of Gaussian noise of a wide
range of variances, we found 16 or 20 codes to be reason-
able choices in terms of computation time and search e-
ciency; we used 16 orientation codes for all the experiments
reported here.
2.6. Computation time
The advantage of using the proposed method can be un-
derstood from the fact that our method alleviates the need
for rotating the template by many angles before nding the
closest similarity. The usual method will involve rotating the
template by at least the number of orientation codes we nd
suitable enough for successful matching within the margin
of tolerance angle. As mentioned earlier that for our exper-
iments we used 16 orientations for matching the template
with an arbitrarily rotated object, the conventional matching
will require at least 16 times the computation time required
for the search of one template. For the computer system
used, the average computation time required for searching
one template of size 75 75 pixels (eectively 4053 pix-
els due to the use of circular mask) in the scene of size
256 256 is about 5547 ms by using OCM alone, which
makes the required time for searching all 16 possibilities of
the rotated templates to be 16 5547 = 88752 ms.
For histogram-based matching in the rst of the two-stage
framework proposed here, computation time depends on the
number of orientation codes; with 16 codes employed on the
same templatescene pair mentioned above, it was about
7046 ms. Pruning threshold level j is the determining fac-
tor for processing in the second stage which varies from 0
(matching for all positions) to 0.99 (matching for only top
1% of the candidate positions); for the threshold level of 0.9
(used for most of the experiments), the total computation
time was about 563 ms (approximately 10% of the match-
ing time of OCM alone) resulting in the total search time
for the proposed algorithm to be about 7609 ms.
3. Experiments
For the experimental verication of the search algorithm
proposed above, we took many images of a magazine cover
by rotating it by various arbitrary angles. The experimental
setup for image capture is detailed in Table 1. A template
was extracted from one image (encircled in Fig. 5(a)) which
was searched in all subsequent images. The proposed algo-
rithm was successful in determining the true position of the
template and estimating the approximate angle of rotation
for all the images regardless of the angle of rotation. Sam-
ples of the results are shown in Fig. 5 where circle marks
on images enclose the region returned by the algorithm as
Table 1
Experimental setup
Camera JAI CV-M10BX CCD
Progressive scan monochrome
Lens Focal length = 16 mm
Capture board Leutron Picport Framegrabber
Image type 8-bit gray scale
CPU Pentium-III, 1.0 GHz
OS Windows 2000
the closest match of the template object after searching the
whole image. Orientations shifts (k

) estimated in the rst


stage of the framework, and later used for approximating the
angle of rotation, have been noted below each result along
with the overall dissimilarity (D).
Another set was prepared by rotating an object under the
conditions of varying illumination. Some of the results are
shown in Fig. 6(a)(f) with the template in the upper left
corner of (a). As can be seen from the results, the varia-
tions in lighting conditions were handled eectively by the
proposed algorithm.
The weighting factor : was set to 0.5 (equal weighting
for both dissimilarity measures) for all the above cases. The
pruning threshold level j for eliminating the unlikely can-
didates was set to 0.9 for the images in Figs. 5 and 6 except
for Figs. 6(c) and (f), which were successful when it was
lowered to 0.6 and 0.85, respectively.
Some cases of parameter manipulation for searching ob-
jects involving minor occlusions are shown in Fig. 7(a)
(c) for the template shown in the upper left corner of (a).
The histogram-based computation of the rst stage is not
robust for searching eectively with the occurrence of
occlusion; however, for the reported results, low-contrast
threshold level I was increased from 10 to 40 for improv-
ing the noise margin, the weighting factor : corresponding
to the dissimilarity D
1
was reduced to 0.4 and the prun-
ing threshold j was set to 0.8 for accommodating more
candidate positions at the matching stage.
Similarity proles for a test image (Fig. 5(j)) are shown
in Fig. 8 around the position searched by the algorithm as
the best match (proles are shown inverted for better visibil-
ity). Plots corresponding to the two individual dissimilarity
measures (D
1
and D
2
) are given in Figs. 8(a) and (b), while
the integrated dissimilarity (D) is shown in (c). Noise mar-
gin corresponding to D
1
is low and the true position ranked
as 117 (99.63 percentile) in the order of smallest evaluation
value. For the case of OCM-based dissimilarity D
2
, and the
overall evaluation D, peak values correspond to the correct
position.
4. Conclusions
A new method based on a two-stage framework for real-
izing orientation invariant template matching is presented.
The method employs the orientation detection property of
F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209 207
Fig. 5. Sample images matched against the template. Image size = 256 256 pixels, template size = 75 75 pixels. (a) Image for template
extraction. (b) k

= 1, D = 0.196. (c) k

= 2, D = 0.152. (d) k

= 3, D = 0.081. (e) k

= 4, D = 0.214. (f) k

= 5, D = 0.234. (g)
k

= 6, D = 0.238. (h) k

= 7, D = 0.218. (i) k

= 8, D = 0.241. (j) k

= 9, D = 0.166. (k) k

= 10, D = 0.209. (l) k

= 11, D = 0.199.
(m) k

= 12, D = 0.209. (n) k

= 13, D = 0.232. (o) k

= 14, D = 0.234. (p) k

= 14, D = 0.17.
orientation histograms and the robustness of the previously
proposed, orientation code matching-based image registra-
tion algorithm for correctly aligning the template with the
closest match position. Experiments on real world images
show the eectiveness of the proposed technique in search-
ing objects rotated by arbitrary angles. This method is
robust even in the cases of illumination variations in the
scene since it does not rely on the image brightness directly.
Selection of pruning threshold level j has some inuence
on the eciency of the proposed method; too small a value
means that the search is to be carried out at more locations
in the second stage at the expense of search time while too
large a value for the threshold can leave out some desir-
able candidates which may have smaller similarity evalua-
tion in the rst stage due to partial occlusion or noise. For
noisy images, j should be reduced to lower levels with the
increase in noise variance. For such cases, the weighting
factor : should also be changed to give more importance to
the dissimilarity based on the OCM-based matching.
In general, this method is not robust for searching ob-
jects under occlusion and may require some segmentation
pre-processing; however, some ad hoc variations in param-
eters such as low-contrast threshold I, weighting factor :
or pruning threshold level j have proved to be eective for
tuning the search algorithm for handling the cases of partial
occlusion also as reported in Section 3.
There is still some room for improving the search
eciency by reduction of computation time of the
histogram-based computation of rst stage. The search time
reported in this paper is without optimization which can
be achieved by eliminating redundant operations by taking
into account the overlapping regions in adjacent locations
instead of fresh computation for the whole subimage at
each location.
208 F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209
(a)
(d)
(b)
(e) (f )
(c)
Fig. 6. Example of matching under illumination changes. Image size=320240 pixels, template size=5959 pixels. (a) k

=15, D=0.135.
(b) k

= 10, D = 0.159. (c) k

= 7, D = 0.248. (d) k

= 8, D = 0.211. (e) k

= 8, D = 0.149. (f) k

= 4, D = 0.185.
(a) (b) (c)
Fig. 7. Example of matching with occurrence of occlusion. Image size=256256 pixels, template size=6161 pixels. (a) k

=13, D=0.216.
(b) k

= 13, D = 0.214. (c) k

= 13, D = 0.228.
7
0
7
4
7
8
8
2
8
6
9
0
9
4
9
8
1
0
2
1
0
6
1
1
0
99
90
81
72
63
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
D
i
s
s
i
m
i
l
a
r
i
t
y

f
u
n
c
t
i
o
n

(
D
1
)
Horizontal Postition
V
e
r
t
i
c
a
l

P
o
s
t
i
o
n
D
i
s
s
i
m
i
l
a
r
i
t
y

f
u
n
c
t
i
o
n

(
D
2
)
Horizontal Postition
V
e
r
t
i
c
a
l

P
o
s
t
i
o
n
7
0
7
4
7
8
8
2
8
6
9
0
9
4
9
8
1
0
2
1
0
6
1
1
0
99
89
79
69
59
0.3
0.35
0.4
0.45
0.5
0.55
0.6
7
0
7
4
7
8
8
2
8
6
9
0
9
4
9
8
1
0
2
1
0
6
1
1
0
99
88
77
66
0.15
0.17
0.19
0.21
0.23
0.25
0.27
0.29
0.31
D
i
s
s
i
m
i
l
a
r
i
t
y

f
u
n
c
t
i
o
n

(
D
)
Horizontal Postition
V
e
r
t
i
c
a
l

P
o
s
t
i
o
n
(a) (b) (c)
Fig. 8. Similarity proles for Fig. 5(j) around (90,79). (a) Similarity surface for D
1
. (b) Similarity surface for D
2
. (c) Similarity surface
for integrated dissimilarity D.
F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209 209
5. Summary
A new method for rotation-invariant template matching
in gray scale images is proposed. It is based on the utiliza-
tion of gradient information in the form of orientation codes
as the feature for approximating the rotation angle as well
as for matching. Orientation codes-based matching is
robust for searching objects in cluttered environments even
in the cases of illumination uctuations resulting from shad-
owing or highlighting, etc. We use a two-stage framework
for realizing the rotation-invariant template matching; in the
rst stage, histograms of orientation codes are employed for
approximating the rotation angle of the object and then in
the second stage, matching is performed by rotating the ob-
ject template by the estimated angle. Matching in the second
stage is performed only for the positions which have higher
similarity results in the rst stage, thereby pruning out in-
signicant locations to speed up the search. Experiments
with real world scenes demonstrate the rotation- and bright-
ness invariance of the proposed method for performing
object search.
References
[1] S. Hutchinson, G. Hager, P. Corke, A tutorial introduction
to visual servo control, IEEE Trans. Robotics Automation 12
(5) (1996) 651670.
[2] N. Papanikolopoulos, P. Khosla, T. Kanade, Visual tracking
of a moving target by a camera mounted on a robot: a
combination of control and vision, IEEE Trans. Robotics
Automation 9 (1) (1993) 1435.
[3] E. Bardinet, L. Cohen, N. Ayache, Tracking medical 3d
data with a deformable parametric model, Proc. European
Conference on Computer vission 2 (1996) 09118.
[4] P. Anandan, A computational framework and an algorithm
for the measurement of visual motion, Int. J. Comput. Vision
2 (3) (1989) 283310.
[5] D. Ballard, Generalizing the hough transform to detect
arbitrary shapes, IEEE Trans. Pattern Anal. Machine Intell.
13 (2) (1981) 111122.
[6] H.J. Wolfson, I. Rigoutsos, Geometric hashing: an overview,
IEEE Comput. Sci. Eng. 1 (1997) 1021.
[7] T. Leung, M. Burl, P. Perona, Finding faces in cluttered scenes
using random labeled graph matching, Proceedings of the Fifth
ICCV, Cambridge, 1995, pp. 637644.
[8] C.H. Teh, R.T. Chin, On image analysis by the methods of
moments, IEEE Trans. Pattern Anal. Machine Intell. 10 (4)
(1988) 496513.
[9] S.X. Liao, M. Pawlak, On image analysis by moments, IEEE
Trans. Pattern Anal. Machine Intell. 18 (3) (1996) 254266.
[10] M. Swain, D. Ballard, Color indexing, Int. J. Comput. Vision
7 (1) (1991) 1132.
[11] J. Huang, S. Kumar, M. Mitra, W. Zhu, R. Zabih, Image
indexing using color correlograms, Proceedings of IEEE
CVPR, Sun Juan, 1997, pp. 762768.
[12] P. Chang, J. Krumm, Object recognition with color
cooccurrence histograms, Proceedings of IEEE CVPR,
Sun Juan, pp. 10631069.
[13] M. Gorkani, R. Picard, Texture orientation for sorting photos
at a glance. Proceedings of the 12th ICPR, Jerusalem, Vol. 1,
1994, pp. 459464.
[14] W. Freeman, M. Roth, Orientation histograms for hand gesture
recognition, IEEE International Workshop on Automatic Face
and Gesture Recognition, 1995.
[15] V. Kovalev, M. Petrou, Y. Bondar, Using orientation tokens
for object recognition, Pattern Recognition Lett. 19 (12)
(1998) 11251132.
[16] F. Ullah, S. Kaneko, S. Igarahi, Object search using orientation
histogram intersection, Proceedings of JapanKorea Joint
Workshop on Frontiers of Computer Vision, Seoul 2000,
pp. 110115.
[17] L. Brown, A survey of image registration techniques, ACM
Comput. Surveys 24 (4) (1992) 325376.
[18] A. Jain, A. Vailaya, Image retrieval using color and shape,
Pattern Recognition 29 (8) (1996) 12331244.
[19] A. Jain, A. Vailaya, Shape-based retrieval: a case study with
trademark image databases, Pattern Recognition 31 (9) (1998)
13691399.
[20] F. Ullah, S. Kaneko, S. Igarahi, Orientation code matching
for robust object search, IEICE Trans. Inform Systems E84-D
(8) (2001) 9991006.
About the AuthorFARHAN ULLAH received the B.E. in Computer Systems Engineering from N.E.D. University of Engineering and
Technology, Karachi, Pakistan and M.Sc. in Systems Engineering from Quaid-i-Azam University, Islamabad in 1990 and 1992, and Ph.D.
in Systems Engineering from Hokkaido University, Sapporo, Japan in 2002, respectively. He is a Senior Engineer at Informatics Complex,
Islamabad, Pakistan. He is interested in computer vision, pattern recognition, and robotic intelligence. He is a member of IEEE Computer
Society and ACM.
About the AuthorSHUNICHI KANEKO received the B.S. degree in precision engineering and the M.S. degree in Information Engineering
from Hokkaido University, Japan, in 1978 and 1980, respectively, and then the Ph.D. degree in Systems Engineering from the University
of Tokyo, Japan, in 1990. He had been a Research Assistant of the Department of Computer Science since 1980 to 1991, an Associate
Professor of the Department of Electronic Engineering since 1991 to 1995, and an Associate Professor of the Department of Bio-application
and Systems Engineering since 1996 to 1996, in Tokyo University of Agriculture and Technology, Japan. He is an Associate Professor of
the Department of Control and Information Engineering in Hokkaido University from 1996. He received the Best Paper Award in 1990,
the Society Award in 1998, respectively, from Japan Society of Precision Engineering. His research interest includes machine vision, image
sensing and understanding, robust image registration. He is a member of IEICE, JSPE, IEEJ, SICE and the IEEE Computer Society.

You might also like