Professional Documents
Culture Documents
www.elsevier.com/locate/patcog
Using orientation codes for rotation-invariant template
matching
Farhan Ullah
, Shunichi Kaneko
Graduate School of Engineering, Hokkaido University, Kita-13 Nishi 8, Kitaku, Sapporo 060-8628, Japan
Received 5 August 2002; accepted 28 January 2003
Abstract
A new method for rotation-invariant template matching in gray scale images is proposed. It is based on the utilization of
gradient information in the form of orientation codes as the feature for approximating the rotation angle as well as for matching.
Orientation codes-based matching is robust for searching objects in cluttered environments even in the cases of illumination
uctuations resulting from shadowing or highlighting, etc. We use a two-stage framework for realizing the rotation-invariant
template matching; in the rst stage, histograms of orientation codes are employed for approximating the rotation angle of the
object and then in the second stage, matching is performed by rotating the object template by the estimated angle. Matching
in the second stage is performed only for the positions which have higher similarity results in the rst stage, thereby pruning
out insignicant locations to speed up the search. Experiments with real world scenes demonstrate the rotation- and brightness
invariance of the proposed method for performing object search.
? 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Keywords: Orientation code; Rotation invariance; Object search; Matching
1. Introduction
Rotation invariance has been one of the desirable features
in template matching or registration and a basis for many
applications, such as visual control [1,2], medical imag-
ing [3] and computation of visual motion [4]. Various
approaches have been used for achieving the invari-
ance, such as generalized Hough transform (GHT) [5],
geometric hashing (GH) [6], graph matching [7], and
geometric moment-based matching [8,9]. GHT is a pow-
erful tool for rotation- and scale-invariant matching that
utilizes voting through local evidences like edges, but
it needs enormous memory to keep the voting space
for selecting the largest score for discrimination, so
we have to make some devices for reasonable memory
0
i)
, |I
x
| + |I
,
| I,
N otherwise,
(1)
where [ ] is the Gaussian operation.
If there are N orientation codes then c
(i))
is assigned values
{0, 1, . . . , N 1}. We assign the particular code N for low
contrast regions (dened by the threshold I) for which it
is not possible to stably compute the gradient angles. For
F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209 203
Gray
Template
OC
Template
Gray
Image
OC
Image
Compute
Similarity with
Template Histogram
Shift OC Histograms
of Subimage
successively
E
s
t
i
m
a
t
e
d
R
o
t
a
t
i
o
n
P
r
o
f
i
l
e
s
(
O
)
D
i
s
s
i
m
i
l
a
r
i
t
y
V
a
l
u
e
s
(
D
1
)
Candidate
Selection
(Pruning)
OCM Dissimilarity
(D
2
)
Rotate
OC Template
by O(x,y) *
Integrate D
1
& D
2
and Search for
Minimum
(for all subimages of same size as template)
(for the subimages selected after pruning)
Fig. 1. Block diagram for the proposed framework.
/8
15/8
7/8
13/8
3/2
11/8
5/4
9/8
5/8
3/4
7/4
/2
3/8
/4
0
3
2
1
4
5
6
7
8
9
10
11
13
14
15
12
0
Fig. 2. Illustration of orientation.
all of the experiments, we used 16 orientation codes
corresponding to a sector width
0
of }8 radians. An
illustration of orientation codes is shown in Fig. 2. The
orientation codes for all pixel locations are computed as
a separate image O = {c
i)
} (referred to as an orientation
code image hereafter). The threshold I is important for
suppressing the eects of noise and has to be selected ac-
cording to the problem at hand; very large values can cause
suppression of the texture information. For most of our
experiments, we used a small threshold value of 10, but for
noisy images or images involving occurrences of occlusion,
larger values are recommended.
2.2. Orientation code histograms
We compute the similarity between a subimage and the
template based on the dierence between their orientation
code histograms. The ith bin of the orientation code his-
togram for an object at a subimage I
mn
at a position (m, n)
can be expressed as
h
mn
(i) =
(x,,)I
mn
o(i c
x,
), (2)
where o() represents the Kroneckers delta. The bins cor-
responding to i =0, 1, , N 1 represent the frequency of
occurrence of the orientation codes computed by gradient
operation and the last bin (i = N) is the count of the codes
corresponding to low contrast regions. The histograms for
the subimage h
mn
, and for the template h
1
can be written
compactly by ordered lists as
h
mn
= {h
mn
(i)}
N
i=0
, h
1
= {h
1
(i)}
N
i=0
.
There are dierent approaches for checking the similar-
ity (or dissimilarity) between two histograms such as
the Chi-Square statistic, Euclidean distance or city-block
distance. We use the city-block metric (sum of absolute
dierences) which is equivalent to the histogram intersec-
tion technique based on maxmin strategy for the cases
when the subimage and the template histograms are of the
same size [10].
The dissimilarity function D
1
between the template and
the subimage histograms can be written as
D
1
= 1 max
k
S
k
. (3)
The second term in the above expression is the normalized
area under the curve obtained by the intersection between
the template histogram and the subimage histogram shifted
left by k bins (symbolized by the superscript k). This in-
tersection area is maximized to nd the closest approxima-
tion of the angle by which the object may appear rotated in
the scene. Since the intersection and city block dierences
are complementary, the maximization of S
k
is equivalent to
minimization of the dissimilarity in Eq. (3). S
k
is given by
S
k
=
1
M
N1
i=0
min{h
k
mn
(i), h
1
(i)}+min{h
mn
(N), h
1
(N)}
,
k = 0, 1, N 1, (4)
where, M is the template size and h
k
represents the histogram
h shifted by k bins computed as
h
k
(i) = h(i + k mod N). (5)
204 F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209
The last bin of the histogram, corresponding to the low
contrast pixels, is not shifted and its intersection with the
corresponding bin is added separately as shown in the above
expression. The orientation maximizing the intersection
evaluation of Eq. (4) is stored along with the dissimilarity
function values for reference at the matching stage.
Atypical example of orientation code histograms is shown
in Fig. 3, where (a) and (b) show a template and the corre-
sponding object from the scene which appears rotated coun-
terclockwise. We use a circular mask for employing only
the pixels lying within the circles shown on the images. The
plot in Fig. 3(c) shows the orientation code histograms for
the template and the subimage along with the intersection
curve; the plot in (d) shows the same histograms and inter-
section curve with the dierence that subimage histogram
is shifted cyclically by 3 bins. As can be seen, the shifted
histogram closely resembles the template histogram with a
larger area under the intersection curve. The radar plot in
Fig. 3(e) shows the values for areas under the intersection
curve for all possible shifts. The maximum value corre-
sponds to code 3, indicating the possibility of the subimage
being rotated by about 3 22.5
counterclockwise relative
to the template. In this example 22.5 is the value for
0
implying 16 orientation codes spanning the whole circle
as mentioned earlier in Section 2.2. Note that the last bin,
reserved for the cases of lowcontrast pixel codes, is excluded
from the shifting operation and is used directly with the
corresponding bin of the other histogram.
2.3. Orientation code matching
The dissimilarity measure for matching in the second
stage is dened as the summation of the dierence between
the orientation codes of the corresponding pixels. The cyclic
property of orientation codes is taken into account for nd-
ing the dierence. If O
T
represents the orientation code im-
age of the template, and O the orientation code image for
a subimage position then the dissimilarity function between
them is given by
D
2
=
1
M, E
(i, ))
d(O(i, )), O
T
(i, ))), (6)
(a) (b)
0
50
100
150
200
250
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16*
Template Histogram Subimage Histogram Intersection
Orientation Code
0
50
100
150
200
250
Orientation Code
Template Histogram Subimage Histogram
(shifted left by 3 bins)
Intersection
*=No Shift (c) (d)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(e)
Fig. 3. Template and subimage. (a) Template (55 55 pixels). (b) Subimage. (c) Original histograms. (d) Shifted histograms for subimage.
(e) Radar plot for all shifts.
F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209 205
where M is the total number of pixels used in the match and
E is the maximum possible error between any two orienta-
tion codes. The error function d() is based on the cyclic
dierence criterion and can be written as
d(a, b) =
0
(to accommodate
0
}2) when the template is rotated
by the nearest quantum of
0
for realizing a successful
search.
Quantizing angle width
0
and the number of orien-
tation codes have an inverse relationship as reported in
Section 2.1. Increasing the number of codes will increase the
number of bins required for histogram construction for the
evaluation of dissimilarity in the rst stage with better ac-
curacy. However, since all bins of the histogram are shifted
one by one for nding the approximate angle of rotation, in-
crease of one bin will result in the addition of another shifted
histogram matching operation as required in Eq. (4). Since
the objective of this study is to locate the object by only ap-
proximating the angle of rotation, we can allow some error
in the actual angle of rotation and the one used for rotating
206 F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209
the template as long as this error is within the acceptable
tolerance range.
After experimenting with dierent number of orientation
codes on articially rotated scene images (with various de-
grees of rotation) and addition of Gaussian noise of a wide
range of variances, we found 16 or 20 codes to be reason-
able choices in terms of computation time and search e-
ciency; we used 16 orientation codes for all the experiments
reported here.
2.6. Computation time
The advantage of using the proposed method can be un-
derstood from the fact that our method alleviates the need
for rotating the template by many angles before nding the
closest similarity. The usual method will involve rotating the
template by at least the number of orientation codes we nd
suitable enough for successful matching within the margin
of tolerance angle. As mentioned earlier that for our exper-
iments we used 16 orientations for matching the template
with an arbitrarily rotated object, the conventional matching
will require at least 16 times the computation time required
for the search of one template. For the computer system
used, the average computation time required for searching
one template of size 75 75 pixels (eectively 4053 pix-
els due to the use of circular mask) in the scene of size
256 256 is about 5547 ms by using OCM alone, which
makes the required time for searching all 16 possibilities of
the rotated templates to be 16 5547 = 88752 ms.
For histogram-based matching in the rst of the two-stage
framework proposed here, computation time depends on the
number of orientation codes; with 16 codes employed on the
same templatescene pair mentioned above, it was about
7046 ms. Pruning threshold level j is the determining fac-
tor for processing in the second stage which varies from 0
(matching for all positions) to 0.99 (matching for only top
1% of the candidate positions); for the threshold level of 0.9
(used for most of the experiments), the total computation
time was about 563 ms (approximately 10% of the match-
ing time of OCM alone) resulting in the total search time
for the proposed algorithm to be about 7609 ms.
3. Experiments
For the experimental verication of the search algorithm
proposed above, we took many images of a magazine cover
by rotating it by various arbitrary angles. The experimental
setup for image capture is detailed in Table 1. A template
was extracted from one image (encircled in Fig. 5(a)) which
was searched in all subsequent images. The proposed algo-
rithm was successful in determining the true position of the
template and estimating the approximate angle of rotation
for all the images regardless of the angle of rotation. Sam-
ples of the results are shown in Fig. 5 where circle marks
on images enclose the region returned by the algorithm as
Table 1
Experimental setup
Camera JAI CV-M10BX CCD
Progressive scan monochrome
Lens Focal length = 16 mm
Capture board Leutron Picport Framegrabber
Image type 8-bit gray scale
CPU Pentium-III, 1.0 GHz
OS Windows 2000
the closest match of the template object after searching the
whole image. Orientations shifts (k
= 1, D = 0.196. (c) k
= 2, D = 0.152. (d) k
= 3, D = 0.081. (e) k
= 4, D = 0.214. (f) k
= 5, D = 0.234. (g)
k
= 6, D = 0.238. (h) k
= 7, D = 0.218. (i) k
= 8, D = 0.241. (j) k
= 9, D = 0.166. (k) k
= 11, D = 0.199.
(m) k
= 14, D = 0.17.
orientation histograms and the robustness of the previously
proposed, orientation code matching-based image registra-
tion algorithm for correctly aligning the template with the
closest match position. Experiments on real world images
show the eectiveness of the proposed technique in search-
ing objects rotated by arbitrary angles. This method is
robust even in the cases of illumination variations in the
scene since it does not rely on the image brightness directly.
Selection of pruning threshold level j has some inuence
on the eciency of the proposed method; too small a value
means that the search is to be carried out at more locations
in the second stage at the expense of search time while too
large a value for the threshold can leave out some desir-
able candidates which may have smaller similarity evalua-
tion in the rst stage due to partial occlusion or noise. For
noisy images, j should be reduced to lower levels with the
increase in noise variance. For such cases, the weighting
factor : should also be changed to give more importance to
the dissimilarity based on the OCM-based matching.
In general, this method is not robust for searching ob-
jects under occlusion and may require some segmentation
pre-processing; however, some ad hoc variations in param-
eters such as low-contrast threshold I, weighting factor :
or pruning threshold level j have proved to be eective for
tuning the search algorithm for handling the cases of partial
occlusion also as reported in Section 3.
There is still some room for improving the search
eciency by reduction of computation time of the
histogram-based computation of rst stage. The search time
reported in this paper is without optimization which can
be achieved by eliminating redundant operations by taking
into account the overlapping regions in adjacent locations
instead of fresh computation for the whole subimage at
each location.
208 F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209
(a)
(d)
(b)
(e) (f )
(c)
Fig. 6. Example of matching under illumination changes. Image size=320240 pixels, template size=5959 pixels. (a) k
=15, D=0.135.
(b) k
= 7, D = 0.248. (d) k
= 8, D = 0.211. (e) k
= 8, D = 0.149. (f) k
= 4, D = 0.185.
(a) (b) (c)
Fig. 7. Example of matching with occurrence of occlusion. Image size=256256 pixels, template size=6161 pixels. (a) k
=13, D=0.216.
(b) k
= 13, D = 0.228.
7
0
7
4
7
8
8
2
8
6
9
0
9
4
9
8
1
0
2
1
0
6
1
1
0
99
90
81
72
63
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
D
i
s
s
i
m
i
l
a
r
i
t
y
f
u
n
c
t
i
o
n
(
D
1
)
Horizontal Postition
V
e
r
t
i
c
a
l
P
o
s
t
i
o
n
D
i
s
s
i
m
i
l
a
r
i
t
y
f
u
n
c
t
i
o
n
(
D
2
)
Horizontal Postition
V
e
r
t
i
c
a
l
P
o
s
t
i
o
n
7
0
7
4
7
8
8
2
8
6
9
0
9
4
9
8
1
0
2
1
0
6
1
1
0
99
89
79
69
59
0.3
0.35
0.4
0.45
0.5
0.55
0.6
7
0
7
4
7
8
8
2
8
6
9
0
9
4
9
8
1
0
2
1
0
6
1
1
0
99
88
77
66
0.15
0.17
0.19
0.21
0.23
0.25
0.27
0.29
0.31
D
i
s
s
i
m
i
l
a
r
i
t
y
f
u
n
c
t
i
o
n
(
D
)
Horizontal Postition
V
e
r
t
i
c
a
l
P
o
s
t
i
o
n
(a) (b) (c)
Fig. 8. Similarity proles for Fig. 5(j) around (90,79). (a) Similarity surface for D
1
. (b) Similarity surface for D
2
. (c) Similarity surface
for integrated dissimilarity D.
F. Ullah, S. Kaneko / Pattern Recognition 37 (2004) 201209 209
5. Summary
A new method for rotation-invariant template matching
in gray scale images is proposed. It is based on the utiliza-
tion of gradient information in the form of orientation codes
as the feature for approximating the rotation angle as well
as for matching. Orientation codes-based matching is
robust for searching objects in cluttered environments even
in the cases of illumination uctuations resulting from shad-
owing or highlighting, etc. We use a two-stage framework
for realizing the rotation-invariant template matching; in the
rst stage, histograms of orientation codes are employed for
approximating the rotation angle of the object and then in
the second stage, matching is performed by rotating the ob-
ject template by the estimated angle. Matching in the second
stage is performed only for the positions which have higher
similarity results in the rst stage, thereby pruning out in-
signicant locations to speed up the search. Experiments
with real world scenes demonstrate the rotation- and bright-
ness invariance of the proposed method for performing
object search.
References
[1] S. Hutchinson, G. Hager, P. Corke, A tutorial introduction
to visual servo control, IEEE Trans. Robotics Automation 12
(5) (1996) 651670.
[2] N. Papanikolopoulos, P. Khosla, T. Kanade, Visual tracking
of a moving target by a camera mounted on a robot: a
combination of control and vision, IEEE Trans. Robotics
Automation 9 (1) (1993) 1435.
[3] E. Bardinet, L. Cohen, N. Ayache, Tracking medical 3d
data with a deformable parametric model, Proc. European
Conference on Computer vission 2 (1996) 09118.
[4] P. Anandan, A computational framework and an algorithm
for the measurement of visual motion, Int. J. Comput. Vision
2 (3) (1989) 283310.
[5] D. Ballard, Generalizing the hough transform to detect
arbitrary shapes, IEEE Trans. Pattern Anal. Machine Intell.
13 (2) (1981) 111122.
[6] H.J. Wolfson, I. Rigoutsos, Geometric hashing: an overview,
IEEE Comput. Sci. Eng. 1 (1997) 1021.
[7] T. Leung, M. Burl, P. Perona, Finding faces in cluttered scenes
using random labeled graph matching, Proceedings of the Fifth
ICCV, Cambridge, 1995, pp. 637644.
[8] C.H. Teh, R.T. Chin, On image analysis by the methods of
moments, IEEE Trans. Pattern Anal. Machine Intell. 10 (4)
(1988) 496513.
[9] S.X. Liao, M. Pawlak, On image analysis by moments, IEEE
Trans. Pattern Anal. Machine Intell. 18 (3) (1996) 254266.
[10] M. Swain, D. Ballard, Color indexing, Int. J. Comput. Vision
7 (1) (1991) 1132.
[11] J. Huang, S. Kumar, M. Mitra, W. Zhu, R. Zabih, Image
indexing using color correlograms, Proceedings of IEEE
CVPR, Sun Juan, 1997, pp. 762768.
[12] P. Chang, J. Krumm, Object recognition with color
cooccurrence histograms, Proceedings of IEEE CVPR,
Sun Juan, pp. 10631069.
[13] M. Gorkani, R. Picard, Texture orientation for sorting photos
at a glance. Proceedings of the 12th ICPR, Jerusalem, Vol. 1,
1994, pp. 459464.
[14] W. Freeman, M. Roth, Orientation histograms for hand gesture
recognition, IEEE International Workshop on Automatic Face
and Gesture Recognition, 1995.
[15] V. Kovalev, M. Petrou, Y. Bondar, Using orientation tokens
for object recognition, Pattern Recognition Lett. 19 (12)
(1998) 11251132.
[16] F. Ullah, S. Kaneko, S. Igarahi, Object search using orientation
histogram intersection, Proceedings of JapanKorea Joint
Workshop on Frontiers of Computer Vision, Seoul 2000,
pp. 110115.
[17] L. Brown, A survey of image registration techniques, ACM
Comput. Surveys 24 (4) (1992) 325376.
[18] A. Jain, A. Vailaya, Image retrieval using color and shape,
Pattern Recognition 29 (8) (1996) 12331244.
[19] A. Jain, A. Vailaya, Shape-based retrieval: a case study with
trademark image databases, Pattern Recognition 31 (9) (1998)
13691399.
[20] F. Ullah, S. Kaneko, S. Igarahi, Orientation code matching
for robust object search, IEICE Trans. Inform Systems E84-D
(8) (2001) 9991006.
About the AuthorFARHAN ULLAH received the B.E. in Computer Systems Engineering from N.E.D. University of Engineering and
Technology, Karachi, Pakistan and M.Sc. in Systems Engineering from Quaid-i-Azam University, Islamabad in 1990 and 1992, and Ph.D.
in Systems Engineering from Hokkaido University, Sapporo, Japan in 2002, respectively. He is a Senior Engineer at Informatics Complex,
Islamabad, Pakistan. He is interested in computer vision, pattern recognition, and robotic intelligence. He is a member of IEEE Computer
Society and ACM.
About the AuthorSHUNICHI KANEKO received the B.S. degree in precision engineering and the M.S. degree in Information Engineering
from Hokkaido University, Japan, in 1978 and 1980, respectively, and then the Ph.D. degree in Systems Engineering from the University
of Tokyo, Japan, in 1990. He had been a Research Assistant of the Department of Computer Science since 1980 to 1991, an Associate
Professor of the Department of Electronic Engineering since 1991 to 1995, and an Associate Professor of the Department of Bio-application
and Systems Engineering since 1996 to 1996, in Tokyo University of Agriculture and Technology, Japan. He is an Associate Professor of
the Department of Control and Information Engineering in Hokkaido University from 1996. He received the Best Paper Award in 1990,
the Society Award in 1998, respectively, from Japan Society of Precision Engineering. His research interest includes machine vision, image
sensing and understanding, robust image registration. He is a member of IEICE, JSPE, IEEJ, SICE and the IEEE Computer Society.