Paper

S. H. Lee et al.
: Interactive E-Learning System Using Pattern Recognition and Augmented Reality
883
Interactive E-Learning System

Using Pattern Recognition and Augmented Reality
Sang Hwa Lee, Junyeong Choi, and Jong-Il Park, Member, IEEE
Abstract This paper proposes an interactive e-learning
system using pattern recognition and augmented reality. The
goal of proposed system is to provide students with realistic
audio-visual contents when they are leaning. The proposed elearning system consists of image recognition, color and
polka-dot pattern recognition, and augmented reality engine
with audio-visual contents. When the web camera on a PC
captures the current page of textbook, the e-learning system
first identifies the images on the page, and augments some
audio-visual contents on the monitor. For interactive learning,
the proposed e-learning system exploits the color-band or
polka-dot markers which are stuck to the end of a finger. The
color-band and polka-dot marker act like the mouse cursor to
indicate the position in the textbook image. Appropriate
interactive audio-visual contents are augmented as the marker
is located on the predefined image objects in the textbook. The
proposed e-learning system was applied to the educational
courses in the elementary school, and we obtained satisfactory
results for real applications. We expect that the proposed elearning system is popular when the educational contents and
scenarios are sufficiently provided. 1
Index Terms E-learning system, interactive learning,
augmented reality, pattern recognition
I. INTRODUCTION
Recently, new media and systems for education are
appearing in the form of portable dictionary, e-book,
distant/virtual classrooms, and so on. The main concept of
new educational systems is to combine the educational
contents using information technologies [5], [6], [7], [8].
The students study their textbooks with auxiliary audiovisual contents which are played on the personal computer
and specific terminals. In the distant classroom, the
remote education is performed by communication
networks. And the students can have some experiences on
the remote or imaginary places by the virtual classrooms
[7], [8]. The virtual places are made by high definition
projectors. The students may act like as if they are really
in the remote places. These educational systems usually
exploit various information technologies, such as sensor
networks, computer graphics, view synthesis, geometry
1
Sang Hwa Lee is with Department of Electrical Engineering and
Computer Science, INMC, Seoul National University, Kwanak-gu, Seoul,
151-742, South Korea (emails: lsh@ipl.snu.ac.kr).
Junyeong Choi and Jong-Il Park is with Department of Electronics and
Computer Eng., Hanyang Univ., 17 Haengdang-dong, Seongdong-gu, Seoul,
133-791, Korea (email: hooeh@mr.hanyang.ac.kr, jipark@hanyang.ac.kr).
Jong-Il Park is the corresponding author (e-mail: jipark@hanyang.ac.kr).
Contributed Paper
Manuscript received April 15, 2009
analysis, and communication systems [10], [11], [13],

[14], [15], [16], [17].
In this paper, we propose a new interactive e-learning
system. The proposed system exploits pattern recognition
techniques for object-based interactive learning. Our goal is to
design a mentoring system for self-studying, which lets the
students learn the audio-visual contents interactively. The
proposed e-learning system augments the audio-visual
contents as the students interact with the objects in the
textbook. If there are educational programs such as textbook,
auxiliary audio-visual contents, 3-dimensional (3-D) graphics,
and educational scenarios, our interactive e-learning system
provides how to interact and augment the contents based on
pattern recognition. When the images and objects on the text
pages are recognized, the related contents are played or
augmented on the display. The contents are also displayed
according to the pattern marker which is a kind of computer
mouse. We implement the recognition algorithms of images
and objects using texture-based features. We design colorband and polka-dot patterns for object-based user interaction.
And we define some human-computer interfaces using the
recognition results according to the educational scenarios.
Thus, the proposed e-learning system is to combine education
with various information technologies. It should be noted that
the proposed e-learning system is exploited for the usual and
public educational courses. We tested the proposed e-learning
system with real elementary education courses, and obtained
successful results as a mentoring system.
The rest of this paper is organized as follows. We briefly
introduce how the proposed e-learning system works in
Section II. We describe the polka-dot pattern and color-band
markers for interaction in Section III. We explain the
recognition algorithm of images or objects in Section IV. We
report the results of applying the proposed e-learning system
to the elementary educational courses in Section V. Finally,
we conclude the paper in Section VI.
II. OVERVIEW OF PROPOSED E-LEARNING SYSTEM
The proposed e-learning system consists of image/object
recognition, polka-dot pattern recognition, color-band marker
recognition, augmented reality engine, audio-visual contents,
and some learning scenarios of textbooks. The learning
scenarios are the predefined processes when or where to
augment the contents. The scenarios combine the educational
contents with information technologies to maximize the
learning efficiency. And the augmented reality engine realizes
the scenarios. Fig. 1 shows the structure of proposed elearning system. A web camera connected to the computer
0098 3063/09/$20.00 2009 IEEE
884
focuses on the textbook. The students study watching the

textbook and the captured video frame where some audiovisual contents are augmented. When video frames from web
camera are given, the recognition modules identify the image
and objects on the textbook, and polka-dot or color-band
marker. We have database of images and objects in the
textbook in advance. The image/object recognition module
identifies the current text page and objects that the student is
studying. Using the identified pages and objects from
recognition module, the system knows where the objects are
located in the video frame. Then, some audio-visual contents
are augmented on the computer monitor according to the
predefined educational scenarios. The augmented reality
engine matches the scenarios to information from the
recognition modules, and plays the audio-visual contents
automatically.
Some interactive learning actions are possible by the polkadot or color-band marker. The marker is a kind of computer
mouse, and indicates the location in the video frame. If the
marker is located on the specific objects or menu bars, the
object-based interactions are performed based on the
educational scenarios and contents. The related visual contents
are displayed on the marker even though the marker is moving.
Some interactive actions, such as dragging the virtual object,
scrubbing-based reaction, and menu selection, are also defined
in the proposed e-learning system.
IEEE Transactions on Consumer Electronics, Vol. 55, No. 2, MAY 2009
since the proposed system is designed for general purposes.

Thus, any contents providers and educational organizations
can exploit the proposed e-learning system for their interactive
learning courses. Being that this paper is focused on the
recognition algorithms and augmented reality engine, not the
contents production, we have developed this e-learning system
based on tight collaboration with content providers and
teachers.
III. DESIGN OF INTERACTIVE MARKERS
For interesting interactive learning, we need a natural
human-computer interface method. We design two markers
using polka-dot pattern or color-band. The markers are put on
the fingers as bands, and act like the computer mouse. The
markers indicate their locations in the video frame, which
enables the students to interact according to the objects in the
textbook. When the marker is located at a specific object or
menu in the textbook, the corresponding audio-visual contents
are augmented on the computer, or the predefined menu
function is performed. And some interactive functions such as
dragging and scrubbing object are defined to support various
learning actions.
A. Polka-dot Pattern Recognition
The polka-dot patterns are rare in the usual textbook, and
well recognized both in the grayscale and color images. The
polka-dot band for a finger is used as a computer mouse. We
exploit the polka-dot marker for interactive augmentation of
contents and menu selection. To detect polka-dot pattern
exactly in real-time, we propose fast filters of integer
operations, hierarchical searching, and edge information.
Fig. 2 shows two array patterns of polka-dot markers. The
array patterns are empirically selected by the polka-dot
recognition algorithm. Since the maker on a finger is subject
to be rotated and slanted at the camera viewpoint, the array
pattern of dots should be invariant to the perspective variations.
According to the proposed recognition algorithm, the optimal
array pattern was selected as shown in Fig. 2. The hexagonal
array is the best pattern that is invariant to the perspective
distortions of camera viewpoints.
Fig. 1. Structure of proposed e-learning system. The image and marker

recognition enables students to learn interactively according to the
predefined learning scenarios and audio-visual contents.
For the usefulness of proposed e-learning system, we have

produced many educational contents and scenarios for the real
school courses. In addition, we develop the authoring tool to
produce the educational scenarios and interactions easily,
Fig. 2. Polka-dot patterns for interactive learning. The best arrays of

dots are empirically selected. Usually, the hexagonal array patterns are
robust for perspective distortion caused by different camera viewpoints.
S. H. Lee et al.: Interactive E-Learning System Using Pattern Recognition and Augmented Reality
The basic algorithm of polka-dot pattern recognition is the

high pass filter in the horizontal and vertical directions [1].
The high pass filter first finds the area where the grayscale
pixel values are regularly varied with black and white pattern
as below,
f h ( x, y ) =
f v ( x, y ) =
( I (i, j) I (i + D, j) ) ,
( i , j )W
( i , j )W
I (i, j ) I (i, j + D) ,
885
Since the characters or other complex textures usually have

some line-edge properties unlike the polka-dot patterns, the
false positive errors are decreased by the edge information.
Fig. 3 shows some experimental results of polka-dot pattern
recognition. It is shown that one or two independent polka-dot
markers are detected in the video frame. In the usual personal
computer environment, the recognition is performed at higher
than 25 frames per second for 640x480 resolution.
(1)
where f h ( x, y ) and f v ( x, y ) are high pass filters in the

horizontal and vertical directions at image coordinate
( x, y ) .
The grayscale value of image I ( x, y ) satisfies the following

condition,
I (i, j ) B, or W I (i, j ),
(a)
(2)
where the thresholds B and W mean the black (dark) or white

(bright) values. Since the polka-dot pattern has only black and
white values, we select such pixel values that satisfy the
condition (2) when applying high pass filters. This reduces the
process time to calculate the high pass filters and false positive
errors in the complex textures. In (1), a parameter D is related to
the diameter and interval of dots. The high pass filters are
calculated in the regular grid, which reflects the periodic array
of dots in the polka-dot patterns. Finally, we examine the values
of high pass filters and the number of pixels to be calculated by
(1) in the window W. We select the candidate position of polkadot marker by high pass filtering and the number to indicate
how many dot-like patterns exist in the window.
When we detect a candidate position of polka-dot marker,
we refine the exact center position of the polka-dot marker.
We find the position of polka-dot marker in the hierarchical
process. The candidate position is first searched in the coarse
grid in the original image. When a position satisfies the
conditions of polka-dot marker by (1) and (2), we search for
the marker positions near the position in the fine grid. There
are multiple positions to be polka-dot marker. We average the
multiple positions of the polka-dot markers in the fine grid.
For fast operation in marker detection, we restrict the search
range based on the motion vector of previously detected
polka-dot marker. The motion vector of polka-dot marker
enables us to predict the next location. We predict the next
position of marker, and first detect the marker in the restricted
search range. If the polka-dot marker is not detected in the
restricted search range, we expand the search range and find
the marker again.
Finally, we examine the detected marker by edge
information. Since the high pass filters can detect the complex
textures or characters in the textbook as polka-dot markers, we
exploit edge information to reduce the false positive errors.
(b)
Fig. 3. Recognition results of polka-dot patterns. Multiple polka-dot
markers are independently detected in the video frame. The light green
squares mean the central position of polka-dot markers. (a) Single maker
detection, (b) two markers detection.
B. Color-band Recognition
Some interactions in the educational scenarios require two
or more markers simultaneously to manipulate multiple
objects. Since the polka-dot patterns have little distinct
difference between them, it is difficult to operate the multiple
markers independently. We need new multiple markers to be
individually discriminated. This paper designs two color-band
markers which consist of three colors as shown in Fig. 4. The
color-band markers are discriminated with each other and the
polka-dot marker, thus, we use three markers simultaneously
according to the educational scenarios and interaction.
Fig. 4. Color-band markers using three colors. The combination of three

colors are optimally selected from various experiments. Each color-band
marker is also discriminated with each other in a video frame.
886
The colors of the markers are selected from various

experiments. The blue color is usually best recognized and
most stable in the lighting variation [18]. The blue band is
located at the center of color-band marker, and is searched
first. The other colors have been chosen since they are well
discriminated with each other and the blue color. We
design two color-band markers with different combinations
of colors as shown in Fig. 4.
The color-band markers are detected by finding blue
color first. The hue components in HSV color space are
used for robust detection in various lighting conditions.
When blue color pixels are detected, we examine the shape
and area of blue region whether the blue region satisfies the
condition of marker. Then, we search for the other colors
(Green and Red, or Yellow and Purple) around the blue
region. We consider the color range and the area of color
region to confirm the color-band pattern. The order of
colors and ratios of color areas are compared with the
predefined criterions. Fig. 5 shows that two color-band
markers are independently detected in a video frame. The
color ranges of color-band markers are optimized according
to the lighting environment. Note that the color ranges
should be changed with respect to the lighting conditions.
Thus, we devise the method to adjust the color ranges of
markers automatically when the proposed e-learning system
is setup.
pages, they do not look good for text design. Our goal is to
replace the AR geometric markers with image objects and
to design a natural interface using the image objects.
A. Feature Extraction
Since the images are subject to be rotated, distorted by
perspective viewpoints, and changed by scales, we have to
extract robust features invariant to the image variations.
Recently, the scale-invariant features have been widely
researched, and some feature extraction algorithms are
developed for image and objects recognition [2], [3], [4].
We exploit the robust features called speeded up robust
feature (SURF) [3], which shows good recognition results
and fast operation compared with SIFT [2]. Since the
proposed e-learning system is also applied to the mobile
devices like PDA or mobile phone, we implement the
SURF algorithm with integer programming and optimized
lookup tables.
The first step of feature extraction is to detect the distinct
points which are also invariant to image variations. The
distinct feature points are determined by Hessian matrix at
image point x( x, y ) and scale parameter ,
Lxx (x, ) Lxy (x, )

H ( x, ) =
.
Lxy (x, ) Lyy (x, )
(3)
In (3), Lxx ( x, ) is the second derivative of Gaussianfiltered image in the x-direction,
Lxx (x, ) =
Fig. 5. Recognition results of color-band markers. Two markers are
consistently detected when they are moving.
IV. RECOGNITION OF IMAGE AND OBJECT

Image recognition is designed for identification of
current text page or objects. When the text page or objects
are identified, the related audio-visual contents are
automatically played on the PC. Since we can obtain the
pose information of objects in the captured image, we
augment the visual 3-D contents according to the poses of
objects.
As a previously related work, augmented reality (AR)
toolkits have used geometric markers to be recognized in
the images [19]. The AR markers consist of black/white
geometric shapes in the square. The AR markers are well
recognized in the various image distortions, and they have
been popular for interactivity of virtual systems. However,
since the AR markers are directly printed on the textbook
2
( I (x) * G ( ) ) ,
x 2
(4)
where I ( x) * G ( ) means the convolution of image and

Gaussian filter with standard deviation . The Gaussian
filter blurs the image as the scale parameter increases.
We construct the pyramid structure of Gaussian-filtered
images, which considers the variation of image scales and
resolutions. The sizes of Gaussian filters and scale
parameter are increased to make higher scale (low
resolution) images. And the images are sub-sampled as the
scales increase. Finally, the distinct points are detected by
the determinant of Hessian matrix in (3),
det(H (x, )) = Lxx Lyy Lxy 2 .
(5)
In the scale space structure, the point at x and is the

distinct feature point, if the determinant value is largest in
the 26 neighbors. Fig. 6 shows the 26 neighbors to decide
the distinct feature points. The neighboring scale images
are considered to determine the distinct feature points.
887
sampled in each subregion. And we calculate 4-D vector for

every subregion,
V=
( d , d , d , d ) ,
x
(6)
where d x is the difference between adjacent pixel samples in

the x direction, and d x is the absolute value of d x . Based on
Fig. 6. 26 neighbors to decide the feature points. The red pixel is the
current point at x and . The neighboring scale images are considered to
decide the feature points.
The second step of feature extraction is to find a dominant

orientation around the feature point. The orientation
information normalizes the rotated images and objects. Thus,
the images or objects are recognized in spite of rotational
distortions. The responses of Haar filters in the x and ydirections are respectively calculated for a circular
neighboring region. The orientation angle for a pixel is
calculated by the x and y responses of Haar filters. And each
orientation angle is inversely weighted by the distance from
the central feature point. All the orientation angles for the
circular region are accumulated into a 6-bin histogram. We
define the dominant orientation of feature point as the most
frequent angle. Fig. 7 shows how to find the orientations
around the feature point. The grayscales in the left circle mean
the Gaussian weights according to the distance from the
central feature point. The arrows are the magnitudes of x and y
responses of Haar filters.
Fig. 7. Orientation assignment. The grayscale values in the left circle

mean the Gaussian weights for the histogram of orientations. The
dominant orientation angle is decided from the 6-bin histogram.
The last step of feature extraction is to describe the feature

points as a vector structure. This descriptor discerns the
feature points. The square region around a feature point is
selected for the descriptor. Note that the square is rotated by
the dominant orientation before finding the descriptor. The
size of square is related to the scale parameter. The square
region is divided into 16 subregions, and 25 pixels (5x5) are
4-D vector for each subregion, the descriptor vector to

describe the feature point becomes 64-D (4x16) vector. This
64-D descriptor vector is an ID number of each feature point.
B. Feature Matching
The corresponding features are searched by the vector
distance between descriptors. When features are extracted for
image and object recognition, all pairs of features are
examined by the vector distances. Then, the nearest and
second nearest features ( f1 and f 2 ) are selected. The nearest
feature f1 is matched to the feature
f , if the following
criterion is satisfied,
f f1 f f 2 ,
(7)
(0 < < 1) adjusts how distinctively the features are

matched. As is close to zero, f is matched to only f1 . For
robust and unique matches, we set less than 0.5 in the real
where
system. And we exploit the sign of Laplacian operation for

fast feature matching. The sign of Laplacian operation is
derived by trace of Hessian matrix. We first search for features
by examining the sign of trace of Hessian matrix, and then
calculate the distance between feature descriptors.
(a)
(b)
Fig. 8. Feature matching results. (a) The nearest feature matching using
(7), (b) Feature matching by homography. The features on the same
object surface are correctly matched by homography.
888
Fig. 8 shows feature matching results. Fig. 8 (a) is the

result only using (7), but there are some mismatches.
Therfore, we introduce the homography and RANSAC [12]
optimization to reduce the errors. The only features that are
on the same geometric relation are matched by the
homography which is optimally estimated in the RANSAC
process. Fig. 8 (b) shows that the mismatched features are
eliminated by the homography relation. And we should
note that the geometric markers on the top-left position in
Fig. 8 are not matched by the features. The proposed
recognition method is not confused with the AR toolkit
markers. As you can see in Fig. 8, the geometric markers
do not look good, so we replace the AR markers with the
proposed feature matching.
C. Image and Object Recognition
When we have all pairs of matched features, we can
recognize the images or objects. The simplest method is to
count the number of matched features. Without loss of
generality, the image pairs that have the largest number of
matched features are the same. However, as shown in Fig. 8
(a), there are some matching errors, if the similar image
patches or repetitive patterns exist in the images. As
described before, we use the homography to reduce
matching errors. Since the homography reflects the
geometric relations of features, it removes such
mismatched features that satisfy the matching criterion (7)
without geometric correlation.
The homography H is a 3x3 perspective matrix to
transform a 2-dimensional (2-D) plane point (x, y) into (x,
y) [10], [11],
x h11
'
y = h21
1 h31

'
h12
h22
h32
h13 x

h23 y ,
1
1
We can find an optimal homography by least squares

method and RANSAC [12]. The matched features are
randomly selected to find the homography using the least
squares method. This process is iteratively performed until
the estimated homography is optimal.
Fig. 9 shows two example of image recognition. For real
situations, we occlude the images or objects partially by the
hand. As we can see in Fig. 9, the images are well
recognized under various image distortions, such as
perspective distortion, luminance difference, scale
difference, and occlusion. In Fig. 9, the left images are the
database images, and the right images are captured ones by
the web camera. The images are well identified regardless
of AR markers as shown in Fig. 8 (a). Fig. 10 shows
another system of image recognition. We tested the image
recognition module by image retrieval system. In Fig. 10
(a) and (c), the left images are query images, and the right
images show the database. The query images are captured
by the web camera. The right images in Fig. 10 (b) and (d)
show the recognition results from the database. For clarity,
the features matched for recognition are shown in yellow
dots.
(a)
(b)
(c)
(d)
(8)
where the 2-D points are represented by the homogeneous

coordinates. Since the freedom of homography H is 8, we
have to get at least 4 matched features to find the homography.
From the coordinates of corresponding feature points, we get
the following system equation,
h11

h12
h13
x y 1 0 0 0 xx yx x
h
yy 21 = y .
0
0
0
1
x
y
xy
h
# # # # # #
22 #
#
#
h

23
h
31
h
32
Fig. 9. Image recognition results. (a) and (c) Original images recognized
in the database regardless of AR markers, (b) and (d) Captured images
by the web cam.
Note that all images in the textbook are not recognized in the
proposed method when there are no sufficient features in
some images. Thus, we have to evaluate how well the images
are recognized before constructing the database. The images
that have feature points sufficiently are selected as the
database for image recognition. The selection of database
image is also related to produce educational scenarios and
contents.
(a)
(b)
(c)
(d)
Fig. 10. Image retrieval results. The features matched for image
recognition are indicated in yellow dots in (b) and (d).
889
11 (b)) shows the augmented reality with graphic contents.

The page ID is recognized by the image and objects in the text
page. Then, the related audio-visual contents are augmented as
the scenarios and students interaction. The graphics are
displayed above the marker so that the interactive augment
reality is naturally performed. The augmented graphic objects
also move as the marker moves.
Fig. 12 shows the commercial system and an exemplary
image of interactive augmented reality. The proposed
interactive e-learning system using augmented reality was
applied to the public elementary school in the courses of
English and Science. This interactive augment reality made
the students have more interest in learning. Therefore, the
proposed e-leaning system not only provides with audio-visual
contents, but also improves the learning efficiency and
concentration of students. The application to public
elementary school was satisfactory. It is expected that the
proposed e-learning system is very useful in the various
educational courses. If we have the authoring tools to develop
the educational contents and scenarios easily, we expect that
the proposed interactive e-leaning system using augmented
reality is rapidly popular in the education system and industry.
Further researches are focused on reducing recognition errors
in the various camera environments.
V. APPLICATION TO PUBLIC EDUCATION SYSTEM

The proposed e-learning system is applied to English and
science courses in the public elementary school. The
educational contents providers design some learning scenarios
and audio-visual contents. We adapt the educational scenarios
to the proposed e-learning system. Some interfaces such as
object/menu selection and objects movement are defined for
the educational scenarios. The polka-dot and color-band
markers act like a computer mouse. The interface of markers
is so natural that the students dont have to learn any kinds of
poses in advance. And we construct database images and
objects from textbooks for recognition.
(a)
(b)
Fig. 12. The proposed e-learning system is applied to the public
elementary school. (a) Example of interactive augmented reality using
image and marker recognition.
VI. CCLUSIONS
(a)
(b)
Fig. 11. Example of augmented reality using marker. (a) a video frame
captured by the web camera, (b) A visual content is augmented reality on
the marker. The visual content is augmented on the marker, thus the
marker is not seen on the monitor.
Fig. 11 first shows that a moving graphic is augmented on

the color-band marker. The left image (Fig. 11 (a)) is the
captured image by the web camera, and the right image (Fig.
This paper has proposed an interactive e-learning system

using recognition algorithms and augmented reality. The
proposed e-learning system provides students with realistic
audio-visual contents according to the recognition results.
When the images in the textbook are identified by the image
recognition, the audio contents are played and the visual
contents such as graphics animation or movies are augmented
on the captured images of web camera. For real-time
interactive learning, the polka-dot or color-band markers are
designed to indicate some objects in the textbook just like a
mouse cursor. The proposed e-learning system has been
applied to the public elementary school successfully. It is
expected that the proposed e-leaning system becomes popular
faster, when recognition errors are reduced and authoring tools
are provided to produce the educational contents and scenarios.
890
ACKNOWLEDGMENT
This work is supported by ETRI (Electrons and
Telecommunication Research Institute), Development of
Elementary Technology for Promoting Digital Textbook and
U-Learning Project.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd ed.,

Prentice Hall Inc., 2002.
D. G. Lowe, Distinctive image features from scale-invariant
keypoints, International Journal of Computer Vision, vol. 60, pp. 91110, Nov. 2004.
Herbert Bay, Tinne Tuytelaars, and Luc Van Gool, SURF: Speeded up
robust features, Proc. European Conf. on Computer Vision, 2006.
K. Mikoljczyk and C. Schmid, A performance evaluation of local
descriptors, IEEE Trans. PAMI, vol. 27, no. 10, pp. 1615-1630, Oct.
2005.
Radu Dnddera, Chun Jia, Voicu Popescu, Cristina Nita-Rotaru, Melissa
Dark, and Cynthia S. York, Virtual classroom extension for effective
distance education, IEEE Computer Graphics and Applications, pp. 6474, Jan./Feb. 2008.
S. G. Deshpande and J.-N. Hwang, A real-time interactive virtual
classroom multimedia distance learning systems, IEEE Trans.
Multimedia, vol. 3, no. 4, pp. 432-444, Dec. 2001.
Y. Shi, W. Xie, and G. Xu, Smart remote classroom: Creating a
revolutionary real-time interactive distance learning, Proc. Intl Conf.
Web-Based Learning, LNCS 2436, 2002.
M. J. Lavooy, Computer mediated communications: Online instruction
and interactivity, Journal of Interactive Learning Research, vol. 14, no.
2, pp. 157-165, June 2003.
Y. Ohta and H. Tamura., Mixed Reality - Merging Real and Virtual
Worlds, Springer-Verlag, 1999.
D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach,
Prentice Hall, 2003.
R. Hartley and A. Zisserman, Multiple View Geometry, Cambridge 2001.
D. Lowe, Recognizing panoramas, Proc. of CVPR, 2003.
O. Bimber and R. Raskar, Spatial Augmented Reality, A K Peters, 2005.
R. Azuma, A survey of augmented reality, Presence: Teleoperators
and Virtual Environments, vol. 6, no. 4, pp.355-385, 1997.
R. Azuma, Y. Baillot, R. Behringer, S. Feiner, S. Julier, and B.
MacIntyre, Recent advances in augmented reality, IEEE Computer
Graphics and Applications, vol. 21, no. 6, pp. 34-47, 2001.
[16] M. Kanbara, N. Yokoya, Real-time Estimation of Light Source

Environment for Photorealistic Augmented Reality, International
Conference on (ICPR'04), vol. 2, pp. 911-914, Aug. 2004.
[17] W. R. Sherman and A. B. Craig, Understanding Virtual Reality, Morgan
Kaufmann Publisher, 2003.
[18] H.-C. Lee, Introduction to Color Imaging Science, Cambridge, 2005.
[19] AR Toolkit Homepage, http://www.hitl.washington.edu/artoolkit/.
Sang Hwa Lee received the B.S., M.S., and Ph.D. in
electrical engineering and computer sciences from
Seoul National University, Seoul, Korea, in 1994,
1996, and 2000, respectively. He has joined BK21
information technology, Department of Electrical
Engineering and Computer Science, Seoul National
University, as a research professor since 2005. His
research interests include image and video processing,
stereoscopic system, pattern recognition, MRF
modeling, and image-based rendering.
Junyeong Choi received B.S., and M.S., degrees in

electrical and computer engineering from Hanyang
University, Seoul, Korea in 2007, and 2009,
respectively. He is now Ph.D. candidate in Hanyang
University. His research interests include augmented
reality, human-computer interaction, and affective
computing.
Jong-Il Park (M87) received B.S., M.S., and Ph.D.

degrees in electronics engineering from Seoul National
University, Seoul, Korea, in 1987, 1989, and 1995,
respectively. From 1996 to 1999, he was with ATR
Media Integration and Communication Research
Laboratories, Japan. He joined the Department of
Electrical and Computer Engineering, Hanyang
University, Seoul, Korea, in 1999, where he is
currently a Professor. His research interests include
computational imaging, augmented reality, 3D computer vision, and HCI.

Paper

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Paper

Uploaded by

Copyright:

Available Formats

S. H. Lee et al.

: Interactive E-Learning System Using Pattern Recognition and Augmented Reality

Interactive E-Learning System

analysis, and communication systems [10], [11], [13],

0098 3063/09/$20.00 2009 IEEE

focuses on the textbook. The students study watching the

IEEE Transactions on Consumer Electronics, Vol. 55, No. 2, MAY 2009

since the proposed system is designed for general purposes.

Fig. 1. Structure of proposed e-learning system. The image and marker

For the usefulness of proposed e-learning system, we have

Fig. 2. Polka-dot patterns for interactive learning. The best arrays of

The basic algorithm of polka-dot pattern recognition is the

Since the characters or other complex textures usually have

where f h ( x, y ) and f v ( x, y ) are high pass filters in the

The grayscale value of image I ( x, y ) satisfies the following

where the thresholds B and W mean the black (dark) or white

Fig. 4. Color-band markers using three colors. The combination of three

IEEE Transactions on Consumer Electronics, Vol. 55, No. 2, MAY 2009

The colors of the markers are selected from various

Lxx (x, ) Lxy (x, )

In (3), Lxx ( x, ) is the second derivative of Gaussianfiltered image in the x-direction,

IV. RECOGNITION OF IMAGE AND OBJECT

where I ( x) * G ( ) means the convolution of image and

det(H (x, )) = Lxx Lyy Lxy 2 .

In the scale space structure, the point at x and is the

sampled in each subregion. And we calculate 4-D vector for

where d x is the difference between adjacent pixel samples in

The second step of feature extraction is to find a dominant

Fig. 7. Orientation assignment. The grayscale values in the left circle

The last step of feature extraction is to describe the feature

4-D vector for each subregion, the descriptor vector to

(0 < < 1) adjusts how distinctively the features are

system. And we exploit the sign of Laplacian operation for

IEEE Transactions on Consumer Electronics, Vol. 55, No. 2, MAY 2009

Fig. 8 shows feature matching results. Fig. 8 (a) is the

We can find an optimal homography by least squares

where the 2-D points are represented by the homogeneous

11 (b)) shows the augmented reality with graphic contents.

V. APPLICATION TO PUBLIC EDUCATION SYSTEM

Fig. 11 first shows that a moving graphic is augmented on

This paper has proposed an interactive e-learning system

IEEE Transactions on Consumer Electronics, Vol. 55, No. 2, MAY 2009

R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd ed.,

[16] M. Kanbara, N. Yokoya, Real-time Estimation of Light Source

Junyeong Choi received B.S., and M.S., degrees in

Jong-Il Park (M87) received B.S., M.S., and Ph.D.

You might also like