You are on page 1of 16

Android :Simple Shape Recognition using OpenCV,JavaCV

Pi19404
April 9, 2013

Contents

Contents
Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV 3

0.1 0.2 0.3 0.4

Introduction . . . . . . . . . . . . . . . . . Recognizing Gesture Shape . . . . . . . . Data Pre Processing . . . . . . . . . . . Gesture Normalization . . . . . . . . . . . 0.4.1 Registering candidate . . . . . . 0.4.2 Rejecting invalid Gesture . . . . 0.4.3 Re-sampling points . . . . . . . . . 0.4.4 Scaling . . . . . . . . . . . . . . . . . 0.4.5 Translation . . . . . . . . . . . . . . 0.4.6 Feature Extraction : Histogram dients . . . . . . . . . . . . . . . . . 0.5 Classification Task . . . . . . . . . . . . . 0.6 Implementation Details . . . . . . . . . . 0.6.1 Implementation details of HOG . 0.6.2 libSVM File . . . . . . . . . . . . . . 0.7 Code . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . of . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oriented . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gra. . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 3 3 4 4 5 5 7 7

8 9 11 11 15 15 15

2 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV


0.1 Introduction
In this article a simple gesture shape recognition tecnhique is explored. The features used to represent the gestures are Histogram of Gradients and SVM classifier will be used as discriminative classifier to classify the gesture shape.

0.2 Recognizing Gesture Shape


In earlier artile 10 HOG was used to recognize the shape.The same approach will be used to recognize the gestures.The feature extraction and training is performed on desktop . The training generates a SVM model file. This file is copied to the AndroidGesture directory on the mobile devices and will be used by the SVM prediction code. The SVM code is available in java 11The code is used with slight modification for purpose of prediction.

0.3 Data Pre Processing


The data pre processing steps are the same ones in $1 gesture recognizer described in the earlier article only the method to compute the similarity between the candidate and template has been changed from using euclidean path distance to using FAST DTW algorithm. The pre-processing steps are included below.

3 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV

0.4 Gesture Normalization


The template and the candidate points may contain different number of sampled points,they may differ in size and spatial location are not the same ie the template and candidate points would not line up. Hence the first step is to perform gesture normalization so that candidate and template points are transformed such that they can be compared this pre processing step is called as gesture normalization. The aim to transform the gesture so that they are invariant to translation and rotation

0.4.1 Registering candidate


The first step is to capture the candidate template.This step is called registering the candidate. As mentioned earlier the number of points captured would depend on the device spatial resolution. The gesture capture processing is defined to be in one of the three states dragged,released,start. The start state indicates that gesture has started and to clear any previous information stored. The dragged state indicates that unistroke is being performed without lifting the figure and the 2D co-ordinates of gesture are being captured.The released state indicates that figure has been lifted and gesture capture process has been completed and to start with gesture recognition process. The class AndroidDollar defines android routines to capture the touch gesture performed by the user . The class Data Capture provide high level interface to capture the data and to initiate gesture recognition.AndroidDollar class contains a instance of DataCapture and whose methods are called based on the touch event detected by the user The java class DataVector is defined which captures the 2D coordinate information of drawn gesture. The DataCapture Class contains instance of DataVector.

4 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV The class PUtils contains all the methods for gesture pre processing and recognition. The DataCapture class contains instance of PUtils class .

0.4.2 Rejecting invalid Gesture


A simple check is incorporated to check whether the gesture was intentional or not by specifying a path length criteria.If the path length of the gesture is less than a specified threshold no further processing is performed and no gesture recognized status will be displayed. The PathLength method defined in the PUtils class simply computes the sum of linear distances between all the adjacent points of the captured/template gesture.

1 2 3 4 5 6 7 8 9

public double PathLength ( Vector points ) { double length = 0; for ( int i = 1; i < points . size () ; i ++) { length += Distance (( Point ) points . elementAt ( i - 1) , ( Point ) points . elementAt ( i ) ) ; } return length ; }

In the present implementation the path length threshold used in 100. The code is implemented by the method PathLength is the PUtils Class.

0.4.3 Re-sampling points


Once the gesture has been captured before the process of comparing the candidate gesture with template gesture some pre processing operations are performed .Resampling is one such operation. The re-sampling operations selects from the provided candidate/template gesture a fixed subset of points. This ensures than candidate and template have the same number of points enabling us to perform

5 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV point based comparison. The method used for sampling the data points is uniform sampling.The path length is divided by the number of re-sampled points.This will be the interval length between the points. We start with the initial point and next point is selected such that distance between points is greater than of equal to interval length.Let points be labeled pt1 and pt2 A linear path is assume to exist between adjacent sample points. using simple trigonometric relationship of sin and cos we can estimate of location at distance of uniform path interval which lies between pt1 and pt2 . This new co-ordinate replaces pt2 in the candidate/template coordinate array and the same process is repeated till the last point of the co-ordinate array is reached.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

double d = Distance ( pt1 , pt2 ) ; if (( D + d ) >= I ) { // computation of new co - ordinate using cos relationship double qx = pt1 . x + ( I / ( D + d ) ) * ( pt2 . x - pt1 . x ) ; // computation of new co - ordinate using sin relationship double qy = pt1 . y + ( I / ( D + d ) ) * ( pt2 . y - pt1 . y ) ; Point q = new Point ( qx , qy ) ; // adding the point in resampled array dstPts . addElement ( q ) ; // replacing the point in the source array srcPts . insertElementAt (q , i ) ; // resetting cumulative distance D = 0.0; } else { // computing cumulative distance D=D+d; }

This is implemented by the method Re-sample in the PUtils Class. In the present implementation the number of re-sampled points used is 32.

6 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV

0.4.4 Scaling
The next pre-processing step is to scale the co-ordinates such that third width and height remain within a fixed bounds.First the bounding width and height of current set of points are computed which is simply max(x) min(x) and max(y ) min(y ).The new width and height are denoted by W and H and all the points are scaled by factor W=max(x) min(x) and H=max(y ) min(y )
1 2 3 4 5 6 7 8 9

Rectangle B = BoundingBox ( points ) ; Vector newpoints = new Vector ( points . size () ) ; for ( int i = 0; i < points . size () ; i ++) { Point p = ( Point ) points . elementAt ( i ) ; double qx = p . x * ( size / B . Width ) ; double qy = p . y * ( size1 / B . Height ) ; newpoints . addElement ( new Point ( qx , qy ) ) ; }

This provides step provides invariance wrt scaling since all gestures are bounded to lie within rectangle of same size.This is implements by method ScaleToSquare1 in PUtils class. In the present implementation the scale is done so than bounding box is a square of dimension 250. The above method perform uniform scaling another method of scaling is non uniform scaling that maintains the aspect ratio. Compute the ration between the width and height or viceversa of the bounding rectangle if ratio is close to 0 than 1 then perform non uniform scaling else perform uniform scaling. The ScaleDimTo method implements this in the PUtils Class.

0.4.5 Translation
The first step required is computation of mean/centroid of set of co-ordinate location.
1 2 3 4 5 6 7

Enumeration e = points . elements () ; while ( e . hasMoreElements () ) { Point p = ( Point ) e . nextElement () ; xsum += p . x ; ysum += p . y ; }

7 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV


8

return new Point ( xsum / points . size () , ysum / points . size () ) ;

This is implemented by the method Centroid in class PUtils. The next step is to translate all the points such that centroid lies at the origin of co-ordinate system. Translate all the points by (centroidx ; centroidy ).

1 2 3 4 5 6 7 8 9

Point c = Centroid ( points ) ; Vector newpoints = new Vector ( points . size () ) ; for ( int i = 0; i < points . size () ; i ++) { Point p = ( Point ) points . elementAt ( i ) ; double qx = p . x - c . x ; double qy = p . y - c . y ; newpoints . addElement ( new Point ( qx , qy ) ) ; }

This is implemented in the method TranslateToOrigin in the PUtils Class. After the completion of gesture normalization the candidate gesture points are plotted on plImage data structure available in java CV. And the image is passed to the Feature extraction method.

0.4.6 Feature Extraction : Histogram of Oriented Gradients


Histogram of oriented gradient features are used to represent the image. As the name suggests it a histogram of gradients in different orientation directions. The Hog descriptor has become one of the most popular low-level image representations in computer vision. Local shape information often well described by the distribution of intensity gradients or edge directions even without precise information about the location of the edges themselves.Shape information is encoded by HOG and spatial information is encoded by sliding

8 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV windows The derivative of images is computed along x and y directions.Applying Cartesian to polar transformation we obtain magnitude and orientation of the gradient at every point of the image. We consider the orientations along 9 orientation directions,a orientation resolution of 200 . We compute the histogram of oriented gradients that lie along these predefined orientations. This will give a feature vector of length 9. The image is subdivided into block and HOG is computed over the each block To encapsulate correlation amongst neighborhood block simple techniques of sliding windows is used. To speed up the computation integral images are used to compute sum of pixels ie histogram bin count quickly over the windows. The most basic features are raw pixel feature .If we used raw pixels directly we would get a very long feature vector.Using HOG feature representation we get a relatively small feature representation. Below is a example of HOG feature vector computed for 9 images and relatively small training set of 20 samples per class are used for training.The training time required is also very small can be performed in real time. Following are the parameters : 1. Image Size : 160x120 2. Num of Orientations : 12 3. Number of Blocks : 3x3=9 4. Feature descriptor : 12x9=108 The descriptors can be computed for set of training images,the image as well class label is written to the csv file.

0.5 Classification Task


Given feature set corresponding to unknown gesture we are required to classify it to one of the known classes. We will use SVM as a classification tool.LibSVM software package is used to train the

9 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV

Figure 1: Normalized gesture plots

classifier and test accuracy for a small gesture set.We will use the java classes provided by LibSVM software package. The file earlier generate is in format suitable for LibSVM package. The file contains the feature label number followed by the feature value. The First task is to perform feature scaling so that all the feature lie in a predefined range This is required to ensure than feature value along one dimension does not bias the classifier. Along each dimension the features are scaled to lie in a fixed range .The default value of provided by the LibSVM tool is (-1,1).

1. train.file - input training data filename 2. test.file - input test data filename 3. t1.range - file containing feature scaling parameters 4. t1.scale - output data file name after performing feature scaling on training data file 5. t2.scale - output data file name after performing feature scaling on test data file 6. t1.model - SVM classifier model file

10 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV command to perform feature scaling is Java svm_scale -s t1.range train.file > t1.scale command to train the classifier java svm_train t1.scale t1.model Command for feature scaling the test data java svm_scale -r t1.range test.file > t2.scale command to perform prediction on test data Java svm_predict t2.scale t1.model test.out we obtain perfect classification on this small data set. About 300 samples of each class were used for training.The shape included in the training stage consist of high variability .

0.6 Implementation Details


The java version of the code to extract HOG features and SVM is used as it provides simpler interface with android. In the later article native C/C++ version of the code will be included.

0.6.1 Implementation details of HOG


The code used for HOG is based on the paper by Ludwig et al., 2009 ,the matlab code provided at Ludwig, 2010 is used with some modification. The HOG code is a generic code which can also used to represent textured objects. The first step of HOG is to compute first order gradients along x and y directions. The below code creates the filter for computing first order derivatives along x and y dirctions
1 2 3 4 5 6

hx = cvCreateMat (1 ,3 , CV_32F ) ; FloatPointer hxf = hx . data_fl () ; hxf . put (0 , -1) ; hxf . put (1 , 0) ; hxf . put (2 , 1) ; hy = cvCreateMat (3 ,1 , CV_32F ) ;

11 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV


7 8 9 10

FloatPointer hyf = hy . data_fl () ; hyf . put (0 , 1) ; hyf . put (1 , 0) ; hyf . put (2 , -1) ;

The below code create a floating point representation of the image and computes the first order derivatives along the x and y directions. From the derivatives cartesian to polar transformation is performed to obtain the gradient magnitude and orientation The orientation is normalized to lie between
1 2 3 4 5 6 7 8

pi

to

pi

cvConvertScale ( Im , Im1 ,1.0 ,0.0) ; /* * computing gradient along x and y directions * */ cvFilter2D ( Im1 , grad_xr , hx , cvPoint (1 ,0) ) ; cvFilter2D ( Im1 , grad_yu , hy , cvPoint ( -1 , -1) ) ; /* * cartesian to polar transformation * */ cvCartToPolar ( grad_xr , grad_yu , magnitude , orientation ,0) ; /* * normalization of orientation * */ cvSubS ( orientation , cvScalar ( pi , pi , pi ,0) , orientation , null ) ;

If the image is color image then orientation corresponding to the dominant channels is extracted.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

cvSplit ( orientation , I1 , I2 , I3 , null ) ; cvSplit ( magnitude , I4 , I5 , I6 , null ) ; FloatBuffer I4b = I4 . getFloatBuffer () ; // magnitude FloatBuffer I5b = I5 . getFloatBuffer () ; FloatBuffer I6b = I6 . getFloatBuffer () ; FloatBuffer I1b = I1 . getFloatBuffer () ; // orientation FloatBuffer I2b = I2 . getFloatBuffer () ; FloatBuffer I3b = I3 . getFloatBuffer () ; while ( i1 < I4b . capacity () ) { float pt ; float pt1 = I4b . get ( i1 ) ; float pt2 = I5b . get ( i1 ) ; float pt3 = I6b . get ( i1 ) ; float max = pt1 ; if ( pt2 > max ) { I4b . put ( i1 , pt2 ) ; I1b . put ( i1 , I2b . get ( i1 ) ) ;} /* end if */ if ( pt3 > max ) { I4b . put ( i1 , pt3 ) ; I1b . put ( i1 , I3b . get ( i1 ) ) ;} I1b . put ( i1 , pt ) ; i1 ++; } cvCopy ( I4 , magnitude1 , null ) ; cvCopy ( I1 , orientation1 , null ) ;

12 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV The next step is to create a IplImage data structure correponding to each histogram bin and integral image corresponding to each histogram bin.

1 2 3 4 5 6 7 8 9 10

IplImage bins []= new IplImage [ B ]; IplImage integrals []= new IplImage [ B ]; for ( int i = 0; i < B ; i ++) { bins [ i ] = cvCreateImage ( cvGetSize ( magnitude1 ) , IPL_DEPTH_32F ,1) ; cvSetZero ( bins [ i ]) ; } for ( int i = 0; i < B ; i ++) { integrals [ i ] = cvCreateImage ( cvSize ( size . width () +1 , size . height () +1) , IPL_DEPTH_64F ,1) ; cvZero ( integrals [ i ]) ; }

Next IplImage correponding to each histogram orientation is populated by segmenting the original magnitude image. Next compute the integral image representation for each of magnitue image corresponding to each histogram bins.
1 2 3 4 5 6 7 8 9 10 11

temp_gradient = ptr1b . get ( index ) ; // magintude and gradient image values temp_magnitude = ptr2b . get ( index ) ; for ( int i =0; i < B ; i ++) { if ( temp_gradient <= - pi +((( i +1) *2* pi ) / B ) ) { ptrs [ i ]. put ( index , temp_magnitude ) ; } /* * compute integral image for each orientation image * */ for ( int i = 0; i <B ; i ++) { cvIntegral ( bins [ i ] , integrals [ i ] , null , null ) ; }

As mentioned earlier image is divided into blocks and HOG feature is required to be computed for each of the blocks seperately. The function calculateHOG_rect method in HOG3_Fast class performs this operation using integral image representation. The input to the methods are x,y co-ordinates of starting point ,width and height of block and integram images If A,C,B,D are the corners of image block in clockwise direction

13 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV then the mean value of magnitudes needs to be computed withing this block corresponding to specific orientation image. The method performs integral computation based on the formula
I

= I (A) + I (B ) I (C ) I (D):

(1)

This computation is perform for each orientation integral image.This will provide us with 9 values for each block of image.After computation L2 normalization is performed on the values. This will normalize the histogram values to lie between 0 and 1.
1 2 3 4 5 6 7 8 9 10 11

for ( int i = 0; i < B ; i ++) { IplImage a1 = integrals [ i ]; DoubleBuffer da1 = integrals [ i ]. getDoubleBuffer () ; double a = da1 . get (( cell . y () +0) * a1 . width () + cell . x () ) ; double b = da1 . get (( cell . y () + cell . height () ) * a1 . width () + cell . x () + cell . width () ) ; double c = da1 . get (( cell . y () +0) * a1 . width () + cell . x () + cell . width () ) ; double d = da1 . get (( cell . y () + cell . height () ) * a1 . width () + cell . x () +0) ; double f =( float ) (( a + b ) -( c + d ) ) ; hog_cell . put (i , f ) ; } cvNormalize ( hog_cell , hog_cell , 1 , 0 , 4 , null ) ;

After obtaining the N descriptors for each block they are concatenated and considering 9 blocks we will have a final descriptor length of 9N. SVM prediction routines are called to predict the class which is represented by the feature vector. The output of prediction routine is class lable and probability. Even if some gesture not belonging to defined gesture set is performed it will generate a class label and probability.Only gestures above certain probability threshold will be considered as valid gesture. For each gesture different probability thresholds are used based on test data.

14 | 16

Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV

0.6.2 libSVM File

The files for svm predication and scaling taken from the libsvm package are svm_model.java,svm_node.java,svm_parameter.java,svm_predict.java,svm.java,svm_p feature_scale.java is a file written refering to libsvm files which performs feature scaling before classification is performed. classifier class defined in classifier.java provides a high level interface to perform feature extraction,feature scaling and classification. The output of the classification routine is a class label representing the shape. The SVM training and feature scaling files is placed on the sdcard and will be used by SVM routines

0.7 Code
The code can be found in code repository https://github.com/ pi19404/m19404/tree/master/Android/AndroidOpenCV or https://code.google. com/p/m19404/source/browse/Android/AndroidOpenCV. The svm training and scaling files are placed in the svm directory of repository.Copy the files to AndroidShapeClassifier on the mobile sdcard.

15 | 16

Bibliography

Bibliography
[1] Lisa Anthony and Jacob O. Wobbrock.  A lightweight multistroke recognizer for user interface prototypes. In: Proceedings of Graphics Interface 2010. GI '10. Ottawa, Ontario, Canada: Canadian Information Processing Society, 2010, pp. 245 252.

1839214.1839258.
[2]

isbn: 978-1-56881-712-5. url: http://dl.acm.org/citation.cfm?id=


at

http://web.science.mq.edu.au/~cassidy/ comp449/html/ch11s02.html.
Dynamic Time Warping.

[3] [4]

Ricardo Gutierrez-Osuna.  Introduction to Speech Processing. In: CSE@TAMU. O. Ludwig et al.  Trainable classier-fusion schemes: An application to pedestrian

[5]

doi: 10.1109/ITSC.2009.5309700. Oswaldo Ludwig. HOG descriptor for Matlab. 2010. url: http://www.mathworks.
tional IEEE Conference on.

detection. In: Intelligent Transportation Systems, 2009. ITSC '09. 12th Interna2009, pp. 1 6.

in/matlabcentral/fileexchange/28689-hog-descriptor-for-matlab.

[6]

Salvador Stan and Chan Philip.  Toward accurate dynamic time warping in linear time and space. In: vol. 11. 5. Amsterdam, The Netherlands, The Netherlands: IOS Press, Oct. 2007, pp. 561580.

id=1367985.1367993.

url: http://dl.acm.org/citation.cfm?

16 | 16

You might also like