Professional Documents
Culture Documents
Pi19404
April 9, 2013
Contents
Contents
Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV 3
Introduction . . . . . . . . . . . . . . . . . Recognizing Gesture Shape . . . . . . . . Data Pre Processing . . . . . . . . . . . Gesture Normalization . . . . . . . . . . . 0.4.1 Registering candidate . . . . . . 0.4.2 Rejecting invalid Gesture . . . . 0.4.3 Re-sampling points . . . . . . . . . 0.4.4 Scaling . . . . . . . . . . . . . . . . . 0.4.5 Translation . . . . . . . . . . . . . . 0.4.6 Feature Extraction : Histogram dients . . . . . . . . . . . . . . . . . 0.5 Classification Task . . . . . . . . . . . . . 0.6 Implementation Details . . . . . . . . . . 0.6.1 Implementation details of HOG . 0.6.2 libSVM File . . . . . . . . . . . . . . 0.7 Code . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . of . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oriented . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gra. . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 3 4 4 5 5 7 7
8 9 11 11 15 15 15
2 | 16
3 | 16
4 | 16
Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV The class PUtils contains all the methods for gesture pre processing and recognition. The DataCapture class contains instance of PUtils class .
1 2 3 4 5 6 7 8 9
public double PathLength ( Vector points ) { double length = 0; for ( int i = 1; i < points . size () ; i ++) { length += Distance (( Point ) points . elementAt ( i - 1) , ( Point ) points . elementAt ( i ) ) ; } return length ; }
In the present implementation the path length threshold used in 100. The code is implemented by the method PathLength is the PUtils Class.
5 | 16
Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV point based comparison. The method used for sampling the data points is uniform sampling.The path length is divided by the number of re-sampled points.This will be the interval length between the points. We start with the initial point and next point is selected such that distance between points is greater than of equal to interval length.Let points be labeled pt1 and pt2 A linear path is assume to exist between adjacent sample points. using simple trigonometric relationship of sin and cos we can estimate of location at distance of uniform path interval which lies between pt1 and pt2 . This new co-ordinate replaces pt2 in the candidate/template coordinate array and the same process is repeated till the last point of the co-ordinate array is reached.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
double d = Distance ( pt1 , pt2 ) ; if (( D + d ) >= I ) { // computation of new co - ordinate using cos relationship double qx = pt1 . x + ( I / ( D + d ) ) * ( pt2 . x - pt1 . x ) ; // computation of new co - ordinate using sin relationship double qy = pt1 . y + ( I / ( D + d ) ) * ( pt2 . y - pt1 . y ) ; Point q = new Point ( qx , qy ) ; // adding the point in resampled array dstPts . addElement ( q ) ; // replacing the point in the source array srcPts . insertElementAt (q , i ) ; // resetting cumulative distance D = 0.0; } else { // computing cumulative distance D=D+d; }
This is implemented by the method Re-sample in the PUtils Class. In the present implementation the number of re-sampled points used is 32.
6 | 16
0.4.4 Scaling
The next pre-processing step is to scale the co-ordinates such that third width and height remain within a fixed bounds.First the bounding width and height of current set of points are computed which is simply max(x) min(x) and max(y ) min(y ).The new width and height are denoted by W and H and all the points are scaled by factor W=max(x) min(x) and H=max(y ) min(y )
1 2 3 4 5 6 7 8 9
Rectangle B = BoundingBox ( points ) ; Vector newpoints = new Vector ( points . size () ) ; for ( int i = 0; i < points . size () ; i ++) { Point p = ( Point ) points . elementAt ( i ) ; double qx = p . x * ( size / B . Width ) ; double qy = p . y * ( size1 / B . Height ) ; newpoints . addElement ( new Point ( qx , qy ) ) ; }
This provides step provides invariance wrt scaling since all gestures are bounded to lie within rectangle of same size.This is implements by method ScaleToSquare1 in PUtils class. In the present implementation the scale is done so than bounding box is a square of dimension 250. The above method perform uniform scaling another method of scaling is non uniform scaling that maintains the aspect ratio. Compute the ration between the width and height or viceversa of the bounding rectangle if ratio is close to 0 than 1 then perform non uniform scaling else perform uniform scaling. The ScaleDimTo method implements this in the PUtils Class.
0.4.5 Translation
The first step required is computation of mean/centroid of set of co-ordinate location.
1 2 3 4 5 6 7
Enumeration e = points . elements () ; while ( e . hasMoreElements () ) { Point p = ( Point ) e . nextElement () ; xsum += p . x ; ysum += p . y ; }
7 | 16
This is implemented by the method Centroid in class PUtils. The next step is to translate all the points such that centroid lies at the origin of co-ordinate system. Translate all the points by (centroidx ; centroidy ).
1 2 3 4 5 6 7 8 9
Point c = Centroid ( points ) ; Vector newpoints = new Vector ( points . size () ) ; for ( int i = 0; i < points . size () ; i ++) { Point p = ( Point ) points . elementAt ( i ) ; double qx = p . x - c . x ; double qy = p . y - c . y ; newpoints . addElement ( new Point ( qx , qy ) ) ; }
This is implemented in the method TranslateToOrigin in the PUtils Class. After the completion of gesture normalization the candidate gesture points are plotted on plImage data structure available in java CV. And the image is passed to the Feature extraction method.
8 | 16
Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV windows The derivative of images is computed along x and y directions.Applying Cartesian to polar transformation we obtain magnitude and orientation of the gradient at every point of the image. We consider the orientations along 9 orientation directions,a orientation resolution of 200 . We compute the histogram of oriented gradients that lie along these predefined orientations. This will give a feature vector of length 9. The image is subdivided into block and HOG is computed over the each block To encapsulate correlation amongst neighborhood block simple techniques of sliding windows is used. To speed up the computation integral images are used to compute sum of pixels ie histogram bin count quickly over the windows. The most basic features are raw pixel feature .If we used raw pixels directly we would get a very long feature vector.Using HOG feature representation we get a relatively small feature representation. Below is a example of HOG feature vector computed for 9 images and relatively small training set of 20 samples per class are used for training.The training time required is also very small can be performed in real time. Following are the parameters : 1. Image Size : 160x120 2. Num of Orientations : 12 3. Number of Blocks : 3x3=9 4. Feature descriptor : 12x9=108 The descriptors can be computed for set of training images,the image as well class label is written to the csv file.
9 | 16
classifier and test accuracy for a small gesture set.We will use the java classes provided by LibSVM software package. The file earlier generate is in format suitable for LibSVM package. The file contains the feature label number followed by the feature value. The First task is to perform feature scaling so that all the feature lie in a predefined range This is required to ensure than feature value along one dimension does not bias the classifier. Along each dimension the features are scaled to lie in a fixed range .The default value of provided by the LibSVM tool is (-1,1).
1. train.file - input training data filename 2. test.file - input test data filename 3. t1.range - file containing feature scaling parameters 4. t1.scale - output data file name after performing feature scaling on training data file 5. t2.scale - output data file name after performing feature scaling on test data file 6. t1.model - SVM classifier model file
10 | 16
Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV command to perform feature scaling is Java svm_scale -s t1.range train.file > t1.scale command to train the classifier java svm_train t1.scale t1.model Command for feature scaling the test data java svm_scale -r t1.range test.file > t2.scale command to perform prediction on test data Java svm_predict t2.scale t1.model test.out we obtain perfect classification on this small data set. About 300 samples of each class were used for training.The shape included in the training stage consist of high variability .
hx = cvCreateMat (1 ,3 , CV_32F ) ; FloatPointer hxf = hx . data_fl () ; hxf . put (0 , -1) ; hxf . put (1 , 0) ; hxf . put (2 , 1) ; hy = cvCreateMat (3 ,1 , CV_32F ) ;
11 | 16
FloatPointer hyf = hy . data_fl () ; hyf . put (0 , 1) ; hyf . put (1 , 0) ; hyf . put (2 , -1) ;
The below code create a floating point representation of the image and computes the first order derivatives along the x and y directions. From the derivatives cartesian to polar transformation is performed to obtain the gradient magnitude and orientation The orientation is normalized to lie between
1 2 3 4 5 6 7 8
pi
to
pi
cvConvertScale ( Im , Im1 ,1.0 ,0.0) ; /* * computing gradient along x and y directions * */ cvFilter2D ( Im1 , grad_xr , hx , cvPoint (1 ,0) ) ; cvFilter2D ( Im1 , grad_yu , hy , cvPoint ( -1 , -1) ) ; /* * cartesian to polar transformation * */ cvCartToPolar ( grad_xr , grad_yu , magnitude , orientation ,0) ; /* * normalization of orientation * */ cvSubS ( orientation , cvScalar ( pi , pi , pi ,0) , orientation , null ) ;
If the image is color image then orientation corresponding to the dominant channels is extracted.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
cvSplit ( orientation , I1 , I2 , I3 , null ) ; cvSplit ( magnitude , I4 , I5 , I6 , null ) ; FloatBuffer I4b = I4 . getFloatBuffer () ; // magnitude FloatBuffer I5b = I5 . getFloatBuffer () ; FloatBuffer I6b = I6 . getFloatBuffer () ; FloatBuffer I1b = I1 . getFloatBuffer () ; // orientation FloatBuffer I2b = I2 . getFloatBuffer () ; FloatBuffer I3b = I3 . getFloatBuffer () ; while ( i1 < I4b . capacity () ) { float pt ; float pt1 = I4b . get ( i1 ) ; float pt2 = I5b . get ( i1 ) ; float pt3 = I6b . get ( i1 ) ; float max = pt1 ; if ( pt2 > max ) { I4b . put ( i1 , pt2 ) ; I1b . put ( i1 , I2b . get ( i1 ) ) ;} /* end if */ if ( pt3 > max ) { I4b . put ( i1 , pt3 ) ; I1b . put ( i1 , I3b . get ( i1 ) ) ;} I1b . put ( i1 , pt ) ; i1 ++; } cvCopy ( I4 , magnitude1 , null ) ; cvCopy ( I1 , orientation1 , null ) ;
12 | 16
Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV The next step is to create a IplImage data structure correponding to each histogram bin and integral image corresponding to each histogram bin.
1 2 3 4 5 6 7 8 9 10
IplImage bins []= new IplImage [ B ]; IplImage integrals []= new IplImage [ B ]; for ( int i = 0; i < B ; i ++) { bins [ i ] = cvCreateImage ( cvGetSize ( magnitude1 ) , IPL_DEPTH_32F ,1) ; cvSetZero ( bins [ i ]) ; } for ( int i = 0; i < B ; i ++) { integrals [ i ] = cvCreateImage ( cvSize ( size . width () +1 , size . height () +1) , IPL_DEPTH_64F ,1) ; cvZero ( integrals [ i ]) ; }
Next IplImage correponding to each histogram orientation is populated by segmenting the original magnitude image. Next compute the integral image representation for each of magnitue image corresponding to each histogram bins.
1 2 3 4 5 6 7 8 9 10 11
temp_gradient = ptr1b . get ( index ) ; // magintude and gradient image values temp_magnitude = ptr2b . get ( index ) ; for ( int i =0; i < B ; i ++) { if ( temp_gradient <= - pi +((( i +1) *2* pi ) / B ) ) { ptrs [ i ]. put ( index , temp_magnitude ) ; } /* * compute integral image for each orientation image * */ for ( int i = 0; i <B ; i ++) { cvIntegral ( bins [ i ] , integrals [ i ] , null , null ) ; }
As mentioned earlier image is divided into blocks and HOG feature is required to be computed for each of the blocks seperately. The function calculateHOG_rect method in HOG3_Fast class performs this operation using integral image representation. The input to the methods are x,y co-ordinates of starting point ,width and height of block and integram images If A,C,B,D are the corners of image block in clockwise direction
13 | 16
Android JavaCV:Simple Shape Recognition using OpenCV,JavaCV then the mean value of magnitudes needs to be computed withing this block corresponding to specific orientation image. The method performs integral computation based on the formula
I
= I (A) + I (B ) I (C ) I (D):
(1)
This computation is perform for each orientation integral image.This will provide us with 9 values for each block of image.After computation L2 normalization is performed on the values. This will normalize the histogram values to lie between 0 and 1.
1 2 3 4 5 6 7 8 9 10 11
for ( int i = 0; i < B ; i ++) { IplImage a1 = integrals [ i ]; DoubleBuffer da1 = integrals [ i ]. getDoubleBuffer () ; double a = da1 . get (( cell . y () +0) * a1 . width () + cell . x () ) ; double b = da1 . get (( cell . y () + cell . height () ) * a1 . width () + cell . x () + cell . width () ) ; double c = da1 . get (( cell . y () +0) * a1 . width () + cell . x () + cell . width () ) ; double d = da1 . get (( cell . y () + cell . height () ) * a1 . width () + cell . x () +0) ; double f =( float ) (( a + b ) -( c + d ) ) ; hog_cell . put (i , f ) ; } cvNormalize ( hog_cell , hog_cell , 1 , 0 , 4 , null ) ;
After obtaining the N descriptors for each block they are concatenated and considering 9 blocks we will have a final descriptor length of 9N. SVM prediction routines are called to predict the class which is represented by the feature vector. The output of prediction routine is class lable and probability. Even if some gesture not belonging to defined gesture set is performed it will generate a class label and probability.Only gestures above certain probability threshold will be considered as valid gesture. For each gesture different probability thresholds are used based on test data.
14 | 16
The files for svm predication and scaling taken from the libsvm package are svm_model.java,svm_node.java,svm_parameter.java,svm_predict.java,svm.java,svm_p feature_scale.java is a file written refering to libsvm files which performs feature scaling before classification is performed. classifier class defined in classifier.java provides a high level interface to perform feature extraction,feature scaling and classification. The output of the classification routine is a class label representing the shape. The SVM training and feature scaling files is placed on the sdcard and will be used by SVM routines
0.7 Code
The code can be found in code repository https://github.com/ pi19404/m19404/tree/master/Android/AndroidOpenCV or https://code.google. com/p/m19404/source/browse/Android/AndroidOpenCV. The svm training and scaling files are placed in the svm directory of repository.Copy the files to AndroidShapeClassifier on the mobile sdcard.
15 | 16
Bibliography
Bibliography
[1] Lisa Anthony and Jacob O. Wobbrock. A lightweight multistroke recognizer for user interface prototypes. In: Proceedings of Graphics Interface 2010. GI '10. Ottawa, Ontario, Canada: Canadian Information Processing Society, 2010, pp. 245 252.
1839214.1839258.
[2]
http://web.science.mq.edu.au/~cassidy/ comp449/html/ch11s02.html.
Dynamic Time Warping.
[3] [4]
Ricardo Gutierrez-Osuna. Introduction to Speech Processing. In: CSE@TAMU. O. Ludwig et al. Trainable classier-fusion schemes: An application to pedestrian
[5]
doi: 10.1109/ITSC.2009.5309700. Oswaldo Ludwig. HOG descriptor for Matlab. 2010. url: http://www.mathworks.
tional IEEE Conference on.
detection. In: Intelligent Transportation Systems, 2009. ITSC '09. 12th Interna2009, pp. 1 6.
in/matlabcentral/fileexchange/28689-hog-descriptor-for-matlab.
[6]
Salvador Stan and Chan Philip. Toward accurate dynamic time warping in linear time and space. In: vol. 11. 5. Amsterdam, The Netherlands, The Netherlands: IOS Press, Oct. 2007, pp. 561580.
id=1367985.1367993.
url: http://dl.acm.org/citation.cfm?
16 | 16