You are on page 1of 5

Autonomous Classication of Intra- and

Interspecic Bee Species Using Acoustic Signals in


Real Time
David Ireland
School of Information Technology and Electrical Engineering
University of Queensland
Brisbane, Australia 4072
AbstractThis paper pertains to the development of a real time
classication system for the discrimination of intraspecic and
interspecic bee species using the K-nearest neighbor and prob-
abilistic neural network classication algorithms. The intended
applications for this system are for autonomous surveillance of
invasive bee species and monitoring tools for entomologists. The
system was developed on a low cost platform which showed at
least 80% classication accuracy on two intraspecic bee species
and 100% accuracy in the classication of four distinct bee
species.
I. INTRODUCTION
With the rapid decline of insect pollinators there is an
increasing demand for tools that provide autonomous tracking
of the movements and activities of pollinating insects. This
paper focuses on the initial development of a system for the
detection and classication of bees: an essential insect in the
production of the the global ood supply. In a cost effective
and portable platform, our system aims to:
1) Provide a tool for entomologist to study the behavior
traits of foraging bees. For example, to determine which
bee species favor pollinating a particular agricultural
crops.
2) Provide an autonomous surveillance system for invasive
bee species that present a potential hazard to current
ecosystems. Australia for example, considers the Asian
bee (Apis cerana) and bumble bee (Bombus terrestris)
insects invasive.
Given the enormous diversity of insects, autonomous detec-
tion is a widely research elded. Insect classication methods
can usually be placed into two broad categories: acoustic and
imaging methods. A perusal of the literature shows acous-
tic methods are mainly used for eld measurements while
imaging approaches are conducted in a laboratory environment
usually posthumous. Examples using acoustic methods include
detection systems for insects in grain silos [1], [2] classica-
tion of mosquitoes in [3], [4] and aphids in [5]. Identifying
crickets based on their sounds can be found in [6] and [7].
Examples of insect classication using imaging systems can
be found in [8] for the identication of aquatic insects and [9]
for the identication of aphids.
The method proposed in this paper relies solely on acoustic
signals emitted by the insects during ight. The novelty of
this paper is the discrimination of bee insects using acoustics
signals which is absent from the literature. Moreover, emphasis
on a cost effective, eld detection/classication system in real
time is a major feature of this paper.
II. ACOUSTIC INSECT DETECTION
Insect classication by acoustic signals emitted during ight
is not a new technique. The method relies on the phenomenon
that the acoustics emitted by an insect in ight has an
fundamental frequency approximately equal to the wing-beat
frequency of the insect [11]. Further spectrum analysis also
reveals a harmonic series where often the dominant frequen-
cies are not the fundamental frequency [11]. Figures 2 and 3
give the spectrogram of two distinct bee species, Apis mellifera
and Amegilla cingulata. Both waveforms have similar funda-
mental frequencies 220Hz however, in the latter example the
fundamental frequency is not the dominant frequency and has
more power in the higher harmonics as opposed to the Apis
mellifera species.
A statistical analysis done by [10] has shown the the wing-
beat frequency (and thus the produced fundamental frequency)
is inversely proportional to the wing area of the insect.
Given the extensive variation in insect anatomy, the wing-
beat frequency and the associated harmonic series, a feature
set can be extracted electronically. This work was inspired by
Moore et al in [3], [4], [5], who pioneered the use of insect
discrimination using the harmonic sets.
Figure 1 provides a owchart of our proposed eld system.
The system records continuously where after some duration,
the fundamental frequency f
o
is determined. If f
o
is de-
termined to be in a region of interest, a feature vector is
constructed from the audio sample and subsequently classied
and logged. Section II and III will discuss the feature vector
extract and considered classications used in this instance.
III. FEATURE VECTOR GENERATION
Given the harmonic nature of the signal emitted by insects
in ight, the cepstrum method was used in determining the
fundamental frequency. This method involves rst nding the
cepstrum power using:
C (q) =

T
_
log
_
[T y (n)[
2
__

2
(1)
Record
audio
sample
Compute
f o
Discard
audio
sample
Log event
f o
f
mi n
o
, f
max
o
Extract
feature
vector
Classify
feature
vector
no
yes
Fig. 1. An overview owchart of the proposed detection/classication system.
Fig. 2. Spectrogram of an acoustic signal emitted by an European honey
bee Apis mellifera during ight.
where y (n) is the sampled waveform, T () denotes the
Fourier transform, and q are the units of the cepstrum power
referred to as the quefrency. Subsequently f
o
is found by:
f
o
=
f
s
argmax
q
C (q)
(2)
where f
s
is the sampling frequency. In order to scale f
o
for the classication algorithm, we propose the following
normalisation function:
f

o
=
_
f
o
f
min
o
_
(f
max
o
f
min
o
)
(3)
where f
min
o
and f
max
o
are the minimum and maximum
possible ranges of the f
o
.
The next step in creating the feature vector is to compute
relative power at multiples of the estimated f
o
. This is rst
achieved by summing the power spectrum density G
y
(f) at
the harmonics of interest. In this instance we considered 5%
at the harmonic regions. Using a sampling frequency f
s
of
44.1KHz and fast Fourier transform length of f
s
, we have for
each multiple n after some simplications:
Fig. 3. Spectrogram of an acoustic signal emitted by an Australia blue
banded bee Amegilla cingulata during ight.
h
n
=

21nfo
20

f=
19nfo
20

G
y
(f) n = 1, . . . , N
h
(4)
where N
h
is the number of multiples considered, this is
an arbitrary value and dependent on the bee species to be
detected. Functions | and | denote the oor and ceil
functions respectively. The h
n
values are further normalised
using:
h

n
=
h
n
max h
1
, h
2
, . . . , h
N
h

(5)
Finally the feature vector is dened as:
x =
_
f

o
, h

1
, h

2
, . . . , h

N
h
_
(6)
For future reference we denote the length of x as N
x
where
N
x
= N
h
+ 1.
IV. CLASSIFICATION ALGORITHMS
For convenience we dene the classication of the feature
vector x as the function:
T(x) 1, 2, . . . , N
class
(7)
where N
class
is the number of classes (or bee insects) consid-
ered.
A. K-Nearest Neighbor Method
The K-nearest neighbor method (kNN) is a widely used
classication method. Given an unknown sample, the kNN
method nds the K nearest objects (training data) typically
using the Euclidean distance as a metric. Subsequently, the
sample is classied based on a majority vote of the K objects.
For example, if:
x
1
, x
2
, x
3
. . . , x
K
(8)
denote the K nearest feature vectors to the unknown feature
vector, determined by some distance metric, then the newly
assigned class is determined by:
k = /T(x
1
) , T(x
2
) , . . . , T(x
K
) , (9)
where /() computes the mode of the dataset of classes. The
Euclidean distance metric was used in this paper for all uses
of the kNN algorithm.
B. Probabilistic Neural Network
Probabilistic neural networks (pNN)s are a practical means
of implementing Bayesian classication techniques. If an
object is to be classied into one of two classes denoted i and
j, then class i is chosen according to Bayes optimal decision
rule:

i
c
i
f
i
(x) >
j
c
j
f
j
(x) (10)
denotes the loss associated with misclassifying x, h
i
is the
prior probability of occurrence in the ith class, and f
i
(x)
is the posterior probability density function (PDF) for the
ith class. In practise f
i
(x) is usually not known and must
be estimated using Parzens method. This involves taking an
average sum of a suitably chosen kernel for each observation
in the training data [12].
The Gaussian function is a common choice for the kernel
as it is well behaved and easily computed [12]. After some
simplication the estimated PDF for a particular class with
N
k
training observations becomes:
f
k
(x) =
1
N
k
N
k

i=1
exp
_
[[x x
ki
[[
2

2
_
(11)
where x
ki
is the ith example of the training data for class k
and is a scaling parameter that controls the area of inuence
of the kernel. There is no rigorous mathematical method to
determine an optimal , however, the author has found a
simple rst-order optimisation approach such as the gradient
descent method [13] quite efcient in determining a suitable
for the training set prior to the system being placed online.
Assuming the misclassication loss and prior probabilities
of occurrence are constant, the class belonging to the feature
vector is determined by:
T(x) = argmax
n
f
1
, f
2
, . . . , f
n
, . . . , f
Nclass
(12)
V. EXPERIMENT SETUP
A. Hardware
An algorithm to perform the operation given in gure 1
was programmed on a FriendlyARM mini2440 platform [14].
This platform features a 400MHz Samsung ARM9 processor
with on board circuitry for sound recording and USB interface
for data storage. The platform is capable of running both
Linux and Windows based operating systems. It was a powered
by a 12V lead acid battery. The cost of the platform is
approximately $90AUD.
The developed classication software was written in C++
and provided continuous recording using dual threads and
dual alternating buffers. Two threads were initially created,
these will be referred to as the recording thread and the
classication thread. The recording thread continuously placed
audio samples into an available buffer while the classication
thread waits for a buffer to be full (1 second of recording time).
Once a buffer was full, the recording thread redirects the audio
samples into the second buffer while the classication thread
computes the f
o
of the waveform stored in the full buffer and
subsequently classies the waveform if the right conditions are
met i.e. f
o
f
min
o
, f
max
o
. Continuous recording was found
to be met while the f
o
computation and classication stages
required no more than 1 second of computation time. The
freely available FFTW subroutine library [15] was used to
compute the PSD. This library is considered to be the most
efcient freely available library for computing the fast Fourier
transformation. Benchmarks performed on on a variety of
platforms show that FFTWs performance is typically superior
to that of other publicly available FFT software, and is even
competitive with vendor-tuned codes [15]. Figure 4 provides
a photo of the classication system being tested on a colony
of Apis mellifera honey bees.
Fig. 4. Photo of the classication system being tested on a colony of Apis
mellifera honey bees.
B. Classication Performance Criteria
The performance of the algorithm was determined by the
amount of successful classications that occurred during the
testing. We mathematical dene the function:
g
i
=
_
1 if T(x
i
) = k
0 if T(x
i
) ,= k
where x
i
is the ith testing sample. The total error which
represents the number of successful classications is given
as:
=
1
N
test
N
test

i=1
g
i
(13)
where N
test
the number of testing samples.
VI. EXPERIMENT 1: COLONY CLASSIFICATION
The rst study presented in this paper is on the efcacy of
the classication system to classify between two intraspecic
colonies of European honey bees (Apis mellifera) with an
arbitrary size of training data. The system was given a total
of N
train
training samples with a 50% distribution of training
samples for each colony. Each training system was audible
checked to ensure it contained a acoustics signal produced by
a bee and had a f
o

_
f
min
o
, f
max
o

, where f
min
o
= 200Hz
and f
max
o
= 250Hz. The system was stopped running after
it had classied 100 bees. This was repeated 5 times with the
classication error dened in equation 13 evaluated at each
instance. It has been observed a priori, that the harmonics
emitted by the Apis mellifera bee have negligible amplitude
passed the 3rd harmonic there,fore N
x
was set to 4.
The results of this experiment are given in table I where

denotes the mean of the classication error for each system


run. Evidently with a minimal training size of 2, the system
is able to obtain, on average, at least 61% and 54% accuracy
using the pNN and kNN algorithms respectively. There was
observable increase in classication accuracy as the number of
training samples increased, on average, 78% and 72% accuracy
was obtained for the pNN and kNN algorithms respectively.
The pNN algorithm is seen to be the more accurate algorithm.
TABLE I
PERCENTAGE OF SUCCESSFUL CLASSIFICATIONS DETERMINED BY
EQUATION 13 FOR THE PNN AND KNN ALGORITHMS AS A FUNCTION OF
TRAINING SIZE N
t
FOR WHEN N
x
= 4.
N
train
Alg.
1

2

3

4

5

pNN 67% 38% 59% 73% 69% 61%


2
kNN (k = 1) 45% 24% 59% 72% 69% 54%
pNN 65% 72% 70% 80% 71% 72%
10
kNN (k = 5) 64% 72% 67% 73% 68% 69%
pNN 79% 75% 76% 86% 73% 76%
20
kNN (k = 5) 68% 77% 72% 73% 63% 71%
pNN 71% 78% 77% 73% 74% 75%
40
kNN (k = 5) 63% 74% 67% 69% 73% 69%
pNN 79% 78% 78% 76% 80% 78%
100
kNN (k = 5) 68% 77% 76% 70% 73% 73%
VII. EXPERIMENT 2: INTERSPECIFIC BEE
CLASSIFICATION
The second experiment presented in this paper is on the
efcacy of the classication system to classify between four
different bee species. The species include the Asian honeybee
TABLE II
TABLE OF THE WING-BEAT FREQUENCY ESTIMATIONS FOR FOUR
DIFFERENT BEE SPECIES. ESTIMATIONS DONE IN PRESENT STUDY WERE
AVERAGE VALUES FROM THE TRAINING SET
Species f
o
Citation
Apis cerana 265Hz Present study
Apis cerana 306Hz [18]
Amegilla cingulata 229Hz Present study
Apis mellifera 225Hz Present study
Apis mellifera 240Hz [16]
Apis mellifera 197Hz [17]
Bombus terrestris 175Hz Present study
Bombus terrestris 156Hz [16]
Bombus terrestris 130Hz [17]
(Apis cerana), the native Australian blue banded bee Amegilla
cingulata, the European honey bee (Apis mellifera) and the
bumble bee (Bombus terrestris). In Australia, the Apis cerana
and Bombus terrestris are prohibited species and therefore
obtaining audio recordings of this insects in ight very difcult
to obtain. As such, the author obtained permission to use audio
recordings taken by amateur entomologist in Japan for the Apis
cerana bee and in South America for the Bombus terrestris
bee. From these audio recordings and further recordings done
locally, a training set was constructed which contained 5, 1-
second audio samples of the bee species under consideration.
The centroids of the testing set for each species are given
in gure 5. There is evidently a large variation in wing-beat
frequency and the distribution of power in the harmonics.
To provide some evidence to the veracity of this gure, the
average recorded f
o
measured wing-beat frequency for each
species was compared to values cited in the literature shown in
table II. Generally consistency is shown between previously
cited values for all species except Amegilla cingulata as no
literature value could be found.
Due to the small number of training and testing samples, the
classication system was tested ofine. A testing sample was
removed from the training set and applied to the classication
algorithm. Table III provides the results for N
x
= 1 i.e.
only the wing-beat frequency is used in the classication
algorithms, and N
x
= 12. As seen, both algorithms performed
the same and were able to provide 88% accuracy when using
only the f
o
as the classication feature. However, when both
algorithms were given the complete feature vector (N
x
= 12)
both algorithms achieved 100% classication accuracy. It
would also appear the kNN in this instance preferred a low
value of k. Given the gure 5, these results are not surprising
as there is signicant difference between feature vectors for
the different bee species. It is also apparent the wing-beat
frequency can also be reasonable feature in interspecic bee
discrimination.
1 2 3 4 5 6 7 8 9 10 11 12
0
1
Apis cerana
h
n
1 2 3 4 5 6 7 8 9 10 11 12
0
1
Amegilla cingulata
h
n
1 2 3 4 5 6 7 8 9 10 11 12
0
1
Apis mellifera
h
n
1 2 3 4 5 6 7 8 9 10 11 12
0
1
Bombus terrestris
n
h
n
Fig. 5. Centroid of the training samples for the four bee species: Apis cerana,
Amegilla cingulata, Apis mellifera and Bombus terrestris. f
min
o
= 150Hz
and f
max
o
= 300Hz
.
TABLE III
PERCENTAGE OF SUCCESSFUL CLASSIFICATIONS DETERMINED BY
EQUATION 13 FOR THE PNN AND KNN ALGORITHMS FOR WHEN N
x
= 1
AND Nx = 12.

Algorithm N
h
= 1 N
h
= 12
pNN 88 % 100%
kNN (k = 1) 88% 100%
kNN (k = 2) 88% 100%
kNN (k = 3) 71% 82%
kNN (k = 4) 82% 88%
kNN (k = 5) 65% 76%
VIII. CONCLUSION
This paper has presented a system for the surveillance and
classication of bee insects in real time using the acoustics
emitted by the insects during ight. The intended purpose of
this system is in the surveillance of invasive bee species and
tools for the tracking of bee behavior for new entomology
studies. Extraction of a feature vector from the sampled acous-
tic waveform was described followed by two classication
algorithms implemented on a low cost prototype platform.
The rst experiment pertained to the intraspecic of two
colonies of Apis mellifera colonies. An average classication
accuracy of 79% was obtained using a probabilistic neural
network. The second experiment pertained to the interspecic
classication of four distinct bee species. 100% classication
accuracy was obtained using both the probabilistic neural
network and k-nearest neighbor methods. This shows intraspe-
cic classication is possible and obtains reasonable accuracy
with the proposed algorithms. The results of interspecic
classication were very promising given, albeit, a limited
training and testing set.
Future work on the proposed system includes the inclusion
of more bee species in the training set and the extension of
wireless connectivity for event notication. Subsequently the
system is expected to deployed in a wider area and operated
for long periods of time.
ACKNOWLEDGMENT
The author would like to thank Yus apiaries for the use of
their beehives and the amateur and professional entomologists
who donated their audio recordings of various insects. The
author acknowledges the technical assistance given by Dr.
Konstanty Bialkowski of the University of Queensland.
REFERENCES
[1] K.M. Coggins and J. Pricipe,, Detection and classication of insect
sounds in a grain silo using a neural network, Neural Networks Pro-
ceedings, 1998. IEEE World Congress on Computational Intelligence,
vol.3, pp.1760-1765, 4-9 May 1998
[2] F. Fleurat-Lessard, B. Tomasini, L. Kostine and B. Fuzeau, Acoustic
detection and automatic identication of insect stages activity in grain
bulks by noise spectra processing through classication algorithms,
Proceedings of the 9th International Working Conference on Stored
Product Protection, 15 - 18th October 2006, Campinas, Sao Paulo, Brazil.
[3] A. Moore, J.R. Miller, B.E. Tabashnik and S.H. Gage, Automated iden-
tication of ying insects by analysis of wingbeat frequencies, J. Econ.
Entomol. 79: 1703-1706
[4] A. Moore, Articial neural network trained to identify mosquitoes in
ight, Journal of insect Behavior, Vol. 4 No. 3 1991
[5] A. Moore and R.H. Miller Automated identication of optically sensed
aphid (Homoptera: Aphidae) wing waveforms, Annals of the Entomolog-
ical Society of America, 95(1):1-8, 2002
[6] I. Potamitis, T. Ganchev and N. Fakotakis, Automatic acoustic identi-
cation of insects inspired by the speaker recognition paradigm, IN-
TERSPEECH 2006 - ICSLP, 9th International Conference on Spoken
Language Processing Pittsburgh, PA, USA September 17-21, 2006
[7] E.D. Chesmore, Application of time domain signal coding and articial
neural networks to passive acoustical identication of animals, Applied
Acoustics 62 (2001) 13591374
[8] M. J. Sarpola, R. K. Paasch, E. N. Mortensen, T. G. Dietterich, D. A.
Lytle, A. R. Moldenke and L. G. Shapiro, An aquatic insect imaging
system to automate insect classication, Transactions of the American
Society of Agricultural and Biological Engineers, 51(6): 2217-2225. 2008
[9] R. Kumar, V. Martin and S. Moisan, Robust insect classication applied
to real time greenhouse infestation monitoring, IEEE ICPR workshop on
Visual Observation and Analysis of Animal and Insect Behavior, Istanbul,
2010
[10] M. Deakin, Formulate for insect wingbeat frequency, Journal of Insect
Science, 10(96):1-9 2010
[11] R. Dudley, The Biomechanics of insect ight, Princeton University press,
Oxfordshire United Kingdom.
[12] T. Masters, Practical neural network recipes in C++, Morgan Kauf-
mann, 1st edition Academic Press Inc. (April 14, 1993)
[13] J. A. Snyman, Practical mathematical optimization: An introduction to
basic optimization theory and classical and new gradient-based algo-
rithms. Springer Publishing. 2005
[14] FriendlyARM. [Online]. Available: http://www.friendlyarm.net [Ac-
cessed: April 12th, 2011].
[15] FFTW, [Online]. Available: http://www.fftw.org/ [Accessed: April 12th,
2011].
[16] O. Sotavalta, The essential factor regulating the wing stroke frequency
of insects in wing mutilation and loading experiments and in experiments
at subatmospheric pressure. Ann. Zool. Soc. Vanaino 15, 1-67
[17] D. N. Byrne, Relationship between wing loading, wingbeat frequency
and body mass in Homopterous insects, Journal of Experimental Biology,
135, 9-23, 1988
[18] N.P. Goyal and A.S. Atwal, Wingbeat frequency of A. indica indica F
and A. mellifera L., Journal of Apiculture Research, 16:4748, 1977

You might also like