Exact Identification of An All-Pole System From Its Response To A Periodic Input

Home | Sessions | Authors | Session 5.
Exact Identification of an All-Pole System

From Its Response to a Periodic Input
C.S. Ramalingam
Department of Electrical Engineering
Indian Institute of Technology Madras
Chennai600 036
csr@iitm.ac.in
Abstract
50
We show that it is possible to identify exactly an all-pole

system from its response to a periodic input. The problem is motivated by speech analysis, where voiced speech
is modeled as the response of an all-pole filter to an impulse train. The autocorrelation method of linear predictive (LP) analysis estimates the inverse filter using a criterion that is equivalent to maximizing the residual signals spectral flatness measure. This is not a satisfying
criterion, and a better choice is requiring the spectral envelope to be flat, but it results in a nonlinear problem.
However, if we constrain the spectrum to be a constant at
a discrete set of points, namely at multiples of the fundamental frequency, not only does the problem become
linear but also yields the exact inverse filter under certain
conditions. This framework is general in that it can handle the case of the excitation being any periodic input, but
requires knowledge of the input spectrum at multiples of
the fundamental frequency. We illustrate the effectiveness of our method by using synthetic examples. The
proposed method is sensitive to the starting point of the
analysis window.
40
A common approach for modeling short segments of voiced speech is viewing it as the output of an all-pole filter
excited by an impulse train [1]. In this simple but effective model, the filter models the vocal tract, and our goal
is to estimate its coefficients from a short segment of the
output. Fig. 1 shows the spectrum of a filter and its output when excited by an impulse train with f0 = 100 Hz.
Clearly, the spectral envelope of the output is a close
match to the filter response, except in the valley region.
On the other hand, Fig. 2 shows what happens when f0
is increased to 1000 Hz. In this case, the harmonic peaks
sample the filters response only sparsely, and the filters
spectrum does not at all appear to be the envelope of the
output signal. We will later build on this example.
Magnitude (dB)
20
10
0
10
20
Output
spectrum
30
40
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
Figure 1: Spectrum of an all-pole filter and its output

when excited by an impulse train with f0 = 100 Hz. The
envelope of the output and filter spectrum match closely
(except in the valley region).
50
40
30
20
Magnitude (dB)
1. Introduction
Filter spectrum
30
10
Filter spectrum
Output
spectrum
0
10
20
30
40
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
Figure 2: The output spectrum when the impulse trains

f0 is increased to 1000 Hz. The harmonic peaks sample
the filter spectrum sparsely, and there seems to be a gross
mismatch between the outputs spectral envelope and the
filter response.
The most popular method of estimating the model

coefficients is based on the principles of linear predictive coding (LPC) [2, 3]. In many practical applications
the autocorrelation method [1] is used because it always
yields a stable filter. It is well-known that the LPC approach suffers from drawbacks, e.g., for voiced speech
segments the LP filters spectral peaks are biased towards
the pitch harmonics, the drawback being inherent in the
error criterion [2, 4]. Fig. 3 shows this bias very clearly,
wherein the LPC spectrum was obtained by applying the
autocorrelation method to the example of Fig. 2 (see Section 3 for more details).
It has been shown that the autocorrelation method is
equivalent to maximizing the spectral flatness measure of
the output of the inverse filter [5, 6]. If the excitation signal to the all-pole filter is an impulse train and we filter
the output by the corresponding exact inverse filter, the
resulting residual signal will be the original impulse train.
For this residual signal, what is flat is the spectral envelope, rather than the spectrum itself. Therefore, it seems
that a more correct criterion would be to require that the
spectral envelope of the residual signal be flat, rather than
requiring this of the spectrum. However, constraining the
envelope is a very difficult problem to formulate: while
it may be easy to visualize the envelope of a spectrum, it
is very difficult to translate this mental picture into concrete mathematical terms. Even if we can come up with
a mathematical formulation, it will most likely result in a
nonlinear problem, solving which will be difficult.
Instead, we propose to constrain the residual signals
spectrum to be constant at a certain discrete set of points.
An intuitively appealing choice of frequencies is k0 ,
where 0 is the fundamental frequency of the periodic input. In the next section we show that seeking the inverse
filter that minimizes the norm of the residual subject to
the constraint of its spectrum being constant at k0 leads
to a linear minimization problem. In Section 3 we show
through simulation examples that the solution gives the
exact inverse filter. The constraint on the residual spectrum to be constant at multiples of the pitch frequency is
a special case of the general approach of constraining it to
be G(ejk0 ), where G(ej ) is the spectrum of the input,
which is assumed to be known at = k0 .
The effectiveness of placing constraints at a discrete
set of frequencies, thereby converting a nonlinear problem into a linear one and yet obtaining an effective solution, has been demonstrated in a different context in
[7]. In that work, given a non-positive sequence, i.e., one
whose Fourier transform was not strictly non-negative,
the goal was to find a positive sequence that was closest
in the mean-square sense. The solution proposed in [8]
was nonlinear. Virtually the same solution was arrived at
by iteratively correcting the spectrum at the most negative
points, which resulted in a linear minimization problem
at each step [7].
2. Proposed Method
Let x[n], n = 0, 1, . . . , N 1, be the output of a p-th order all-pole filter 1/A(z) = 1/(1 + a1 z 1 + + ap z p )
in response to the periodic input g[n]. To motivate our
idea, let the input be a periodic impulse train, although
our framework is applicable to any periodic input. We
p
X
bk z k be
assume that we know G(ej ). Let B(z) =
k=0
the inverse filter, which when excited by x[n] produces

the residual e[n]. Over the interval n = p, . . . , N 1 it
can be expressed in matrix form as follows:
e = Xb
(1)
where the (N p) (p + 1) matrix X is
x[p]
x[p 1]
x[0]
x[p + 1]
x[p]
x[1]
X=
..
..
..
..
.
.
.
.
x[N 1] x[N 2] x[N p 1]
and b = (b0 b1 . . . bp )T . Note that b0 is not constrained

to be unity. We seek the b that minimizes E = kek22 =
eT e, subject to the constraint that the residual spectrum
be constant at k0 , k = 0, 1, . . . , L. That is,
min eT e
subject to WT e = d
(2)
where
W=
1
1
1
..
.
1
ej0
ej20
..
.
1 ej(M1)0
..
.
1
ejL0
e
j2(L0 )
..
.
ej(M1)(L0 )
and M = N p. For the examples in Section 3, L was

chosen such that L0 < 2 (L + 1)0 . The constant
value of the spectrum at k0 has been set to 1 without
def
loss of generality, and hence d = [1 1 1]T = 1.
j0
For a general periodic input g[n], d = [G(e ) G(ej0 )
G(ej20 ) . . . G(ejL0 )]T . In particular, for the impulse
train, G(ejk0 ) = constant (real-valued).
The solution to the constrained minimization of (2) is
well-known and is solved using the method of Lagrange
multipliers [9]. We begin by replacing e with Xb, to get

E = bT Rb 2T CT b d
(3)
where R = XT X and C = XT W. Differentiating E
w.r.t. b we get
E
= 2Rb 2C
b
(4)
Setting the above to zero yields b = R1 C. The parameter is obtained from the constraint CT b = d,
50
50
40
40
Original filter
30
30
Original and
Estimated filter
20
Magnitude (dB)
Magnitude (dB)
20
10
0
LPC filter
10
Estimated filter
10
0
10
LPC filter
20
20
30
30
40
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
40
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
Figure 3: Frequency response of the original and estimated filters when the input is an impulse train. They
match exactly. The estimated filters spectrum was computed after normalizing b0 to unity. f0 = 1000 Hz, and
p = 6, L = 7. Also shown is the sixth order LPC filters
response (autocorrelation method). Its bias towards pitch
harmonics is clearly seen.
Figure 4: The proposed method gives an estimate that is

very sensitive to the analysis window position. The result
for a different position is shown above, which is significantly worse compared with the exact answer obtained
previously. On the other hand, the LPC result has hardly
changed.
leading us to the inverse filter that yields the minimum

error:
1
b = R1 C CT R1 C
d
(5)
when f0 = 100 Hz.

The performance of the proposed method comes at
a price, namely, it is very sensitive to where the analysis window is located. In Fig. 4 the result of applying
(5) on a different segment is shown. It is clear that for
this analysis window position, the new method gives an
estimate that is significantly worse in comparison to the
exact answer obtained previously. In sharp contrast, the
autocorrelation method of LPC analysis is far less sensitive to the analysis position. In [11] we have modified our
method such that it is no longer sensitive to the analysis
window location.
An intuitively appealing criterion for locating the analysis window is to choose the position that gives the minimum kek. That is, for a window beginning at m, if we
denote the corresponding error vector by em , the optimal
choice is given by min kem k. In the examples that we
m
tried, this has proved to be effective.
We present one more example in which the excitation is a synthetic glottal pulse train, rather than impulses.
The chosen resonant frequencies and pitch are closer to
a natural speech example. The true coefficients are [1,
2.0535, 2.4818, 2.1442, 2.2336, 1.6961, 0.7717]T ,
corresponding to the following resonances and bandwidths:
750 (90), 1100 (110), and 2550 (130). One period of the
excitation pulse was constructed as follows [10]: the first
ten samples were zero; the next
were gener
13 samples
, 0 n 12; the
sin n
ated using the formula
26
12

last four samples were generated using sin n ,

8
8
1 n 4. These three sequences were concatenated
to get one period (27 samples), giving an f0 of 296.3 Hz.
The complex spectrum of a 30 ms segment of the input
Note that the corresponding B(z) is not guaranteed to be

minimum phase.
The solution to the problem of minimizing a quadratic
form with linear constraints has been known for a long
time. What is novel in this work is formulating our system identification problem within this framework and,
through simulation examples, demonstrating exact identification.
3. Simulation Results
Consider an all-pole filter with coefficients [1, 4.1780,
8.2209, 9.8011, 7.5151, 3.5005, 0.7717]T . This is the
same filter used in Figs. 1 and 2. The corresponding resonant frequencies and bandwidths (in Hz) are 350 (90),
750 (110), and 1500 (130). The excitation is an impulse
train with f0 = 1000 Hz, with the sampling frequency
fs = 8 kHz. These parameters have been chosen such
that the formants are located well away from the pitch
harmonics. In particular, note that there are two formants
between DC and the fundamental.
The result of applying (5) on a 30 ms segment of the
above filters output with p = 6, L = 7 is shown in
Fig. 3, which shows the magnitude spectra of 1/A(z)
and 1/B(z), along with the result of LPC analysis (autocorrelation method, sixth order). Equation (5) is not
guaranteed to give a solution with b0 = 1, but after normalization yields the exact answer. On the other hand,
the LPC spectrum is severely biased towards the pitch
harmonics. Although not shown here, it can be easily
verified that the LPC method produces very good results
40
4. Discussion
30
In both the examples given in the previous section and

the ones that we have tried, the solution of (5) gives an
answer that is real-valued, even though we did not impose
any such constraint.
Our method requires knowledge of G(ej ), which is
almost always not known if we want to apply our method
to a segment of natural speech. In such cases we need
to solve the problem in a blind manner, i.e., by assuming
that G(ej ) is not known. This is currently being investigated.
In Fig. 6 we have given an example of how the performance suffers if we use the constraint d = 1 even
when the input is not an impulse train. Both LPC and our
method estimate the first two formants reasonably well,
although for this example our method gives smaller bandwidth estimates. Both methods failed to capture the third
formant at 2550 Hz. We have not yet carefully studied or
attempted to quantify how badly the performance suffers
in such cases.
As mentioned in Section 3, in [11] we have addressed
the methods sensitivity to the starting point of the analysis window. In that work, we have also investigated the
effects of analysis window size, model order, choice of
constraint frequencies, errors in 0 , and analyzed natural
speech in the cases where the corresponding electroglottograph signal was available.
Magnitude (dB)
20
Original and
Estimated filter
10
10
LPC filter
20
30
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
Figure 5: Exact results are also obtained when the excitation is a periodic pulse. f0 = 296.3 Hz, p = 6, L = 26.
50
Minimum norm
window location,
unity constraint
40
Magnitude (dB)
30
20
10
Original
LPC
10
30
0
5. Conclusion
Exact answer
window location,
unity constraint
20
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
Figure 6: When the excitation is not an impulse train and

yet we constrain d = 1, the performance of the proposed
method suffers. Exact answer window location: the
analysis window is in the same position as in Fig. 5, i.e.,
the location that will yield the exact answer if the correct
constraint is used. Minimum norm window location:
window position that yields min kem k with d = 1.
We have proposed a method of identifying exactly an allpole system from its response to a periodic input. We
need knowledge of the input spectrum at k0 , where 0
is the fundamental frequency. We obtained the inverse filter by minimizing the norm of the residual subject to the
constraint that its spectrum be equal to that of the input
at k0 . The method, as presented in this paper, is sensitive to the analysis window location, and based on the
examples that we have tried, the criterion of choosing the
position that minimizes kem k seems to be effective.
6. Acknowledgment
was evaluated at k0 to obtain d. The results for p = 6
and L = 26 are given in Fig. 5 for the window location
that minimized kek. As in the previous example, the proposed method gives virtually exact results.
When the excitation is not an impulse train, instead of
using G(ejk0 ) as the constraint values, if we set d = 1,
in the light of the above simulation results, we cannot
hope to get the exact estimate. For the window location
that gave the exact answer when G(ejk0 ) was used, the
result of using d = 1 in the second example is given in
Fig. 6. Also shown in that figure is the estimate that minimizes kem k, which occurs at a different analysis window
location.
The author wishes to thank Prof. S. Umesh of IIT Kanpur

for his comments on the paper.
7. References
[1] L. R. Rabiner and R. W. Schafer, Digital Processing
of Speech Signals. Englewood Cliffs, NJ: PrenticeHall, 1978.
[2] J. I. Makhoul, Linear prediction: A tutorial review, Proceedings of the IEEE, vol. 63, pp. 561
580, Apr. 1975.
[3] B. S. Atal and S. L. Hanauer, Speech analysis
and synthesis by linear prediction of the speech

wave, Journal of the Acoustical Society of America, vol. 50, no. 2, pp. 637655, Aug. 1971.
[4] A. El-Jaroudi and J. Makhoul, Discrete all-pole
modeling, IEEE Transactions on Signal Processing, vol. 39, no. 2, pp. 411422, Feb. 1991.
[5] A. H. Gray, Jr. and J. D. Markel, A spectral-flatness
measure for studying the autocorrelation method of
linear prediction of speech analysis, IEEE Trans.
Acoust., Speech, Signal Processing, vol. 22, no. 3,
pp. 207217, Jun. 1974.
[6] J. D. Markel and A. H. Gray, Jr., Linear Prediction
of Speech. New York, NY: Springer-Verlag, 1976.
[7] C. S. Ramalingam and R. J. Vaccaro, A simplified
computational algorithm to obtain sequences with
non-negative Fourier transforms, IEEE Transac-
tions on Signal Processing, vol. 39, pp. 14591462,

Jun. 1991.
[8] J. Cadzow and Y. Sun, Sequences with positive semidefinite Fourier transforms, IEEE Trans.
Acoust., Speech, Signal Processing, vol. 34, no. 6,
pp. 15021510, 1986.
[9] D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods. Nashua, NH: Athena
Scientific, 1996.
[10] P. Hedelin, High quality glottal LPC-vocoding,
in Proceedings of IEEE ICASSP86, Tokyo, Japan,
Apr. 1986, pp. 465468.
[11] C.S. Ramalingam and B.H. Sri Hari, A constrained least-squares method for all-pole modeling
of speech, in preparation.

Exact Identification of An All-Pole System From Its Response To A Periodic Input

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exact Identification of An All-Pole System From Its Response To A Periodic Input

Uploaded by

Copyright:

Available Formats

Home | Sessions | Authors | Session 5.

Exact Identification of an All-Pole System

We show that it is possible to identify exactly an all-pole

Figure 1: Spectrum of an all-pole filter and its output

Figure 2: The output spectrum when the impulse trains

The most popular method of estimating the model

the inverse filter, which when excited by x[n] produces

where the (N p) (p + 1) matrix X is

and b = (b0 b1 . . . bp )T . Note that b0 is not constrained

and M = N p. For the examples in Section 3, L was

Figure 4: The proposed method gives an estimate that is

leading us to the inverse filter that yields the minimum

when f0 = 100 Hz.

last four samples were generated using sin n ,

Note that the corresponding B(z) is not guaranteed to be

In both the examples given in the previous section and

Figure 6: When the excitation is not an impulse train and

The author wishes to thank Prof. S. Umesh of IIT Kanpur

and synthesis by linear prediction of the speech

tions on Signal Processing, vol. 39, pp. 14591462,

You might also like