Professional Documents
Culture Documents
50
40
A common approach for modeling short segments of voiced speech is viewing it as the output of an all-pole filter
excited by an impulse train [1]. In this simple but effective model, the filter models the vocal tract, and our goal
is to estimate its coefficients from a short segment of the
output. Fig. 1 shows the spectrum of a filter and its output when excited by an impulse train with f0 = 100 Hz.
Clearly, the spectral envelope of the output is a close
match to the filter response, except in the valley region.
On the other hand, Fig. 2 shows what happens when f0
is increased to 1000 Hz. In this case, the harmonic peaks
sample the filters response only sparsely, and the filters
spectrum does not at all appear to be the envelope of the
output signal. We will later build on this example.
Magnitude (dB)
20
10
0
10
20
Output
spectrum
30
40
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
Magnitude (dB)
1. Introduction
Filter spectrum
30
10
Filter spectrum
Output
spectrum
0
10
20
30
40
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
2. Proposed Method
Let x[n], n = 0, 1, . . . , N 1, be the output of a p-th order all-pole filter 1/A(z) = 1/(1 + a1 z 1 + + ap z p )
in response to the periodic input g[n]. To motivate our
idea, let the input be a periodic impulse train, although
our framework is applicable to any periodic input. We
p
X
bk z k be
assume that we know G(ej ). Let B(z) =
k=0
(1)
x[p]
x[p 1]
x[0]
x[p + 1]
x[p]
x[1]
X=
..
..
..
..
.
.
.
.
x[N 1] x[N 2] x[N p 1]
subject to WT e = d
(2)
where
W=
1
1
1
..
.
1
ej0
ej20
..
.
1 ej(M1)0
..
.
1
ejL0
e
j2(L0 )
..
.
ej(M1)(L0 )
(4)
Setting the above to zero yields b = R1 C. The parameter is obtained from the constraint CT b = d,
50
50
40
40
Original filter
30
30
Original and
Estimated filter
20
Magnitude (dB)
Magnitude (dB)
20
10
0
LPC filter
10
Estimated filter
10
0
10
LPC filter
20
20
30
30
40
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
40
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
Figure 3: Frequency response of the original and estimated filters when the input is an impulse train. They
match exactly. The estimated filters spectrum was computed after normalizing b0 to unity. f0 = 1000 Hz, and
p = 6, L = 7. Also shown is the sixth order LPC filters
response (autocorrelation method). Its bias towards pitch
harmonics is clearly seen.
, 0 n 12; the
sin n
ated using the formula
26
12
3. Simulation Results
Consider an all-pole filter with coefficients [1, 4.1780,
8.2209, 9.8011, 7.5151, 3.5005, 0.7717]T . This is the
same filter used in Figs. 1 and 2. The corresponding resonant frequencies and bandwidths (in Hz) are 350 (90),
750 (110), and 1500 (130). The excitation is an impulse
train with f0 = 1000 Hz, with the sampling frequency
fs = 8 kHz. These parameters have been chosen such
that the formants are located well away from the pitch
harmonics. In particular, note that there are two formants
between DC and the fundamental.
The result of applying (5) on a 30 ms segment of the
above filters output with p = 6, L = 7 is shown in
Fig. 3, which shows the magnitude spectra of 1/A(z)
and 1/B(z), along with the result of LPC analysis (autocorrelation method, sixth order). Equation (5) is not
guaranteed to give a solution with b0 = 1, but after normalization yields the exact answer. On the other hand,
the LPC spectrum is severely biased towards the pitch
harmonics. Although not shown here, it can be easily
verified that the LPC method produces very good results
40
4. Discussion
30
Magnitude (dB)
20
Original and
Estimated filter
10
10
LPC filter
20
30
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
Figure 5: Exact results are also obtained when the excitation is a periodic pulse. f0 = 296.3 Hz, p = 6, L = 26.
50
Minimum norm
window location,
unity constraint
40
Magnitude (dB)
30
20
10
Original
LPC
10
30
0
5. Conclusion
Exact answer
window location,
unity constraint
20
500
1000
1500
2000
2500
3000
3500
4000
Frequency (Hz)
We have proposed a method of identifying exactly an allpole system from its response to a periodic input. We
need knowledge of the input spectrum at k0 , where 0
is the fundamental frequency. We obtained the inverse filter by minimizing the norm of the residual subject to the
constraint that its spectrum be equal to that of the input
at k0 . The method, as presented in this paper, is sensitive to the analysis window location, and based on the
examples that we have tried, the criterion of choosing the
position that minimizes kem k seems to be effective.
6. Acknowledgment
was evaluated at k0 to obtain d. The results for p = 6
and L = 26 are given in Fig. 5 for the window location
that minimized kek. As in the previous example, the proposed method gives virtually exact results.
When the excitation is not an impulse train, instead of
using G(ejk0 ) as the constraint values, if we set d = 1,
in the light of the above simulation results, we cannot
hope to get the exact estimate. For the window location
that gave the exact answer when G(ejk0 ) was used, the
result of using d = 1 in the second example is given in
Fig. 6. Also shown in that figure is the estimate that minimizes kem k, which occurs at a different analysis window
location.
7. References
[1] L. R. Rabiner and R. W. Schafer, Digital Processing
of Speech Signals. Englewood Cliffs, NJ: PrenticeHall, 1978.
[2] J. I. Makhoul, Linear prediction: A tutorial review, Proceedings of the IEEE, vol. 63, pp. 561
580, Apr. 1975.
[3] B. S. Atal and S. L. Hanauer, Speech analysis