Speech Coding

The transmission of speech is, at the moment, the most important service of a mobile cellular system.
The
GSM speech codec, which will transform the analog signal (voice) into a digital representation, has to
meet the following criterias:
• A good speech quality, at least as good as the one obtained with previous cellular systems.
• To reduce the redundancy in the sounds of the voice. This reduction is essential due to the
limited capacity of transmission of a radio channel.
• The speech codec must not be very complex because complexity is equivalent to high costs.
The final choice for the GSM speech codec is a codec named RPE-LTP (Regular Pulse Excitation Long-Term
Prediction). This codec uses the information from previous samples (this information does not change very
quickly) in order to predict the current sample. The speech signal is divided into blocks of 20 ms. These
blocks are then passed to the speech codec, which has a rate of 13 kbps, in order to obtain blocks of 260
bits.
The speech coding algorithm used in GSM is based on a rectangular pulse excited linear predictive coder
with long-term prediction (RPE-LTP). The speech coder produces samples at 20 ms intervals at a 13 kbps
bit rate, producing 260 bits per sample or frame. These 260 bits are divided into 182 class 1 and 78 class
2 bits based on a subjective evaluation of their sensitivity to bit errors, with the class 1 bits being the
most sensitive. Channel coding involves the addition of parity check bits and half-rate convolutional coding
of the 260-bit output of the speech coder. The output of the channel coder is a 456-bit frame, which is
divided into eight 57-bit components and interleaved over eight consecutive 114-bit TDMA frames. Each
TDMA frame correspondingly consists of two sets of 57 bits from two separate 456-bit channel coder
frames. The result of channel coding and interleaving is to counter the effects of fading channel
interference and other sources of bit errors.
Speech Coding
GSM is a digital communications standard, but voice is analog, and therefore it must be converted to a digital bit stream. GSM
uses Pulse Coded Modulation (64kbps) to digitize voice, and then uses the Full-Rate speech codec to remove the redundancy
in the signal and achieve a bit rate of 13 kbps
Please go to the Coding Section to learn more about speech coding
GSM Speech and Channel Coding
In order to send our voice across a radio network, we have to turn our voice into a digital signal. GSM uses a method called
RPE-LPC (Regular Pulse Excited - Linear Predictive Coder with a Long Term Predictor Loop) to turn our analog voice into a
compressed digital equivalent. Once we have a digital signal we have to add some sort of redundancy so that we can
recover from errors when we trams our digital voice over the radio channel. GSM uses a convolution codes to encode digital
speech representations.
SPEECH ENCODING
RPE-LPC
In modern land-line telephone systems, digital coding is used. The electrical variations induced into the microphone are sampled and
each sample is then converted into a digital code. The voice waveform is then sampled at a rate of 8 kHz. Each sample is then converted
into an 8 bit binary number representing 256 distinct values. Since we sample 8000 times per second and each sample is 8 binary bits,
we have a bitrate of 8kHz X 8 bits = 64kbps. This bitrate is unrealistic to transmit across a radio network since interference will likely ruin
the transmitted waveform. GSM speech encoding works to compress the speech waveform into a sample that results in a lower bitrate
using RPE-LPC.
A [1] LPC encoder fits a given speech signal against a set of vocal characteristics. The best-fit parameters are transmitted and used by
the decoder to generate synthetic speech that is similar to the original. Information from previous samples is used to predict the current
sample. The coefficients of the linear combination of the previous samples, plus an encoded form of the residual, the difference between
the predicted and actual sample, represent the signal. Speech is divided into 20 millisecond samples, each of which is encoded as 260
bits, giving a total bit rate of 13 kbps. This way GSM can transmit 4 times (floor[64kbps/13kbps]) as many phone calls as a regular land-
line telephone. See Figure 1 for a representation of RPE-LPC
Figure 1- A block diagram detailing how an analog voice is digitized and encoded to produce a digital voice signal.
2 Speech coding
GSM is a digital system, so speech signals, inherently analog, have to be digitized. The method employed
by ISDN, and by current telephone systems for multiplexing voice lines over high speed trunks and optical
fiber lines, is Pulse Coded Modulation (PCM). The output stream from PCM is 64 kbps, too high a rate to
be feasible over a radio link. The 64 kbps signal contains much redundancy, although it is simple to
implement. The GSM group studied several voice coding algorithms on the basis of subjective speech
quality and complexity (which is related to cost, processing delay, and power consumption once
implemented) before arriving at the choice of a Regular Pulse Excited - Linear Predictive Coder (RPELPC)
with a Long Term Predictor loop. Basically, information from previous samples, which does not change
very quickly, is used to predict the current sample. The coefficients of the linear combination of the
previous samples, plus an encoded form of the residual, the difference between the predicted and actual
sample, represent the

Speech Coding

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Speech Coding

Uploaded by

Copyright:

Available Formats

The transmission of speech is, at the moment, the most important service of a mobile cellular system.

Please go to the Coding Section to learn more about speech coding

GSM Speech and Channel Coding

You might also like