Professional Documents
Culture Documents
INTRODUCTION
1.1 Introduction to LDPC
Due to their near Shannon limit performance and inherently parallelizable
decoding scheme, low-density parity-check (LDPC) codes. have been extensively
investigated in research and practical applications. Recently, LDPC codes have been
considered for many industrial standards of next generation communication systems
such as DVB-S2, WLAN (802.11.n), WiMAX (802.16e), and 10GBaseT (802.3an).
For high throughput applications, the decoding parallelism is usually very high.
Hence, a complex interconnect network is required which consumes a significant
amount of silicon area and power.
A message broadcasting technique was proposed to reduce the routing
congestion in a fully parallel LDPC decoder. Because all check nodes and variable
nodes are directly mapped to hardware, the implementation cost is very high. The
decoders in are targeted to specific LDPC codes which have very simple
interconnection between check nodes and variables nodes. The constraints in H matrix
structure for routing complexity reduction unavoidably limit the performance of the
LDPC codes. The LDPC code decoder proposed in based on two-phase message-
passing (TPMP) decoding scheme.
Recently, layered decoding approach has been of great interest in LDPC
decoder design because it converges much faster than TPMP decoding approach. The
4.6 Gb/s LDPC decoder presented in adopted layered decoding approach. However, it
is only suited for array LDPC codes, which can be viewed as a sub-class of LDPC
codes. It should be noted that a shuffled iterative decoding algorithm based on vertical
partitioning of the parity-check matrix can also speed up the LDPC decoding in
principle.
In practice, LDPC codes have attracted considerable attention due to their
excellent error correction performance and the regularity in their parity check
matrices which is well suited for VLSI implementation. In this paper, we present a
high-throughput low-cost layered decoding architecture for generic QC-LDPC codes.
1
A row permutation approach is proposed to significantly reduce the
implementation complexity of shuffle network in the LDPC decoder. An approximate
layered decoding approach is explored to increase clock speed and hence to increase
the decoding throughput. An efficient implementation technique which is based on
Min-Sum algorithm is employed to minimize the hardware complexity. The
computation core is further optimized to reduce the computation delay.
Low-density parity-check (LDPC) codes were invented by R. G. Gallager
(Gallager 1963; Gallager 1962) in 1962. He discovered an iterative decoding
algorithm which he applied to a new class of codes. He named these codes low-
density parity-check (LDPC) codes since the parity-check matrices had to be sparse to
perform well. Yet, LDPC codes have been ignored for a long time due mainly to the
requirement of high complexity computation, if very long codes are considered. In
1993, C. Berrou et. al. invented the turbo codes (Berrou, Glavieux, and Thitimajshima
1993) and their associated iterative decoding algorithm. The remarkable performance
observed with the turbo codes raised many questions and much interest toward
iterative techniques. In 1995, D. J. C. MacKay and R. M. Neal (MacKay and Neal
1995; MacKay and Neal 1996; Mackay 1999) rediscovered the LDPC codes, and set
up a link between their iterative algorithm to the Pearls belief algorithm (Pearl 1988),
from the artificial intelligence community (Bayesian networks). At the same time, M.
Sipser and D. A. Spielman (Sipser and Spielman 1996) used the first decoding
algorithm of R. G. Gallager (algorithm A) to decode expander codes.
1.2 Objectives:
The objective of this project is a high-throughput decoder architecture for
generic quasi- cyclic low-density parity-check (QC-LDPC) codes. Various
optimizations are employed to increase the clock speed. A row permutation scheme is
proposed to significantly simplify the implementation of the shuffle network in LDPC
decoder. An approximate layered decoding approach is explored to reduce the critical
path of the layered LDPC decoder. The computation core is further optimized to
reduce the computation delay. It is estimated that 4.7 Gb/s decoding throughput can
be achieved at 15 iterations using the current technology.
2
Low-density parity-check (LDPC) codes, which have channel capacity
approaching performance, were first invented by Gallager in 1962 and rediscovered
by MacKay in 1996 as a linear block code; LDPC codes show excellent error
correction capability even for low signal-to-noise ratio applications. Also, inner
independence of its parity-check matrix enables parallel decoding and thus makes
high-speed LDPC decoder possible. Hence, LDPC codes have been suggested in
many recent wire-line and wireless communication standards such as IEEE 802.11n,
DVB-S2 and IEEE 802.16e (WiMax). LDPC codes can be effectively decoded by the
standard belief propagation (BP) algorithm which is also called sum-product
algorithm (SPA). Later, min-sum algorithm (MSA) is introduced to reduce the
computational complexity of the check nodes processing in SPA, which makes this
algorithm suitable for VLSI implementation. VLSI implementation of LDPC decoder
has attracted attentions from researchers in the past few years, including fully parallel
architecture and partly parallel architecture. Fully parallel architecture directly maps
standard BP algorithm into hardware by specifying connections between check nodes
and variable nodes. However, the interconnections become more complex as the
block length increases, which leads to large chip area and power consumption. Partly
parallel architecture can effectively balance the hardware complexity and system
throughput by employing architecture-aware LDPC (AA-LDPC) codes that have
regularly-constructed parity-check matrix. However, the decoder complexity is still a
great challenge for LDPC codes that have irregular parity check matrix, such as the
codes used in the IEEE 802.16e standard for WiMax systems. The decoding
throughput for irregular LDPC codes will decrease due to the irregular parity check
matrix which destroys the inherent parallelism in partly parallel decoding
architectures. Layered decoding algorithm (LDA), either by horizontal partitioning or
vertical partitioning, uses the newest data from the current iteration rather than data
from the previous iteration and thus can double the convergence speed. Conventional
LDA processes messages in serial, from the first layer to the last, leading to limited
decoding throughput. Grouped layered decoding can improve the throughput but
employs more hardware resources. In this paper, we introduce a new parallel layered
decoding architecture (PLDA) to enable different layers to operate concurrently.
Precisely scheduled message passing paths among different layers guarantees that
newly calculated messages can be delivered to their designated locations before they
are used by the next layer. By adding offsets to the permutation values of the sub-
3
matrices in the base parity check matrix, time intervals among different layers become
large enough for message passing. In PLDA, the decoding latency per iteration can be
reduced greatly and hence the decoding throughput is improved. The remainder of
this paper is organized as follows. Section II introduces code structure used in
WiMax, MSA and LDA. Corresponding hardware implementation of PLDA and
message passing network are presented in Section III. Section IV shows experimental
results of the proposed decoder, including FPGA implementation results, ASIC
implementation results and comparisons with existing WiMax LDPC decoders.
4
CHAPTER II
2.1 Turbo codes
Turbo Coding is an iterated soft-decoding scheme that combines two or more
relatively simple convolutional codes and an interleaver to produce a block code that
can perform to within a fraction of a decibel of the Shannon limit. Predating LDPC
codes in terms of practical application, they now provide similar performance.
One of the earliest commercial applications of turbo coding was the
CDMA2000 1x (TIA IS-2000) digital cellular technology developed by Qualcomm
and sold by Verizon Wireless, Sprint, and other carriers. It is also used for the
evolution of CDMA2000 1x specifically for Internet access, 1xEV-DO (TIA IS-856).
Like 1x, EV-DO was developed by Qualcomm, and is sold by Verizon Wireless,
Sprint, and other carriers (Verizon's marketing name for 1xEV-DO is Broadband
Access, Sprint's consumer and business marketing names for 1xEV-DO are Power
Vision and Mobile Broadband, respectively.).
2.1.1 Characteristics of Turbo Codes
1) Turbo codes have extraordinary performance at low SNR.
a) Very close to the Shannon limit.
b) Due to a low multiplicity of low weight code words.
2) However, turbo codes have a BER floor.
- This is due to their low minimum distance.
3) Performance improves for larger block sizes.
a) Larger block sizes mean more latency (delay).
b) However, larger block sizes are not more complex to decode.
c) The BER floor is lower for larger frame/interleaver sizes
5
4) The complexity of a constraint length K
TC
turbo code is the same as a K =
K
CC
convolutional code,
Where: K
CC
2+K
TC
+ log
2
(number decoder iterations)
2.2 Performance of Error Correcting Codes
The performances of error correcting codes are compared with each other by
referring to their gap to the Shannon limit, as mentioned in section 1.1.1. This section
aims at defining exactly what the Shannon limit is, and what can be measured exactly
when the limit to the Shannon bound is referred to. It is important to know exactly
what is measured since a lot of near Shannon limit codes have been discovered
now. The results hereafter are classical in the information theory and may be found in
a lot of references. Yet, the first part is inspired by the work of (Schlegel 1997).
Forward Error Correction (FEC) is an important feature of most modem
communication systems, including wired and wireless systems. Communication
systems use a variety of FEC coding techniques to permit correction of bit errors in
transmitted symbols.
2.2.1 Forward error correction
In telecommunication and information theory, forward error correction (FEC)
(also called channel coding) is a system of error control for data transmission,
whereby the sender adds (carefully selected) redundant data to its messages, also
known as an error-correcting code. This allows the receiver to detect and correct
errors (within some bound) without the need to ask the sender for additional data. The
advantages of forward error correction are that a back-channel is not required and
retransmission of data can often be avoided (at the cost of higher bandwidth
requirements, on average). FEC is therefore applied in situations where
retransmissions are relatively costly or impossible. In particular, FEC information is
usually added to most mass storage devices to protect against damage to the stored
data.
FEC processing often occurs in the early stages of digital processing after a
signal is first received. That is, FEC circuits are often an integral part of the analog-to-
6
digital conversion process, also involving digital modulation and demodulation, or
line coding and decoding. Many FEC coders can also generate a bit-error rate (BER)
signal which can be used as feedback to fine-tune the analog receiving electronics.
Soft-decision algorithms, such as the Viterbi encoder, can take (quasi-)analog data in,
and generate digital data on output.
The maximum fraction of errors that can be corrected is determined in
advance by the design of the code, so different forward error correcting codes are
suitable for different conditions.
How it works
FEC is accomplished by adding redundancy to the transmitted information
using a predetermined algorithm. Each redundant bit is invariably a complex function
of many original information bits. The original information may or may not appear in
the encoded output; codes that include the unmodified input in the output are
systematic, while those that do not are nonsystematic.
An extremely simple example would be an analog to digital converter that
samples three bits of signal strength data for every bit of transmitted data. If the three
samples are mostly zero, the transmitted bit was probably a zero, and if three samples
are mostly one, the transmitted bit was probably a one. The simplest example of error
correction is for the receiver to assume the correct output is given by the most
frequently occurring value in each group of three.
Triplet received Interpreted as
000 0
001 0
010 0
100 0
111 1
110 1
101 1
011 1
This allows an error in any one of the three samples to be corrected by
"democratic voting". This is a highly inefficient FEC, but it does illustrate the
principle. In practice, FEC codes typically examine the last several dozen, or even the
7
last several hundred, previously received bits to determine how to decode the current
small handful of bits (typically in groups of 2 to 8 bits).
Such triple modular redundancy, the simplest form of forward error correction,
is widely used.
Averaging noise to reduce errors:
FEC could be said to work by "averaging noise"; since each data bit affects
many transmitted symbols, the corruption of some symbols by noise usually allows
the original user data to be extracted from the other, uncorrupted received symbols
that also depend on the same user data.
Because of this "risk-pooling" effect, digital communication systems that use
FEC tend to work well above a certain minimum signal-to-noise ratio and not
at all below it.
This all-or-nothing tendency -- the cliff effect -- becomes more pronounced as
stronger codes are used that more closely approach the theoretical limit
imposed by the Shannon limit.
Interleaving FEC coded data can reduce the all or nothing properties of
transmitted FEC codes. However, this method has limits; it is best used on
narrowband data.
Most telecommunication systems used a fixed Channel Code designed to tolerate
the expected worst-case bit error rate, and then fail to work at all if the bit error rate is
ever worse. However, some systems adapt to the given channel error conditions:
Hybrid automatic repeat-request uses a fixed FEC method as long as the FEC can
handle the error rate, then switches to ARO when the error rate gets too high; adaptive
modulation and coding uses a variety of FEC rates, adding more error-correction bits
per packet when there are higher error rates in the channel, or taking them out when
they are not needed.
Types of FEC:
The two main categories of FEC codes are block codes and convolutional
codes.
8
Block codes work on fixed-size blocks (packets) of bits or symbols of
predetermined size. Practical block codes can generally be decoded in
polynomial time to their block length.
Convolutional codes work on bit or symbol streams of arbitrary length. They
are most often decoded with the Viterbi algorithm, though other algorithms are
sometimes used. Viterbi decoding allows asymptotically optimal decoding
efficiency with increasing constraint length of the convolutional code, but at
the expense of exponentially increasing complexity. A convolutional code can
be turned into a block code, if desired.
There are many types of block codes, but among the classical ones the most
notable is Reed-Solomon coding because of its widespread use on the Compact disc,
the DVD, and in hard disk drives. Golay, BCH, Multidimensional parity, and
Hamming codes are other examples of classical block codes.
Hamming ECC is commonly used to correct NAND flash memory errors
]
. This
provides single-bit error correction and 2-bit error detection. Hamming codes are only
suitable for more reliable single level cell (SLC) NAND. Denser multi level cell
(MLC) NAND requires stronger multi-bit correcting ECC such as BCH or Reed-
Solomon
]
.Classical block codes are usually implemented using hard-decision
algorithms, which means that for every input and output signal a hard decision is
made whether it corresponds to a one or a zero bit. In contrast, soft-decision
algorithms like the Viterbi decoder process (discretized) analog signals, which allow
for much higher error-correction performance than hard-decision decoding. Nearly all
classical block codes apply the algebraic properties of finite fields.
Concatenated FEC codes for improved performance:
Classical (algebraic) block codes and convolutional codes are frequently
combined in concatenated coding schemes in which a short constraint-length Viterbi-
decoded convolutional code does most of the work and a block code (usually Reed-
Solomon) with larger symbol size and block length "mops up" any errors made by the
convolutional decoder.
9
Concatenated codes have been standard practice in satellite and deep space
communications since Voyager2 first used the technique in its 1986 encounter with
Uranus.
Low-density parity-check (LDPC):
Low-Density Parity- Check (LDPC) codes are a class of recently re-
discovered highly efficient linear block codes. They can provide performance very
close to the channel capacity (the theoretical maximum) using an iterated soft-
decision decoding approach, at linear time complexity in terms of their block length.
Practical implementations can draw heavily from the use of parallelism.
LDPC codes were first introduced by Robert G. Gallager in his PhD thesis in
1960, but due to the computational effort in implementing en- and decoder and the
introduction of Reed-Solomon codes, they were mostly ignored until recently.
LDPC codes are now used in many recent high-speed communication
standards, such as DVB-S2 (Digital video broadcasting), Wi-MAX (IEEE 802.16e
standard for microwave communications), High-Speed Wireless LAN (IEEE
802.11n), 10GBase-T Ethernet (802.3an) and G.hn/G.9960 (ITU-T Standard for
networking over power lines, phone lines and coaxial cable).
Channel Capacity:
Stated by Claude Shannon in 1948, the theorem describes the maximum
possible efficiency of error-correcting methods versus levels of noise interference and
data corruption. The theory doesn't describe how to construct the error-correcting
method, it only tells us how good the best possible method can be. Shannon's theorem
has wide-ranging applications in both communications and data storage. This theorem
is of foundational importance to the modern field of information theory. Shannon only
gave an outline of the proof. The first rigorous proof is due to Amiel Feinstein in
1954.
The Shannon theorem states that given a noisy channel with channel capacity
C and information transmitted at a rate R, then if R < C there exist codes that allow
the probability of error at the receiver to be made arbitrarily small. This means that,
10
theoretically, it is possible to transmit information nearly without error at any rate
below a limiting rate, C.
The converse is also important. If R > C, an arbitrarily small probability of
error is not achievable. All codes will have a probability of error greater than a certain
positive minimal level, and this level increases as the rate increases. So, information
cannot be guaranteed to be transmitted reliably across a channel at rates beyond the
channel capacity. The theorem does not address the rare situation in which rate and
capacity are equal.
Simple schemes such as "send the message 3 times and use a best 2 out of 3
voting scheme if the copies differ" are inefficient error-correction methods, unable to
asymptotically guarantee that a block of data can be communicated free of error.
Advanced techniques such as ReedSolomon codes and, more recently, turbo codes
come much closer to reaching the theoretical Shannon limit, but at a cost of high
computational complexity. Using low-density parity-check (LDPC) codes or turbo
codes and with the computing power in today's digital signal processors, it is now
possible to reach very close to the Shannon limit. In fact, it was shown that LDPC
codes can reach within 0.0045 dB of the Shannon limit (for very long block lengths).
Mathematical statement:
Theorem (Shannon, 1948):
1. For every discrete memory less channel, the channel capacity
has the following property. For any > 0 and R < C, for large enough N, there
exists a code of length N and rate R and a decoding algorithm, such that the
maximal probability of block error is .
11
2. If a probability of bit error p
b
is acceptable, rates up to R(p
b
) are achievable, where
And H
2
(p
b
) is the binary entropy function
3. For any p
b
, rates greater than R(p
b
) are not achievable.
(MacKay (2003), p. 162; cf Gallager (1968), ch.5; Cover and Thomas (1991),
p. 198; Shannon (1948) thm. 11)
Error Correction in Communication Systems
Error correction is widely used in most communication
systems.
Encoder
(Redundancy
Added)
Decoder
(Error Detection
and Correction
Noise
Binary
information
Corrected
information
Encoded
information
Noisy
information
Figure 1: Error Correction in Communication systems
2.3 Row Permutation of Parity Check Matrix of LDPC Codes
12
The Parity check matrix of a LDPC code is an array of circulant
submatrices.To achieve very high decoding throughput, an array of cyclic shifters is
needed to shuffle soft messages corresponding to multiple submatrices for check
nodes and variable nodes. In order to reduce the VLSI implementation complexity for
the shuffle network, the shifting structure in circulant matrices is extensively
exploited. Suppose the parity check matrix H of a LDPC code is a JC array of PP
circulant submatrices. With row permutation, it can be converted to a form as shown
in fig.2
Figure 2: Array of circulant sub matrices
Figure 3: Permuted Matrix
Where is a PP permutation matrix representing a single left or right
cyclic shift. The submatrix can be obtained by cyclically shifting the submatrix
for a single step. A
i is
a JP matrix determined by the shift offsets of the
13
circulant submatrices in block column i(i=1,2,...C),m is an integer such that P can be
divided by m.
For example, the matrix H
a
shown in Fig 1. is a 23 array of 88 cyclically
shifted identify submatrices. With the row permutation described in the following, a
new matrix H
b
can be obtained, which has the form shown in (1).First,the first four
rows of the first block row of H
a
are distributed to four block rows of H
b
in a round-
robin fashion(i.e., rows 1-4 of H
a
are distributed row 1,5,9,and 13 of H
b
).Then the
second four rows are distributed in the same way. The permutation can be continued
until all rows in the first block row of matrix H
a
are moved to matrix H
b
. Then the
second block row of H
a
Are distributed in the same way. It can be seen from Fig.2 that
H
b
has the form shown in(1).In the previous example, the row distribution is started
from the first row of each block row. In general, the distribution can be started from
any row of a block row. To minimize the data dependency between two adjacent
block rows, an optimum row distribution scheme is desired.
For an LDPC decoder which can process all messages corresponding to
the 1-components in an entire block row of matrix H
p
(e.g., H
b
in Fig.2), the shuffle
network for LDPC decoding can be implemented with very simple data shifters.
1
1
1
1
1
1
1
1
]
1
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
0 1 0 0 0 1 1 0 0
H
( )
' ij j ' j h ' j
, h , ' j
' ij ij
' ij
' ij
min sign
,
_
1
1
MinSum:
Row
processing
Col
processing
Error
correction
Parity check
Row
processing
Col
processing
Error
correction
Parity check
Initial value
(received information from channel )
+
j ' j
' ij j ij
1
1
1
1
1
1
1
1
]
1
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
0 1 0 0 0 1 1 0 0
H
j
is the received information.
15
'
<
0 yi if 0
0 yi if 1
Vi
Row
processing
Col
processing
Error
correction
Parity check
Row
processing
Col
processing
Error
correction
Parity check
Initial value
1
1
1
1
1
1
1
1
1
]
1
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
0 1 0 0 0 1 1 0 0
H
y1
Row
processing
Col
processing
Error
correction
Parity check
Row
processing
Col
processing
Error
correction
Parity check
Initial value
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
]
1
1
1
1
1
1
1
1
1
]
1
8
7
6
5
4
3
2
1
0
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
0 1 0 0 0 1 1 0 0
^
^
^
^
^
^
^
^
^
v
v
v
v
v
v
v
v
v
H
= 0 (Stop decoding)
0 (Repeat decoding)
LDPC Codes:
An LDPC code is defined by a binary matrix called parity check matrix H.
16
Rows define parity check equations (constrains) between encoded symbols in
a code word and columns define the length of the code.
V is a valid code word if HV
t
=0.
Decoder in the receiver checks if the condition HV
t
=0 is valid.
Example : Parity check matrix for (9, 5) LDPC code, row weight=4, column
weight =2:
1
1
1
1
1
1
1
1
1
1
1
1
]
1
1
1
1
1
1
1
1
1
]
1
9
8
7
6
5
4
3
2
1
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
0 1 0 0 0 1 1 0 0
v
v
v
v
v
v
v
v
v
H
0 (There is error)
= 0 (There is no error)
17
CHAPTER III
CHANNEL CODING
This first chapter introduces the channel code decoding issue and the problem
of optimal code decoding in the case of linear block codes. First, the main notations
used in the thesis are presented and especially those related to the graph
representation of the linear block codes. Then the optimal decoding is discussed: it is
shown that under the cycle-free hypothesis, the optimal decoding can be processed
using an iterative algorithm. Finally, the performance of error correcting codes is also
discussed.
3.1 Optimal decoding:
Communication over noisy channels can be improved by the use of a channel
code C, as demonstrated by C. E. Shannon for its famous channel coding theorem
Let a discrete channel have the capacity C and a discrete source the entropy per
second H. If H _ C there exists a coding system such that the output of the source can
be transmitted over the channel with an arbitrarily small frequency of errors. If H > C
it is possible to encode the source so that the equivocation is less than H. This
theorem states that below a maximum rate R, which is equal to the capacity of the
channel, it is possible to find error correction codes to achieve any given probability
of error. Since this theorem does not explain how to make such a code, it has been the
kick-off for a lot of activities in the coding theory community. When Shannon
announced his theory in the July and October issues of the Bell System Technical
Journal in 1948, the largest communications cable in operation at that time carried
1800 voice conversations. Twenty-five years later, the highest capacity cable was
carrying 230000 simultaneous conversations. Today a single optical fiber as thin as a
human hair can carry more than 6.4 million conversations. In the quest of capacity
18
achieving codes, the performance of the codes is measured by their gap to the
capacity. For a given code, the smallest gap is obtained by an optimal decoder: the
maximum a-posteriori (MAP) decoder. Before dealing with the optimal decoding,
some notations within a model of the communication scheme are presented hereafter.
Figure 4: Basic scheme for channel code encoding/decoding.
3.1.1 Communication model
It depicts a classical communication scheme. The source block delivers
information by the mean of sequences which are row vectors x of length K. The
encoder block delivers the codeword c of length N, which is the coded version of x.
The code rate is defined by the ratio R = K/N. The codeword c is sent over the
channel and the vector y is the received word: a distorted version of c. The matched
filters, the modulator and the demodulator, and the synchronization is supposed to
work perfectly. Hence, the channel is represented by a discrete time equivalent model.
The channel is a non-deterministic mapper between its input c and its output y. We
assume that y depends on c via a conditional probability density function (pdf)
p (y|c). We assume also that the channel is memory less:
For example: If the channel is the binary-input additive white Gaussian noise
(BI-AWGN), and if the modulation is a binary phased shift keying (BPSK)
modulation with the
19
On Figure 4, two types of decoder are depicted: decoders of type 1 have to
compute the best estimation x of the source word x; decoders of type 2 compute the
best estimation c of the sent codeword c. In this case, x is extracted from c by a post
processing (reverse processing of the encoding) when the code is non-systematic.
Both decoders can perform two types of decoding:
3.2 Classes of LDPC codes:
R. Gallager defined an (N, j, k) LDPC codes as a block code of length N
having a small fixed number (j) of ones in each column of the parity check H, and a
small fixed number (k) of ones in each rows of H. This class of codes is then to be
decoded by the iterative algorithm described in chapter 1. This algorithm computes
exact a posteriori probabilities, provided that the Tanner graph of the code is cycle
free. Generally, LDPC codes do have cycles. The sparseness of the parity check
matrix aims at reducing the number of cycles and at increasing the size of the cycles.
Moreover, as the length N of the code increases, the cycle free hypothesis becomes
more and more realistic. The iterative algorithm is processed on these graphs.
Although it is not optimal, it performs quite well. Since then, LDPC codes class have
been enlarged to all sparse parity check matrices, thus creating a very wide class of
codes, including the extension to codes in GF(q) and irregular LDPC codes
Irregularity:
In the Gallagers original LDPC code design, there is a fixed number of ones
in both the rows (k) and the columns (j) of the parity check matrix: it means that each
bit is implied in j parity check constraints and that each parity check constraint is the
exclusive-OR (XOR) of k bits. This class of codes is referred to as regular LDPC
codes. On the contrary, irregular LDPC codes do not have a constant number of non-
zero entries in the rows or in the columns of H. They are specified by the distribution
degree of the bit _(x) and of the parity check constraints (x), using the notations of,
where:
20
Similarly, denoting by
i
the proportion of rows having weight i:
21
Code rate:
The rate R of LDPC codes is defined by is the design code rate. Rd = R if the
parity check matrix has full rank. The authors of have shown that as N increases, the
parity-check matrix is almost sure to be full rank. Hereafter, we will assume that R =
Rd unless the contrary is mentioned. The rate R is then linked to the other parameters
of the class by
Note that in general, for random constructions, when j is odd:
and when j is even:
3.2.1 Optimization of LDPC codes:
The bounds and performance of LDPC codes are derived from their
parameters set. The wide number of independent parameters enables to tune them so
as to fit some external constraint, as a particular channel, for example. Two
algorithms can be used to design a class of irregular LDPC codes under some channel
constraints: the density evolution algorithm and the extrinsic information transfer
(EXIT) charts.
Density evolution algorithm:
Richardson designed capacity approaching irregular codes with the density
evolution (DE) algorithm. This algorithm tracks the probability density function (pdf)
of the messages through the graph nodes under the assumption that the cycle free
hypothesis is verified. It is a kind of belief propagation algorithm with pdf messages
instead of log likelihood ratios messages. Density evolution is processed on the
asymptotical performance of the class of LDPC codes. It means that a infinite number
of iterations is processed on a infinite code-length LDPC code: if the length of the
22
code tends to infinity, the probability that a randomly chosen node belongs to a cycle
of a given length tends towards zero.
Usually, either the channel threshold or the code rates are optimized under the
constraints of the degree distributions and of the SNR. The threshold of the channel is
the value of the channel parameter above which the probability tends towards zero if
the iterations are infinite (and the code length also). Optimization tries to lower the
threshold or to higher the rate as best as possible. In for example, the authors designed
a rate1/2 irregular LDPC codes for binary-input AWGN channels that approach the
Shannon limit very closely (up to 0.0045 dB). Optimization based on DE algorithm
are often processed by the mean of differential evolution algorithm when
optimizations are non-linear, as for example in where the authors optimize an
irregular LDPC code for uncorrelated flat Rayleigh fading channels. The Gaussian
approximation in the DE algorithm can also be used: the probability density functions
of the messages are assumed to be Gaussian and the only parameters that has to be
tracked in the nodes is the mean.
EXIT chart:
Extrinsic information transfer (EXIT) charts are 2D graphs on which are
superposed the mutual information transfers through the 2 constituent codes of a
turbocode. EXIT charts have been transposed to the LDPC code optimization
3.2.2 Regular vs. Irregular LDPC codes:
An LDPC code is regular if the rows and columns of H have uniform weight,
i.e. all rows have the same number of ones (d
v
) and all columns have the same
number of ones (d
c
)
The codes of Gallager and MacKay were regular (or as close as
possible)
Although regular codes had impressive performance, they are still
about 1 dB from capacity and generally perform worse than turbo
codes
An LDPC code is irregular if the rows and columns have non-uniform weight
23
Irregular LDPC codes tend to outperform turbo codes for block lengths
of about n>10
5
The degree distribution pair (, ) for a LDPC code is defined as
i
,
i
represent the fraction of edges emanating from variable (check) nodes of
degree i
3.2.3 Constructing Regular LDPC Codes:
Around 1996, Mackay and Neal described methods for constructing sparse H
matrices
The idea is to randomly generate a M N matrix H with weight d
v
columns
and weight d
c
rows, subject to some constraints
Construction 1A: Overlap between any two columns is no greater than 1
This avoids length 4 cycles
Construction 2A: M/2 columns have d
v
=2, with no overlap between any pair
of columns. Remaining columns have d
v
=3. As with 1A, the overlap between
any two columns is no greater than 1
Construction 1B and 2B: Obtained by deleting select columns from 1A and 2A
Can result in a higher rate code
3.2.4 Constructing Irregular LDPC Codes:
Luby developed LDPC codes based on irregular LDPC Tanner graphs
Message and check nodes have conflicting requirements
24
1
2
1
1
( )
( )
v
c
d
i
i
i
d
i
i
i
x x
x x