You are on page 1of 108

ASYMPTOTIC ANALYSIS OF LATTICE-BASED QUANTIZATION

by Peter Warren Moo

A dissertation submitted in partial ful llment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering: Systems) in The University of Michigan 1998

Doctoral Committee: Professor David L. Neuho , Chair Associate Professor Je rey A. Fessler Professor Edward D. Rothman Professor Demosthenis Teneketzis Assistant Professor Kimberly M. Wasserman

Peter Warren Moo 1998 All Rights Reserved

To Michelle

ii

ACKNOWLEDGEMENTS
I am deeply indebted to my advisor, Dave Neuho , for his guidance and encouragement throughout the course of this research. I have bene tted tremendously from his instruction and expertise over the past ve years. I would like to thank Professors Je Fessler, Ed Rothman, Demosthenis Teneketzis and Kim Wasserman for serving on my doctoral committee. My gratitude also goes to Professors Lorne Campbell and Paul Wittke of Queen's University for encouraging my interest in communications as an undergraduate. I would like to thank my ancee Michelle for her constant love and patience. She has been an boundless source of support. I also thank my parents, Warren and Josephine Moo, and my sister Rachel, for providing a loving family atmosphere to grow up in and for their continued love and support. My grandparents, Charles and Berthebell Moo, and Alex and Hilda Watts, have been a true source of inspiration I thank them for their love and encouragement. I would also like to thank my friends and colleagues here at Michigan, who have kept me sane and taught me a great deal. Finally, I thank the National Science Foundation for its generous nancial support.

iii

TABLE OF CONTENTS
DEDICATION : : : : : : : : ACKNOWLEDGEMENTS LIST OF FIGURES : : : : : LIST OF TABLES : : : : : : CHAPTER ii : : : : : : : : : : : : : : : : : : : : : : : : : : iii : : : : : : : : : : : : : : : : : : : : : : : : : : vi : : : : : : : : : : : : : : : : : : : : : : : : : : vii
::::::::::::::::::::::::::
1

I. Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Motivation for Lattice-Based Quantization : : : : : : : : : : : : : 7 Main Contributions : : : : : : : : : : : : : : : : : : : : : : : : : : 10 References : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13

II. Asymptotically Optimal Fixed-Rate Lattice Quantization for a Class of Generalized Gaussian Sources : : : : : : : : : : : : : 16
Introduction : : : : : : : : : : : : : : : : : : : : Distortion Bounds : : : : : : : : : : : : : : : : : Bounds to Overload Distortion : : : : : : Bounds to Granular Distortion : : : : : : Main Results : : : : : : : : : : : : : : : : : : : : Minimizing the Upper and Lower Bounds Minimizing Lattice Quantizer Distortion Summary : : : : : : : : : : : : : : : : : : Proofs and Derivations : : : : : : : : : : : : : : Proof of Lemma 1 : : : : : : : : : : : : : Proof of Lemma 2 : : : : : : : : : : : : : Proof of Lemma 3 : : : : : : : : : : : : : Proof of Proposition 5 : : : : : : : : : : : Proof of Proposition 6 : : : : : : : : : : : Proof of Proposition 10 : : : : : : : : : : Appendix : : : : : : : : : : : : : : : : : : : : : : References : : : : : : : : : : : : : : : : : : : : : iv

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : :

16 22 22 23 24 24 26 28 28 28 29 30 31 32 35 38 42

III. Optimal Compressor Functions for Multidimensional Companding of Memoryless Sources : : : : : : : : : : : : : : : : : : 45


Introduction : : : : : : : : : : : : : : : : : : Main Result : : : : : : : : : : : : : : : : : : Comparison to Optimal Vector Quantization Derivation of Main Result : : : : : : : : : : : Appendices : : : : : : : : : : : : : : : : : : : Inertial Pro le Inequality : : : : : : : De nitions : : : : : : : : : : : : : : : Key Lemmas : : : : : : : : : : : : : : References : : : : : : : : : : : : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

45 48 51 54 57 57 58 59 63 65 69 70 71 73 76 78 80 83 83 84 84 85 85 90 90

IV. Polar Quantization Revisited : : : : : : : : : : : : : : : : : : : : 65


Introduction : : : : : : : : : : : : : : : : : : : : : : : : Nonuniform Polar Quantization : : : : : : : : : : : : : : Analysis via Bennett's Integral : : : : : : : : : : Optimization of Power Law Polar Quantization : Optimization of Unrestricted Polar Quantization Restricted Uniform Polar Quantization : : : : : : : : : Gaussian Sources : : : : : : : : : : : : : : : : : Approximations to W : : : : : : : : : : : : : : : Details : : : : : : : : : : : : : : : : : : : : : : : : : : : Proof of Lemma 2 : : : : : : : : : : : : : : : : : Proof of Lemma 3 : : : : : : : : : : : : : : : : : Proof of Lemma 4 : : : : : : : : : : : : : : : : : Proof of Proposition 6 : : : : : : : : : : : : : : : Proof of Lemma 7 : : : : : : : : : : : : : : : : : Acknowledgements : : : : : : : : : : : : : : : : : : : : : References : : : : : : : : : : : : : : : : : : : : : : : : :

V. Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95
Summary of Contributions : : : : : : : : : : : : : : : : : : : : : : 95 Future Research Issues : : : : : : : : : : : : : : : : : : : : : : : : 96

LIST OF FIGURES
Figure
1.1 1.2 2.1 2.2 3.1 4.1 4.2 4.3 4.4 A lattice quantizer based on the two-dimensional square lattice with hexagonal support. : : : : : : : : : : : : : : : : : : : : : : : : : : : 15 A lattice quantizer based on the two-dimensional hexagonal lattice with hexagonal support. : : : : : : : : : : : : : : : : : : : : : : : : 15 The optimal scaling factor aN for N xed. : : : : : : : : : : : : : : 44 Illustration of inner and outer supports. : : : : : : : : : : : : : : : : 44 Block diagram of multidimensional companding. : : : : : : : : : : : 64 Examples of polar quantizers: (a) restricted nonuniform, (b) restricted uniform, (c) unrestricted nonuniform. : : : : : : : : : : : : 92 The height H and width W of a polar quantization cell. : : : : : : : 92 Optimal phase rates for restricted uniform polar quantization. The optimal rate from Proposition 6 is R . The two-term and three-term approximations are R 2 and R 3, respectively. : : : : : : : : : : : 93 Optimal magnitude quantizer support for restricted uniform polar quantization. The optimal support from Proposition 6 is LM . The two-term and three-term approximations are LM 2 and LM 3, respectively. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 93 Maximum signal-to-noise ratio (SNR) for restricted uniform polar quantization. The optimal rate from Proposition 6 is S . The twoterm and three-term approximations are S2 and S3 , respectively. : : 94 Signal-to-noise ratio (SNR) of restricted nonuniform polar quantization, restricted uniform polar quantization and uniform scalar quantization. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 94 vi

4.5 4.6

LIST OF TABLES
Table
3.1 4.1 Losses of optimal companding, Lcomp , and optimal product VQ, Lprod , for an IID Gaussian source. : : : : : : : : : : : : : : : : : : : 54 Comparison of magnitude allocation M and MSE D = ;=N for a pair of IID Gaussian random variables. : : : : : : : : : : : : : : : : 74

vii

CHAPTER I

Introduction
The core of this dissertation is composed of three self-contained manuscripts, each serving as a chapter. This chapter presents an introduction to the research area considered in the manuscripts, a summary of some open problems in the area, and an overview of this dissertation and its results. The work presented in this dissertation analyzes three quantization methods, which have at their core a lattice structured codebook. As we will discuss, these methods are of interest because some lattice encoding techniques have low complexity and because lattices have a strong connection to optimal quantizers. We begin our discussion with a brief introduction to vector quantization.
Vector quantization (VQ) is the process of mapping a k-dimensional real-valued

random vector X to a reproduction vector in a nite or countably in nite set. A


vector quantizer performs this mapping and consists of an encoder and decoder. The

encoder maps the random vector to a reproduction vector and transmits a nitelength binary codeword, representing the index of the reproduction vector, across a channel to the decoder. The decoder maps the received binary sequence to a reproduction vector. In general, the channel is noisy and introduces errors to the binary codeword, but for this discussion, we will assume that the channel is noiseless. 1

2 A VQ is principally characterized by the dimension k, a partition S = fSi i 2 Ig of


<k and a codebook C = fyi i 2 Ig of reproduction vectors or codevectors, where I is

the index set of the codevectors. The mapping performed by a VQ is given by the quantization rule Q(x) = yi if x 2 Si. There are three key measures of vector quantizer performance: distortion, rate, and complexity. Distortion is a measure of the accuracy of the reproduction vector in comparison to X . Throughout this dissertation, we will only consider a widely used distortion measure, average mean-squared error (MSE), which is given by, Z D(Q) = k kx ; Q(x)k2 p(x) dx
<

where p(x) is the probability density of X . The rate of a VQ measures the average length of the binary codewords produced by the VQ encoder. If the codebook is of size N and all codevectors have binary codewords of the same length (modulo one bit), then the quantizer is said to be xed-rate, and the rate is given by, 1 RFR(Q) = k log2 N: On the other hand, if the binary codewords are of varying lengths, then the quantizer is variable rate. One can think of a variable-rate VQ as consisting of a partition, a codebook, and a lossless binary code. It is known that Hu man's algorithm designs binary codes with the smallest possible average codeword lengths for a given set of probabilities 11]. We will refer to a variable rate VQ that employs such a binary code as an entropy-coded vector quantizer, because the rate of a quantizer employing a Hu man code is approximately the entropy of the quantizer output, that is, X REC (Q) = ; Pi log2 Pi where Pi = P (X 2 Si). The complexity of a VQ is a measure of the computational and storage requirements of the encoder and decoder. For example, in full-search
i2I

3 encoding, the encoder must compare the source vector to every codevector to nd the closest codevector and must therefore also store the codebook. In this case, the decoder need only store the codebook. Ideally, we would like to design VQs with simultaneously low distortion, rate and complexity, but there are in fact tradeo s that must be made among these three performance measures. Research in vector quantization looks to characterize these tradeo s, both for arbitrary VQs and for various classes of VQs. This dissertation focuses on the distortion-rate performance of a particular class of vector quantizers, namely lattice-based quantization. Lattice quantization has at its core an in nite lattice , which is a countably in nite set of points in <k that is closed under addition 6, 10]. A lattice quantizer codebook is given by R \ , where R <k , is the quantizer support. Figures 1.1 and 1.2 show examples of lattice quantizers with hexagonal support. Note that R is bounded for xed-rate quantization but may be unbounded for variable-rate quantization. The complement RC is the overload region. Lattice quantization in one dimension is called uniform scalar quantization. We will scale the lattice quantizer by a scaling factor a 2 <+ . For uniform scalar quantization, the scaling factor is often called the step size. We loosely de ne a lattice-based quantization method as a VQ that utilizes as its codebook a subset of either or a transformation of . Before continuing our discussion of latticebased quantization, we will consider optimal vector quantization. It will be seen that optimal VQ provides a distortion-rate performance benchmark and motivates our consideration of lattice-based VQ. An optimal vector quantizer minimizes distortion for a given rate and dimension. In general, we are interested in designing optimal VQs and in characterizing their performance. The design of optimal VQ is well understood. Necessary and su cient

4 conditions for an optimal xed-rate scalar (k = 1) quantizer were developed by Lloyd 13], who developed an iterative descent algorithm for scalar quantizer design. Lloyd's optimality conditions can be extended to multiple dimensions, and his design algorithm was later generalized to xed-rate vector quantizers by Linde et al. 12]. In multiple dimensions, an optimal vector quantizer satis es two necessary conditions. First, each codevector is the centroid of its respective cell, that is, yi = E X jX 2

Si]. Second, each quantization cell is the Voronoi region of its codevector, that is, Si = fx : kx ; yik kx ; yj k j 6= ig. The descent algorithms in 13, 12] iteratively
design a codebook using these necessary conditions. In general, they converge to local minima, but various techniques can be used to increase the chances of nding global minima. These algorithms yield optimal codebooks, but require full-search encoding that is, the encoder must compare the source vector to every codevector to nd the closest codevector. As we will see, this causes the search complexity to increase exponentially as rate and dimension increase. Another important issue is the problem of indexing, which is the process of assigning binary codewords to each codevector. Again, as rate and dimension increase, e cient indexing is a signi cant problem. To characterize the performance of optimal VQ, we are interested in determining the least possible distortion of vector quantizers with rate R or less, called the operational distortion-rate function (ODRF). The ODRF among VQs of any dimension is denoted (R), while the ODRF among VQs with dimension k is denoted k (R). We would like to nd analytical expressions for (R) and k (R). In his celebrated 1948 paper, Shannon 18] showed that (R) equals D(R), which is the distortionrate function and is de ned using information-theoretic quantities that depend on the source statistics. It was shown in 18] that D(R) can be achieved arbitrarily

5 closely by using VQs of increasing vector dimension k and that when k is large, a good VQ has a uniform distribution of codevectors in the region of typical sequences, where the source distribution is approximately uniform. When rate R is large, k (R) for xed-rate VQ can be characterized using high resolution quantization theory, which is also called asymptotic quantization theory or high rate theory. Bennett's integral for vector quantizers 1, 8, 14] gives an expression for the distortion of an arbitrary VQ, when the number of codevectors N is large, 1 Z m(x) p(x) dx D(N ) = N 2=k (x)2=k (1.1) where the point density (x) is the normalized density of codevectors near x, the inertial pro le m(x) is approximately the normalized moment of inertia of the quantization cells near x, and the normalized moment of inertia (NMI) of a cell S is given R by M (S ) = k;1 v(S );(1+2=k) S kxk2 dx. The ODRFs for k-dimensional xed-rate VQ can then be derived by choosing m(x) and (x) to minimize Bennett's integral. It is not known how to optimize m(x), but according to Gersho's widely accepted conjecture 8], the best inertial pro le m (x) is a constant, that is, m (x) = mk , where mk is the least NMI of any tesselating polytope in k dimensions. Using variational calculus or Holder's inequality then shows that the best inertial pro le is (x) = c p(x)k=(k+2). This gives the ODRF for k-dimensional VQ, which expressed as a function of rate, is lim R!1 22R
k (R)

= mk

p(x)k=(k+2) dx

(k+2)=k

(1.2)

This was rst derived by Zador 19], who determined the right-hand side of (1.2) to within a constant Gersho then conjectured that the constant is mk . Equivalently, the least distortion of a xed-rate VQ with N codevectors is approximately 1 m Z p(x)k=(k+2) dx (k+2)=k : D (N ) = N 2=k k (1.3)

6 Since Bennett's integral gives the distortion of an arbitrary VQ, we can use (1.1) and (1.3) to quantify the loss L of suboptimal VQ. Optimal VQs have optimal point density and inertial pro le, which implies that the total loss L can be factored as R ) R ) R ) 2 2 D(N ) = m()x=k p(x) dx = m()x=k p(x) dx mx)(2x=k p(x) dx (x (x ( L = D (N ) R m (x) R m (x) R m (x) (x)2=k p(x) dx (x)2=k p(x) dx | (x)2=k p(x) dx | {z } {z } Lce Lpt where cell shape loss Lce is the loss due to suboptimal inertial pro le and point density
loss Lpt is the loss due to suboptimal point density. For product quantization, which

is scalar quantization used k times, cell shape loss can be further decomposed as the product of cubic loss Lcu , which is due to product quantization's inability to form cells with lower NMI than cubes, and oblongitis loss, Lob , which is due to the fact that the quantization cells are rectangles instead of cubes. High resolution theory also gives an analytical expression for k (R) for entropycoded VQ. In this case, a high resolution formula for distortion is given by, Z m(x) D = (x)2=k p(x) dx where (x) is the unnormalized point density. A high resolution formula for rate is, 1 Z p(x) log (x) dx R = hk (X ) + k 2 1 R where hk (X ) = ; k p(x) log2 p(x) dx is the kth-order di erential entropy of the source. To optimize entropy-coded VQ, we wish to choose m(x) and (x) to minimize D when R is xed. The best m(x) is not known, so we again apply Gersho's conjecture which yields m (x) = mk . It can then be shown using variational calculus that (x) = c, that is, the optimal point density is uniform. It follows that the lim 22R k (R) = mk 22hkn (X ) ODRF of VQ with n-th order entropy coding is
R!1

7 which was rst derived by Zador 19]. We have characterized the ODRFs for k-dimensional VQ using high resolution quantization theory. In addition, as we have seen, there are algorithms for designing optimal VQs using a training sequence from the source. However, the complexity of optimal VQ is potentially prohibitive, especially as rate and dimension increase. To see this, consider an optimal unstructured codebook and full search encoding. The encoder needs to store and search over N = 2kR codevectors to implement full search encoding, and the decoder needs to store N = 2kR codevectors. Therefore, the complexity of such a VQ is O(2kR ), which increases exponentially in dimension and rate. In order to achieve performance close to that of optimal VQs under reasonable complexity constraints, one must consider structured vector quantizers. There are many types of structured VQs 9], such as product, multistage, tree-structured, trellis, lattice, and hierarchical, many of which have been shown to achieve good performance with moderate complexity.

Motivation for Lattice-Based Quantization


We now discuss several reasons for considering lattice-based quantization, which has potentially low complexity due to the development of encoding and indexing techniques for lattice codebooks. It will also be seen that there are several interesting open problems related to lattice quantization. Recall that encoding is the process of mapping a source vector to a reproduction vector, while indexing the process of assigning a binary codeword to each codevector. In order to design a xed-rate lattice quantizer for a source with unbounded support, we need to choose a good quantizer support region and a good scaling factor for the codebook. At the same time, we would like to encode source vectors e ciently and

8 use an e cient algorithm for indexing. E cient encoding and indexing algorithms have been developed for certain quantizer support regions. Given an arbitrary point x 2 <k , Conway and Sloane 4] have shown that the closest point to x in can be found with search complexity O(k), which is signi cantly less that the search complexity of full-search VQ, O(2kR ). This result shows that xed-rate lattice quantization of bounded sources can be implemented with low complexity. However, this algorithm applies only to the in nite lattice and does not directly extend to lattice quantizers that use only a nite number of points from . For xed-rate lattice quantization of unbounded sources, some suboptimal low-complexity encoding schemes for source vectors in the overload region have been proposed. However, it has not yet been shown that source vectors in the overload region can be optimally encoded with complexity less than O(2kR ), the complexity of full search encoding. Thus a signi cant open problem involves developing a low-complexity algorithm that nds the closest lattice point in the support to an arbitrary point in the overload region. Conway and Sloane have also developed indexing algorithms of order O(k) for lattice quantizers, whose support region is the Voronoi region of a sublattice of 5]. However, these algorithms do not extend to other support regions, such as spherical. E cient indexing for non-Voronoi support regions is an important open problem. The optimal scaling of a lattice quantizer has, for the most part, been determined experimentally. If the source has bounded support, it can be seen that, at least for high rate, the lattice should be scaled to cover the support of the source density. For unbounded sources, however, it is not clear how to scale the lattice or how the scaling should vary as rate increases. Optimal scaling may vary with the shape of the support region. For example, the scaling for a spherical support might di er

9 from that for a cubic support, especially in higher dimensions. In Chapter II of this dissertation, we derive analytical expressions for the optimal scaling factor for spherical support regions. There are still more reasons for considering lattice quantization. In our discussion of optimal VQ, we saw a number of intriguing connections between optimal VQ and lattices. Shannon's asymptotic equipartition property shows that for large dimension, the source is approximately uniformly distributed this in turn suggests that lattice quantization may be optimal. It is clear that for xed dimension, lattice quantizers are not optimal however, they may achieve performance close to optimal. From high resolution theory, we know that an optimal xed-rate VQ has a point density that matches the source density, which implies that lattice quantizers are optimal only for uniformly distributed sources. However, Gersho's conjecture suggests that optimal VQs are locally a lattice quantizer. This suggests that an optimal VQ might be obtained from a transformation of a lattice quantizer. In fact, this is known to be the case in one dimension. Bennett showed that any scalar quantizer, e.g. optimal, can be implemented by a companding structure, which consists of a nonlinear mapping, called a compressor function, followed by a uniform scalar quantizer and the inverse of the compressor function. A multidimensional compander utilizes a vector-valued compressor function and a lattice quantizer. In multiple dimensions, it has been shown that optimal VQs cannot be implemented using a companding structure 8, 2, 3], except for a very restricted class of source densities. It may be that the best companders are close to optimal however, the best multidimensional companders are not currently known. In Chapter III, we determine the optimal high resolution compressor function for memoryless sources, under certain technical conditions on the compressor function.

10 Following from the companding idea, it is also interesting to consider lattice quantization with other transformations. For example, the source can be quantized in terms of its polar coordinates 17], 7]. Lattice quantization can also be used in two-stage quantization to quantize the rst-stage errors 16]. In Chapter IV, we optimize uniform polar quantization, a simple lattice-based method that uses a polar transformation and a two-dimensional integer lattice. Recall that for asymptotically optimal variable-rate VQ, the best point density is uniform. This suggests that lattice quantization may be optimal. The encoding technique of Conway and Sloane can be implemented because the lattice VQ codebook consists of the entire lattice therefore, entropy-coded VQ has low encoding complexity. However, in order to achieve good performance, variable-rate VQs must use higher order entropy coders, which have high complexity. Thus, optimal VQs have high complexity for xed-rate VQ, the complexity lies in the quantizer, while for variable-rate VQ, the complexity lies in the entropy coder 15]. As we have seen, optimal variable-rate VQ is well understood. The best asymptotic quantizer has a uniform point density and may use an in nite number of codevectors. There are few interesting open questions in variable-rate lattice VQ, and as a result, this dissertation considers xed-rate lattice quantization only.

Main Contributions
In this thesis, three lattice-based quantization methods are analyzed and their optimal high resolution distortion-rate performance is determined. In Chapter II, we study simple lattice quantization for generalized Gaussian sources. Despite the simplicity of lattice quantization, it is still not known how the quantizer support should increase with rate. A related open question asks how

11 the distortion of optimized lattice quantization decreases with N . We saw that for asymptotically optimal VQ, distortion decreases as N ;2=k , and it has long been suspected that optimized lattice quantization decreases less rapidly with N . In this chapter, we derive asymptotic formulas for the optimal scaling factor, which uniquely determines quantizer support, and the resulting minimum MSE. These expressions are derived by minimizing upper and lower bounds to distortion. It is shown that the optimal scaling factor aN decreases as (ln N )1= N ;1=k and that for scale-optimized lattice quantization, granular distortion asymptotically dominates overload distortion. Consequently, the minimum distortion is DN =

c (ln N )2= N ;2=k . This result indicates that the performance of optimal lattice quantizers diverges from that of asymptotically optimal vector quantization, as N increases. Because simple lattice quantization has poor distortion-rate performance compared to optimal VQ, we are led to consider multidimensional companding in Chapter III. In particular, we would like to determine the compressor function that minimizes compander distortion. For an arbitrary source, this is a di cult open problem. In this chapter, we nd the asymptotically optimal compressor function for a memoryless source, under certain technical conditions on the compressor function. In order to do so, we consider Bucklew's asymptotic expression for compander distortion 2]. We then show, using variational calculus arguments, that the compressor function consisting of the optimal scalar compressor functions for each source component is the minimizing function. As a result, one concludes that optimized companding has distortion that decreases as N ;2=k but nevertheless su ers the same point density and oblongitis losses as optimal scalar quantization. Companding is able to recover the cubic loss by appropriate choice of .

12 Although it is not able to achieve performance near that of optimal VQ, companding o ers a signi cant improvement over simple lattice quantization. As a result, we are interested in knowing whether there are simple lattice-based methods that can o er similar improvement. One such method is polar quantization, which we consider in Chapter IV. In particular, we consider uniform polar quantization, where the source vector is quantized in terms of its magnitude and phase, using a uniform scalar quantizer for each. This is also a lattice-based method, involving a polar transformation and an integer lattice quantizer in two dimensions. For a Gaussian source, we are able to use a recent result on the support and distortion of an optimal uniform scalar quantizer for a Rayleigh random variable, similar to that in Chapter II, to nd the optimal step size for the magnitude quantizer. The asymptotically optimal rate allocation between magnitude and phase is then derived. The results show that the optimal rate allocation gives increasingly more rate to the magnitude as total rate increases, compared to nonuniform polar, where the di erence between optimal magnitude and phase rates is asymptotically a constant. In Chapter IV, we also present a uni ed analysis of several nonuniform polar quantization methods by focusing on their point densities and inertial pro les and using Bennett's integral to express the mean-squared error. The subsequent analysis is straightforward and leads to new insights into the relationship between polar quantization and Cartesian quantization. With this approach, unrestricted polar quantization, which is arguably the best method, may be optimized essentially by inspection. As another example, a new polar quantization method, called power law polar, is analyzed and optimized.

13

References
1] W.R. Bennett, \Spectra of quantized signals," Bell Syst. Tech. J., vol. 27, pp. 446-472, July 1948. 2] J.A. Bucklew, \Companding and random quantization in several dimensions," IEEE Trans. Inform. Theory, vol. IT-27, pp.207-211, Mar. 1981. 3] J.A. Bucklew, \A note on optimal multidimensional companders," IEEE Trans. Inform. Theory, vol. IT-29, p. 279, Mar. 1983. 4] J.H. Conway and N.J.A. Sloane, \Fast quantizing and decoding algorithms for lattice quantizers and codes," IEEE Trans. Inform. Thy., vol. IT-28, pp. 227232, Mar. 1982. 5] J.H. Conway and N.J.A. Sloane, \A fast encoding method for lattice codes and quantizers," IEEE Trans. Inform. Thy., vol. IT-29, pp. 820-824, Nov. 1982 6] J.H. Conway and N.J.A. Sloane, Sphere Packings, Lattices, and Groups, 2nd ed. New York: Springer-Verlag, 1993. 7] T.R. Fischer, \A pyramid vector quantizer," IEEE Trans. Inform. Theory, vol. IT-32, pp. 568-583, July 1986. 8] A. Gersho, \Asymptotically optimal block quantization," IEEE Trans. Inform. Theory, vol. IT-25, pp. 373-380, July 1979. 9] A. Gersho and R. Gray, Vector Quantization and Signal Compression. New York: Springer, 1991. 10] J.D. Gibson and K. Sayood, \Lattice quantization," Advances in Electronics and Electron Physics, vol. 72, pp. 259-330, 1988. 11] D.A. Hu man, \A method for the construction of minimum redundancy codes," Proc. IRE, vol. 40, pp. 1098-1101, Sept. 1952. 12] Y. Linde, A. Buzo, and R.M. Gray, \An algorithm for vector quantizer design," IEEE Trans. Commun., vol. 28, pp. 84-95, Jan. 1980. 13] S.P. Lloyd, \Least squares quantization in PCM," Bell Laboratories Technical Note, 1957. 14] S. Na and D.L. Neuho , \Bennett's integral for vector quantizers," IEEE Trans. Inform. Theory, vol. IT-41, pp. 886-900, July 1995. 15] D.L. Neuho , \Source coding strategies: simple quantizers vs. simple noiseless coders," Proc. 1986 CISS, pp. 267-271, 1986. 16] J. Pan and T.R. Fischer, \Two stage vector quantization-lattice vector quantization," IEEE Trans. Inform. Theory, vol. 41, pp. 155-163, Jan. 1995. 17] M.J. Sabin and R.M Gray, \Product code vector quantizers for waveform and voice coding," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, no. 3, June 1984.

14 18] C.E. Shannon, \A mathematical theory of communication," Bell Syst. Tech. J., vol. 27, pp. 379-423, 623-656, 1948. 19] P.L. Zador, \Topics in the asymptotic quantization of continuous random variables," Bell Laboratories Technical Memorandum, 1966.

15

Figure 1.1: A lattice quantizer based on the two-dimensional square lattice with hexagonal support.

Figure 1.2: A lattice quantizer based on the two-dimensional hexagonal lattice with hexagonal support.

CHAPTER II

Asymptotically Optimal Fixed-Rate Lattice Quantization for a Class of Generalized Gaussian Sources1

I. Introduction
Lattice quantization is one of the simplest methods of vector quantization (VQ) 1, 2], but for sources with unbounded support, it is still not known how the quantizer support should vary with the number of codevectors. To be speci c, consider a lattice codebook of size N and let the quantizer codevectors be scaled by a real number a.2 We would like to determine the best scaling factor aN and the resulting minimum distortion DN . In this paper, we nd asymptotic expressions for aN and DN for a class of generalized Gaussian sources. Past work on the optimal support of uniform scalar quantization, which is lattice quantization in one dimension, has included numerical optimization 3, 4] and curve tting 5]- 8]. Nonlinear equations that can be solved numerically to give good
This work was supported by an NSF Graduate Fellowship and by NSF Grant NCR-9415754 and was submitted for publication in the IEEE Transactions on Information Theory, with coauthor David L. Neuho . Part of this work was presented in September 1996 at the International Symposium on Information Theory and Its Applications in Victoria, Canada. 2 For a given N , the scaling factor a uniquely determines the quantizer support.
1

16

17 approximations to aN and DN have also been developed 6, 9, 10]. The most comprehensive work to date is that of Hui and Neuho 11], who found asymptotic formulas for aN and DN for a large class of sources. For multidimensional lattice quantization, determination of aN has been performed largely by experimentation 12]- 17]. Jeong and Gibson 18] developed a non-linear equation whose solution gives an approximation to aN . Their analytical solution was then shown to match simulation results for the 16-dimensional integer lattice, when the source is memoryless Gaussian and memoryless Laplacian. In this paper, we consider a class of stationary, memoryless, generalized Gaussian sources and nd asymptotic formulas for aN and DN . This work generalizes some of the results in 11] to arbitrary dimension. For the strictly Gaussian case, similar expressions have been derived independently by Eriksson and Agrell 19]. We consider a zero-mean random vector X with independent and identically distributed components, where each component has variance follows that X has k-dimensional density " #k " X # k p(x) = 2;( 1 + 1) exp ; jxi j
i=1

and a density given

by the generalized Gaussian (GG) density with exponential decay parameter . It

where

v u 1 u ;(3= ) = t ;(1= )
;(x) =

and ;(x) is the Gamma function

Z1
0

tx;1 e;tdt:

The density of R = kX k is given by,

pR(r) = c rk;1 exp ; r ]

(2.1)

18 where c makes the density integrate to one. For = 1 and = 2, the univariate GG density reduces to the Laplacian and Gaussian densities, respectively. When
! 1, the univariate GG density approaches a uniform density. In general the GG

density is de ned for all positive values of , but in this work, we restrict attention to 1, for reasons to be made clear later. The contours of equal probability of (r) = fx 2 <k : kxk where
kxk =
k X i=1

p(x) are boundaries of k-dimensional spheres, called ;spheres, of the form, rg

jxi j

!1=

It can be shown 20, pp. 7-8] that for


kxk

2 <+ and x 2 <k , kxk

(2.2)

where

8 > 1 if < = > : k1= if <

Let v(F ) denote the volume of a k-dimensional set F and let V = v( (1)) denote the volume of the unit radius ;sphere. It is known that 21, p. 108] h 2 1 ik ; V = : ; k +1 Consider a k-dimensional in nite lattice with a fundamental cell S0 that is, is a countably in nite set of points f ig1 that is closed under addition, and S0 i=1 is a set with nite diameter such that f i + S0g1 is a partition of <k . Without i=1 loss of generality, assume that k ik
k i+1 k for all i. Consider the sequence

19 of lattice VQ codebooks fCN g1=1, where CN contains the N codevectors fyigN , N i=1 where y1 = quantizer as
SN =
N i=1

1 y2 = 2

::: yN =

N.

Associated with each codevector is a support

cell Si = yi + S0 i = 1 ::: N . We de ne the support SN of the N -point lattice

Si :

C The complement SN is the overload region. The lattice quantizer maps x 2 <k to a

codevector in CN according to the rule, 8 > yi < if x 2 Si QN (x) = > : C > arg min kx ; yk2 if x 2 SN : y2CN Note that as conventionally de ned, the ith quantization cell contains the ith support cell and possibly a portion of the overload region. We consider scaling the codebook by a positive and nite scaling factor a. The scaled codebook, denoted aCN , has codevectors ay1 ::: ayN and support cells aS1 ::: aSN . In this work, we use mean-squared error (MSE) as the distortion measure, as given by 1Z D(N a) = k k kx ; QN (x)k2 p(x) dx: 2 < minimizes mean-square error, that is, (2.3)

For the codebook CN , we are interested in determining the scaling factor aN that

aN = arg min D(N a): a


It is easily seen that D(N a) is a continuous function of a and that aN 6= 0 and

aN 6= 1. Therefore, the minimum exists but may not be unique. If the minimum is
not unique, we allow aN to be any scaling factor that achieves the minimum. The

20 corresponding minimum distortion is

DN = D(N aN ):
We will derive asymptotic expressions for aN and DN . We decompose the distortion into

D(N a) = Dg (N a) + Do (N a)
where Dg and Do are, respectively, granular distortion and overload distortion, given
N 1 X Z kx ; ay k2 p(x) dx Dg (N a) = k i 2 i=1 aSi 1Z kx ; QN (x)k2 p(x) dx: Do (N a) = k 2 (aSN )C Figure 2.1 illustrates aN and its relation to granular and overload distortion.

by

In general, it is di cult to work with the exact expressions for granular and overload distortion. As a result, we develop upper and lower bounds to granular distortion. These induce upper and lower bounds to total distortion. We then nd asymptotic expressions for the scaling factors aN and aN that minimize the upper and lower bounds, respectively. As with aN , the minima of the upper and lower bounds exist but are not unique. We allow aN and aN to be any scaling factors that achieve their respective minima. We show that aN =aN ! 1 as N ! 1 and that

aN =aN ! 1 as N ! 1, where aN is the optimal scaling factor that we seek. We also


nd asymptotic expressions for the minimum value of the upper and lower bounds, denoted DN and DN respectively, and consequently for DN . We now introduce some important parameters that will be used in this paper. We de ne the l diameter of S0 as

d =

x y2cl(S0 )

max kx ; yk

21 where cl(S0) is the closure of S0. Because cl(S0) is compact, the maximum exists, and d is nite. The normalized moment of inertia (NMI) of S0 is given by 1R 2 k S0 kx ; y k2dx M (S0) = v(S )1+2=k 0 where y is the codevector of S0. The NMI is normalized in the sense that it is invariant to scaling and also in a per-dimension sense. We de ne the equivalent
sphere of the codebook CN by
Seq N =

( N)

where
N

Nv(S0) V

!1=k

(2.4)

Note that SN and Seq N have the same volume and that Seq N does not depend on the ordering of the i's. Finally, in statements and proofs of some of our results, we use the notation c to denote a constant whose value is not important, and we use oN to indicate a quantity that tends to zero as N tends to in nity. The values of c and oN in one such statement are not necessarily the same as those in another. This rest of this paper is organized as follows. Section II presents the upper and lower bounds. Section III presents the main results of this paper, namely, an asymptotic expression for the scaling factor that minimizes lattice quantizer distortion. Proofs are given in Section IV. Some details of the derivation are relegated to the Appendix.

22

II. Distortion Bounds


In this section, we present the upper and lower bounds to granular and overload distortion that will be used to derive the main results of this paper, presented in Section III. Proofs of the results in this section are given in Section IV.

Bounds to Overload Distortion


Bounds to overload distortion are based on bounds to the quantizer support. In particular, we de ne the inner support S N and outer support S N of SN as the
;spheres, SN = SN =

( (

N N

; 3d )

+ 3d ):

For a Gaussian source, inner and outer supports are illustrated in Figure 2.2. Corresponding to a codebook scaled by a, the inner and outer supports are aS N and aS N . The following result relates the support SN to its inner and outer supports S N and
SN .

Lemma 1: For all su ciently large N ,


SN SN SN :

Using the inner and outer supports, we can then derive bounds to overload distortion.

Lemma 2: For all su ciently large N , upper and lower bounds to overload distortion are given by

Do (N a)

2 Z Do(N a) def ck = U (kxk ; a( (aS N )C

; d ))2 p(x) dx

23
and

Do(N a)
where
N

Do(N a)
N

def =

c2 Z L k (aSN )C kxk ; a

p(x) dx

8 > 1 if < 2 cU = > : k1=2 if > 2

; 3d ,

+ 3d ,

8 >1 < if 2 cL = > : : k;1= if < 2

Bounds to Granular Distortion


Bounds to granular distortion are based on bounds on the probability in each support cell. For i = 1 ::: N , de ne

yi = arg xmax ) p(x) 2cl(S


i

yi = arg x2cl(S ) p(x): min


i

(2.5)

Note that the maximum (resp. minimum) exists, because p is continuous and cl(Si) is compact, but may not be unique. If the maximum (resp. minimum) is not unique, we allow yi (resp. yi) to be any vector that achieves the maximum (resp. minimum).

Lemma 3: Upper and lower bounds to granular distortion are given by


Dg (N a)
and

Dg (N a)

def =

M (S0 ) v(S0

)1+2=k ak+2

N X i=1

p(ayi)

Dg (N a)

D g (N a )

def =

M (S0) v(S0

)1+2=k ak+2

N X i=1

p(ayi):

24

III. Main Results


In this section, lattice quantization is optimized asymptotically when the source is stationary, memoryless and generalized Gaussian. Proofs of results in this section are given in Section IV, except where noted. In order to nd the optimal scaling factor and the resulting minimum distortion, we rst optimize bounds to quantizer distortion. It will then be shown that the optimized bounds lead directly to the desired optimization.

Lemma 4: As N ! 1, aN aN aN ! 0 and aN N 1=k aN N 1=k aN N 1=k ! 1.


This holds because, as N ! 1, the quantization cell diameter of an optimal quantizer must tend to zero and the support of an optimal quantizer, which is proportional to aN 1=k , must increase to in nity, or else distortion will not tend to zero.

A. Minimizing the Upper and Lower Bounds


We begin by optimizing the upper bound to distortion. The following result characterizes the scaling factor aN .

Proposition 5: The scaling factor aN satis es


M (S0) v(S0)2=k k c;2 (1 + oN ) U
= 1 Z 1 r p (r) dr R N( N ; d ) a N ZN aN N 1 pR(r) dr (2.6) ; ( N ; d )2
aN
N

where pR(r) is given by (2.1).

Thus we have reduced the problem of characterizing the scaling factor aN of a kdimensional lattice quantizer to the solution of a one-dimensional integral equation.

25 Note that (2.6) is similar to Equation (52) in Hui and Neuho 11], which characterizes the optimal support of uniform scalar quantization for all N . The integral equation in (2.6) leads to the following asymptotic expression for aN .

Proposition 6 : For a stationary, memoryless, generalized Gaussian source with


1, aN satis es lim aN 1= N !1 (ln N ) N 1=k lim N !1 2 = 1 k
1=

V v(S0)

!1=k

The resulting minimum MSE of the upper bound to distortion satis es

DN 2 2= M (S0)V 2=k : = k 2 (ln N )2= N 2=k This optimized upper bound indicates that lattice quantizer distortion decreases to
zero at least as quickly as (ln N )2= N ;2=k . We also see that when the lattice is scaled by aN , the overload distortion of the upper bound is asymptotically negligible, so that total distortion asymptotically equals granular distortion, i.e. DN = a2 M (S0) v(S0), N when N is large. However, derivation of aN requires consideration of both granular and overload distortion. In order to provide more de nitive conclusions about the distortion of lattice quantization, we now consider minimizing the lower bound to distortion. Because the upper and lower bounds have a similar form, minimizing the lower bound is very similar to minimizing the upper bound. Accordingly, we present the following results without proofs.

Proposition 7: The scaling factor aN satis es


M (S0) v(S0)2=k k c;2 (1 + oN ) L
=

1 1 2 r pR (r) dr N aN NZ aN N 1 ; 2 pR(r) dr: N a N


N

26

Proposition 8 : For a stationary, memoryless, generalized Gaussian source with


1, aN satis es lim aN 1= N !1 (ln N ) N 1=k lim N !1 2 = 1 k
1=

V v(S0)

!1=k

The resulting minimum MSE of the lower bound to distortion satis es

2 2= M (S0)V 2=k : DN = k 2 (ln N )2= 2=k N Again, we see that overload distortion is asymptotically negligible.

B. Minimizing Lattice Quantizer Distortion


Combining Propositions 6 and 8 and Lemmas 2 and 3 immediately leads to the following result, presented without proof.

Proposition 9 : For a stationary, memoryless, generalized Gaussian source with


1, the minimum MSE of the lower bound to distortion satis es 2 2= M (S0)V 2=k : DN = k 2 (ln N )2= 2=k N The following result then characterizes the asymptotic behavior of aN . lim N !1

Proposition 10: For a stationary, memoryless, generalized Gaussian source with


! aN 2 1= 1 V 1=k : lim = k N !1 (ln N )1= v(S0) 1=k N Proposition 9 shows how scale-optimized lattice quantization distortion depends
on the codebook size N , dimension k, fundamental cell S0, and decay parameter . High-resolution quantization theory often assumes or shows that overload distortion 1, aN satis es

27 is negligible when analyzing VQ. For scale-optimized lattice quantization, as with the upper and lower bounds, the overload distortion is asymptotically negligible. Therefore for large N , DN has the form
2=k 2= 2=k DN = M (S0)V (ln N 2=k) : 2 N

(2.7)

Let us compare (2.7) to

= cN ;2=k , which is the distortion of asymptotically

optimal vector quantization 22], 23], where c is a constant that depends on and

k. We immediately see that DN = N ! 1 as N ! 1. Although overload distortion


is asymptotically negligible for scale-optimized lattice quantization, derivation of aN requires consideration of both granular and overload distortion. The fact that aN must grow to keep overload distortion small is what makes DN = N ! 1 as N ! 1. Further insight can be gained by expressing distortion as a function of rate. The distortion of scale-optimized lattice quantization with rate R is
2=k 2=k D(R) = M (S0)V 2 (ln 4) R2= 2;2R

while the distortion of asymptotically optimal quantization is (R) = c 2;2R . Equivalently, the signal-to-noise ratio (SNR) of scale-optimized lattice quantization is

S (R) = 6R ; 20 log10 R + c dB
as compared to 6R + c dB for asymptotically optimal vector quantization, where the \c" constants depend on and k. In other words, both lattice quantization and optimal quantization have SNRs that increase as 6 dB per bit, but the di erence between their SNRs increases to in nity as rate increases. As ! 1, the GG distribution tends uniform. For large , assuming Gersho's conjecture 23] and that and S0 are chosen optimally, it can be shown that DN =

28
N.

Thus our results are consistent with the widely-held belief that lattice quantizers

are optimal for uniformly distributed sources.

C. Summary
We have derived asymptotic expressions for the optimal scaling factor and resulting minimum distortion of lattice vector quantization. This scheme assumes xed-rate quantization and xed dimension, and our results apply to stationary, memoryless, generalized Gaussian sources. It is shown that scale-optimized lattice quantizer distortion diverges from that of asymptotically optimal quantization as rate increases.

IV. Proofs and Derivations


A. Proof of Lemma 1
We have restricted attention to need its radius a( large. De ne
N N

1 because we make use of the triangle

inequality in this and other proofs. In order for the inner support to make sense, we
; 3d ) to be nonnegative. This occurs when N is su ciently

= max kyk : y2C


N

For all x 2 SN ,
kxk kx ; QN (x)k + kQN (x)k
N.

d +

by the de nition of d and

Therefore,
N

x2

def =

+d )

29 and it immediately follows that SN


N.

Next, observe that by the de nition of CN ,


N y2

min kyk : nC
N

C For x 2 SN , let y be any point in n CN that is closest to x and notice that

kxk

ky k ; ky ; xk >

; 2d :

Therefore

x 62

def =
C N

; 2d )
N

C and it immediately follows that SN

which implies that


N

SN .

From the above, we know that v( N ) v(SN ) v( N ). Since v(SN ) = v(Seq N ), we see that v( N ) v(Seq N ) v( N ). Because 2d
N N +d

Seq N

are ;spheres centered


N

at 0 with increasing volumes, it follows that they have increasing radii, i.e., . As a result,
N N +3d N +d

and S N

SN . Similarly,

it can be shown that S N

SN , which completes the proof.

B. Proof of Lemma 2
To prove the upper bound, let naS N (x) be the closest point in aS N to x. Then for all x 2 (aS N )C ,
kx ; QN (x)k2 kx ; QN (naS N (x))k2 kx ; naS N (x)k2 + knaS N (x) ; QN (naS N (x))k2

(2.8) (2.9)

cU kx ; naS N (x)k + knaS N (x) ; QN (naSN (x))k cU ( kxk ; a (


N

; 3d )] + a d )

30 where (2.8) follows from the triangle inequality and (2.9) follows from (2.2). Since (aSN )C (aS N )C by Lemma 1, we then have 1Z Do (N a) = k kx ; QN (x)k2 p(x) dx 2 (aSN )C Z 1 kx ; QN (x)k2 p(x) dx 2 C k (aSN ) c2 Z U (kxk ; a( N ; d ))2 p(x) dx C k (aS ) = Do(N a):
N

To prove the lower bound, note that for all x 2 (aS N )C ,


kx ; QN (x)k2

cL kx ; QN (x)k cLkx ; naS N (x)k


= cL (kxk ; a(
N

+ 3d ))

where the rst inequality follows from (2.2). Therefore, observing that (aS N )C (aSN )C by Lemma 1, 1Z kx ; QN (x)k2 p(x) dx Do (N a) = k 2 (aSN )C Z 1 kx ; QN (x)k2 p(x) dx 2 C k (aS N ) 2 c2 Z L kxk ; a N p(x) dx C k (aSN ) = D o ( N a)

which completes the proof.

C. Proof of Lemma 3
It can be easily shown that for i = 1 ::: N ,

ayi = arg xmini p(x) 2aS ayi = arg x2aSi p(x): max

(2.10) (2.11)

31 To prove the upper bound, consider granular distortion and note that N 1 X Z kx ; ay k2 p(x) dx Dg (N a) = k i 2 i=1 aSi N X 1 Z kx ; ay k2 dx p(ayi) k i 2 aSi i=1 N X = M (S0) v(S0)1+2=k ak+2 p(ayi) = Dg (N a)
i=1

where the inequality follows from (2.11) and the penultimate equality follows from the de nition of NMI and the fact that v(aSi) = ak v(Si). Using (2.10), it can similarly be shown that Dg (N a) Dg (N a). This completes the proof of the lemma.

D. Proof of Proposition 5
It is easily seen that aN is a solution to

@ D(N a) = @ D (N a) + @ D (N a) = 0: @a @a g @a o
To evaluate the partial derivative of overload distortion, one can use Leibniz' rule to show that 1 @ D (N a) a @a o

2c2 Z 1 ( ; d )2 p (r) dr = kU a N R N ! 1 Z 1 ( ; d ) r p (r) dr + o : (2.12) ; R N a N aN N N

To evaluate the partial derivative of granular distortion, it is straightforward to show that 1 @ D (N a) a @a g

= M (S0) v(S0)2=k (k + 2)
;

N X

i=1 N X i=1

p(a yi) v(aSi)

v(aSi) p(a yi) kayik :

32 Lemma A-1 in the Appendix shows that N X p(a yi) v(aSi) = 1 + oN


N X i=1 i=1

v(aSi) p(a yi) kayik = E kX k (1 + oN ): E kX k = k 1 :

For a generalized Gaussian density,

Therefore,

! 1 @ D (N a) = M (S ) v(S )2=k (k + 2) (1 + o ) ; k (1 + o )] : (2.13) 0 0 N N a @a g

Combining (2.12) and (2.13) proves the result. 2

E. Proof of Proposition 6
To prove the asymptotic expression for aN , start by de ning, Z1 K (y ) = 1 y y x pR(x)dx R 1 (x ; y) p (x) dx R J (y) = y R 1 x p (x)dx
y R

which were rst considered in 11]. Rewriting (2.6) from Proposition 5 in terms of

K ( ) and J ( ), we see that aN satis es


(
N

; d ) K (aN

N)

d +(

; d ) J (aN

N )]

+ oN

(2.14)

where = M (S0) v(S0)2=k k c;2. We then have the following lemma. U

Lemma 11: aN satis es


(a) lim sup 2 K (aN N
N !1 N ) J (aN N ) N)

(b) lim inf 2 K (aN N !1 N

33

Proof: To show (a), start with (2.14) and note that d


(
N

0 to obtain
N )]

; d ) K (aN

N)

N N)

; d ) J (aN

N)

( = (

; d ) K (aN

d +(

; d ) J (aN

+ oN
; d ) K (aN
N)

and take the lim sup. To show (b), use the bound J (aN N ) 1 in (2.14) to obtain
N N N)

( =

; d ) K (aN

d +(

; d ) J (aN

N )]

+ oN

and take the lim inf of the resulting inequality. 2 Returning to the proof of Proposition 6, we note that R is a generalized Gamma random variable, so from Hui and Neuho 11],

K (y) = expf; y (1 + oN )g J (y) K (y) = expf; y (1 + oN )g:


Evaluating at aN
N

yields

K (aN N ) = exp f; (aN N ) (1 + oN )g K (aN N ) J (aN N ) = exp f; (aN N ) (1 + oN )g :


Combining the above with (a) and (b) from Lemma 11 and the de nition (2.4) of shows that aN satis es
N

N 2=k exp f; (aN N ) (1 + oN )g = 1:


to ln N 2=k =

(2.15)

Taking the logarithm of both sides and using (2.4), we see that (2.15) is equivalent

v(S0) V

=k

aN N 1=k

(1 + oN ):

34 Therefore, we see that asymptotically, the radius of the optimal quantizer support,

aN N 1=k , grows as (ln N 2=k )1= . This in turn implies that ! aN N 1=k = 1 V 1=k (1 + o ) N (ln N 2=k )1= v(S0)
and taking the limit N ! 1 gives the asymptotic expression for aN . We now establish the asymptotic expression for DN . It is immediately seen that

Dg (N aN ) satis es
lim Dg (N2 aN ) = M (S0) v(S0)2=k Nlim N !1 !1 a
N N X i=1

p(aN yi) v(aSi)


(2.16)

= M (S0) v(S0)2=k

where the last equality follows from Lemma A-1. We now show that overload distortion is asymptotically negligible by showing that
N !1

lim

R1
aN
N

(r ; aN (

N ;d a2 N

))2 pR(r)dr

= 0:

(2.17)

We rst observe that R 1 (r ; a ( ; d ))2 p (r)dr N N R aN N 2 a R 1 (rN; a )2 p (r)dr N N R = aN N + 2 aN d a2 N Z1 + a2 d2 pR (r)dr N aN N R1 2 aN N (r ; aN N ) pR (r)dr + 2 aN d a2 N Z 2 d2 1 p (r)dr + aN a R
N N

R 1 (r ; a ) p (r)dr N N R aN N aN vR 1 u u aN N (r ; aN N )2 pR (r)dr t a2 N

where lim N !1

Z1
aN
N

pR(r)dr = 0

35 which follows from Lemma 4. Therefore, in order to show (2.17), it is su cient to show

R 1 (r ; a )2 p (r)dr N N R lim aN N = 0: N !1 a2 N
2 N

(2.18)

To show (2.18), we note that R1 2 aN N (r ; aN N ) pR (r)dr = a2 N Therefore, by Lemma 11, R 1 (r ; a )2 p (r)dr N N R lim sup aN N a2
N !1 N

K (aN N ) J (aN N ) R1 2 aN RN (r ; aN N ) pR (r)dr aN N a1 N (r ; aN N ) pR(r)dr : N

= = = 0

R 1(r ; y)2p (r)dr lim sup y yR 1(r ; y)pR (r)dr y!1 R y R 1(r ; y)p (r)dr 2 R lim sup y R 1 p (ry dr ; R 1 (r ; y)p (r)dr y!1 R y R ) y 2 y) lim sup 1 ;J2(J (y) y!1

where the second equality follows from L'Hopital's Rule and where the last equality follows because J (y) ! 0 as y ! 1 for generalized Gamma densities 11]. Therefore, (2.18) holds, which establishes the asymptotic negligibility of Do(N aN ). Proposition 6 then follows from (2.16).

F. Proof of Proposition 10
Note that

D(aN )

D(aN )

D(aN )

D(aN ):

(2.19)

Propositions 6 and 8 imply that D(aN )=D (aN ) ! 1 and hence by (2.19), lim D(aN ) = 1: N !1 D (a )
N

(2.20)

36

e We will show that if a sequence of scaling factors aN satis es e ) lim D(aN ) = 1 N !1 D(a
N

(2.21)

then

e a lim aN = 1: N !1
N

(2.22)

e With aN

aN , this fact, together with (2.20), then proves the proposition. We will

prove that (2.21) implies (2.22) by showing the contrapositive, namely, if (2.22) does not hold, then (2.21) does not hold. Accordingly, we assume that there exists > 0 and a sequence of integers increasing to in nity, denoted fNng1=1 , such that either n

en b > 1+ bn
or

for all n

(2.23)

en b < 1; bn
b D(en) > 1 + D(bn)

for all n

(2.24)

where en = aNn and bn = aNn . We will show that this implies b e for all su ciently large n (2.25)

which in turn will imply that (2.21) does not hold.


1 Without loss of generality, assume < 1. Note that bn ! 0 and bn Nn=k ! 1 because aN has these properties and bn is a subsequence of aN . We also assume en has b

these properties, because if not, it is obvious that D(en) 6! 0, whereas D(bn ) ! 0, so b (2.25) holds. Because bn and en have these properties, an argument similar to that b in (2.16) shows that Dg (en ) = e2 (1 + on ) and Dg (bn ) = b2 (1 + on ), where is a b bn n constant.

37 Now suppose (2.23) holds. It is easily shown that a condition similar to (2.18) holds for D(bn ), which implies that D(bn ) = Dg (bn) (1 + on ). Therefore, we see that Dg (en ) b b b D(en) = Dg (en ) + Do(en) b D(bn ) Dg (bn ) + Do(bn ) Dg (bn)(1 + on ) e2 n = b2 (1 + on) (1 + )2 (1 + on ) bn (1 + on) so that for large enough n, D(en)=D(bn ) > 1 + , which establishes (2.25). b Next suppose (2.24) holds. We will show later that D e lim D o(bn) = 1: n!1 g (bn ) It will follow that en 2 Dg (bn) (1 + on ) + Do(en) b b en) + Do(en) en) b = bn D(b = Dg (b D(bn) Dg (bn ) + Do(bn) Dg (bn)(1 + on ) # " 2 2 D (e ) en Dg (bn ) 1 + bn b o bn + o bn en Dg (bn ) n b = Dg (bn)(1 + on ) 0 !1 Do (en) + o en 2A (1 + o ) b b @ n D (b ) n b
g n n

(2.26)

D e = Do(bn) + on ! 1 g (bn ) b so that for n large enough, D(en)=D(bn ) > 1 + , which again establishes (2.25).
In order to prove (2.26), we will show that R 1 r ; e 2 p (r)dr bn n R e b lim n n = 1 n!1 b2 n where n = Nn . It can be shown that 11] Z1 (r ; y)2 pR(r) dr = c exp ; y ] (1 + on ): y (2.27)

The above fact, together with en < (1 ; )bn , shows that b Z1 Z1 2 2 r ; en n pR (r)dr b r ; (1 ; )bn n pR (r)dr en n b (1; )bn n h i = c exp ; ((1 ; )bn n ) (1 + on ): (2.28)

38 Proposition 8 implies that 2 bn = 1 k so that for large enough n, (1 ; ) (bn n ) = (1 ; ) 2 (ln N ) (1 + o ) n n k


2 (1 ; =2) (ln Nn=k): 1=

V v(S0)

!1=k

(ln Nn )1= (1 + o ) n 1 Nn=k

(2.29)

Then, for large enough n, the right-hand side of (2.28) is given by,

h i c exp ; ((1 ; )bn n) (1 + on )

h i 2 c exp ;(1 ; =2) (ln Nn=k ) (1 + on )


; = c Nn 2=k Nn=k (1 + on ):

(2.30)

In order to prove (2.27), note that (2.29) and (2.30) imply that R 1 r ; e 2 p (r)dr bn n R =k e b c nlim (lnNn )2= (1 + on ) = 1: lim n n !1 n!1 Nn b2 n This completes the proof of (2.26) and the proof of the proposition.

Appendix Riemann Convergence


This Appendix proves an intuitive lemma concerning the convergence of a Riemann sum to its Riemann integral. Although the result is well known for functions with compact support, convergence is not obvious when the function has unbounded support.

e Lemma A-1 : If bN ! 0 and bN N 1=k ! 1 as N ! 1, yi 2 Si for all i, and


0

< 1, then
N !1 i=1

lim

N X

e e kbN yi k p(bN yi ) v (bN Si ) = E kX k :

39

Remark: This lemma explicitly includes the case = 0, which is needed in the
proof of Proposition 5.

Proof: Notice that the sum is over all support cells. We will group the support cells
and separately evaluate the portion of the above sum due to each of these groups. In order to establish these groups, we x s < 1, to be chosen later, and consider an
;sphere with radius s. The various groups are determined by the relation of the

support cells to this sphere. To be speci c, de ne


(1) IN bN = fi : 1 i N bN Si

(s)g

R(1)bN = N

i2IN bN
(1)

bN Si:

(1) Note that, for each N , every index in IN bN corresponds to a support cell that is

entirely contained within (s). We also de ne


(3) IN bN = fi : 1 i N bN Si

(s)C g

R(3)bN = N

i2IN bN
(3)

bN Si

(2) (1) (3) IN bN = fi : i 62 IN bN i 62 IN bN g

R(2)bN = N
Next we show that for each N ,

i2IN bN
(2)

bN Si:

R(2)bN N

(s + bN d ) n (s ; 2 bN d ):

(2.31)

(2) By the construction of R(2)bN , it is easily seen that for each i 2 IN bN , there exists a N

point z 2 bN Si such that kzk = s. Let x be any point in bN Si. Then


kxk

= kx ; zk + kzk

bN d + s:

40
(2) This holds for all x 2 bN Si and all i 2 IN bN , so R(2)bN N (2) any i 2 IN bN and any x 2 bN Si, we have

(s + bN d ). Similarly, for

kxk

kz k ; kz ; xk > s ; 2 bN d

which in turn shows that R(2)bN N

(s ; 2 bN d )C . This establishes (2.31).

Using (2.31) and the fact that p(x) is bounded, there exists an M such that

X
i2IN bN
(2)

e e kbN yik p(bN yi) v (bN Si )


(s + bN d ) M

X
i2IN bN
(2)

v(bN Si)

(s + bN d ) M v( (s + bN d ) n (s ; 2 bN d )) which implies that

X
i2IN bN
(2)

e e kbN yi k p(bN yi ) v (bN Si ) = oN

(2.32)

since bN ! 0. Since (s) is bounded and p(x) is continuous on (s), we see that Z X eik p(bN yi) v(bN Si) = e kbN y kxk p(x) dx + oN
i2IN bN
(1)

(s)

(2.33)

for all s, because as N ! 1, bN ! 0 and P SN bN Si ! 1, which follows from i=1 the fact that bN N 1=k ! 1. In other words, the left-hand side of (2.33) converges to its Riemann integral. Next, we will show that

X
i2IN bN
(3)

e e kbN yi k p(bN yi ) v (bN Si ) = E (s) + oN

(2.34)

41 where E (s) does not depend on N and E (s) ! 0 as s ! 1. In order to show (2.34), we note that

e kbN yi k

kbN yi k + bN d

2 kbN yik

(2.35) 2 p(bN yi) (2.36)

e p(bN yi) = p(bN yi) exp

e kbN y i ; bN yi k

p(bN yi) exp ( (bN d ) )

where yi and yi were de ned in (2.5), and where the last inequality in (2.35) and the last inequality in (2.36) assume su ciently large N . Therefore, lim sup

N !1 i2I (3) N bN

e e kbN yi k p(bN yi ) v (bN Si )


(2kbN yik ) 2 p(bN yi) v(bN Si) i2IN bN X Z 2 +1 kxk p(x) dx
(3)

2
def =

+1

(3) i2IN bN

bN Si

E (s)

(s)C

kxk p(x) dx

which establishes (2.34). Combining (2.32)-(2.34), we then see that Z N X e e kbN yi k p(bN yi ) v (bN Si ) = E kX k ;
i=1

(s)C

kxk p(x) dx + E (s) + oN

which implies that N X e e lim sup kbN yik p(bN yi) v(bN Si ) ; E kX k
N !1 i=1

Z
(s)C

kxk p(x) dx + E (s):

Since the above holds for all s, N X e e lim sup kbN yik p(bN yi) v(bN Si) ; E kX k
N !1 i=1

= 0

which completes the proof. 2

42

References
1] J.H. Conway and N.J.A. Sloane, Sphere Packings, Lattices, and Groups, 2nd ed. New York: Springer-Verlag, 1993. 2] J.D. Gibson and K. Sayood, \Lattice quantization," Advances in Electronics and Electron Physics, vol. 72, pp. 259-330, 1988. 3] J. Max, \Quantizing for minimum distortion," IEEE Trans. Inform. Theory, vol. IT-6, pp. 7-12, Mar. 1960. 4] M.D. Paez and T.H. Glisson, \Minimum mean-squared-error quantization in speech PCM and DPCM systems," IEEE Trans. Commun., vol. COM-20, pp. 225-230, Apr. 1972. 5] W.A. Pearlman and G.H. Senge, \Optimum quantization of the Rayleigh probability distribution," IEEE Trans. Commun., vol. COM-27, pp. 101-112, Jan. 1979. 6] J.A. Bucklew and N.C. Gallagher, Jr., \Some properties of uniform step size quantizers," IEEE Trans. Inform. Theory, vol. IT-26, pp. 610-613, Sept. 1980. 7] F.-S. Lu and G.L. Wise, \A simple approximation for minimum mean-squared error symmetric uniform quantization," IEEE Trans. Comm., vol. COM-32, pp. 470-474, Apr. 1984. 8] G. Vasquez, \Comments on `A simple approximation for minimum meansquared error symmetric uniform quantization'," IEEE Trans. Comm., vol. COM-34, pp. 298-300, Mar. 1986. 9] G.M. Roe, \Quantizing for minimum distortion," IEEE Trans. Inform. Thy., vol. IT-10, pp. 384-385, Oct. 1964. 10] V.R. Algazi, \Useful approximations to optimum quantization," IEEE Trans. Commun., vol. COM-14, pp. 297-301, Jun. 1966. 11] D. Hui and D.L. Neuho , \Asymptotic analysis of optimal xed-rate uniform scalar quantization," submitted to IEEE Trans. Inform. Theory, August 1997. 12] K. Sayood, J.D. Gibson and M.C. Rost, \An algorithm for uniform vector quantizer design," IEEE Trans. Inform. Thy., vol. IT-30, pp. 805-814, Nov. 1984. 13] M.C. Rost and K. Sayood, \The root lattices as low bit rate vector quantizers," IEEE Trans. Inform. Thy., vol. 34, pp. 1053-1058, Sept. 1988. 14] D.G. Jeong and J.D. Gibson, \Lattice vector quantization for image coding," in Proc. 1989 IEEE Int. Conf. Acoust., Speech, Signal Processing, Glasgow, Scotland, May 23-26, pp. 1743-1746. 15] M.V. Eyuboglu and G.D. Forney, Jr., \Lattice and trellis quantization with lattice- and trellis-bounded codebooks - high-rate theory for memoryless sources," IEEE Trans. Inform. Thy., vol. 39, pp. 46-59, Jan. 1993. 16] F. Chen, Z. Gao and J. Villasenor, \Lattice vector quantization of generalized Gaussian sources," IEEE Trans. Inform. Thy., vol. 43, pp. 92-103, Jan. 1997.

43 17] C. Pepin, J.-C. Bel ore and J. Boutros, \Quantization of both stationary and nonstationary Gaussian sources with Voronoi constellations," in Proc. 1997 IEEE Int. Symp. on Information Theory (Ulm, Germany, June 1997), p. 59. 18] D.G. Jeong and J.D. Gibson, \Uniform and piecewise uniform lattice vector quantization for memoryless Gaussian and Laplacian sources," IEEE Trans. Inform. Thy., vol. 39, pp. 786-804, May 1993. 19] T. Eriksson and E. Agrell, \Lattice-based quantization, Part II," Report No. 18, Department of Information Theory, Chalmers University of Technology, Goteborg, Sweeden, Oct. 1996. 20] A. Wilansky, Functional Analysis. New York: Blaisdell, 1964. 21] R.M. Gray, Source Coding Theory. Boston: Kluwer, 1990. 22] P.L. Zador, \Topics in the asymptotic quantization of continuous random variables," Bell Laboratories Technical Memorandum, 1966. 23] A. Gersho, \Asymptotically optimal block quantization," IEEE Trans. Inform. Theory, vol. IT-25, pp. 373-380, July 1979.

44

Do (N,a)

D(N,a) D g (N,a)

aN

Figure 2.1: The optimal scaling factor aN for N xed.

Figure 2.2: Illustration of inner and outer supports.

CHAPTER III

Optimal Compressor Functions for Multidimensional Companding of Memoryless Sources1

I. Introduction
Multidimensional companding is a type of structured vector quantization (VQ) with low complexity. As shown in Figure 3.1, a compander maps a k-dimensional source vector X to a k-dimensional vector Y in some rectangular support region, using a continuous compressor function f . The transformed source vector Y is quan^ ^ tized using a lattice quantizer to Y , which is then mapped by f ;1 to X , which is the reproduction of X . In this work, only xed-rate quantization (i.e. no entropy coding) is considered. The lattice quantizer has as its codevectors, all points from some in nite lattice that are contained in the rectangle. When given a compressor function and a desired number of codevectors N , we assume that the lattice is maximally scaled so that the number of lattice points in the corresponding support rectangle is at most N . We also assume that the partition of the lattice quantizer is
This work was supported by an NSF Graduate Fellowship and by NSF Grant NCR-9415754 and was submitted for publication in the IEEE Transactions on Information Theory, with co-author David L. Neuho . Part of this work was presented at the 1997 IEEE International Symposium on Information Theory in Ulm.
1

45

46 generated by tesselating some fundamental cell T0 at points of the lattice. (We will not be concerned with the manner in which points outside the support rectangle are assigned to cells.) The resulting compander is a VQ whose codevectors and partition are those of the lattice quantizer transformed by f ;1. In pioneering work, Bennett 1] argued that any scalar quantizer (e.g. optimal) can be implemented by companding and heuristically derived an asymptotic expression for the mean squared error (MSE) of scalar companding. Speci cally, when N is large, companding MSE is given by

Z (x) D(N ) = 121 2 fp0(x)2 dx N


where p(x) is the probability density of the source and f 0(x) is the derivative of the compressor function. From Panter and Dite 11] one concludes that the compressor function f that minimizes MSE is given by, Zx f (x) = c p(y)1=3 dy
;1

where c makes f (x) integrate to one. Using the optimal compressor in Bennett's formula shows that asymptotically optimal scalar quantizers have MSE given by

R where kpk = ( p(x) dx)1= .

D (N ) = 121 2 kpk1=3 N

For k-dimensional companding, Bucklew 2] extended Bennett's scalar result to show that when N is large and the fundamental cell T0 is white in the sense that that the components of a random vector uniformly distributed over T0 are white, the compander MSE D(f N ) is approximately given by D(f N ) = DB (f N ), where

DB (f N ) = N1=k M (T0) G(f ) 2

(3.1)

47 and2

G(f ) =

v(f (<k ))2=k

Z 1 ;1 2 k kF (x)k p(x) dx

(3.2)

v denotes k-dimensional volume, f (<k ) is the image of the compressor function f , T0 is the fundamental cell of the lattice
4], F (x) is the derivative matrix of the compressor function f (x), F ;1(x) is the inverse of the matrix F (x), and k k is the

l2 matrix norm.3 M (T0) is the normalized moment of inertia (NMI) of T0, given by R M (T0) = k;1v(T0);(1+2=k) T0 kxk2 dx. Bucklew also showed that an optimal twodimensional vector quantizer for a source with a circularly symmetric density cannot be implemented using a companding structure and a di erentiable compressor function. See also Gersho 5]. Subsequently, Bucklew 3] showed that asymptotically optimal vector quantizers, for vector dimension three and greater, cannot be implemented using a companding structure, except for a very restricted class of source densities. An important open question remains: What is the best multidimensional compressor function for an arbitrary source? This is a di cult unsolved problem. As a rst step, in this paper we nd the asymptotically optimal compressor function for a memoryless source, under certain technical conditions on the compressor function. The result agrees with intuition and shows that the best k-dimensional compressor function consists of the best scalar compressor functions for each component of the source vector. Though we suspect that this result holds over a broader class of compressors, we were not able to show this. We also compare the MSE of optimal multidimensional companding to that of optimal VQ and optimal scalar companding. For example, we show that companding su ers the same point density and oblongitis
2 3

In 2], ( ) did not contain ( (<k ))2=k , because it was assumed that (<k ) = (0 1)k . That is, the sum of squares of the matrix elements.
G f v f f

48 losses as scalar quantization, but recovers the cubic loss. The results in this paper can be easily generalized to pth power distortion measures. An outline of the paper is the following. The main result is stated and discussed in Section II. Optimal companding is compared to optimal VQ and scalar quantization in Section III. A derivation of the main result is presented in Section IV, with some details given in the Appendix.

II. Main Result


Let the source vector be a random vector X with probability density

p(x) =

k Y i=1

pi(xi) pi continuous, i = 1 :::k.

(3.3)

We consider a multidimensional compander that consists of a compressor function

f : <k ! <k , a lattice quantizer constructed from a lattice , and the inverse function f ;1 . The lattice
is assumed to have a fundamental cell T0 that is white.4 The support region of the lattice quantizer is a rectangle of the form R(a b) := (a1 b1) (ak bk ) for some a b 2 <k . Without loss of generality, we assume ai < bi, for all i.

De nition 1 For a b 2 <k , a function f : <k ! <k is boundary limited to R(a b) if for each i = 1 ::: k , fi(x) ! ai when xi ! ;1 for any xed x1 ::: xi;1 xi+1 ::: xk, and fi(x) ! bi when xi ! +1 for any xed x1 ::: xi;1 xi+1 ::: xk.
In this paper, we only consider compressor functions f in a class C , as de ned below.
4

In any dimension, the integer lattice and the optimal lattice are white 14].

49

De nition 2 The class C is the set of all mappings f : <k ! <k that are onto
and boundary limited to some rectangle R(a b), and whose derivative matrix F (x) is continuous in x and positive de nite5 for all x.

We are interested in nding the compressor function that minimizes G(f ) over the class C and the resulting minimum MSE that is,

f = arg min G(f ) f 2C DB (N ) = DB (f N ):


The following is the main result of this paper.

(3.4)

Proposition 1 Let the source be memoryless, as given by (3.3). Then over the class C of compressor functions, there exists a minimum of G(f ), and a compressor
function achieves the minimum if and only if it is a memoryless compressor of the form f (x) = (f1 (x) f2 (x)

fk (x))T , where Z xi 1=6 fi (x) = c kpi k1=3 pi (y)1=3dy + ai i = 1 ::: k


;1

for some c 2 < and a 2 <k . The image of the resulting compressor function is

R(a b ) where bi = ai + c kpi k1=2 i = 1 ::: k: The resulting MSE is, asymptotically, 1=3 0k 11=k 1 M (T ) @ Y kp k A DB (N ) = N 2=k (3.5) 0 j 1=3
j =1

where T0 is the white fundamental cell of the lattice .

The proof of this proposition is given in Section IV.

Remarks:
5

A positive de nite matrix with real elements is symmetric (c.f. 8]).

50 1. We do not assume that the compressor functions in C are one-to-one. However, Proposition 1 demonstrates that the optimal compressor functions are one-toone. 2. The optimal compressor function operates independently on each component of the source vector with the optimal compressor function for scalar companding. 3. We have restricted attention to compressor functions with a rectangular image. It is possible that a compressor function with a non-rectangular image could yield lower MSE than f . However, the resulting optimal image might have a very complex shape, making the implementation of the lattice quantizer very complex. This would defeat the motivating idea behind companding, namely that it have low complexity. 4. If the functions kpi k1=3 are equal, then the optimal image is a cube. If not, then the optimal image is a rectangle, which induces a kind of rate allocation among the components of X , with Xi receiving bits in proportion to log2 bi . This is easiest to see when = Z k , the k-dimensional integer lattice for in this case, companding is simply product quantization, and optimal compander implements an optimal product quantizer, which as is well known, assigns k1 Qkkpikp=j3k N levels to Xi . But it also holds for any other white lattice, as does
j =1
1=3

the fact that the resulting distortions are the same for each component, which is well known for optimal product quantizers. 5. If f (<k ) were restricted to be a cube, then one may see from the proof of Proposition 1 that the optimal compressor function fe would have component

51 functions

fei(x) :=

where c 2 < and a 2 <k . In this case, the loss in MSE of companding to a cube instead of to the optimal rectangle is given by DB (fe N ) = 1 Pk=1 kpj k1=3 k i Qk kp k 1=k DB (f N ) j =1 j 1=3 1

kpi k

1=3 1=3

Z xi
;1

pi (y)1=3dy + ai

where the inequality follows from the arithmetic-geometric mean inequality.

III. Comparison to Optimal Vector Quantization


It is interesting to compare the MSE of an arbitrary compander, given by (3.1), to the asymptotic form of k (N ), the MSE of optimal vector quantization. Zador 13] and Gersho 5] showed that for large N ,
k (N )

m = N 2k kpkk=(k+2) =k

under the assumption that optimal VQ has cells that are congruent to the tesselating polytope with least NMI, mk , in k dimensions. In order to facilitate this comparison, we provide a point density, cell-shape analysis in the style of 10]. Recall from 10] that Bennett's integral gives the MSE of a VQ with many points in terms of two key characteristics, the point density (x) and inertial pro le m(x). Speci cally, Bennett's integral is given by 1 Z m(x) p(x) dx: B ( m) = N 2=k (x)2=k (x) = c p(x)k=(k+2) and

A VQ that achieves k (N ) has optimal point density6


6
c

optimal inertial pro le m (x) = mk . Let the point density and inertial pro le of
Constants in this section make their respective functions integrate to one.

52 companding be denoted by
C (x)

and mC (x), respectively. Then the loss of an

arbitrary multidimensional compander relative to optimal VQ is

m Lcomp = DB ((fNN ) = B (( C mC)) = B (( C mC)) B (( C m )) B m k ) {z |B C m } |B {z } Lce Lpt


where Lce is the loss due to suboptimal inertial pro le (or cell shape) and Lpt is the loss due to suboptimal point density. It is easily shown that the point density and inertial pro le of companding are given by = jdet F (x)j 1 mC (x) = M (To) k kF ;1(x)k2 (det F (x))2=k
C (x)

M (To)

(3.6)

where the approximate equality in (3.6) assumes, as usual, that the fundamental cell of the lattice is white, and where the inequality, which is proved in Appendix I, holds with equality if and only if F (x) is a scaling of an orthogonal matrix. Equation (3.6) re ects the fact that the the cells of the compander are stretched versions of T0, except where and only where F (x) is a scaling of an orthogonal transformation. We can isolate the e ect of this stretching, or oblongation, by decomposing the cell shape loss into Lce = Lob Lce T0 , where Lob is the oblongitis loss 10], and Lce T0 =
M (T0) mk

is

the cell shape loss of the lattice, which is independent of the compressor function

f . Therefore, the loss of arbitrary companding relative to optimal VQ is expressed


as Lcomp = Lpt Lob Lce T0 . Note that companding achieves Lce T0 = 1 if and only if it uses a lattice such that T0 has NMI equal to mk . One would like to choose the compressor f to minimize Lpt Lob. On one hand, companding can achieve the optimal point density if the component functions of the compressor function are given by Z xi fi(x) = c pi (y)k=k+2dy (3.7)
;1

53 for i = 1 ::: k. In this case Lpt = 1, but it can be shown that this causes Lob = 1. In other words, when the compander operates with a compressor function speci ed by (3.7), MSE does not decrease as N ;2=k . On the other hand, a compander achieves

Lob = 1 if for all x, k;1 kF ;1(x)k2 (det F (x))2=k = 1, which is approximately true if and only if the compressor function is a scaling of an orthogonal transformation
on some large probability set.7 In this case Lpt = 1. As a result, we see that companding cannot simultaneously achieve Lpt = 1 and Lob = 1 and therefore cannot achieve the performance of optimal VQ, which agrees with Bucklew's analysis 2]. In fact the best compressor f , presented in Proposition 1, is a compromise which yields a compander with Lpt > 1 and Lob > 1. It is immediately seen that companding makes the same tradeo as optimal product VQ 10] (it generates the same point density), and therefore has the same Lpt and Lob. Companding's sole advantage over product VQ is its ability to choose the lattice that achieves Lce T0 = 1, whereas product VQ must incur cell shape loss due to cubic cells, that is Lce cube > 1. Therefore, optimal companding su ers the same point density and oblongitis losses as optimal product VQ while recovering the cubic loss. Table 3.1 summarizes the losses of optimal companding and optimal product VQ, relative to optimal VQ, for an IID Gaussian source. Note that Lcomp = Lpt +

Lob, which is the shape loss8 of optimal scalar quantization relative to optimal VQ,
de ned by Lookabaugh and Gray 9]. On the other hand, optimal product VQ has loss Lprod = Lpt + Lob + Lce cube, which shows that companding achieves the
space- lling advantage of vector quantization over scalar 9], or equivalently, the

inverse of the cubic loss of optimal product quantization relative to optimal vector
See Appendix I. Actually, 9] de nes the inverse of this to be the shape advantage of vector quantization over scalar quantization.
7 8

54

k Lce cube Lce


(dB) 2 3 4 0.17 0.26 0.39 1.53

Lpt

Lcomp Lprod
1.14 1.61 1.87 2.81 1.30 1.87 2.27 4.35

(dB) (dB) (dB) (dB) 0.64 0.50 0.75 0.86 0.81 1.06 1.71 1.10

Table 3.1: Losses of optimal companding, Lcomp , and optimal product VQ, Lprod, for an IID Gaussian source. quantization 10].

IV. Derivation of Main Result


In this section we nd the minimum of the function G(f ) over C , as de ned in Section II. Our basic approach is to use variational calculus arguments to show that

f is the optimal compressor function. It is a well known fact that a stationary point
of a convex functional on a convex set is a minimizing function. Upon initial inspection of the companding problem, it is evident that G(f ) is not a convex functional and that C is not a convex set (due to the onto assumption), so that we cannot apply the above fact directly. However, we will consider certain subclasses of C and decompose G(f ) into the product of two functionals, one of which is strictly convex on each subclass and the other of which is constant on each subclass. Though these subclasses are not convex, it will nevertheless be possible to minimize G(f ) over them using the above fact. We then perform an auxiliary minimization that leads to the main result.

55 We have de ned C so that we can use the above approach for optimization. Continuity of the derivatives and the boundary limited assumptions are needed to show that a function is a stationary point. The assumption that f be onto R(a b) is needed to establish the aforementioned product decomposition of G(f ). Finally, we require that the derivative matrix of f be positive de nite to ensure the convexity of

H (f ) and so that the appropriately chosen subclass of C is convex.


As an aside, we note that in the scalar case, Gish and Pierce 6] derived the optimal compressor function using variational techniques, and subsequently Gray and Gray 7] presented a di erent proof using Holder's inequality. However, we have not been able to develop a proof using Holder's inequality for the multidimensional case. To begin the derivation of the main result, we de ne the subclasses of C as follows.

De nition 3 For a b 2 <k , the class Cab is the subset of C such that f is onto and
boundary limited to R(a b).

We wish to minimize G(f ) over C and let G denote the resulting minimum value. Then

G = ffmin G(f ) = min Gab :f 2Cg fa bg


where

Gab =

ff :f 2Cabg

min G(f ):

It will be shown that the minima G and Gab exist. We decompose G(f ) into

G(f ) = V (f ) H (f )

56 where

V (f ) = H (f ) =

v(f (<k ))2=k

Z
f (<k )

dy

!2=k

Z
<

det F (x) dx k

2=k

1 h(f F x) = k kF ;1(x)k2 p(x): To nd G , we rst nd Gab for arbitrary a b and then perform the auxiliary minimization of G over the choice of a b. Since V (f ) = Qk (b ; a )2=k , for all f 2 C ,
ab i=1 i i ab

h(f F x) dx

we have

Gab =

k Y

i=1

(bi ; ai)2=k fmin H (f ) 2C


ab

and the minimum will be shown to exist. Therefore, it remains to minimize H (f ) on Cab. Though Cab is not convex (due to the onto assumption), relaxing the onto e assumption leads to a larger class Cab that is convex, and fortunately, such that the compressor that minimizes H (f ) over this larger class is in Cab.

De nition 4 For a b 2 <k , the class Ceab is the set of all mappings f : <k ! <k
that are boundary-limited to R(a b) and whose derivative matrix F (x) is continuous in x and positive de nite for all x.

e e Cab. And it is easy to check that Cab is convex. e Lemma A-1 shows that H (f ) is strictly convex on Cab and Lemma A-2 shows that
We see immediately that Cab

fab := fab 1

fab k ]T , where

bi ; ai Z xi p (y)1=3dy + a fab i(x) := i kpi k1=3 ;1 i 1=3 e is a stationary point of H (f ) on Cab. (The de nitions of convexity and stationary
point are given in Appendix II.) Since H (f ) is strictly convex, we see that fab

57

e uniquely minimizes H (f ) on Cab. Moreover, since fab 2 Cab and Cab


also uniquely minimizes G(f ) on Cab. Substituting fab into G(f ) yields

e Cab, we see

that fab uniquely minimizes H (f ) on Cab, as well. It follows immediately that fab

Gab =

k 1 X kpik (bi ; ai)2=k k (b ; 1=3)2 : i=1 i ai i=1 k Y

To minimize Gab over the choice of a b, we apply the arithmetic-geometric mean inequality. Speci cally, for any a b,

b G(a b) =

(bi ; ai)2=k i=1 0k 11=k Y = @ kpj k1=3A


j =1

k Y

k 1 X kpik1=3 k i=1 (bi ; a1)2

k Y i=1

(bi ; ai)2=k

k Y kpik1=3 !1=k 2 i=1 (bi ; a1)

with equality if and only if

exists a constant c such that bi ; ai = c kpik1=2 i = 1 ::: k: It follows then that 1=3 0k 11=k Y G = @ kpj k1=3A
j =1

(bi ;a1 )2

kpik1=3

is not the same for all i, or equivalently if there

and that the compressor functions that achieve G are f (x) = f1 (x) where

fk (x)]T

fi (x) = c kpi k
Proposition follow directly.

1=6 1=3

Z xi
;1

pi (y)1=3dy + ai i = 1 ::: k

where c 2 < and a 2 <k are arbitrary. The image and MSE formulas stated in the

Appendix I Inertial Pro le Inequality


In this appendix, we prove the lower bound on the inertial pro le of companding given in (3.6). Expressing the inertial pro le as a function of the eigenvalues

58
1

(x) 2(x) ::: k(x) of F (x), we have

m(x) = M (To) k;1 kF ;1(x)k2 (det F (x))2=k ! Y ! k k 2=k ;1 X 1 = M (To) k i (x) 2 i=1 i=1 i (x) ! k k Y 1 ! Y 2=k M (To) i (x) 2=k i=1 i (x) i=1 = M (To)
where it is easily seen that kF ;1(x)k2 = tr F ;1(x)T F ;1(x)] = Pk=1 i
;2 i (x),

and

where the inequality follows from the arithmetic-geometric mean inequality. We have equality if and only if all i (x)2 are equal for all x, which occurs when and only when F (x) is a scaling of an orthogonal transformation.

Appendix II De nitions
This appendix introduces certain de nitions and facts that will be needed in the sequel. Let G denote a linear space of continuous functions, and let F denote a convex subset of G that is, f1 + (1 ; )f2 2 F whenever f1 f2 2 F and 0
at f1 in the direction towards f2 is given by

1.

Let J be a functional de ned on F . For any f1 f2 2 F , the rst variation of J

@ J (f1 f2 ; f1) := @ J (f1 + (f2 ; f1))j =0


assuming that the derivative exists at = 0. From now on, we assume that J (f1 f2;

f1) is well de ned for all f1 f2 2 F .


A function f0 2 F is said to be a stationary point of J on F if

J (f0 f ; f0) = 0 for all f 2 F :

59 If J has a minimum point on F , then it is either a stationary point or a point in the boundary of F . The functional J is said to be convex on the convex set F if J ( f1 + (1 ; )f2)

J (f1) + (1 ; )J (f2) whenever f1 f2 2 F and 0


convex if

1. The functional is strictly

convex if the equality is strict whenever f1 6= f2. Equivalently 12, Def. 3.1], J is

J (f2) ; J (f1)

J (f1 f2 ; f1) for all f1 f2 2 F :

and strictly convex if the inequality is strict whenever f2 6= f1. If f is a stationary point of a convex functional J , then f minimizes J . Furthermore, f is the unique minimizer of J on F if J is strictly convex.

Appendix III Key Lemmas


This appendix proves the two key lemmas needed in the proof of Proposition 1 e in Section IV. Speci cally, we will show that H (f ) is strictly convex on Cab and e that fab is a stationary point of H (F ) on Cab. We will need to use the de nitions and facts of the previous appendix. Throughout, we assume C is the class de ned in Section II. One may straightforwardly check that H (f1 f2 ; f1) is well de ned for all f1 f2 2 C .

e Lemma A-1 The function H (f ) is strictly convex on Cab.

Proof: By the condition for convexity given in the previous appendix it su ces to e show that for all f1 f2 2 Cab, the function H (f ) satis es
H (f2) ; H (f1) H (f1 f2 ; f1)
(3.8)

60 with equality if and only if f1(x) = f2(x) for all x. Let F1(x) and F2(x) be the derivative matrices of f1 and f2. By the de nitions of H and H , the above is equivalent to Z

kF2;1(x)k2 ; kF1;1(x)k2 p(x) dx

Z @ k (F1(x) + (F2(x) ; F1(x)));1 k2 p(x) dx @

=0

Lemma A-3, given later, shows that the function kA;1k2 is convex on the space of positive de nite matrices A. By the de nition of C , it follows from such convexity that

kF2;1(x)k2 ; kF1;1(x)k2

@ k (F (x) + (F (x) ; F (x)));1 k2 1 2 1 @

=0

for all x,

with equality for some x if and only if F1(x) = F2(x). It now follows directly from this that (3.8) holds, with equality if and only if F1(x) = F2(x) for all x, which in turn e is equivalent to f1(x) = f2(x) for all x, because compressors in Cab are continuous and boundary limited to R(a b). (Since the f 's (and F 's) are continuous, if they are equal almost everywhere, they are equal everywhere.) This completes the proof of the lemma. 2.

e Lemma A-2 fab is a stationary point of H (f ) on Cab. e Proof: It is easily veri ed that fab 2 Cab. To show that fab is a stationary point of e H (f ) on Cab, we will show that e H (fab f ; fab) = 0 for all f 2 Cab:
Taylor series expansion, (3.9)

Fix f 2 Cab and de ne g = f ; fab and let G be the derivative matrix of g. Using a

@ H (f + g)j = Z @ h(f + g F + G x)j dx ab ab ab =0 =0 @ @

61

Z @ = @ (h(fab Fab x) 1 k k XX @ A + @Fij h(fab Fab x) Gij (x) + o( ) i=1 j =1 # k X "Z @ k X = @Fij h(fab Fab x) Gij (x) dx : i=1 j =1

dx
=0

In order to show (3.9), we will show that for i = 1 ::: k and j = 1 ::: k, Z @ (3.10) @Fij h(fab Fab x) Gij (x)dx = 0 e for all f such that f 2 Cab , where g = f ; fab. First consider (3.10) for i 6= j . From the de nition of h and the fact that F is diagonal, the derivative of h reduces to

@ h(f F x) = 1 p(x) @ k(F );1(x)k2 = 0 for all x: @Fij ab ab k @Fij ab


Next consider (3.10) for i = j . Using the formula for f in Proposition 1, we nd
k @ h(f F x) = 1 p(x) @ k(F );1(x)k2 = ; 2 Y p (x ): @Fii ab ab k @Fii ab k c3 m=1 m m i m6=i

Therefore, Z @ @Fii h(fab Fab x) Gii (x) dx Z1 Z1 2 Y Z1 k k Y = ; k c3 pm (xm) Gii(x) dxi dxm = 0
;1 ;1
i m=1 m6=i

;1

m=1 m6=i

because

Z1
;1

Gii(x) dxi = xlim gi (x) ; xilim gi(x) = 0 !;1 i !1

for all x1 ::: xi;1 xi+1 ::: xk, by the fundamental theorem of calculus and the fact that since g is the di erence between two functions that are boundary limited to

Rab, gi(x) ! 0 when xi ! ;1 for any xed x1 ::: xi;1 xi+1 ::: xk, and gi(x) ! 0
when xi ! +1 for any xed x1 ::: xi;1 xi+1 ::: xk. Note that this argument relies

62 on the continuity of Gii therefore, it is necessary that the diagonal entries of Fab be continuous. By assumption, the densities pi are continuous, which implies that

Fab ii = c pi(xi)1=3 i = 1 :: k are continuous. It follows that (3.10) holds for all i j ,
which implies that fab satis es (3.9), and completes the proof of the lemma. 2

Lemma A-3 The function kA;1k2 is strictly convex on the set of positive de nite
matrices.

Proof: Note that kA;1k2 = tr (A;1)T A;1 . We will show that for A B positive
de nite,

h i h i tr (B ;1)T B ;1 ; tr (A;1)T A;1 @ tr hf(A + (B ; A));1gT (A + (B ; A));1i @


with equality if and only if A = B .

=0

(3.11)

Because they are both positive de nite, A and B can be written as A = CIC T and B = CDC T , where D is a diagonal matrix with positive diagonal elements

d11 d22 ::: dkk 8, Cor. 7.6.5]. Then the left-hand-side of (3.11) can be written as h i h i tr (B ;1)T B ;1 ; tr (A;1)T A;1 n oT = tr (C ;1)T C ;1 (C ;1)T C ;1 (D;1 )2 ; I ] (3.12)
since DM = MD for any matrix M and any diagonal matrix D. We then see that (3.12) is given by tr

n oT (C ;1)T C ;1 (C ;1)T C ;1

(D;1 )2 ; I ]

= si d12 ; 1 ii i=1

k X

(3.13)

where si is the sum of column i of the matrix (C ;1)T C ;1. In order to simplify the right-hand-side of (3.11), we note that h i tr f(A + (B ; A));1gT (A + (B ; A));1

63

n oT = tr (C ;1)T C ;1 (C ;1)T C ;1 ( (1 ; )I + D];1 )2 k X = si 1 + (d1 ; 1)]2 (3.14)


i=1 ii

where (3.14) is derived using an argument similar to that in (3.13). Now the righthand-side of (3.11) is given by

@ tr hf(A + (B ; A));1gT (A + (B ; A));1i =0 @ k k X 1 @ Xs = ;2 si(dii ; 1): = @ i 1 + (d ; 1)]2


i=1 ii
=0

i=1

Therefore, (3.11) holds if and only if, k X 1 ! si d2 ; 1


i=1 ii

;2 si (dii ; 1)
i=1

k X

(3.15)

where si > 0. Since D and I are diagonal matrices with positive diagonal elements, it is su cient to show that f (x) = Pk s x;2 is strictly convex on fx : x > 0g,
i=1 i i i

which can be easily shown. Therefore (3.15) holds, with equality if and only if dii = 1 for all i, or equivalently when D = I and therefore A = B . 2

References
1] W.R. Bennett, \Spectra of quantized signals," Bell Syst. Tech. J., vol. 27, pp. 446-472, July 1948. 2] J.A. Bucklew, \Companding and random quantization in several dimensions," IEEE Trans. Inform. Theory, vol. IT-27, pp.207-211, Mar. 1981. 3] J.A. Bucklew, \A note on optimal multidimensional companders," IEEE Trans. Inform. Theory, vol. IT-29, p. 279, Mar. 1983. 4] J.H. Conway and N.J.A. Sloane, Sphere Packings, Lattices, and Groups, 2nd ed. New York: Springer-Verlag, 1993. 5] A. Gersho, \Asymptotically optimal block quantization," IEEE Trans. Inform. Theory, vol. IT-25, pp. 373-380, July 1979. 6] H. Gish and J.N. Pierce, \Asymptotically e cient quantizing," IEEE Trans. Inform. Theory, vol. IT-14, pp. 676-683, Sept. 1966.

64 7] R.M. Gray and A.H. Gray, Jr., \Asymptotically optimal quantizers," IEEE Trans. Inform. Theory, vol. IT-23, pp. 143-144, Jan. 1977. 8] R.A. Horn and C.R. Johnson, Matrix Analysis. Melbourne: Cambridge Press, 1985. 9] T.D. Lookabaugh and R.M. Gray, \High-resolution quantization theory and the vector quantizer advantage," IEEE Trans. Inform. Theory, vol. 35, pp. 10201033, Sept. 1989. 10] S. Na and D.L. Neuho , \Bennett's integral for vector quantizers," IEEE Trans. Inform. Theory, vol. IT-41, pp. 886-900, July 1995. 11] P. Panter and W. Dite, \Quantization in pulse-count modulation with nonuniform spacing of levels," Proc. IRE, vol. 39, pp. 44-48, Jan. 1951. 12] J.L. Troutman, Variational Calculus and Optimal Control: Optimization with Elementary Convexity, 2nd Ed. New York: Springer-Verlag, 1996. 13] P.L. Zador, \Topics in the asymptotic quantization of continuous random variables," Bell Laboratories Technical Memorandum, 1966. 14] R. Zamir and M. Feder, \On lattice quantization noise," IEEE Trans. Inform. Theory, vol. 42, pp. 1152-1159, July 1996.
Y

compressor function f: k k

lattice quantizer

expander function -1 f

^ X

Figure 3.1: Block diagram of multidimensional companding.

CHAPTER IV

Polar Quantization Revisited1

I. Introduction
In polar quantization (PQ), a two-dimensional random vector is quantized in terms of its magnitude and phase. This is advantageous for many sources, especially those with spherically symmetric densities. Several di erent polar quantization methods have been proposed 1]- 11] optimality conditions and numerical methods of determining and optimizing performance have been developed 1, 2, 3, 12, 13, 4, 6, 7, 8, 9, 11, 14] and high resolution analyses have been developed that determine and optimize performance when distortion is small and rate is large 12, 13, 8, 15, 16, 17, 10, 14]. Such analyses are, by and large, individually tailored to the speci c polar method being analyzed. In this paper, we show that for several polar schemes, high resolution analyses can be performed in a uni ed manner, by using Bennett's integral for vector quantizers (VQ). We also analyze and optimize polar quantization with a uniform magnitude quantizer, using recent high resolution results on the optimal step size and support of uniform scalar quantization for sources
This work was supported by NSF Grant NCR-9415754 and is to be submitted for publication in the IEEE Transactions on Information Theory, with co-author David L. Neuho . Part of this work was presented at the 1997 IEEE International Symposium on Information Theory in Ulm.
1

65

66 with unbounded support. Previously, such uniform polar quantization had been analyzed only with numerical methods and with high resolution methods that assumed the support of the magnitude quantizer was independent of the encoding rate. Let the source vector X = (X1 X2) be zero mean with joint density pX (x) and q 2 variance 2 = 1 E kX k2 < 1. The magnitude of X is given by M = X1 + X22 2 and the phase is given by = tan;1(X2=X1 ). The random vector is spherically are independent,
symmetric if pX (x) depends only on kxk. Equivalently, M and

has a uniform distribution and

pX (x) = 2 1xk pM (kxk) k


where pM (m) is the density of M . If the components of X are independent and identically distributed (IID) Gaussian random variables, then M is Rayleigh. Polar quantization employs a uniform scalar quantizer for the phase.2 In restricted
polar quantization, the magnitude and phase are quantized independently. Note that

the resulting two-dimensional quantization cells are not Voronoi regions, as in shapegain VQ 9]. Let NM and N denote the numbers of levels in the magnitude and phase quantizers, respectively. The step size of the uniform phase quantizer is = 2 =N and the total number of quantization cells is N = NM N . The magnitude and phase rates are given by RM = log2 NM and R = log2 N , respectively. The overall rate is
1 R = 2 log2 N = 1 (RM + R ). Optimizing restricted PQ to minimize distortion for a 2

given overall rate R involves nding the best rate allocation fRM R g (equivalently, c c c the best level allocation fNM N g) and the best magnitude levels m1 m2 ::: mNM . In unrestricted polar quantization, the phase is quantized after the magnitude, and the phase quantizer varies with the quantized magnitude. If there are NM
Uniform phase quantization has been shown to be optimal for spherically symmetric densities 2, 4, 13].
2

67

c c c magnitude levels, denoted m1 m2 ::: mNM , then there are integers N 1 ::: N NM c such that if the magnitude level is mi, then the phase is quantized with a uniform
scalar quantizer with step size quantization cells is N = N 1 +
i

= 2 =N i . In this case, the total number of +N


NM ,
1 and again R = 2 log2 N .

In nonuniform polar quantization, the magnitude is quantized with an arbitrary scalar quantizer (SQ), whereas in uniform polar quantization a uniform scalar quantizer is used for the magnitude. Figure 4.1 shows examples of various polar quantizers. Optimizing uniform PQ involves nding the best rate allocation between the magnitude and phase quantizers and the best support (equivalently, step size) of the magnitude quantizer. Like most prior analysis, we use mean-squared error (MSE) as the distortion measure. For restricted nonuniform polar quantization, Gallagher 2], Pearlman 4], and Bucklew and Gallagher 12, 13] found Lloyd-Max-type necessary conditions for optimality and numerically optimized nonuniform and uniform PQ, by solving for the optimal rate allocation between the magnitude and phase quantizers. Pearlman 1] and Bucklew and Gallagher 12, 13] (for asymptotically large rate) found analytically that the di erence between the optimal phase and magnitude rates is a constant. Bucklew and Gallagher also found asymptotic expressions for the minimum MSE of restricted nonuniform PQ, which is less than that of Cartesian quantization.3 Unrestricted nonuniform polar quantization was rst proposed by Wilson 6], who derived Lloyd-Max-type necessary conditions for optimality. Subsequently, Swaszek and Ku 17] performed an asymptotic analysis and optimization that showed that optimized unrestricted nonuniform PQ comes close to the distortion of asymptoti3

We refer to scalar quantization optimized for

X1

and

X2

as Cartesian quantization.

68 cally optimal two-dimensional vector quantization. In comparison, in this work, we present a uni ed analysis of several nonuniform polar quantization methods, by focusing on their point densities and inertial pro les, and using Bennett's integral for vector quantizers 18] to express the mean-squared error. This makes the analysis straightforward, and leads to new insights. With this approach, unrestricted polar quantization, which is arguably the best polar method, may be optimized essentially by inspection. As another example, a new polar quantization method, called power law polar, is analyzed and optimized. Restricted uniform polar quantization was optimized numerically by Gallagher 2] and Pearlman 4] and analytically by Swaszek 15]. However, the analysis in 15] assumed a xed support region for the magnitude quantizer, when in fact for true optimality the support needs to increase with rate. In this paper, we optimize restricted uniform PQ asymptotically by using a recent result from Hui and Neuho 19] on the support of uniform scalar quantization to choose the optimal support of the magnitude quantizer. We then nd the optimal rate allocation between magnitude and phase and the resulting mean-squared error. Two styles of argument are used in high-resolution quantization theory: informal, as in 20], 21] 17], and rigorous, as in 22], 18], 19]. In this paper, we use primarily the informal style. However, uniform polar quantization requires more care than its nonuniform counterpart because overload distortion is not necessarily negligible. Because of this, we provide some rigorous results in our analysis of uniform PQ. The rest of this paper is organized as follows. In Section II several unrestricted nonuniform PQ methods are analyzed using Bennett's integral. Section III considers restricted uniform polar quantization and optimizes this technique for an IID Gaussian source. Various details are presented in Section IV.

69

II. Nonuniform Polar Quantization


In this section, several nonuniform polar quantization methods are analyzed. Assuming the total number of cells N is large, unrestricted PQ is principally characterized by the magnitude point density function
level pro le
M (m), M

which gives the density

of magnitude levels near m, the magnitude allocation near m. The functions


M (m) M (m)

= NM = N , and the phase

(m), which gives the number of phase levels when the magnitude is and (m) are nonnegative and satisfy (4.1) (4.2) (4.3) = 1 (4.4) 1 c = N l(S ) when m = mi M i

p m (m) = N i when m = ci N
Z1
0

M (m) dm

= 1

Z1
0

(m)

M (m) dm

where l(Si) is the length of the ith magnitude cell Si. It can be shown, at least approximately, that (4.1) implies (4.3), and that (4.1), (4.2) and the fact that N =

N 1+

+N

NM

imply (4.4).

Power Law Polar Quantization is a special case of unrestricted PQ in which the

phase level pro le is a power of m, namely, (m) = c m 0 1:

When N is large, it can be easily shown using (4.4) that ! NM Z 1 m (m) dm ;1 : c = p 0 M N

70 Power law PQ is of interest because it considers a conceptually simple class of phase level pro les. It may also be of interest when quantizing a complex-valued random variable in situations, such as for synthetic aperture radar, where more phase accuracy is required for larger magnitudes than small. When = 0, power law PQ is equivalent to restricted nonuniform polar quantization. When = 1, all cells of the resulting PQ have approximately the same radial width. We refer to this case as constant width polar quantization. When 0 < < 1, the cells have radial widths increasing as m1; .

Analysis via Bennett's Integral


In order to analyze the mean-squared error of nonuniform polar quantization, we use the vector version of Bennett's integral 18], which gives the following approximation to the MSE of a two-dimensional vector quantizer when the number of cells

N is large:

1 1 Z x c D = 2 E kX ; X k2 = N m((x)) p(x) dx

where (x) is the point density of the polar quantizer and m(x) is the inertial pro le, giving the normalized moment of inertia (NMI) of the cells near x. The NMI of a R cell T0 with a codevector y is given by M (T0) = k;1v(T0);(1+2=k) T0 kx ; yk2 dx. We proceed by expressing the point density and inertial pro le of polar quantization as functions of
M (m), M ,

and

(m). As illustrated in Figure 4.2, when N

is large, the cell containing x is approximately a rectangle with height

H = N 1 (kxk) = M M
and width

p 1
N

M (kxk)

W =

2 kxk : p (kxk) N

71 Therefore the point density and inertial pro le of polar quantization are given by 1 M (x) = NHW = M (kxk2) kxk (kxk) 2 1 2 m(x) = 12 (Hp +2W )=2 : H W2 Substituting (x) and m(x) into Bennett's integral yields the following asymptotic expression for the MSE of unrestricted nonuniform polar quantization, ! 1 4 2kxk2 p(x) dx: 1 1 Z + (kxk)2 (4.5) D = 24 N 2 2 M M (kxk) This expression was rst derived by Swaszek and Ku 17] using a longer argument. We see that the Bennett integral approach leads to a straightforward derivation of (4.5). Note that because the phase quantizer is uniform, the asymptotic MSE of polar quantization depends only on the magnitude density and not on the phase density that is, not on any assumption of spherical symmetry. However, if the source vector is not spherically symmetric, then it is possible to exploit the symmetry to do better than polar quantization. Because asymptotic distortion does not depend on the phase density, one could assume without loss of asymptotic generality that the phase is uniform and independent of the magnitude. We make this assumption later in the paper.

Optimization of Power Law Polar Quantization Proposition 1 For power law polar quantization, the magnitude point density and
magnitude allocation that minimize (4.5) subject to (4.3) and (4.4) are given by pM (m)1=3 M (m) = c =3

where c makes 1 M = p 2

M (m)

integrate to one, and

EM 2;2

Z1
0

m2 =3 pM (m)1=3 dm

;1=4

Z1
0

m; =3 pM (m)1=3 dm:

72
The resulting minimum MSE is

; D=N
where
1= 2

; =

2;2 6 EM

Z1
0

m2 =3 pM (m)1=3 dm

3=2

This can be shown by xing respect to


M M.

M (m) and equating to zero the derivative of (4.5) with

This yields

= p1 2

EM

2;2

;1=4

Z1
0

M (m) dm

;1=2

Z pM (m) !1=4 dm M (m)2


M (m) dm:

and the resulting distortion is

D( M ) = 6N EM
omitted. Note that when

2;2

1=2

Z pM (m) !1=2 Z 1 dm m 0 M (m)2

Calculus of variations can then be used to nd the optimal

M (m).

The details are

= 0, these formulae reduce to those for restricted nonuni-

form polar quantization 13]. When = 1 (constant width PQ), the resulting twodimensional point density and MSE are identical to those for Cartesian quantization. In 18], it was shown that the loss of Cartesian quantization relative to optimal VQ of the same dimension can be decomposed into the product of point density and cell shape losses. The former isolates the loss caused by the suboptimal point density of Cartesian quantization, while the latter isolates the e ect of suboptimal inertial pro le. Since when optimized constant width PQ and Cartesian quantization have the same point density and the same MSE, it follows that they incur the same point density loss and the same total loss. Since total loss is the product of point density and cell shape losses, we see that constant width PQ and Cartesian quantization also

73 have the same cell shape loss, even though their inertial pro les are quite di erent: constant width PQ has a spherically symmetric inertial pro le, whereas Cartesian quantization does not. For the important special case of IID Gaussian random variables,
M

and ; are

given in Table 4.1 for various values of . We do not have a method for optimizing power law PQ over the choice of , but have found by trial-and-error that the best choice of for a Gaussian source is = 0:395. In this case, the resulting signalto-noise ratio (SNR) is 0.62 dB less than that of optimal two-dimensional vector quantization 22] and 0.45 dB less than that of optimized unrestricted PQ. We see that the phase is allocated more rate than magnitude for all cases of power law PQ. Note also that
M

is largest at = 0:8 and decreases slowly as ! 0. Table 4.1

also gives the loss of power law PQ relative to optimal two-dimensional VQ. This loss, expressed in dB, is then decomposed into the sum of square loss Lsq , oblongitis loss Lob and point density loss Lpt 18]. The square loss is the portion of cell shape loss due to the fact that the cells are at best squares rather than hexagons. The oblongitis loss, which is the remaining portion of the cell shape loss, is due to the fact that the cells are rectangles rather than squares.

Optimization of Unrestricted Polar Quantization


In order to optimize unrestricted polar quantization, we rst recall the properties of optimal two-dimensional vector quantizers with high rate. These are known to have hexagonal cells 21]. In this case, m(x) = 5=(36 3) for all x. The optimal (x) may be found by minimizing Bennett's integral using variational calculus or Holder's inequality. The result is (x) = c p(x)1=2, where c is a normalizing constant. could be is . Now consider unrestricted PQ. The best that its point density

74

L2vq

Lsq

Lob

Lpt

(dB) (dB) (dB) (dB) power law (PL)


(restricted)

0 0.1 0.2 0.3

0.6133 2.4748 0.89 0.17 0.39 0.33 0.6227 2.4066 0.77 0.17 0.33 0.27 0.6312 2.3595 0.69 0.17 0.29 0.23 0.6388 2.3320 0.63 0.17 0.26 0.21 0.6453 2.3235 0.62 0.17 0.25 0.20 0.6508 2.3339 0.64 0.17 0.26 0.21 0.6551 2.3639 0.69 0.17 0.29 0.24 0.6580 2.4147 0.79 0.17 0.34 0.28 0.6595 2.4886 0.92 0.17 0.41 0.34 0.6593 2.5890 1.09 0.17 0.50 0.42 0.6573 2.7207 1.30 0.17 0.62 0.51 0.707 2.0944 0.17 0.17 0.00 0.00 2.7207 1.30 0.17 0.62 0.51 2.0157 0.00 0.00 0.00 0.00

(optimal PL)

0.395 0.6450 2.3235 0.62 0.17 0.25 0.20 0.4 0.5 0.6 0.7 0.8 0.9

(constant width)

1.0

unrestricted Cartesian optimal 2-dim.

Table 4.1: Comparison of magnitude allocation M and MSE D = ;=N for a pair of IID Gaussian random variables. L2vq is loss relative to optimal xed-rate two-dimensional VQ and is decomposed as the sum of square loss Lsq , oblongitis loss Lob and point density loss Lpt.

75 Asymptotically, its cells are rectangles, so the best they could be is squares, in which case m(x) = 1=12 for all x. In order to optimize unrestricted PQ, we would like, if possible, to choose
M (m), M

and

(m) so that for all x the cells are square (that

is, H = W ) and (x) = (x). Given the rectangular shape of the quantization cells, this is the best that one could do. In order to achieve H = W , it must be that (x) = 1 = NH 2
M M (kxk)
2 2

Moreover, in order to achieve (x) = (x), we must have


M M (kxk)
2 2

= c p(x)1=2 = c q 1 pM (kxk)1=2: 2 kxk


M (m)

From this we see that choosing =

c0

pM (m) m

!1=4

yields unrestricted PQ with square cells and the optimal point density. The resulting best magnitude allocation and phase level pro le are given by Z1 ;1=2 Z 1 = 2 m1=2 pM (m)1=2 dm m;1=4 pM (m)1=4 dm M 0 0 Z1 ;1=2 p 3=4 1=4 (m) = 2 m pM (m) m1=2 pM (m)1=2 dm 0 and the resulting minimum MSE is ; D(N ) = N where 1 Z 1 (2 m p (m))1=2 dm 2 : ; = 12 M 0 Compared to optimal two-dimensional VQ, optimized unrestricted PQ su ers only a 0.17 dB loss, which is the ratio of the NMI of a cube to that of a hexagon,

76 expressed in dB. In contrast to the optimization in 17], we have optimized unrestricted PQ very simply by inspection of its point density and cell shape. Moreover, our analysis provides the insight that the only shortcoming of unrestricted PQ is its inability to form hexagonal cells, which completely accounts for the 0.17 dB loss.

III. Restricted Uniform Polar Quantization


In this section, we use recent high resolution results on the optimal step size of uniform scalar quantization to analyze uniform polar quantization. We will consider the restricted case only, so throughout this section we will refer to restricted uniform PQ as uniform PQ. We assume without loss of asymptotic generality that the source is spherically symmetric. Proofs of lemmas and propositions stated in this section are given in Section IV. The parameters of uniform PQ are the magnitude quantizer step size length NM
M , the number of magnitude levels NM

and the number of phase levels N .

The phase quantizer step size is


M

= 2 =N . The magnitude quantizer support

determines the support of the polar quantizer. If the support increases

too slowly with NM , then overload distortion dominates. On the other hand, if the support increases too rapidly, then granular distortion dominates. It is not known a priori how rapidly the support should increase therefore, Bennett's integral is not a reliable asymptotic expression for total distortion, because it accounts only for granular distortion. Thus we use other methods to derive an asymptotic expression for MSE.

Lemma 2 The exact MSE of a restricted uniform or nonuniform polar quantizer is


1 D = 2 DM + D IM (4.6)

77
where

c DM = EM (M ; M )2]
D = 2E 1 ; cos( ; b )] c IM = 1 EM M M ] 2

c and M and b are the quantizer outputs of the magnitude and phase quantizers,
respectively.

We now nd asymptotic expressions for the last two terms in (4.6).

Lemma 3 For uniform scalar phase quantization with N levels,


= 3: Lemma 4 If a sequence of magnitude quantizers has MSE DM ! 0, then 1 lim 0 IM = 2 EM 2: DM ! Whereas the above results have straightforward derivations, asymptotic expressions for DM as a function of known formula
2

lim N 2 D N !1

and NM are not so easily developed. The well

M =12 is an asymptotically accurate expression for the granular dis-

tortion of the magnitude quantizer, but is not always asymptotically accurate for total distortion. In general, asymptotic expressions for DM depend on the source density and on the dependence of
M

on NM .
M

We optimize uniform PQ by optimizing the step size

of the magnitude quan-

tizer for a given number of magnitude levels NM and then nding the best magnitude allocation. To optimize the step size, we see from Lemma 4 that IM has an asymptotically negligible e ect on MSE. Therefore, we choose the step size to minimize

DM (
of
M

NM ), the MSE of the magnitude quantizer. In general, the optimal choice

is source dependent, and we now consider the most important special case.

78

Gaussian Sources
Consider an IID Gaussian source, so that the magnitude is Rayleigh. Hui and Neuho 19] found asymptotic expressions for the optimal step size lemma.
M

and the

resulting minimum distortion DM . Their results are summarized in the following

Lemma 5 For an IID Gaussian source, the step size


distortion DM (
M

that minimizes magnitude

NM ) for a quantizer with NM levels satis es


NM !1

= 2 : ln NM NM The resulting minimum MSE DM = DM ( M NM ) satis es


2 DM (4.7) lim ln NM = 3 : NM !1 2 NM This shows that asymptotically, the total distortion is the granular distortion and

lim q

that the overload distortion is negligible. The asymptotic results given above can be used to derive the following proposition, which shows that MSE is approximately
2 2 2 1 f D(NM N ) def 6 ln NM + 3 N 2 : = 2 NM

(4.8)
M

Proposition 6 For an IID Gaussian source, the distortion D(


lim D(f M NM N ) = 1: NM !1 D (NM N ) N !1

NM N ) of a

uniform polar quantizer with NM magnitude levels and N phase levels satis es

Using the above and replacing NM and N with rates RM and R , respectively, we nd that with the asymptotically optimal magnitude step size, the distortion of uniform PQ with rate allocation (RM R ) is ;f (RM ) + oRM (1) ;g(R ) + oRM (1) + oR (1) D(RM R ) = 2 +2 (4.9)

79 where f (r) = 2r ; log2 r ; log2( 2 ln 2=6) and g(r) = 2r ; log2( to in nity. We want to nd the rates RM and R that minimize D(RM R ) subject to the
1 constraint 2 (RM + R ) = R. The following lemma shows that for large R, RM and 2 2

=3), and where

oRM (1) and oR (1) denote quantities that tend to zero as RM , respectively R , tend

R are asymptotically close to the values that minimize D(RM R ) def 2 =

;f (RM )

+2

;g(R )

(4.10)

Lemma 7 Let
DR(r) def 2;f (r)+or + 2;g(2R;r)+or +o2R;r = DR(r) def 2;f (r) + 2;g(2R;r) =
and let rR and rR minimize DR (r) and DR (r), respectively, for a given value of R. Then

lim jr ; rRj R!1 R lim D(rR ) R!1 D (rR )

= 0 = 1:

(4.11) (4.12)

The rates that minimize D(RM R ) can then be derived using Lagrange multipliers. By the previous lemma these rates asymptotically minimize D(RM R ) as well. The result is summarized in the following.

Proposition 8 Let the source be IID Gaussian. The rates that minimize the MSE
1 D(RM R ) of uniform polar quantization subject to 2 (RM + R ) = R are asymptot-

ically given by

1 RM = 2 log2 e ; 1 (log2 e)W ;8 2 e2 2;4R 4

(4.13)

80
and R = 2R ; RM , where for x < 0, W (x) is the more negative of the two values

y such that yey = x 23]. The resulting magnitude quantizer support is q LM = 2 ; W (;8 2 e2 2;4R)
and the resulting MSE is
2 2 2 D (R) = 12e eW (;8 2 e22;4R)=2 + 2 3 e 2;4R e;W (;8 2 e2 2;4R)=2:

We now compare the result of Proposition 8 to that of Pearlman's numerical optimization 4] and Swaszek's analysis 15]. The optimization in 4] was limited to values of N for which there was a good factorization N = NM N . Figure 4.3 shows that all three analyses allocate more rate to the phase quantizer than to the magnitude quantizer. Although R overestimates Pearlman's phase rate from rates 3 to 4, it is a good estimate for rates 4 to 5. Figure 4.4 shows that LM provides a good estimate of the increasing optimal support of restricted uniform PQ, especially compared to Swaszek's analysis, which assumed that the support is xed for all rates. As shown in Figure 4.5, our expression for SNR slightly overestimates Pearlman's results. This is because our analytic formula ignores overload distortion. We note that the above analysis can be repeated for other magnitude distributions. We also note that the more accurate expressions in 19] for the support and distortion of optimal uniform scalar quantizers could be used to improve the approximations given here.

Approximations to W
Although the above result gives a solution to the rate allocation problem for uniform PQ, because of the W function, it is di cult to immediately see the behavior of the magnitude and phase rates and the resulting minimum MSE. For this reason,

81 we consider a series expansion for W (x). As rate R increases, the argument of W tends to zero, and we will be able to approximate W with only a few terms from the series. It is shown in 23, Eqn. 4.18] that

W (x) = L1 ; L2 + L2 + L2(;2 + L2) L1 2L2 1 L2(6 ; 9L2 + 2L2) + O + 6L3 1


and three-term approximations,

L2 L1

(4.14)

when x < 0 and L1 = ln(;x) and L2 = ln(; ln(;x)). We will consider the two-term

; W3(x) = L1 ; L2 + L2 = ln(;x) ; ln(; ln(;x)) + ln(; ln(x)x)) L ln(;


1

W2(x) = L1 ; L2 = ln(;x) ; ln(; ln(;x))

which are good approximations for W (x) when x is small and negative. We denote R
2

as the approximation to R when W (x) is approximated by

W2(x). The rate R 3 similarly corresponds to W3(x). It is easily shown that


R R
2 3

p = R ; 1 log2(4 ln 2R ; ) + log2 + 3 4 4 ! p 1 log (4 ln 2R ; ) 1 + 1 = R; 4 2 + log2 + 3 4 ln 2R ; ) 4


3

where = ln(8 2e2). Figure 4.3 shows that R to R for rates 4 to 5.

is a very accurate approximation

From these estimates, one sees that for large values of rate R, the optimal phase rate R is less than R. In other words, magnitude is allocated more rate than phase, which never happened with nonuniform PQ. By numerically equating (4.13) to R, we nd that the magnitude and phase receive equal rate when R = 29. In addition, one sees from the above estimates that RM ; R ! 1 as R ! 1 and, nevertheless,

RM =R ! 1, or equivalently, RM =R ! 1=2.

82 Using W2(x) and W3(x), approximations to the optimal support of the magnitude quantizer are given by q 4 (ln 2) R ; ln(8 2) + ln 4 (ln 2) R ; ] LM 2 = v ! u u 1 t4 (ln 2) R ; ln(8 2) + ln 4 (ln 2) R ; ] 1 + LM 3 = 4 (ln 2) R ; : Figure 4.4 shows that L
3

is very accurate for rates 4 to 5.

Approximations to the SNR of optimized restricted uniform PQ are given by p ! 2 S2 (R) = 6:02R ; 10 log10 6 h i ;10 log10 (4(ln 2)R ; );1=2 + (4(ln 2)R ; )1=2 p ! 2 S3 (R) = 6:02R ; 10 log10 6 ! 2 1 1+ 1 ; 6 6 4(ln 2)R ; 6(4(ln 2)R ; ) 2 ;10 log10 6 6 4 !3 1 1 1+ 7 2 4(ln 2)R ; 7 7: 7 + (4(ln 2)R ; ) 7 5 Figure 4.5 shows that S3 (R) is an accurate approximation for S (R) at rates 4 to 5. These approximations show that the SNR of uniform PQ increases asymptotically as 6 dB/bit, but separates from any line with slope 6 dB/bit. Finally, Figure 4.6 compares the SNR of optimized restricted uniform PQ to that of restricted nonuniform PQ and Cartesian product uniform scalar quantization (USQ), as optimized in 19]. We see that the SNRs of all three methods increase asymptotically as 6 dB/bit. However the di erence in SNR between restricted nonuniform and restricted uniform tends to in nity as rate increases. This quanti es the loss in using a uniform scalar quantizer for the magnitude as opposed to using

83 an arbitrary scalar quantizer. Furthermore, the di erence in SNR between restricted uniform PQ and product USQ also tends to in nity as rate increases, which demonstrates the gain of using the polar transformation before quantization. The SNR of unrestricted nonuniform PQ is parallel to and 0.73 dB greater than that of restricted nonuniform PQ.

IV. Details
A. Proof of Lemma 2
When the source is spherically symmetric, the MSE of restricted polar quantization is given by,

D
= = = =

1 Z kx ; xk2p (x) dx b X 2 Z<2 Z 1 2 1 h(m sin ; c sin b)2 + (m cos ; m cos b)2i c m 2 0 0 p ( ) pM (m) dm d (4.15) 1 Z 2 Z 1 hm2 ; 2mm cos( ; b) + m2i p ( )p (m)dmd : c c M 2 Z0 Z0 1 2 1 (m ; c)2 p ( ) p (m) dm d M 2 0 0Z Z m 2 1 b c +1 2 0 0 2mm 1 ; cos( ; ) p ( ) pM (m) dm d (4.16) 1 E (M ; M )2] + 2E 1 ; cos( ; b )] 1 E M M ] c c } | {z } |2 M{z } 2 | M {z DM D IM

where the independence of p and pM in (4.15) follows from the spherical symmetry assumption on the source. The integrals in the second term on the right-hand side of (4.16) can be separated by the restricted assumption on the quantizer.

84

B. Proof of Lemma 3
We use an in nite series expansion for cosine, cos(x) =
1 X
j =0

(;1)j

1 x2j = 1 + X(;1)j x2j (2j )! (2j )! j =1

which converges for all x. Therefore 1 Z 2 0X 1 1 ( ; b)2j A p ( ) d = X 2 D = 0 2 @ (;1)(j+1) (2j )!


j =1 j =1

2j

(;1)(j+1) 1 (2j + 1)! N 2j

where the sum and integral can be exchanged because the integrand is absolutely integrable 24, p. 307]. It then follows that lim sup N 2 D ; 3 N !1
2

(;1)(j+1) 1 (2j + 1)! N 2(j;1) N !1 j =2 1 (j +1) 2j X lim sup 2 (;j1) 1)! 2(j;1) (2 + N !1 j =2 N 1 4 X 1 lim sup 2 2 N !1 N j =0 (2j + 1)! 4 1 = lim sup 2 2 1 e ; 2 N !1 N 2 = 0 = lim sup 2
2j

1 X

where the penultimate equality follows from 25, p. 13].

C. Proof of Lemma 4
We immediately see that

i 1 hc lim sup IM ; 1 EM 2 = lim sup 2 E (M ; M )M 2 D M !0 DM !0 1 lim sup E h (M ; M )M i c 2 DM !0 1 lim sup E h(M ; M )2i1=2 E hM 2i1=2 c 2
= 0
DM !0

85 where the second inequality follows from the Cauchy-Schwarz inequality.

D. Proof of Proposition 6
We use oN to denote a term that tends to zero as N tends to in nity. Lemmas 3, 4 and 5 show that 1 = 3 N 2 + oN IM = 1 EM 2 + oN 22 NM ) = 3 ln NM + oNM : 2 NM

(4.17)

DM (

where (4.17) follows because DM (

NM ) ! 0 as NM ! 1. Therefore, from (4.8),

2 D( M NM N ) = 1 + oNM + 1 EM 2 oN + 3 N12 oNM + oN oNM : 2 2 ln N 2 2 1 f M D(NM N ) 2 6 NM + 3 N 2

It is easily seen that the second term goes to zero as NM ! 1 and N ! 1, which proves Proposition 6.

E. Proof of Lemma 7
To show jrR ; rRj as R ! 1, it will su ce to show jf (rR) ; f (rR)j ! 0 as

R ! 1. This follows because rR ! 1 and rR ! 1 and because for r large enough f 0(r) 1, which implies that jf (rR) ; f (rR)j jrR ; rRj.
To show jf (rR) ; f (rR)j ! 0 as R ! 1, we will show
0 jf (rR) ; f (rR)j ! 0

as R ! 1

(4.18)

and
0 jf (rR) ; f (rR)j ! 0

as R ! 1

(4.19)

86
0 where rR is de ned as the rate such that 0 0 f (rR) = g(2R ; rR)

that is, the rate such that the exponents in DR(r) are identical (cf. 26, 27]). We will show (4.18) by contradiction, namely if (4.18) does not hold, then for large
0 enough R, D(rR ) > D(rR ), which contradicts the optimality of rR. Accordingly, we

assume that there exists > 0 and a sequence of rates increasing to in nity, denoted

fR g1 , such that either =1


f (r ) ; f (r0 ) >
or for all , (4.20)

f (r0 ) ; f (r ) >

for all ,

(4.21)

0 where r = rR and r0 = rR . We will show that there is some > 0 such that

D (r ) D (r 0 ) > 1 +
which establishes the contradiction. It is easily seen that as ! 1,

for all su ciently large ,

(4.22)

r ! 1 and 2R ; r ! 1
for otherwise D(r ) would not tend to zero. Also,

(4.23)

r0 ! 1 and 2R ; r0 ! 1:

(4.24)

This follows because either r0 or 2R ; r0 must increase to in nity as R ! 1. The fact that f (r) and g(r) increase to in nity as r ! 1 implies that either f (r0 ) ! 1 or g(2R ; r0 ) ! 1. However, by de nition, f (r0 ) = g(2R ; r0 ). Therefore,

87

f (r0 ) ! 1 and g(2R ; r0 ) ! 1, which implies (4.24). Since there exists r0 such
that f (r) is strictly increasing for all r > r0, we may assume that r , 2R ; r , r0 and 2R ; r0 are all greater than or equal to r0. Before proceeding with the proof, we establish a general expression for the ratio of D(r ) to D(r0 ),

D(r ) 2;f (r )+or + 2;g(2R ;r )+or +o2R ;r D(r0 ) = 2;f (r0 )+or0 + 2;g(2R ;r0 )+or0 +o2R ;r0 2;f (r ) + 2;g(2R ;r ) 2;f (r )+or + 2;g(2R ;r )+or +o2R ;r = ;f (r0 ) ;g(2R ;r0 ) 2 +2 2;f (r ) + {z;g(2R ;r ) 2 | } 1+o ;f (r0 ) + 2;g(2R ;r0 ) 2 0 ;f (r0 )+or0 + 2;g(2R ;r )+or0 +o2R ;r0} |2 {z 1+o 0 )+f (r 0 );f (r ) 2;f (r + 2;g(2R ;r0 )+g(2R ;r0 );g(2R ;r ) (1 + o ) = 2;f (r0 ) + 2;g(2R ;r0 ) 1 (4.25) = 2 2f (r0 );f (r ) + 2g(2R ;r0 );g(2R ;r ) (1 + o )
where the 1 + o terms follow from (4.23) and (4.24). Returning to the proof, we begin by assuming that (4.20) holds. The assumption

f (r ) > f (r0 ), together with the monotonicity of f , imply that r > r0 > r0, and it
can be easily seen that

f (r

) ; f (r 0 )

r ; r0 1+ 0 = 2(r 2 r g(2R ; r0 ) ; g(2R ; r ):

; r0 ) ; log

From the above and (4.25) we have

D(r ) 1 = 2 2f (r0 );f (r ) + 2g(2R ;r0 );g(2R ;r ) (1 + o ) D(r0 ) 1 2f (r0 );f (r ) + 2f (r );f (r0 ) (1 + o ) 2 1 > 2 2; + 2 + o

(4.26)

88 where (4.26) follows from Lemma 9, given later, and the assumption (4.20) that

f (r ) ; f (r0 ) > . Let be large enough so that


; jo j < 2 +42 ; 2

where Lemma 9 implies that 2; + 2 ; 2 > 0. Then for large enough,

D(r ) 1 ; 0) > 2 2 +2 +o D(r 1 2; + 2 ; jo j 2 (2; + 2 ) ; 2; + 2 ; 2 > 2 4


= 1+ where
def

; = 2 +42 ; 2

which establishes (4.22). Next assume that (4.21) holds. In this case, it is easily shown that

g(2R

; r0 ) ; g(2R

; r ) = f (r

) ; f (r0 ) ; log

! r0 ; r : 1+ r

Therefore, from (4.25),

D(r ) 1 f (r0 );f (r ) + 2g(2R ;r0 );g(2R ;r ) (1 + o ) 0) = 2 2 D(r = 1 2f (r0 );f (r ) + 2f (r );f (r0 ) 2

|
Note that

r ;r 0 f (r0 );f (r ) + 2f (r );f (r );log2 1+ r 2 2f (r0 );f (r ) + 2f (r );f (r0 )

{z J( )

(1 + o ):

(4.27)

r 2f (r0 );f (r ) + r0 2f (r );f (r0 ) J ( ) = f (r0 );f (r ) f (r );f (r0 ) 2 +2

89 = 1+
r r0 ; 1 0 );2f (r ) 22f (r +1

(4.28) (4.29)

! 1

where (4.29) follows because the denominator in (4.28) is bounded for large enough and r =r0 ! 1, which can be easily shown. Returning to (4.27), we have D(r ) 1 f (r0 );f (r ) + 2f (r );f (r0 ) J ( ) (1 + o ) 0) = 2 2 D(r = 1 2f (r0 );f (r ) + 2f (r );f (r0 ) (1 + o ) 2 > 1 2; + 2 + o 2 Again, let be large enough so that
; jo j < 2 +42 ; 2

(4.30)

where (4.30) follows from Lemma 9 and the assumption (4.21) that f (r0 ) ; f (r ) > .

where Lemma 9 implies that 2; + 2 ; 2 > 0. Then for large enough, D(r ) 1 ; 0) > 2 2 +2 +o D(r 1 2; + 2 ; jo j 2 ; + ; > (2 2 2 ) ; 2 +42 ; 2 = 1+ which establishes (4.22) and completes the proof of (4.18). Since D(r) equals D(r) modulo the o terms, this implies that (4.19) holds as well, which establishes (4.11) in Lemma 7. It then follows immediately that (4.12) in Lemma 7 holds, as well.

Lemma 9 If 0 a b, then
2;a + 2a
with equality if and only if a = b.

2;b + 2b

90

Proof: Note that 2a+b 1 with equality if and only if a = b = 0. We then have 2b ; 2a b ; 2a 2
2a+b with equality if and only if a = b, and simple manipulation gives the result. 2

Acknowledgements
The authors would like to thank Susan Werness for spurring their interest in polar quantization and Dennis Hui for assistance with Table 4.1.

References
1] W.A. Pearlman, \Quantization error bounds for computer generated holograms," Stanford. Univ. Inform. Syst. Lab., Stanford, CA, Tech Rep. 6503-1, Aug. 1974. 2] N.C. Gallagher, Jr., \Quantizing schemes for the discrete Fourier transform of a random time-series," IEEE Trans. Inform. Theory, vol. IT-24, pp. 156-162, Mar. 1978. 3] W.A. Pearlman and R.M. Gray, \Source coding of the discrete Fourier transform," IEEE Trans. Inform. Theory, vol. IT-24, pp. 683-692, Nov. 1978. 4] W.A. Pearlman, \Polar Quantization of a complex random variable," IEEE Trans. Comm., vol. COM-27, pp. 892-899, June 1979. 5] A. Buzo, A.H. Gray, Jr., R.M. Gray, and J.D. Markel, \Speech coding based upon vector quantization," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, no. 5, Oct. 1980. 6] S.G. Wilson, \Magnitude/phase quantization of independent Gaussian variates," IEEE Trans. Commun., vol. COM-28, pp. 1924-1929, Nov. 1980. 7] P.F. Swaszek and J.B. Thomas, \Optimal circularly symmetric quantizers," J. Franklin Institute, 313, pp. 373-384, June 1982. 8] P.F. Swaszek and J.B. Thomas, \Multidimensional spherical coordinates quantization," IEEE Trans. Inform. Theory, vol. IT-29, pp. 570-576, July 1983. 9] M.J. Sabin and R.M Gray, \Product code vector quantizers for waveform and voice coding," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, no. 3, June 1984. 10] T.R. Fischer, \A pyramid vector quantizer," IEEE Trans. Inform. Theory, vol. IT-32, pp. 568-583, July 1986.

91 11] J.-P. Adoul and M. Barth, \Nearest neighbor algorithm for spherical codes from the Leech lattice," IEEE Trans. Inform. Theory, vol. 34, pp. 1188-1202, Sept. 1988. 12] J.A. Bucklew and N.C. Gallagher, Jr., \Quantization schemes for bivariate Gaussian random variables," IEEE Trans. Inform. Theory, vol. IT-25, pp. 537543, Sept. 1979. 13] J.A. Bucklew and N.C. Gallagher, Jr., \Two-dimensional quantization of bivariate circularly symmetric densities," IEEE Trans. Inform. Theory, vol. IT-25, pp. 667-671, Nov. 1979. 14] P.F. Swaszek, \A vector quantizer for the Laplace source," IEEE Trans. Inform. Theory, vol. 37, pp. 1355-1365, Sept. 1991. 15] P.F. Swaszek, \Uniform spherical coordinate quantization of spherically symmetric sources," IEEE Trans. Commun., vol. COM-33, pp. 518-521, June 1985. 16] P.F. Swaszek, \Asymptotic performance of Dirichlet rotated polar quantizers," IEEE Trans. Inform. Theory, vol. IT-31, pp. 537-540, July 1985. 17] P.F. Swaszek and T. Ku, \Asymptotic performance of unrestricted polar quantizers," IEEE Trans. Inform. Theory, vol. IT-32, pp. 330-333, Mar. 1986. 18] S. Na and D.L. Neuho , \Bennett's integral for vector quantizers," IEEE Trans. Inform. Theory, vol. IT-41, pp. 886-900, July 1995. 19] D. Hui and D.L. Neuho , \Asymptotic analysis of optimal xed-rate uniform scalar quantization," submitted to IEEE Trans. Inform. Theory, August 1997. 20] P. Panter and W. Dite, \Quantization in pulse-count modulation with nonuniform spacing of levels," Proc. IRE, vol. 39, pp. 44-48, Jan. 1951. 21] A. Gersho, \Asymptotically optimal block quantization," IEEE Trans. Inform. Theory, vol. IT-25, pp. 373-380, July 1979. 22] P.L. Zador, \Topics in the asymptotic quantization of continuous random variables," Bell Laboratories Technical Memorandum, 1966. 23] R.M. Corless, G.H. Gonnet, D.E.G. Hare, D.J. Je rey, and D.E. Knuth, \On the Lambert W Function", Advances in Computational Mathematics, vol. 5, pp. 329{359, 1996. 24] A.N. Komolgorov and S.V. Fomin, Introductory Real Analysis. Dover, New York, 1970. 25] I.S. Gradshteyn and I.M. Ryzhik, Tables of Integrals, Series, and Products. Academic Press, New York, 1994. 26] C.E. Shannon, \Certain results in coding theory for noisy channels," Inform. Contr., vol. 1, pp. 6-25, 1957. 27] K. Zeger and V. Manzella, \Asymptotic bounds on optimal noisy channel quantization via random coding," IEEE Trans. Inform. Theory, vol. 40, pp. 1926-1938, Nov. 1994.

92

(a)

(b)

(c)

Figure 4.1: Examples of polar quantizers: (a) restricted nonuniform, (b) restricted uniform, (c) unrestricted nonuniform.

H W

Figure 4.2: The height H and width W of a polar quantization cell.

93
6

5.5

Phase Rate

4.5

3.5

Twoterm Approximation Threeterm Approximation Proposition 6 Pearlman Swaszek Equal allocation


3 3.2 3.4 3.6 3.8 4 Rate 4.2 4.4 4.6 4.8 5

Figure 4.3: Optimal phase rates for restricted uniform polar quantization. The optimal rate from Proposition 6 is R . The two-term and three-term approximations are R 2 and R 3, respectively.

4.5

Support of Magnitude Quantizer

3.5

2.5

Swaszek Pearlman Proposition 6 Threeterm Approximation Twoterm Approximation

1.5

3.2

3.4

3.6

3.8

4 Rate

4.2

4.4

4.6

4.8

Figure 4.4: Optimal magnitude quantizer support for restricted uniform polar quantization. The optimal support from Proposition 6 is LM . The two-term and three-term approximations are LM 2 and LM 3, respectively.

94

26

24

22

SNR

20

18

16

14

Twoterm Approximation Threeterm Approximation Proposition 6 Pearlman Swaszek

12

3.2

3.4

3.6

3.8

4 Rate

4.2

4.4

4.6

4.8

Figure 4.5: Maximum signal-to-noise ratio (SNR) for restricted uniform polar quantization. The optimal rate from Proposition 6 is S . The two-term and three-term approximations are S2 and S3 , respectively.

60

55

50

45

SNR

40

35

30

25

Restricted Nonuniform PQ Uniform PQ Uniform Scalar Quantization

20

5.5

6.5

7.5 Rate

8.5

9.5

10

Figure 4.6: Signal-to-noise ratio (SNR) of restricted nonuniform polar quantization, restricted uniform polar quantization and uniform scalar quantization.

CHAPTER V

Conclusions
In this nal chapter, we summarize the contributions of this dissertation and point to future research issues in the area.

Summary of Contributions
In Chapter II, we presented analytical formulas for the optimal scaling factor aN and the resulting minimum distortion DN when the source is generalized Gaussian with decay parameter 1. Upper and lower bounds to distortion were derived, and it was then shown that the ratio of scaling factors that minimize each bound equals one asymptotically and that the ratio of each scaling factor to aN equals one asymptotically. It was also shown that for scale-optimized lattice quantization, granular distortion asymptotically dominates overload distortion, which leads directly to the asymptotic expression for DN . In Chapter III, multidimensional companding for memoryless sources was optimized asymptotically, under certain technical conditions on the compressor function. The proof of the result was based on variational calculus arguments. It was then seen that optimized multidimensional companding incurs the same point density and oblongitis losses as optimized scalar quantization, while recovering the cubic loss. 95

96 In Chapter IV, we considered uniform polar quantization and found the optimal rate allocation between magnitude and phase. It was shown that the magnitude receives increasingly more rate as total rate increases. The optimal rates were shown to depend on W , an inverse function without a closed form solution. We showed that a simple approximation to W is very accurate for total rates greater than 4. We also analyzed and optimized several nonuniform polar quantization methods by using Bennett's integral.

Future Research Issues


The results in this dissertation suggest a number of avenues for future research activity. We have optimized xed-rate lattice quantization for a class of generalized Gaussian sources. What is the optimal scaling factor for an arbitrary multidimensional source? It may also be possible to derive the minimum MSE of lattice quantization for arbitrary sources without knowing the optimal scaling factor. The results in Chapter II assumed a nearly spherical support. Is the scaling factor di erent for di erent support shapes, such as Voronoi regions? In multidimensional companding, the most promising work lies in extending the results in Chapter III. Namely, it would be useful to show that independent compressor functions are optimal over a broader class of compressor functions. It is hoped that this broader class might include compressor functions followed by a linear transform. This may then allow consideration of Gaussian sources with memory. The general companding question remains open: what is the best compander for an arbitrary source? Although this is a di cult problem, it may be possible to nd optimal or near-optimal compressor functions in low dimensions using numerical methods.

97 In polar quantization, it would be interesting to extend the uniform polar results in Chapter IV to dimensions greater than two. What is the best lattice-based polar quantization method in three dimensions? For the two-dimensional case, optimizing unrestricted uniform polar quantization is still an open problem. Such an optimization would give further insight into the tradeo s between restricted and unrestricted polar quantization and between nonuniform and uniform polar quantization.

ABSTRACT
ASYMPTOTIC ANALYSIS OF LATTICE-BASED QUANTIZATION by Peter Warren Moo

Chair: David L. Neuho Lattice-based quantization is an attractive method of vector quantization, because of its potentially low complexity and its ability to form nearly spherical cells. In this dissertation, three lattice-based quantization methods are optimized using high resolution analysis. In the rst part of this dissertation, xed-rate lattice quantization for a class of generalized Gaussian sources is considered. Asymptotic expressions for the optimal scaling factor and resulting minimum distortion are presented. These expressions are derived by minimizing upper and lower bounds to distortion. It is also shown that for scale-optimized lattice quantization, granular distortion asymptotically dominates overload distortion, and the ratio of optimal lattice quantizer distortion to optimal vector quantizer distortion is shown to increase without bound as rate increases. In the second part, multidimensional companding, which utilizes a nonlinear compressor function and a lattice quantizer, is optimized for memoryless sources. Under certain technical assumptions on the compressor function, it is shown that the best

compressor function consists of the best scalar compressor functions for each component of the source vector. The gain of optimal companding compared to scalar quantization is shown to be the gain in normalized moment of inertia of the lattice cell compared to a cube. The third part considers uniform polar quantization, which utilizes a polar transformation and a two dimensional integer lattice quantizer. It is shown that the optimal rate allocation between magnitude and phase gives increasingly more rate to the magnitude as total rate increases, compared to nonuniform polar, where the di erence between optimal magnitude and phase rates is a constant. In addition, a uni ed analysis of several nonuniform polar quantization schemes is developed by focusing on their point densities and inertial pro les and using Bennett's integral to express the mean-squared error. The subsequent analysis is straightforward and leads to new insights into the relationship between polar quantization and Cartesian quantization.