You are on page 1of 6

Abstract It is hard to avoid ASCII Art in todays digital world, from the ubiquitous emoticons;)to the esoteric artistic

c creations that reside in many peoples e-mail signatures, everybody has come across ASCII art at some stage. The origins of ASCII art can be traced back to the days when computers had a high price, slow operating speeds and limited graphics capabilities, which forced computer programmers and enthusiasts to develop some innovative ways to render images using the limited graphics blocks available, viz., text characters. Here, we treat automatic ASCII art conversion of binary images as an optimisation problem, and present an application of our work on NonNegative Matrix Factorisation to this taskwhere a basis constructed from monospace font glyphs is fitted to a binary image using a winnertakes-all assignment.

Keywords ASCII Art, Non-Negative Matrix Factorisation.

Introduction The art of creating images using text characters as their constituent elements has been around long before the advent of computers. A really nice example exists in the text is arranged in the form of a mouse tail, which is considered to be one of the first printed character art creations. In the digital realm, character art was first created using Radio Teletype (RTTY), which is machine-to-machine communication method using radio or telephone lines, RTTY typically used the 5-bit Baudot code character encoding scheme,which provides a limited palette that includes onlycapital letters. It was not until the arrival of computer bulletin board systems and the Internet when ASCII was the most popular standard for character encodingthat character art really took off giving the name ASCII Art [1] to this creative endeavour. Furthermore, a subculture developed around the art resulting in the creation of ASCII animations, comic books, etc. ASCII art continues to capture the imagination of computer enthusiasts. The advent of text

markup languages such as HTML, has enabled ASCII art to evolve to include colour information, where it is possible to quantise images to a full RGB colour space. Furthermore, new character encoding schemes such as Unicode [2] have a huge variety of characters, enabling the possibility of creating character art with a lot more detail. In this work, we employ methods related to NonNegative Matrix Factorisation (NMF) in the conversion of binary images, i.e., black and white, to ASCII art, where a fixed basis constructed from monospace font glyphs is fitted to the image, and the glyphs used to represent the image are selected using a winner-takes-all approach. Furthermore, we use a parameterisable divergence measure known as the -divergence as the algorithms reconstruction cost function, which enables the selection of a range of different cost functions, each producing different ASCII art. We focus our attention on classic ASCII art, which is created using the standard 7-bit ASCII table resulting in a palette of 95 printable characters (numbered 32 to 126), as opposed to highASCII art, which is created from an extended 8-bit ASCII set that includes block graphics characters [1]. This paper is organised as follows: We present an overview of NMF in Section II and discuss its application to ASCII art conversion of binary images in Section III. We present some examples in Section IV and finish the paper with a discussion and conclusion in Section V & Section VI respectively. II Non-negative Matrix Factorisation Non-Negative Matrix Factorisation [3, 4] is a method for the decomposition of multivariate data, where a non-negative matrix, V, is approximated as a product of two non-negative matrices, V WH. NMF is a parts-based approach that makes no statistical assumption about the data. Instead, it assumes for the domain at hand, e.g. binary images, that negative numbers are physically meaninglesswhich is the foundation for the

assumption that the search for a decomposition should be confined to a non-negative space, i.e., nonnegativity assumption. The lack of statistical assumptions makes it difficult to prove that NMF will give correct decompositions. However, it has been shown in practice to give correct results. NMF, and its extensions, has been applied to a wide variety of problems including face recognition [5], brain imaging [6] and tensor factorisation [7]. Furthermore, in combination with a magnitude spectrogram representation, NMF has been applied to audio processing tasks such as speech separation [8, 9, 10] and automatic transcription of music [11]. a) Standard NMF Algorithm Given a non-negative matrix V R0,MT , the goal is to approximate V as a product of two non-negative matrices W R0,MR and H R0,RT , V WH, vik
R

Xj
=1

wijhjk. (1) Typically, R < M, where W contains a low-rank basis and H contains associated activations. Two NMF algorithms were introduced by Lee and Seung [4], each optimising a different cost function to measure reconstruction quality. The cost functions specified are the Squared Euclidean Distance (SED), DSED(V,W ,H) = 1 2 kV WHk2, (2) and a generalised version of the KullbackLeibler Divergence (KLD), DKLD(VkW ,H) =X
ik

_vik log vik [WH]ik vik+[WH]ik_. (3) NMF is treated as an optimisation problem that minimises the selected cost function, and enforces a non-negativity constraint on the factors: min
W,H

D(VkW ,H) W ,H 0,

resulting in a parts-based decomposition, where the basis in W resemble parts of the input data, which can only be summed together to approximate V. Both Eq. 2 and Eq. 3 are convex in W and H individually, but not together. Therefore NMF algorithms usually alternate updates of W and H. The cost functions are minimised using a diagonally rescaled gradient descent algorithm [4], which enforces the non-negativity constraint and leads to the following multiplicative updates for Squared Euclidean Distance (SED), wij wij [VHT]ij [WHHT]ij , (4a) hjk hjk [W TV]jk [W TWH]jk , (4b) and Kullback-Leibler Divergence (KLD), wij wijPT k=1(vik/[WH]ik)hjk PT k=1 hjk , (5a) hjk hjkPM i=1 wij(vik/[WH]ik) PM i=1 wij , (5b) As the NMF algorithm iterates, its factors converge to a local optimum of its cost function. The parameter R, which specifies the number of columns inWand rows in H, determines the rank of the approximation. If R < M then W is overdetermined and NMF reveals low-rank features of the data. The selection of an appropriate value for R usually requires prior knowledge, and is important in obtaining a satisfactory decomposition. III ASCII Art Conversion Using Non-Negative Constraints To increase the flexibility of our NMF algorithm we employ the -divergence as its cost function. The Beta Divergence (BD) (proposed as an cost function for NMF by [12]; also referred to as the modified alpha divergence [13]) is a parameterised divergence measure that encompasses the previ-

ously discussed cost functions of SED and KLD, and the Itakura-Saito Divergence (ISD) [14], DISD(VkW ,H) =X
ik

_ vik [WH]ik log vik [WH]ik 1_. (6) The NMF cost function utilising -divergence is DBD(VkW ,H, ) = Xi
k

is partitioned into blocks, which are the same dimension as the glyphs used to construct W. Each image block corresponds to a single font glyph in the ASCII art image. V is fitted to W using the following update rule for H, hjk hjkPM i=1 wij(vik/[WH]2_ ik ) PM i=1 wij [WH]_1
ik

_vik v_1 ik [WH]_1


ik

( 1) + [WH]_1
ik

[WH]ik vik _, (7) for = 2, SED is obtained; for 1, the divergence tends to KLD; and for 0, it tends to ISD. The choice of the parameter depends on the statistical distribution of the data, and requires prior knowledge, see [15, Chapter 3]. The utility of the -divergence cost function is that it enables the selection of many different reconstruction penalty schemes, as illustrated in Fig. 1, through the selection of a single parameter. In effect, providing a wide selection of NMF algorithms, each producing different results. For the purposes of binary image to ASCII art conversion,Wis known in advance, where the font glyphs of the specified monospace font are used to construct the basis. Therefore removing the requirement for an W update as in standard NMF. The columns of matrix V are constructed from the binary image under consideration, where the image

. (8) Subsequent to fitting, the ASCII art representation of the image is indicated by H: V Wmaxcol(H, ), (9) where the maxcol operator sets all entries to zero and replaces the maximum activation in each column with a one, provided that the maximum value is greater than . From our experience, thresholding of the maximum values improves the appearance of the resultant ASCII art, as whitespace is more pleasing to the eye than a glyph with a small activation. The columns of V contain the bitmaps for the winning glyphs at each block location, the block partitioning step is reversed and the ASCII art image is constructed. More formally, we use the following procedure for automatic conversion of binary images to ASCII art: 1. Construct W from a monospace font, e.g., Courier, where the glyphs that represent the 95 printable characters (numbered 33 to 126) of the 7-bit ASCII character encoding scheme are stored as M N bitmaps, which are arranged as vectors of size R and placed in each column, wj . Rescale each column to the unit L2-norm, wj = wj kwjk , j = 1, . . . ,R. 2. Partition the binary image X RPQ into M N blocks forming a P/M Q/N grid, where each block corresponds to a font glyph in the final ASCII art image. Construct V from the blocks by arranging as vectors and placing in columns. If X is not evenly divisible into MN blocks then perform zero padding

to the required dimensions. 3. Randomly initialise H; specify & . 4. Fit V to W using the H update rule (Eq. 8), and repeat for the desired number of iterations. 5. Assign each block location in the original image a glyph based an a winner-takes-all approach, where the maximum value in each column of H corresponds to the winning glyph in W (Eq. 9). Reverse the block partitioning procedure of step 2 and render the ASCII art image using the identified glyphs in the specified monospace font. IV ASCII Art Examples In order to demonstrate the utility of the proposed approach we select a test image (UCD CASL logo) and perform conversion using SED ( = 2) and KLD ( = 1). Following the procedure detailed in Section III, we specify 100 iterations of Eq. 8, = 0, and construct W from a courier font (font file = c0419bt .pfb) with a glyph size of 19 38 Binary Image Pseudoinverse NMF SED NMF KLD
Fig. 2: A test image (UCD CASL logo) and three ASCII art representations, which are created using the pseudoinverse and NMF utilising the SED (_ = 2) and KLD (_ = 1) cost function. Inspection of the logo text reveals that NMF preserves the curves best and minimises black space. Furthermore, the selection of a different _ creates a different ASCII art representation.

pixels (width height). The glyph basis is fitted to our 1209 962 pixel test image, resulting in a ASCII representation with 9137 characters. As a way of comparison, we also convert the test image using the pseudoinverse, H = |(W TW)1W TV|, and present the resultant ASCII art images in Fig. 2. On initial inspection, the most noticeable difference between the three ASCII art images is the glyph used to represent a fully black block (p for pseudoinverse, Q for SED and | for KLD), which occurs because each cost function has a different notion of what a correct solution should be, resulting the in selection of different glyphs. Comparison of the pseudoinverse image to the SED imagewhere both methods produce minimum

2-norm solutions for an over-determined system or equationsreveals that the non-negative constraint on the SED image appears to make the logo text look better defined. This is especially evident when inspecting the logo letters C and S on the pseudoinverse image, where the curve at the top of each letter, as indicated in the original binary image, appears to be truncated. Therefore, it appears that introducing a non-negativity constraint minimises the black space in the ASCII representation, preserving the curves in the original binary image. Moreover, our subjective assertion is backed up quantitatively, where the Frobenius norm of the matrix representation of the pseudoinverse and SED image is 1.0105 and 9.5104 respectively. The selection of the -divergence as the proposed algorithms cost function, introduces an element of flexibility to the algorithm, where different ASCII art can be produced for the same input image by specifying a different . The effect of the selection of is demonstrated in Fig. 2, where it is evident that the KLD image utilises different glyphs in its ASCII representation than SED, while continuing to minimise black space. Using the proposed method, we present a number of ASCII art examples of various images in Fig. 3.
Fig. 3: ASCII art representations, created by the proposed NMF procedure, of Homer J. Simpson, the Aphex Twin Logo and Andy Warhols print of Marilyn Monroe (after thresholding), which are available at http://ee.ucd.ie/_pogrady/

V Discussion It may be possible to improve the resultant ASCII art representations by finding the most natural grid for the binary image, which may be achieved by shifting the image both vertically and horizontally and fitting the image to W. The grid that results in the best reconstruction, as indicated by the signal-to-noise ratio for example, may be considered to be the most natural grid.

The chosen glyphs in an ASCII art image are selected based on a winner-takes-all approach (Eq. 9). It is possible to reduce the number of activations in H by using a sparse NMF algorithm [10], which may result in fewer iterations to achieve the same ASCII art representation. For the glyph set used to construct W in our examples, M had the largest amount of black space as indicated by the Frobenius norm. However, M was not chosen as the fully black block glyph using any of the presented cost functions, which suggests that a more suitable cost function exists. The utility of ASCII Art in the early computing era is clear. In todays world, where transmission of photograph quality images is not a problem, ASCII art still has relevance. For example, the proposed method may be employed in image manipulation software, or may be used to create ASCII art for the many bulletin board systems that are still popular today, e.g. 2channel [16]. Finally, in this work we concentrate on binary images, where the resultant ASCII art is monochromatic. However, it is possible to create multi-colour ASCII art, where a binary image is created from a colour image and ASCII art conversion is performed giving a monochromatic ASCII art representation, which is subsequently used to mask the original colour image. VI Conclusion In this paper, we presented a novel application of NMF related methods to the task of automatic ASCII art conversion, where we fit a binary image to a basis constructed from monospace font glyphs using a winner-takes-all assignment. We presented some examples, and demonstrated that when compared to a standard pseudoinverse approach, non-negative constraints minimise the black space of the ASCII art image, producing better defined curves. Furthermore, we propose the use of the -divergence cost function for this task, as it provides an element of control over the final

ASCII art representation. Acknowledgements This material is based upon works supported by the Science Foundation Ireland under Grant No. 05/YI2/I677. References
[1] Wikipedia. ASCII Art Wikipedia, the free encyclopedia, 2008. [Online; accessed 11-March-2008]. [2] Wikipedia. Unicode Wikipedia, the free encyclopedia, 2008. [Online; accessed 11-March-2008]. [3] P. Paatero and U. Tapper. Positive matrix factorization: A nonnegative factor model with optimal utilization of error estimates of data values. Environmetrics, 5:11126, 1994. [4] Daniel D. Lee and H. Sebastian Seung. Algorithms for non-negative matrix factorization. In Adv. in Neu. Info. Proc. Sys. 13, pages 55662. MIT Press, 2001. [5] David Guillamet and Jordi Vitria. Classifying faces with non-negative matrix factorization, 2002. [6] M. N. Schmidt and H. Laurberg. Non-negative matrix factorization with gaussian process priors. Computational Intelligence and Neuroscience, 2008. [7] Amnon Shashua and Tamir Hazan. Non-negative tensor factorization with applications to statistics and computer vision. In ICML 05: Proceedings of the 22nd international conference on Machine learning, pages 792799, New York, NY, USA, 2005. ACM. [8] Paris Smaragdis. Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs. In Fifth International Conference on Independent Component Analysis, LNCS 3195, pages 4949, Granada, Spain, September 2224 2004. Springer-Verlag. [9] D. FitzGerald, M. Cranitch, and E. Coyle. Sound source separation using shifted non-negative tensor factorisation. In Proceedings, IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. [10] Paul D. OGrady and Barak A. Pearlmutter. Discovering convolutive speech phones using sparseness and non-negativity. In Seventh International Conference on Independent Component Analysis, LNCS 4666, pages 5207, London, UK, September 2007. Springer-Verlag. [11] S. A. Abdallah and M. D. Plumbley. Polyphonic transcription by non-negative sparse coding of power spectra. In Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004), pages 31825, 2004. [12] Raul Kompass. A generalized divergence measure for non-negative matrix factorization. In Neuroinformatics workshop, Torun, Poland, September 2005. [13] Andrzej Cichocki, Rafal Zdunek, and Shun-ichi Amari. Csiszars divergences for non-negative matrix factorization: Family of new algorithms. In Justinian P. Rosca, Deniz Erdogmus, Jose Carlos Prncipe, and Simon Haykin, editors, Independent Component Analysis and Blind Signal Separation, 6th International Conference, ICA 2006, Charleston, SC, USA, March 5-8, 2006, Proceedings, volume 3889 of Lecture Notes in Computer Science, pages 3239. Springer, 2006. [14] F. Itakura and S. Saito. An analysis-synthesis telephony based on maximum likelihood method. In 6th Int. Conf. Acoustics, pages 1720, 1968. [15] Paul D. OGrady. Sparse Separation of UnderDetermined Speech Mixtures. PhD thesis, National University of Ireland Maynooth, 2007. [16] Wikipedia. 2channel Wikipedia, the free encyclo-

pedia, 2008. [Online; accessed 21-May-2008].

You might also like