Professional Documents
Culture Documents
Frank L. Borchardt
Duke University
ABSTRACT: After twenty years of disfavor, a technology has returned which imitates
the processes of the brain. Natural language experiments (Sejnowski & Rosenberg: 1986)
demonstrate that neural network computing architecture can learn from actual spoken
language, observe rules of pronunciation, and reproduce sounds from the patterns
derived by its own processes. The consequences of neural network computing for natural
language processing activities, including second language acquisition and
representation, machine translation, and knowledge processing may be more convulsively
revolutionary than anything imagined in current technology. This paper introduces
neural network concepts to a traditional natural language processing audience.
Tolerant Computers
A certain exasperation is commonly experienced when a present day
computer does what it is told to do instead of what a user intended it to do. The
reason for the simple-mindedness of computers is itself simple. The modern
deterministic, serial, digital computer has at its core a central processing unit,
which is capable (a) of following one instruction at a time, therefore in sequence
and (b) calculating an outcome arithmetically as either zero or one or logically as
either true or false. To be sure, the modern computer can do this very quickly, in
some at a rate of many million times a second, but nonetheless, finally, one
instruction at a time, in sequence, with an either/or outcome. That means that a
computer can tell very quickly whether someone has typed something correctly
or not, or will very quickly follow the erroneous instruction given it, with no
tolerance for errors. With effort, to be sure, someone could probably program a
somewhat more flexible judge of typing, but finally, that judge has also to
produce an unequivocal either/or for the machine to work. This constitutes in
part what is known as "the Von Neumann Bottleneck" (Brown 1986).
It has often been claimed that the activities and functions of the
human nervous system are so complicated that no ordinary
mechanism could possibly perform them. It has also been
attempted to name specific functions which by their nature
exhibit this limitation. It has been attempted to show that such
specific functions, logically, completely described, are per se
unable of mechanical neural realization. The McCulloch-Pitts
result puts an end to this. It proves that anything that can be
completely and unambiguously put into words is ipso facto
realizable by a suitable finite neural network. (McCorduck 1979,
65).
Pattern Recognition
Consider, for a moment, the instructions typed on a keyboard as
something other than mathematically or logically precise values, say, as patterns,
more or less like handwriting, where no two representations of the same letter
are precisely identical, but where one can detect, in most cases, a general
similarity from one representation to the other, in some cases discretely (a
"Palmer Method" handwritten 'a'), in other cases, only by comparison or
knowledge of the context (a sloppy handwritten 'e' over against a sloppy
handwritten '1'). A computer that could deal with that would have to have a
good idea what the intended character would look like ideally but still be
tolerant of a large number of variations, some of which might come closer to the
ideals of other letters altogether. The mathematics underlying problems of
precisely this kind began being addressed in the 'forties and 'fifties.
The next major step forward in the development of theoretical models of
"tolerant" computers was taken in the area of pattern recognition. Pitts and
McCulloch (1947) themselves developed models which would recognize
"properties common to all Possible variants of the pattern" and a mechanism by
which a new variant could be transformed into a standard representation.
(Cowan & Sharp 1987, 12-13). In 1958, a decade after the publication of the
McCulloch-Pitts pattern recognition models, Frank Rosenblatt (Rosenblatt 1958,
386) invented what he called the "Perceptron," a front end to McCulloch-Pitts
networks by which they "could be trained to classify certain sets of patterns as
similar or distinct" (Cowan & Sharp 1987, 13). Within a couple of years (1960), a
The Catastrophe
In the interim between the recognition of this problem and its solution, a
great disaster befell the development of "tolerant" computers. In the mid-sixties,
the chief proponents of Artificial Intelligence, alert to the logical gap in the then-
current neural models, are supposed, successfully, to have made the case before
government that further research in the area of neural networks was premature.
In 1969, Marvin Minsky and his associate S. A. Papert, published a monograph
which proved that "elementary" (as they are now called) Perceptrons or Adalines
could not perform two crucial logical operations [exclusive OR and not
(exclusive OR)]. He conjectured then and maintains now (Johnson 1987, 52) that
no multi-layering of McCulloch-Pitts neurons within the Perceptron or Adaline
could solve the problem. For all practical purposes, funding in the United States
came to a dead stop for twenty years, and research slowed to a crawl.
An important exception to that generalization is Stephen Grossberg of
Boston University, who continued to labor in neural networks for a handful of
admirers and now leads the recent explosive renewal of interest in the field as
first President of the International Neural Network Society. Research in Europe
was less affected by this turn of events: Christian von der Malsburg of the Max
Planck Institute at Göttingen, and Teuvo Kohonen of the University of Helsinki,
made important contributions to the field throughout the 'seventies
(Klimasauskas 1987, 49-53, 68-70, 123).
Minsky has remained skeptical. Even today, confronting dazzling
demonstrations of neural network applications, he is quoted as saying: "we don't
really know if these demonstrations are only the beginning—or the final
achievement" (Newsweek, 110,3 [20 July 1987] 53).
Learning Machines
The properties both of spin-glass and Hopfield nets closely resemble a
theory of learning published as early as 1949 by D.O. Hebb and going back, by
Hebb's own admission, to the very discovery of the neuron in the last decades of
the nineteenth century, namely, "connectionism" (Cowan & Sharp 1987, 9 & 38).
Hebb postulated that learning, and subsequent memory, take place as groups of
weakly connected cells organize into more strongly connected assemblies by
repeated stimulus and become relatively stable, that is, less susceptible to change
by new stimuli. This theory has been enormously influential in the development
of artificial neural nets, despite an absence of confirming evidence from
neurophysiological research (Cowan & Sharp 1987, 9).
The problems pointed out by Minsky and Papert back in 1969 (variously
called "the credit assignment," the "exclusive OR," and the "T/C problem"), were
solved in rapid order, first, by an adaption of a Hopfield net, called a "Boltzmann
machine" by its inventors, Terrence Sejnowski of Johns Hopkins and Geoffrey
Hinton of Carnegie Mellon (Hinton & Sejnowski 1983). They introduced into a
Hopfield net a randomizing function in the shape of a version of the "Monte
Carlo" algorithm, well known to statisticians (and card sharks), a procedure
The Architectures
These dazzling experiments were performed on conventional computers
which were programmed to emulate the various unconventional architectures
demanded by neural networks. To differentiate between the architectures it is
possible to isolate the significant categorical differences, which might be
classified as "Time," "Manner," and "Place" in the honored taxonomy of adverbs:
*This paper was originally delivered at the national CALICO meetings, Salt Lake
City, 26 February 1988.
References
Author's Biodata
Dr. Frank L. Borchardt is Associate Professor of German at Duke
University and Chairman of the Department. He also functions as Principal
Investigator for Duke's Humanities Computing Projects.
Author's Address
Frank L. Borchardt
Department of German
Duke University
Durham, NC 27706