Melodic Similarity

cover
next page>
Page i
Melodic Similarity Concepts, Procedures, and Applications Computing in Musicology 11 Edited by Walter B. Hewlett Eleanor Selfridge-Field The MIT Press Cambridge, Massachusetts London, England CCARH Stanford University Stanford, CA title: author: publisher: isbn10 | asin: print isbn13: ebook isbn13: language: subject publication date: lcc: ddc: subject: Melodic Similarity : Concepts, Procedures, and Applications Computing in Musicology ; 11 Hewlett, Walter B.; Selfridge-Field, Eleanor. MIT Press 0262581752 9780262581752 9780585069784 English Melodic analysis--Data processing. 1998 ML73.D57 1998eb 781.2 Melodic analysis--Data processing.
cover
If you like this book, buy it!
next page>
<previous page
cover-0
next page>
inside front cover
The series Computing in Musicology is a co-publication of the Center for Computer Assisted Research in the Humanities and The MIT Press. Established in 1985, CM treats topics related to the acquisition, representation, and use of musical information in applications related to musical composition, sound, notation, analysis, and pedagogy and to significant digital collections of textual information supporting the study of music. Editorial matters and enquiries concerning submissions should be directed to CCARH, Braun #129, Stanford University, Stanford, CA 94305-3076. Prospective contributors should consult the guidelines given on the last page of this book and send a query to esf@ccrma .stanford .edu . Editors: WALTER B. HEWLETT, ELEANOR SELFRIDGE-FIELD Associate Editor: EDMUND CORREIA, JR. Assistant Editor: DON ANTHONY Advisory Board: MARIO BARON LELIO CAMILLERI TIM CRAWFORD EWA DAHLIG ICHIRO FUJINAGA DAVID HALPERIN JOHN WALTER HILL KEIJI HIRATA JOHN HOWARD DAVID HURON THOMAS J. MATHIESEN KIA NG JOHN STINSON YO TOMITA ARVID VOLLSNES LISA WHISTLECROFT FRANS WIERING Volume 11 and subsequent issues of Computing in Musicology are distributed by The MIT Press, Massachusetts Institute of Technology, Cambridge, MA, and London, England http: //mitpress .mit .edu Back issues, highlights of which are listed on the inside back cover, are available from CCARH. CCARH welcomes queries, offprints, unpublished studies (including theses), notices of work-in-progress, and citations for Web sites of interest to its readers. Links may be found at http: //musedata .stanford .edu
<previous page
cover-0
next page>
<previous page
page_ii
next page>
Page ii
This book contains characters with diacritics. When the characters can be represented using the ISO 8859-1 character set (http://www.w3.org/TR/images/latin1.gif), netLibrary will represent them as they appear in the original text, and most computers will be able to show the full characters correctly. In order to keep the text searchable and readable on most computers, characters with diacritics that are not part of the ISO 8859-1 list will be represented without their diacritical marks. 1998 Center for Computer Assisted Research in the Humanities All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. This issue of Computing in Musicology is dedicated to the memory of Prof. Dr. Helmut Schaffrath (1942-1994). ISBN 0-262-58175-2 ISSN 1057-9478 Library of Congress Catalog Card Number 98-88104 Printed and bound by Thomson-Shore, Dexter, Michigan. Java and UltraSPARC are registered trademarks of Sun Microsystems, Inc. Macintosh is a registered trademark of Apple Computer, Inc. MuseData is a registered trademark of the Center for Computer Assisted Research in the Humanities. SCORE is a registered trademark of San Andreas Press. Windows is a registered trademark of Microsoft Corp. Other product names mentioned in this text may also be protected. The Center for Computer Assisted Research in the Humanities is a non-profit educational research facility located at Braun #129, Stanford University, Stanford, CA 94305-3076. Tel.:(650)725-9240;(800)JSB-MUSE ccarh@ccrma .stanford .edu Fax:(650)725-9290 http: //musedata .stanford .edu
<previous page
page_ii
next page>
<previous page
page_iii
next page>
Page iii
TABLE OF CONTENTS Preface I. Concepts and Procedures 1. Conceptual and Representational Issues in Melodic Comparison Eleanor Selfridge-Field CCARH, Stanford University vii 1
3 4
Concepts of Melody 15 Searchable Representations of Pitch 31 Searchable Representations of Duration and Accent 36 Strategies for Multi-dimensional Data Comparison 46 Prototypes, Reductions, and Similarity Searches 54 Conclusions 2. A Geometrical Algorithm for Melodic Difference Donncha Maidn Department Of Computer Science, University Of Limerick, Ireland
65 66
General Properties of a Difference Algorithm 72 Implementation 3. String-Matching Techniques for Musical Similarity and Melodic Recognition Tim Crawford, Costas S. Iliopoulos, and Rajeev Raman Departments Of Music And Computer Science, King 's College, London, UK
73 74
Introduction 76 String-Matching Problems in Musical Analysis
78 Exact-Match Algorithms 87 Inexact-Match Algorithms 89 Musical Examples
<previous page
page_iii
next page>
<previous page
page_iv
next page>
Page iv
4. Sequence-Based Melodic Comparison: A Dynamic-Programming Approach Lloyd A. Smith, Rodger J . McNab, and Ian H. Witten Department Of Computer Science, University Of Waikato, Hamilton, NZ
101 102
Previous Research in the Field 103 Sequence-Comparison Using Dynamic Programming 104 Algorithms for Melodic Comparison 108 Additional Operations for Comparing Music 108 Effect of Music Representation and Match Parameters 110 A Sample Application: Retrieving Tunes from Folksong Databases 115 Conclusion 5. Strategies for Sorting Musical Incipits John Howard Digital Librarian, Widener Library, Harvard University
119 120
Potentials and Practicalities 121 The Frankfurt Experience 123 The Harvard Experience 6. Signatures and Earmarks: Computer Recognition of Patterns in Music David Cope Porter College, University Of California, Santa Cruz
129 130
Signatures 134 Earmarks
II. Tools and Applications 7. A Multi-scale Neural-Network Model for Learning and Reproducing Chorale Variations Dominik Hrnel Institute For Logic, Complexity, And Deduction Systems, University Of Karlsruhe, Germany
139
141 142
Background 143 Task Description 144 A Multi-scale Neural Network Model 148 Motivic Classification and Recognition 148 Network Structure 151 Intervallic Representation 153 System Performance 156 Conclusions
<previous page
page_iv
next page>
<previous page
page_v
next page>
Page v
8. Melodic Pattern-Detection Using MuSearch in Schubert's Die schne Mllerin Nigel Nettheim University Of New South Wales, Australia
159 160
The MuSearch Program 163 Musical Examples 166 Conclusions 9. Rhythmic Elements of Melodic Process in Nagauta Shamisen Music Masato Yako Kyushu Institute Of Design, Fukuoka, Japan
169 171
Shamisen Music of the Edo Era 171 Towards a Catalogue of Melodic Types 174 Classification of Rhythmic Patterns 181 Observations on Nagauta Shamisen Rhythmic Patterns and Features of Melodic Process 182 Summary of Results III. Human Melodic Judgments 10. Concepts of Melodic Similarity in Music-Copyright Infringement Suits Charles Cronin School Of Information Management, University Of California, Berkeley 185
187 189
Fundamentals 192 Demonstrating Melodic Similarity 193 Some Case Decisions 207
Conclusion 11. Judgments of Human and Machine Authorship in Real and Artificial Folksongs Ewa Dahlig Helmut Schaffrath Laboratory, Polish Academy Of Sciences, Warsaw, Poland, and Helmut Schaffrath 211 [ Hochschule fr Musik, Essen University, Germany] 216 Correlations of Judgment with Musical Features 217 Correlations of Judgment with Social Factors
<previous page
page_v
next page>
<previous page
page_vi
next page>
Page vi
IV. Online Tools for Melodic Searching 12. MELDEX :A Web-based Melodic Index Service David Bainbridge Department Of Computer Science, University Of Waikato, Hamilton, New Zealand
221
223 224
Databases 226 Software 227 Search Algorithm 228 The New Zealand Digital Library Project 13. Themefinder:A Web-based Melodic Search Tool Andreas Kornstdt Computer Science Division, Department Of Human Interface Design, University Of 231 Hamburg 233 Using Themefinder 233 List of Features 236 Technical Background 236 Future Plans Index 237
<previous page
page_vi
next page>
<previous page
page_vii
next page>
Page vii
PREFACE With this issue, Computing in Musicology makes its debut as a co-publication with MIT Press. We believe that this change will benefit both our readers and our contributors. We welcome your comments. This special issue is devoted entirely to the subject of melodic processing. Much of what it discusses is more concerned with conceptualization than with technology. Yet it is technology that makes the subject topical. Computers offer the potential of enabling us to perform many tasks related to melodic searching, recognition, comparison, or generation, yet their purveyors sometimes convey a false sense of the ease with which we can produce useful results. Quite apart from the details of technology, we lack an articulate vocabulary for discussing melody. The intellectual apparatus for framing specific questions awaits development. Technology cannot proceed without it, except in superficial ways that may enjoy only temporary value. The subject of melody cuts across many subdisciplines including music history, theory, analysis, cognition, perception, and ethnic studies. The subject has many bridges to other disciplines as well. Thus two articles bring perspectives to similarity-searching in general from the domain of computer science. Another contribution recounts legal judgments on claims of melodic plagiarism. One incorporates procedures from the social sciences. In the introductory article some fundamental issues of conceptualization and data representation are considered. While the contributions included in this issue are representative of the range of approaches that have been proposed, they barely suggest all the possible implementations or adaptations to particular repertories or querytypes. Much important work is still to come. Many of the contributions presented here have been given at conferences but none has been published previously. A conference on computer applications in music (Belfast, 1991) was the original venue for the work of SelfridgeField. John Howard's work was presented at the study session organized by the International Musicological Society's Study Group on Musical Data and Computer Applications (Madrid, 1992). Maidn's paper was given at a meeting of the same group (London, 1997), while drafts of the papers by the King's and Waikato consortia were given at a special follow-on conference (King's College, London, also 1997). The contributions of Hrnel and Yako have been given at conferences in their respective fields and
<previous page
page_vii
next page>
<previous page
page_viii
next page>
Page viii
countries. Cronin's paper provides the basis for a conference on music and copyright provisionally scheduled to take place in 1999. Kornstdt's work, which builds on software developed by David Huron, was scheduled to debut at the Colloquio sull'Informatica Musicale, held in collaboration with the IEEE Computer Society Task Force on Computer-Generated Music, in Gorizia (Italy) in September 1998. <><><><><><><><><><><><> It is wholly appropriate that we dedicate this issue of Computing in Musicology to our late colleague, Prof. Dr. Helmut Schaffrath, who was prodigiously involved in the development of datasets and tools for melodic searching for twelve years prior to his untimely death in 1994. The value of his example and his direct contribution to the resolution of issues both conceptual and practical has become more apparent with each passing year. Users of the Essen databases now seem to exist on five continents. Helmut is remembered as the capable organizer of the study group on computer applications of the International Council on Traditional Music (ICTM), as a key member of the study group on computer applications of the International Musicological Society (IMS), as an imaginative teacher at the Essen Gesamthochschule fr Musik, and as a diligent scholar of the indigenous musics of Europe, South America, and China. Prior to his studies in these areas he had been apprenticed to an organ-builder, which may well have provided some of the intellectual and technical foundations for his prescient grasp of issues and his inventiveness in methology. A substantial list of his publications accompanies his contribution to Beyond MIDI : The Handbook of Musical Codes (MIT Press, 1997). The article we include here, long dormant in our files, was summarized by Helmut a few months before his death from a more extensive study made in collaboration with Ewa Dahlig, who has become of the official custodian of his work. <><><><><><><><><><><><> Significant contributions to the production of this issue were made possible by Don Anthony, who set most of the musical examples, by Akiko Orita, who explored the mysteries of obsolete Hiragana as used in shamisen music (thus enabling us to prepare the illustrations for Yako's article) and to Craig Sapp, who resolved many hardware and software issues at appropriate junctures. Ed Correia prepared camera-ready copy, implementing our new format and indexing the entire volume with characteristic diligence. Doug Sery and many other members of the MIT Press staff have been endlessly helpful in resolving the practical details of our new publication arrangements.
<previous page
page_viii
next page>
<previous page
page_ix
next page>
Page ix
Readers should note that in December 1996 the Center for Computer Assisted Research in the Humanities moved to Stanford University and that address and contact information in earlier publications is now obsolete. Current particulars are given on the obverse of the title page. General information for contributors is found on the inside front cover and details of style are given on the last page of the book. The inside back cover itemizes the contents of previous issues of Computing in Musicology, which continue to be handled by CCARH. STANFORD UNIVERSITY AUGUST 1998
<previous page
page_ix
next page>
<previous page
page_1
next page>
Page 1
I. CONCEPTS AND PROCEDURES
<previous page
page_1
next page>
<previous page
page_3
next page>
Page 3
1 Conceptual and Representational Issues in Melodic Comparison Eleanor Selfridge-Field CCARH Braun Music Center Stanford University Stanford, CA 94305-3076 esf@ccrma .stanford .edu Abstract Melodic discrimination is a fundamental requirement for many musical activities. It is essential to psychological recognition and often signifies cultural meaning. In computer applications, melodic searching and matching tasks have been extensively explored in the realm of bibliographical tools and monophonic repertories but have been relatively little used in analyses of polyphonic works. Computer approaches to melodic comparison often produce vague or unacceptably prolific lists of false ''matches." In some cases these occur because the underlying representation of the melodic material is too general. In other cases they arise from queries that are poorly adapted to the kind of repertory being examined or are inadequately articulated. Some queries assumed to be simple must operate within contexts that are conceptually complex or purposefully ambiguous. Lessons we have learned from finding-tool projects and monophonic repertories may help to clarify and resolve issues of representation, design, and conceptualization as we approach studies of entire corpora of encoded materials and daunting quests for "melodic similarity."
<previous page
page_3
next page>
<previous page
page_4
next page>
Page 4
The ability to recognize melodic similarity lies at the heart of many of the questions most commonly asked about music. It is melody that enables us to distinguish one work from another. It is melody that human beings are innately able to reproduce by singing, humming, and whistling. It is melody that makes music memorable: we are likely to recall a tune long after we have forgotten its text. It is often the subtlety of the effect that leads us to consider the possibility of engaging a computer in our research. Music bibliographers want to be able to overcome the deceptive impression of difference given by transposition or re-orchestration. They want help in resolving questions of identification and attribution. Music historians want dependable tools for locating similar tunes disguised by modular substitutions (as in the transmission of early chant), retexting (as in masses that parody motets), consolidation of parts (as in lute or keyboard transcriptions of chansons), and elaboration (as in divisions, diminutions, and variations). Folk-music researchers seeking to identify tune families depend on stress constants where rhythmic details, reflecting variations of text, may vary. 1.1 Concepts of Melody 1.1.1 Intellectual Frameworks For two centuries or more theorists have concentrated their attention on harmony, counterpoint, and "form" in examining the fabric of music of the past. Rightly they proclaim that these aspects of music distinguish European art music from that of other cultures. In consequence, the component elements of harmony and counterpoint are rigorously and systematically described in the literature of music theory. Such terms as "I6" and "second species" are unambiguous in meaning, at least from the perspective of observation: we would not identify a V7 as a I6 or fourth-species counterpoint as second. The rule-based vocabularies of harmony and counterpoint are moderately supportive of efforts at artificial composition in the styles of the sixteenth and eighteenth centuries. Schottstaedt's implementations of species counterpoint following Fux's teachings (1989) offer an apt example. It has generally been recognized by those engaged in generative applications that even those rule-systems which we regard as extensive are far from exhaustive. Thus the task of deriving "invisible" rules of practice, never expressed formally in music-theoretical works, has attracted welcome
<previous page
page_4
next page>
<previous page
page_5
next page>
Page 5
attention. The rules accumulated through Kemal Ebcioglu's * artificial harmonizations of chorale melodies set by J. S. Bach (1986; 1992), for example, now number more than 300. The intellectual framework for discussions of melody in European art and folk music is not yet nearly so well formed. Formal discussions of melody in German and Italian music theory of the eighteenth century are engaging but few, particularly in comparison with the copious literature on harmony, counterpoint, and form of the past three centuries, and even, vis-a-vis recent literature, with expositions of reductionist techniques of musical analysis or principles of music perception and cognition. This is partly the result of the idea, current in earlier eras, that the construction of melody allowed for inspiration and experimentation, for permutation and transformation. The "invention" of a melody was considered to be concept-based, but it was not rule-driven in the same way that harmonizations and realizations of form were. Folk melodies were considered to have arisen "unconsciously." There is little automatic agreement on definitions for various manifestations of melody. We may all believe we have a common focus in mind when we discuss our notions of a theme, a phrase, a motive, or a fugal subject. When it comes to the design of computer applications, however, we may disagree about thematic hierarchies, about phrase boundaries, about motivic overlaps, or about the termination point of a fugal subject. These conceptual lapses are quite paralytic, because computers cannot make intuitive judgments for themselves and there is too little consensus for them to make "scientific" judgments founded on our beliefs. The degree to which melody is a dominant part of the musical fabric is another conceptual variable. It goes without saying that folksongs in the Western tradition frequently offer a 1:1 ratio of melodic content to overall length, but the ratio is almost always diminished in art music. The extent to which it is diminished is highly variable. The point of departure for many recent discussions of melodic and structural processes has been the opening theme of Mozart's Piano Sonata K. 331 (Figure 1).
Figure1. TheopeningthemeofMozart'sPianoSonataK.331.
<previous page
page_5
next page>
<previous page
page_6
next page>
Page 6
We must remember, however, that Mozart is extremely cogent in his musical thinking and precise in his definition of thematic material, and rarely more so than here. It is not so easy to identify motives and phrases in the rambling prose of, let us say, Glinka, or in the whimsy of seventeenth-century toccatas. These and a host of other less tractable repertories are little discussed in recent analytical literature, much less in computer applications. The degree to which the definition of a melody may be determined by cultural convention is another issue that warrants respect. As long as we work within the European/American art-music tradition we may be able to ignore the possible confounds that divergent cultural views may bring to estimates of melodic similarity, but their influence should nonetheless be acknowledged. Ethnomusicologists, beginning with Frances Densmore's 1918 study of Teton Sioux music, seem to have the longest-lived interest in methods of melodic comparison. This is all the more commendable because these scholars work against an enormous obstacle as they change venues. George List's revealing article about Hopi concepts of melody relates that the Hopi "conceives of melody as a series of contours that are differentiated by their combination of rising and falling pitch lines. For two performances of a contour to be considered the same, only a general relationship must be maintained" (1985: 152). Where precise definitions of a melody are intended, it is well known that lapses occur in oral transmission. Conversely, however, Spitzer (1994) noted that in a situation in which variation and improvisation were acceptable (American minstrel shows), written variants of "Oh! Susanna" nonetheless tended towards regularization of accent and simplification of tonal structure. Differences within academic subcultures also flavor our findings. Approaches to melodic comparison issuing from computer science and other statistically oriented fields value "degrees" of similarity in comparative studies, but researchers nurtured in the humanities and the performing arts tend to seek decisive answers, or at least persuasive interpretations. They want to know whether one work is or is not a paraphrase of another. If a continuum between quantitative and qualitative evaluations could be constructed, its use might be appropriate in the field of melodic comparison: some results that are "decisive" are nonetheless incorrect while, within the framework of results rated by degrees of confidence, statistical findings may occasionally oppose psychological verities. Writing on composition at the end of the eighteenth century, the theorist Heinrich Christoph Koch attempted to provide a series of "mechanical rules"
<previous page
page_6
next page>
<previous page
page_7
next page>
Page 7
for the creation of melodies (1782-93, partial translation 1983; see also Baker 1988). He considered only genius to be capable of endowing a melody with beauty and only taste to facilitate the perception of this beauty. These obviously are human qualities that lie outside the music itself. Koch's citation of them reminds us that our concepts are ultimately an amalgamation of what is within the music and what is within our minds. Computers can only address the first. 1.1.2 Interdisciplinary Contexts Interest in melodic studies has risen significantly in recent years. An important part of this interest is separate from, but secondarily related to, the rise of computer use. This secondary relationship often comes about through a sympathy for some other discipline which itself has been heavily influenced by computer use. For example, Eugene Narmour's books on the Analysis and Cognition of Basic Melodic Structures (1990) and the Analysis and Cognition of Melodic Complexity (1992), which provide exhaustive description of one specific melodic processthe "realization" of melodic "implication"show the influences (via Leonard Meyer) of Gestalt psychology, artificial intelligence, and cognitive psychology. The work of Mario Baroni and his colleagues on the establishment of rule-based grammars (1978, 1983, 1984, 1990, 1992), which illustrates a long-sustained attempt to bring systematics to the study of melody, is indebted to computational linguistics and the Chomskian idea of the existence of universal principles of grammar. Theories of grammar, in combination with techniques of artificial intelligence, also play a role in the artificial-composition experiments of David Cope (1991A, 1991B, 1992A, 1992B, 1996). Cope's composition software identifies patterns unique to individual composers and repertories and stores them in a lexicon. It selects them at random but in conformance with a grammatical classification scheme, which governs the generation of new compositions in the style of the designated composer and genre. The exploration of the fundamental role that implicit knowledge plays in many human activities was stimulated by unforeseen difficulties that arose in artificial intelligence. Machines could not be programmed to simulate human perception beyond the limits of current understanding. The line between perceptual studies, in which the focus is on human performance or understanding (i.e., on "subjects"), and music-theoretical studies, in which the
<previous page
page_7
next page>
<previous page
page_8
next page>
Page 8
focus is on the music itself (i.e., "objects"), is often more delicate than we might suppose. In order to recognize antecedent and consequent phrases, for example, we must be (1) cognitively familiar with the concept, (2) perceptually astute in listening, and (3) neurologically able to match precept and example. Yet we cannot identify these components of a melody in music that does not contain them. Similarly, in many other tasks related to musical analysis, both subjective and objective conditions must be satisfied. Cognitive studies may lead us towards philosophical questions that seem quite distant from practical applications. In studying melody, we must acknowledge, as Koch suggested, the role of fluctuating aesthetic values. The beauty (or "genius") of some musical effects is borne by clarity, of others by subtlety. "Taste" may be an instance of perception conditioned by cultural expectation. As musicians, we will always value the distinction between clarity and subtlety. As researchers using computers, we are challenged to provide the same degree of reliability in pursuing the one as the other. As the following considerations reveal, this is a goal which may always be somewhat elusive. 1.1.3 Content Variables : Prototypical, Disguised, and Implied Melodies The vacuum of theoretical work underpinning studies of melody obliges us to agree on some basic terms and concepts. These terms and concepts fall into four areasactual and ideal melodies, locations of melodies within the work, elements of melodies, and contextual aspects of melody. Between an actual melody and the encoding of it there is often an intermediate conceptual entity, the prototypical melody. The prototype is a kind of generalization to which elements of information represented in the actual melody, such as rhythm, may seem irrelevant, or indeed to which unwritten information, such as stress, may seem relevant. It is this prototype, rather than the actual music, that has the greatest influence on the way in which melody is rememberedand later sought. Although, if the music is simple enough, there may be no difference between an actual melody and its prototype, much of the literature of music theory and of music perception and cognition over the past two decades has been absorbed in questions of ambiguityrhythmic, harmonic, and melodic. Composers have often gone to considerable lengths to hide, or to pretend to
<previous page
page_8
next page>
<previous page
page_9
next page>
Page 9
hide, the melodyto bury the treasure, as it were, so that we can enjoy the hunt. For the purposes of computer searching, works in which the melody is isolated in a consistent way are far easier to handle than those in which it is instead interwoven with other material. Some commonly encountered kinds of disguised melodies are the following: (1) Compound melodies, in which there is really only one melody but its principal notes, many fewer in number than the surface level of activity suggests, are not automatically indistinguishable from the passage in which they are embedded. These occur particularly in unaccompanied string music [Figure 2a].
Figure2a. Acompoundmelody:Bach'sChaconneforunaccompaniedviolin. Twomelodiesarecollapsedintooneline. (2) Self-accompanying melodies, in which some pitches pertain both to thematic idea and to the harmonic (or rhythmic) support [Figure 2b].
Figure2b. Aself-accompanyingmelody:Schubert'sImpromptu,Op.142,No.3, Variation1.Somepitchesjointlybelongtothemelodyandtheaccompaniment.
<previous page
page_9
next page>
<previous page
page_10
next page>
Page 10
(3) Submerged melodies consigned to inner voices, while decorative outer voices leave different residues in the listener's mind. These are likely to occur in keyboard music [Figure 2c].
Figure2c. Asubmergedmelody:Brahms'sBalladeOp.10,No.4.Brahms'sinstructioncallsattention tothewishthatthetheme,foundinanintermediatevoice,shouldbeplayed"withthemost intimatesentimentbutwithouttoomuchmarkingofthemelody."Theessentialcontentis giveninthetopstaffasa"rhetoricalreduction"byLeonardRatner.
<previous page
page_10
next page>
<previous page
page_11
next page>
Page 11
(4) Roving melodies, in which the theme migrates from part to part [Figure 2d].
Figure2d. Arovingmelody:Haydn'skeyboardvariations"Gotterhalte."Themelody(System1) passesbetweenthebassandtenorvoicesinVariation2(System2)andbetweenthe altoandsopranoinVariation3(System3). The "track" problem is particularly acute in Variation 3 (third system), where the parts might more accurately be considered to be two altos and two sopranos or bass, tenor, and two sopranos. Throughout the first half of the movement, the two "sopranos" have a syncopated canon at the unison.
<previous page
page_11
next page>
<previous page
page_12
next page>
Page 12
(5) Distributed melodies, in which the defining notes are divided between parts and the prototype cannot be isolated in a single part. These occur mainly in orchestral repertory [Figure 2e].
Figure2e. Adistributedmelody:TheopeningoftheAdagiolamentosomovementofTchaikovsky's SixthSymphony. Thenotesdefiningtheprincipalthemeareintroducedalternately betweenthefirstandsecondviolins.A compositerepresentationisgivenon thethirdstaff. The problem with disguised melodies in computer applications is that either the encoding or the query must be structured in such a way as to make the simple strand retrievable. In keyboard music, this requires the differentiation of the material into multiple tracks or threads. Notes with two stems (2b, 2c) must appear in both tracks. To find roving melodies (2d), a query must be able to thread its way through different parts and to traverse the same material over and over. Even then, a program would only be able to compile a complete melody by having a carefully constructed model to attempt to match. The only hope for ferreting out the prototypical melodies in compound and distributed melodies (2a, 2e) would be to separate the pitches by register into two tracks (2a) or, conversely, to collapse the tracks into one (2e) and then extract the notes that are highest in each time-slice. Unless one has reason to suspect that such phenomena occur, one would be unlikely to provide the extra apparatus necessary to find such themes. Some similarly deceptive examples demonstrating other strategies for elusiveness are given by Crawford et al. in Chapter 3.
<previous page
page_12
next page>
<previous page
page_13
next page>
Page 13
The most prevalent kind of lapse between precept and example may occur in Mozartean melodies where every statement of the same ''thematic" material is presented slightly differently, where no single statement contains a completely unornamented "model," and yet where there may be general consensus that a common melody is implied. Agreement about the exact components of this implied, but undisclosed, model (or prototype) may be difficult to achieve. Consider, by way of a tangible example, the various iterations of the "theme" of the Andante of Mozart's Piano Sonata K. 311 (Figure 3).
Figures3ae. Fiveiterationsofthe"theme"ofMozart's PianoSonataK.311. None of these permutations gives a clear statement of the prototypical melody that is implied, which is arguably what is shown in Figure 3f.
Figure3f. Aprototypicalmelodyrelatedtothe iterationsshowninFigures3ae. Some algorithmic approaches to the resolution of this problem will be discussed in Section 5.
<previous page
page_13
next page>
<previous page
page_14
next page>
Page 14
1.1.4 Position Variables : Incipits and Themes Bibliographical (or finding) tools tend to concentrate not on complete melodies but on samples of them. These samples may be taken either from the start of a movement or work, in which case they are called incipits, or from a random portion of the work that enjoys the greatest melodic importance. In the latter case the melodic material is called a theme. Most experience with computer-assisted studies of melody is based on the use of incipits or themes. Incipits form the basis of most so-called "thematic" catalogues. Some finding tools (e.g., RISM's index of music manuscripts, Lincoln's madrigal and motet indexes, LaRue's symphony catalogue) also concentrate on incipits, while others (e.g., the Barlow and Morgenstern [1948] and Parsons [1975] dictionaries) concentrate on "themes." Incipits serve well for early and folksong repertories, in which "themes" may be coincident with incipits. Within the domain of incipit representation and searching there are significant traps. In polyphonic repertories the characteristics of incipits vary markedly between voice parts. The most important melodic information may be in the tenor in late medieval repertories; it is likely to be in the highest vocal or instrumental part in homophonic repertories of later centuries. One is more likely to find rough equivalence of melodic importance in the various part-incipits of imitative works than in homophonic ones, although in works involving elaborate counterpoint, double subjects and other complications may be present. The development of classical instrumental music in the eighteenth and nineteenth centuries brought with it an ever greater tendency toward a concentration of interest on themes and a disintegration of the notion of continuous melody. Some passages are melodically more significant than others. The value of the melodic information in an incipit gradually decreases over this period. Opening bars less and less frequently contain important thematic information, although they might, as in the case of Wagner, contain potent harmonic indicators of what was later to evolve. "Theme" catalogues are usually concerned entirely with the melodic material that is most memorable and most essential to the description of larger musical structures (symphonies, concertos, etc.). Programs that can automatically select "themes" from within long streams of melodic information are unlikely to appear soon, because rubrics for selection are poorly defined. At present, "themes" that are not coincident with incipits must be preselected and hand-fed or tagged in existing data. However, the data encoded for bibliographical projects (whether books of
<previous page
page_14
next page>
<previous page
page_15
next page>
Page 15
themes or books of incipits) provide a valuable laboratory for evaluating the relative success of different approaches to representation and sorting strategies, with or without algorithms for thematic indentification. Increasingly [see the articles by Howard, Bainbridge, and Kornstdt in this issue], these underlying databanks are being made accessible via the World-Wide Web. Thus, issues that until recently may have seemed remote from common user experience suddenly assume importance. The hard scrutiny that comes with frequent use can be expected to follow as a natural consequence of improved access. 1.2 Searchable Representations of Pitch The level and nature of the detail captured in a musical encoding exerts considerable influence on the kinds of searches that can be undertaken. The directly representable components of melody are pitch and duration. Derivable components include intervallic motion and accent. Non-derivable components include articulation and dynamics indications. More general aspects of the music that may be of relevance to analytical tasks include the number of voices and the work's instrumentation, genre, texture, and other general features. Two pertinent questions about the encoding of such data are these: (1) What is the minimum set of parameters required to define a melody? (2) What is the minimum level of specificity required of any given parameter to support multiple uses of the encoded material? Most algorithms for melodic searching currently in use treat pitch in a relatively general way and ignore rhythmic information entirely. The equivalents in text searching might be to encode only consonants and to ignore punctuation: the results are often too inclusive and, when reference is made to the original compositions, are seen not to represent some of the most distinctive characteristics of the music. To explain why such results may occur, we first consider levels of pitch representation and then review methods for comparing pitch profiles.
<previous page
page_15
next page>
<previous page
page_16
next page>
Page 16
1.2.1 Levels of Pitch Representation The octave may be subdivided in many different ways. Leaving aside discussions of tuning systems, which can engage hundreds of ways for subdividing the octave, and concentrating on methods most closely associated with common experience, the most prevalent representations of pitch are the base-7, or diatonic representation (Figure 4a),
Figure4a. Numericalvaluesinabase-7systemofpitchrepresentation. which allocates one slot for each white key of the piano and/or each name-class (A..G) of common nomenclature, and the base-12 representation (Figure 4b),
Figure4b. Numericalvaluesinabase-12systemofpitchrepresentation. which allocates one slot for each black and white key and/or each equal-tempered pitch-class. The base-7 system thus is correlated with physical entities and notational concepts, while the base-12 system is correlated with physical entities and theoretical concepts. Base-7 systems have been used in many finding tools, and base-12 systems are ubiquitous in both pitch-set theory and in MIDI applications. Both systems have limitations which can be crippling for particular kinds of melodic applications, and for this reason, many more articulate systems of pitch representation have been devised and implemented in computer applications. Sometimes these are used only as metacodes for internal processing to improve the accuracy of processing. They are not necessarily apparent to the user. Two that are representative are the base-21 system (Figure 4c),
<previous page
page_16
next page>
<previous page
page_17
next page>
Page 17
Figure4c. Numericalvaluesinabase-21systemofpitchrepresentation. and the base-40 system (Figure 4d):
Figure4d. Numericalvaluesinabase-40systemofpitchrepresentation. The base-21 system provides a sufficient number of numerals to differentiate each enharmonic tone within the range of single sharps and flats. While it does allocate a slot for all the name- and inflection-classes of common music notation, its scheme of representation does not directly correspond to physical entities or theoretical concepts. Hewlett (1992) discovered in the mid-Eighties that two problems could be solved at once by allocating a slot to each of the "phantom" spaces that happen to correspond to black notes on the piano. This resulted in the base-40 system, which is one of a class of "solutions" to increasingly more abstract representations of tonal pitch-space. The two problems solved were interval invariance and discrete accommodation of enharmonic pitches through all double sharps and flats. There are larger numbers that also secure interval invariance and can accommodate greater degrees of inflection (e.g., triple sharps, etc.), but 40 is the lowest common denominator for tonal music through the nineteenth century. Interval invariance guarantees that there is always consistency between an interval and its numerical mapping. All of the systems mentioned have some degree of invertibility, but none guarantees complementarity for all conceivable interval combinations. All systems accommodate perfect-interval combinations, provided that the tones involved regularly occur in the scale and are not altered. The base-7 system cannot discriminate between major and minor intervals; it can therefore accommodate thirds but not major or minor
<previous page
page_17
next page>
<previous page
page_18
next page>
Page 18
thirds. Thus it cannot be used to determine whether a chord is major or minor, since it cannot evaluate the quality of the component thirds. It would be useful for examples in a melodic minor scale (with different ascending and descending versions), since it is unable to represent inflections. The base-12 system supports complementarity in terms of measuring the number of semitones correctly, but it operates outside the conventions of written tonality. For example, it would assign a difference of 4 (semitones) to both the augmented fourth C -F and the major third D -F. This is an asset in evaluations of atonal music. The base-21 system appears to be better suited to discrimination of written tones, since C and D have separate numerals. When these numerals are combined to compute intervallic sizes, however, the results are not always consistent. The minor third C-E has a score of 5 (6-1), while the minor third E -G has a score of 6 (12-6). The base-40 system produces consistent results for such measurements. The minor third C-E (14-3) has a measure of 11; the minor third D -F (18-7) has a measure of 11. The minor third G -B (38-27) has a measure of 11. Similarly, all major sixths, irrespective of their notation, have a score of 29. The results of computation with different bases become more unstable when the intervals involved include chromatic tones. Thus in the base 12-system the augmented fourth (e.g., C-F ) has a score of 6, but so does the diminished fifth (C-G ; F -C). In the base-21 system, one augmented fourth, C-F (11-1), has a score of 10, while another, F-B (19-10), has a score of 9. The diminished seventh C -B (18-2) has a score of 16, while the diminished seventh D -C (22-5) has a score of 17. In the base-40 system, all augmented fourths (e.g., C-F or 21-3) have a score of 18, while all diminished fifths (e.g., F -C or 43-21) have a score of 22. Similarly, all diminished sevenths (e.g., C -B or (37-4) have a score of 33, and all augmented seconds (e.g., C-D or 10-3) have a score of 7. Thus the complimentarity used in ordinary music theory exercises is preserved, with such intervals in combination always producing a sum of 40. This system for mapping enharmonic tones has been used with success in diverse applications in teaching (e.g., MacGamut)and analysis (e.g., MuseData analysis routines by Walter B. Hewlett, Essen software conversions by Lincoln Myers and Humdrum tools by Craig Sapp). The base-40 scheme works well as an intermediate representation, invisible to the user, provided that the original data provides the level of detail required to make it operative.
<previous page
page_18
next page>
<previous page
page_19
next page>
Page 19
1.2.2 Pitch-based Comparisons Methods of pitch representation merely set the stage for methods of pitch comparison. It is obvious that one cannot match models at levels of detail that have not been represented in the data being searched. For practical reasons, it is often the data that have been collected most copiously that are represented with the least detail. Four common approaches to "melodic" comparison rely solely on pitch data. These respectively compare (1) profiles of pitch direction, (2) pitch contours, (3) pitch-event strings, or (4) intervallic contours. Rigorous questions about levels of detail in pitch representation seem almost irrelevant to the first two but play a central role in the second two. One additional approach is to combine different levels of pitch representation (e.g., intervals and contour); it is little tested in melodic-search applications. The most general method (1) for the comparison of two or more melodies in current use evaluates sequences of pitch-direction codes using the parameters up ( U), down ( D), and repeat ( R). It was a favorite device of finding tools created in the middle decades of the twentieth century. Using a simple base-7 system of pitch representation, an "up-down" incipit might "match" such strings as 132, 142, 143, 154,153,152...and also 243, 354, 465, 576...and even 451, 581, 786, and so forth. The general procedure is illustrated in McAll's Melodic Index to the Works of Johann Sebastian Bach (1962), a finding tool organized in three parts. The first gives a general classification of directional relationships between the first four notes of each item. This is followed by a list of chromatically notated instantiations (assuming transposition to C Major or A Minor) of each contour. These are grouped according to the scale degree on which they begin. Thus for the UUU contour, there are 44 instantiations beginning on the first degree (I), two on the second (II), 11 on the third (III), one on the fourth (IV), 25 on the fifth (V), and two on the seventh (VII). (No examples begin on the sixth degree.) Extracts are shown in Figure 5a.
<previous page
page_19
next page>
<previous page
page_20
next page>
Page 20
Figure5a. SelectedinstantiationsoftheUUUsequenceinMcAll'sMelodicIndex totheWorkofJ .S.Bach.Romannumeralsrepresentscaledegrees. In McAll's work, the profile serves merely as a mnemonic device: the actual fully notated incipits for all "matched" sources are given in the second section of the book. The five matches of the first instantiation of the UUU contour anchored to the first degree are shown in Figure 5b.
Figure5b. Thefive"matches"ofthefirstinstantiation(I:1inFigure5a) oftheUUUcontour,representedbyindividualincipits.
<previous page
page_20
next page>
<previous page
page_21
next page>
Page 21
When directional profiles are used without such collateral magnification of the underlying detail, the results may be of diminished value. For example, in Pont's article on "Geography and Human Song" (1990), citing a universal tendency of incipits beginning UU to predominate, only the two melodic intervals created by the first three notes are considered in the output, whereas in the underlying data, assembled by Parsons for his Dictionary of Tunes (1975), which covers roughly 10,000 classical themes and 4,000 popular tunes, longer streams of intervals were collected. The more general the system of representation, the longer the string will need to be to produce meaningful discriminations. Conversely, the richer the data, the shorter the string can be. In this case the combination of highly general data with very short strings leads to a superficial view, but one that is provocative when regarded as a model in need of improvement. Pont added to data from Parsons' working tapes corresponding information for five monophonic repertorieschants from ancient Greece, chants from the Liber Usualis, Gaelic melodies, Peyote Indian songs, and Aboriginal songs. The three kinds of directional motion tracked produce nine logical combinations. The results of their comparison were reported by the percentage of occurrence of each of the nine intervallic-direction patterns. The results are simplified in the rank-order lists given in Figure 6.
Figure6. Comparativerankingsondirectionalprofilesinsixrepertories. 9=thegreatestnumberofoccurrences.
<previous page
page_21
next page>
<previous page
page_22
next page>
Page 22
These results raise issues that must frequently be considered in the comparison of more fully represented melodies. For comparison, the same tests have been run on other data near to hand. In Figure 7 we see comparisons with four collections in the Essen database Lieder from the sixteenth century, songs from Central Europe, songs from Southeastern Europe, and children's songs. These form four different profiles, none of them consistent with any of the profiles in Figure 6.
Figure7. DirectionalprofilesforfourcollectionsofEuropeansong. Two problems with directional profiles become apparent. First, there is no way to determine from such a limited set of information the relationship of the third note to the first. Second, as we saw from the first level of McAll's work, each directional category conceals a great range of contours and notated pitch-strings. Third, Pont's quantities of data were small compared with those used by Parsons and those available in finding tools and source databases, such as the Essen collection and the Musedata corpora encoded and archived at the Center for Computer Assisted Research in the Humanities. To more fully explore the proposition that there are universal preferences for direction, we assembled comparable information from such sets and determined that the evidence for universal preferences is weak (Figure 8). Among eighteenth-century repertories, for example, the profiles for two Italian composers (B. and A. Marcello) show poor conformance to the general profile of Italian composers in a sample of RISM data. A greater degree of similarity exists between early cantatas (Nos. 1-20) by Bach and Telemann. Handel's Messiah does not exhibit strong preferences for initial direction at all.
<previous page
page_22
next page>
<previous page
page_23
next page>
Page 23
Figure8. DirectionalprofilesforvariousencodedrepertoriesofBaroquevocalmusic. In Figure 9, we present similar profiles for eighteenth-century instrumental music, with data for twentieth-century bebop (compiled by Williams 1985) included for further comparison. The contrast between the even-handedness of Beethoven and the clear preferences of bebop is possibly more striking than the indicated consistency within the Bach and Telemann repertories when both instrumental and vocal music are considered (see again Figure 8). We would expect differences between vocal and instrumental music, but these differences do not seem to be as significant as those between particular composers and even, in results not shown, between the early and late styles of individual composers. The problems associated with unknown relationships between noncontiguous intervals in directional profiles (which pertain to all the material shown in Figures 6-9) have been well recognized in ethnomusicology. Although Seeger (1960) claimed that an up/down/repeat typology was adequate for distinguishing contours, Adams (1976) refined this approach to distinguish three primary features, which were (1) slope (degree of), (2) deviation (from current direction), and (3) reciprocal (the relationship of the current pitch to the initial pitch),
<previous page
page_23
next page>
<previous page
page_24
next page>
Page 24
Figure9. Directionalprofilesforvariousencodedrepertoriesofeighteenthandtwentieth-centuryinstrumentalmusic. from three possible secondary features, (1) repetition (of the pitch), (2) recurrence (of the same pitch after intervening pitches), and (3) accommodation of ''conjunct" and "disjunct" segments of melody. Adams noted that there was wide latitude in the use of term "contour" in ethnomusicological studies from Densmore forward. Pitch contours (2) (frequently called "melodic" contours) give more definite information than directional profiles while retaining the kind of generality that may be required for studies based on performance or general features of melodic content. Sonographic data from audio input may be used to show undulating lines that illustrate the shapes of melodies (see, for example, Lubej 1995-96). Another type of contour is that which considers the overall relationship of average pitches within a succession of phrases, in short a phrase contour. Relationships between successive starting or ending points are more clearly conveyed than in a directional profile, since such a procedure requires a numerical representation of pitch at the outset. In his study of melodic arches, for example, Huron (1995-96) noted the tendency for folksongs with six phrases to exhibit the "McDonald's effect"a concave arch nested in the convex arch formed by the outer phrases (Figure 10).
<previous page
page_24
next page>
<previous page
page_25
next page>
Page 25
Figure10. Phrasecontour:the"McDonald's"effectinsix-phrasemelodies (afterHuron'sanalysisoftheEssendatabase). Like directional profiles, graphic contours are useful for demonstrating highly generalized results. However, they lack the specificity to support direct comparison of works. Greater levels of detail in pitch representation play a defining role in two more explicit approaches to melodic comparison: (1) sequential profiles of pitch strings (diatonic, chromatic, or enharmonic) and (2) sequential profiles of melodic intervals. Pitch-event strings (3) may employ the base-7, -12, -21, -40 or any other workable system of pitch representation. In addition, they may or may not indicate the register (or octave) in which each pitch occurs. Indications may be relative to the most prevalent octave (as in EsAC code or Braille musical notation) or absolute (as in MIDI key numbers). In moveable-register representations, an arbitrary range of limited extent is set as a default. Pitch may be defined in relation to the vertical position of written notes on a clef. Some early computer systems of representation, such as IML-MIR and DARMS, both encoded pitch according to graphic placement. In Princeton's IML-MIR system, pitches were identified by letter name, but no absolute octave or register information was provided. DARMS (extensively
<previous page
page_25
next page>
<previous page
page_26
next page>
Page 26
described in Selfridge-Field 1997) assigns the numbers 1..7 to the same seven-note span addressed by letter-names in IML-MIR, and like IML-MIR, DARMS is clef-dependent. If one takes account of both clef and pitch name, IMLMIR may be said to be absolute in its transcription of pitches. DARMS, in contrast, is always relative, for "pitches" do not have names; they only have "vertical heights." DARMS is relative over a very long number line, however, for negative and positive numbers of arbitrary extent can be tolerated. Moveable-register systems were actually in use in the last century, particularly among those following Curwen's "tonic sol-fa" system (1875) for singing. Later employed both as a simplified way to distribute popular songs and as the foundation for a method to collate hymn tunes (see, for example, Love 1891), Curwen's method explicitly represented scale degree, relative octave, duration, barlines, slurs, grace notes, and other features of music. See Figure 11. The most successful equivalent of the "tonic sol-fa" approach is found in the diatonically encoded material of the Essen Database of folksongs originated by Helmut Schaffrath (1992A, 1992B, 1993; also Selfridge-Field 1997). The associated EsAC code (described in Selfridge-Field 1997) makes a relative provision for octave encoding: each work has a principal octave and can refer, through the use of plus and minus signs, to the immediately adjacent octaves.
Figure11. Amoveable-register("tonicsol-fa")representationofthe "NewSt.Ann"tune(fromLove,p.258).Moveabledo ( d)here=B intheoctaveaboveMiddleC.
<previous page
page_26
next page>
<previous page
page_27
next page>
Page 27
Does reliance on registral switches handicap melodic searches? Missing registral cues in the Princeton data would have created mistaken impressions of melodic contour had the data been used extensively for analysis. DARMS data has been more extensively used for analytical tasks involving melody (e.g., Lincoln 1988, 1993) of various kinds. That its system of representation is not biased towards any particular key or mode is a blessing for atonal applications ( DARMS has found its most sustained following among set-theorists) and sometimes a curse (data verification is difficult). The lack of explicit registral information may incline DARMS towards slightly cumbersome meta-representations, such as Brinkman's binomial approach (1986A) to pitch representation. Combinatory schemes have the disadvantage (vis--vis the base-7, -12, -21, and -40 systems reviewed earlier) that two numbers (with different numerical bases) must be manipulated in analytical endeavors, as in nested English measures such as pounds/shillings/pence or feet/yards/miles computations. What is more problematic in melodic searches is the complete absence of either registral or directional information. For example, the seven-letter code (plus chromatic symbols) used by Barlow and Morgenstern (1948), in strings of six to ten pitch names, enables low-level sorting but, given short strings as queries, can produce some absurd "matches" (Figure 12a-b). Matches for the opening of Beethoven's Fifth Symphony,
( GGGEb ) as given by Barlow and Morgenstern, include the following:
Figure12a. Handel'sOrganConcertoOp.7,No.1.
Figure12b. Dvork's *SlavonicDanceOp.72,No.6. Similarly, the thematic-identifier volume of Jan La Rue's eighteenth-century symphony catalogue (1988) provides only schematic information
<previous page
page_27
next page>
<previous page
page_28
next page>
Page 28
about pitch; for registral information we must await the publication of the music volume. The principal themes from the opening movement of Haydn's "Military" Symphony appear as shown in Figure 13. In the related incipits of Figure 13, a parameter for register or octave would enable us to know that the contours are similar. 14646G:DGDCBA//DEGDEDCBAH411HAYDN
Figure13. LaRue'slistingfortheopeningmovementofHaydn's"Military" Symphonyandthemusicrepresented. Pitch-string comparisons are defensible in the study of monophonic medieval repertories, because in many cases there is no firm knowledge of durational values. Even here, though, another parameter may be essential to infer contours. In Andrew Hughes's database of Late Medieval Liturgical Offices (1994), for example, a mode indicator makes a base-7 pitch-code interpretable. Medievalists have been making enviable progress in studies of chant centonization by coordinating syllabic information from text underlay with pitch information (Figure 14; see Haas 1991 and 1992), or details about the physical appearance of neumes with pitch information (Binford-Walsh 1990, 1991, et al.).
Figure14. Acomparisonofchantmelodies,basedoncoordinationoftextsyllables, producedbyMaxHaas'sChatullGadolprogram.
<previous page
page_28
next page>
<previous page
page_29
next page>
Page 29
The SCRIBE database of fourteenth-century music explicitly encodes neume types and pitches (Stinson 1992; Selfridge-Field, 1990: 25). Associated software supports color separation of mensural notation, which expresses durational relationships through the use of black, red, and "white" (unfilled) notes. Yet in studies of search strategies, SCRIBE 's manager, John Stinson, has discovered that when notational information is ignored and pitch sequence alone is considered, concordances with chants of the tenth century are more likely to be found. This result demonstrates the apparently immutable truth of computer applications that attributes that are absolutely essential in one application may simply hinder the efficiency of another. While precise note names are often desired for analytical tasks in music history, theory, and bibliography, more generalized representations tend to be preferred in psychological studies. Intervallic profiles (4) provide this generalization. They may either be encoded as such or derived from note-specific data. Direct encoding of intervals inhibits data verification, however. Recent literature on studies of melodic perception is sometimes vague on the subject of data representation. Rosner and Meyer (1986) used "recordings." Vos and Troost (1989) say their material was played on the organ. Edworthy (1985) and Dowling (1986) used pitch strings (in the latter case a base-12 one), while letter names were used in the reporting phase of Monaghan and Carterette (1985) and Bartlett and Dowling (1988). In one study posing a particularly abstract question (How able are listeners to recognize similar contours with non-similar intervallic sequences?), results are reported in terms of pure-tone frequencies (Carterette, Kohl, and Pitt 1986). Would a different approach yield a different psychological profile, or a more uniform presentation of data lead to more consistent results? Studies in this field remain too rare to predict the answer to this question. In general psychologists focus intense scrutiny on small quantities of musical data, while music scholars may be inclined to survey larger quantities of data more loosely. Halperin, after directly encoding intervals in a study of troubadour music (1978), developed a simple but foresighted procedure for intervallic comparison in a later study of Ambrosian chant (1986: 32). He first encoded the music on a continuous number-line of semitones representing the gamut (G-c"). The F below gamma-ut (the first G of the index) was encoded as a zero, the G as 2, the A above it as 4, and so forth. Then he converted his data to an arbitrary intervallic code suited to sorting: a prime was 0, a minor second was 1, a major second 2, and so forth. Because his original encoding
<previous page
page_29
next page>
<previous page
page_30
next page>
Page 30
used a number-line, he was able to specify whether each of these intervals was rising or falling. Thus his searches (which ultimately showed different kinds of modal use in diverse elements of the liturgy) preserved directional and intervallic information while his meta-data stayed within the range of signed single-digit integers. 1.2.3 Confounds : Rests, Repeated Notes, and Grace Notes Three elements of notation can confound both contour and intervallic-profile comparisons. These are rests, repeated notes, and grace notes. Researchers focussed on contours often argue that all three disrupt the "flow" of the line. Contour, after all, conceptually requires a continuous line, but this line is a representation of the melody. In the music itself the disruption may be an intended means of punctuation (rests), accentuation (grace notes), or energy-building (repeated notes). When pitch is encoded without any reference to duration, rests will normally be absent because effectively they are durations without pitch content. Repeated notes are more problematical, because the numbers in which they can occur are so variable and because without durational information their importance cannot be evaluated. Thus they are treated in highly diverse ways. LaRue gives 64 as the greatest number of times a single note was repeated consecutively in the symphonic material he listed in the thematic-identifier volume of his catalogue of eighteenth-century symphonies (1988). If we were working with short incipits, the inclusion of such repeats would squeeze out the information that we really wantthat which is closest to the prototypical melody. Bryant and Chapman, in creating a melodic index to Haydn's works (1982), give the first four repetitions of a note as separate ciphers but use summary numerical indicators for five and more repetitions (e.g., C6DEF ). Repeated notes are ignored in Lincoln's meta-data (1988; 1993), which is sorted by intervallic size and direction based on the first nine discrete pitches. Durations, although undoubtedly included in the original DARMS code and although obviously necessary for printing, were excluded from the indexes in his two important finding aids (each based on several tens of thousands of incipits). When the intervallic index of the madrigal volumes was used to seek matches to unattributed works, the absence of repeated notes proved to have an insidious effect (SelfridgeField, 1990).
<previous page
page_30
next page>
<previous page
page_31
next page>
Page 31
Grace notes are infinitely problematical, since their interpretation has varied over time while their notation has remained fixed. The grace notes in Figure 12b (above) might legitimately be considered extraneous to the "theme" but the one in Figure 13 is essential to the melody. Bryant and Chapman represent incipits containing grace notes twice, in order not to guess which way a user will search for it, whereas La Rue gives grace notes full weight. A listener remembering a theme is unlikely to know whether any of the notes might have been notated as grace notes, and thus would expect to find the pitch included in a finding tool. When grace notes were included in RISM experiments in searching [see Howard], fewer events were required to produce discrete sorts, suggesting that although we may consider them parenthetical in Figure 12b, grace notes are an essential part of the melody. There may be a historical confound here, however, since RISM concentrates on works composed between roughly 1600 and 1825 (better typified in Figure 13); Dvork's * Slavonic Dances, in which the grace notes contribute to accentuation more than to melodic substance, are from 1878. 1.3 Searchable Representations of Duration and Accent That few of the studies thus far cited take any durational or accentual information into account results partly from practical considerations. Several databases that are now large were designed two or three decades ago. When computer memory was scarce, optimal design favored concise representations. Also, multi-dimensional searches are necessarily more complex than one-dimensional ones, so programming and debugging time are significantly greater. Yet a tendency to dismiss durational information as being "not strictly necessary" for melodic enquiries seems often to stem from the belief that melody can be defined entirely by pitch-events or by other information that can be extrapolated from pitch information. Apparently we are more conscious of pitch change than of duration or accentual features of melodies in performance. Certainly pitch control requires more conscious effort by vocalists and instrumentalists than does the control of rhythm or stress. Dorothy Gross (1975) was among the first computer researchers to observe that a pitch-string is not the equivalent of a melody. Recent studies in musical perception suggest that durational values may outweigh pitch values in facilitating melodic recognition. No one would maintain that the three
<previous page
page_31
next page>
<previous page
page_32
next page>
Page 32
themes shown in Figure 15, with pitch-content matching the (transposed) pitch-string EDCDE , as in the children's song "Mary Had a Little Lamb,"
are qualitatively the same.
Figures15a-c. "Themes"from(a)Bruckner'sSymphonyNo.7,Movement3; (b)Mozart'sStringQuartetinD,K.575,Movement3;and (c)Schubert'sOverturetoRosamunde. General theories of rhythm have attracted some interest in recent years, but given their long history (stretching back to antiquity), it may be a fair summary to say that rhythmic patterns have been far more stable in Western culture than have pitch patterns. Therefore there may be less need to develop theories of rhythm. No general theory of rhythm has been rigorously tested in analytical applications. Tangian's proposal for a binary classification of rhythmic patterns (1992) is indicative of approaches that may be testable. Craig Sapp (1998, unpublished research) has recently implemented some elements of Swain's theory of harmonic rhythm (1998), which is eminently well suited to computer implementation, as Humdrum tools. One highly systematic approach to the concurrent representation of rhythmic and accentual patterns is the one appearing in Moritz Hauptmann's study of harmony and meter (1853; reprinted 1991). Note the binomial method for specifying rhythmic patterns in Figure 16.
<previous page
page_32
next page>
<previous page
page_33
next page>
Page 33
Figure16. ExcerptfromMoritzHauptmann'ssystem[1853] fordistinguishingrhythmicpatternsin6/8meter. For practical purposes, the provisions of such comprehensive encoding schemes as DARMS, SCORE, Kern (for the Humdrum Toolkit ), MuseData, and Plaine and Easie Code (used by RISM) are fully functional. Among these, SCORE is noteworthy for sequestering all durational information in a separate string. Humdrum is notable for accepting encodings that may consist only of durational information; syntax checkers for other systems will ordinarily object to the lack of concurrent pitch information. Accentuation is especially pertinent to the study of folk repertories, and in general ethnomusicologists seem to have devoted more thought to the matter of comparing stress patterns than have musicologists. In studies of American folk-tune repertories, for example, Bevil (1988, 1992A, 1992B) encoded indicators for stress together with those for pitch and duration (in what appear to be separate digits of a single integer). This facilitated searches for particular combinations of pitch and duration, pitch and stress, and duration and stress. From these measures multiple viewpoints on the same thematic material could be obtained. His software also overlaid images of closely related variants (Figure 17, showing Bevil 1988: 119).
<previous page
page_33
next page>
<previous page
page_34
next page>
Page 34
Figure17. J.MarshallBevil'scomputeroverlayofmelodicvariantsofanAppalachianfolksong. There are several fairly simple ways of including brief information about duration, accentuation, or phrase structure in databases consisting principally of pitch information. Gustafson's thematic locator for the works of Lully (1989) uses a blank space to represent a barline in a base-7 representation of pitches. The incipits are separated by mode (major or minor), and each one includes a field for meter and key. The number of items within the bar gives some idea of the kinds of durations included. Some examples in the minor mode are shown in Figure 18. 11323771673g 1132176543252g 11321171234217225112e Figure18. UseofablankspacetoseparatemeasuresinGustafson's thematiclocatorfortheworksofLully. A similar use of barline divisions is employed in Temperley's Hymn Tune Index (see hti.music .uiuc.edu/introduction.htm). The Essen databases not only use a blank space to encode barlines but also give a visual representation of duration with underline characters and mark phrase endings with a hard return (Figures 19a-b). Here too a base-7 octave is used to represent pitch.
<previous page
page_34
next page>
<previous page
page_35
next page>
Page 35
Figure19. (a)BeginningofanEsACencodingoftheGermanfolksong ''DerMaitritteinmitFreuden"and(b)thecorrespondingmusic. The Essen analysis tools, designed by Barbara Jesser (1991) with contributions by other Schaffrath pupils, used data in this format to derive profiles of pitches on the first beat of each measure (i.e., the most heavily accented notes), of rhythmic features, and of song structure. Although in the initial output, the accented notes were extracted with their original values, in later processing they were regularized (Figure 20a) to facilitate the printing of a melodic spine (Figure 20b). (a)1_.-5_.-6_.-5_.-4_.-3_.3_.1_.
Figure20. (a)Extractedmelodic-spineinformationforthemusicshowninFigure19 and(b)itsmusicalrealization. Although, apart from the Essen databases, it is unusual to encode phrase information, its presence can be critical for certain kinds of "melodic" searches. In Brinkman's study of the reuse of chorale tunes in Bach's Orgelbchlein (1986; thesis submitted in 1978), it was postulated that from any given sequence of pitches constituting a phrase in a chorale, a series of "patterns" of progressively fewer pitches could be culled. Great care was taken not to store patterns that crossed phrase boundaries in the look-up lexicon of pitch-strings, but the search algorithm (as judged from the output) seemed to be indifferent to phrase boundaries when producing "matches" (Selfridge-Field, 1993).
<previous page
page_35
next page>
<previous page
page_36
next page>
Page 36
1.4 Strategies for Multi-Dimensional Data Comparison We have already seen several prototypes for two-dimensional searches, but so far they have not concentrated on a parallel traversal of information representing pitch and duration. Many strategies for doing this have been tried, but none have been generally adopted. 1.4.1 The Kernel-Filling Model Concurrent representations of pitch and duration data may pair diverse levels of generality and specificity in an asymmetric way. In studies aimed at discovering unwritten rules of melodic construction in detail sufficient to generate new melodies of the same kinds, Mario Baroni and his collaborators have concentrated on "melodic symmetries [which] take into account pitch contours [and] metrical patterns" (1990: 211). The repertories they have explored include German chorale tunes, eighteenth-century French chansons, and seventeenth-century Italian cantatas. In Baroni's work, the melody is seen to evolve from a kernel that consists of the outer notes of a phrase. The intervening notes, which give the melody its contour and character, are seen to "fill" a space created by these poles. The apparent resemblance of this approach to Leonard Meyer's theory of gap-fill (1956) is superficial, since Meyer enlists elements of harmony and rhythm in his argument and gives much greater emphasis to contour. By its nature, Baroni's approach tolerates approximation in the representation of pitch. The study of chanson melodies (1990) determined that "there are no cases of phrases having the same melodic [=pitch] contour but a different metrical structure." Would a more precise system of pitch representation (from which a larger roster of "contours" could be derived) produce different results? At present there is no evidence one way or the other. To create new melodies in the style of the model thus derived, Baroni first generates a rhythmic profile. Pitches are then poured into this profile. Some resulting chorale melodies (from Baroni and Jacoboni 1978: 146f.) are shown in Figure 21. The "kernel" in 21a is a descending third, while that of 21b is an ascending fourth. (In this base-7 system any clef could display the results.)
<previous page
page_36
next page>
<previous page
page_37
next page>
Page 37
Figure21a-b. TwochoralemelodiesgeneratedbyBaroniandJacoboni. 1.4.2 Accented-Note Models In an effort to create meta-information commensurate with that of melodic intervals, the Russian mathematician Zarhipov (1965) proposed the use of profiles of stress change as one of three tiers of information available for melodic analysis (see Bakhmutova 1989). Zarhipov's four-tier scheme of melodic representation is illustrated in Figure 22. The tiers are: (1) ordinal numbers representing successive events, (2) melodic intervals, indicated diatonically at the intermediate positions, (3) ordinal numbers representing the positions within the bar at which these tones are struck, and (4) rhythmic transitions, coincident with (2), represented by a plus sign (+) when moving from a stronger to a weaker position and by a minus sign (-) in the opposite case.
Figure22. Zarhipov'sfour-tiersystemforrepresenting(a)event numbers,(b)intervallicchange,(c)beatsonwhich eventsoccur,and(d)durationalchange.
<previous page
page_37
next page>
<previous page
page_38
next page>
Page 38
Abstractions (2) and (4) can be collapsed into a single line in which the first operator following a numeral indicates melodic direction and the second indicates stress change, e.g., 1++1+-2-+2+-0++1--1-+1+-4-+. Zarhipov's system is noteworthy for its combination of explicit and implicit information, for its effort to coordinate several strings of data at the same time, and for its model of collapsing multiple variables into a single string for easier processing. Another approach to the coordination of elements representing both the pitch and duration domains is offered by the music psychologist Mari Riess Jones (1993). She differentiates between two kinds of pitch accents, melodiccontour accents and melodic-interval accents, and also between two kinds of implications of duration, which correspond to the literary concepts (traceable to antiquity) of qualitative and quantitative accents. She calls these strong-beat accents and elongational accents.From these she contemplates the possibility of investigating jointaccent structure and temporal phrasing.Joint-accent structure codes give a composite view of the amount of activity collected from earlier profiles, giving one stroke each for a "filled" beat, a "melodic" (i.e., pitch) accent, and a temporal accent (Figure 23). Jones's collective structures are conceptually similar to Lerdahl and Jackendoff's representation of metrical structure (1983: Chapter 4) in "time-span reductions."
Figure23. Jones'sconceptsofmelodicaccent(derivedfrompitchcontour), temporal accent(derivedfrombeatstructure),andjoint-accent structure(cumulative).
<previous page
page_38
next page>
<previous page
page_39
next page>
Page 39
Where the notion of a "pattern" includes information thus derived, she proposes to investigate both dynamicpattern similarity and dynamic-pattern simplicity .Although Jones gives only a few examples and modest statistics, her notions may be of potential value in music-theoretic endeavors involving computer processing. 1.4.3 Coupling Procedures Because of the possibility that two-dimensional searches can process both absolute (e.g., pitch) and relative (e.g., intervallic) information concurrently, there is some potential for a warping effect to occur in the combination. This is particularly the case if one of the data variables represents elapsed time, as it does in the work of Suk Won Yi on similarity. Because data describing intervals (relative information) is combined with data describing duration (absolute information) in a composite measure, it is unclear whether the assumed coordinates are vertically aligned in the most advantageous way. The "warp" comes about in the following way. In a sounding melody, pitch lingers for the duration of the event. When the music is to be represented for processing, the pitch can be determined at the onset of the event, while the duration must be computed at its termination. A melodic interval cannot be computed until the next pitch is sounded, and so there is a continual zigzag in the data-collection path (see Figure 24). 1.4.4 Synthetic Data-Models Another approach to coordinated comparisons of pitch and duration data is to synthesize elements from each domain in modules of "complete" melodic information. This reduces the number of symbols that must be manipulated, thereby simplifying (but also in some cases restricting) the search. Arthur Wenk designed such a system for his work on Debussy (1988): recurrent rhythmic patterns assumed a single code which was linked with a series of pitch names suited to the pattern. Balaban has written extensively about synthetic schemes of representation and manipulation, but usually from the standpoint of investigating structural issues (for a list of her writings, see Balaban 1992).
<previous page
page_39
next page>
<previous page
page_40
next page>
Page 40
Figure24. Atime-orderedschematicviewofthecollectionpointsforpitch, duration,andintervallicdata. A more elaborate, in fact quite elegant mathematical procedure was devised by Wolfram Steinbeck, an early pupil of Helmut Schaffrath and an encoder of several thousand folksongs in the Essen databases, to synthesize information about stress, pitch, and duration into one integer (1982: 48-49).
<previous page
page_40
next page>
<previous page
page_41
next page>
Page 41
With his terms translated into English, Steinbeck's formula worked as follows: Where S=score, B=beat position, P=pitch, D=duration, and N=the number of digits in the duration specification, S=((Bx100)+P)x10N+D In his system, the code for a half note is 8 and the code for E4 is 45 (Steinbeck uses a base-19 system of pitch representation; it excludes F and B ). Where the beat position is 4 and N=1, S=((4x100)+45)x101+8 S=(445x10)+8 S=4458 If the same note were on the first beat ( B=1), the composite score would be 1458 . In the original example, if the pitch-code ( P) were 60 (C 3), the composite score would be 4608 . This composite numeral (Figure 25) is parsable and easily comprehensible:
Figure25. Wolfram Steinbeck'scompositefour-digitrepresentation ofbeatposition,pitch,andduration. Steinbeck's intention was to simplify the task of recognizing similarity in pairs of melodies similar to those shown in the discussion of Leppig's work (below). In his study of bebop melodies, J. Kent Williams (1985: 60) assessed the frequency of a series of pitch-direction profiles ( UU , RR , etc.) in conjunction with a list of thirteen composite rhythmic figures. To produce the rhythmicfigure dimension of this array, Williams collapsed another two-dimensional array coupling four first-note/last-note accentual cases (strong-strong, strong-anticipated, anticipated-strong, or anticipated-anticipated) with specific rhythmic patterns. This first-order array is shown in Figure 26.
<previous page
page_41
next page>
<previous page
page_42
next page>
Page 42
Figure26. J.KentWilliams'sschemeforcouplingaccentualrelationshipsoffirstandlastnotes ofapatternwithspecificrhythmicformulae.Thislook-uparrayfeedscomposite results(e.g.,5SS,forGroup5,strong-strong)toasecondarraywithdirectional profilesofpitch(e.g.,UU ,DR ).
<previous page
page_42
next page>
<previous page
page_43
next page>
Page 43
The basic approach of using streams of arrays suggests various other procedures for relating elements of information about pitch and duration. The pitch-axis of a subsequent array could contain information about direction (as it does here) or contour, about diatonic or enharmonic pitch-string categories, and so forth. The rhythm-axis could convey information about duration or stress, or about some hierarchical scale of events or combinations of events. Williams's choice was to couple fairly general information on pitch direction with more precise information about rhythmic figures. He also examined the Baroni-Jacoboni concept of the kernel, which he terms a cumulative interval. A third approach to synthesizing information about pitch and rhythm is provided by Suk Won Yi (1990). Starting from an orientation influenced by semiotics, linguistics, and studies in perception, he creates statistical measures of "melodic activity" as an aid to comparing melodic contours in selected Schubert and Schumann Lieder.Eight measurements are used. The first (intervallic change), the second (duration), and last (a coefficient of melodic activity ) for the first ten events of Mozart's G-Minor Symphony (K. 550) are represented by Yi in a vertically organized table (Figure 27), which is rotated for presentation here.
Figure27. SynthesisbyYiofintervallicanddurationaldatatoproducea "coefficientofmelodicactivity." Yi (1990: 52) is eager to get at the problem of a truly melodic contour as opposed to a mere contour of pitches. Being sensitive to the emphasis of perceptual studies, he has chosen to couple intervallic data with durational data. Intervals are computed from a semitone representation converted to a decimal format (as is duration) for processing. To the traditional analyst of music this may seem to be an orthogonal coupling, since the size of, let us
<previous page
page_43
next page>
<previous page
page_44
next page>
Page 44
say, the first interval of a work can only be established slightly after the second duration begins (Figure 24). The consequence of this implied diagonal coupling is that inter-domain pattern relationships may be less evident in the statistical result than in another format. In Figure 27, the coupling of pitches rather than intervals with durations would have preserved an indication of the third iteration of the three-note pattern in the ninth coefficient. The problematic element here is the rest in Bar 3 (Figure 27), which completes the ninth "interval." After changing the last number for duration to reflect the actual note value (Yi seems to have elongated the value of the high B by that of the rest [event #11, if the first two rests are ignored] in order to produce his 1.0 for the duration in the tenth cell), we find that an overlapping set of patterns has emerged. In each three-event group, repetition of a duration regularly introduces repetition of a pitch. Viewed conversely, a decrease in pitch prompts an increase in duration. If Yi had coordinated pitch information with rhythmic information, his results for the above example would resemble those in Figure 28 and the third iteration of the pattern would be evident.
Figure28. ArevisionofYi'sdata,substitutingpitchforintervallicdatainorderto produceaclearersenseofoverlappingpatterns. This attempt to make cognitive and perceptual principles the basis for melodic analysis inadvertently poses an array of important questions for logical approaches to melodic analysis. It points out that paradoxes inhere in our basic definitions of pitch and duration. The concept of pitch without duration is as meaningless for music as the concept of duration without pitch. We discuss these parameters as though they were entirely separable, but at some cosmological level, they are not. The idea of coupling pitch with duration also implies an orthogonal coupling; it simply has a different slant from the coupling of intervallic data with pitch data (see Figure 24). Yi himself calls attention to another difficulty inherent in composite statistics, one that Crerar (1985) called "statistical mush": once multiple threads of information are reduced to a single measure (here of melodic activity for each "event"), it is no longer possible to know whether the
<previous page
page_44
next page>
<previous page
page_45
next page>
Page 45
identical results (here coefficients) express a high score for melody combined with a low one for duration, the reverse, or equal measures for both. Thus, measures in which pitch and durational information (or their substitutes) are synthesized in a single figure have some limitations in helping us answer musical questions about music. 1.4.5 Parallel-Processing Models If synthetic data and composite results are not an ultimate answer, what procedures exist for managing multiple data streams in an uncoupled way? The German mathematician Manfred Leppig (1987) has proposed some simple routines for similarity searches in which perceptually correct results might be impeded by contextual differences or surface details. For example, to find parallel passages obscured by variant meters, such as those shown in Figures 29a and b,
Figures29. (a)TheFrenchfolksong''Ah,vousdirai-je,maman"and (b)theGermanfolksong"AlleVgelsindschonda." Leppig's procedure involves (1) the assignment of diatonic numbers to map pitch and arbitrary integers to identify durations, (2) the comparison of pitch-strings, and (3) the computation of item-by-item differences, with the aim of establishing the total number of matching tones. A comparison of the above examples would yield the series of differences shown in Figure 29c: 1:13586865453121 2:115566544332231 D:0203020010-200 Figure29c. Pitch-stringsforthemelodiesshowninFigures29aand b.Numericaldifferences( D)aregiveninRow3.
<previous page
page_45
next page>
<previous page
page_46
next page>
Page 46
Then (4) by various sliding routines (both horizontal sliding, to offset differences in starting position, and vertical sliding, to capture ephemeral transpositions), nearly corresponding passages can be identified by the large number of zeroes their comparison generates. In comparing durations Leppig only computes hits ( *) and misses ( -). See Figure 30. This procedure can also be substituted for the more articulate pitch-comparison shown in Figure 29c. Pitchcomparison: *-*-*-**-*-** Durationcomparison: --***-*--**-* Figure30. Comparisonofpitchanddurationstreamsashits(*)ormisses(-). Scores can be produced by converting hits and misses to percentages of events. What this particular result reveals is a similarity in the pitch-content of accented tones. We would not cognitively consider "Ah, vous dirai-je maman" and "Alle Vgel sind schon da" to be "similar" melodies. The purpose of the discovery here is to reveal structural similarities that defy perception. Leppig's explorations, which were based on the use of the Essen databases, show the rich potential that exists in some fairly simple approaches to comparison. At the same time, they support the goal of accountability for both pitch-data and duration-data. 1.5 Prototypes, Reductions, and Similarity Searches Are reductions prototypes? If we look again at Figure 20b in relation to Figure 19b, we see that the spine (20b) hardly gives the flavor of the original (19b). In the reduction, the upbeat is absent and the octave leap is distracting, for despite its lowly beat position, it is somehow the a' in Bar 6 that seems to be centrally important, as it completes the triadic flourish created by the preceding eighth notes. Computers can produce simple reductions, like 20b, with the greatest of ease. Prototypes that are not monorhythmic are much more difficult to derive. We can explore this lapse in a more elaborate context by reconsidering Mozart's Piano Sonata K. 311 (see Figures 3a-e). Reductions offer a tempting solution to fuzzy matching because they suppress the differences of surface detail. But do they suppress the most appropriate surface details? In a search
<previous page
page_46
next page>
<previous page
page_47
next page>
Page 47
for the prototypical theme (conjecturally Figure 3f), we find that no simply-implemented reductive method produces it. Nine reductions (Figures 31a-i) are easily derived: (1) A pitch-string representation of each example, e.g., 34323454321 would readily separate 3d, because of its chromatic notes, from the others. If ties were suppressed, it would recognize la and le as being the same through the first nine pitches. It would recognize 1c and 1d as identical through the first ten pitches. The length of the string on which the comparison is run would obviously influence the results of the search. (2) A duration profile of each example would pair 3a with 3b as ES.TEE/ESSEE (where E=eighth, etc.) and would indicate 3c, 3d, and 3e to be different from each other. (3) An accented-note profile capturing the pitch on each quarter-note beat would render two resultsone for 3a through 3d (Figure 31a) and one for 3e (Figure 31b). (4) An accented-note profile capturing the pitch on each eighth-note beat would differentiate four sets3a, 3b/c, 3d, and 3e (Figures 31c-f). (5) A more rhythmically varied profile could be created by capturing only the notes that are emphasized by harmonic change in the full context. Three harmonic-reinforcement profiles (Figures 31g-i) could thus be created. These would represent the sets 3a, 3b-d, and 3e.
<previous page
page_47
next page>
<previous page
page_48
next page>
Page 48
Figures31a-i. Ninereductionsofthemelodic materialinFigures3a-3. In order to derive these profiles, a program would need to be able to determine from other voices of the work which parts are so reinforced. Thus information solely from within a melody may be insufficient to support this kind of "melodic" search. The results of procedures (1) through (5) are summarized in Figure 32.
<previous page
page_48
next page>
<previous page
page_49
next page>
Page 49
Figure32. Summaryofreductionistprocedures,results,andpassagestowhichtheyareapplicable. Note that none of these procedures produces the melodic prototype shown in Figure 3f. If we further examine the music in its full context, we find that there is an important melodic/harmonic implication that cannot be derived by any of the above means. This is that the C that falls on the second beat of the second bar is an appoggiatura, which causes all of the above reductions to fail to select the implied (that is, harmonically congruent) B in this position. The B is required to produce the prototypical melody
That Mozart avoids ever placing this note squarely on the beat may tell us something important about his style and about the difficulty of finding formulae to compensate for the melodic, harmonic, and rhythmic displacements caused by appoggiature . To overcome the problem posed by Figure 3 it would be necessary to design a procedure permitting not only mixed durations but also rhythmic alterations and/or pitch substitutions for the purpose of facilitating the creation of a provisionally prototypical model. The result would have a vague resemblance to an added treble in fifth-species counterpoint. Williams approached this strategy when he employed a rhythmic-regularization
<previous page
page_49
next page>
<previous page
page_50
next page>
Page 50
procedure to facilitate the analysis of jazz melodies. In his data, syncopation was "usually effected by shifting notes slightly backward with respect to the metrical grid so that they anticipate the beat," and in his analysis, "unsyncopation...shift[ed] the attack-point...ahead to make melody congruent with harmony." Williams also removed "jazz turns" [inverted mordents], which usually begin on a strong beat, to facilitate melodic comparison (1985: 45). Since a growing proportion of recent writings on melody have been stimulated by the works of Narmour (1990, 1992), it seemed appropriate to solicit his views on this example. The first iteration (Figure 3a) is in fact included in Narmour (1992: 219), in a discussion of "missing structural tones" under the general subject of "melodic chaining." Because Narmour's response was substantial and earnest, it is useful to quote it here together with an accompanying set of examples (Figure 33). In brief, Narmour's interest is in describing various combinations of intervals and durations in incremental threepitch contexts. Therefore it is combinatorially more complex than other systems discussed here. The system is hierarchical, admitting harmonic information at higher levels, looks both forward and backward, observes direction, and classifies intervallic sizes in precisely defined clusters (e.g., "a fifth or greater"). In Figure 33, brackets mark melodic segments. Letters and letter combinations classify pitch sequences with regard to (a) the size and (b) the direction of the interval, as well as whether (c) the intervallic movement and (d) the directional relationship satisfy or deny principles of "implication-realization" ( I-R). IP , for example, is an intervallic process which satisfies expectations of intervallic behavior but denies registral implications. D indicates a duplicate (i.e., a repeated note). A change of direction is a "reversal" ( R). A VP (registral process) satisfies a registral implication but denies an intervallic one, and so forth. Narmour (personal correspondence, 13 May 1992) writes of K. 311, The theoretical symbols [used in the implication-realization model] capture very well the similarities and differences between the variations (chiefly in their second measures). For instance, P ( VR ) appears in a structural chain in measure 2 in Figures 3a and 3b; an IPP begins in measure 2 of both examples 3b and 3c; PIDP appears at the end (measure 2) in both examples 3d and 3e. Note that according to the implication-realization model, in the first three versions (a, b, c) the tone B does not
<previous page
page_50
next page>
<previous page
page_51
next page>
Page 51
Figures33a-e. EugeneNarmour'sreworkingofFigures3a-ein accordancewiththeimplication-realizationmodel.
Figure33f. Narmour'sharmonicreductionof3f.
<previous page
page_51
next page>
<previous page
page_52
next page>
Page 52
transform to a higher level because, though a resolution of dissonance, it is metrically weak; thus I disagree that this B is a "dominating tone" in the second half of the second bar. Observe significantly, however, that this B does become transformationally structural (and "dominating") in Figure 3d. This would also stylistically influence the immediate repeat of the pattern in Figure 3e. In other words, Mozart deforms the schema (or prototype) during the first three appearances of the melody, allowing it to merge fully only toward the end of the piece. Whether it is a common compositional strategy of his style needs to be investigated, as you say. As regards the style schema on which this passage is based, I would have said the melody was a simple changing-tone one (a single IP in my terms), which I have shown in the schema on my manuscript sheet [Figure 33]. Thus to treat your question about whether the I-R model would generate your prototype (Figure 3e), the answer is no because your reduction mixes both quarter-note and eighth-note levels, which the I-R model generally disallows (unless chaining or combining occur). This prompts the thought that the reductions best suited to melodic searching and comparison may not be as concise as those used by the I-R model, which was developed for analytical purposes. Consider some possible reductions of the well-known Bach minuet shown in Figure 34. Figure 34b is a reduction of the melody (34a) that almost aspires to reductive logic of a compound melody (cf. Figure 2a). Once we have heard 34b, however, we may well ask why the "distracting" surface detail should be retained in the odd-numbered bars. Bar 1 cannot be reduced to quarter-note values, because the changing tone (A) on beat 2, if selected as a surrogate pitch for the beat in a prototypical melody, would give a misleading sense of the harmony outlined in the bar (D minor instead of B major). Thus we arrive at 34c, where the bar-by-bar accentual quality of the reduction in 34b is retained, but the details are reversed: there is more activity in the even-numbered bars. Figure 34d, analogous to a simple Urlinie (in the vocabulary of Heinrich Schenker), does not have nearly as much character as a "melody" as 34b or 34c, with their mixed durations. Or consider the C -Major Prelude in Book I of Bach's Well-Tempered Clavier (Figure 35). It too is reducible to a one-note-per-measure format. An intermediate reduction with mixed durations could retain a sense of identity that is missing from the mere set of parallel tenths that could result. Yet it could not be derived simply by further reduction of the right-hand "melody,"
<previous page
page_52
next page>
<previous page
page_53
next page>
Page 53
Figures34a-d. (a)TheMinuetIinB MajorfromtheFirstPartitaand (b-d)threepossiblereductionsofit. since some of the tones of intermediate significance are marked accentually in the left-hand accompaniment. Here the issue hinges on the question of which notes should be repeated (for there are no novel pitches to flesh out an intermediate reduction). A regular quarter-eighth pattern captures the rhythmic sense conveyed by the left-hand part, but if one took literally only those notes repeated in the right-hand part, the result would be more erratic. Some method for admitting repeated notes is as essential here as it is in searches, even those based on "contours."
Figure35. TheopeningbarsofBach'sPreludeinC MajorfromBookI oftheWell-TemperedClavier.
<previous page
page_53
next page>
<previous page
page_54
next page>
Page 54
How much reduction is enough? Enough, that is, to facilitate computer recognition of cognitively recognizable "matches." How much is too much? Too much, that is, for the retention of some semblance of musical coherence. The answers will vary with the goals of the query and with the repertory at hand. This necessarily complicates the prospects for the derivation of algorithms that identify melodic prototypes. Conversely, prototypes may have only a general identity not susceptible to explicit translation in every case. That is why they are potentially so valuable. Melodic prototypes are cognitive entities, while reductions are derived from the music itself. In the process of searching for melodic matches, we are frequently trying to match mind with matter. 1.6 Conclusions There are many approaches to melodic study that have not been broached here. We have not considered Renaissance diminution techniques, Baroque Figuren (and their rhetorical implications) or improvisational formulae, variation techniques of the eighteenth and nineteenth centuries, modular approaches to the "musical grammar" of song repertories, nor features of self-similarity in the construction of melodies. We have not taken account of many recent analytical procedures involving grouping structures or sophisticated reduction techniques. These are all important components of the larger study of melodic process but computer applications at this juncture need to solidify their position in simpler domains before putting elaborate models into practice. We have made an effort to look at the ways in which melody is conceived and represented in various domains of music scholarship. Differences of objective certainly dictate divergent emphases. It appears, however, that if we were to generalize across these subdisciplines, we would agree that the most successful efforts involving computer searching to date have been those that manipulate more than one variable. That is, a coordinated search of any two streams of information, however general, seems to yield a more refined result than any one parameter, however specific. Even when both parameters come from the same domain (e.g., direction and contour from the pitch domain), rather than from an equilateral pairing (pitch interval and stress interval), the results seem to be improved.
<previous page
page_54
next page>
<previous page
page_55
next page>
Page 55
What users most seem to need is multiple options, in order to suit highly divergent goals. In this respect, our dedicatee, the late Prof. Dr. Helmut Schaffrath, together with his pupils and colleagues, set a sterling example. In their collective effort, they have permitted us to extract and compare pitches or intervals, durations, indexes of the tones actually used (in contrast to one-octave scales), cadence tones, accented tones, ''form" as dictated by pitch profiles, "form" as dictated by duration profiles, and contour information. These capabilities can be used to compare two items or to analyze entire repertories. Admittedly, these capabilities all rest on an encoding system designed for monophonic music, but melodic searching is by nature a monophonic exercise and most search data currently available for analysis is monophonic. We need to approach enquiries into melodic similarity with flexibility, with respect for the underlying complexity of the largely unconscious procedures that produce melodies, and with gratitude to those who have had the boldness to tackle the conceptual problems that inhere in the subject. This article shifts back and forth between conceptual and representational issues precisely because in computer applications the two are intertwined. Neither a naive idea implemented in a sophisticated way nor a sophisticated idea filtered through a crude implementation will get us very far. We need to examine clearly articulated concepts of melody in relation to their suitability for processing, but we almost certainly need more theoretical literature on the subject of melody as well as more interdisciplinary discussions of approaches. The computer adds one more level of complexity to an already challenging mass of substance, but it also creates common interests and new opportunities for understanding in an otherwise disparate array of pursuits.
<previous page
page_55
next page>
<previous page
page_56
next page>
Page 56
References Adams, Charles R., "Melodic Contour Typology," Ethnomusicology 20 (1976), 179-215. Balaban, Mira, "Music Structures: A Temporal-Hierarchical Representation for Music," Musikometrika 2 (1990), 1-51. Balaban, Mira, "Music Structures: Interleaving the Temporal and Hierarchical Aspects in Music" in Understanding Music with AI : Perspectives on Music Cognition , ed. Mira Balaban, Kemal Ebcioglu *, and Otto Laske (Cambridge: AAAI Press/MIT Press, 1992), pp. 110-138. Baker, Nancy K., "'Der Urstoff der Musik': Implications for Harmony and Melody in the Theory of Heinrich Koch," Music Analysis 7/1 (1988), 3-30. Bakhmutova, I. V., V. D. Gusev, and T. N. Titkova, "Repetition and Variation in Melody: Towards a Quantitative Study, Musikometrika II (1989), 143-168. Bakhmutova, I. V., V. D. Gusev, and T. N. Titkova, "The Search for Adaptations in Song Melodies," Computer Music Journal 21/1 (1997), 58-67. Barlow, Harold and Sam Morgenstern. A Dictionary of Musical Themes. New York: Crown Publishers, 1948. Baroni, Mario, "The Concept of Musical Grammar," tr. Simon Maguire with Willliam Drabkin, Music Analysis 2 (1983), 175-208. Baroni, Mario, R. Brunetti, Laura Callegari, and Carlo Jacoboni, "A Grammar for Melody: Relationships between Melody and Harmony" in Baroni and Callegari (eds.), Musical Grammars and Computer Analysis , Florence: Olschki, 1984. Baroni, Mario, and Laura Callegari, "Analysis of a Repertoire of Eighteenth-Century French Chansons," Musikometrika 2 (1990), 197-240. Baroni, Mario, Rossana Dalmonte, and Carlo Jacoboni. "Theory and Analysis of European Melody" in Computer Representations and Models in Music , ed. Alan Marsden and Anthony Pople (San Diego: Academic Press, 1992), pp. 187-206. Baroni, Mario, and Carlo Jacoboni. Proposal for a Grammar of Melody: The Bach Chorales.Montral: Les Presses de l'Universit de Montral, 1978. Bartlett, James C., and W. Jay Dowling, "Scale Structure and Similarity of Melodies," Music Perception 5/3 (1988), 285-314.
<previous page
page_56
next page>
<previous page
page_57
next page>
Page 57
Bharucha, Jamshed J., "Melodic Anchoring," Music Perception 13/3 (1996), 383-400. Bevil, J. Marshall, "Comparative Melodic Analysis," Directory of Computer-Assisted Research in Musicology 4 (1988), 119. Bevil, J. Marshall, " MelAnaly ," Computing in Musicology 8 (1992A), 68-69. Bevil, J. Marshall, "Principles and Methods of Element Encoding and Array Formation in the MelAnaly System of Comparative Folktune Analysis," Computers in Music Research 4 (1992B), 77-104. Binford-Walsh, Hilde M., "Applications in Historical Musicology: The Melodic Grammar of Aquitanian Tropes," Computing in Musicology 7 (1991), 41-42. Binford-Walsh, Hilde M., "The Ordering of Melody in Aquitanian Chant: A Study of Mode-One Introit Types," in Cantus Planus (Budapest, 1990), pp. 327-339. Boroda, Moisei, "A Methodology of Quantitative Investigation of Methods of Variation of Segments in Song Melodies" [typescript in Russian], Tbilisi, 1987. Boroda, Moisei G., "Towards a Phrase-Type Melodic Unit in Music," Musikometrika 50/4 (1992), 15-82. Breslauer, Peter, "Diminutional Rhythm and Melodic Structure," Journal of Music Theory 32/1 (1988), 1-21. Brinkman, Alexander R., "A Binomial Representation of Pitch for Computer Processing of Musical Data," Music Theory Spectrum 8 (1986A), 44-57. Brinkman, Alexander. "Johann Sebastian Bach's Orgelbchlein : A Computer-Assisted Study of the Melodic Influence of the Cantus Firmus on the Contrapuntal Voices." Ann Arbor: UMI, 1986. Brinkman, Alexander R., Pascal Programming for Music Research . Chicago: The University of Chicago Press, 1990. Brinkman, Alexander R., "Representing Musical Scores for Computer Analysis," Journal of Music Theory 30/2 (1986B), 225-275. Bryant, Stephen C. and Gary Chapman. A Melodic Index to Haydn's Instrumental Music : A Thematic Locator for Anthony van Hoboken's THEMATISCH-BIBLIOGRAPHISCHES WERKVERZEICHNIS, Vols . I and III (Thematic Catalogues, No. 3). New York: Pendragon Press, 1982.
<previous page
page_57
next page>
<previous page
page_58
next page>
Page 58
Camilleri, Lelio, "A Grammar of the Melodies of Schubert's Lieder"in Musical Grammars and Computer Analysis, ed. Mario Baroni and Laura Callegari (Modena, 1982; pub. Florence: Olschki, 1984), pp. 229-236. Carterette, Edward C., Donald V. Kohl, and Mark A. Pitt, "Similarities among Transformed Melodies: The Abstraction of Invariants," Music Perception 3/4 (1986), 393-409. Cooper, Grover, and Leonard Meyer. The Rhythmic Structure of Music .Chicago: The University of Chicago Press, 1960. Cope, David, Computers and Musical Style .Madison: A-R Editions, Inc., 1991A. Cope, David, Experiments in Musical Intelligence.Madison: A-R Editions, Inc., 1996. Cope, David, "On the Algorithmic Representation of Musical Style" in Understanding Music with AI : Perspectives on Music Cognition, ed. Mira Balaban, Kemal Ebcioglu *, and Otto Laske (Cambridge: AAAI Press/MIT Press, 1992A), pp. 354-363. Cope, David, "Pattern Matching as an Engine for the Simulation of Musical Style," Computing in Musicology 8 (1992B), 107-110. Cope, David, "Recombinant Music: Using the Computer to Explore Musical Style" in IEEE Computer (July 1991B), 22-29. Crerar, Alison, "Elements of a Statistical Approach to the Question of Authorship," Computers and the Humanities 19 (1985), 175-182. Curwen, John. The Art of Teaching and the Teaching of Music : Being the Teacher 's Manual of the Tonic Sol-fa Method.London, 1875. Dowling, W. Jay, "Context Effects on Melody Recognition: Scale-Step versus Interval Representations," Music Perception 3/3 (1986), 281-296. Dowling, W. Jay, "Melodic Contour in Hearing and Remembering Melodies" in Musical Perceptions, ed. Rita Aiello with John Sloboda (New York: Oxford University Press, 1994), pp. 173-190. Ebcioglu, Kemal. "An Expert System for Harmonization of Chorales in the Style of J. S. Bach." Ph.D. thesis, State University of New York at Buffalo (Technical Report #86-09), 1986. Ebcioglu, Kemal. "An Expert System for Harmonizing Chorales in the Style of J. S. Bach" in Understanding Music with AI : Perspectives on Music Cognition, ed.
<previous page
page_58
next page>
<previous page
page_59
next page>
Page 59
Mira Balaban, Kemal and Otto Laske (Cambridge: AAAI Press/MIT Press, 1992), pp. 294-334. Edworthy, Judy, "Interval and Contour in Melody Processing," Music Perception , 2/3 (1985), 375-388. Eitan, Zohar. Highpoints : A Study of Melodic Peaks . Philadelphia: University of Pennsylvania Press, 1997. Eitan, Zohar, "Melodic Contour and Musical Style: A Quantitative Study," Musikometrika 5 (1993), 1-68. Ellis, Mark. "Linear Aspects of the Fugues and J. S. Bach's The Well-Tempered Clavier: A Quantitative Survey." 2 vols. Ph.D. thesis, Nottingham University, 1980. Franzke, Ulrich, "Formale und endliche Melodiesprachen und das Problem der Musikdatenkodierung," Musikometrika 5 (1993), 107-149. Gjerdingen, Robert O., "The Formation and Deformation of Classic/Romantic Phrase Schemata," Music Theory Spectrum 8 (1986), 25-43. Gingerich, Lora L., "A Technique for Melodic Motivic Analysis in the Music of Charles Ives," Music Theory Spectrum 8 (1986), 75-93. Gross, Dorothy. "A Set of Computer Programs to Aid in Musical Analysis." Ph.D. thesis, Indiana University, 1975. Gustafson, Bruce, with Matthew Leshinskie. A Thematic Locator for the Works of Jean-Baptiste Lully. New York: Performers' Editions, 1989. Haas, Max, "Chatull Gadol 1.0," Computing in Musicology 8 (1992), 63-64. Haas, Max, Chatull Gadol 1.0: Computergesttzte Erforschung mittelalterlicher Musik am Beispiel liturgischer Einstimmigkeit. Basel: Max Haas, 1991. Halperin, David, "A Segmentation Algorithm and its Application to Medieval Monophonic Music," Musikometrika 2/43 (1990), 107-119. Halperin, David, "A Structural Analysis of Troubadour Music." M.A. thesis: Tel-Aviv University, 1978. Halperin, David, "Contributions to a Morphology of Ambrosian Chant." Ph.D. thesis, Tel Aviv University, 1986. Hauptmann, Moritz. The Nature of Harmony and Metre [1853], tr. W. E. Heathcote [1888], foreword by Siegmund Levarie. New York: Da Capo Press, 1991.
<previous page
page_59
next page>
<previous page
page_60
next page>
Page 60
Hewlett, Walter B., "A Base-40 Number-Line Representation of Musical Pitch Notation," Musikometrika 50/4 (1992), 1-14. Hughes, Andrew. Late Medieval Liturgical Offices : Resources for Electronic Research .Toronto: Pontifical Institute of Mediaeval Studies, 1994. Huron, David, "The Melodic Arch in Western Folksongs," Computing in Musicology 10 (1995-96), 3-23. Huron, David, and Matthew Royal, "What is Melodic Accent? Converging Evidence from Musical Practice," Music Perception 13/4 (1996), 489-516. Iszmirli, ., and S. Bilgen, "Classification of Note-Rhythm Complexities in Melodies using a Neural Network Model" in Artificial Intelligence and Music : 10th European Conference on Artificial Intelligence, Workshop W12 , ed. Gerhard Widmer (Vienna, August 4, 1992). Jesser, Barbara. Interaktive Melodieanalyse .Bern: Peter Lang, 1991. Jones, Mari Riess, "Dynamics of Musical Patterns: How do Melody and Rhythm Fit Together?," Psychology and Music : The Understanding of Melody and Rhythm, ed. Thomas J. Tighe and W. Jay Dowling (Hillsdale, NJ: Erlbaum, 1993), pp. 67-92. Koch, Heinrich Christoph. Introductory Essay on Composition: The Mechanical Rules of Melody, Section 3 and 4,tr. and intro. by Nancy Kovaleff Baker. New Haven: Yale University Press, 1983. Laden, Bernice, "Melodic Anchoring and Tone Duration," Music Perception 12/2 (1994), 199-212. LaRue, Jan. A Catalogue of Eighteenth-Century Symphonies : Vol . I, Thematic Identifier. Bloomington: Indiana University Press, 1988. LaRue, Jan. Guidelines for Style Analysis .New York: W. W. Norton, 1970. LaRue, Jan, "Symbols for Melodic Description," Ethnomusicology 8 (1964), 165-166. Leppig, Manfred, "Musikuntersuchungen im Rechenautomaten," Musica 41/2 (1987), 140-150. Leppig, Manfred, "Tonfolgenverarbeitung in Rechenautomaten: Muster und Former," Zeitschrift fr Musikpdagogik 42 (1987), 59-65. Lerdahl, Fred, and Ray Jackendoff. A Generative Theory of Tonal Music . Cambridge: MIT Press, 1983.
<previous page
page_60
next page>
<previous page
page_61
next page>
Page 61
Lincoln, Harry B. The Italian Madrigal and Related Repertories: Indexes to Printed Collections, 1500-1600. New Haven: Yale University Press, 1988. Lincoln, Harry B. The Latin Motet . Ottawa: Institute of Mediaeval Music, 1993. List, George, "Hopi Melodic Concepts," Journal of the American Musicological Society XXXVIII/1 (1985), 143152. Love, James. Scottish Church Music : Its Composers and Sources. Edinburgh: William Blackwood and Sons, 1891. Lubej, Emil H., "EMAP ( EthnoMusicological Analysis Program)for Windows 95 and Windows NT," Computing in Musicology 10 (1995-96), 151-154. McAll, May DeForest. Melodic Index of the Works of Johann Sebastian Bach. New York: C. F. Peters, 1962. Meyer, Leonard. Emotion and Meaning in Music .Chicago: Chicago University Press, 1956. Monaghan, Caroline B., and Edward C. Carterette, "Pitch and Duration as Determinants of Musical Space," Music Perception 3/1 (1985), 1-32. Mongeau, Marcel, and David Sankoff, "Comparison of Musical Sequences," Computers and the Humanities 24 (1990), 161-175. Narmour, Eugene, "Toward an Analytical Symbology: The Melodic, Harmonic, and Durational Functions of Implication and Realization" in Musical Grammars and Computer Analysis , ed. Mario Baroni and Laura Callegari (Modena, 1982; pub. Florence: Olschki, 1984), pp. 83-114. Narmour, Eugene. The Analysis and Cognition of Basic Melodic Structures: The Implication-Realization Model. Chicago: The University of Chicago Press, 1990. Narmour, Eugene. The Analysis and Cognition of Melodic Complexity: The Implication-Realization Model. Chicago: The University of Chicago Press, 1992. Parsons, Denys. The Directory of Tunes and Musical Themes. Cambridge: Spencer Brown, 1975. Pont, Graham, "Geography and Human Song," The New Scientist 20 January 1990, pp. 56-58. Ratner, Leonard. Classic Music : Expression, Form, and Style . New York: Schirmer Books, 1980.
<previous page
page_61
next page>
<previous page
page_62
next page>
Page 62
Ratner, Leonard. Romantic Music : Sound and Syntax.New York: Schirmer Books, 1992. Riemann, Hugo. System der musikalischen Rhythmik und Metrik. Leipzig: Breitkopf und Hrtel, 1903. Rosner, Burton S., and Leonard B. Meyer, "The Perceptual Roles of Melodic Process, Contour, and Form," Music Perception 4/1 (1986), 1-39. Schaffrath, Helmut, "The EsAC Databases and MAPPET Software," Computing in Musicology 8 (1992A), 66. Schaffrath, Helmut, "The EsAC Electronic Songbooks," Computing in Musicology 9 (1993-94), 78. Schaffrath, Helmut, "Reprsentation einstimmiger Melodien; computeruntersttzte Analyse und Musikdatenbanken" in Neue Musiktechnologie, ed. Bernd Enders and Stefan Hanheide (Mainz: B. Schott's Shne, 1993), pp. 277-300. Schaffrath, Helmut, "The Retrieval of Monophonic Melodies and their Variants: Concepts and Strategies for Computer-Aided Analysis" in Computer Representations and Models in Music , ed. Alan Marsden and Anthony Pople (San Diego: Academic Press, 1992B), pp. 95-110. Schottstaedt, Bill, "Automatic Counterpoint," Current Directions in Computer Music Research , ed. Max V. Mathews and John R. Pierce. Cambridge: MIT Press, 1989. Seeger, Charles, "On the Moods of a Music Logic," Journal of the American Musicological Society 13 (1960), 224-261. Selfridge-Field, Eleanor (ed.). Beyond MIDI: The Handbook of Musical Codes.Cambridge and London: MIT Press, 1997. [Selfridge-Field, Eleanor], "Encoding Neumes and Mensural Notation," Computing in Musicology 6 (1990), 23-35. Selfridge-Field, Eleanor, "Music Analysis by Computer" in Music Processing , ed. Goffredo Haus (Madison: A-R Editions, Inc., 1993), pp. 3-24. Selfridge-Field, Eleanor. The Music of Benedetto and Alessandro Marcello: A Thematic Catalogue with Commentary on the Composers, Repertory, and Sources.Oxford: Clarendon Press, 1990. Selfridge-Field, "Reflections on Technology and Musicology," Acta musicologica LXII/2-3 (1990), 302-314.
<previous page
page_62
next page>
<previous page
page_63
next page>
Page 63
Sisman, Elaine R. Haydn and the Classical Variation . Cambridge: Harvard University Press, 1993. Smith, Matt, and Simon Holland, "An AI Tool for the Analysis and Generation of Melodies," International Computer Music Conference Proceedings (1992), 61-64. Spitzer, John, "'Oh Susanna': Oral Transmission and Tune Transformation," Journal of the American Musicological Society 47/1 (1994), 90-136. Stech, David, "A Computer-Assisted Approach to Micro-Analysis of Melodic Lines," Computers and the Humanities 15/4 (1981), 211-221. Steinbeck, Wolfram. Struktur und hnlichkeit: Methoden automatisierter Melodien-analyse (Kieler Schriften zur Musikwissenschaft, XXV). Kassel: Brenreiter, 1982. Stinson, John, "The SCRIBE Database," Computing in Musicology 8 (1992), 65. Swain, Joseph P., "Dimensions of Harmonic Rhythm," Music Theory Spectrum 20/1 (1998), 48-71. Tangian, Andranik, "A Binary System for Classification of Rhythmic Patterns," Computing in Musicology 8 (1992), 75-81. Temperley, David, "An Algorithm for Harmonic Analysis," Music Perception 15/1 (1997), 31-68. Temperley, David, "Motivic Perception and Modularity," Music Perception 13/2 (1995), 141-170. Temperley, Nicholas, with Charles G. Manns and Joseph Herl. The Hymn Tune Index : A Census of EnglishLanguage Hymn Tunes in Printed Sources from 1535 to 1820. 4 vol. Oxford: Clarendon Press, 1997. Thompson, William Forde, and Murray Stainton, "Using Humdrum to Analyze Melodic Structure: An Assessment of Narmour's Implication-Realization Model," Computing in Musicology 10 (1995-96), 24-33. Trowbridge, Lynn. "The Fifteenth-Century French Chanson: A Computer-Aided Study of Styles and Style Change." Ph.D. thesis, University of Illinois. Partially duplicated in "Style Change in the Fifteenth-Century Chanson," Journal of Musicology IV/2 (1985), 146-170. Van Den Toorn, Pieter C., "What's in a Motive? Schoenberg and Schenker Reconsidered," The Journal of Musicology 14/3 (1996), 370-399.
<previous page
page_63
next page>
<previous page
page_64
next page>
Page 64
Vos, Peit G., and Jim M. Troost, "Ascending and Descending Melodic Intervals: Statistical Findings and their Perceptual Relevance," Music Perception 6/4 (1989), 383-396. Ward, John M., "The Morris Tune," Journal of the American Musicological Society XXXIX/2 (1986), 294-331. Wenk, Arthur B., "Parsing Debussy: Proposal for a Grammar of his Melodic Practice," Musikometrika I (1988), 237-256. Williams, J. Kent, "A Method for the Computer-Aided Analysis of Jazz Melody in the Small Dimensions," Annual Review of Jazz Studies 3 (1985), 41-70. Yi, Suk Won. "A Theory of Melodic Contour as Pitch-Time Interaction: The Linguistic Modeling and Statistical Analysis of Vocal Melodies in the Selected Lied Collections of Schubert and Schumann." Ph.D. thesis, University of California at Los Angeles, 1990. Zarhipov, R. K., "Construction of Frequency Lexicons of Musical Segments for Analysis and Modelling of Melodies" [in Russian], Science 41 (1965), 207-252.
<previous page
page_64
next page>
<previous page
page_65
next page>
Page 65
2 A Geometrical Algorithm for Melodic Difference Donncha Maidn University of Limerick Plassey Technological Park Limerick, Ireland donncha .omaidin@ul.ie Abstract This paper describes a parametrically driven difference algorithm that depends loosely on general principles of music perception. It may be used for making comparisons between pre-selected melodic segments of equal length. It has been tested on a corpus of Irish folksongs.
<previous page
page_65
next page>
<previous page
page_66
next page>
Page 66
The approach to melodic comparison described here has been applied successfully to identifying similar 8-bar segments in corpora of Irish folk-dance music. Its performance in identifying such melodic segments from a corpus of 419 pieces of Irish dance music has exceeded manual performance. The algorithm was also used for identifying shorter phrase segments of 1-bar durations. The design of the algorithm involves decisions on the handling of the following analytical features: (1) juxtapositioning of notes in the two melodic segments under comparison (2) pitch differences (3) note durations (4) metrical stress (5) transpositions 2.1 General Properties of a Difference Algorithm The algorithm is implemented as a function called difference, which evaluates a numerical measure of melodic difference between segments of duration r. This function has the properties difference(si1,si2,r)>=0, difference(si1,sil,r)==0, difference(si1,si2,r)alwaysyieldsthesamevalueas difference(si2,sil,r),wherer>=0. where si1 and si2, respectively, represent the starting locations of each of the score segments under comparison. 2.1.1 Juxtapositioning In a comparison of two melodic segments, a decision has to be made on which pairs of notes participate in comparisons. The approach taken here is to juxtapose the two melodic segments in time sequence. If one were to lay out the two melodic segments so that they share the same time axis, one could then divide this axis into timewindows where each window represents the longest time for which both melodic segments have a uniform activity. This is illustrated in Fig. 1c, where the two melodic segments from Fig. la and
<previous page
page_66
next page>
<previous page
page_67
next page>
Page 67
Fig. 1b are laid out so as to share the same time axis, which is represented by the lower horizontal line. The resulting time-windows are represented by divisions on the lower horizontal line. These windows determine which pairs of note segments are compared.
Figuresla-c. (a)Startof''ShandonBells"fromO'Neill'sTheDanceMusic ofIreland(Chicago1907);(b)startof"TheYellowFlail" (op.cit.);(c)segmentsofaandbjuxtaposedintimewindoworder.Thedottedverticallinessegmentthe lowerhorizontallineintodivisions,eachoneof whichrepresentsawindow.
<previous page
page_67
next page>
<previous page
page_68
next page>
Page 68
2.1.2 Pitch Differences For this repertory, two pitches under comparison may be converted into either a base-12 note number or to a base7 note number. In the first representation, which employs the MIDI key-number specifications for pitch, Middle C is represented by the number 60 and the note E above it by the number 64. The pitch difference between these two notes is taken as the positive difference between them, that is, 4. If, on the other hand, one were to use a base-7 representation, where Middle C = 1 and the E above it = 3, then a difference of 2 would be obtained. Using either of these pitch representations, an estimation of melodic distance between two tune-segments could be got by summing such (positive) pitch differences. This is very roughly analogous to calculating the sum of the lengths of the heavy vertical lines superimposed in Figure 1. This algorithm has its origin in the idea of representing musical pitch geometrically (Krumhansl, 1990). To date extensive use has been made of the base-12 representation in MIDI-based analysis systems. 2.1.3 Note Durations Intuitively it makes sense to influence difference measures by note lengths. Thus when all other things are equal, pairs of long notes contribute to the difference measure to a greater extent than pairs of short notes. Here individual pitch differences are weighted according to the width of the window to which they belong. 2.1.4 Metrical Stress The incorporation of metrical stress into the difference measure is achieved by assigning differential weights to notes that start at different places in a bar. These can be shown as a weight map which, for a piece in 6/8 time, might be assembled as in Table 1. The choice of the weights is arbitrary.
<previous page
page_68
next page>
<previous page
Table 1 Stress weights for 6/8 time. Distance in Bar Weight 0 4 1/8 2 2/8 2 3/8 3 4/8 2 5/8 2 otherwise 1
page_69
next page>
Page 69
The one-bar segments shown in Figure 2 are used for illustrating the operation of various difference algorithms.
Figure2. Samplemelodicsegmentsforillustratingdifferencealgorithms 2.1.5 Transpositions The difference algorithm can be expressed as
where p1k is the pitch of the note from the first segment at the kth window p2k is the pitch of the note from the second segment at the kth window wk is the width of window k wsk is the weight derived from metrical stress for window k
<previous page
page_69
next page>
<previous page
page_70
next page>
Page 70
The above formula can be expressed as
The creation of a key-independent version of the algorithm might be derived from a process of transposing one of the segments so as to minimize the above difference. From considering various transposed versions of one of these tune-segments, such as where the second tunesegment has been transposed up m semitones, we get
One possible way in which we can visualize a key-independent comparison is by making multiple estimates of the distance by means of one of the previous algorithms, where we allow one of the tune-segments to be transposed to all possible keys in the pitch vicinity of the other segment. A difference is calculated for each transposition, and the minimum value of this set of differences is taken as a measure of melodic similarity. This is illustrated in Figure 3 and in Figure 4, by considering a comparison between two related segments. We can see that the difference calculation for the original untransposed method gives 88 , but that if the second segment is transposed down either a major or a minor second, a smaller value of 63 results. The process of finding this difference is equivalent to finding the value of m which minimizes
A well known theorem in statistics (Aitken, 1939) enables us to find the required value of m which minimizes the sum, without the repeated calculations involved above. This is where m is the median value of the sequence of pitch differences ( p1k-p2k ), with associated weight Wk. In statistical applications, Wk is normally interpreted as a frequency. The use of this theorem gives us a way of arriving at the answer efficiently.
<previous page
page_70
next page>
<previous page
page_71
next page>
Page 71
Figure3. Tworelatedtune-segmentsforcomparison.
Figure4. Calculationofatransformationallyindependentdifference. Thisexampleillustratestheevaluationofdifferencebetween Segment1andvarioustranspositionsofSegment2. The difference matrix shown as Table 2 was produced by a transposition-independent difference algorithm using windows and stress weighting.
<previous page
page_71
next page>
<previous page
page_72
next page>
Page 72
Table 2. Differences weighted by windows, stresses with transpositions. a b c d b 217 c 42 200 d 133 154 133 e 188 63 170 92 2.2 Implementation The algorithm was implemented in C.P.N. View (for further information see the ScoreView user manual in Maidn, 1995: 175-240), in the form of the difference function shown earlier. Various settings can be made to select various features discussed here, together with some extra features. Selection of options may be made to control such things as key transitions, metrical stresses and window widths. Differences may be weighted by note durations at onset points. Selection may be made of base-7 (diatonic) or of base-12 (chromatic) pitch comparisons. Additionally, the adjustment of weights is allowed for, in cases where it is desirable to optimize the algorithm, as for example when a corpus is used for training the algorithm. In practice, the algorithm worked well with the intuitive weights, and it was possible, by experimentation, to pick a critical value of difference for a particular application that partitioned the melodic segments into similar and dissimilar pairs in a reasonably satisfactory way. References Aitken, A. C. Statistical Mathematics (Edinburgh: Oliver and Boyd, 1939), I, 32. Krumhansl, Carol L. Cognitive Foundations of Musical Pitch (Oxford, 1990), pp. 112-119. Maidn, Donncha, "A Programmer's Environment for Music Analysis," Technical Report UL-CSIS-95-1. Copies are available from the Department of Computer Science, University of Limerick, Ireland.
<previous page
page_72
next page>
<previous page
page_73
next page>
Page 73
3 String-Matching Techniques for Musical Similarity and Melodic Recognition Tim Crawford Music Department King's College London London WC2R 2LS, England t .crawford@kcl .ac .uk Costas S. Iliopoulos csi@dcs.kcl.ac .uk Rajeev Raman raman@dcs.kcl.ac .uk Algorithm Design Group Dept. of Computer Science King's College London London WC2R 2LS, England www .dcs .kcl.ac .uk Abstract The primary goal of this paper is to identify some computational tasks which arise in computer-assisted musical analysis and data retrieval, to formalize them as string-pattern-matching problems and to survey some possible algorithms for tackling them. The selection of computational tasks includes some foreseen as useful in research into such historical repertories as lute or keyboard music.
<previous page
page_73
next page>
<previous page
page_74
next page>
Page 74
3.1 Introduction We wish to identify some computational approaches used in biological and technical sciences which may be of value in pursuing certain kinds of musical queries and to formalize them as string-pattern-matching problems. The approaches are of two general kinds: (1) Approaches for which computationally efficient procedures are suggested in the computer-science literature . By identifying these solutions we hope to provide a basis by which musicologists and computer scientists may collaborate to develop efficient software for musical analysis problems. (2) Approaches for which computationally efficient solutions are not known to exist in the computer-science literature . By describing these unresolved problems we hope to stimulate further research in the field of stringalgorithm design. The approaches discussed here are representative rather than inclusive. 3.1.1 Objectives An important direction of our research is towards a formal definition of musical similarity between such musical entities as "themes" or "motifs." We aim to produce a quantitative measure, or "characteristic signature," of a musical entity. This measure is essential for melodic recognition. It could have multiple uses, including that of data retrieval from musical databases. The ideal characteristic signature would be derived from the pattern of notes (and other musical objects) as they occur in temporal sequence in the musical entity. The note-pattern itself may be derived from an unstructured audio input or from symbolically encoded score data containing a high degree of logical structure, or some intermediate state, such as a stream of MIDI commands, wherein pitches may be clearly identifiable but their structural relationship is not clear. Note-pattern derivation is not considered in this paper.
<previous page
page_74
next page>
<previous page
page_75
next page>
Page 75
Two musical entities which are said to be similar will be expected to have matching characteristic signatures, that is, both entities will satisfy a set of properties (Cambouropoulos and Smaill 1995). A property is said to be satisfied when it achieves a certain score. Each property is assigned a certain weight, and the characteristic signature is the combination of the weighted properties of objects in the musical entity. Intuitively, these properties will encode patterns in the musical entity. It is hoped that the pattern-matching and pattern-discovering problems that we discuss in this paper will provide the basis for obtaining a set of properties that can be used as parameters in musical similarity and as a touchstone for creating the characteristic signature of a piece of music. 3.1.2 Computational Resources for Musicologists Textbooks on string algorithms, although few, may sometimes be useful. Apostolico and Galil (1985, 1997) provide surveys of the majority of fundamental algorithms and results on strings. The textbook of Crochemore and Rytter (1996) is the only one covering a plethora of problems in this area. Aho (1990) gives an excellent survey focussing on a certain set of problems. In relation to possible connections between computational biology and computer-assisted musicology, Setubal and Meidanis (1997) give an excellent introduction to computational molecular biology. Some software packages designed for molecular biology may possibly be useful for musical analysis. FAST is a family of programs for sequence database search: for example, FASTP is a subprogram for general sequence comparison, LFASTA is a tool for local similarity comparison, and TFASTA is a sequence query tool. The FAST implementation is described by Lipman and Pearson (1985), Pearson and Lipman (1988) and Pearson (1990). BLAST is another sequence-similarity tool, described in Altschul et al. (1991). A more general and wider-ranging software library for string processing, aimed at a larger range of applications, is also under development (Czumaj et al. 1997).
<previous page
page_75
next page>
<previous page
page_76
next page>
Page 76
3.2 String-Matching Problems in Musical Analysis 3.2.1 A Perspective from Computational Science In computational science, string-matching procedures have their own jargon. A string is a (usually finite) sequence of symbols drawn from a finite set of symbols which is called the alphabet. Patterns and texts are both strings. The prefixes of a string are the strings formed from a concatenation of its initial characters, e.g., the prefixes of the string abcd are the empty string, a, ab, abc and abcd itself. The suffixes of a string are the strings formed from a concatenation of its final characters, e.g., the suffixes of the string abcd are the empty string, d, cd, bcd and abcd itself. The text usually corresponds to a score or other musical entity and the pattern could be a motif, in the form of a sequence of notes, provided by a user or some other sequence of items. Some pattern-matching problems do not involve a user-specified pattern, but rather involve discovering a pattern in a text: for instance, analyzing a score to find repeated passages. The mapping of musical entities onto texts depends heavily on the way in which the entity is represented in a computer [see the opening article by Selfridge-Field in this issue]. The subject of musical representation for computers is comprehensively covered (Selfridge-Field 1997). The schemes in that book take account of the multidimensional quality (parameters may involve pitch [both chromatic and diatonic], duration, loudness, timbre, notational and other features) of musical information to some extent (to these should be added the GPIR representation described in Cambouropoulos 1996A). It is assumed that the values of musical parameters to be compared in a matching task can be mapped to symbols in an alphabet as defined above.1 A polyphonic score may be treated either as a collection of melodic strings laid end-to-end, each of which is labelled as explicitly belonging to a certain voice (double-stopping is not considered in this discussion), or as a sequence of collections of notes lacking voice information, each of which occurs simultaneously within a certain time-slot. Musical data that is derived from a score in conventional musical notation can usually be treated as belonging to the first kind, while that derived from MIDI performance may lack explicit 1 Such values, for example, could be MIDI note-numbers (0-127) for a base-12 representation of pitch. For the present purposes, rests are regarded as a special form of note, and are treated no differently; multiple rests contiguous within a voice are assumed to have been concatenated into single rests separating notes.
<previous page
page_76
next page>
<previous page
page_77
next page>
Page 77
voice-information and thus will generally need to be treated as the second kind, while note-data originating from raw audio can at best be expected to conform to an error-ridden form of the second kind. 3.2.2 Symbols Used in Algorithmic Descriptions The alphabet is denoted by S, the size of the alphabet is denoted by |S|, and the lengths of the text and pattern strings are usually denoted by n and m .The running time of an algorithm is expressed as a function of one or more of |S|, n and m, and is expressed using the O-notation (Cormen et al. 1990). Roughly, an algorithm with a running time of O( f ( n,m, |S|)) is predicted to run in time proportional to f ( n,m, |S|) or less. For instance, an algorithm with a running time O( m+n ) might require no more than 50( m+n )microseconds of CPU time to execute on a particular computer, for different values of m and n.The constant of proportionality depends upon the computer on which the program is executing, the programming language in which the program is written, the compiler used, and indeed upon the algorithm itselftwo algorithms for the same problem with equal running times in the O -notation (e.g., O( m+n )) may, ceteris paribus, have different running times in real life (e.g., 5( m+n ) microseconds versus 50( m+n )microseconds). An algorithm with running time O( n)or O( m+n )is said to be linear since its running time is a linear function of the size of the input. Since in most cases it would take O( m+n )steps simply to read the input, this running time is considered the best possible (neglecting the constant of proportionality). 3.2.3 Pattern-Match Categories The problems discussed here are based on a two-dimensional model: pitch and duration. At some point in the future it will be necessary to consider the same problems in higher dimensions by adding other parameters, such as loudness or timbre. Here we consider two kinds of matches exact matches and transposed matches. In the first, specific pitch information is matched. In the second, intervallic information is matched. A special case of a transposed match is an octave-displaced match, where the matchability of a sequence of pitches may be obscured by octave displacement.
<previous page
page_77
next page>
<previous page
page_78
next page>
Page 78
The algorithms we describe deal principally with exact matches. Transposed and octave-displaced matches may be found in some cases by suitably transforming the pattern and the score, and applying an exact matching algorithm to the transformed pattern and score. 3.2.4 Problem Typologies The first series of problems (Types 1-9) deals with polyphonically structured musical entities in which there is explicit formal voice-leading information. The accompanying diagrams are intended as a generalized representation of a musical situation sufficient to clarify the text. In most cases we provide references to the musical examples given in score notation in the appendix to this article. The examples are intended to show some possible uses of these techniques in real musical applications. The horizontal axis refers to the time domain and the vertical one to that of pitch; musical pitch relates to frequency but it is important to recognize that chromatic and diatonic pitch-standards are not usually interchangeable (See Cambouropoulos 1996A, 1996B, 1997). The second series of problems (Types 10-13) concerns entities in which the voice-leading information is unspecified. 3.3 Exact-Match Algorithms (Category 1) 3.3.1 Type 1. Exact Matching PROBLEM DESCRIPTION: Given a set of sequences of notes (one for each voice), find whether an exact subsequence occurs in one of the voices.
Figure1. Searchingforanexactsequenceofnotesinanyonevoice. SeeMusicalExample1onp.90.
<previous page
page_78
next page>
<previous page
page_79
next page>
Page 79
This problem can be solved by the Knuth-Morris-Pratt algorithm (1977) in linear time. This algorithm preprocesses the pattern by finding prefixes of the pattern which occur in the pattern itself and by storing the results in a (failure-function) array. The algorithm proceeds by attempting to match the symbols of the pattern with those of the text one by one. When a mismatch occurs, the algorithm attempts to find the pattern in another position of the text using the failure-function value as a yardstick. The algorithm requires O ( n+m ) operations. In practice, the best performing method is a variant of Boyer-Moore (1977, see also Hume and Sunday 1991). The Boyer-Moore algorithm, an extension of the Knuth-Morris-Pratt algorithm, also makes use of the failure function as well as the ''mismatch" information. In the Boyer-Moore method, when a mismatch occurs, then the algorithm shifts the pattern, either until it matches the mismatched symbol or by the failure-function amount, whichever is greater. On some pathological inputs, the Boyer-Moore algorithm is slow, requiring O ( nm +|S| ) operations. However, the expected time complexity of the Boyer-Moore algorithm is linear and this bound matches the times achieved in practice. It is also worth considering the Aho-Corasick automaton (1975), which is implemented in the grep command in UNIX systems (Hume 1988). The Aho-Corasick automaton is an extended version of the suffix-tree data structure (Weiner 1983). This automaton is particularly suitable for the case where several patterns are given and one wants to test whether any one of them occurs in the text. It is worth noting that the running time of the Knuth-Morris-Pratt algorithm does not depend on the size of the alphabet, but the Boyer-Moore algorithm and the Aho-Corasick automaton depend on the alphabet. This alphabet dependence causes O( |S| ) preprocessing operations in the case of the Boyer-Moore algorithm, but since in a musical context the alphabet is small, this cost is negligible in practice. On the other hand, the Aho-Corasick automaton needs O( |S| ) outgoing edges for each state, which makes it inefficient in terms of storage space, but again, once the automaton is constructed, it is very efficient to use. These algorithms can be modified to find transposed matches by matching the intervals between successive notes in the pattern to intervals between successive notes in the score. A survey of algorithms for this problem on parallel computers and systems can be found in Iliopoulos (1993).
<previous page
page_79
next page>
<previous page
page_80
next page>
Page 80
3.3.2 Type 2: Matching with Deletions PROBLEM DESCRIPTION: Given a set of sequences of notes (one for each voice), find whether an exact subsequence occurs in one of the voices without preserving the duration times of each pattern.
Figure2. Searchingforasequenceofnotesinanyonevoicewithout observingonset-timesandduration.SeeMusicalExample2. Approximate string-matching algorithms (Ukkonen 1985; Galil and Giancarlo 1990; Aho 1990) can be used to solve the above problem. Furthermore, software packages like BLAST and FASTA can be used in identifying such sequences of notes. But this problem appears to be simpler than approximate string-matching (e.g., there are no insertions), and therefore new, faster algorithms could be designed specifically for this type of problem, for both exact and transposed matches. 3.3.3 Type 3: Repetition Identification PROBLEM DESCRIPTION: Given a set of sequences of notes (one for each voice), identify non-overlapping repeated patterns in different voices or the same voice.
Figure3. Identifyingrepeatedpatterns(notgivenapriori )inascore. SeeMusicalExample3.
<previous page
page_80
next page>
<previous page
page_81
next page>
Page 81
One of the best methods for solving this type of problem was given by Main and Lorentz (1984, 1985). The MainLorentz method is based on the failure function, and its running time is O( n log n) operations, where n is the length of the text. Crochemore (1981) also gave an O( n log n)algorithm for repetition identification based on "set partitioning" techniques. All the above algorithms can also find transposed repetitions. It is not yet conclusively settled whether O( n log n)is the best running time for this problem. Crochemore (also 1981) showed that the Fibonacci strings can have of the order of n log n repetitions, and argued that since it could take of the order of n log n steps simply to write the output, his method was optimal. However, recent results (Iliopoulos, Moore, and Smyth [1996]; Iliopoulos and Smyth [1995] ) showed that one can report all the repetitions in a Fibonacci string in linear time using special encodings. This raises the question whether it is possible to design an optimal (linear) algorithm for this problem whose output is a linear-sized encoded representation of all the repetitions in the string. 3.3.4 Type 4: Overlapping Repetition Identification PROBLEM DESCRIPTION: Given a set of sequences of notes (one for each voice), identify repeated patterns that may overlap in different voices or the same voice.
Figure4. Identifyingoverlappingrepeatsinascore. SeeMusicalExample4. For computing possibly overlapping repetitions that occur locally somewhere in the score, Apostolico and Ehrenfeucht (1993) and Iliopoulos and Mouchard (in preparation) provide efficient methods (see also Guibas and Odlyzko, 1981). The Apostolico and Ehrenfeucht method makes use of
<previous page
page_81
next page>
<previous page
page_82
next page>
Page 82
the suffix-tree data structure. It identifies all text positions where a repetition starts, as these positions are mapped on the same locations of the suffix-tree data structure. The Iliopoulos and Mouchard method is based on set partitioning similar to that in Crochemore (1981). Both methods proceed by identifying the gaps between the repetitions. Both algorithms are non-linear and it remains open whether these methods are optimal. A generalization of this problem occurs when one wants to cut up the score into segments and to test whether each segment can be tiled by repeated identical substrings. For example, the string abcabcabca is tiled by overlapping occurrences of abca occurring at positions 0, 3 and 6.
For this one can use the methods presented in Apostolico, Farach and Iliopoulos (1991), Iliopoulos and Park (1996), Mongeau and Sankoff (1990). The first algorithm is based on the failure function, the second is based on monitoring the gaps between repetitions, and the third is based on the string-border computation.2 All three algorithms are linear. Another variant occurs if the segmentation is not perfect and one wants to test whether a score segment can be tiled by repeated identical substrings, except perhaps the edges of the segment (due to imperfect cut-off). For example, cabcabcabc is tiled by overlapping and complete occurrences of abca occurring at positions 1 and 4, and by incomplete occurrences at position 7 and (non-existent) position -2.
2 A border is the largest prefix of a string that is also its suffix.
<previous page
page_82
next page>
<previous page
page_83
next page>
Page 83
For this one can use the methods given by Iliopoulos, Moore, and Park (1996) and Berkman, Iliopoulos, and Park (1996). Note that the methods in Apostolico, Farach, and Iliopoulos (1991), Iliopoulos and Park (1996), and Mongeau and Sankoff (1990) cannot be used in this case as all three depend on a perfect segment edge, which here might have been cut off erroneously. The Iliopoulos/Moore/Park method is based on partitioning the set of text positions into subsets of positions where identical occurrences start, and then combinatorially evaluating whether tiling is possible. The Berkman method is a complex one, based on the suffix-tree data structure and computational geometry techniques (so-called outer envelope computation). Both methods require O ( n log n)operations, but the Iliopoulos method is simpler and practical. All the above algorithms can also find transposed repetitions. 3.3.5 Type 5: Transformed Matching PROBLEM DESCRIPTION: Given a set of sequences of notes (one for each voice) and a pattern, find whether the pattern occurs in one of the given sequences in either the original form, inversion, retrograde or retrograde inversion.
Figure5. Identifyingtransformations:retrograde,inversion,retrogradeinversion. This type of problem can be tackled by three consecutive applications of one of the exact pattern-matching methods given for solving problem 1 above. The Aho-Corasick algorithm is the most suitable method for this type of problem, since it can handle all four patternsoriginal, inversion, retrograde and retrograde inversionin one pass.
<previous page
page_83
next page>
<previous page
page_84
next page>
Page 84
3.3.6 Type 6: Distributed Matching PROBLEM DESCRIPTION: Given a set of sequences of notes (one for each voice) and a pattern, find whether the pattern occurs distributed horizontally, either in one voice or across several other voices.
Figure6. Patterndistributedacrossvoices. SeeMusicalExamples1and4. There is no specific method in the literature for this type of problem. A naive way of solving the problem is as follows: first design an Aho-Corasick automaton that accepts all the prefixes of the given pattern. Now view the input string as a two-dimensional array. Assume that we have computed and stored all the pattern prefixes that occur in the text up to a certain column. At the next unit of time (column), we either extend these prefixes by one symbol that occurs in that column or the prefix is voided. This leads to an O ( nm)operations algorithm, where n is the length of the text and m is the length of the pattern. This algorithm can also be adapted to find transposed matches with a complexity of O( nm)using |S| extra locations. It will be interesting to design a more efficient algorithm for this problem. However, if the value of m is small enough so that an ( m-1)-bit value may be stored in one word of the computer (on a 32-bit processor like the Pentium, m should be no more than 31), the above algorithm can be implemented to find exact matches very efficiently, in O( n + |S| ) time, in fact.3 This implementation may not be suitable for very large alphabets. A straightforward implementation of the transposed matching variant of this algorithm seems considerably slower (about 5 times slower for patterns of length 4, and 13 times slower for patterns of length 16). We are investigating heuristics to speed up this algorithm. 3 Our implementation, for example, takes 0.12 seconds of processing time to search in a file of one million characters on a 200 MHz Pentium processor, and about half as long on a 300 MHz Sun UltraSPARC processor.
<previous page
page_84
next page>
<previous page
page_85
next page>
Page 85
3.3.7 Type 7: Chord Recognition PROBLEM DESCRIPTION: Given a set of sequences of notes (one for each voice) and a pattern, the pattern has to be found with all its elements located in the same time-slot.
Figure7. Patterndistributedasachord. This problem appears to be similar to the distributed matching one but it is simpler in the sense that the pattern has to be found in the same time-slot. The exact version of this problem can be solved in time linear in the score length: for each time-slot in the score, if a note of the score is also a chord note, then we mark the chord note. A chord occurs in a time-slot only when all chord notes are marked. This algorithm may also be adapted for transposed matching with some reduction in performance. (This problem will need to be solved, for example, in matching versions of a lute or keyboard piece in which a chord is written in "broken" form in one version only. It has many other applications in harmonic analysis.) 3.3.8 Type 8: Approximate Matching PROBLEM DESCRIPTION: Given a set of sequences of notes (one for each voice) and a pattern, find whether approximate occurrences (accommodating insertion, deletion, and/or replacement of notes) of the pattern occur in one of the sequences. Algorithms for solving this type of problem can be found in Crochemore and Rytter (1996), Aho (1990), and Ukkonen (1985). In fact, in addition to considering the approximate string-matching problem, we would like to consider all of the above problems in the presence of errors, such as the identification of substrings that are duplicated to within a certain tolerance k .The tolerance is normally measured using distance metrics such as Hamming distance and edit distance .
<previous page
page_85
next page>
<previous page
page_86
next page>
Page 86
The Hamming distance of two strings u and v is defined to be the number of substitutions necessary to get u from v ( u and v have the same length). The edit distance is the (weighted) number of edit operations (taken from a set of permissible edit operations) needed to transform u to v .Edit-distance has been used by Mongeau and Sankoff to define a notion of musical similarity. Given two strings A, B, we say that B is k -approximate to A if the distance is k .Given a string x and an integer k, a substring w of x is called a k-approximate period of x, if x is partitioned into disjoint blocks such that every block has distance at most k from w. Given a string x and an integer k , a substring w of x is called a k-approximate cover of x , if x is divided into disjoint or overlapping blocks such that every block has distance at most k from w.The most efficient algorithm for computing non-overlapping repetitions is that by Schmidt (1994). Computing overlapping repeats in the presence of errors is an open problem. 3.3.9 Type 9: Evolution Detection PROBLEM DESCRIPTION: Given a sequence of notes and a pattern u, find whether there exists a sequence u1 = u, u2,..., uk in the score such that ui +1begins to the right of ui ,and ui and ui +1have an edit-distance of one.4 There is no specific algorithm for this problem. Landau and Vishkin (1986) gave a simple algorithm for the "1difference" problem (find all substrings of the text which have an edit-distance of 1 from the pattern) which runs in O( n log n)time. A naive way to solve this problem is to repeatedly apply the Landau/Vishkin algorithm to the text using ui as the pattern, for i = 1, 2,...,giving an approach with O ( n2 log n) worst-case running time.
Figure8. Localapproximationsinsearchpatterntracegradualchange (analogousto"evolution")inamotif.SeeMusicalExample6. 4 i.e., one can transform ui to ui+1by one insertion, deletion or a replacement of a note.
<previous page
page_86
next page>
<previous page
page_87
next page>
Page 87
A variant of this problem is to find whether the sequence u1,..., uk exists in the case that u is not given. Another variant of the same problem is to be considered over a set of sequences of notes (one for each voice), and the ui 'sare distributed over the voices. Another variant is where the edit-distance between ui and ui +1is allowed to be some fixed number larger than one. It is not known whether the Landau/Vishkin algorithm can be generalized to efficiently find occurrences of the pattern at a distance greater than 1, so even the naive solution described above does not work. Further investigation as to whether methods such as Landau and Vishkin (1988) and Galil and Park (1990) can be adapted to solve the above problems is needed. The three remaining problem types to be discussed concern musical entities that are polyphonically unstructured with no explicit voice leading. 3.4 Inexact-Match Algorithms (Category 2) 3.4.1 Type 10: Unstructured Exact Matching PROBLEM DESCRIPTION: Given a sequence of notes (voices unspecified) and a pattern, find whether the pattern occurs in the mixed set of sequences. A variant of this problem is to identify the pattern spread over time with several notes intervening.
Figure9. Unstructuredexactmatching.SeeMusicalExample4. The O( mn) - timealgorithm for Problem-Type 6 can solve this problem as well. However, it will be interesting to design new methods for this type of problem and in particular to examine the relationship of this problem with that of distributed matching (Type 6), settling the question whether the lack of structure (voice information) affects the complexity of the problem.
<previous page
page_87
next page>
<previous page
page_88
next page>
Page 88
3.4.2 Type 11: Unstructured Repetitions PROBLEM DESCRIPTION: Given a sequence of notes (voices unspecified), identify repeated patterns (i.e., significant motifs; see Cambouropoulos and Smaill 1995) that may or may not overlap.
Figure10. Unstructuredrepetitions.SeeMusicalExamples4,5. Algorithms for two-dimensional string matching (see Crochemore and Rytter 1996) may be useful for this problem. If overlaps are allowed, then methods for identifying repetitions can be found in Crochemore, Iliopoulos, and Korda (1997) and in Iliopoulos and Korda (1996A and 1996B). 3.4.3 Type 12: Unstructured Approximate Matching PROBLEM DESCRIPTION: Consider the above problems in the presence of errors and find approximate matchings and repetitions.
<previous page
page_88
next page>
<previous page
page_89
next page>
Page 89
Figure11. Unstructuredapproximatematching. Very little work has been done in this direction (see Crochemore and Rytter 1996). New algorithms for handling errors in two dimensions need to be designed. 3.5 Musical Examples The following musical examples illustrate some of the problem types discussed above and identify the most appropriate algorithmic approaches to melodic matching. In Example 1, the often-cited ''stereophonic" theme of the fourth movement of Tchaikovsky's Symphony No. 6 is shown here as a likely search query which could be encoded as a string la. Algorithms of Type 1 will find a match in the first violin part at Measure 104 (1b). They will not, however, find the theme at the beginning of the movement (1c), since the melodic string is distributed between first and second violins as indicated. To match this, algorithms of Type 6 are required.
<previous page
page_89
next page>
<previous page
page_90
next page>
Page 90
a) a reasonable search-query:
b) measure 104:
c) beginning of movement:
Example 1. Tchaikovsky: Symphony 6, fourth movement: (a) the theme as perceived, (b) the theme as rendered in Measure 104, and (c) the "distributed" version of the theme at the opening of the movement.
<previous page
page_90
next page>
<previous page
page_91
next page>
Page 91
In Example 2, the opening of the song (2a) is based on a motif (2b) which appears in the piano part (right hand, with partial echoes in the left hand) in imitation with the voice, but with different onset-times and durations. a) Vocal score:
b) opening motif:
Example 2. Brahms: Deutsche Volkslieder, No. 15 ("Schwesterlein"). In Example 3, Bach's chorale prelude "Kyrie, Gott Vater in Ewigkeit," a fugal subject of 13 pitches, based on and opening with the first three notes of the chorale melody, appears in each of the voices. This non-overlapping repeated pattern will be detected by algorithms of Type 3. Note the partial match of seven pitches only at Measure 5, and the inverted form of the pattern at Measure 8. "Polyphonic" transcriptions of a lute piece may be searched for a given pattern using algorithms of Type 1, for non-overlapping repetitions using algorithms of Type 3, or for overlapping repetitions with algorithms of Type 4. The apparently trivial, but perhaps structurally significant two-note pattern e'-d' in Example 4, for example, can be found as non-overlapping repetitions (Type 3).
<previous page
page_91
next page>
<previous page
page_92
next page>
Page 92
Example 3. J. S. Bach: organ chorale prelude "Kyrie, Gott Vater in Ewigkeit," BWV 669, from Clavierbung III. The descending tetrachord, on the other hand, appears in several overlapping repetitions (Type 4), but the number of such occurrences depends on the nature of the transcription. Since voice leading is largely absent from lute tablature notation, transcriptions may thus be unsafe for such analysis, or at least a different approach is needed, such as using algorithms of Type 6. Matching a pattern given a priori in the "raw" pitch-data derived from tablature requires algorithms of Type 10. The detection of repetitions in such data requires algorithms of Type 11 (this is also true of matches in pitch derived from polyphonic audio input). In Example 5a, we first see the separation on rhythmic and pitch data. Then, in the arrangements A and B, we see the melodic lines of Gaultier's piece revised (with important modifications of pitch and rhythmic strings) in two arrangementsone for spinet and one for violin or flute. In both cases the treble pitches have been transposed up an octave.
<previous page
page_92
next page>
<previous page
page_93
next page>
Page 93
Example 4. Denis Gaultier: the allemande "Le Tombeau de L'Enclos" for lute: computer facsimile of original tablature, pitch and duration data, and excerpts from three divergent transcriptions. Ornament signs have been omitted.
<previous page
page_93
next page>
<previous page
page_94
next page>
Page 94
a) Scores:
b) Edit operations on melodic strings extracted from scores:
Example 5. Denis Gaultier: "Le Tombeau de L'Enclos." Here (a) the pitch and duration data derived from the tablature shown in Example 4 are given first. Then two arrangements (A and B) contemporary with the tablature are shown. Next (b) the melodies derived separately from the treble and bass of A and B are shown, with markers to indicate edit operations (insertions, deletions, replacements, and temporal displacements).
<previous page
page_94
next page>
<previous page
page_95
next page>
Page 95
c) All three versions reduced to note-data form: Lute
(Vertical lines between notes link those occurring in the same time-slot.) Example 5, continued. In 5c, time-slice information is added to the recoupled treble and bass lines of the two arrangements shown in 5a. In Example 5c, timing data is introduced in the representations of A and B and co-incident pitches are indicated by vertical lines.
<previous page
page_95
next page>
<previous page
page_96
next page>
Page 96
In Example 6, the five successive entries (A-E) of one ricercare for lute are audibly related. They can be treated as stages in the evolution of a diatonic motif by a series of alterations of edit-distance 2 (where the deletion, insertion, replacement, and time-displacement operations each have a weight of 1). This requires algorithms of Type 9. a) Selected entries (in their original sequence):
b) 'Evolution' of diatonic-pitch pattern
Example 6. Francesco da Milano: monothematic lute ricercar from the Cavalcanti Lutebook, f. 71v.
<previous page
page_96
next page>
<previous page
page_97
next page>
Page 97
References Aho, A. V., "Pattern Matching in Strings," in Handbook of Theoretical Computer Science, Volume A, Algorithms and Complexity, Elsevier, 1990. Altschul, S., W. Gish, W. Miller, E. Myers, and D. Lipman, "A Basic Local Alignment Tool," Journal of Molecular Biology 219 (1991), 555-565. Aho, A. V., and M. J. Corasick, "Efficient String Matching," Communications of the ACM 18 (1975), 333-340. Apostolico, A., and A. Ehrenfeucht, "Efficient Detection of Quasiperiodicities in Strings," Theoretical Computer Science 119 (1993), 247-265. Apostolico, A., M. Farach, and Costas S. Iliopoulos, "Optimal Superprimitivity Testing for Strings," Information Processing Letters 39 (1991), 17-20. Apostolico, A., and Z. Galil, eds. Combinatorial Algorithms on Words, Springer-Verlag, NATO ASI Series, 1985. Apostolico, A., and Z. Galil, eds. Pattern Matching Algorithms, Oxford University Press, 1997. Berkman, O., Costas S. Iliopoulos, and K. Park, "String Covering," Information and Computation 123 (1996), 127137. Boyer, R., and J. Moore, "A Fast String Searching Algorithm," Communications of the ACM 20 (1977), 262-272. Cambouropoulos, E., "A General Pitch Interval Representation: Theory and Applications," Journal of New Music Research 25 (1996A), 231-251. Cambouropoulos, E., "A Formal Theory for the Discovery of Local Boundaries in a Melodic Surface," in Proceedings of the III Journes d'Informatique Musicale, Caen, France, 1996B. Cambouropoulos, E., "The Role of Similarity in Categorisation: Music as a Case Study." In Proceedings of the Third Triennial Conference of the European Society for the Cognitive Sciences of Music ( ESCOM ) , Uppsala, 1997. Cambouropoulos, E., and A. Smaill, "A Computational Theory for the Discovery of Parallel Melodic Passages," in Proceedings of the XI Colloquio di Informatica Musicale, Bologna, Italy, 1995. Cormen, T. H., C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms, MIT Press, 1990.
<previous page
page_97
next page>
<previous page
page_98
next page>
Page 98
Crawford, Tim, "Lute Tablature and Concordance Recognition: Special Problems Needing Special Solutions," read at the 15th Congress of the International Musicological Society, Madrid, 1992. This paper is available on the WWW using the URL http: //www.kcl.ac .uk/kis/schools/hums/music/ttc/madrid.html Crochemore, M., "An Optimal Algorithm for Computing the Repetitions in a Word," Information Processing Letters 12 (1981), 244-250. Crochemore, M., Costas S. Iliopoulos, and M. Korda, "An Optimal Algorithm for Prefix String Matching and Covering," to appear in Algorithmica, 1997. Crochemore, M., and W. Rytter. Text Algorithmics, Oxford Press, 1996. Czumaj, A., P. Ferragina, L. Gasieniec, S. Muthukrishnan and J. Traeff, "The Architecture of a Software Library for String Processing," to be presented at Workshop on Algorithm Engineering, Venice, September 1997. Galil, Z., and R. Giancarlo, "Efficient Algorithms for Molecular Biology," in Sequences: Combinatorics, Compression, Security, Transmission, Springer-Verlag, 1990, pp. 59-74. Guibas, L., and A. Odlyzko, "String Overlaps, Pattern Matching, and Non-transitive Games," J . Combinatorial Theory ( Series A)23 (1981), 183-208. Galil, G., and K. Park, "An Improved Algorithm for Approximate String Matching," SIAM Journal on Computing, 19 (1990), 989-999. Hume, A., "A Tale of Two Greps," Software Practice & Experience 18 (1988), 1063-1072. Hume, A., and D. Sunday, "Fast String Searching," Software Practice & Experience 21 (1991), 1221-1248. Iliopoulos, Costas S., "Parallel Algorithms for String Pattern Matching," in A. Gibbons and P. Spirakis, eds., Lectures on Parallel Computation, Volume 4, Cambridge University Press, 1993, 109-121. Iliopoulos, Costas S., and M. Korda, "Parallel Two-dimensional Covering," in Proceedings of the Australasian Workshop on Combinatorial Algorithms ( AWOCA '96), University of Sydney, 1996A, 62-75. Iliopoulos, Costas S., and M. Korda, "Optimal Parallel Superprimitivity Testing on Square Arrays," Parallel Processing Letters 6 (1996B), 299-308. Iliopoulos, Costas S., and L. Mouchard, "Fast Local Covers," in preparation.
<previous page
page_98
next page>
<previous page
page_99
next page>
Page 99
Iliopoulos, Costas S., and K. Park, ''An O( n log n)PRAM Algorithm for Computing All Seeds of a String," Theoretical Computer Science 164 (1996), 299-310. Iliopoulos, Costas S., D. W. G. Moore, and K. Park, "Covering a String," Algorithmica 16 (1996), 288-297. Iliopoulos, Costas S., D. W. G. Moore, and W. F. Smyth, "A Linear Algorithm for Computing the Squares of a Fibonacci String," in P. Eades and M. Moule, eds., Proceedings CATS '96, "Computing : Australasian Theory Symposium,"University of Melbourne, 1996, 55-63. Iliopoulos, Costas S., and W. F. Smyth, "A Fast Average Case Algorithm for Lyndon Decomposition," International Journal for Computer Mathematics 57 (1995), 15-31. Iliopoulos, Costas S., and W.F. Smyth, "An On-line Algorithm for Computing a Minimal Set of k -covers of a String," submitted. Knuth, D. E., J. Morris, and V. R. Pratt, "Fast Pattern Matching in Strings," SIAM Journal on Computing 6 (1977), 323-350. Lipman, D., and W. R. Pearson, "Rapid and Sensitive Protein Similarity Search," Science 227 (1985), 1435-1441. Landau, G. M., and U. Vishkin, "Introducing Efficient Parallelism into Approximate String Matching and a New Serial Algorithm," in Proc . Annual ACM Symposium on Theory of Computing, ACM Press, 1986, 220-230. Landau, G. M., and U. Vishkin, "Fast String Matching with k Differences," Journal of Computer and Systems Sciences 37 (1988), 63-78. Main, G., and R. Lorentz, "An O( n log n)Algorithm for Finding All Repetitions in a String," Journal of Algorithms 5 (1984), 422-432. Main, G., and R. Lorentz, "Linear Time Recognition of Square Free Strings," in A. Apostolico and Z. Galil, eds., Combinatorial Algorithms on Words, Springer-Verlag, 1985, 271-278. Mongeau, Marcel, and David Sankoff, "Comparison of Musical Sequences," Computers and the Humanities 24 (1990), 161-175. Pearson, W., "Rapid and Sensitive Sequence Comparison with FASTP and FASTA," in Methods in Enzymology, Academic Press, 1990, 63-98.
<previous page
page_99
next page>
<previous page
page_100
next page>
Page 100
Pearson, W., and D. Lipman, "Improved Tools for Biological Sequence Comparison," Proceedings of National Academy of Sciences of the USA 85 (1988), 2444-2446. Schmidt, J. P., "All Shortest Paths in Weighted Grid Graphs and its Application to Finding All Approximate Repeats in Strings," in Proc . Fifth Symposium on Combinatorial Pattern Matching, Springer-Verlag Lecture Notes in Computer Science, 1994. Selfridge-Field, Eleanor, ed. Beyond MIDI: The Handbook of Musical Codes, MIT Press, 1997. Setubal, J., and J. Meidanis. Introduction to Computational Molecular Biology, PWS Publishing, 1997. Ukkonen, E., "Algorithms for Approximate String Matching," Information and Control 64 (1985), 100-118. Weiner, P., "Linear Pattern Matching Automaton," in Proc . of the 14th IEEE Symposium on Switching and Automata Theory , 1983, 1-11. Sources for Musical Examples Buch, 1990 D. Buch, ed., Denis Gaultier, La rhtorique des dieux (A-R Editions: Madison, 1990). Cavalcanti Lutebook Brussels, Belgium, Bibliothque Royale (B-Br), MS II 275. Darmstadt 18 Darmstadt, Germany, Stadtbibliothek, MS 18. Gaultier, 1670 Denis Gaultier, Pices de luth (Paris, 1670; repr. Minkoff: Geneva, 1978). Perrine, 1680 Pices de luth en musique...par le Sr . Perrine (Paris, 1680; repr. Minkoff: Geneva, 1982). Rollin, 1996 M. Rollin and F.-P. Goy, eds., Oeuvres de Denis Gaultier (CNRS: Paris, 1996). Suittes faciles Suittes faciles (Amsterdam: Roger, 1701).
<previous page
page_100
next page>
<previous page
page_101
next page>
Page 101
4 Sequence-Based Melodic Comparison: A Dynamic-Programming Approach Lloyd A. Smith, Rodger J . McNab, and Ian H. Witten Department of Computer Science University of Waikato Private Bag 3105 Hamilton, New Zealand {las, rjmcnab, ihw}@cs.waikato.ac .nz Abstract Because of the importance of melodic comparison in musical analysis, several methods have been developed for comparing melodies. Most of these methods have been targeted at particular styles of analysis, and thus cannot be used for carrying out the melodic comparison necessary for other kinds of analysis. The observation that music is a sequence of symbols, however, leads to the use of general-purpose sequence-comparison algorithms for melodic comparison. This paper describes one such algorithmdynamic programmingand discusses experiments in which dynamic programming is used to match input melodic phrases against a database of 9400 folk songs in order to retrieve closely matching tunes from the database.
<previous page
page_101
next page>
<previous page
page_102
next page>
Page 102
4.1 Previous Research in the Field Melodic comparison is a fundamental operation aimed at determining whether two melodies are, in some sense, similar. It has important practical applications whenever any kind of search for melodic patterns is performed. For that reason, several algorithms have been developed for melodic comparison. The earliest of these were focused on a particular application. Stech (1981), for example, developed a method for micro-analysis of melodies. His system searched for similarities within a song using a combination of exact pitchand rhythm-matching modes. For pitch, the modes were (1) original sequence, (2) inversion, (3) retrograde, and (4) retrograde inversion. For rhythm, they were (1) original and (2) retrograde. Dillon and Hunter (1982) developed a more flexible system using Boolean operators to search for combinations of pitches. Their system required that melodies be encoded in five ways: (1) by measured pitches, (2) by unmeasured pitches, (3) by measured stressed pitches, (4) by unmeasured stressed pitches, and (5) by pitches with phrase information. Using this representation, a user could specify, for example, that a melody's third stressed pitch must be the dominant note of the scale. In an attempt to gain still more flexibility, Logrippo and Stepien (1986) suggested using cluster analysis to determine the similarity of melodies. The difficulty they faced was in specifying a distance measure that makes sense with music. Their metric was based on the percentage of occurrence of notes within melodies, thus ignoring the sequential ordering of notes. Perhaps the most general method of musical pattern matching was developed by Cope (1991). Cope's Experiments in Musical Intelligence ( EMI )system looks for similar patterns across works of a given composer, in an attempt to find commonly occurring musical signatures relevant to that
<previous page
page_102
next page>
<previous page
page_103
next page>
Page 103
composer. These signatures are used by EMI to compose music in the style of the composer. [See Cope's article, "Signatures and Earmarks," in this issue.] All the above methods were developed specifically for comparing music and were tailored to the user's particular goal in performing the comparison. The observation, however, that a musical score is a sequence of symbols (or, in computer science terminology, a string )suggests that general sequence-comparison algorithms, developed in the fields of information systems and pattern recognition, might be productively applied to melodies and melodic patterns. This has the advantage of divorcing the technical aspects of how the comparison is performed from the features used to determine melodic similarity, thus freeing the researcher to focus on selecting an appropriate feature set and musical representation to use when conducting the comparison. While many sequence-comparison algorithms have been developed, the most general is a technique known as dynamic programming, used, for example, in matching biological DNA sequences (Goad and Kanehisa, 1982). Dynamic programming has been independently adapted for musical applications by Mongeau and Sankoff (1990) and by Orpen and Huron (1992). This paper describes the use of dynamic programming in matching melodies; the particular approach followed is based on Mongeau and Sankoff (McNab et al., 1996). 4.2 Sequence-Comparison Using Dynamic Programming Dynamic programming is based on the concept of edit distance ;when matching sequence a against sequence b, the edit distance is the cost of changing sequence a (the source string )into sequence b (the target string ).The sequences may consist of virtually any type of symbol including alphabetic characters (input to spelling checkers), numeric feature vectors (used in pattern recognition), DNA symbols (for gene identification), or musical pitches and rhythms. The cost of changing the source string into the target is calculated in terms of edit operators .The standard edit operators used in dynamic programming are replacement, insertion, and deletion.To change the sequence ABCDE into ABDEF, for example, requires leaving the letters A, B, D and E (i.e., replacing them with themselves), deleting C and inserting F.
<previous page
page_103
next page>
<previous page
page_104
next page>
Page 104
Each of these operations incurs a cost, or weight.If the cost of replacing a letter is the "alphabetic distance" (where B - A = 1, C - A = 2, and so forth, and the cost of replacing a letter with itself is 0), and the cost of inserting or deleting is 1, then the cost of changing ABCDE into ABDEF is 2or the distance between ABCDE and ABDEF is 2. If the cost of insertion is equal to the cost of deletion, and the cost of replacement is the same regardless of which is source and which is target (i.e., B - A = A B), then the algorithm is symmetric, meaning that it does not matter which is the source and which is the targetchanging ABCDE into ABDEF incurs the same cost as changing ABDEF into ABCDE. In the above example, other sequences of operations are possible. For example, we could replace C with D, replace D with E, and replace E with F. This has the desired effectABCDE is transformed into ABDEFbut now the total cost is 3 instead of 2. In the dynamic-programming paradigm, it is assumed that the goal is to find an optimal way of transforming the source into the target, so that the lowest cost is returned. Furthermore, a dynamic programming algorithm, in addition to returning the distance between two sequences, also returns an optimal alignment between the two sequencesit shows how the sequences match in order to calculate the best score. Sometimes more than one alignment may yield the same score. In that case, the algorithm has no way of knowing which alignment is "correct." All alignments returning the same score are considered equally good. 4.3 Algorithms for Melodic Comparison Equation 1 expresses the dynamic programming algorithm for matching sequence a against sequence b (Sankoff and Kruskal, 1983). dij=min[di-1,j+w(ai,),di-1,j-1+w(ai,bj),di,j-1+w(,bj)] (1) where 1 i length of sequence a 1 j length of sequence b w(ai,bj) is the cost (or weight) of substituting element ai with bj w(ai,) is the cost of inserting ai w(,bj) is the cost of deleting bj dij is the accumulated distance of the best alignment ending with ai and bj
<previous page
page_104
next page>
<previous page
page_105
next page>
Page 105
Initial conditions are: d00=0 (2) di0=di-1,0+w(ai,),i 1(3) d0j=d0,j-1+w(,bj),j 1(4) While there are many ways of implementing a dynamic programming algorithm, the process may be viewed in three stages. The first stage generates a local score matrix which holds the distances between each element of the source string and each element of the target; in equation 1, the local cell matrix provides the values for references to w ( ai, bj). The second stage uses the local score matrix to generate a global score matrix which holds the cost of a complete match. In equation 1, d refers to cells in the global score matrix; dij is the cell currently being computed, while di-1,j,di-1,j-1,and di,j-1 are previously computed cells. The third stage traces back from the final score to determine the alignment that generated that score. The process may be illustrated by stepping through a match of the two incipits, or phrase beginnings, shown in Figure 1 (after Leppig, 1987).
Figure1. Asamplepairofmelodiesforcomparison. Figure 2 illustrates the local score matrix associated with the match. Innsbruck ich muss ich lassen, the source string, is represented going from top to bottom on the left, and Nun ruhen alle Wlder, the target, is from left to right across the top. In the example, pitch is represented by MIDI pitch number (Middle C = 60 ) and rhythm by duration in terms of sixteenth-note units, so the first note of Nun ruhen is 69-4: A4 is MIDI 69 , and a quarter note has a duration of four sixteenth notes. In this example, distances are calculated by taking the absolute value of the difference in MIDI note number, then adding half the absolute value of the difference in duration (this particular choice of relative pitch vs. rhythm weight was chosen simply because it yields an unambiguous alignment for these two melodic phrases). Each cell of the matrix holds the distance
<previous page
page_105
next page>
<previous page
page_106
next page>
Page 106
between one note in Innsbruck and one note in Nun ruhenso the top leftmost cell holds the distance between the first note of each melody. Each row holds the distances between one note of Innsbruck and all notes of Nun ruhen .The top row, for example, holds the distances between the first note of Innsbruck and each note of Nun ruhen . Each column holds the distances between one note of Nun ruhen and all notes of Innsbruck.
Figure2 Localscorematrix. Each cell in the global score matrix represents the best possible score matching the two strings up to that point. It is filled by sweeping through the local distance matrix, moving to the right and down (toward the end of both melodies) and calculating the distance for each cell based on cells immediately to the left of, above, or diagonal (left and above) to the target cell. The easiest way to meet this requirement is to proceed row by row, left to right from top to bottom. Each cell is calculated, following equation 1, by adding the cost of an insertion to the cell to the left, adding the cost of a deletion to the cell above, adding the local distance in the target cell to the cell diagonally left and above, then taking the minimum of those three figures. Figure 3 shows the global score matrix for the match of the two incipits from Figure 1. Note that there is one more row and one more column than shown by the local score matrix. The extra row and column reflect the cost of inserting or deleting all notes of one melody, relative to the other. The bold numbers in Figure 3 show the trace of best path through the global score matrixthis path gives the alignment between the two melodies, shown in Figure 4. Because initial good paths through the matrix may lead to poorly scoring regions, the trace is done looking back from the bottom right corner, where
<previous page
page_106
next page>
<previous page
page_107
next page>
Page 107
the final score occurs, to the top left, where the match began. If the alignment between the two melodies is not needed, then the third stagetracing the alignmentmay be skipped.
Figure3 Globalscorematrix. It is often useful, however, for the researcher to view the alignment to make sure that the match "makes sense"even when the alignment is not needed for the application. An alignment that does not make sense may be generated by an inappropriate setting of replacement or insertion/deletion costs, or by an inappropriate data representation; these issues are discussed below.
Figure4. AlignmentgeneratedbyFigure3scorematrix.
Figure5 Notealignmentgeneratedbydifferentmatch parameters.
<previous page
page_107
next page>
<previous page
page_108
next page>
Page 108
4.4 Additional Operations for Comparing Music The example discussed above applies standard dynamic programming to matching melodic sequences. It is possible, however, to introduce additional edit operators for a specific application. Mongeau and Sankoff (1990) define fragmentation and consolidation for matching melodies. Fragmentation allows a note to match notes of lesser duration that combine to cover the same time span; a quarter note, for example, may match two eighth notes or a dotted eighth followed by a sixteenth. Consolidation is the inverse of fragmentationfour sixteenth notes, for example, may consolidate to match a quarter note. Orpen and Huron (1992) make a distinction based on whether notes are repeated in one string or the other, or both. They define two versions of insertion and deletionone for repeated notes and one for nonrepeated notesand four versions of replacement, based on whether the replaced note repeats in neither string, both strings, in the source or in the target string. This allows the deletion of a repeated note to incur a cost of 0. 4.5 Effect of Music Representation and Match Parameters While dynamic programming provides a powerful and flexible tool for comparing melodic patterns, the comparison must be guided, and the result interpreted, on the basis of sound musical judgement. In particular, the researcher must be aware of the implications of both edit operator costs and of the musical data representation. Figure 5 shows one of several equal-scoring alignments of Innsbruck with Nun ruhen if the cost of insertions and deletions is set to 2 and pitch and rhythm differences are weighted equally. In this case, only four notes "match," while all others are deletions (deletions from Innsbruck or insertions into Nun ruhen ;either perspective is valid). In setting the parameters, the researcher must consider what the musical question is, and what feature(s) form the basis for similarity measurement. In addition, it should be kept in mind that the value of the score returned by the comparison is meaningless except as it relates to ranking different comparisons; if the cost of all operations is doubled, the score of the corresponding match will doublebut that does not make the two melodies less similar.
<previous page
page_108
next page>
<previous page
page_109
next page>
Page 109
Mongeau and Sankoff (1990) base their replacement costs on musical consonancethe cost of replacing one note by another at the interval of a fifth, for example, is lower than the cost of replacing the same note at an interval of a second. This assignment of costs indicates the focus of their test applicationidentifying theme variations. Mongeau and Sankoff arrived at the particular values they used for their parameters (costs of replacement, insertion and deletion, as well as relative weighting of pitch and rhythm) through an empirical investigation of theme variation clustering in Mozart's well-known variations on a theme, K. 300e. While Mongeau and Sankoff's algorithm, like most based on dynamic programming, is symmetric, Orpen and Huron (1992) point out that a nonsymmetric algorithm may be more appropriate for matching a melody with an embellished variant. Attention must also be given to the musical representation. In the above examples (illustrated by Figures 2-5), rhythm was represented by duration in sixteenth-note units. A representation in eighth-note units would be equally valid, but the difference between a half note and a quarter note would now be 2 instead of 4. Other representations may be equally valid, but harder to manage in terms of relative differences. The Humdrum **kern format (Huron, 1994) represents rhythm by the reciprocal of note valuea whole note is 1, half note 2, quarter note 4, and so forth. This method has the advantage of easily handling triplets or other time values that "defeat the meter," but the researcher must carefully consider how to carry out a rhythmic match. Finally, music researchers may encounter matching algorithms in ''canned" form. The Humdrum music analysis suite (Huron, 1994), for example, provides a dynamic programming comparison algorithm in the form of simil (Orpen and Huron, 1992). Such utilities can empower the user to ask complex questions concerning musical similarity, but it is the user's responsibility to understand the assumptions made by the programmer and what control, if any, he or she has over parameters and/or data representation.
<previous page
page_109
next page>
<previous page
page_110
next page>
Page 110
4.6 A Sample Application: Retrieving Tunes from Folk-Song Databases In order to get a feel for the issues involved in sequence-based melody comparison, we have implemented a comparison program and used it to perform an extensive simulation based on the task of retrieving tunes from a database of folk songs. The database incorporated two folk song corpora: the Digital Tradition (Greenhaus, 1994) and the Essen database (Schaffrath, 1992). [See also David Bainbridge's article on MELDEX in this issue.] At the time we downloaded it, the Digital Tradition contained approximately 1700 tunes, most of North American origin. The Essen database contains approximately 8300 melodies, about 6000 of which are German folk songs and 2200 are Chinese; most of the remainder are Irish. Nearly 400 duplicatesthe same tune with a different name and, often, in a different keywere removed from the Essen database, and 14 duplicates were removed from the Digital Tradition. Because our music display program does not currently display tuplet note values, the approximately 200 songs containing tuplets were also removed. Combining the two sources and eliminating the three songs common to both gave us a database of 9400 melodies. There are just over half a million notes in the database, with the average length of a melody being 56.8 notes. 4.6.1 Experimental Method The experiments focused on the number of notes required to identify a melody uniquely under various matching conditions. The dimensions of matching include whether an intervallic or merely a directional (updownsame) contour is used as the pitch representation; whether comparison is based on pitch only or is inclusive of rhythm;
whether matching is exact or approximate, with the possibility of note deletion, insertion or substitution; and whether note fragmentation and consolidation are allowed.
<previous page
page_110
next page>
<previous page
page_111
next page>
Page 111
Based on these dimensions, we have examined exact matching of: interval and rhythm; contour and rhythm; interval regardless of rhythm; contour regardless of rhythm;
and approximate matching of: interval and rhythm; contour and rhythm.
4.6.2 User Input For each matching scheme we imagine a user singing the beginning of a melody, comprising a certain number of notes, and asking for it to be identified in the database. For this application, the most appropriate distance measure, when musical interval is used as the pitch representation, is the absolute distance in semitones, and that is the metric used for matching pitch. Rhythm is scored using the difference in duration between notes. If the tune sung by the user is in the database, how many other melodies that begin this way might be expected? We examined this question by randomly selecting 1000 songs from the database, then matching patterns ranging from 5 to 20 notes against the entire database. This experiment was carried out both for matching incipits and for matching sequences of notes embedded within songs; in order to match embedded sequences of notes, it is necessary to modify the dynamic programming starting condition so that deletions preceding the match of the pattern receive a score of 0. The only change is to equation 4 (Galil and Park, 1990), which is replaced by: d0j=0,j 1(5) For each sequence of notes, we counted the average number cn of "collisions"that is, other melodies that match. Fragmentation and consolidation are relevant only when rhythm is used in the match; in these experiments, fragmentation and consolidation were allowed for approximate matching but not for exact ones.
<previous page
page_111
next page>
<previous page
page_112
next page>
Page 112
Figure6. Numberofcollisionsfordifferentlengthsofinputsequence whenmatchingincipits.Fromlefttoright: exactinterval andrhythm; exactcontourandrhythm; exactinterval; exactcontour; approximateintervalandrhythm; approximatecontourandrhythm. 4.6.3 Results of Retrieval Experiments Figure 6 shows the expected number of collisions plotted against n, for each of the matching regimes when queries are matched at the beginnings of songs. The number of notes required to reduce the collisions to any given level increases monotonically as the matching criteria weaken. All exact-matching schemes require fewer notes for a given level of identification than all approximate-matching methods. Within each group the number of notes decreases as more information is used: if rhythm is included, and if interval is used instead of contour. For example, for exact matching with rhythm included, if contour is used instead of interval two more notes are needed to reduce the average number of items retrieved to one. The contribution of rhythm is also illustrated at the top of Figure 6, which shows that, if rhythm is included, the first note disqualifies a large number of songs. It is interesting that melodic contour with rhythm is a more powerful discriminator than interval without rhythm; removing rhythmic information increases the number of notes needed for unique identification by about three if interval is used and about six if contour is used. A similar picture emerges for approximate matching except that the note sequences required are considerably longer.
<previous page
page_112
next page>
<previous page
page_113
next page>
Page 113
An important consideration is how the sequence lengths required for retrieval scale with the size of the database. Figure 7 shows the results, averaged over 1000 runs, obtained by testing smaller databases extracted at random from the collection. The number of notes required for retrieval seems to scale logarithmically with database size.
Figure7. Numberofnotesforuniquetuneretrievalindatabasesof differentsizes.Linescorrespond,frombottomtotop,to thematchingregimeslistedinFigure3. Figure 8 shows the expected number of collisions for matching embedded note patterns. As expected, all matching methods require more notes than searches conducted on the beginnings of songs. In general, an additional three to five notes are needed to avoid collisions, with approximate matching on contour now requiring, on average, over 20 notes to uniquely identify a given song.
Figure8. Numberofcollisionsfordifferentlengthsofinput sequencewhen matchingembeddedpatterns. Lines correspond, from left to right, to those in Figure 3.
<previous page
page_113
next page>
<previous page
page_114
next page>
Page 114
4.6.4 Timing Considerations The computational complexity of dynamic programming is O( nxm) for an n-note search pattern matched against a database of m notes, meaning that the time taken for performing the comparison increases multiplicatively with the size of either the database or search pattern. For comparing single pairs of melodies, or for searching small databases, dynamic programming is fast enough for interactive applications. For the folk song database used in these experiments, running on a Macintosh PowerPC 8500 with a clock speed of 120 Mhz, a search for an embedded theme of 20 notes takes 23.7 seconds if fragmentation and consolidation are allowed, and 16.7 seconds if those operations are not allowed. While this may be reasonable performance, much larger databasesa million folk songs, for example, or a thousand symphoniesmight take an unacceptably long time to search. There are approximate string matching algorithms that have the potential to speed up approximate searches, and we are currently investigating one, based on the UNIX agrep text searching utility (Wu and Manber, 1992), that represents the state of the match, at any point, by a bit vector stored as a binary number. This method is not as flexible as dynamic programming; it allows only a predetermined number of errors (insertions, deletions, replacements) and all such errors are weighted equallya semitone difference is scored the same as any other interval. Furthermore, state matching does not return the alignment between the two melodies. It does however possess the benefit of having a very low execution time which is constant, for a given database, for all search patterns up to the length of a machine word in bits. Figure 9 shows timing results for dynamic programming, with and without fragmentation and consolidation, and state matching for search patterns of up to 20 notes on the folk song database. State matching takes half a second to match all the search patterns, making it about 47 times faster, for a 20-note search pattern, thandynamic programming with fragmentation and consolidation.
Figure9 Timerequiredtosearchfordifferentlengthpatterns.
<previous page
page_114
next page>
<previous page
page_115
next page>
Page 115
An alternative way of speeding retrieval based on embedded patterns is to automatically identify themes using an offline matching method, storing those themes in a separate collection indexed to the original database. Because themes are relatively short (in comparison to an entire composition), the theme database can be searched much more quickly. In addition, it may be unnecessary to search for embedded patterns in a database containing only themes. 4.7 Conclusion The focus of this paper is on the use of general sequence comparison techniques for comparing melodies. Such methods have a number of applications. We have already discussed the use of sequence-based melody comparison for retrieving tunes from a database of 9400 folk melodies. In other work, we have used dynamic programming melody comparison as part of a sight-singing tutorial program (Smith and McNab, 1996). Other applications include identifying theme variations (Mongeau and Sankoff, 1990), studying the use of motifs by a given composer or group of composers, analysing and tracing folk song variants (Dillon and Hunter, 1982), performing copyright searches, or any of a number of other musicological studiesindeed, the list is endless. In all cases, however, the researcher must remember that the analysis tool is blind and must be guided by sound musical knowledge and judgement. The primary method discussed heredynamic programmingoperates by finding the cost required to transform one string, the source, into another, the target. The cost is defined in terms of edit operations, the standard ones being insertion of a note, deletion of a note, or replacement of one note by another. Each of these operations carries its own cost, or weight, and other operations, specific to the application, may be defined, such as fragmentation and consolidation (Mongeau and Sankoff, 1990). In order to apply these operations, the algorithm first creates a local score matrix, which reflects the distance of all notes of the source melody from all notes of the target. The local score matrix is then used to generate the global score matrix by applying the dynamic programming operations in such a way that the score at each cell of the matrix is minimized. The score in the final (bottom rightmost) cell is the overall score for the match; an alignment, showing how the melodies
<previous page
page_115
next page>
<previous page
page_116
next page>
Page 116
match, is then generated by tracing back from the final score to find the best scoring path through the global score matrix. While dynamic programming is fast enough for comparing single pairs of melodies, or for searching small databases, it can be too slow for performing searches of large databases, or for exhaustively comparing all pairs of melodies even in relatively small databases. For that reason, we are currently investigating state matching, another general sequence-based matching method, which is not as flexible as dynamic programming but is much faster. References Cope, David, "Recombinant Music: Using the Computer to Explore Musical Style," Computer 24/7 (1991), 22-28. Dillon, M., and M. Hunter, "Automated Identification of Melodic Variants in Folk Music," Computers and the Humanities 16 (1982), 107-117. Galil, Z., and K. Park, "An Improved Algorithm for Approximate String Matching," SIAM J . Comput.19/6 (1990), 989-999. Goad, W. B., and M. I. Kanehisa, "Pattern Recognition in Nucleic Acid Sequences," Nucleic Acids Research 10/1 (1982), 247-263. Greenhaus, D., "About the Digital Tradition," http: //www.deltablues.com (1994). Huron, David. The Humdrum Toolkit : Reference Manual.Stanford University: Center for Computer Assisted Research in the Humanities, 1994. Leppig, M., "Musikuntersuchungen im Rechenautomaten," Musica 41/2 (1987), 140-150. Logrippo, Luigi, and Bernard Stepien, "Cluster Analysis for the Computer-assisted Statistical Analysis of Melodies," Computers and the Humanites 20/1 (1986), 19-33.
<previous page
page_116
next page>
<previous page
page_117
next page>
Page 117
McNab, Rodger J., Lloyd A. Smith, Ian H. Witten, C. L. Henderson, and S. J. Cunningham, "Towards the Digital Music Library: Tune Retrieval from Acoustic Input," Proc . ACM Digital Libraries, Bethesda, Maryland (1996), 11-18. Mongeau, Marcel, and David Sankoff, "Comparison of Musical Sequences," Computers and the Humanities 24 (1990), 161-175. Orpen, K. S., and David Huron, "Measurement of Similarity in Music: A Quantitative Approach for Nonparametric Representations," Computers in Music Research 4 (Fall 1992), 1-44. Sankoff, David, and J. B. Kruskal (ed.). Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison .Reading, MA: Addison-Wesley, 1983. Schaffrath, Helmut, "The EsAC Databases and MAPPET Software," Computing and Musicology 8 (1992), 66. Smith, Lloyd A., and Rodger J. McNab, "A Program to Teach Sight-singing," Proc . Third Int. Conf . Technological Directions in Music Education, San Antonio, Texas (1996), 43-47. Stech, David A., "A Computer-assisted Approach to Micro-analysis of Melodic Lines," Computers and the Humanities 15 (1981), 211-221. Wu, S., and U. Manber, "Fast Text Searching Allowing Errors," Commun. ACM 35/10 (1992), 83-91.
<previous page
page_117
next page>
<previous page
page_119
next page>
Page 119
5 Strategies for Sorting Melodic Incipits John B. Howard Widener Library, Room 188 Harvard University Cambridge, MA 02138 howard@rism .harvard.edu Abstract The Rpertoire International des Sources Musicales (RISM) has over a period of 30 years catalogued more than 300,000 works preserved in manuscript in both public and private libraries in more than 60 countries. This article explores some experimental work in the relative effectiveness of different kinds of thematic searching conducted both at the Zentralredaktion in Frankfurt (Germany) and at the U.S. RISM Office at Harvard University over the past decade.
<previous page
page_119
next page>
<previous page
page_120
next page>
Page 120
Editors and performers of music are often confronted with questions of identity and attribution. Is the material at hand the first or final version of a work? Is it in the original key? Is the scoring original? Are performing indications (e.g., dynamics markings, bowing signs, ornaments) authorial or editorial? Scholars who work with original materials and librarians who attempt to catalogue them are confronted with more basic questions of the same general kind. Is the attribution of the work correct? Is the work complete as it stands? Is it a parody of another work? That is, does it incorporate thematic material used in a different context? Among large-scale projects in music bibliography, none has been so centrally concerned with the task of collating thematic information from musical materials that are physically remote from each other as the manuscript-indexing project (''A/II") of the Rpertoire International des Sources Musicales (RISM). Over a period of 30 years RISM has catalogued more than 300,000 works preserved in manuscript in both public and private libraries in more than 60 countries. Access to partially congruent portions of these holdings are available in two ways: (1) on CD-ROM by annual license from G. K. Saur Verlag and (2) via the World Wide Web site www .rism .harvard.edu/rism/DB.html . Since the original cataloguing was done by hand, retrospective conversion of musical incipits is an ongoing process. Originally the CD-ROM included materials from Europe, and an Internet connection to the Harvard University Library Hollis database containing listings of European manuscripts found in the U.S. Over the past year there has been extensive coalescence: the Web site now includes a total of 230,000 holdings from Europe and the U.S. Of these 200,000 are found on the third issue of the CD-ROM. 5.1 Potentials and Practicalities In principle, one value of such a fund of encoded incipits should be in providing answers to vexing questions of attributions. Such questions have arisen frequently in the preparation of work lists to be included in the new collected Mozart and Haydn editions. Ten years ago the two projects between them listed 144 pieces of uncertain attribution. A meeting of editors was convened by the Akademie der Wissenschaften und der Literatur to
<previous page
page_120
next page>
<previous page
page_121
next page>
Page 121
establish a common list of criteria on which to assess the validity of uncertain attributions. Ludwig Fischer found it "both astonishing and disappointing that there ha[d] been hardly any basic and methodoriented discussion of the whole problem." Georg Feder of the Haydn Institute maintained that investigations should be based on sources and their routes of transmission. Wolfgang Plath of the Mozart Edition hoped for the eventual establishment of a "logical, methodical, systematic approach" but noted that for the present the discovery of conflicting or clarifying information from diverse sources was largely a matter of chance.1 5.2 The Frankfurt Experience 5.2.1 Search Approach The search for concordances among the 144 uncertain works and between them and the holdings of the RISM database in Frankfurt became a test case for evaluating the efficiency of diverse methods of melodic searching. In the two best known methodsthe pitch index and the melodic interval index"insignificant" information is normally excluded. The RISM A/II repertory, which is drawn largely from the seventeenth and eighteenth centuries, is almost entirely tonal. It is assumed in the case of a pitch index that by transposing all the examples to a common key (C Major or A Minor), concordances will be easily spotted. It is assumed in the case of a melodic-interval index that by concentrating on the defining events of a melodic contour, the cognitively "essential" similarities will surface. Long experience of those working with large collections of material had exposed the weaknesses of both approaches, and the decision was made to instead take into account the richness of "insignificant" elements of notation supported by the Plaine and Easie encoding language used by RISM. These elements include staccato dots, slurs, pauses, ties, grace notes, ornaments, redundant accidentals written in the musical text, and so forth. 1 This paragraph and the following text with the heading "The Frankfurt Experience" are based on Joachim Schlichte's article "Der automatische Vergleich von 83,243 Musikincipits aus der RISM-Datenbank: ErgebnisseNutzenPerspektiven," Fontes artis musicae 37 (1990), 35-46, and are used by permission.
<previous page
page_121
next page>
<previous page
page_122
next page>
Page 122
5.2.2 Procedure To facilitate comparative searching, the Plaine and Easie Code is first translated into a meta-code in which every musical parameter is connected to each individual note it affects. These related parameters include octave number, note duration, beaming, position within a beamed group, and so forth. The meta-code (formulated by Norbert Bker-Heil), enables the incipits to be sorted efficiently. The sorted material can be retranslated into Plaine and Easie Code, from which incipits may be printed. Human judgment is still required to interpret search results. For example, in the case of the Mozart mass known by the Kchel number C1.04 a parallel
source was determined to exist in manuscript in a collection in Winterthur (Switzerland), and in Veszprm (Hungary) the same work is attributed to someone named "Mller." The Swiss source is considered inconsequential, since it is copied from a print of 1815 or 1821. The Hungarian source was copied in 1806, 15 years after Mozart's death, and its attribution may remain questionable. At least, however, Mozart scholars are now aware of its existence.2 5.2.3 Results At the time this study was made, the RISM control database in Frankfurt contained slightly under 84,000 incipits. From the test database of 144 incipits of works traditionally attributed to Haydn or Mozart which are actually unattributed in "original" sources, concordances were found for 33. Numerous concordances from within the RISM database were also found. These were of three main types: 2 This mass for four voices and numerous instruments was published in 1815 and was known in the nineteenth century as "Mozart's Twelfth Mass." Its legitimacy was first questioned in 1826.
<previous page
page_122
next page>
<previous page
page_123
next page>
Page 123
(1) concordances between works with common attributions (2) concordances between sources in which the surname is identical but the forename is missing on one source (3) concordances in which the musical incipits match but the composer attributions conflict Roughly 5,000 identical works, or 6.17% of the sample, were found to have conflicting attributions (Type 3). At the same time, matches to attributed works were found for 292 works previously indexed locally as "anonymous." This result represented a 2% hit-rate, since within the database the number of anonymous attributions originally stood at 14,000. From these results, we were then able to determine that when all of the "insignificant" details are included in the search data, musical incipits can be effectively sorted, and deviations from the sought match will be evident by the third, fourth, or (at the most) fifth item of information processed. This finding stands in marked contrast to the very long letter-name incipits given in such finding tools as Parsons (1975) and Barlow and Morgenstern (1948). In these cases, the string of "significant" information for which a match is sought is much longer. 5.2.4 Future Directions The meta-code underlying RISM searching is malleable. If it were decided that other weightings (the selective inclusion or exclusion of particular parameters for specific repertories) were appropriate, varied sorts could be supported. If a user needed a melodic-interval index, it could be derived from the meta-code. 5.3 The Harvard Experience Like its colleagues in Frankfurt, U.S. RISM officials found Plaine and Easie Code to be unwieldy for sorting procedures. Instead of using the meta-code adopted in Frankfurt, it pursued a different path: conversion of Plaine and Easie Code to the DARMS encoding language. DARMS encodes the exact
<previous page
page_123
next page>
<previous page
page_124
next page>
Page 124
registral and rhythmic value of each musical event, whereas Plaine and Easie describes them contextually. The fact that DARMS represents pitch contextually, that is, by relative position on a musical staff, has both benefits and limitations. This form of representation facilitates transposition but is easily misled by identical melodies written with divergent clef signs. 5.3.1 Procedures Using DARMS as the basis for the sorting of encoded musical excerpts, various methods of generalizing the encoded musical structure have been made and the results compared to the known "ideal" result. The levels and types of generalization made have been as follows: (1) the complete encoding with all parameters (2) the complete encoding transposed to a common pitch register3 (3) the encoding stripped of such features as beaming, bar lines, and fermatas (4) the encoding stripped of the items given in (3) plus grace notes (5) the encoding stripped of the items given in (3) and (4) plus rhythmic values, rests, and ties, with transposition to a common register (2) but with preservation of repeated notes 5.3.2 Results The results of the various sorts performed on the sample data have confirmed in part the experience of the RISM Central Office in Frankfurt, but they also bring to light some particular problems that relate to the methodology applied, to repertory, and to the encoding system employed. Among works in the sample that have been attributed to known composers, a sort of Type 1 grouped together identical works transmitted in versions notated at the same pitch register. A sort of Type 2 brought together 3 The initial DARMS value for pitch register is set to 0 and subsequent values are adjusted accordingly.
<previous page
page_124
next page>
<previous page
page_125
next page>
Page 125
the few known transposed versions of pieces. These results are in complete agreement with the Frankfurt experience. The bulk of pieces in the data sample derive from the general repertory of tunes, however, and the two initial types of sorts were far less effective in recognizing the relationships of pieces presented differently on the page.4 Of the 13 known occurrences of a song called "Roslin Castle," for example, only four sorted together in a Type-1 sort, and only six in a Type-2 sort [see Figure 1]. In both cases, musically similar manifestations of the tune were separated from the most common manifestation by such different tunes as "The White Cockade" and ''General Washington's March" [12 versions of the first are shown in Figure 2]. Sorts of Types 3 and 4 both failed to bring together additional concordant melodies. Sorts of Type 5 yielded results similar to Types 1 and 2. 5.3.3 Future Goals The ineffectiveness of all strategies in bringing together different manifestations of pieces from the tune repertory suggests that different methodologies must be developed for determining musical identity for the different repertories represented by the project. Some desirable elements of these new approaches can be suggested by examining the data involved in the U.S. experiments to determine which parameters served to separate variants of the same tune in the sort result. At this juncture, the features that seem to be most valuable in separating pieces known to be based on common thematic material are these: (1) details that relate to notational conventions: clefs, bar line placement, two tied notes vs. one dotted note, two eighth rests vs. a quarter rest, etc. (2) variations in the rhythm of specific figures, such as a dotted eighth/sixteenth rendered as two eighths in another source 4 The fact that tunes identified by title were often grouped in the sort result with completely unrelated tunes is ignored in the discussion here. The phenomenon is well known to music bibliographers using a wide array of tools.
<previous page
page_125
next page>
<previous page
page_126
next page>
Page 126
Figure1. Thirteeniterationsoftheincipitof"RoslinCastle"fromlistingsbyRISM. Noteinparticularthevariationsin(a)key,(b)placementofdottednotes, and(c)presenceorabsenceofgracenotes.
<previous page
page_126
next page>
<previous page
page_127
next page>
Page 127
Figure2. Twelveiterationsoftheincipitof"TheWhiteCockade"fromlistingsbyRISM. Notethedifferences(a)ofkey,(b)ofhighversuslowbeginnings,(c)ofshorter versuslongernotevalues,(d)ofgracenotes,and(e)ofbeamings [whichare representedinPlaineandEasieCode].
<previous page
page_127
next page>
<previous page
page_128
next page>
Page 128
(3) small intervallic differences in the initial pitches of the melodic line (4) the use of rests and repeated notes, particularly in separating vocal and instrumental renditions of the same tune Although each procedure is conceptually simple, each poses complex problems if the query confronts the broader issue of manipulating particular values to facilitate grouping or separation. For example, if one wanted to match the syncopated version of a tune with a conventionally accented version, one would need to "regularize" durations. While the U.S. RISM office has been successful in its exploration of principles involved in generalizing data for searching and sorting, it has also been mindful that encoding methods can enable or cripple searching strategies, since there is no way to find what is not there. References Barlow, Harold, and Sam Morgenstern. A Dictionary of Musical Themes, New York: Crown Publishers, 1948. Parsons, Denys. The Directory of Tunes and Musical Themes, Cambridge: Spencer Brown, 1975.
<previous page
page_128
next page>
<previous page
page_129
next page>
Page 129
6 Signatures and Earmarks: Computer Recognition of Patterns in Music David Cope Porter College #88 University of California Santa Cruz, CA 95064 howell@cats .ucsc .edu Abstract In this article I attempt to distinguish between two types of patterns I find extremely important in understanding and analyzing music: signatures and earmarks. Using these patterns, my computer program Experiments in Musical Intelligence ( EMI )has created a number of compositions arguably in the style of various classical composers (Cope 1994, 1997). I briefly describe how, using pattern-matching techniques, EMI finds and then uses signatures and earmarks in its compositional processes.
<previous page
page_129
next page>
<previous page
page_130
next page>
Page 130
6.1 Signatures Musical signatures (a term for motives common to two or more works of a given composer) can aid in the recognition of musical style (Cope 1987, 1991a and b, 1996). For example, signatures can tell us what period of music history a work comes from, the probable composer of that work, and so on. Signatures are typically two to five beats in length and are often composites of melodic, harmonic, and rhythmic elements. Signatures usually occur between four and ten times in any given work. Variations often include transposition, diatonic interval alteration, rhythmic refiguring, and registral and voice shifting. With few exceptions, however, such variations do not deter recognition. Figure 1 shows examples of a musical signature used by Mozart and how signatures can change over time, providing a microcosm of stylistic development. Figure la shows a signature in a rudimentary form over a simple harmonic statement. This is followed by a slightly more elaborate version from a sonata composed four years later (Figure 1b). In Figure 1c, the melody has been truncated with a more active version of the accompaniment. Both this and the following version (Figure 1d, which shows a more elegant and developed melody) were composed around the same time: six years after the version shown in Figure lb. In Figure le, the melodic line closely matches the version shown in Figure 1b but is an octave lower, has slight rhythmic differences, and has a more developed accompaniment. The final version shown here (Figure 1f), composed fifteen years after Figure la, has a fully developed melody and accompaniment and is by far the most complex of those shown. All of these versions of the signature appear in cadences in fast movements. 6.1.2 Signatures and Stylistic Analysis Observing a signature change and develop over time can provide valuable insights into how a given style matures and how, to some extent, one can differentiate by ear the various periods in the life of a composer. While such style analysis cannot supplant other forms of harmonic and melodic or structural analyses, it can augment them in important ways. Interestingly, most forms of standard analysis tend to articulate the ideas and materials composers have in common. Studies of signatures, on the other hand, tend to define what makes each composer unique.
<previous page
page_130
next page>
<previous page
page_131
next page>
Page 131
Figure1. VersionsofaMozartsignaturefromhis(a)PianoSonataK.280(1774), mvt.1,mm.107-8; (b)PianoSonataK.330(1778),mvt.3,m.110; (c)PianoConcertoK.453(1784),mvt.1,mm. 162-3;(d)Piano ConcertoK.459(1784),mvt.2,mm.114-6;(e)PianoSonata K.547a(1788), mvt.1,mm.63-4;(f)PianoSonataK.570 (1789),mvt.2,m.4.
<previous page
page_131
next page>
<previous page
page_132
next page>
Page 132
Figure2. VersionsofasignaturefoundinMozart'sPianoSonataK.284,mvt.2: (a)m.16;(b)m.30;(c)m.46;(d)m.69;(e)m.92. Placement of signatures can also be an extremely important strategy in classical period structural logic. Figure 2 shows five examples of a Viennese signature used by Mozart which is described generally in my book Computers and Musical Style (Cope 1991, pp. 157-169). It can be analyzed as a premature tonic bass note under a dominant chord or a late-sounding dominant over a tonic pedal point. Each iteration of the signature shown here appears at the end of a period whose first phrase does not cadence with the signature. Note how the spacing and texture (number of dissonant notes) provide differing tensions to the signatures with the final occurrence providing the most prominent weight of the five shown. The tension and location variances help delineate the rondo form in this movement with tonic-function signatures weighted more strongly towards dissonance than the dominant-function signatures. Hence signatures may not only be location-dependent at the local phrase level, but may also be structurally dependent according to section endings. Experienced listeners can hear such subtle balance and know when composers or machine-composing programs misplace or leave out such important signatures in given styles. 6.1.3 Identifying Inter-work Signatures Pattern-matching for signatures entails discovering musical patterns that occur in more than one work of a composer. This requires the development of a program that not only recognizes that two patterns are exactly the same, a
<previous page
page_132
next page>
<previous page
page_133
next page>
Page 133
fairly trivial accomplishment, but also that two patterns are almost the same. EMI pattern-matches by means of controllers that define how closely a pattern must resemble another for it to register as a match. If these controllers are resolved too narrowly, signatures will not pass. If the controllers are resolved too broadly, patterns that do not identify a composer's style will be allowed to pass. If these controllers are set with discrimination, only signatures will pass. Looking back at Figure 1 provides a simple example of pattern-matching to find signatures. Imagine that a patternmatching program is attempting to determine whether the melody in Figures la and 1c constitutes a signature. It is improbable that a nonmusical pattern-matcher would find these two melodies very similar. They share only two common pitches (D and E). Also, Figure 1c has fewer notes than Figure la. To the ear, however, these are easily identifiable as simple variations of the same pattern. Musical pattern-matchers can discover the similarities in these two patterns. This is initially accomplished by reducing pitches to (base-12) intervals. For Figures la and 1c this produces [1 -4 -3 -2 -1] and [1 -3 -4 2] respectively. Notice how using intervals shows more similarity in the two patterns than using pitches. Introducing controllers that determine interval accuracy proves the patterns to be similar enough to qualify as a signature. By allowing, for example, either pattern to match when different by just one step in either direction enables the program to match the first three intervals. Such variations are obviously very common in tonal music, where composers, in order to remain within a diatonic framework when sequencing, often substitute whole steps for half steps and vice versa. Matching Figures la and 1c requires a second controller to ignore the direction of one-note motion (i.e., interval of 2 matching interval of -2). A third controller will allow the extra note in the first pattern. Thus, an allowance for these variations helps to make the pattern-matcher find musical similarities. Figure 3 shows a signature from Chopin's mazurkas with more elaborate rhythmic and pitch variations. Here, few of the examples have either the same number of melodic notes or intervals. Matching such subtle differentiations can be quite difficult even for a sophisticated pattern-matcher because the controller settings necessary to recognize these variants may also produce numerous non-variants. To reduce this noise in the output, the program must factor elements such as the exact placement of the variations. Such precision allows the EMI pattern-matcher to discover signatures which are aurally recognizable but numerically very different.
<previous page
page_133
next page>
<previous page
page_134
next page>
Page 134
Figure3. VariousformsofasignatureinChopinmazurkas:(a)Op.6,No.1,m.1; (b)Op.6,No.4,mm.9-10;(c)Op.7,No.2,m.3;(d)Op.17,No.4, m.13;(e)Op.17,No.4,m.15;(f)Op.17,No.4,m.29; (g)Op.50,No.1,m.18. 6.2 Earmarks There are other patterns in music which indicate, at least to the initiated listener, important attributes about their host works besides style. These patterns are more generalized than signatures. I call such patterns earmarks since they are identified most easily by ear and tend to mark specific structural locations in a work. Earmarks can tell us what movement of a work we are hearing. Earmarks can also foreshadow particularly important structural events. Earmarks may even contribute to our expectations of when a movement or work should climax or end and therefore enhance our appreciation or lack of appreciation of that work. In general, earmarks have significant impact on the analysis of structure beyond thematic repetition and variation.
<previous page
page_134
next page>
<previous page
page_135
next page>
Page 135
Recognition of such patterns in music is not new. The study of musical signs and symbols, for example, reveals that certain gestures in works can be traced to sources beyond their context and often beyond their composer (Agawu 1991, Gjerdingen 1988). Frequently, however, such analysis takes the form of recognition of quotation or quasi-quotation so that the semantic understanding of a work is enhanced. Earmarks, while not falling out of the scope of the study of signs and symbols, have little such semantic meaning. Earmarks, like signatures, are integrated seamlessly into their immediate environment and have syntactic rather than semantic value. Earmarks are icons holding little interest in themselves but great interest for what they reveal about structure and what they can foretell about what is to follow. 6.2.1 Earmarks as Gestural Information In general, variations of earmarks point out their gestural nature. They can typically be described in general terms such as a trill followed by a scale or an upward second followed by a downward third, and so on. Trills and scales, however, abound in many composers even when in combination. The distinguishing characteristic about earmarks is their locationthat they appear at particular points in compositions just after some important event and just before another important event. Thus, finding earmarks helps pinpoint important nexus points in music. Figure 4 shows five examples of an earmark found in Mozart's piano concertos Numbers 6 through 27: the tonic 6/4s which precede the trills at the ends of expositions and recapitulations just prior to cadenzas. Note how the first measure in each example varies yet adheres to simple scalar designs. While these examples are simple, possibly obvious, it is clear that with even a limited listening experience, the ear can become accustomed to the harmonic and melodic substance of such material and begin to anticipate the eventual culmination of the exposition in their first occurrence and the cadenza in the second occurrence.
<previous page
page_135
next page>
<previous page
page_136
next page>
Page 136
Figure4. AnearmarkfromthefirstmovementsofMozart'sPianoConcertos: (a)K.238,mm.86-7;(b)K.449,mm.318-9;(c)K.450,mm. 277-8;(d) K.482,mm.196-7;(e)K.595,mm.326-7.
<previous page
page_136
next page>
<previous page
page_137
next page>
Page 137
6.2.3 Earmarks as an Aid to Structural Perception Misplaced earmarks can cause a disruption in an educated listener's perception of the apparent musical structure. For example, earmarks which do not precede anticipated sections, occur out of sequence, or are ill-timed can cause rifts in the antecedent-consequent motion so important to musical structure. There is, for example, an oboe concerto attributed to Haydn (Hoboken VIIg:C1) which has numerous earmarks scattered about the first movement, earmarks which typically foreshadow or simply precede the cadenza. None of the uses of this earmark subsequently moves to the cadenza, which results in a movement which sounds scattered at best, and at worst appears to constantly stumble about, unsure of where it should go. Given that Haydn was acutely aware of the use of this earmark, at least if one can judge by the numerous examples of his concertos which use this earmark correctly, it would seem unlikely that this work's attribution is correct and more likely that it is apocryphal.
Figure5. AnearmarkfromthefourthmovementofEMI 's Symphony(mm.82-85), arguablyinthestyleofMozart.
<previous page
page_137
next page>
<previous page
page_138
next page>
Page 138
Figure 5 shows a fourth-movement earmark found in many of Mozart' s symphonies and here found in EMI 's symphony in the style of Mozart. Three different fourth movements of Mozart's symphonies were used in the analysis which helped produce this music. Each of Mozart s examples possessed a version of this earmark at roughly the same structural point in the movement. Again, as with the earmark found in the concertos, the music here is pedestrian and not particularly distinguished. However, upon hearing, it does stand out from the surrounding material with enough integrity to make its structural importance known to the ear. The orchestration, inherited in the case of the EMI -Mozart from the original scores upon which its composition was based, so blends with the surrounding material that the earmark seems all but lost in the forest. The character, here of a descending scale rather than the leaps which otherwise bookend it on either side, forms the basis on which one can anticipate, in this case, the recapitulation of the first theme in the original key. References Agawu, V. Kofi. Playing with Signs.Princeton: Princeton University Press, 1991. Cope, David, ''An Expert System for Computer-Assisted Music Composition," Computer Music Journal 11/4 (Winter, 1987): 30-46. Cope, David. Computers and Musical Style .Madison, WI: A-R Editions, 1991. Cope, David, "Recombinant Music," Computer 24/7 (July, 1991): 22-28. Cope, David. Bach by Design .Sound Recording. Baton Rouge, LA: Centaur Records, 1994. Cope, David. Experiments in Musical Intelligence.Madison, WI: A-R Editions, 1996. Cope, David. Classical Music Composed by Computer.Sound Recording. Baton Rouge, LA: Centaur Records, 1997. Gjerdingen, Robert. A Classic Turn of Phrase .Philadelphia: University of Pennsylvania Press, 1988.
<previous page
page_138
next page>
<previous page
page_139
next page>
Page 139
II. TOOLS AND APPLICATIONS
<previous page
page_139
next page>
<previous page
page_141
next page>
Page 141
7 A Multi-scale Neural-Network Model for Learning and Reproducing Chorale Variations DominikHrnel InstitutfrLogik,KomplexittundDeduktionssysteme UniversittKarlsruhe(TH) AmFasanengarten5 D-76128Karlsruhe,Germany dominik@ira.uka .de Abstract This article describes a multi-scale neural-network system producing melodic variations in a style directly learned from musical pieces of baroque composers like Johann Sebastian Bach and Johann Pachelbel. Given a melody, the system invents a four-part chorale harmonization and improvises a variation of any chorale voice. Unlike earlier approaches to the learning of melodic structure, the system is able to learn and reproduce higherorder elements of harmonic, motivic, and phrase structure. Learning is achieved by using mutually interacting neural networks, operating on different time-scales, in combination with an unsupervised learning mechanism to classify and recognize these elements of musical structure. A complementary intervallic encoding allows the neural network to establish relationships between intervals and learned harmony. Musical pieces in the style of chorale partitas written by Pachelbel are the result.
<previous page
page_141
next page>
<previous page
page_142
next page>
Page 142
7.1 Background The investigation of neural information structures in music brings together such disciplines such as computer science, musicology, mathematics, and cognitive science. One of its objectives is to find out what determines the personal style of a composer. Neural networks constitute one procedure which has been shown to be able to "learn" and reproduce style-dependent features from given examples. Various alternative techniques have been applied to style identification and simulation. These include signature identification (Cope; see preceding article), back-propagation networks (Ebcioglu *), and the use of neural networks in conjunction with other procedures (e.g., data-compression measures in the work of Witten, Conklin, et al.) The generation of chorale melodies and/or harmonizations in the style of Bach, for example, have been a central focus of the work of Ebcioglu (1986 and 1992); Conklin and Witten (1990); Witten, Manzara, and Conklin (1994); and Hild, Feulner, and Menzel (1992). When dealing with longer melodic sequences in, for example, folk melodies, models have considerable difficulties in learning structure. Instead they may produce new sequences that lack coherence (Feulner and Hrnel 1994; Mozer 1994). A principle reason for their failure may be that they are unable to capture higher-order structural features such as harmonies, motifs, and phrases simultaneously occurring on multiple time scales (Hrnel and Ragg 1996b). 7.1.1 Melodic Variation as a Musical Technique The art of melodic variation has a long tradition in Western music. Many European composers, particularly in the eighteenth and nineteenth centuries, have written variations on a given melody, e.g., Mozart's keyboard variations K. 300e on the folk melody "Ah! Vous dirai-je, Maman" (also known as the children's song "Twinkle, Twinkle Little Star"). Underlying this tradition is the baroque genre of chorale variations. These were written for performance on the organ or harpsichord for use in the Protestant church. Even earlier, the secular keyboard partita had a presence in Italy through the works of Frescobaldi and various other keyboard composers, including his German pupil Johann Jakob Froberger. A prominent representative of this kind of composition in Germany at the end of the seventeenth century was Johann Pachelbel, who wrote seven sets
<previous page
page_142
next page>
<previous page
page_143
next page>
Page 143
of variations on chorale melodies. Typically each work included from seven to twelve variations. Pachelbel was also praised for his variations on six secular arias published under the title Hexachordum Apollinis (1699). He subjected chorale melodies to a great many other compositional procedures. In his lifetime Pachelbel was known as "a perfect and rare virtuoso" whose works influenced many other composers such as Bach. Most of Pachelbel's chorale partitas can be seen as improvisations of an organist who invented "real-time" harmonizations and melodic variations on a given chorale melody. This method of composing is very similar to the behavior of the neural-network system presented here. The problem of learning melodic variations with neural networks has been studied by Feulner and Hrnel (1994) and by Toiviainen (1995) for jazz improvisation. Although these approaches produce some musically convincing local sections, the results in general lack global coherence. 7.1.2 A Neural Model for Variation Technique The neural-network model we present here is able to learn global structure from musical examples by using two mutually interacting neural networks that operate on different time-scales. The main idea of the model is a combination of unsupervised and supervised learning techniques to perform the given task. Unsupervised learning classifies and recognizes musical structure; supervised learning is used for prediction in time. The model has been tested on simple children's song melodies in (Hrnel and Ragg 1996b). In the following we will illustrate its practical application to a complex musical taskthe learning of melodic variations in the style of Pachelbel. 7.2 Task Description Given a chorale melody, the learning task is achieved in two steps: (1) A chorale harmonization of the melody is invented. (2) One of the voices of the resulting chorale is selected and provided with melodic variations. Both subtasks are directly learned from musical examples composed by J. Pachelbel and performed in an interactive composition process which results in a chorale variation of the given melody. The first task is performed by
<previous page
page_143
next page>
<previous page
page_144
next page>
Page 144
HARMONET, a neural-network system which is able to harmonize melodies in the style of various composers such as J. S. Bach. The second task is performed by the neural-network system described below. 7.2.1 Time Resolution of Variations For simplicity we have considered only melodic variations consisting of four sixteenth notes for each quarter note of the melody. This is the most common variation type used by baroque composers and presents a good starting point for even more complex variation types, inasmuch as there are enough musical examples for training the networks, and because it allows the representation of higher-scale elements in a rather straightforward way. 7.2.2 Harmonet: A Neural System for Harmonization HARMONET is a system producing four-part chorales in various harmonization styles, given a one-part melody. It solves a musical real-world problem on a performance level appropriate for musical practice. Its power is based on (1) an encoding scheme capturing musically relevant information and (2) the integration of neural networks and symbolic algorithms in a hierarchical system, combining the advantages of both. For a detailed account see Hild, Feulner, and Menzel (1992) or Hrnel and Ragg (1996a). Figure 1 shows the chorale melody "Alle Menschen mssen sterben" and a corresponding harmonization and variation of the soprano voice composed by Pachelbel. 7.3 A Multi-scale Neural-Network Model The learning goal of this model is two-fold. On the one hand, the results produced by the system should conform to melodic and harmonic constraints such as the correct resolution of dissonances or the appropriate use of successive intervallic leaps. On the other hand, the system should be able to capture unique stylistic features from the learned examples, in this case melodic shapes preferred by Pachelbel. The adherence to musical rules and aesthetic conformance to the learning set can be achieved by a multi-scale neural-network model. The learning task is divided into subtasks. The procedure is illustrated in Figure 2.
<previous page
page_144
next page>
<previous page
page_145
next page>
Page 145
Figure1. TheGermanchoralemelody"AlleMenschenmssensterben"(upperstaff)and a choralevariationcomposedonthesamemelodybyPachelbel(lowerstaves).
<previous page
page_145
next page>
<previous page
page_146
next page>
Page 146
Figure2. Theorganizationofthesystemforcomposingnewchoralevariations. Readingfromthelowerleft-handcorner,eachnoteofthemelody (whichhasbeenharmonizedbyHARMONET)ispassedtothe supernet,whichpredictsthecurrentmotifclass( MT)fromalocal window(seetheMotif-Predictionwindow).Byasimilarprocedure performedonalowertime-scale,thesubnetpredicts,onthebasis ofMTandthecurrentharmony,thenextnoteofthemotif(seethe Note-Predictionwindow).Theresultisreturnedtothesupernet throughthemotivic-recognitioncomponent(seetherightsideofthe chart)inordertobeconsideredwhenthenetcomputesthe nextmotifclass( MT+1).
<previous page
page_146
next page>
<previous page
page_147
next page>
Page 147
(1) A chorale variation is considered on an abstract time-scale as a sequence of note groups (motifs). Each quarter note of the original melody is replaced by a motif (here a motif consisting of four sixteenth notes). Before training the networks, motifs are classified according to their similarity. (2) One neural network is used to learn the abstract sequence of motivic classes. Motivic classes are represented in a 1-of- n encoding form where n is a fixed number of classes. The question this step solves is this: What kind of motif fits a particular note with respect to the melodic context and the motifs that have occurred previously? No precise pitches are fixed by this network. It works at a more abstract level and is therefore called a supernet in this commentary. (3) Another neural network learns how to translate motivic classes into concrete note sequences appropriate to a given harmonic context. It produces the actual pitches. Because it works one level of precision below the supernet, it is here called a subnet. (4) Although the output of the subnet is mainly influenced by the motivic class computed by the supernet, the subnet has to find a meaningful realization according to the harmonic context. Sometimes the subnet invents a sequence of notes that does not belong to the motivic class determined by the supernet. This motif will be considered by the supernet when computing the next motif, however, and should therefore match the notes previously formed by the subnet. The motif is therefore reclassified by the motivic recognition component of the system before the supernet determines the next motif class. The motivation for this separation into supernet and subnet arose from the following considerationthat if each motif had an associated contour (i.e., a sequence of intervallic directions to be produced for each quarter note), the choices for note-generation could be restricted to suit these contours. The procedure is based on human behavior. When a human organist improvises a melodic variation of a given melody in real time, he must make his decisions in a fraction of a second. In order to find a meaningful continuation of the variation, he must therefore have at least some idea about what kind of variation should be applied to the next note.
<previous page
page_147
next page>
<previous page
page_148
next page>
Page 148
The validity of the concept was established by several experiments where motivic classes, previously obtained from Pachelbel originals through classification, were presented to the subnet. After training, the subnet was judged to be able to reproduce almost perfectly the Pachelbel originals. Since the motivic contour was shown to be an important element of melodic style, another neural network was introduced at a more general time-scale to accommodate contour. The training of this net greatly improved the overall performance of the system rather than merely shift the learning problem to another time-scale. 7.4 Motivic Classification and Recognition In order to coordinate learning at different time scales, we needed a procedure to classify motifs for training the supernet and to recognize motifs for reclassification. This was achieved by using unsupervised learning following Kohonen's topological feature maps (1990), which represent agglomerative hierarchical clustering. We implemented a recursive clustering algorithm based on a distance measure which determines the similarity between motifs by comparing their contours and interval sizes. The result of hierarchical clustering is a dendrogram that allows comparison of classified elements on a distance scale. Figure 3a shows the result of classification for eight motifs. While cutting the classification tree at lower levels, we get more and more classes. Another approach is to determine appropriate motivic classes through self-organization within a one- or twodimensional surface. Figure 3b displays the distribution of motif contours over a 10x10 Kohonen feature map. The algorithm is then applied to all motifs contained in the training set. 7.5 Network Structure An important aspect of this task is to find an appropriate number of classes for the given learning task. Both the supernet and the subnet are implemented as standard forward-feed networks. The task of the note-prediction subnet is to find a variation of a given melodic note according to the motif class proposed by the supernet and the harmony determined by HARMONET. Because the character of a motif depends on the intervallic relationship between its notes rather than on absolute pitches, we have chosen
<previous page
page_148
next page>
<previous page
page_149
next page>
Page 149
an intervallic representation for this procedure. Each note is represented by the interval to the first motif note, the so-called reference note (indicated by the percent sign [%]).
Figure3a AdendrogramforthefirsteightmotifsofthePachelbelchoralevariationshownin Figure1b;belowthestaffonecanseethecorrespondingbase-7intervallic representation.
Figure3b AKohonenfeaturemapdeveloped fromallmotifsofthechorale variation(initialupdatearea6x6, initialadaptationheight0.95, decreasefactor0.995).Eachcell correspondstooneunitinthe featuremap.Onecanseethe arrangementofregionsresponding tomotifshavingdifferentcontours.
<previous page
page_149
next page>
<previous page
page_150
next page>
Page 150
7.5.1 Harmonic Controls Since the motivic structure also depends on harmonic information, a special complementary intervallic encoding was developed to allow the network to establish a relationship between melodic intervals and harmonic context. 7.5.2 Subnet Input and Output The following array depicts all input and output features of the subnet. The network has three fully connected layers: (1) an input layer with 47 units, (2) a hidden layer with about 25 units, and (3) an output layer with 12 units. A corresponding musical example is displayed in the upper right-hand box of Figure 2. The output feature of the subnet is the note N to be learned at time t. The input features of the subnet are these:
the motif class MT determined by the supernet the harmonic field HT determined by HARMONET the next reference note mT+1 one preceding melody note Nt-1 the position pt within the motif.
The supernet learns a sequence of motifs which are given as abstract classes developed during motivic classification. The most important element influencing the choice of a motivic class is the interval between the current and next pitches. To produce motivic sequences that are also coherent on larger time frames, information about the position relative to the beginning or end of a musical phrase (2-4 measures) is added. The motivic classes are represented in a simple 1-of- n encoding form ( n is the number of classes). 7.5.3 Supernet Input and Output The following array summarizes the input and output features of the supernet. The network has a 61-35-12 topology for n = 12. A corresponding musical example is displayed in the lower right-hand box of Figure 2.
<previous page
page_150
next page>
<previous page
page_151
next page>
Page 151
The output of the supernet is the motivic class M to be learned at time T. The input features of the supernet are: a melodic context given by one preceding and two following notes mT-1, mT+1, and mT+2. one preceding motivic class MT-1 phrasing information phrT information about up- and downbeats ZT within a measure.
7.6 Intervallic Representation In general one can distinguish two groups of motifs: melodic motifs prefer small intervals, mainly primes and seconds; harmonic motifs prefer leaps and harmonizing notes (chord notes). Both motif groups rely heavily on harmonic information. In melodic motifs dissonances should be correctly resolved; in harmonic motifs notes must fit the given harmony. Small deviations may have a strong effect on the quality of musical results. Thus our idea was to integrate musical knowledge about intervallic and harmonic relationships into an appropriate intervallic representation. Each note was represented by its intervallic relationship to the first note of the motif, the so-called reference note. This is an important element contributing to the success of our system. We have developed and tested various intervallic encodings. The initial encoding of intervals took account of two important relationships: neighboring intervals were realized by overlapping bits octave invariance was represented using an octave bit.
The activation of the overlapping bit was reduced from 1 to 0.5 in order to allow a better distinction of the intervals. This encoding was then extended to capture harmonic properties as well. The idea was to represent in a similar way ascending and descending intervals leading to the same note. This was achieved by using the complementary intervallic encoding shown in Table 1.
<previous page
page_151
next page>
<previous page
page_152
next page>
Page 152
It used three bits to distinguish the direction of the interval, one octave bit, and seven bits for the size of the interval. Table 1. Complementary intervallic encoding allowing the numeral 1 to represent letter-name changes DIRECTION OCTAVE INTERVAL SIZE INTERVAL 1 0 0 1 0 0 0 0 0 0.5 1 ninth down 1 0 0 1 1 0 0 0 0 0 0.5 octave down 1 0 0 0 0.5 1 0 0 0 0 0 seventh down 1 0 0 0 0 0.5 1 0 0 0 0 sixth down 1 0 0 0 0 0 0.5 1 0 0 0 fifth down 1 0 0 0 0 0 0 0.5 1 0 0 fourth down 1 0 0 0 0 0 0 0 0.5 1 0 third down 1 0 0 0 0 0 0 0 0 0.5 1 second down 0 1 0 0 1 0 0 0 0 0 0.5 prime 0 0 1 0 0.5 1 0 0 0 0 0 second up 0 0 1 0 0 0.5 1 0 0 0 0 third up 0 0 1 0 0 0 0.5 1 0 0 0 fourth up 0 0 1 0 0 0 0 0.5 1 0 0 fifth up 0 0 1 0 0 0 0 0 0.5 1 0 sixth up 0 0 1 0 0 0 0 0 0 0.5 1 seventh up 0 0 1 1 1 0 0 0 0 0 0.5 octave up 0 0 1 1 0.5 1 0 0 0 0 0 ninth up Complementary intervals such as ascending thirds and descending sixths [beginning from the same pitch class] have similar representations because they lead to the same new note name and can therefore be regarded as harmonically equivalent. A simple rhythmic element was then introduced by a tenuto bit (not shown in Table 1) which is set when a note is tied to its predecessor. This final ( 3+1+7+1 =) 12 -bit interval encoding gave the best results in our simulations. This intervallic encoding requires an appropriate representation for harmony. It can be encoded as a harmonic field which is a vector of chord notes of the diatonic scale. The tonic T in C major, for example, contains three chord notesC, E and Gwhich correspond to the first, third and fifth degrees of the C major scale ( 1010100 ). This representation may be further extended, for we can now encode the harmonic field starting with the first motivic note instead of the first degree
<previous page
page_152
next page>
<previous page
page_153
next page>
Page 153
of the scale. This is equivalent to rotating the bits of the harmonic field vector. An example is displayed in Figure 4.
Figure4. Therelationshipbetweencomplementaryintervallicencodingandarotated harmonicfieldisshown.Eachnoteisrepresentedbyitsintervalinrelation tothefirst(reference)note[ofthemotif].Theharmonicfieldindicatesthe intervalsleadingtoharmonizingnotes(i.e.,B,D,F,GforharmonyD7). The chord given to the motif is the dominant D7 [in Riemann functional notation the equivalent of V7]; the first note of the motif is B, which corresponds to the seventh degree of the C major scale. Therefore the harmonic field for harmony D7 ( 0101101 ) is rotated by one position to the right, resulting in the bit-string 1010110 . Starting with the first note B, the harmonic field indicates the intervals that lead to harmonizing notes B, D, F and G. On the right side of Figure 4 one can see the correspondence between bits activated in the harmonic field and bits set to 1 in the three intervallic encodings. This kind of representation helps the neural network to directly establish a relationship between intervals and a given harmony. 7.7 System Performance We carried out several simulations to evaluate the performance of the system. Many methods of improvement were suggested by listening to the improvisations generated by the system. One important problem was to find an appropriate number of classes for the given learning task. Table 2 lists the classification rate on the learning and validation set of the supernet and the subnet using 5-, 12-, and 20-motif classes. The learning set was automatically built from 12 Pachelbel chorale variations, which produced 2,220 patterns for the subnet and 555 for the supernet. The validation set included six Pachelbel variations, which provided 1,396 patterns for the subnet and 349 for the supernet. The supernet
<previous page
page_153
next page>
<previous page
page_154
next page>
Page 154
and subnet were then trained independently with the RPROP learning algorithm (''Resilient backPROPagation"; see Riedmiller and Braun 1993) using the validation set to avoid over-fitting. Table 2. Classification performance for the supernet and subnet. supernet # of classes learning set validation set 5 91% 50% 12 87% 40% 20 88% 38% 5 86% 79% subnet 12 94% 83% 20 96% 87%
7.7.1 Optimizing the Number of Classes The classification rate of both networks strongly depends on the number of classes, especially on the validation set of the supernet. The smaller the number of classes, the better the classification of the supernet because there are fewer alternatives to choose from. We can also notice an opposite classification behavior of the subnet. The bigger the number of classes, the easier the subnet will be able to determine concrete motif notes for a given motif class. One can imagine that the optimal number of classes lies somewhere in the middle (about 12 classes). This was then confirmed by comparing the results produced by different network versions.
Figure5. Pachelbel-stylevariationonthetenorvoiceofthechorale"OWelt, ichmu dichlassen"composedbytheneural-networksystem.
<previous page
page_154
next page>
<previous page
page_155
next page>
Page 155
Figure6. Melodicvariationon"Ah!Vousdirai-je,Maman"composedbythe HARMONET neural-networksystem.Theoriginalmelodyisshownabove.
<previous page
page_155
next page>
<previous page
page_156
next page>
Page 156
The accompanying examples are composed by the neural-network system with twelve motivic classes. Figure 5 shows an extract of a Pachelbel-style harmonization and chorale variation in the tenor voice based on the melody "O Welt, ich mu dich lassen," which did not belong to the learning or validation set. We have also tested our neural organist on melodies that do not belong to the Baroque era. Figure 6 shows a baroque-style harmonization and variation on the melody "Ah! Vous dirai-je, Maman," used by Mozart in his famous piano variations. The result clearly exhibits global structure and is well-bound to the harmonic context. 7.8 Conclusions Preliminary results confirm that the HARMONET system is able to reproduce style-specific elements of melodic variation. In future work we may explore the question of whether the global coherence of the musical results may be further improved by adding a third neural network working at a still higher level of abstraction, e.g., at a phrase level. We believe that our overall approach presents an important step towards the learning of complete melodies. More information about our research project (goals, demos) is offered on the WWW page: http: //illwww .ira.uka .de/~dominik/neuro_music.html References Conklin, Darrell, and Ian H. Witten, contribution on "Predictive Theories" in "Software for Theory and Analysis," Directory of Computer-Assisted Research in Musicology 6 (1990), 122. Ebcioglu *, Kemal, "An Expert System for Harmonization of Chorales in the Style of J. S. Bach." Ph.D. thesis, State University of New York at Buffalo (Technical Report #86-09), 1986. Ebcioglu, Kemal, "An Expert System for Harmonizing Chorales in the Style of J. S. Bach" in Understanding Music with AI : Perspectives on Music Cognition, ed.
<previous page
page_156
next page>
<previous page
page_157
next page>
Page 157
Mira Balaban, Kemal Ebcioglu *, and Otto Laske (Cambridge: AAAI Press/MIT Press, 1992), pp. 294-334. Feulner, J., and Dominik Hrnel, "MELONET: Neural Networks that Learn Harmony-Based Melodic Variations," Proceedings of the 1994 International Computer Music Conference (Aarhus, Denmark: International Computer Music Association, 1994), 121-124. Hild, H., J. Feulner, and W. Menzel, "HARMONET: A Neural Net for Harmonizing Chorales in the Style of J. S. Bach," Advances in Neural Information Processing 4 (1991), ed. R. P. Lippmann, J. E. Moody, D. S. Touretzky, 267-274. Hrnel, Dominik, and T. Ragg 1996a, "A Connectionist Model for the Evolution of Styles of Harmonization" in Proceedings of the 1996 International Conference on Music Perception and Cognition (Montreal, Canada: Society for Music Perception and Cognition, 1996a), 213-218. Hrnel, Dominik, and T. Ragg, "Learning Musical Structure and Style by Recognition, Prediction and Evolution" in Proceedings of the 1996 International Computer Music Conference (Hong Kong: International Computer Music Association, 1996b), 59-62. Kohonen, T., "The Self-Organizing Map," Proceedings of the IEEE Vol. 78/9 (1990), 1464-1480. Mozer, M. C., "Neural Network Music Composition by Prediction," Connection Science 6/2 & 3 (1994), 247-280. Riedmiller, M., and H. Braun, "A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm," Proceedings of the 1993 IEEE International Conference on Neural Networks (San Francisco, 1993), ed. H. Ruspini, 586-591. Toiviainen, P., "Modeling the Target-Note Technique of Bebop-Style Jazz Improvisation: An Artificial Neural Network Approach," Music Perception 12/4 (1995), 399-413. Witten, Ian H., Leonard C. Manzara, and Darrell Conklin, "Comparing Human and Computational Models of Music Prediction," Computer Music Journal 18/1 (1994), 70-80. [Calgary, 1990]
<previous page
page_157
next page>
<previous page
page_159
next page>
Page 159
8 Melodic Pattern-Detection Using MuSearch in Schubert's Die Schne Mllerin Nigel Nettheim 204A Beecroft Road Cheltenham, NSW 2119, Australia n.nettheim@unsw.edu .au. Abstract The MuSearch program described here was written in 1989 in C. It can be used with the commercial musicprinting program SCORE, whose data it processes, to pursue analytical questions related to pitch, rhythm, meter, and text underlay. In this article I show its application to a repertory of Schubert Lieder.
<previous page
page_159
next page>
<previous page
page_160
next page>
Page 160
Melodic personality. Pitch alone is rather limited for this purpose: rhythm and meter should preferably also be taken into account. Song texts may provide additional evidence of the significance of the musical patterns. The four variables just mentionedpitch, rhythm, meter, textare basic to many investigations in analytical musicology. They are handled by the custom computer program to be described here, as are less basic elements such as dynamics and slurs. Other variables could of course be relevant, and can be handled by fairly straightforward extensions of the method to be described. For instance, although there is apparently no fully successful computer algorithm for the harmonization of music of the period of common practice, the conventional symbols (I, V, ii6, etc.) may be determined by a human, entered as an extra line of "text" underlay, and thus handled similarly. 8.1 The MuSearch Program I began writing the program MuSearch in 1989 in the "C" language for DOS on IBM PC's, after acquiring the commercial music printing program SCORE, whose data it processes. The starting point for the program design was the requirement that the musical data constituting the input should be viewed, if desired, as conventional music-score images (not just as alphanumeric codes or as a schematic representation), and that the output should also be viewable as music scores, possibly annotated by indicating on the scores the occurrence of specified patterns; the output could also include statistical tables or other lists or reports. That starting point resulted from my desire to have ready access at any time to the score images of the music being studied. Between the input and output lies the searching (or analytical) part of the program. Thus in the trivial case where no searching is performed, the input scores will pass straight through to be viewed again unchanged as the output scores. This starting point has not always been adopted by others, which is one reason for the difference in character between MuSearch and, for example, Humdrum, whose design did not emphasize music-score images.
<previous page
page_160
next page>
<previous page
page_161
next page>
Page 161
8.1.1 Input The input data consists of plain ASCII text files in SCORE's macro input language.1 They embody the musical data in a form understandable by a human, by contrast with more elaborate parameter or other files representing the graphical image. The macro files may indicate not only staves, note-names, rests, durations, text underlay, dynamics, etc., but also editing procedures, such as the adjustment of slurs, to be applied in putting together the graphical image. These features of the macro files, as well as the fact that they may contain comments, make them easy to use.2 One file may contain a number of staves, typically covering one song, and each staff normally contains one phrase, as determined by the encoder. MuSearch first reads a master file containing a list of the desired data files for a particular run of the program. The data to be discussed here (as a small sample) consists of the 20 songs from Schubert's cycle Die schne Mllerin, D. 795 (1823)see Figures 1 and 2.
Figure1 InputphrasefromSchubert's"DerNeugierige,"D.795/6. 8.1.2 Search Routines Search routines in programs for public use normally require the user to specify the target by using a pre-set minilanguage. Instead, I simply program in the "C" language a function for whatever search criteria I may require, and then recompile the program. This has the advantage of complete freedom from restriction, but it means that, in its present form, my program is not suited to public release.3 The program forms an internal representation 1 Data not initially in the required format can often be converted to it. For example, I have written a program EsSCORE (freely available) to convert the well-known Essen database files to SCORE macro files. 2 Comments begin with a "!"an undocumented feature of the SCORE program. 3 Yet a comparable situation exists in the Microsoft Word word-processing program, which currently allows programming within it in the Visual Basic language. Further, I might have (footnote continued on next page)
<previous page
page_161
next page>
<previous page
page_162
next page>
Page 162
Figure2. InputdataforthephraseinFigure1.Thesyntaxisthat oftheSCOREprogram;commentlinesbeginwith"!". of the music, upon which the desired search function acts. The function itself is conceptually simple: proceeding through the data, if the data match the search criteria, prepare the excerpt for output; but its implementation in program code requires considerable detail. 8.1.3 Output The output files are again text macro files, as were the input files. The user may choose the scope of the output excerpts: either the whole phrase in which the target occurred, or just the motif, defined here to include along with the target the immediately preceding and following note and barline, if any. For phrases, the program supplies macro commands in the output file (footnote continued from previous page) prepared my program for public release but for a problem of establishing communication with the SCORE personnel. SCORE is apparently the only commercial music-notation program allowing ASCII macro input files, and to that extent made the present approach possible; if a future release also allows nested macros, its scope for automatic manipulation of musical scores for musicological purposes will be greatly increased.
<previous page
page_162
next page>
<previous page
page_163
next page>
Page 163
to draw a box around portions of the music found in a searcha vital addition.4 If the user is very familiar with the data, then motifs may suffice to call to mind the indicated excerpt; otherwise whole phrases may be preferred. The resulting annotated scores may be viewed on the computer screen or printed. Naturally, further editing of those files, or recombination of them, may then be carried out by the user. However, it should be noted that the ideal for this kind of musicological purpose, where many different runs of the program may be made in a given study, is that formatting and other control of the graphical musical image should as far as possible be performed automatically by the program via the macro language. Artistic touching-up, when needed on the basis of visual inspection, is then left for the final stage. The user is also informed of the number of phrases searched and the number of matches found. 8.2 Musical Examples Any pattern expressible in terms of the data elements may be the target of a search, but here just two illustrations will be sketched: (a) a simple text search, and (b) searching for a text/melody pattern reflecting pitch and meter. Song texts are involved in both illustrations. In the case of strophic songs, the human musician should first determine which verse of the poetic text has primarily been set to music by the composer. Usually this is the first verse, the same music fitting the remaining verses slightly less perfectly because of the varying sense and/or accentuation. Accordingly, only the first verse has generally been used in the present study. Exceptions can occur, however: in the last song, "Des Baches Wiegenlied," it appears that Schubert has set primarily the last verse, which of course ends the cycle, and accordingly only the text of that verse has been used.5 4 Technically, the automatic placement of the box is facilitated by drawing it as a pair of enlarged "firsttime endings" without numerals, the lower one inverted. 5 At other times, one might even wish to study just the secondary verses of strophic songs, in order to look into the degree of inappropriateness of the music to their texts which the composer tolerated.
<previous page
page_163
next page>
<previous page
page_164
next page>
Page 164
8.2.1 A Simple Text Search For a first illustration a search was made for all cases where Schubert set a word containing the string "Lieb" or "lieb" (love). This includes all the German words derived from "lieb'': Liebe, lieben, Geliebte, Liebchen (love, to love, beloved, diminutive), etc. In a study of this kind one must be aware of the varied contexts and nuances of meaning for the target wordsthus the word may have its most straightforward positive sense, but may on other occasions be used negatively ("Du liebst mich nicht," D. 756) or ironically ("Die Liebe hat gelogen," D. 751). The search function for this illustration is in principle simple: if the given text string is present, append a suitable excerpt including it to the output file. The 22 matches found are shown in Figure 3 (figures), and four of them in Figure 4 (phrases). The output motifs are positioned by the program in columns in order to include as many as possible on each page, minimizing page-turns in the comparative study of large collections of excerpts. Some remaining touch-up editing has deliberately been left undone, so that the reader may see how little is not handled by the automatic score-generating procedure. The present example does not constitute a project, but just an illustration of the modus operandi.The motifs found (imagining one had a larger database) would next be processed by the human musicologist. They could be sorted according to (1) the more casual uses of the given text "lieb" and (2) the more intensely expressive ones, here especially D. 795/19, bar 68. The relative lengths of the notes for "lieb", the extent of melisma, the metrical location, and ideally also the harmony, would next be taken into account. Such a study of the settings of various significant words could lead to quite instructive conclusions concerning the compositional approach taken by the given composer, and to the ways in which musical melody may match speech inflections. Systematic studies of the relationship between music and text/speech in the period of common practice are surprisingly uncommon; see Fecker (1984), who collected many examples without computer assistance. 8.2.2 A Text/Melody Search For a second example I searched for cases similar to the setting of the words "mein Bchlein" in "Der Neugierige," D. 795/6, bar 17, expressed as in Figure 5 (the appropriate formulation would of course be dictated by the user's idea of what might prove fruitful for a particular purpose).
<previous page
page_164
next page>
<previous page
page_165
next page>
Page 165
Figure3. Outputmotifsfromasearchforthetext"lieb". The resulting four matches are shown in Figure 6. Motives of this kind appear to express admiration for someone or something that has endeared itself to the speakercomparison with the inflection of the speaking voice in such circumstances is indicated. Naturally this idea needs to be worked out thoroughly with a much larger database (cf. Fecker, 1984).
<previous page
page_165
next page>
<previous page
page_166
next page>
Page 166
Figure4. Outputphrasesfromasearchforthetext"lieb". 8.2.3 An Application to Pulse Analysis A very different application of MuSearch was made in Nettheim (1993) to the Essen database of German folksong, producing tables of melodic/rhythmic progressions at the barlines for the study of the musical "pulse". That paper, though not discussing details of the computer program, gave results from a complete project. The same approach could well be taken to the music of Schubert and others, depending on the availability of suitable large databases. 8.3 Conclusions The melodic searching available with MuSearch is virtually unlimited in its specifications. However, the specifications used so far do not constitute sophisticated analysis, for which considerable complexity would be needed. As with the task of natural language translation, it is not reasonable to expect a computer program to carry out the whole of a worthwhile humanistic or artistic taskits role is instead that of an assistant. The assistance provided includes the following: (1) once the data has been entered, the time-consuming consulting of the volumes of printed scores is not needed again;
<previous page
page_166
next page>
<previous page
page_167
next page>
Page 167
Figure5. Searchfunctionskeletonforapitch/meter/textsearch.
Figure6. Outputphrasesfromthepitch/meter/textsearch.
<previous page
page_167
next page>
<previous page
page_168
next page>
Page 168
(2) all cases searched for will be found without human error, once the program is sound; (3) the search criteria can be modified and the search re-run far more easily than in work carried out without computer assistancethis is important because often the best criteria for a given purpose can be found only after considerable experimentation; (4) output can be manipulated or reordered for the next stage of the project with little further effort. Thus, once the present shortage of large musical databases is overcome, the assistance obtainable from computer searching with graphical input and output can be expected to be substantial, enabling worthwhile research which would otherwise scarcely have been feasible. References Fecker, Adolf. Sprache und Musik : Phnomenologie der Deklamation in Oper und Lied des 19. Jahrhunderts .Hamburg: Karl Dieter Wagner, 1984. Nettheim, Nigel, "The Pulse in German Folksong: A Statistical Investigation," Musikometrika 5 (1993), 69-89. [The proof corrections were not implemented by the editora corrected copy is obtainable from the author.]
<previous page
page_168
next page>
<previous page
page_169
next page>
Page 169
9 Rhythmic Elements of Melodic Process in Nagauta Shamisen Music Masato Yako Kyushu Institute of Design Acoustic Design Division 4-9-1 Shiobaru, Minami-ku Fukuoka, 815 Japan yako@kyushu-id .ac .jp Abstract The three-string Japanese shamisen was used in many venues over the preceding three centuries. Techniques for plucking and methods of composing were taught by example and oral tradition. Preliminary research aimed at creating a written catalogue of melodic types reveals that some rhythmic patterns are fixed while others are flexible, that reverse rhythmic patterns may overlap one another, and that some rhythmic patterns can be correlated with particular intervallic profiles. COMPUTING IN MUSICIOLOGY 11 (1997-98), 169-184. Much of the content here is an abridged translation from Musical Acoustics Research Association Material 11/6 (992), with additions from other writings by the author. All material is used by permission.
<previous page
page_169
next page>
<previous page
page_170
next page>
Page 170
The shamisen is a three-string lute in use since the seventeenth century. The shamisen is often used to accompany singing, particularly of nagauta, or "long songs" performed in a formal manner. The instrument has been associated with a broad variety of social contexts including forerunners of soap opera, puppet theatre, and Buddhist narrative. In the seventeenth and eighteenth centuries the shamisen was most often used in kabuki theatre, an important venue for nagauta .In the nineteenth century, nagauta became independent of kabuki and its composition gained prestige as an independent art.
Figure1. Japaneseshamisen. The shamisen's three possible tunings are all based on the "tenth pitch" (B3) of the Japanese traditional system. Until the twentieth century vocal syllables were used to represent finger positions in shamisen notation. A wide variety of plucking techniques has evolved over time, simulating the effect of varied timbres. Nagauta shamisen melodies do not have much fluctuation in tempo. However, they may be full of latent energy. Some sections of a work may contain elaborate melodic activity, while others may be placid. Preliminary research aimed at cataloguing melodic types suggests that (1) overlapping rhythms can exist; (2) some rhythmic patterns have inflexible configurations while others seem to occur fortuitously; and (3) some rhythmic patterns can be correlated with particular intervallic movements.
<previous page
page_170
next page>
<previous page
page_171
next page>
Page 171
9.1 Shamisen Music of the Edo Era In Japanese music of the Edo era (1603-1868), there are many examples of musical compositions structured such that melodically and rhythmically stereotyped configurations are combined and linked continuously. The melodic patterns which can be taken as samples of these stereotyped configurations are not specific to particular musical compositions, but they are common to genres and can be found even when they transcend the genre. In shamisen music also, much melodic material has been drawn from the melodic pattern group peculiar to nagauta, which is close to the common people. However, much has been diverted from the melodic patterns of other genres. In either case, melodic patterns generally accumulate predominantly in genres, and which melodic patterns are singled out and how they are combined is what forms the nature of a musical composition. Consequently, a broad range of knowledge regarding melodic patterns is required for the understanding and appreciation of shamisen music. Most melodic patterns do not have names. This absence reflects the fact that neither composers nor performers are conscious of melodic patterns. Since there exists no comprehensive literature covering all the melodic patterns which are latent in the genres, the scope of research regarding melodic patterns up to now has been limited. The cataloguing of melodic types could therefore be an invaluable aid to the understanding of nagauta and shamisen music. 9.2 Towards a Catalogue of Melodic Types 9.2.1 Selection of Works In this study, I have chosen ten nagauta compositions by way of provisional research into the creation of a catalogue of shamisen melodic types. Pieces from the Kojuro scores (first published in 1918) were used as a basis for the analysis. The titles are these: (1) Musume Dojoji (6) Yoshiwara Suzume (2) Oimatsu (7) Yoiwamachi (3) Kokaji (8) Tokimune (4) Tsurukame (9) Suehirogari (5) Akenokane (10) Hanami Odori
<previous page
page_171
next page>
<previous page
page_172
next page>
Page 172
I have sampled and classified the shamisen rhythmic patterns within these ten compositions while at the same time comparing pitch movement in the sample compositions. Direct sampling of melodic patterns is intended as a focus of future research. One advantage of performing musical analysis with computer assistance is that sampling of patterns used unconsciously by human beings becomes possible. The patterns of which players, composers, and listeners are conscious are only a small percentage of those that appear to exist in this repertory. Human interpretation is necessary to make final judgments, however. 9.2.2 Criteria for Judgment Rhythmic patterns and pitch are interrelated. Therefore, rhythmically configured parts also have melodic configurations and they must form phrases. Thus the sampling of rhythmic patterns requires that one determine whether there are melodic configurations present. In judging the segmentation of phrases, the points listed below need to be considered: (1) Is the sound directed toward the beginning and end sounds? [Koizumi postulates that a melody progressing with regularity becomes very irregular as it approaches the break in a phrase.] If the structural sounds of shamisen melody are listed in sequence from bottom to top, they will be as shown below (octave relationships are omitted):
Figure2. Structuralsoundsoftheshamisen.
<previous page
page_172
next page>
<previous page
page_173
next page>
Page 173
The core sounds of the sound structure are taken as chi and kyu, which are enharmonic readings in the tonic and dominant respectively. Koizumi maintained that nagauta was core-note dominated, in contrast to interval-dominated Western music. In his view a core note has its own "gravitational sphere." The core sound is held together throughout by a perfect fifth. However, there are many instances of the final sound of each phrase which makes up the piece overlapping with the core sounds ( kyu and chi). (2) Is the shape of the rhythmic pattern suggested by the same sound being drawn out or reiterated? (3) Does the shamisen enter during a phrase break? This complicates the identification and classification of patterns. (4) Is a measure in duple meter or one filled by a rest to be taken as a phrase break? (5) Is there disjunct motion in the sample? (6) Are there instances of sukui (plucking upwards from underneath the string), hajiki (stopping the string without plucking), or the one-string open sound? The foregoing questions were first studied in the context of four-measure units. When the patterns were volatile in four measures, they were reexamined in two-measure units. 9.2.3 Phrases and Hierarchy The beat structure of traditional Japanese music is basically constructed in a hierarchical fashion. According to Koizumi (1984), beat units called before-beat and after-beat form one measure,
<previous page
page_173
next page>
<previous page
page_174
next page>
Page 174
and then preceding and following measures create a two-measure group (motif). Similarly the preceding and following motives create a four-measure phrase. Two such units may be further combined to form a before-stage and an after-stage. At the juncture there often seems to be a lost beat or a redundant beat. Schematically, the arrangement is shown in Figure 3.
Figure3. Hierarchicalstructureofphrases. In a long melody with a succession of phrases, the tension-producing last note of a phrase appears at a point where the melody shifts to a phrase of a different length or where a different type of phrase is inserted. 9.3 Classification of Rhythmic Patterns Seven hundred kinds of rhythmic patterns were identified in the total of 5,878 measures of the ten pieces. They were identified in the ways described below. 9.3.1 Classification by Component Configuration Rhythmic patterns were grouped into 39 classifications in accordance with their composition (Figures 4a-c). 9.3.2 Classification According to Common Properties Rhythmic patterns were investigated for the common properties of their components, and cases in which overlap, connotation, and crossing occur were retrieved (Figure 5).
<previous page
page_174
next page>
<previous page
page_175
next page>
Page 175
Figures4a-c. Classificationof(a)four-measure,(b)two-measure,and(onthefollowingpage) (c)three-measurepatterns.Alshowsthesamerhythmiccomponentbeingrepeated fourtimes.Newrhythmiccomponentsareindicatedbythelettersb,c,andd.
<previous page
page_175
next page>
<previous page
page_176
next page>
Page 176
Figure4c.
Figure5. These arrangements of components, when taken alone, are volatile, thus the patterns were classified according to their components' similarities. 9.3.3 Study of the Melodic Process In many passages there was rhythmic overlap, but much room for selection was left in the pitch movement inside these overlapping patterns. Here I compared individual rhythmic patterns in each composition and determined whether any rule characteristics or particular tendencies could be found.
<previous page
page_176
next page>
<previous page
page_177
next page>
Page 177
Example 1. Tokimune (samurai proper name)
<previous page
page_177
next page>
<previous page
page_178
next page>
Page 178
Example 2. Akenokane (''Bell of the Morning Glow")
<previous page
page_178
next page>
<previous page
page_179
next page>
Page 179
Example 3. Kokaji
<previous page
page_179
next page>
<previous page
page_180
next page>
Page 180
Example 4. Tsurukame ("Crane and Turtle")
<previous page
page_180
next page>
<previous page
page_181
next page>
Page 181
9.4 Observations on Nagauta Shamisen Rhythmic Patterns and Features of Melodic Process The process of applying the techniques described above revealed some important details of structure. Among them are these points: (1) The study began with a one-tiered model of musical procedure, but overlapping rhythmic patterns continued to be identified. Thus it seems that the rhythmic patterns in shamisen music should be interpreted not on a single layer but on a multi-layer basis. (2) The existence of overlapping rhythms complicates the hierarchical phrase model. For example, some rhythmic patterns found in two-measure units may be connoted and found in the rhythmic patterns of fourmeasure units. (In Example 1, "Tokimune," R34 is connoted in J8 and H14, R9 in H14, and H8 in E5.) Here again there are many instances of the continuous four-measure rhythmic patterns being found in a form which overlaps and crosses the two-measure ones. See, for instance, Example 2, "Akenokane," which contains a large number of A30 and A19 types. (3) Certain types of patterns could be considered both fixed and fortuitous. Usually, the rhythmic elements seem fixed while the preceding and following melodies could not be deemed fixed. Many such cases were found in the ambiguous passages (the kudoki parts) of the beat structure. Conversely, there were also examples of strong rhythmic-pattern configurations which bore no relation to the character of the melody. (4) Some patterns clearly belong at particular points or play particular roles in a composition, while other patterns cannot be classified functionally. (5) When sampled rhythmic patterns were converted to a retrograde arrangement, about 20% could be looked upon as cases of overlap. Nagashi and other retrograde patterns are particularly difficult to identify.
<previous page
page_181
next page>
<previous page
page_182
next page>
Page 182
(6) When the pitches of all the samples forming rhythmic patterns were examined, certain correlations of coincident movement of pitch and rhythm could be noted. (In Example 3, "Kokaji," Q26 was the only example of this interval progression in the ten compositions.) (7) Some rhythmic patterns are characterized by conjunct motion while others contain a lot of disjunct motion. 9.5 Summary of Results The most important finding is that rhythmic patterns in shamisen music must be considered to exist in a multilayer context. For no piece of those sampled did it seem appropriate to consider that only one rhythmic interpretation was possible. The overlapping patterns considered to be dominant will vary with the style and personality of the performer. Skilled performers may be able to bring out all the overlapping rhythms, but a skilled listener is required to perceive them. In the future we must come to grips with the problem of how rhythmic patterns are mutually combined, the contribution of differences in timbre and playing methods, and how these relate to rhythmic patterns. Perceptual variations between listeners and performers is also a question of critical importance. Some listeners hear only one rhythmic pattern at a time, while others detect the subtleties of hierarchical organization (but not necessarily in a uniform way). The cultural acceptance of multiple melodic interpretations makes shamisen music a particularly rich area for perceptual investigation. The work of creating a catalogue of melodic types which includes all the annals of Japanese music of the Edo era, and of describing the rules of configuration of melodic patterns, has just begun. Only with computer assistance can such a goal eventually be achieved. References Abe, Junichi, "An Experiment Regarding the Recognition of Tonal Sequential Patterns," Japan Acoustical Society Auditory Research Material H-34-5 (1976).
<previous page
page_182
next page>
<previous page
page_183
next page>
Page 183
Asagawa, Gyokuto. Outline of Famous Nagauta Pieces (Tokyo: Nippon Ongaku Sha, 1976). Alphonce, Bo, "Music Analysis by Computer: A Field for Theory Formation," Computer Music Journal 4/2 (1980), 26-35. Gamom, Satoaki. Important Shamisen Melodic Types (Kodansha [Tokyo]: Nippon koten ongaku taikei 4 nagauta, 1981). Koizumi, Fumio. Second Study of Traditional Japanese Music (Tokyo: Ongaku no Tomo Sha, 1984). Machida, Kasei, "Study of Melodic Types in Shamisen Vocal Compositions," Toyo Ongaku Kenkyu 47 (1982). Malm, William P. Nagauta : the Heart of Kabuki Music .Tokyo, 1963. Meyer, Leonard, and Gordon W. Cooper. The Rhythmic Structure of Music, tr. Yoshihiko Tokumaru (Tokyo: Ongaku no Tomo Sha, 1968). Murao, Tadahiro. Recognition in Musical Grammar, ed. Yoshio Watano, in Ongaku no Ninchi (Tokyo: Tokyo Daigaku Shippan Kai, 1987), 1-40. Tenny, James C., and Larry Polansky, "Hierarchical Gestalt Perception in Music: A 'Metric Space' Model," typescript, York (Canada) University Music Department (1978). Yako, Masato, "An Analysis of Special Playing Techniques on the Shamisen in Nagauta Music," typescript, 1992. Yako, Masato, "How is the Song in Nagauta connected with the Shamisen? Using the Kojuro Score as an example," presentation to the 352nd convocation of the Oriental Music Research Assocation (1990). Yako, Masato, "Perception of Decay and the Sound of the Shamisen: A Preliminary Study of the Role of Timbre in Melodic Progression," typescript, 1992. Yako, Masato, "The Tritone Progression and the Rhythmic Distortion of Metric Structure in Nagauta-Shamisen," typescript, 1992. Yeston, Murray. The Stratification of Musical Rhythm.New Haven: Yale University Press, 1976.
<previous page
page_183
next page>
<previous page
page_184
next page>
Page 184
Appendix A Note on Typesetting Shamisen Music and Text Using the SCORE Computer Music-Typography System The CCARH Editorial Staff The four vocal scores in the article immediately previous were provided in manuscript form, with the text written in the Hiragana syllabarya thousand-year-old phonetic system derived from Chinese ideographic characters. Many of the symbols in this system are now obsolete. For SCORE to typeset this text as an underlying lyric to the musical notes, (1) a Hiragana PostScript font had to be acquired, (2) the symbols provided by this font matched to the manuscript, and (3) additional Hiragana symbols created as necessary to provide a set complete enough to satisfy the requirements of the manuscript. Then the font's metric information had to be supplied to SCORE by means of a utility which ships with the program. Craig Sapp, currently a programmer and technical advisor for CCARH, was kind enough to provide the Hiragana font from his own font collection, and the symbol-matching was accomplished through the diligent efforts of Ms. Akiko Orita, a visiting scholar from Keio University in Japan now working at Stanford's Center for Computer Research in Music and Acoustics (CCRMA). The additional symbols required were created in Macromedia's Fontographer 4.1, again with the expert help of Akiko Orita and Craig Sapp. Valuable advice in introducing the Hiragana font to SCORE was provided by Leland Smith, Professor Emeritus of Music at Stanford University and creator and purveyor of the SCORE system through his company the San Andreas Press of Palo Alto, California.
<previous page
page_184
next page>
<previous page
page_185
next page>
Page 185
III. HUMAN MELODIC JUDGMENTS
<previous page
page_185
next page>
<previous page
page_187
next page>
Page 187
10 Concepts of Melodic Similarity in Music-Copyright Infringement Suits Charles Cronin School of Information Management and Systems South Hall University of California Berkeley, CA 94720 chassi@sims . berkeley .edu Abstract Over the past century U.S. courts, especially those in New York and California, have published a steady, entertaining body of opinions discussing melodic similarities in musical works that have been the subject of infringement claims under U.S. copyright law. This article discusses the notions of musical similarity supporting idiosyncratic findings of some of these opinions.
<previous page
page_187
next page>
<previous page
page_188
next page>
Page 188
Although plaintiffs invariably claim musical rather than melodic infringement, courts pay little attention to rhythm, harmony, or other elements of music. They mention them, if at all, as support for their findings of melodic similarities. The pronouncements in Northern Music v . King Record are typical: rhythmic originality is nearly impossible to achieve, harmony is simply the application of well-known rules, and neither can be the subject of copyright ( Northern Music 1952, 400). However, in Tempo Music v . Famous Music (1993), it was held that harmonies added to a song could be sufficiently original to support a copyright claim for a derivative work. Significantly, this case involved jazz, and the Court, asserting that this was a case of first impression, made the debatable assertion that in most music the melody dictates the harmony to which it is set. A list of case citations is given at the end of this article. For federal courts at least, originalitythe sine qua non of copyrightin music lies in melody. The melody "requires genius for its construction" ( Jollie v . Jacques 1850, 913). It is the "fingerprint" of a musical work, and ''a mere mechanic can make the adaptation or accompaniment" ( Northern Music 1952, 400).1 As such, it is typically regarded by courts as the main generator of economic value. Because one of the underlying tenets of copyright in the United States is to protect the financial interests of the creators of original works, the overwhelming emphasis in music-copyright infringement cases is on melodic similarity.2 The following discussion reviews several ways in which melodic comparisons have been demonstrated for and by courts, and examines cases from throughout this century and the evidence of melodic similarities underlying their claims and resolutions. 1 Judge Learned Hand observed, however, that "...[t]rue it is the themes which catch the popular fancy, but their invention is not where musical genius lies, as is apparent in the work of all the great masters" ( Arnstein v . Edward Marks Music 1936, 277). In other words, "generating musical material is much less difficult than structuring it" (Aaron Keyt, "An Improved Framework for Music Plagiarism Litigation," California Law Review 76 [1988]: 425). 2 Most music infringement charges are brought against hit numbers; money, not honor, motivates these claims. Although the Copyright Statute (U.S. Code, Title 17, 1976) provides for statutory damages of between $500 and $20,000 (up to $100,000 if the court finds willful infringement), these are paltry sums compared to a potential actual damage award of hundreds of thousands when a hit number is involved.
<previous page
page_188
next page>
<previous page
page_189
next page>
Page 189
10.1 Fundamentals The typical plaintiff in a music infringement suit is a songwriter of modest means who asserts that a lucrative hit by Stevie Wonder, Mick Jagger, Michael Jackson, Andrew Lloyd-Webber, or another popular musician is based on musical expression from an earlier work by the plaintiff. Because the financial stakes are high (e.g., $2,000,000 in the case against George Harrison) and highly personal work is involved, plaintiffs readily over-invest in their claims, financially and emotionally. Often they are dazzled by the slight possibility of an unexpected monetary bonanza awarded on evidence of musical similarities that usually amount to no more than coincidence and concurrences of commonplace ideas. To prove infringement the plaintiff must establish that he owns a valid copyright and that the defendant misappropriated expression protected by it. This seemingly simple formula contains a welter of nuanced issues. Misappropriation means that defendant copied a significant portion of plaintiff's protected expression. It is possible, however, to hold a valid copyright for a work that is identical to an existing work as long as the subsequent work is original; that is, that it was created without reference to the earlier protected work. To establish misappropriation, the plaintiff must demonstrate that the defendant had a reasonable possibility of access to the earlier work and that there are substantial similarities between it and the defendant's. Generally, the weaker the similarities between the works, the stronger must be the evidence of access.This equation's logical shortcoming is its implication that an unequivocal showing of access renders unnecessary any showing of similarity. The converse is true too, and in some instances courts have inferred access from striking similarities between works.3 After establishing that the defendant copied, the plaintiff still must prove that this copying amounted to improper appropriation. In other words, for 3 Fortunately, this skirting of the requirement of showing a reasonable possibility of access has been discredited recently. In Selle v . Gibb (1983) the Seventh Circuit held that no matter how strikingly similar the works in question, one cannot infer access when the facts affirmatively indicate otherwise. Robert Osterberg, who represented the Bee Gees in Selle, traces the dubious origins of the concept of inferring access from similarities, and shows how this notion is contrary to copyright's accommodation of even identical independently created works, in "Striking Similarity and the Attempt to Prove Access and Copying in Music Plagiarism Cases," Journal of Copyright Entertainment & Sports Law 2 (1983): 85. But the Second Circuit, in Gaste v . Kaiserman (1988), backed away from Selle, applying a less rigorous standard of proof that permits an inference of access based on striking similarities alone, as long as that inference is reasonable in light of all the evidence.
<previous page
page_189
next page>
<previous page
page_190
next page>
Page 190
a plaintiff to recover actual damages he must prove that the defendant has taken a material (i.e., economically valuable) part of the plaintiff's protected expression, and not simply an idea, a trivial motive or rhythmic tattoo, or public-domain material that plaintiff used himself.4 Because most consumers of popular music are musically uneducated, their ears (not those of musicians) determine whether similarities between plaintiff's and defendant's work render the latter an acceptable substitute for the former, and thereby undermine its profitability. But, since the judges and jurors who ultimately decide these cases lack the knowledge needed to tease out de minimis or public-domain elements from copyrightable expression in musical works, courts have permitted expert testimony to assist them in this area. How can a judge or juror hear with lay ears after entertaining analyses from partisan musical experts? Two cases in particular have examined this issue and have attempted to establish boundaries for expert testimony in infringement actions. The earlier, Arnstein v . Porter (1946), limits expert testimony to the question of copying. (Recall that copying is only the first of two steps in proving infringement.) According to Porter, once copying is established, lay ears, unaided by experts, must determine whether the copying constitutes misappropriation (the second step) of protected expression.5 Addressing the difficulty of separating non-protectable ideas from protectable expression, the Court in Sid & Marty Krofft v . McDonalds (1977) recast the two-part copying-and-misappropriation test of Porter.The intent of Porter, according to Krofft, was to test for infringement by separating non-protectable similar ideas, determined with expert assistance and with reference 4 This returns us to the question of originality. If the plaintiff's original expression already exists in a public-domain work, plaintiff still can prevent another from taking that material from his work, but cannot stop another from taking it from its public-domain source. 5 According to Alan Latman, copying, the first step of Porter, should depend upon "probative," not "substantial" similarity, because even small parallels, like common mistakes, can indicate copying. "'Probative Similarity' as Proof of Copying: Toward Dispelling Some Myths in Copyright Infringement,'' Columbia Law Review 90 (1990): 1187. The "lay listener" standard was derived from the "reasonable man" of other areas of law, torts in particular. According to Alice Kim, however, the "lay listener" standard of Arnstein is not congruent with the "average reasonable man" standard previously applied in infringement actions, in which the question is whether an ordinary person can detect plagiarism without aid or suggestion or critical analysis by others. "Expert Testimony and Substantial Similarity: Facing the Music in (Music) Copyright Infringement Cases," ColumbiaVLA Journal of Law and the Arts 19 (Fall-Winter 1994): 109-128.
<previous page
page_190
next page>
<previous page
page_191
next page>
Page 191
to extrinsic information like evidence of common sources, from protectable similar expression, which is based on the intrinsic response of the intended audience, without input from experts.6 The limits placed by Porter and Krofft on expert testimony in music-infringement cases have generated criticism, including that of Porter's dissenter Judge Clark. In Clark's opinion, these limitations amount to a clear "invitation to exploitation of slight musical analogies...in the hope of getting juries...to divide the wealth of Tin Pan Alley" ( Porter 1946, 497).7 Others have argued that the idea-expression test of Krofft (a case involving copyrighted dramatic characters) cannot be meaningfully applied to music cases because it is impossible to separate ideas from their expression in music.8 Moreover, critics claim, expert testimony should be seen as particularly helpful in determining misappropriation under Arnstein, or similarity of expression under Krofft, because these tests only produce meaningful results if they establish both whether the defendant's expression was similar, and also whether it was illicit.This can be determined only with knowledge of the scope of protected expression in the complaining work.9 Regardless of its appropriate scope, expert testimony in music-copyright cases tends to be seen as hocus-pocus by musicologists who present findings of common pitches and rhythms, findings that may appear impressive and dispositive, but which are increasingly irrelevant in establishing plagiarism in popular songs that use simple musical ideas in minimally original ways. 6 Krofft (1977) was decided a year before implementation (in 1978) of the 1976 Copyright Act, which specifically excludes ideas from copyright protection. Paul Goldstein has criticized Krofft because, while establishing substantial similarity of ideas under the first part of Krofft 's test may prove copying, it does not prove that defendant appropriated protected expression. Also, though Krofft says the lay audience is to determine misappropriation based upon the "total concept and feel" of the works, in fact it should determine the existence of substantial similarity only between the expression in the two works, and not the works overall. Copyright: Principles, Law and Practice (1989) 7.3.2. 7 Clark does not acknowledge that limits on expert testimony may as likely dissuade as encourage frivolous music-infringement suits that depend upon similarities that might have gone unnoticed, were it not for an expert's ferreting them out and distorting their importance in bringing them to the attention of the court or jury. 8 Keyt, "An Improved Framework," 443. 9 Kim, "Expert Testimony and Substantial Similarity," 125.
<previous page
page_191
next page>
<previous page
page_192
next page>
Page 192
10.2 Demonstrating Melodic Similarity American judges and jurors are generally less musically literate than the popular musicians whose claims they hear. This is evident from erroneous musical terminology that crops up in judicial opinions and from methods used at trial to provide courts and jurors accessible visual representations of music, in particular of melodies.10 The illustrations produced by these methods present quantifiable evidence but, as we will see, they tend to be more misleading than probative on the question of infringement. Learned Hand's "comparative method" attempts to make music tractable to statistical analysis, but the result is absurdly reductive.11 In Hein v . Harri s (1910) Hand put the melodies of "Arab Love Song" and the defending tune "I Think I Hear a Woodpecker" in the same key, aligned one over the other, and based his finding of infringement on the frequent pitch correspondences, ignoring rhythm, harmony, phrasing and instrumentation.12 Despite its shortcomings, this approach to melodic comparison is still used. Defending John Williams against the charge that he misappropriated the theme of ''E. T." from Leslie Baxter's song "Joy," legal counsel offered the jury a chart of ten similar melodic incipits, reduced to strings of letters that do not even indicate octave placement.13 Some visual representations of melodic comparisons are so vague they have no evidentiary value. In Chiate v . Morris (1992) the Court upheld the District Court's ruling that plaintiff's graph, purporting to plot similarities between Stevie Wonder's "I Just Called to Say I Love You" and plaintiff's 10 In Northern Music (1952, 400), for instance, we learn that "Rhythm is simply the tempo in which the composition is written. It is the background for the melody..." 11 Learned Hand also devised the "abstractions test," which he applied to dramatic works in Nichols v . Universal Pictures (1929). Creative works are built from skeletal structures or "abstractions" that acquire protection only as a result of, and to the extent of, details added by the author. Applied to musical works, Hand's abstractions test conflicts with his comparative method insofar as aspects of rhythm, harmony, instrumentation and phrasing that the comparative method would strip away are the very attributes ("details added by the author") that render a work copyrightable under the abstractions test. 12 "Arab Love Song," music by Silvio Hein, words by George Hobart, was published by Shapiro in New York in 1908; the defendant, Charles Harris, published "I Think I Hear a Woodpecker." 13 Baxter v . MCA (1987). The chart is reproduced in Maureen Baker's "La[w]A Note to Follow So: Have We Forgotten the Federal Rules of Evidence in Music Plagiarism Cases?" Southern California Law Review 65 (1992): 1615.
<previous page
page_192
next page>
<previous page
page_193
next page>
Page 193
"Hello, It's Me," in which the vertical axis represented pitch and the horizontal axis time, was irrelevant. In an earlier case, Baron v . Leo Feist (1948), the plaintiff presented the Court with a similar contour graph, but only to support another exhibit listing the solfegge syllables of the notes of the two melodies at issue.14 On the solfegge chart, which ignored rhythm and octave placement, each of the seven syllables was given a different color to underscore pitch concordances. In Baxter (1987) the defendant used a similar color-based exhibit (a sort of psychedelic piano-roll) in which rectangles of different colors and sizes represented pitches and their durations. In Selle v . Gibb (1983) the plaintiff used musical notation in his exhibits but stripped it to the most basic representations of melody and rhythm, couching them in universally familiar iconography (including such symbols as arrows and numerals) that is not specific to music. Almost without exception, music-copyright infringement cases are ultimately judged upon melodic similarities between no more than a few measures. The simple methods of analysis mentioned above have been used to highlight these similarities and courts have been receptive to this sort of evidence. The purpose of the following brief tour of prominent music cases from this century is to indicate some of the inconsistencies and peculiarities that have resulted from this reliance on primitive notions of melodic similarity as the benchmark for musiccopyright infringement. 10.3 Some Case Decisions In Boosey v . Empire (1915), an early Tin Pan Alley case, the Court found infringement in a similar five-note arched melodic motive that recurred in varied form several times in plaintiff's "I Hear You Calling Me" and in defendant's "Tennessee, I Hear You Calling Me."15 The District Court, basing its findings on the musical similarities between the works, noted that even a 14 Chapter 3 of Louis Nizer's My Life in Court (Doubleday: Garden City, New York [1961]) discusses the evidence used in this case. 15 "I Hear You Calling Me" by Charles Marshall, with words by Harold Harford, was published by Boosey in 1908; John McCormack recorded it several times. Jeff Godfrey wrote "Tennessee, I Hear You Calling Me" to the lyrics of Harold Robe. It was published by Empire Music in 1914, and popularized, in part, through Al Jolson's performances.
<previous page
page_193
next page>
<previous page
page_194
next page>
Page 194
brief musical phrase, if important, can be the basis for a finding of substantial similarity. The opinion reflects copyright's economic underpinnings: The [musical phrase] "I hear you calling me" has the kind of sentiment in both cases that causes the audiences to listen, applaud, and buy copies in the corridor on the way out of the theater ( Boosey, 647). The Court did not acknowledge that the exact phrase on which it based its finding occurs only once in the plaintiff's song, nor did it consider the fact that this motive is akin to many others used to introduce a "yoo hoo" effect like that found in Rudolf Friml's "Indian Love Call" from a few years later (Examples la-c).16
Examples la-c. From Boosey v . Empire (1915). Boosey gives short shrift to the possibility that the motives in both songs had a common public-domain origin. This defense is often used in music-copyright infringement actions and can effectively defeat a charge of infringement. In Hirsch v . Paramount Pictures (1937), for instance, the plaintiff sought half a million dollars in damages (in 1937!) claiming that Harry Revel appropriated eight bars of her "Lady of Love" in "Without a Word of Warning," which was used in defendant's motion picture "Two for 16 Words by Otto Harbach and Oscar Hammerstein II. Published as "The Call," by Harms, in 1924. To make comparisons easier to interpret, most examples have been transposed to C (or A minor).
<previous page
page_194
next page>
<previous page
page_195
next page>
Page 195
Examples 2a-j. From Hirsch v . Paramount Pictures (1937).
<previous page
page_195
next page>
<previous page
page_196
next page>
Page 196
Tonight."17 Testifying for the defense, Sigmund Romberg convinced the Court that the plaintiff derived her melody from a well-known tune from the operetta Die Fledermaus .His testimony included an exhibit that aligned the melodic incipits of the Fledermaus tune with nine popular numbers, including the plaintiff's and defendant's (Examples 2a-j). The Court attached Romberg's exhibit to its opinion, noting that "[i]t is difficult to describe by words similarities or differences in musical compositions. They can be best illustrated by the music itself" ( Hirsch, 818). As in the case of Boosey, the alleged plagiarism in Hirsch devolved onto a handful of notes. But in Hirsch, these notes, as arranged by the plaintiff, could not meet the minimal standard of originality required for copyright protection because they so closely traced a public-domain melody, one the Court assumed the plaintiff was familiar with. Indeed, one might argue that the defendant's fox trot differs sufficiently from Strauss's waltz to support an independent copyright claim whereas the plaintiff's melody does not. Nearly ten years after Hirsch, the defendant in Heim v . Universal (1946) also successfully asserted a common public-domain source, Dvork's * "Humoresque," for his song "Perhaps," which the plaintiff claimed was based on his "Ma Este Mg," a number used in a film that was seen by Hungarians in Europe and the U.S. in the late 1930s (Examples 3a-c). The Court majority's determination that the plaintiff's work was insufficiently original to preclude coincidence as a legitimate explanation for identity reflects the
Examples 3a-c. From Heim v . Universal (1946). 17 Mack Gordon wrote the lyrics to "Without a Word of Warning." It is extraordinary Hirsch's suit made it to court. Her song was unpublished and she based access on a claim to have hummed her song in Revel's presence at a Hollywood restaurant ( Hirsch, 818).
<previous page
page_196
next page>
<previous page
page_197
next page>
Page 197
significance that courts award to pitch similarities alone, which underlies Learned Hand's "comparative method." Judge Clark, though ultimately concurring with the holding of the majority in favor of the defendant, found Universal's common-source defense a disingenuous argument for independent creation of what he called a "Chinese copy" of a vital part of plaintiff's work. As Clark notes, given that the rhythm and harmonization of the Dvork * melody are entirely different from the defendant's, the majority's acceptance of the prior-art defense was extraordinary. Also, how likely is it that the defendant purposely used the ending (i.e., the third and fourth measures) of Dvork's opening phrase as the basis of his number? Like most four-bar phrases, Dvork's phrase can be subdivided into two-measure segments which, once heard in sequence, are not readily separated in one's memory. While Dvork's number has little in common with "Perhaps" or "Ma Este Mg," the two songs have portions that are remarkably similar in pitch, rhythm, and harmony. The diminished chord in the fifth bar of both and the frequent use of chords with an added sixth create a distinctive, impressionistic major-minor ambiguity in a nearly identical harmonic progression over eight measures. The procedure militates against the defendant's claim of influence by Dvork rather than Heim. This musical evidence, along with a reasonable showing of access, supports an inference of copying regardless of the prior-art defense.18 A few notes, as we saw in Boosey, can be the foundation of a successful infringement claim. Nine years after Boosey, Learned Hand, writing for the same court in Fred Fisher v . Dillingham (1924), held Jerome Kern liable for infringement of an eight-note accompaniment pattern appearing in the "burthen" of his "Ka-lu-a." The plaintiff had used the pattern, a broken chord with a passing note in the upper register, in "Dardanella," published two years earlier (Examples 4a-b).19 In Fisher, Hand recognized that the scope of popular numbers is limited by the narrow musical tastes of their audience. This fact, says Hand, justifies the use of a lower standard of originality to stake a copyright claim in a popular number than in a serious musical work. Even a brief phrase in a 18 Heim, after he wrote "Ma Este Mg," shared an apartment with the producer who eventually used defendant's "Perhaps" in his film "Nice Girl?". 19 Felix Bernard and Johnny Black wrote the music, and Fred Fisher the words to "Dardanella," published by McCarthy & Fisher in 1919. Kern's "Ka-lu-a," with words by Alice Caldwell, was written for the musical Good Morning Dearie .Harms published the song as a separate number in 1921.
<previous page
page_197
next page>
<previous page
page_198
next page>
Page 198
Examples 4a ("Dardanella") and b ("Ka-lu-a"). popular number may be protected if it is qualitatively important. Thus, a broken chord, which is an otherwise nonprotectable de minimis idea, satisfies copyright's prerequisite of originality when, as in Fisher, it is used in an original way.20 Finding that the accompaniment pattern in question appeared in such public-domain works as The Flying Dutchman and Schumann's Toccata, Judge Hand supported his finding of originality by asserting that although one finds the same figure in these existing works, the plaintiff was the first to use the pattern as an ostinato.21 Hand conceded it was unlikely that a prominent musician such as Jerome Kern deliberately misappropriated another songwriter's protected expression. The taking was unconscious, suggests Hand, but the plaintiff's right to prevent copying is not curtailed by the defendant's good faith. In any event, 20 Of course the less original the work, the shallower its protection. Keep in mind that copyrights, unlike patents, do not require a showing of "novelty," but rather "originality." A copyright prevents another from taking your protected expression, but not from enjoying the same rights to that expression if created independently of yours. 21 An ostinato can be melodic, harmonic or rhythmic, or a combination of all three, as long as it is used repeatedly in a work or section of it. For Western music at least, "ostinato" implies greater melodic significance than that of the accompaniment pattern in Fisher, and Hand does not explain how he distinguished between use of the pattern as an ostinato and as an accompaniment. I picture Learned Hand by the Victrola, searching among the 78s for meaningful referents for Fisher, but it is unclear from the opinion whether the Schumann and Wagner examples were his invention or the defendant's suggestions. What was he referring to in Dutchman? The accompaniment to the title character's second act love duet with Senta, "Was ist's, das mchtig in mir lebet," which reappears in the introduction to the third act, has an arpeggio accompaniment, but it does not resemble that of "Dardanella." Perhaps he had in mind the repetitive pattern in the Spinning Chorus, no doubt included in a 78 rpm "Highlights from Dutchman,"but it has nothing in common with the "Dardanella" figure. The recherch reference to Schumann's Toccata is equally odd, since the ''Dardanella" accompaniment pattern does not appear there either.
<previous page
page_198
next page>
<previous page
page_199
next page>
Page 199
because the economic value of "Dardanella" came from the melody, and not the accompaniment, Hand concluded that the claim was a trivial point of honor, with no actual damages involved. This was small comfort to Jerome Kern, mortified, no doubt, not by the nominal statutory damage award against him, but by the tarnishing of his professional reputation. The financial stakes were higher in Bright Tunes v . Harrisongs (1976) which, like Fisher, found the defendant liable for unconscious infringement. According to the Court, George Harrison misappropriated virtually the entire musical portion of "He's So Fine," a song by Robert Mack, made popular through a recording by the Chiffons in the early Sixties, and three-quarters of the earnings of Harrison's "My Sweet Lord," about $1,600,000, was attributable to this misappropriated melody.22 The defendant's testimony of independent creation notwithstanding, the Court found that Harrison must have copied the plaintiff's song because, except for one phrase, the melody and harmony of ''My Sweet Lord" were "identical" to that of "He's So Fine." Also, the little "HallelujahHare Krishna" phrase for the back-up singers that is periodically superimposed over the principal melody parallels the "Dulangdulang" riff that introduces "He's So Fine."
Examples 5a ("He's So Fine") and b ("My Sweet Lord"). The melodic and harmonic similarities asserted in Bright Tunes boil down to the repeated use in both songs of a three- and a five-note motive, set to II-V and V harmonies respectively (Examples 5a-b). Even combined, these motives and chords strain the minimal standard of originality required for a 22 The plaintiff Bright Tunes published "He's So Fine" (New York, 1962). In the subsequent case on damages, ABKCO v . Harrisongs (1981), the court drastically reduced collectible damages after determining that Allen Klein (ABKCO), Harrison's ex-manager, who had purchased from Bright Tunes the right to collect against his former employer, had breached his fiduciary duty to Harrison.
<previous page
page_199
next page>
<previous page
page_200
next page>
Page 200
copyright claim in popular music and merit the thinnest layer of protection that prevents only near-literal copying. The plaintiff's two-pitch "Dulang-dulang," which does not appear in the published sheet music, barely contributes to the song's originality. Simple background responsive calls like this are found in innumerable popular songs from throughout the century, and the "Hallelujah" mantra in Harrison's song squares with this tradition. The Court acknowledged the simplicity of the motives but was swayed by the fact that they were repeated and used in the same order in both songs. And, though the complaint did not involve the words of the songs, the vague association between the expression of a teeny-bopper's crush on a cute, wavy-haired boy in "He's So Fine" and George Harrison's dreamy "I really want to see You...Really want to be with You" may have colored the Court's findings. Verbal similarities have played a more prominent part in Repp v . Webber (1994, 1995, 1996), the ongoing dispute over Lloyd-Webber's Phantom of the Opera.Ray Repp, a writer of simple religious music, has claimed that the theme song of Phantom was derived from "Till You," a song he wrote in the late Seventies. After losing at trial, Repp appealed, prompting Lloyd-Webber to counter with a claim that Repp's "Till You" was taken from Webber's ''Close Every Door," of Joseph and the Technicolor Dreamcoat (1968) (Examples 6a-c). Holding for Repp in the counterclaim, the Court did not subscribe to Webber's suggestion that the door metaphor and theme of turmoil alleviated through divine intercession that is common to both songs indicated copying.
Examples 6a ("Till You"), b ( Phantom of the Opera) , and c ("Close Every Door").
<previous page
page_200
next page>
<previous page
page_201
next page>
Page 201
But, because the Court also spurned suggestions of musical similarities between "Till You" and "Close Every Door," Webber hoped that this rejection might ultimately help to debilitate Repp's similar claims of striking similarity between "Till You" and Phantom.23 In his countersuit Webber emphasized the similar melodic contour of the opening phrases, as well as the consecutive fundamental pitches that occur in the second phrases of both songs. The same order of the phrases within the two songs and the rhythmic and harmonic similarities between the second phrases, according to Webber, indicate misappropriation. The Court thought otherwise, finding that the verbal, metrical, and rhythmic differences between the first phrases made the songs "vastly different," despite their similar intervallic skeletons. The "rising arpeggios and descending tetrachords" of the second phrases are not a solid basis for a finding of infringement because these structures are "among the most common devices used in music" ( Repp 1996, 116). The rhythmic, harmonic, and structural similarities do not override the different "core personality and character," explicated through the 23 The reactions of Webber's legal counsel to this ruling are mentioned in "Who Copied Whom? Ruling Implies Neither," New York Times, 5 December 1996. On December 30, 1997 the Second Circuit Court handed Webber a serious setback when it reopened factual determination by overturning the District Court's decision granting Webber's motion for summary judgment ( Repp v . Webber, Nos. 96-9691, 97-7050, U.S. App. LEXIS 36366). According to the Circuit Court, the lower court erred in rejecting the evidence Repp presented in his attempt to establish, by inference, Webber's access to Repp's "Till You" from the striking similarities Repp's experts claimed between "Till You" and Phantom. This evidence included testimony by H. Wiley Hitchcock and James Mack that the basic rhythmic and metrical character of both works were identical, and that although the two pieces were written in different harmonic modes, they reveal an absolute identity in harmonic rhythm. The Court noted that Mack's testimony also took into account his opinion that aural impressions are more significant in commercial music than in symphonic music because the former is designed for the lay listener who derives his impression from what he hears (id. 10-12). Are these assertions pertinent to the question of striking similarity? The rhythmic and metrical character of, say, a waltz by Franz Lehr and one by Johann Strauss Jr. may be identical without the slightest suggestion of misappropriation. The "identical harmonic rhythm" between Repp's and Webber's songs amounts to nothing more than the fact that the opening measures of both songs contain one harmonic change per measure. The Court's mention of Mack's testimony about "aural impressions" in connection with his conclusion that independent creation of the two works would have been impossible, is perplexing. The expression "aural impression" brings to mind non-copyrightable "look and feel" elements of a tangible or visible work. Mack is right that aural impressions are what remain with the musically unskilled listener of popular works, but these impressions are typically grounded in non-copyrightable or minimally protectable elements like performance style, instrumentation, and simple motivic and rhythmic formulae.
<previous page
page_201
next page>
<previous page
page_202
next page>
Page 202
defendant's "holistic" analysis (which was warmly received by the Court) that identified the different meters, modes, and rhythmic figures of the two songs. Applying the Court's approach to Repp's original claim, one reaches the same conclusion: the personalities of the songs are different and their audible similarities involve no more than commonplace elements. The contours of the opening phrases of Phantom and "Till You" are more similar to each other than those of the first four measures of "Close Every Door" and "Till You" because of the corresponding repeat of the opening motive, one step down, beginning in the third measure of each song. But the rhythmic similarity between Repp's song and Webber's Phantom is nothing more than a pervasive upbeat of three even pulses followed by a longer dotted-note downbeat, a rhythmic clich from time immemorial. Given the simple chord progressions of both songs, it is not surprising the District Court opinion does not address harmonic similarities, although it is slightly coincidental that both use a naive-sounding flattened leading-tone chord followed by the tonic within the first eight measures. In 1990 the much-maligned John Williams was finally exonerated of Leslie Baxter's charge that he misappropriated the signature theme of his "E. T." score from Baxter's song "Joy" ( Baxter v . MCA 1987, 1990) (Examples 7a-b).24
Examples 7a ("Joy") and b ("Theme from E. T."). Williams did not have to rely on the rhythmic and melodic differences between his and the plaintiff's themes, or the fact that strikingly similar motives can be found in prior works, because it was ultimately determined that the creativity of Baxter's motive was too minimal to support a copyright 24 Some years before Baxter, Williams faced a similar infringement charge in Ferguson v . NBC (1978) that determined that Wilma Ferguson failed to establish a reasonable possibility of access by John Williams to her song "Jeannie Michele," which she claimed was infringed by his background music for the television program "A Time to Love." The music of John Williams and Andrew Lloyd-Webber, a slumgullion of popular and classical styles, is a natural target for those who would find in their works elements purportedly derived from existing works. See, e.g., Michael Walsh, "Has Somebody Stolen Their Song?," Time, Oct. 19, 1987, 86.
<previous page
page_202
next page>
<previous page
page_203
next page>
Page 203
claim. It is remarkable that Baxter pursued his appeal after the District Court's withering reaction to his claim: "The Court cannot hear any substantial similarity between defendant's expression of the idea and plaintiff's. Until [the expert's] tapes were listened to, the Court could not even tell what the complaint was about..."25 Perhaps the most publicized music copyright case in recent years, Selle v . Gibb (1983) involved the Bee Gees' "Saturday Night Fever" hit "How Deep Is Your Love." Without reaching the issue of misappropriation, the Court overturned a jury verdict for the plaintiff because the Bee Gees affirmatively established at trial that they had no access to Selle's song "Let It End" before writing "How Deep Is Your Love." The plaintiff had tried to compel the Court to infer access, despite contrary evidence, by establishing striking similarities between his and the Bee Gees' song. Plaintiff's expert produced exhibits that purportedly identify numerous melodic and rhythmic identities between the songs (Example 8).26 Looking beyond plaintiff's qualitative findings, which are not probative, we find that even had Selle proved access by the defendants, by examining the songs within a broad context, the Court could have comfortably dismissed the claim, for the similarities devolve onto two commonplace musical ideas which, even when combined, demonstrate little originality. The eight-bar opening theme of "Let It End," which is the melodic kernel of Selle's song and the foundation of his claim, contains four statements of a two-measure motive which suggest a progression from tonic to dominant harmony. The second, or closing, theme of "Let It End," which the plaintiff also claimed was strikingly similar to that of the Bee Gees' tune, is essentially a mirror image of the first. Comprising three iterations of a descending two-measure arched motive that rises a third, falls a fourth and lingers on its final pitch, this segment simply reverses the melodic and harmonic direction of the opening eight measures of the song, returning to the tonic of the first measure. The prefiguring of the contour of this motive in the earlier theme, and the cadential flavor of the predictable descending sequence, discount any 25 Quoted by the Appeals Court in Baxter v . MCA (1987, 423). In Smith v . Michael Jackson et al (1996) the unsuccessful plaintiff unabashedly based his claim of infringement on alleged motivic similarities alone. 26 The defense did not counter the plaintiff's expert with their own, claiming that Aarand Parsons, a music theory professor at Northwestern University, was not a qualified expert in popular music.
<previous page
page_203
next page>
<previous page
page_204
next page>
Page 204
Example 8. Selle v . Gibb
<previous page
page_204
next page>
<previous page
page_205
next page>
Page 205
significance attached to the fact that the allegedly similar themes occur in the same order in both songs.27 If the copyright standard of originality is lower for popular than for serious music, the scope of protection for works that depend on this lower standard should be similarly decreased. Otherwise, extending copyright protection beyond literal copying to tunes of such predictable and generic cut as Selle's "Let It End" would chill innovation in popular music, an industry whose output is already restricted by its consumers' narrow tastes and, increasingly, its creators' limited musical skills. Not all music-copyright infringement claims have as little merit as the previous paragraphs may suggest. In Baron v . Leo Feist (1948) for example, the evidence convincingly showed that defendant's profitable "Rum and CocaCola" was derived from an earlier calypso number called "L'Anne Passe" (Examples 9a-b). The plaintiff's proof went beyond melodic coincidences that are the flimsy platforms of most infringement suits, demonstrating both general and specific musical similarities as well as a reasonable inference of access. Previous litigation had established that the lyrics of "Rum and Coca-Cola" were misappropriated from Mohamed Khan's "Rum and CocaCola" ( Khan v . Leo Feist 1947), and several copies of "L'Anne Passe'' had been added to the CBS library, which the defending songwriter, who was CBS Music Director at the time, used often. Even acknowledging that the jagged melodic and syncopated rhythmic aspects of both songs reflect generic calypso traits, three factors justify the Court's sweeping conclusion that ...[t]he uninterrupted sequence of identical notes is too great to admit of any other inference but copying. The rhythm, construction and the harmony of both songs are little short of identical. To the lay ear substantial identity is the only inference ( Baron, 689). 27 Despite obvious similarities in overall rhythm and contour between the melodies of "Let It End" and "How Deep Is Your Love," the latter is more finely wrought. The slight rhythmic and pitch variations of the basic motive in the Bee Gees' tune avoid the flat-footed sound of the literal motivic repetition of "Let It End"; the fall of a third at the end of the two-measure motive in "How Deep is Your Love" contributes to a line that is more lyrical than that of "Let It End," with its four plaintive descending skips of a fifth that end each two-measure phrase in the opening eight bars. Even the upward skip of a fourth that is part of the upbeat that opens the Bee Gees' melody, which has no counterpart in Selle's work, contributes to a jaunty effervescence that is absent from plaintiff's number.
<previous page
page_205
next page>
<previous page
page_206
next page>
Page 206
Examples 9a ("L'Anne Passe") and b ("Rum and Coca-Cola").
<previous page
page_206
next page>
<previous page
page_207
next page>
Page 207
These are: the melodic correspondence between the eight-measure phrase that is the cornerstone of both songs, the defendant's use of idiosyncratic harmony (a diminished-seventh chord built on the raised tonic in the third measure) at the same point as plaintiff, and the fact that both works are in the same key.28 The fact that all of defendant's work was derived from plaintiff's musical and verbal expression distinguishes this case from similar claims. By publishing "Rum and Coca-Cola" the defendants preempted potential sales for "L'Anne Passe," which could have enjoyed the same success with an identical audience, had it been marketed with the defendant's resources. 10.4 Conclusion If, in evaluating musical copyright infringement claims, courts were to consider a broader range of elements that contribute to the commercial success of a song more than they have heretofore, opportunistic pop-star aspirants would have less incentive to bring infringement charges against prominent musical entertainers, knowing that the court will assume that the alleged melodic similarity between the two songs in question is but one of many elements contributing to the defending song's success. Performing 28 Courts usually give little importance to the fact that the complaining and defending songs are in the same key, but Maurice Baron argued that C major, while an appropriate key for a sad song like "L'Anne Passe," was an odd choice for defendant's upbeat song, which normally would be set (by calypso musicians at least) in a higher key. It is a curious point, and one could argue that C major is even more inappropriate for a sad song like "L'Anne Passe" than for the jolly "Rum and Coca-Cola,'' but consider Maurice Baron's statement in the foreword to the collection in which "L'Anne Passe" was first published: "...Even the tragedies of natives are sung in a gay major key, as proof of their optimistic philosophy..." ( Calypso Songs of the West Indies [New York: M. Baron, 1943]). (There was no familial connection between plaintiff Maurice Baron, who wrote and published "L'Anne Passe," and Paul Baron, the defendant who wrote "Rum and Coca-Cola.") Though he never admitted it, Paul Baron obviously copied "L'Anne Passe" but was probably misled by statements in the published text accompanying the number to the effect that it was in the public domain. A principal at CBS, Paul Baron had access to potent marketing and production channels. Minus these forces, and a hit recording by the Andrews Sisters, the commercial success of the plaintiff's "L'Anne Passe," tucked away as part of a collection of songs, would have been negligible. Had the defendant admitted copying from the start, the dispute could have been settled early on to the benefit of both parties. Keyt ("An Improved Framework") argues against the ruinous injunctive relief granted in cases like Baron in favor of compulsory licensing orders that would allow both parties to enjoy the success of what was essentially a joint effort.
<previous page
page_207
next page>
<previous page
page_208
next page>
Page 208
style, visual appeal, verbal content and audio engineering, for instance, are more important factors in the commercial success of a popular number today than they were fifty years ago. Unless courts begin to acknowledge these and other economically valuable contributions to current popular releases, music-copyright decisions favoring plaintiffs will further limit the already narrow stylistic confines of popular genres, and infringement claims will continue to be a lottery in which extravagant inferences may be drawn from a preoccupation with naive concepts of melodic similarity. Cases Cited ABKCO Music, Inc . v . Harrisongs Music Ltd ., 508 F. Supp. 798 (S.D.N.Y.) 1981 Arnstein v . Edward Marks Music Corp., 82 F.2d 275 (2d Cir. 1936) Arnstein v . Porter, 154 F. 2d 464 (2d Cir. 1946) Baron v . Leo Feist, Inc ., 78 F. Supp. 686 (S.D.N.Y. 1948) Boosey v . Empire Music Co., 224 F. 646 (S.D.N.Y. 1915) Bright Tunes Music Corp. v . Harrisongs Music, Ltd ., 420 F. Supp. 177 Chiate v . Morris, 972 F. 2d 1337 (9th Cir. 1992) Ferguson v . NBC, 584 F. 2d 111 (5th Cir. 1978) Fred Fisher, Inc . v . Dillingham, 298 F. 145 (S.D.N.Y. 1924) Gaste v . Kaiserman, 683 F. Supp. 63 (S.D.N.Y. 1988) Heim v . Universal Pictures Co., 154 F. 2d 480 (2d Cir. 1946) Hein v . Harris, 175 F. 875 (C.C.S.D.N.Y. 1910) Hirsch v . Paramount Pictures, Inc ., 17 F. Supp. 816 (S.D. Cal. 1937) Intersong-USA v . CBS, Inc ., 757 F. Supp. 274 (S.D.N.Y. 1991) Jollie v . Jacques, 13 F. Cas. 910 (C.C.S.D.N.Y. 1850) Kahn v . Leo Feist, 70 F. Supp. 450 (S.D.N.Y. 1947) Midler v . Ford Motor Co,, 849 F. 2d 4690 (1981) Moore v . Columbia Pictures Co., 972 F. 2d 939 (8th Cir. 1992)
<previous page
page_208
next page>
<previous page
page_209
next page>
Page 209
Nichols v . Universal Pictures Co., 34 F. 2d 145 (S.D.N.Y. 1929) Northern Music Corp. v . King Record Distribution Co., 105 F. Supp. 393 (S.D.N.Y. 1952) Repp v . Lloyd-Webber, 858 F. Supp. 1292 (S.D.N.Y. 1994); 892 F. Supp. 552 (S.D.N.Y. 1995); 947 F. Supp. 105 (S.D.N.Y. 1996) Selle v . Gibb, 567 F. Supp. 1173 (N.D. Ill. 1983) Sid & Marty Krofft Television Productions, Inc . v . McDonalds Corp., 562 F. 2d 1157 (9th Cir. 1977) Sinatra v . Goodyear Tire & Rubber Co., 435 F. 2d 711 (9th Cir. 1970) Tempo Music, Inc . v . Famous Music Corp., 838 F. Supp. 162 (S.D.N.Y. 1993)
<previous page
page_209
next page>
<previous page
page_211
next page>
Page 211
11 Judgments of Human and Machine Authorship in Real and Artificial Folksongs Ewa Dahlig Helmut Schaffrath Laboratory Polish Academy of Sciences ul. Dluga 16/29 PL-00950 Warszawa skr. 994 Poland eda@plearn.edu .pl Helmut Schaffrath [formerly of the Hochschule fr Musik, Essen University, Germany] Abstract In a series of experiments in which listeners were presented with a series of natural and artificial folksongs, perceptions of the nature of the composition varied with the reception of the music itself. The qualities which listeners most readily associated with "original folksongs" were rhythmic similarity of phrases, a final cadence on the first degree of the scale, and an intermediate phrase beginning that did not begin on the first degree. Overall, listeners were most likely to judge the songs they liked best to be "original."
<previous page
page_211
next page>
<previous page
page_212
next page>
Page 212
In 1992 and 1993 a series of listening experiments was conducted in several socially and geographically diverse venues to determine what role the reception of a melody might play in evaluations of whether the music was natural (i.e., composed by a human being) or artificial (i.e., composed by a computer). Collectively, these experiments were called Kompost, to suggest the investigation of melodic recycling. The 13 melodies used in this test were all derived from the Essen databases of folksongs. Only two were presented as stored, that is, as "original folk melodies." Each of the others was a fabrication made from phrases extracted from multiple songs in the database and recombined to form a new work. The melodies, shown in Figure 1, were all short. They were presented aurally from pre-recorded MIDI files.
Figure1. Thirteenmelodiesusedinthe Kompost experiments.
<previous page
page_212
next page>
<previous page
page_213
next page>
Page 213
Figure1,cont.
<previous page
page_213
next page>
<previous page
page_214
next page>
Page 214
Figure1,cont.
<previous page
page_214
next page>
<previous page
page_215
next page>
Page 215
Some variables that were tracked were the age, gender, national origin, and musical skill (trained or untrained, and if trained, whether in vocal or instrumental music) of the respondents. The effect of the order in which the melodies were presented was also tracked. The questionnaire is shown in Figure 2. Questionnaire
Figure2. ThequestionnaireusedintheKompostexperiments.
<previous page
page_215
next page>
<previous page
page_216
next page>
Page 216
The answers given by respondents were evaluated in two ways. First, the correlation of particular musical characteristics with judgments was evaluated. Then correlations between social factors and judgments were investigated. 11.1 Correlations of Judgment with Musical Features Three musical influences on judgment, in descending order of effect, were noted: (1) If both phrases were rhythmically similar, the melody was more likely to be judged "original." (2) If the last note of the melody was on the first degree of the scale, the melody was more likely to be judged "original." (3) If the second phrase did not start on the first degree of the scale, the melody was more likely to be judged "original." With regard to the last, we hypothesize that melodic construction exhibiting some kind of "harmonic tension" is more persuasive than a construction that is harmonically static. In terms of the features evaluated, we would group the melodies according to their degree of acceptance as follows: (1) Generally accepted: melodies 2, 8, 1, and 6. (2) Generally not accepted: melodies 9, 4, 13, 11, 5, and 10 The reception of melodies 3, 7, and 12 was mixed. The original melodies were Nos. 6 and 10. Thus it can be seen that the overall acceptance was not predictably related to authenticity. In fact, we synthesized two new melodies (Figures 3a-b) incorporating all the features that listeners seem to find most persuasive. These were not used in the experiments.
<previous page
page_216
next page>
<previous page
page_217
next page>
Page 217
Figures3a-b. Twoartificialmelodiesincorporatingfeaturesfound tobemostconducivetoajudgmentofauthenticity. 11.2 Correlations of Judgment with Social Factors Subjects were from highly diverse groups. These included Hauptschler (students at the beginning level of German secondary education), students in a music-software course at Essen University, statisticians attending an international meeting, musicologists attending an international congress, Chinese music-conservatory teachers and students, and several other categories of individuals. Our main finding was that, overall, identification of a melody as "original" or artificial correlated more highly with liking the melody than with particular social factors. However, there were some significant differences among selected groups, particularly in the first round of experiments (Figure 4).
<previous page
page_217
next page>
<previous page
page_218
next page>
Page 218
Figure4. Differencesofevaluationandreceptionamong diverseeducationalandprofessionalgroups. Some findings related to particular melodies were these: (1) No. 1 was found, particularly by the 31-64 age group, to be ''original" and "good" and also by those with musical training. (2) No. 11 was scored highly by Chinese subjects as "original" and "good". (3) Male subjects were more likely than female subjects to consider the same melodies both "artificial" and "good". (4) Female subjects particularly rated No. 3 "original" and "good", while rating No. 8 "artificial" and "good". (5) Young listeners typically rated No. 6 (one of the two authentic melodies) "original" and "not good."
<previous page
page_218
next page>
<previous page
page_219
next page>
Page 219
(6) Listeners aged 30 and under typically rated No. 5 "original" and "good." (7) Those trained as instrumentalists were less likely to rate No. 9 "original" and "good" than those trained in singing. In general, young and Chinese subjects expressed a more favorable reception of the computer-composed melodies. Instrumentalists also indicated a better reception of these works than non-instrumentalists. Older subjects and musical "experts" were less likely to express dislike, younger subjects more willing to express it. The songs were presented in two groups and the order of the groups was varied from locale to locale. The first song presented was likely to be considered "original" even if, when placed in a different position, it was generally thought not to be original. A detailed report on this study is available from Ewa Dahlig.
<previous page
page_219
next page>
<previous page
page_221
next page>
Page 221
IV. ONLINE TOOLS FOR MELODIC SEARCHING
<previous page
page_221
next page>
<previous page
page_223
next page>
Page 223
12 MELDEX: A Web-based Melodic Locator Service David Bainbridge Department of Computer Science, University of Waikato Private Bag 3105 Hamilton, New Zealand; davidb@cs.waikato.ac .nz Abstract In a normal library situation, an online bibliographical service provides the ability to search a text database of authors. In a music library it supports searches for names of composers and titles of works. Finding tools for locating tunes for which the title or composer is not known to the user have generally been based on text representations of melodies [which are discussed elsewhere in this issue], but users often consider these tools to be counter-intuitive. MELDEX, a computer-music service developed at the University of Waikato, New Zealand, is now available over the World-Wide Web. It is designed to allow users to search a database of encoded melodies by singing the item they wish to identify. MELDEX forms part of the New Zealand Digital Library maintained at Waikato University.
<previous page
page_223
next page>
<previous page
page_224
next page>
Page 224
MELDEX has been developed by a team of musicians and computer scientists (including Lloyd Smith, Ian Witten, Rodger McNab, and David Bainbridge) over the past three years. The search capabilities that it provides are available free of charge via the World-Wide Web. To access data stored in the MELDEX system (see Figure 1), a user needs to have a sound-input device, with appropriate software support, attached to a computer that provides Web access. Sound-tools for UNIX workstations, PC's, and Macs are available from the same site.1 The user first records himself singing (or humming or playing on an instrument) the remembered fragment of a song and enters the filename of his recording as Step 1. Then the user selects the database or databases he wants to search (Step 2) before pressing the "submit" button at the bottom of the page (Step 3). After a short delay of a few seconds a list of possible matches is returned. The user can selectively listen to and print out the "answers" as sheet music which is typeset (albeit, in a simple fashion). Figure 1 shows the default query page. An advanced query page, which provides the user with greater control over the matching process, also exists. It can use or ignore note durations, allow adaptive tuning, and so on. To develop this system, three important components were required: (1) a database of melodies, (2) software that transcribes audio into melodic notes, and (3) an algorithm that numerically ranks the similarity of two tunes. For a composite view, see Figure 1. 12.1 Databases Currently we have four databases of encoded musical materials. These represent North American/British, German, Chinese, and Irish folksongs. The encodings derive from two sources: the Digital Tradition database (Greenhaus 1994) and the Essen Data Package (Schaffrath 1992; CCARH 1995). Together these resources provide 9,400 melodies. Access to on-line musical databases is scarce, so we are grateful for the cooperation of these two projects. In the 1 This archive includes MiXViews (MXV) tools for playing and recording on the Solaris, LINUX, Silicon Graphics, and NeXT versions of UNIX; SOund eXchange (SOX) conversion tools for UNIX and DOS; WAVmaker for Windows 3.x, 95, and NT; and three sound utilities for the MacAIFF Recorder, SoundApp, and Sound Manager.
<previous page
page_224
next page>
<previous page
page_225
next page>
Page 225
future we hope to build new collections of on-line music using our Optical Music Recognition (OMR) system Cantor (Bainbridge 1997).
Figure1. AsampleMELDEXquerysession.Forenlargementsofeachpage,seeFigures2and3.
<previous page
page_225
next page>
<previous page
page_226
next page>
Page 226
12.2 Software The real-time transcription process that facilitates comparisons of sung material to stored data uses amplitude (loudness) to determine where each note starts and ends. The user must therefore provide a sufficient drop in amplitude between notes when singing. It is recommended that each note be sung to the syllable ta .
Figure2. MELDEXsearchoptions
<previous page
page_226
next page>
<previous page
page_227
next page>
Page 227
12.3 Search Algorithm The last component requires the comparison and probability ranking of tunes. This is handled by the MR software. [See the article on "Sequence-Based Melodic Comparison" by Smith, McNab, and Witten elsewhere in this issue for a detailed description of search algorithms.] The first ten of 74 possible matches for a user-provided sample are shown in Figure 3.
Figure3. Listingofthefirsttenof74potentialmatchesforasamplemelodyprovidedbyuser.
<previous page
page_227
next page>
<previous page
page_228
next page>
Page 228
MR offers two alternative methods for matching: (1) a simple, fast algorithm based on "state" matching, and (2) a slower, more sophisticated algorithm based on dynamic programming. The default is the simple, fast algorithm, but both are available through the advanced query page (see Figures 2 and 3). 12.4 The New Zealand Digital Library Project The MELDEX system is part of a digital library project run by Waikato University. The establishment of a sophisticated on-line music library is only one of the goals (see Witten et al. 1998). Two goals for the future are these: (1) to support the recording of an input query directly in the user interface (most likely to be implemented using Java) rather than require the user to employ a separate application; and (2) to expand the currently monophonic database of tunes to polyphonic; and to integrate this desirable ability to locate melodies based on a sung fragment with the more traditional text search on composer and title. The MELDEX system can be accessed through the following URL: http: //www.nzdl.org/meldex
<previous page
page_228
next page>
<previous page
page_229
next page>
Page 229
References Bainbridge, David, "Extensible Optical Music Recognition." Ph.D. thesis: University of Canterbury, Christchurch, New Zealand, 1997. See also http: //www.nzdl. org/cantor . Essen Musical Data Package .[4 3.5" PC diskettes.] Created by and under the auspices of Dr. Helmut Schaffrath, Essen University, 1982-94. With index of works by Kurt Kuchein. Menlo Park, CA: Center for Computer Assisted Research in the Humanities, 1995. Greenhaus, D., "About the Digital Tradition." Webscript. 1994. http: //www. deltablues.com/ . McNab, Rodger, "Interactive Applications of Music Transcription." M.Sc. thesis: University of Waikato, Hamilton, New Zealand, 1996. McNab, Rodger, Lloyd Smith, David Bainbridge, and Ian Witten, "The New Zealand Digital Library MELody inDEX," D-Lib Magazine, May 1997. http: //www.dlib.org/dlib/may97/meldex/05witten.html . Schaffrath, Helmut, "The EsAC Databases and MAPPET Software," Computing in Musicology, 8 (1992), 66. Schaffrath, Helmut, "The Essen Associative Code: A Code for Folksong Analysis" in Beyond MIDI : The Handbook of Musical Codes, ed. E. Selfridge-Field (Cambridge: MIT Press, 1997), 343-361. Schaffrath, Helmut, "The Essen Databases," Computing in Musicology 7 (1991), 30-31. Smith, Lloyd, Rodger McNab, and Ian Witten, "Sequence-Based Melodic Comparison: A Dynamic Programming Approach," Computing in Musicology 11 (1997-98), xx-yy. Witten, Ian, C. Nevill-Manning, Rodger McNab, and S. Cunningham, "A Public Digital Library Based on Full-text Retrieval: Collections and Experience," Communications of the Association for Computing Machinery, 41/4 (1998), 71-75.
<previous page
page_229
next page>
<previous page
page_231
next page>
Page 231
13 Themefinder: A Web-based Melodic Search Tool Andreas Kornstdt Computer Science Division Department of Human Interface Design University of Hamburg D-22527 Hamburg, Germany lkornsta@informatik.uni-hamburg.de Abstract Themefinder, which is accessible through the CCARH MuseData Website, provides access to a database of over 2,000 monophonic representations of themes that cover a broad range of instrumental works from the seventeenth, eighteenth, and nineteenth centuries. Keyboard works, chamber music, and orchestral works are included. Search results are displayed graphically, with reference information.
<previous page
page_231
next page>
<previous page
page_232
next page>
Page 232
CCARH's new Musedata website features a search tool that caters to the needs of musicologists and laypersons alike: the Themefinder. Themefinder offers a simple but powerful interface to a database of over 2,000 monophonic representations of themes that cover a broad range of printed instrumental works from the seventeenth, eighteenth, and nineteenth centuries. Keyboard works, chamber music, and orchestral works are included. Search results are displayed graphically and with reference information. At this writing (August 1998), 559 themes, covering the complete Beethoven repertory, are indexed.
Figure1. ScreenshotofThemefinder.
<previous page
page_232
next page>
<previous page
page_233
next page>
Page 233
The search software, developed by David Huron, uses musical data in the Kern format. The user interface and website have been developed by Andreas Kornstdt. The opening user screen is shown in Figure 1. 13.1 Using Themefinder Themefinder has the following location: http: //musedata .stanford .edu/databases/themefinder You may specify your theme as precisely (or fuzzily) as your memory permits. Let us assume that the only clues that you can come up with are the rough contour of the beginning of the theme (up, up, down, down) and the fact that it is from a work by Beethoven. So you restrict your search by composer (Beethoven), check the box next to ''Gross Contour" and enter the contour into the adjacent field where "up" translates to / and "down" to \: thus //\\ . After hitting the "Submit Search" button, you are confronted with 233 matches, a uselessly large result. If the option "anchored to the beginning of theme" is selected, only 24 matches are found, but the one you have been looking for is still not on the first page. By way of further refining the query, if a theme were believed to start on an A, that might be indicated next to "Pitch." With all three parameters set, there are now only two matches(1) from the Violin Sonata No. 1 in D Major and (2) the desired match, which is the opening theme of Symphony No. 6 (Figure 2). 13.2 List of Features In full, Themefinder offers the following ways of specifying melodies: (1) by letter-name of pitch (e.g.,ABbDCBbAGCFGABbAG) if the exact pitch-sequence, key, and enharmonic spelling are known; (2) by pitch-class (e.g., 9A20A970579A97 ), if the exact pitch-sequence and key are known but not the enharmonic spelling;
<previous page
page_233
next page>
<previous page
page_234
next page>
Page 234
Figure2. Twomatchesforthecombinedquery(1)contour=//\\ ( UUDD ), (2)themeisanchoredtostartofwork,and(3)firstnote=A.
<previous page
page_234
next page>
<previous page
page_235
next page>
Page 235
(3) by intervallic name (in tonal nomenclature) (e.g., +m2+M3-M2-M2-m2-M2-P5+P4+M2+M2+m2m2-M2 ), if the exact interval-sequence and key are known; (4) by intervallic size (in semitones) (e.g., +1+4-2-2-1-2-7+5+2+2+1-1-2), if the exact intervalsequence is known but not the key; (5) by scale degree (e.g., 34654325123432 ), if the diatonic pitch-sequence and key are known but no chromatic information; (6) by gross contour (e.g., //\\\\\////\\), if neither the exact nor diatonic pitch-sequence nor key are known, but a rough sequence of upward and downward steps is; or (7) by refined contour (e.g., ^/vvvv\/^^^vv), where upward and downward steps can be subdivided into small ( ^,v) and big ( /,\) steps. The search software can treat several consecutive unisons as one item (e.g., the first theme from Beethoven's Symphony No. 5 (e.g., 5553 in scale-degree-format will also show up when searching for 53 ), but in some circumstances, the representation of each discrete event is more conducive to an accurate response (see SelfridgeField, chapter 1 of this issue). The repertory can be restricted by composer, genre (orchestral, concerto, piano concerto, violin concerto, chamber, string quartet, and solo piano), key (tonic note and mode can be specified separately or set to all respectively) and meter (numerical or literal: simple, compound, duple, triple, quadruple, irregular, mixed, simple duple, simple triple, simple quadruple, compound duple, compound triple, and compound quadruple). As demonstrated in Figure 1, all of the above elements can be arbitrarily combined. On the results page, the number of matching themes and the first 10 matches are displayed in musical notation, with full reference information (composer, name of work, opus or work number [if appropriate] and genre, movement number, and theme identification). If more than 10 themes have been found, matches are binned by ten at the bottom of the page. A complementary text database currently in preparation by Maria Heifetz and Zo Chafe facilitates searches by date, by place of publication or performance, by composer's generation (e.g., all composers born in 1685), and by other text parameters.
<previous page
page_235
next page>
<previous page
page_236
next page>
Page 236
13.3 Technical Background The themes contained in the database were originally encoded in Humdrum Kern format. Then, using Humdrum tools, each file was translated by Huron into the respective representations required by the individual searches offered by Themefinder (pitch, pitch-class, etc.). In this way, queries from the Web interface can be swiftly matched against those adapted representations without the invocation of various Humdrum commands. The images on the results page were generated using a program from the MuseData package, written by Walter B. Hewlett. Some post-editing and other contributions to the development of Themefinder were provided by Craig Sapp. Jane Singer has provided useful beta-testing and made several suggestions for improvements. 13.4 Future Plans Due to its simple architecture, the repertory of Themefinder can easily be extended to include an almost unlimited number of themes. The current emphasis is on inreasing the quantity of themes. For copyright reasons, only works published before 1925 can be included. Parties that are interested in co-operation should contact CCARH about details ( ccarh@ccrma .stanford .edu ). Future software extensions may support searches for rhythmic patterns and links to the MuseData database to download whole pieces (where available). References Huron, David, "Humdrum and Kern:Selective Feature Encoding" in Beyond MIDI : The Handbook of Musical Codes, ed. E. Selfridge-Field (Cambridge: MIT Press, 1997), 375-401.
<previous page
page_236
next page>
<previous page
page_237
next page>
Page 237
INDEX
A Abe, Junichi 182 aboriginal song 21 accentuation 33, 38 accompaniment pattern 198 Adams, Charles R. 23-24, 56 Agawu, V. Kofi 135, 138 Aho, A. V. 75, 79-80, 85, 97 Aho-Corasick algorithm 83 Aho-Corasick automaton 79, 84 Aitken, A. C. 70, 72 Akademie der Wissenschaften und der Literatur 120 alphabet 76-77 Alphonce, Bo 183 Altschul, S. 75, 97 Andrews Sisters 208 Apostolico, A. 75, 81-83, 97 appoggiatura 49 artificial intelligence 7 Asagawa, Gyokuto 183 ASCII macro input files 162 text files 161 audio data 77 audio input 24, 92
Bach, J. S. 141-144 cantatas 22-23 chaconne for unaccompanied violin 9 chorale prelude (BWV 669) 91-92 chorale melodies 5 melodic index 19-20 minuet from the B Major Partita 52-53 Orgelbchlein 35 Prelude in C Major (WTC I) 52-53 Bainbridge, David 15, 223-229 Baker, Maureen 192 Baker, Nancy K. 7, 56 Bakhmutova, I. V. 37, 56 Balaban, Mira 39, 56 Barlow, Harold 14, 27, 56, 128 Baron, Maurice 208 Baron, Paul 208 Baroni, Mario 7, 36-37, 43, 56 Baroque Figuren 54 Bartlett, James C. 29, 56 Baxter, Leslie 192, 202 bebop 23, 41 Bee Gees, the 189, 203 Beethoven, Ludwig van 23, 232-233 Symphony No. 5 27, 235 Symphony No. 6 233 Violin Sonata No. 1 233 Berkman, O. 83, 97 Bernard, Felix 197 Bevil, J. Marshall 33-34, 57 Bharucha, Jamshed J. 57 Bilgen, S. 60
Binford-Walsh, Hilde M. 28, 57 biology, computational 75 biology, molecular 75 Black, Johnny 197 Bker-Heil, Norbert 122 Boolean operators 102 Boroda, Moisei 57 Boyer-Moore algorithm 79 Boyer, R. 79, 97 Brahms, Johannes Ballade (Op. 10, No. 4) 10 Deutsche Volkslieder (No. 15) 91 Braille musical notation 25 Braun, H. 154, 157 Breslauer, Peter 57 Brinkman, Alexander 27, 35, 57 Bruckner, Anton Symphony No. 7 32 Brunetti, R. 56 Bryant, Stephen C. 30-31, 57 Buch, D. 100
<previous page
page_237
next page>
<previous page
page_238
next page>
Page 238
C C (programming language) 160 cadenza 135, 137 Caldwell, Alice 197 Callegari, Laura 56 calypso 205 Cambouropoulos, E. 75, 78, 88, 97 Camilleri, Lelio 58 cantatas J. S. Bach 22-23 seventeenth-century Italian 36 Telemann 22-23 Carterette, Edward C. 29, 58, 61 Cavalcanti Lutebook 96, 100 CCARH (Center for Computer Assisted Research in the Humanities) 22, 184, 224, 236 Website 231 CCRMA (Center for Computer Research in Music and Acoustics) 184 Chafe, Zo 235 chamber music 232 chanson 36 chant Ambrosian 29 ancient Greek 21 centonization 28 Liber Usualis 21 melodies 28 Chapman, Gary 30-31, 57 Chatull Gadol 28 Chiffons, the 199
Chomsky, Noah 7 Chopin, Frdric mazurkas 133-134 chorale harmonization 141-143 melodies 5, 36, 91, 142 partitas 143 variations 142, 146-149, 153 Classical period 132 cluster analysis 102 cognitive science 142 computer science 142 Conklin, Darrell 142, 156-157 Cooper, Gordon W. 183 Cooper, Grover 58 Cope, David 7, 58, 102, 116, 129-138, 142 copyright infringement 192-194, 197, 199, 201-202, 205, 208 copyright search 115 Corasick, M. J. 79, 97 Cormen, T. H. 77, 97 counterpoint, fifth-species 49 Crawford, Tim 12, 73-100 Crerar, Alison 44, 58 Crochemore, M. 75, 81-82, 85, 88-89, 98 Cronin, Charles 187-210 Cunningham, S. 117, 229 Curwen, John 26, 58 Czumaj, A. 75, 98
D Da Milano, Francesco 96 Dahlig, Ewa 211-219
Dalmonte, Rossana 56 dance music, Irish 66 DARMS 25-27, 30, 33, 123-124 databases melodic 224 musical 168 folksong 224 Debussy, Claude 39 dendrogram 148-149 Densmore, Frances 6, 24 Die Fledermaus 196 difference algorithm 65-66, 69, 71 Digital Tradition, the 110, 224 Dillon, M. 102, 115-116 directional profiles 22-24 DNA 103 Dowling, W. Jay 29, 56, 58 Dvork *, A. "Humoresque" 196-197 Slavonic Dances 27, 31 dynamic programming 101, 103-104, 108-109, 111, 114-115 dynamics 160
E earmarks 129, 134-135, 137 Ebcioglu*, Kemal 5, 58, 142, 156 edit distance 85-86 Edo era (Japan) 171, 182 Edworthy, Judy 29, 59
<previous page
page_238
next page>
<previous page
page_239
next page>
Page 239
Ehrenfeucht, A. 81, 97 Eitan, Zohar 59 Ellis, Mark 59 EMI ( Experiments in Musical Intelligence)102-103, 129, 133 Mozart-style symphony 137-138 EsAC code 25-26, 35 EsSCORE 161 Essen databases 22, 25, 26, 34-35, 40, 46, 110, 161, 166, 212 analysis tools 35 software 18 Essen University 217 ethnomusicology 6, 23-24, 33 expert testimony (infringement suits) 190 exposition (sonata-form) 135
F Farach, M. 82-83, 97 Fecker, Adolf 164-165, 168 Feder, Georg 121 Ferguson, Wilma 202 Ferragina, P. 98 Feulner, J. 142-144, 157 Fibonacci strings 81 Fischer, Ludwig 121 Fisher, Fred 197 folk tune repertories 14 American 33 folksong databases 101, 110, 114-115, 224 Irish 65
melodies 142 original (vs. artificial) 212, 216-219 variants 115 Franzke, Ulrich 59 Frescobaldi, Girolamo 142 Friml, Rudolf 194 Froberger, Johann Jakob 142 Fux, Johann Joseph 4
G G. K. Saur Verlag 120 Galil, G. 87, 98 Galil, Z. 75, 80, 97-98, 111, 116 Gamom, Satoaki 183 Gasieniec, L. 98 Gaultier, Denis 92-94, 100 Giancarlo, R. 80, 98 Gingerich, Lora L. 59 Gish, W. 97 Gjerdingen, Robert 59, 135, 138 Glinka, Mikhail 6 Goad, W. B. 103, 116 Godfrey, Jeff 193 Goldstein, Paul 191 Gordon, Mack 196 grace notes 30-31 Greenhaus, D. 110, 116, 224, 229 Gross, Dorothy 31, 59 Guibas, L. 81, 98 Gusev, V. D. 56 Gustafson, Bruce 34, 59
H Haas, Max 28, 59 Halperin, David 29, 59 Hammerstein II, Oscar 194 hamming distance 85-86 Handel, G. F. Messiah 22 Organ Concerto (Op. 7, No. 1) 27 Harbach, Otto 194 Harford, Harold 193 HARMONET 144, 146, 148, 155-156 harmonic field 150, 152-153 tension 216 Harris, Charles 192 Harrison, George 189, 199-200 Harvard University 119 Harv. Univ. Lib. Hollis database 120 Hauptmann, Moritz 32-33, 59 Haydn, Franz Josef 122 edition 120 Institute 121 melodic index 30 Military Symphony 28 oboe concerto 137 variations on "Gott erhalte" 11 Heifetz, Maria 235 Hein, Silvio 192 Henderson, C. L. 117 Herl, Joseph 63 heuristics 84
<previous page
page_239
next page>
<previous page
page_240
next page>
Page 240
Hewlett, Walter B. 17-18, 60, 236 Hild, H. 142, 144, 157 Hiragana symbols 184 Hitchcock, H. Wiley 201 Hobart, George 192 Holland, Simon 63 Hopi melody 6 Hrnel, Dominik 141-157 Howard, John 15, 31, 119-128 Hughes, Andrew 28, 60 Humdrum 18, 32, 109, 160, 236 Toolkit 33 Hume, A. 79, 98 Hunter, M. 102, 115-116 Huron, David 24-25, 60, 103, 108-109, 116, 233, 236 hymn tunes 26
I Iliopoulos, Costas S. 73-100 IML-MIR 25-26 implication-realization theory 7, 50-51 incipits 14, 192, 196 instantiations 19-20 intervallic contour 19 intervallic profile 25, 29-30 intervals complementary 152 encoding 29, 150-153 representation 149, 151 inversion 102
Iszmirli, . 60
J Jackendoff, Ray 38, 60 Jackson, Michael 189 Jacoboni, Carlo 36-37, 43, 56 Jagger, Mick 189 Java 228 jazz 50, 143, 188 Jesser, Barbara 35, 60 Jolson, Al 193 Jones, Mari Riess 38, 60 Judge Clark 191, 197 Judge Learned Hand 188, 192, 197-198
K Kabuki theatre 170 Kanehisa, M. I. 103, 116 Kern 33, 233, 236 Kern, Jerome 197-199 kernel 43 Keyt, Aaron 188, 191, 208 Khan, Mohamed 205 Kim, Alice 190-191 Klein, Allen 199 Knuth, D. E. 79, 99 Knuth-Morris-Pratt algorithm 79 Koch, Heinrich Christoph 6-8, 60 Kohl, Donald V. 29, 58 Kohonen, T. 148, 149, 157 Koizumi, Fumio 172-173, 183 Kompost 212, 215
Korda, M. 88, 98 Kornstdt, Andreas 15, 231-236 Krumhansl, Carol L. 68, 72 Kruskal, J. B. 104, 117
L Laden, Bernice 60 Landau, G. M. 86-87, 99 LaRue, Jan 14, 27, 30-31, 60 Latman, Alan 190 Lehr, Franz 201 Leiserson, C. E. 97 Leppig, Manfred 41, 45-46, 60, 105, 116 Lerdahl, Fred 38, 60 Leshinskie, Matthew 59 Liber Usualis 21 Lincoln, Harry B. 14, 27, 30, 61 linguistics 7, 43 Lipman, D. 75, 97, 99 List, George 6, 61 Lloyd-Webber, Andrew 189, 201-202 Joseph and the Technicolor Dreamcoat 200 Phantom of the Opera 200 Logrippo, Luigi 102, 116 Lorentz, R. 81, 99 Love, James 26, 61 Lubej, Emil H. 24, 61 Lully, Jean-Baptiste 34 lute ricercare 96 tablature 92-94 transcription 91-92
<previous page
page_240
next page>
<previous page
page_241
next page>
Page 241
M MacGamut 18 Machida, Kasei 183 Macintosh (PowerPC 8500) 114 Mack, James 201 Mack, Robert 199 madrigals intervallic index 30 Main, G. 81, 99 Malm, William P. 183 Manber, U. 114, 117 Manns, Charles G. 63 Manzara, Leonard C. 142, 157 Marcello, Benedetto and Alessandro 22 Marshall, Charles 193 mathematics 142 McAll, May DeForest 19-20, 22, 61 McCormack, John 193 McNab, Rodger J. 101-118, 224, 227, 229 medieval repertories 28 Meidanis, J. 75, 100 MELDEX 223-226 melisma 164 melodic arch 24 comparison 6, 19, 25, 46, 50, 52, 66, 100-102, 115, 192 contour 24, 30, 43, 201 direction 38 kernel 36
matching 89 patterns 160, 171-172 prototype 46, 49, 52, 54 recycling 212 searchng 166 similarity 70, 188, 193, 208 variation 142-143, 147, 155 melody artificial 217-219 chant 28 chorale 5, 36, 91 compound 9, 12, 52 disguised 8, 12 distributed 12, 89-90 Gaelic 21 Hopi 6 jazz 50 Mozartean 13 prototypical 8, 13, 30, 52 roving 11, 12 self-accompanying 9 submerged 10 mensural notation 29 Menzel, W. 142, 144, 157 metrical stress 66, 68, 72 Meyer, Leonard B. 7, 29, 36, 58, 61-62, 183 Microsoft Word 161 MIDI 16, 74, 76 files 212 pitch (key) numbers 25, 68, 76, 105 Miller, W. 97 misappropriation 189-192, 201, 203, 205
Monaghan, Caroline B. 29, 61 Mongeau, Marcel 61, 82-83, 86, 99, 103, 108-109, 115, 117 Moore, D. W. G. 81, 83, 99 Moore, J. 79, 97 Morgenstern, Sam 14, 27, 56, 128 Morris, J. 79, 99 motivic classes 148, 150, 154, 156 Mouchard, L. 81-82, 98 Mozart, W. A. 6, 49, 52, 109, 122, 137 edition 120-121 Piano Sonata (K. 284) 132 Piano Sonata (K. 311) 13, 46, 50 Piano Sonata (K. 331) 5 Piano Sonatas (K. 280, 330, 547a, 570) 131 Piano Concertos (K. 238, 449, 450, 482, 595) 135-136 Piano Concertos (K. 453, 459) 131 String Quartet (K. 575) 32 symphonies 138 Symphony in G Minor (K. 550) 43 Variations (K. 300e) 142, 156 Mozer, M. C. 142, 157 Murao, Tadahiro 183 MuSearch 159, 161, 166 MuseData 18, 22, 33, 236 database 236 music perception 7-8, 31, 43 music theory 8
<previous page
page_241
next page>
<previous page
page_242
next page>
Page 242
musical similarity (in copyright cases) 187, 189 musicologists 217-218 Muthukrishnan, S. 98 Myers, E. 97 Myers, Lincoln 18
N Nagauta 170-171, 181 Narmour, Eugene 7, 50-51, 61 Nettheim, Nigel 159-168 neumes 28-29 neural network 141-144, 147-148, 153, 156 Nevill-Manning, C. 229 New Zealand Digital Library 228 Nizer, Louis 193 Northwestern University 203 notation Braille 25 mensural 29
O Maidn, Donncha 65-72 Odlyzko, A. 81, 98 orchestration 138 Orita, Akiko 184 Orpen, K. S. 103, 108-109, 117 Osterberg, Robert 189 ostinato 198
Pachelbel, Johann 141-143, 145, 148-149, 153-154, 156 Park, K. 82-83, 87, 97-99, 111, 116 Parsons, Aarand 203 Parsons, Denys 14, 21-22, 61, 128 Partita, secular 142 Pearson, W. R. 75, 99 Pentium processor 84 perception, music 7-8, 31, 43 Perrine 93, 100 Peyote Indian songs 21 phrase contour 24-25 segmentation 172 pitch contour 19, 24, 36 representation 16, 19, 25, 41, 68 string 45 Pitt, Mark A. 29, 58 plagiarism 191, 196 Plaine and Easie Code 33, 121-123, 127 Plath, Wolfgang 121 Polansky, Larry 183 Pont, Graham 21-22, 61 popular music 190 popular songs 191, 200 Pratt, V. R. 79, 99 Princeton University 25, 27 protected expression 191 Protestant church 142 psychology cognitive 7 Gestalt 7
R Ragg, T. 142-144, 157 Raman, Rajeev 73-100 Ratner, Leonard 10, 61 recapitulation (sonata-form) 135 Renaissance diminutions 54 representation music 108 pitch 16, 19, 25, 41, 68 Repp, Ray 200 retrograde 102 retrograde inversion 102 Revel, Harry 194 rhythmic patterns 169-170, 172-174, 176, 181 similarity 211, 216 theory 32 Riedmiller, M. 154, 157 Riemann, Hugo 62 RISM 14, 22, 31, 33, 119-121, 126-127 Central Office (Frankfurt, Germany) 119, 124 U.S. Office 119, 128 Rivest, R. L. 97 Robe, Harold 193 Rollin, M. 100 Romberg, Sigmund 196 rondo form 132 Rosner, Burton S. 29, 62 Royal, Matthew 60 Rytter, W. 75, 85, 88-89, 98
<previous page
page_242
next page>
<previous page
page_243
next page>
Page 243
S Sankoff, David 61, 82-83, 86, 99, 103-104, 108-109, 115, 117 Sapp, Craig 18, 32, 184, 236 Schaffrath, Helmut 26, 35, 40, 55, 62, 110, 117, 211-219, 224, 229 Schenker, Heinrich 52 Schlichte, Joachim 121 Schmidt, J. P. 86, 100 Schottstaedt, Bill 4, 62 Schubert, Franz 43, 164, 166 ''Der Neugierige" 161 Die Schne Mllerin 159, 161 Impromptu (Op. 142, No. 3) 9 Lieder 159 Rosamunde overture 32 Schumann, Robert 43 Toccata 198 SCORE 33, 159, 162, 184 macro files 161 SCRIBE 29 Seeger, Charles 23, 62 segmentation (of phrases) 172 Selfridge-Field, Eleanor 3-64, 76, 100 semiotics 43 set theory 27 Setubal, J. 75, 100 shamisen 169-170, 172, 181 signatures 129-133, 135, 142 Simil 109 Singer, Jane 236
Sisman, Elaine R. 63 slurs 160-161 Smaill, A. 75, 88, 97 Smith, Leland 184 Smith, Lloyd A. 101-117, 224, 227, 229 Smith, Matt 63 Smyth, W. F. 81, 99 solfegge 193 speech inflections 164 Spitzer, John 6, 63 Stainton, Murray 63 state matching 114, 116 statistics 70 Stech, David A. 63, 102, 117 Steinbeck, Wolfram 40-41, 63 Stepien, Bernard 102, 116 Stinson, John 29, 63 Strauss, Johann Jr. 196, 201 Die Fledermaus 196 stress 37-38 metrical 66, 68, 72 patterns 33 songs aboriginal 21 Hopi 6 Peyote Indian 21 popular 191, 200 strophic 163 style simulation 142 Sun UltraSPARC processor 84 Sunday, D. 79, 98 Swain, Joseph P. 32, 63
syncopation 50
T Tangian, Andranik 32, 63 Tchaikovsky, P. I. Symphony No. 6 12, 89-90 Telemann, G. Ph. cantatas 22-23 Temperley, David 63 Temperley, Nicholas 34, 63 Tenny, James C. 183 tenuto 152 Teton Sioux music 6 text underlay 159, 161 thematic catalogues 14 variations 109, 115 Themefinder 231-236 theory music 8 set 27 Thompson, William Forde 63 Tin Pan Alley 191, 193 Titkova, T. N. 56 Toiviainen, P. 143, 157 Traeff, J. 98 transposition 66, 69 triplets 109 Troost, Jim M. 29, 64 troubadour music 29 Trowbridge, Lynn 63
<previous page
page_243
next page>
<previous page
page_244
next page>
Page 244
U Ukkonen, E. 80, 85, 100 UNIX 114, 224 Urlinie 52
V Van Den Toorn, Pieter C. 63 Victrola 198 Vishkin, U. 86-87, 99 voice leading 92 Vos, Peit G. 29, 64
W Wagner, Richard 14 The Flying Dutchman 198 Waikato University 228 Walsh, Michael 202 Ward, John M. 64 Weiner, P. 79, 100 Wenk, Arthur B. 39, 64 Williams, J. Kent 23, 41-43, 49-50, 64 Williams, John 192, 202 Witten, Ian H. 101-117, 142, 156-157, 224, 227-229 Wonder, Stevie 189, 192 World-Wide Web 15, 120, 156, 223-224 Wu, S. 114, 117
Yako, Masato 169-183 Yeston, Murray 183 Yi, Suk Won 39, 43-44, 64
Z Zarhipov, R. K. 37-38, 64
<previous page
page_244
next page>
<previous page
page_245
next page>
Page 245
Computing in Musicology: Style Sheet All submissions should be made both in hardcopy, following the indications given below, and in ASCII text on a 3.5"/1.44 MB diskette, with fonts stripped unless the text was prepared using WordPerfect.All graphics files should be sent as separate items on diskettes. Please do not send conversions to WordPerfect from other word-processing programs or PostScript files with graphics embedded in text files. Most musical examples are reset in-house to suit page formats. Italics: Italicize titles of books, journals, and proceedings; titles of major texted musical works, such as operas; email addresses, names of computer directories and files, and titles of programs and specific versions of computer languages (e.g., Turbo Pascal)but not of languages (Pascal) or operating systems (UNIX). Instructions to be entered on a computer screen by the user should be in Courier font . Titles: Titles of articles within books or journals, of short texted musical works, such as songs, and of nicknames for musical works (e.g., "Moonlight" Sonata) should be placed within double quotation marks. For titles in English, the main words should begin with a capital letter. Titles in other languages follow native style. Names: In bibliographical references, please include first names of authors and editors as well as volume/issue numbers (in Arabic numerals) and page numbers of articles in journals and collected writings. Please observe the name order indicated below. (1) Single author, book: Mazzola, Guerino. Geometrie der Tne.Basel: Birkhuser, 1990. (2) Single author, article in journal: Bel, Bernard. "Time in Musical Structures," Interface, 19/2-3 (1990), 107-135. (3) Single author, article in book or proceedings: Morehen, John. "Byrd's Manuscript Motets: A New Perspective" in Byrd Studies, ed. Alan Brown and Richard Turbut (Cambridge: Cambridge University Press, 1991), pp. 51-62. (4) Single author, thesis or dissertation: Diener, Glendon R. "Modeling Music Notation: A Three-Dimensional Approach." Ph.D. Thesis, Stanford University, 1991.
<previous page
page_245
next page>
<previous page
page_246
Page 246
(5) Multiple authors, article in journal: Hill, John Walter, and Tom Ward. "Two Relational Databases for Finding Text Paraphrases in Musicological Research," Computers and the Humanities, 23/4 (1989), 105-111. Bibliographical listings, which should be limited to eight items, should be given in alphabetical order of the authors' surnames. Multiple references by the same author should be given alphabetically by title. Authors' full names are preferred. Longer bibliographies may be included in review articles. Citations within the main text may give the author and the year only, e.g., "(Hill 1989)". If multiple writings by the same author occur in the same year, please append designations (Hill 1989a, Hill 1989b, etc.) to appropriate bibliographical citations in the references.
<previous page
page_246

Melodic Similarity

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Melodic Similarity

Uploaded by

Copyright:

Available Formats

cover

Introduction 76 String-Matching Problems in Musical Analysis

78 Exact-Match Algorithms 87 Inexact-Match Algorithms 89 Musical Examples

Signatures 134 Earmarks

I. CONCEPTS AND PROCEDURES

Figure2b. Aself-accompanyingmelody:Schubert'sImpromptu,Op.142,No.3, Variation1.Somepitchesjointlybelongtothemelodyandtheaccompaniment.

Figure2c. Asubmergedmelody:Brahms'sBalladeOp.10,No.4.Brahms'sinstructioncallsattention tothewishthatthetheme,foundinanintermediatevoice,shouldbeplayed"withthemost intimatesentimentbutwithouttoomuchmarkingofthemelody."Theessentialcontentis giveninthetopstaffasa"rhetoricalreduction"byLeonardRatner.

Figure4c. Numericalvaluesinabase-21systemofpitchrepresentation. and the base-40 system (Figure 4d):

Figure5b. Thefive"matches"ofthefirstinstantiation(I:1inFigure5a) oftheUUUcontour,representedbyindividualincipits.

Figure6. Comparativerankingsondirectionalprofilesinsixrepertories. 9=thegreatestnumberofoccurrences.

Figure11. Amoveable-register("tonicsol-fa")representationofthe "NewSt.Ann"tune(fromLove,p.258).Moveabledo ( d)here=B intheoctaveaboveMiddleC.

( GGGEb ) as given by Barlow and Morgenstern, include the following:

Figure14. Acomparisonofchantmelodies,basedoncoordinationoftextsyllables, producedbyMaxHaas'sChatullGadolprogram.

are qualitatively the same.

Figure22. Zarhipov'sfour-tiersystemforrepresenting(a)event numbers,(b)intervallicchange,(c)beatsonwhich eventsoccur,and(d)durationalchange.

Figure23. Jones'sconceptsofmelodicaccent(derivedfrompitchcontour), temporal accent(derivedfrombeatstructure),andjoint-accent structure(cumulative).

Figure26. J.KentWilliams'sschemeforcouplingaccentualrelationshipsoffirstandlastnotes ofapatternwithspecificrhythmicformulae.Thislook-uparrayfeedscomposite results(e.g.,5SS,forGroup5,strong-strong)toasecondarraywithdirectional profilesofpitch(e.g.,UU ,DR ).

Figures33a-e. EugeneNarmour'sreworkingofFigures3a-ein accordancewiththeimplication-realizationmodel.

Figure35. TheopeningbarsofBach'sPreludeinC MajorfromBookI oftheWell-TemperedClavier.

Figuresla-c. (a)Startof''ShandonBells"fromO'Neill'sTheDanceMusic ofIreland(Chicago1907);(b)startof"TheYellowFlail" (op.cit.);(c)segmentsofaandbjuxtaposedintimewindoworder.Thedottedverticallinessegmentthe lowerhorizontallineintodivisions,eachoneof whichrepresentsawindow.

Figure2. Samplemelodicsegmentsforillustratingdifferencealgorithms 2.1.5 Transpositions The difference algorithm can be expressed as

The above formula can be expressed as

Figure1. Searchingforanexactsequenceofnotesinanyonevoice. SeeMusicalExample1onp.90.

Figure3. Identifyingrepeatedpatterns(notgivenapriori )inascore. SeeMusicalExample3.

2 A border is the largest prefix of a string that is also its suffix.

b) Edit operations on melodic strings extracted from scores:

c) All three versions reduced to note-data form: Lute

b) 'Evolution' of diatonic-pitch pattern

Figure5 Notealignmentgeneratedbydifferentmatch parameters.

Figure1. Thirteeniterationsoftheincipitof"RoslinCastle"fromlistingsbyRISM. Noteinparticularthevariationsin(a)key,(b)placementofdottednotes, and(c)presenceorabsenceofgracenotes.

Figure2. Twelveiterationsoftheincipitof"TheWhiteCockade"fromlistingsbyRISM. Notethedifferences(a)ofkey,(b)ofhighversuslowbeginnings,(c)ofshorter versuslongernotevalues,(d)ofgracenotes,and(e)ofbeamings [whichare representedinPlaineandEasieCode].

Figure4. AnearmarkfromthefirstmovementsofMozart'sPianoConcertos: (a)K.238,mm.86-7;(b)K.449,mm.318-9;(c)K.450,mm. 277-8;(d) K.482,mm.196-7;(e)K.595,mm.326-7.

Figure5. AnearmarkfromthefourthmovementofEMI 's Symphony(mm.82-85), arguablyinthestyleofMozart.

II. TOOLS AND APPLICATIONS

Figure1. TheGermanchoralemelody"AlleMenschenmssensterben"(upperstaff)and a choralevariationcomposedonthesamemelodybyPachelbel(lowerstaves).

Figure3a AdendrogramforthefirsteightmotifsofthePachelbelchoralevariationshownin Figure1b;belowthestaffonecanseethecorrespondingbase-7intervallic representation.

If you like this book, buy it!

Figure5. Pachelbel-stylevariationonthetenorvoiceofthechorale"OWelt, ichmu dichlassen"composedbytheneural-networksystem.

Figure6. Melodicvariationon"Ah!Vousdirai-je,Maman"composedbythe HARMONET neural-networksystem.Theoriginalmelodyisshownabove.

Figures4a-c. Classificationof(a)four-measure,(b)two-measure,and(onthefollowingpage) (c)three-measurepatterns.Alshowsthesamerhythmiccomponentbeingrepeated fourtimes.Newrhythmiccomponentsareindicatedbythelettersb,c,andd.

Example 1. Tokimune (samurai proper name)

Example 2. Akenokane (''Bell of the Morning Glow")

Example 4. Tsurukame ("Crane and Turtle")

III. HUMAN MELODIC JUDGMENTS

Examples 2a-j. From Hirsch v . Paramount Pictures (1937).

Example 8. Selle v . Gibb

Examples 9a ("L'Anne Passe") and b ("Rum and Coca-Cola").

Figure1. Thirteenmelodiesusedinthe Kompost experiments.

IV. ONLINE TOOLS FOR MELODIC SEARCHING

Figure2. Twomatchesforthecombinedquery(1)contour=//\\ ( UUDD ), (2)themeisanchoredtostartofwork,and(3)firstnote=A.

D Da Milano, Francesco 96 Dahlig, Ewa 211-219

U Ukkonen, E. 80, 85, 100 UNIX 114, 224 Urlinie 52

You might also like