+ All documents
Home > Documents > Musical Syntax and Its Relation to Linguistic Syntax

Musical Syntax and Its Relation to Linguistic Syntax

Date post: 30-Nov-2023
Category:
Upload: columbia
View: 0 times
Download: 0 times
Share this document with a friend
16
10 Musical Syntax and Its Relation to Linguistic Syntax Fred Lerdahl Abstract Music is meaningful, but there is no musical counterpart to the lexicon or semantics of language, nor are there analogs of parts of speech or syntactic phrases. This chapter seeks to establish a notion of musical syntax at a more fundamental level, starting from the view that syntax can be broadly dened as the hierarchical organization of discrete sequential objects which generate a potentially innite set of combinations from a rela- tively small number of elements and principles (thereby extending not only to linguistic syntax in the usual sense but also to a syntax of phonology). The elementary musical objects in this approach are perceived pitches, chords, and rhythms. Sequences of musi- cal events receive three types of structure: groupings, grids, and trees. Using a Beatles song as illustration, the formation of successive structural levels is described and issues of sequential ordering, the status of global structural levels, contour, and the question of psychological musical universals are discussed. The strongest correspondences be- tween music and language appear to be between musical syntax and linguistic phonol- ogy, not musical syntax and linguistic syntax. Background on Musical Syntax Music has always been the most theoretically laden of the arts. In the Western tradition, early theorizing largely focused on details of tuning, scales (modes), and rhythmic proportion. In the Renaissance, theorists developed principles to control horizontal and vertical intervallic relations (Zarlino 1558), which gradually coalesced into the pedagogies of counterpoint and harmony still taught to undergraduate music majors today. The specication of scale and chord type and the treatment of dissonance constitute a sort of morphology of a musical style. In the eighteenth century, Jean-Philippe Rameau (1726) proposed a syntac- tic treatment of harmonic progression linked to the newly discovered overtone series. Other theorists addressed larger levels of musical form, pursuing an From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.
Transcript

10

Musical Syntax and Its Relation to Linguistic Syntax

Fred Lerdahl

Abstract

Music is meaningful, but there is no musical counterpart to the lexicon or semantics of language, nor are there analogs of parts of speech or syntactic phrases. This chapter seeks to establish a notion of musical syntax at a more fundamental level, starting from the view that syntax can be broadly defi ned as the hierarchical organization of discrete sequential objects which generate a potentially infi nite set of combinations from a rela-tively small number of elements and principles (thereby extending not only to linguistic syntax in the usual sense but also to a syntax of phonology). The elementary musical objects in this approach are perceived pitches, chords, and rhythms. Sequences of musi-cal events receive three types of structure: groupings, grids, and trees. Using a Beatles song as illustration, the formation of successive structural levels is described and issues of sequential ordering, the status of global structural levels, contour, and the question of psychological musical universals are discussed. The strongest correspondences be-tween music and language appear to be between musical syntax and linguistic phonol-ogy, not musical syntax and linguistic syntax.

Background on Musical Syntax

Music has always been the most theoretically laden of the arts. In the Western tradition, early theorizing largely focused on details of tuning, scales (modes), and rhythmic proportion. In the Renaissance, theorists developed principles to control horizontal and vertical intervallic relations (Zarlino 1558), which gradually coalesced into the pedagogies of counterpoint and harmony still taught to undergraduate music majors today. The specifi cation of scale and chord type and the treatment of dissonance constitute a sort of morphology of a musical style.

In the eighteenth century, Jean-Philippe Rameau (1726) proposed a syntac-tic treatment of harmonic progression linked to the newly discovered overtone series. Other theorists addressed larger levels of musical form, pursuing an

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

258 F. Lerdahl

analogy to Classical rhetoric (Mattheson 1739) and articulating phrasal forms (Koch 1793). The nineteenth century focused increasingly on chromatic har-mony, culminating in Hugo Riemann’s (1893) theory of harmonic function, which bears the seeds of a syntactic constraint system. In the early twenti-eth century, Heinrich Schenker (1935) developed a comprehensive analytic method that generates structure in a series of self-similar hierarchical levels, starting from a simple underlying form and yielding, through elaborative and transformational operations, the surface variety of a given piece. As such, he anticipated generative linguistics.

Recent music theory has often taken a psychological turn, beginning with Leonard Meyer’s (1956) reliance on Gestalt principles and probabilistic methods to account for melodic expectation. His argument that emotion in music arises from denied expectation remains a touchstone for research on musical emotion. Jackendoff and I adopted the methodological framework of generative linguistics to develop a theory of musical cognition in entirely mu-sical terms ( generative theory of tonal music, GTTM; Lerdahl and Jackendoff 1983). We proposed four interacting hierarchical components: the rhythmic components of (a) grouping and (b) meter and two kinds of pitch hierarchy, (c) time-span reduction, and (d) prolongational reduction. Time-span reduc-tion provides an interface between rhythm and pitch, whereas prolongational reduction describes nested patterns of departure and return that are experi-enced as waves of tension and relaxation. Each component is regulated by well-formedness rules, which stipulate possible structures, and preference rules, which assign to given musical passages, in gradient fashion, specif-ic structures that the theory predicts are cognized (see also Jackendoff and Lerdahl 2006).

The growing cognitive science of music has bred fruitful interdisciplinary work, notably Krumhansl’s (1990) theoretically informed experiments on the cognitive schematic organization of pitches, chords, and keys. These results underlie the post-GTTM construction of a unifi ed and quantitative hierarchy of pitch relations ( tonal pitch space, TPS; Lerdahl 2001b; see also Lerdahl and Krumhansl 2007). In other work, Huron (2006) explored expectation from the perspectives of statistical learning and evolutionary psychology. Tymoczko’s (2006) geometrical model of all possible chords, while mathematically sophis-ticated, may be of limited relevance to music cognition since it sets scales aside and assumes perceptual equivalence of all members of a chord. Most musical idioms, if they have chords at all, build them out of pitches of a musical scale, and chords in most idioms have perceptually salient roots which must be fac-tored into measurements of distance (where spatial distance correlates with cognitive distance).

Current interest in the relationship between music and language has been fueled by the idea that these two uniquely human capacities evolved from pro-tomusical, protolinguistic expressive utterances (Brown 2000; Darwin 1876; Fitch 2010; Rousseau 1760/1852; see also Arbib and Iriki, this volume). Patel

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

Musical Syntax and Its Relation to Linguistic Syntax 259

(2008 and this volume) provides a thorough review from a neuroscientifi c per-spective of what music and language do and do not share. He advances the hypothesis that while linguistic and musical structures may have different stor-age areas in the brain, their processing shares brain resources. He approaches the two capacities in areas where they do not mix, instrumental music and nonpoetic speech, so as to maintain a clear comparison. I take the opposite ap-proach by analyzing the sounds of poetry using GTTM’s components (Lerdahl 2001a). This analysis presents evidence for where musical and linguistic struc-tures do and do not overlap, and it provides an account of the variables in set-ting text to music.

Rohrmeier (2011; see also Koelsch, this volume) has implemented a tree-structure version of Riemann’s functional theory of harmony in ways that re-semble early generative grammar (Chomsky 1965). His trees decompose into the grammatical categories of tonic region, dominant region, and subdominant region, which in turn are spelled out in terms of tonic-functioning, dominant-functioning, and subdominant-functioning chords. He does not, however, ad-dress rhythm or melody. GTTM rejected an early and less-developed version of this approach (Keiler 1977), if only because it does generalize to other mu-sical styles. Indeed, until recently, most of the world’s music did not have harmonic progressions. A second issue concerns the status of Riemannian harmonic functions, which derive from Hegelian philosophy via Hauptmann (1853). It is unclear what cognitive claim could lie behind the tripartite func-tional classifi cation. In contrast to Rohrmeier’s approach, GTTM and TPS seek a theoretical framework that can be applied, with suitable modifi cations, to any type of music; hence their tree structures and treatment of functionality are un-encumbered by stylistic restrictions or a priori categories. Rohrmeier’s work, however, succeeds well within its self-imposed limits.

Katz and Pesetsky (2011) seek unity between music theory and the mini-malist program in current generative linguistics. They reinterpret GTTM’s components in pursuit of the claim that the two cognitive systems are identi-cal except for their inputs: phonemes and words in one, pitches and rhythms in the other. It is a highly suggestive approach, although they are forced into the uncomfortable position of positing a single entity, the cadence, as the un-derlying generative source of all musical structures, including rhythm. Here they follow the Chomsky paradigm in which syntax is the centerpiece from which phonological and semantic structures are derived. Jackendoff’s (2002) parallel-architecture linguistic theory is more like GTTM’s organization, with its equal and interactive components.

Syntax Abstractly Considered

A recurring pitfall in discussions of musical syntax is the search for musical counterparts of the tripartite division of linguistic theory into syntax, semantics,

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

260 F. Lerdahl

and phonology (Bernstein 1976). There is no reason to suppose that this divi-sion transfers in any straightforward way to music. Music is meaningful, but there is no musical counterpart to the lexicon or semantics, just as there is no analogy to parts of speech and syntactic phrases, or binary phonemic opposi-tions and moras. Comparisons of musical and linguistic organization must thus begin at a more fundamental level.

If we broadly defi ne syntax here as the hierarchical organization of discrete sequential objects, we can speak not only of linguistic syntax in the usual sense but also of the syntax of phonology. Any syntax constitutes a “Humboldt sys-tem” capable of generating a potentially infi nite set of outputs from a relatively small number of elements and principles (Merker 2002). In both language and music, there are four abstract aspects in a syntactic hierarchy: a string of ob-jects, nested groupings of these objects, a prominence grid assigned to the objects, and a headed hierarchy of the grouped objects.

Strings and Groupings

The constitution of a string of objects varies according to component and level within a component. In linguistic phonology, the object at the segmental level is a phoneme; at suprasegmental levels, it is the syllable, then the word, then a prosodic unit. In linguistic syntax, the low-level object is a lexical item with discrete features; at higher levels it is an X-bar phrase (i.e., a syntactic phrase whose principal constituent, or head, is an X; e.g., a verb phrase is a constituent headed by a verb), then a sentence. Music has different objects. Music theory tends to ignore the psychoacoustic level, which corresponds more or less to that of phonetics in linguistic studies, and treats perceived pitches, chords, and rhythms as its elementary objects. These can be referred to as “(pitch) events.” At larger levels, units consist of groupings of events.

How do linguistic and musical objects relate? Perhaps the most basic cor-respondence is between syllable and note. In a text setting, a single syllable is usually set to a single note (the less frequent case is melisma, in which a syl-lable continues over several notes). At a sub-object level, syllables typically break down into a consonant plus a vowel, corresponding roughly to the attack and sustained pitch of a note. Syllables group into polysyllabic words and clitic phrases, which group into phonological and intonational phrases. These levels correspond more or less to the musical levels of motive, subphrase, and phrase, respectively. Possibly of deeper import than these broad correspondences is that they are made between music and phonology, not music and linguistic syntax.

In both domains, groupings apply to contiguous objects. The general form of grouping structure is illustrated in Figure 10.1. Higher-level groups are made up of contiguous groups at the next lower level. These strictures apply only within a component. Grouping boundaries in one component of language or music often do not coincide with those in another. For example, prosodic

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

Musical Syntax and Its Relation to Linguistic Syntax 261

and linguistic-syntactic boundaries are often different, as are phrasal and pro-longational boundaries in music.

Generative linguistics has long posited movement transformations that reor-der syntactic strings in specifi ed ways to account for certain syntactic phenom-ena. Lately, movement has become “internal merge” (Chomsky 1995). Such phenomena appear not to exist in music and, except for Katz and Pesetsky (2011), music theory has never considered them.

Grids

Here we consider “ grids” as they arise in phonology and in the metrical com-ponents of poetry and music. An instance of the general form of a grid is il-lustrated in Figure 10.2, where an X represents stresses or beats. If an X is prominent or strong at a given level, it is also an X at the next larger level. The X’s do not group in the grid per se; that is, a grid is not a tree structure.

There are two kinds of grids in language and music: a stress grid and a metrical grid. A stress grid in phonology represents relative syllabic stress, confusingly called a “metrical grid” in the literature (e.g., Liberman and Prince 1977). A linguistic metrical grid, in contrast, represents strong and weak pe-riodicities in a poetic line against which stresses do or do not align. Stresses in ordinary speech are usually too irregular to project meter (Patel 2008). The cues for beats in musical meter are more periodic and mutually reinforcing than those of most spoken poetry. Consequently a musical metrical grid often has many levels.

Stress (or psychoacoustic prominence) in music is less rule-governed than in phonology and plays little role in music theory. Much more important in music is another kind of prominence—pitch-space stability—for which there is no linguistic equivalent. The most stable event in music is the tonic: a pitch or chord that is the point of orientation in a piece or section of a piece. Other pitches and chords are relatively unstable in relation to the tonic. The degree of instability of nontonic pitches and chords has been well established empiri-cally and theoretically (Krumhansl 1990; Lerdahl 2001b).

Sequence of objects: O1 O2 O3 O4 O5

Figure 10.1 Abstract form of grouping structure.

XXX X X

XX X

XX

X XX

X XXX

Figure 10.2 Abstract form of a grid.

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

262 F. Lerdahl

Headed Hierarchies

Groupings and grids are nonheaded hierarchies; that is, adjacent groups or X’s do not form dominating-subordinating constituencies. Strings of words and musical events, however, form headed hierarchies. The objects of headed hier-archies in linguistic syntax are grammatical categories, represented in simpli-fi ed form in Figure 10.3a, b, where XP stands for X-bar phrase, X for the head constituent (noun in a noun phrase, verb in a verb phrase, etc.), and Y for any nonhead grammatical constituent within the phrase, either before or after X. The same hierarchy is conveyed in Figure 10.3c, d using GTTM’s musical tree notation. Musical trees, however, do not represent X-bar categories but rather noncategorical elaborative relations. In time-span reduction, branching occurs between events in the nested rhythmic structure, with the more stable event dominating at each level. In prolongational reduction, the tree shows tensing-relaxing relations between stable and unstable events, with right branching (Figure 10.3c) for a tensing motion and left branching (Figure 10.3d) for a relaxing motion.

Syntax Illustrated

Some short examples from the Beatles song Yesterday will illustrate these ab-stract structures, moving from lyrics to music and beginning with phonology instead of linguistic syntax. The focus will be on representations rather than derivations. Figure 10.4a illustrates a prosodic analysis of the word “yester-day,” employing a tree notation with strong (S) and weak (W) nodes (Liberman and Prince 1977). Figure 10.4b conveys the same information using a combi-nation of prosodic grouping and stress grid (Selkirk 1984; Hayes 1989). Figure 10.4c translates Figure 10.4a, b into musical tree notation, in which domination is represented by branching length.

A prosodic analysis of the entire fi rst line of the lyric appears in Figure 10.5a, showing a stress grid, an inverted metrical grid, and prosodic group-ing. There is a silent metrical beat between “-day” and “all,” as suggested by the comma. The result, based on prosodic criteria independent of the Beatles’setting, is triple meter at the largest level, with strong beats on “Yes-,” “troub-,” and “-way.” Somewhat unusually, the stress and metrical grids fully

X

XP

y

(a)XP

Xy

(b)

X y

(c)

Xy

(d)

Figure 10.3 Abstract form of an X-bar syntactic tree and an equivalent musical tree. X’s dominance is represented symbolically in (a) and (b) by “XP”; in (c) and (d), it is depicted by the longer branch stemmed to X.

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

Musical Syntax and Its Relation to Linguistic Syntax 263

align. Figure 10.5b converts the metrical pattern in Figure 10.5a into standard musical notation. The notes on “Yes-,” “troub-,” and “-way” are raised slightly to convey the heavy stresses on these syllables (pitch height is a contributor to the perception of stress).

Figure 10.6 shows the Beatles’ musical setting, accompanied by a musi-cal grouping analysis and metrical grid. To the left of the grid are the note durations of the metrical levels. The setting departs in a few respects from the metrical analysis in Figure 10.5. The fi rst two syllables of “yesterday” are shortened, causing syncopation (relative stress on a weak beat) and lengthen-ing on “-day.” The stress-metrical pattern of “yesterday” essentially repeats in the rhymed “far away,” whereby “far” receives the major metrical accent

(a) (b) (c)

S

S W Wyes-ter-day

XXX X X

XStressgrid:

yes-ter-day yes-ter-daySyllabicgrouping:

Figure 10.4 Prosodic analysis of the word “yesterday” with mostly equivalent nota-tions: (a) strong–weak (S–W) tree; (b) prosodic grouping and stress grid; (c) time-span tree.

(a)

(b)

Stressgrid:

Metricalgrid:

Prosodicgrouping:

X X X X X X X X X X X X

X X X X X X X X X X X X X

X X X X X X X

X X X X X X X

X X X

X X X

X

Yes-ter-day, all my troub-les seemed so far a-way

Yes-ter-day, all my troub-les seemed so far a-way

Figure 10.5 (a) Prosodic analysis of the fi rst poetic line of Yesterday; (b) conversion of the metrical pattern in (a) into musical notation.

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

264 F. Lerdahl

instead of “-way.” This emphasis shifts the semantic focus of the second part of the phrase, conveying distance or“farness” between a “happy yesterday” and “troubled today.”

Figure 10.7 displays one interpretation of the syntactic tree of the sentence (Jackendoff, pers. comm.). Figure 10.8 shows the prolongational tree of the corresponding music. Derivational levels are labeled in the tree, and a con-ventional Roman numeral harmonic analysis appears beneath the music. The linguistically dominating words, “troubles” and “seemed,” which are the main noun and verb of the sentence in Figure 10.7, are rather embedded in the musi-cal tree in Figure 10.8. The musically dominating words are “yesterday” and “far away,” specifi cally the rhyming syllables “-day” and “-way.”

More basic than these particular divergences, however, is the dissimilarity of the trees themselves: a linguistic-syntactic tree consists of parts of speech and syntactic phrases, but a musical tree assigns a hierarchy to events with-out grammatical categories. In this respect, musical trees are like phonological

X X XX X X X X

XX

XX

X XXX

X XX

X XXXXX

X XX

XX XXX

Figure 10.6 The fi rst phrase from the song Yesterday, with the metrical grid and glob-al grouping added below.

S

VP

PPVN

NPNP(Adv)

NP+possQ

all my troubles seemed

so far

away

yesterday

QP P

Deg Q

Figure 10.7 Syntactic tree for the fi rst poetic line of Yesterday. S = sentence, NP(Adv) = noun phrase with adverbial function, NP = noun phrase, VP = verb phrase, PP = prep-ositional phrase, Q = quantifi er, NP+ poss = possessive pronoun, N = noun, V = verb, QP = quantifi er phrase, P = preposition, and Deg = degree.

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

Musical Syntax and Its Relation to Linguistic Syntax 265

trees, as in Figure 10.4c, instead of linguistic-syntactic trees. Furthermore, the phrase categories in Figure 10.7 include word groupings (e.g., the noun phrase “all my troubles”) whereas all the leaves of a prolongational tree are single events. (For a contrasting approach, see Rohrmeier 2011).

In bars 2–3 (Figure 10.8), the music that portrays “far away” progresses melodically to a higher register and harmonically from the tonic (I) F major to D minor. In the next phrase (Figure 10.9), melody and harmony return in bars 4–5, refl ecting the sense of the words “Now it looks as though they’re here to stay.” The tree represents the return by its highest branch at level a, which at-taches to the highest branch at level a in Figure 10.8. Thus the tonic F major in bar 5 “prolongs” the tonic in bar 1. The continuation in bars 5–6 in turn prolongs the tonic of bar 5.

ab

c

de

c

d

e

d

eee

F:

Yes - -ter day, all my troub - les seemed so far a - way

Figure 10.8 Prolongational tree for the fi rst musical phrase of Yesterday (bars 1–3).

Figure 10.9 Prolongational tree for the second musical phrase of Yesterday (bars 4–7).

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

266 F. Lerdahl

Sequential Ordering in Music

Any consideration of syntax involves not only a hierarchy of elements but also their sequential ordering. Word order is crucial in English, a comparatively uninfl ected language. In the syntax of various other languages, case markers often take the place of word order. There appears to be no musical equivalent to the syntactic role of case markers.

Are there constraints on the order of musical events? For Western tonal music, the answer is yes, although there is more freedom than in English syn-tax. At a very local level, dissonance treatment requires specifi c continuations. For example, the dissonances on “Yes-” and “far” in Figure 10.8 demand and receive resolution on adjacent and consonant scale degrees. At the phrase or double-phrase level, the most restrictive constraint is the cadence, a formulaic way of establishing closure at a phrase ending. In Western tonality, the stan-dard cadence is a two-event process: a melodic step to the tonic pitch over an accompanying dominant-to-tonic harmonic progression. Cadential prepara-tion, for which there are only a few options, usually precedes a cadence. At the onset of a phrase, the norm is to start on the tonic or member of the tonic chord. Hence a phrase ordinarily begins and ends on a tonic, the point of stabil-ity or relaxation. After the beginning, ordering is relatively unrestricted, with the proviso that the continuation induces tension as the music departs from the tonic. The high point of tension occurs somewhere in the middle of the phrase, followed by relaxation into the cadence. In sum, ordering constraints are gen-erally strongest at the phrasal boundaries and weakest in the middle. Figure 10.10 sketches this pattern of stability from tension to closure.

Yesterday manifests this pattern at the double phrase more clearly than at the single phrase level. In the opening phrase (Figure 10.8), the tonic (I) in bar 1 departs and tenses to D minor (vi) in bar 3. In the relaxing answering phrase (Figure 10.9), the subdominant (IV) on the downbeat of bar 4 prepares the dominant-tonic (V–I) cadence in bars 4–5. Bars 6–7 function as a tag, a confi rmation of this main action.

The trees in Figures 10.8–10.10 refer to musical features—tonic, key, ca-dence, dissonance, tension, stability—which do not translate into linguistic-syntactic terms. The only area where pitch is shared between the two media

Tonic Departure Tension Cadential preparation CadenceFigure 10.10 Normative prolongational pattern of tension and relaxation in phrases.

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

Musical Syntax and Its Relation to Linguistic Syntax 267

is the rise and fall of melodic and speech contour. In other respects, pitch relations are unique to music. A robust tradition represents pitch relations in multidimensional spaces that have no linguistic counterpart (Krumhansl 1990; Lerdahl 2001b; Tymoczko 2011). There is increasing evidence that spatial pitch representations have neural correlates, not only logarithmic pitch height on the basilar membrane and in the auditory cortex (Weinberger 1999) but also cycle-of-fi fths key relations among major keys (Janata et al. 2002a).

Figure 10.10 represents a schema applicable to many musical styles. Ordering constraints can also arise from schemas specifi c to a single style. In the Classical period, there are a number of stock subphrase melodic and har-monic patterns (Gjerdingen 1996). These patterns slot into beginning, middle, or end positions to form functionally ordered units (TPS). Similarly, classical schemas within and above the phrase level coalesce into intermediate levels of form (Caplin 1998). These larger units, comprised of phrase groups with char-acteristic tree patterns, are also susceptible to ordering constraints.

The tightness or laxness of ordering at various musical levels corresponds to the expectation, or predictability, of future events: the tighter the constraint at a given point, the higher the probability of a particular outcome. Huron (2006) argues that predictability is largely a result of exposure to statistical distribu-tions. Yet while distributions are undoubtedly important in learning the syntax of a musical idiom, a statistical account alone is insuffi ciently explanatory. Why do dissonances resolve to adjacent pitches? Why is dominant-to-tonic the standard cadence in the classical style? Why, and how, are some chords and keys closely related while others are only distantly related? Why does the normative phrase go from relaxation to tension to relaxation? Why do certain style-specifi c schemas occur at certain positions in a phrase or section? There is no space here to resolve such questions except to say that the answers go beyond statistics to issues involving mental representation, psychoacoustics, and biology.

Structure at Global Musical Levels

Linguistic-syntactic trees apply only up to the level of the sentence. Larger lin-guistic levels express discourse or information structure. A tradition in music theory, however, carries the logic of prolongational syntax from the smallest detail up to a large movement such as sonata form (Schenker 1935; GTTM). This view is too uniform and needs to be supplemented by methods that incor-porate aspects of discourse structure. Music theory has yet to develop such an approach in any detail.

Three related questions arise in this connection. First, to what extent do ordinary listeners hear prolonged pitches or chords, especially the tonic, over long time spans? The empirical literature on this matter, technical fl aws aside, is not very encouraging (e.g., Cook 1987b). TPS offers a way to resolve the

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

268 F. Lerdahl

issue by positing prolongational functions instead of key identity as the opera-tive factor in the perception of long-range connections. A piece may begin in one key and end in another, but if the ending key is well established it can function as a global tonic and thereby provide closure.

Second, what are the units of analysis at global levels? TPS proposes fea-ture reduction of events at underlying levels and a syntactic ordering of small schematic groups, of the kind discussed in Gjerdingen (1996) and Caplin (1998), as the units of analysis at larger levels. Metaphorically, if pitch events are atoms of analysis, in a larger view small schematic groups act as molecules of analysis.

Third, how deep does hierarchical embedding in prolongational structure extend? In principle, embedding could be of indefi nite depth, but in practice there are limits to its cognition. (The same holds true in language.) Larger units of analysis will alleviate this problem, for at global levels there are fewer objects to embed.

It is perhaps unnecessary to add that the hypothesis that only the nar-row language faculty possesses recursion is not tenable (Hauser et al. 2002). (Recursion is meant as hierarchical self-embedding, as when a sentence in-cludes one or more sentential subordinate clauses.) There is empirical evi-dence that music is also cognized hierarchically and recursively (Dibben 1994; Lerdahl and Krumhansl 2007). For example, the Bach chorale tested in Lerdahl and Krumhansl (2007) shows recursion of the harmonic progression I → V–I at multiple levels.

Contour

Intonation in speech and melody in music share the feature of pitch contour. In speech, pitch rises and falls in continuous glides; in music, pitch height is ordinarily steady from one pitch to the next. In most musical systems, intervals between pitches are constant, whereas in speech intervals they are in constant fl ux. Slight variations in musical tuning do not undermine interval constancy as long as the intervals are perceived categorically. Fixed intervals and their relative psychoacoustic consonance permit a melody to project degrees of sta-bility and instability for which there is no linguistic analog.

This point holds for tone languages as well. Speech tones are not fi xed but relative to each other, the voice of speaker, and the natural declination of speech utterances (Ladd 2008). Tone languages as well as nontone languages lack fi xed pitch intervals, scales, tonics, and other fundamental features of music.

The question to pose in this context is whether speech contour is syntactic. If it is, it is not merely continuous in rise and fall but must consist of semistable ob-jects connected hierarchically. The topic of intonational units has been much de-bated (Ladd 2008). On one side is the view that intonation is fully continuous and defi nable only by its shapes (Bolinger 1986). Autosegmental-metrical theory, in

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

Musical Syntax and Its Relation to Linguistic Syntax 269

contrast, posits intonational objects at the highest peak and phrase boundaries of an utterance. Movement between these points is unconstrained, and the objects are related sequentially but not hierarchically (Pierrehumbert 1980).

In adapting GTTM to the analysis of the sounds of poetry, I have developed a derivational model that assigns each syllable of an utterance to one of four tiered but relative (not absolute) levels of pitch height (Lerdahl 2001a). The syllables are organized hierarchically, using the factors of prosodic grouping, stress, and a few canonical contour shapes. In this view, speech intonation is syntactic. The approach fi nds provisional backing in the recent technique of prosogram analysis of speech contour, which assigns pitch and timing to each syllable in a sequence (Mertens 2004; Patel 2008). Tierney and Patel (pers. comm.) have applied prosogram analysis to Robert Frost’s reading of one of his poems analyzed in Lerdahl (2001a), lending indirect support for the theo-retical analysis.

The complementary question, whether contour in a musical line is syntac-tic, has hardly been raised, no doubt because for tonal music, the syntax of pitch relations—scales, counterpoint, harmony, tonality, relative stability, and event hierarchies—has always been more central. Instead, musical contour theories have developed in the context of nontonal contemporary Western mu-sic. With the exception of Morris (1993), these theories are not hierarchical in orientation.

Psychological Universals in Music

Generative linguistics seeks to defi ne universal grammar beyond the particu-larities of this or that language (Chomsky 1965). The term universal is in-tended not in a cultural but in a psychological sense. In this view, a feature of universal grammar need not appear in every language; rather, it describes a part of the organization of the language capacity itself. Although far less rigor-ous comparative work has been done in music than in language, the musical situation is comparable. The structures of grouping, grid, and tree apply to all of music cognition, and the particulars of given musical idioms vary in system-atic ways within that framework.

To take a relatively simple case, the stresses in some musical styles are ir-regular enough that no meter is inferred (e.g., the beginning of a North Indian raga, some Japanese gagaku music, some contemporary Western art music). However, if stresses are suffi ciently regular, a mentally inferred metrical grid of beats comes into play against which events are heard and measured. There are only a few types of metrical grid:

1. Multiple levels of beats, with equidistant beats at each level and with beats two or three beats apart at the next larger level (as in Western tonal music; see Figure 10.11a).

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

270 F. Lerdahl

2. Multiple levels of beats, with equidistant beats at the smallest level, beats two or three beats apart within the next level, and often equidis-tant beats at a still larger level (as in much Balkan music; see Figure 10.11b).

3. Multiple beat streams, each with equidistant but often noncoincident beats at the lowest metrical level and equidistant beats at a larger level (as in some sub-Saharan African music, some Indian music, and some Western art music; see Figure 10.11c).

Not all kinds of music invoke metrical grids, so grids are not culturally univer-sal; however, if a grid is inferred, its form is formally constrained as described. Each of the three grid types can combine in different ways within its own type (e.g., different combinations of two and three in type Figure 10.11a), and each can combine with the other types. Depending on the regularity of stresses, there can be as few as two and as many as eight metrical levels. The result is a small combinatorial explosion of possible (or well-formed) grids. A given musical style typically utilizes a small subset of possible grids.

Beyond well-formedness, metrical grids are constrained by perceptual lim-its of tempo. Six hundred beats per minute (100 ms) are too fast to distinguish clearly, whereas 10 beats per minute (6 s) are too slow enough to gauge ac-curately (London 2004). A middle tempo, from about 70–100 beats per minute (857–600 ms), is perceptually the most salient, and integer multiples or divi-sions of the tactus tend to be heard in relation to it (GTTM). It has long been hypothesized that the tactus has a biological basis in the human heart rate.

When listening to music, listeners infer particular metrical grids by fi nding the best match, or fewest violations, between stress patterns in the musical signal and the repertory of possible grids in the style in question. This pro-cess happens automatically and quickly. (Next time you turn on the radio and hear music, observe that it takes a moment to fi nd the beat.) GTTM lays out, through its interacting metrical preference rules, how this process happens. Some of the rules appear to be psychologically universal, especially those that incorporate Gestalt principles, whereas others are style specifi c. No doubt the list of factors is incomplete, if only because of GTTM’s orientation to Western tonal music. Comparative study of music from around the globe promises to enrich and correct claims of psychological musical universality with respect not only to meter but also to other musical components.

Level n+2:Level n+1:

Level n: x xx xx x

xxx

x

n+1:(a) (b) (c)

xx x

n:[Figure 10.11 Types of metrical grids: (a) beats at level n are two beats apart at level n + 1 and three beats apart at level n + 2; (b) beats at level n that are 2 + 2 + 3 beats apart at level n + 1 and three beats apart at level n + 2; (c) two streams of beats at level n, dividing level n + 1 into fi ve and three parts.

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

Musical Syntax and Its Relation to Linguistic Syntax 271

Shared Structures and Evolution

This review of musical syntax and its relation to linguistic syntax has taken a broad view of syntax to include not just linguistic syntax but any hierar-chical ordering of sequential objects. From an abstract perspective, music and language share three kinds of syntactic structures: groupings, grids, and trees. Groupings and grids are nonheaded hierarchies whereas trees are headed hierarchies.

The broad defi nition of syntax permits parts of phonology as well as music to be viewed in syntactic terms. Indeed, phonological trees representing syl-labic prominence are formally equivalent to musical trees (Figure 10.4c, 10.8, and 10.9), even though the leaves of the trees are different: syllables on one hand, pitch events on the other. This similarity stands in contrast to linguistic-syntactic trees, which are built of syntactic categories.

Further connections between phonology and music emerge when poetry is considered instead of ordinary speech (Lerdahl 2001a). The inference of a poetic meter from patterns of syllabic stress operates in the same way as the best-match process between musical stress and grid. The departure and return of events, especially of the tonic, which is so important to the sense of tension and relaxation in music, is comparable at local levels of poetry in recurrent sound patterns in alliteration, assonance, and especially rhyme. The intuition in both cases is of a return and connection to the same or similar object: a pitch event in music and a syllable in language.

Table 10.1 suggests a taxonomy of shared and unshared musical and lin-guistic structures. Fixed pitches and the pitch structures that arise from them belong exclusively to music. The lexicon, which specifi es word meanings and parts of speech, belongs only to language. From combinations of lexical items come semantic structures such as truth conditions and reference, for which there is no musical counterpart. Also in the linguistic category are linguis-tic-syntactic relations and various phonological structures such as distinctive features (Jakobson et al. 1952). Shared structures are mostly in the domain of rhythm, broadly conceived: the relative duration of pitches and syllables; grouping into phrases and sections in music and prosodic phrases in language; patterns of stress (or contextual psychoacoustic salience); and metrical grids. Nonrhythmic features shared by both domains are pitch contour and recur-rent patterns of timbre (sound quality). “Metrical grids” and “ recurrent sound patterns” are given asterisks because these features, while common in music, appear in language mainly in poetry but not in normal speech.

This classifi cation broadly fi ts with Peretz and Coltheart’s (2003) modular model of music processing based on neuropsychological evidence. They place pitch and rhythm in different brain modules and view contour as being pro-cessed prior to fi xed- pitch relations.

Table 10.1 can also be considered in an evolutionary light. As mentioned, there is a long-standing view that expressive animal utterances preceded the

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.

272 F. Lerdahl

emergence of the separate faculties of music and language. The structural fea-tures that cries and calls display are the shared structures in Table 10.1. Animal sounds consist of long and short elements that group into larger units, albeit with comparatively shallow nesting. Relative stress (contextual salience) is a psychoacoustic feature of animal calls, as is pitch contour. Animal utteranc-es show cyclic patterns of recurrent features, especially birdcalls—a case of convergent evolution rather than direct descent. The exception under “shared structures” is metrical grids; only humans and some songbirds engage in be-havior with multiple periodicities (Fitch 2006a). These shared structures give rise to the possibility of a pared-down syntactic analysis of animal utterances using GTTM’s components, much as in the analysis of spoken poetry.

The shared structures formed a syntactic foundation for the subsequent spe-cializations of music and language. Fitch (2006a) makes a useful distinction in this regard between unlearned animal signals (e.g., ape or bird calls) and learned complex signals (e.g., bird or whale song). In this scenario, learned complex signals acted as a stage on the way to the full development of the separate faculties of music and language. Music grew in the direction of fi xed pitches and intervals and consequent complex pitch structures, and language grew in the direction of word meanings and their combinations.

Table 10.1 Hypothesized organization of musical and linguistic structures (adapted from Lerdahl 2001a).Exclusively musical structures Shared structures Exclusively linguistic structuresFixed pitches, intervals, and scales

Durational patterns

Lexicon (word meaning and parts of speech)

Harmony Grouping (pro-sodic hierarchy)

Semantic structures (truth condi-tions, reference, entailment)

Counterpoint Stress (psycho-acoustic salience)

Syntactic units

Tonality *Metrical grids Phonological distinctive fea-tures (and other phonological structures)

Pitch prolongations ContourTonal tension and attraction * Recurrent sound

patterns* Common features in music, but which in language appear primarily in poetry, not in normal speech.

From “Language, Music, and the Brain,” edited by Michael A. Arbib. 2013. Strüngmann Forum Reports, vol. 10, J. Lupp, series ed. Cambridge, MA: MIT Press. 978-0-262-01810-4.


Recommended