Edited by:
Reviewed by:
*Correspondence:
This article was submitted to Cognition, a section of the journal Frontiers in Psychology.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
It is often claimed that music and language share a process of hierarchical structure building, a mental “syntax.” Although several lines of research point to commonalities, and possibly a shared syntactic component, differences between “language syntax” and “music syntax” can also be found at several levels: conveyed meaning, and the atoms of combination, for example. To bring music and language closer to one another, some researchers have suggested a comparison between music and phonology (“phonological syntax”), but here too, one quickly arrives at a situation of intriguing similarities and obvious differences. In this paper, we suggest that a fruitful comparison between the two domains could benefit from taking the grammar of action into account. In particular, we suggest that what is called “syntax” can be investigated in terms of goal of action, action planning, motor control, and sensory-motor integration. At this level of comparison, we suggest that some of the differences between language and music could be explained in terms of different goals reflected in the hierarchical structures of action planning: the hierarchical structures of music arise to achieve goals with a strong relation to the affective-gestural system encoding tension-relaxation patterns as well as socio-intentional system, whereas hierarchical structures in language are embedded in a conceptual system that gives rise to compositional meaning. Similarities between music and language are most clear in the way several hierarchical plans for executing action are processed in time and sequentially integrated to achieve various goals.
Comparative approaches to music and language as cognitive systems have recently gained interest in language and music research, but there seems no general consensus about the fundamental nature of this relationship (e.g.,
In this integrative approach, attention should be paid to the fact that different kinds of
“Syntax” can be defined as a set of principles governing the hierarchical combination of discrete structural elements into larger units (
Recent comparative approaches investigate syntax of music and language on the several representational levels. In particular, musical syntax is compared to narrow-sense syntax and phonological syntax of language. This one-to-one comparison based on theoretical considerations as well as findings from cognitive neuroscience, however, includes a conundrum: on one hand there are intriguing similarities at both levels of comparison, but there are also important differences. That is, musical syntax does not fit into the ready-made linguistic conception. So, is this distinction of narrow-sense and phonological syntax really useful for investigating syntax of music? Musical syntactic structures are headed hierarchies like linguistic syntax (
Thus, the aim of this paper is to find an appropriate level of comparison for the combinatorial properties of music and language, ideally, in a way that is independent of controversies specific to one or the other field. In comparing music and language, specific concepts developed in linguistics are adapted to music, but often prove harmful in the context of comparison. As
To find the right level of comparison and adequate granularity of constituents for the analysis, we introduce another level of comparison, namely action. Action-based comparison of music and language is promising mainly from two reasons. First, action is basic component of several cognitive systems including language and music. For example, developmental studies demonstrated the parallel development of word combinations (grammar) and manual object combinations, including tool use (e.g., using a spoon to eat foods) in children (
In the current paper, based on the ideas introduced by
Ever since
Recently, phonological syntax has been put forward by several authors as a more promising level of comparison between musical and linguistic structures because music and phonology (a domain of speech) make use of combinatorics without meaning (or “syntax without meaning”;
The problem of such language-based comparison is, however, that musical syntactic structures, in which several mechanisms such as tonal encoding of pitch, meter, and grouping are interacting, cannot be perfectly aligned with hierarchical structures of language (see Table
Level of narrow-sense syntax comparison | Syntactic categories | Yes | No |
Propositional meaning | Yes | No | |
Lexicon | Yes | No (or very different) | |
Level of phonological syntax comparison | Pitch Relative pitch | Yes | Yes |
Hierarchy | No | Yes | |
Discreteness | No | Yes | |
Grouping Hierarchy | Yes | Yes | |
Large-scale | No (Yes at the text level) | Yes | |
Meter Hierarchy | Yes | Yes | |
Isochronicity | No (Yes in poetics) | Yes | |
Interaction of pitch, grouping, meter | No | Yes |
The similarity between musical and linguistic syntax at the level of narrow-sense syntax comparison boils down to the fact that they can be organized as headed hierarchies (
Second, the relationship between head and elaborations in music define the tension-relaxation pattern of the sequence and thus encodes affect (
Third, music does not possess a rich lexicon comparable to linguistic one defined as a set of linguistic objects stored in long-term memory or a structured list of assemblies of features (e.g., phonological, semantic, and syntactic features;
Similarly, there are many parallels between music and phonology, but differences exist at the same time. Music and speech are learned complex vocalization including pitch, grouping, and metrical structures. In both domains, pitch and temporal structures are rule-based systems. Yet, there are also significant differences. First, “relative pitch processing,” i.e., encoding and recognizing pitch pattern independently of absolute frequency level, is an important shared mechanism of pitch processing in music and speech (
Second, on the lower level of perception, the principles of grouping are largely general-purpose gestalt perceptual principles which account for music and speech as well as visual perception in the similar way (
Third, in metrical structures,
Finally, differences become clearer when integrating these subcomponents, namely pitch, grouping, and metrical structures, into more complex hierarchical structures: prolongational structures. Contrary to linguistic prosodic hierarchy, prolongational hierarchy is based strongly on the interaction between pitch and temporal organizations. Moreover, in contrast to phonological rules determining physical changes, musical combinatorial rules have effect on structural aspects (
In sum, narrow-sense syntax includes many aspects which do not fit to musical syntax, and phonological syntax is not enough for capturing all relevant structural features of music. One question arises: Does the distinction between narrow-sense syntax and phonological syntax make any sense for investigating musical syntax? Sometimes this issue is discussed in relation to the notion of “duality of patterning.” For language, duality of patterning is considered to be a central design feature (
The difference in the combinatorics on the first level might be no categorical difference between music and language rather a difference of degree, but has an important consequence to the latter aspect of the duality of patterning, namely compositionality. For a structure to be compositional, meaningful units should be primitives of combinatorics. As already discussed above, contrarily to language possessing a rich, stable lexicon in which largely conventionally determined units of “freestanding” meaning, i.e., lexical elements (e.g., words), are stored, it is difficult to find such a unit in music. Moreover, although a stock of musical formulas stored in long-term memory may be thought of as similar to the lexicon in language, they are not the primary primitives of syntactic manipulation (
Concerning the small-scale level processing, this might be true. However, in a large-scale structure of music, combinatorial primitives are groups (
The hypothesis about shared
Additional evidence for a close connection between music and language at the level of syntax comes from cognitive disorder studies. Agrammatic aphasics show deficits in harmonic (but not in melodic and rhythmic) syntactic processing (
Given that the principles of hierarchical structure building in music and language are very different as discussed above, what is actually this shared aspect of syntax in music and language? One possible answer is that shared resources of music and language syntax processing are related to “more general, structural and temporal integration” (
To sum up, the very similarity of syntax in music and language is
To investigate the nature of similarities and differences between syntax of music and language we think that a further domain of comparison might be helpful, namely the comparison between music, language, and action.
Recently the similarity between action and (narrow-sense) syntax of language receives considerable attention from several fields of cognitive science (
The idea to investigate syntax of music and language in terms of action was already introduced by
The first one is
One central property of motor program and plan is often claimed to be their hierarchical organization. Especially, the cognitive representational approach to motor control emphasized the notion of hierarchically organized central plans or mental representations in controlling sequences of behavior (
How does hierarchy of action relate to hierarchy of music and language? First, primitives of combinatorics can be compared. Basic actions, movements that achieves some goals, can be regarded as “action words” or “action constructions,” analogs to language words, although this does not mean that action words and language words are completely the same (
Another important relationship between music and language (or rather phonology) is the importance of sensory-motor integration (even in the absence of any overt movements). In music and phonology, rule systems are based not only on acoustic features, but also on motor features. For example, the strong coupling of sensory and motor information in speech perception was pointed out by the well-known motor (command) theory of speech perception which claimed that the invariance in phoneme perception arises from articulatory invariance (
The similarity between music and phonology in terms of sensory-motor integration provides further comparative options. BA 44 which is often regarded as a core region of narrow-sense linguistic syntax computing complex hierarchical structure (
Investigating music and language as parallel to action opens the door to resolving the conundrum of syntax. On this level of comparison, the gray zone in narrow-sense comparison can be explained in terms of how hierarchically structured plan is built in order to achieve a certain goal or “meaning” unique to each domain. Hierarchically structured plans of music to achieve musical goals are built in strong relation to affective-gestural system encoding tension-relaxation pattern as well as socio-intentional system, whereas those of language are based on its conceptual structure with rich compositionality and its communicative or pragmatic system.
First of all, because we consider goals as analogous to meaning, a brief discussion about musical meaning is inevitable. While investigating musical meaning,
The lack of referential, propositional meaning yields the flexibility of musical meaning, i.e., their meanings are not required to be made explicit between performers, listeners, and participants: “Musical meaning is fluid” (
Because of the significant role of affective-gestural as well as socio-intentional meaning, we suggest them as being musical goals. Contrarily to linguistic goals relating largely to conceptual structure, musical goals are relating more to other features of action, namely sensory-motor features. While
As already mentioned above, musical structure encodes affect. Concerning music, “affect” is understood as patterning of tension and relaxation widely existing in human activity and experience (
On the other hand, musical affect also includes partly conventional rules such as tonal grammar or other rules dependent on other idioms or styles. For example, in Western tonal music, tonal hierarchy provides stability conditions serving as a kind of conventional rule determining together with rhythm the structural importance of pitch events, which reflects the tension-relaxation pattern of a musical sequence. Such a joint accent structure is considered to shape structural expectancies of a musical sequence (what will happen when) and would be a basis of affective dynamics (
Both levels of musical affect have very strong relation to movement and bodily representation (
Music is a primarily temporal phenomenon and its rhythmic structure is thus an important organizational principle, while linguistic rhythm is more a byproduct of other linguistic phenomena (
Rhythmic syntax is appropriate for investigating the relationship between sensory-motor integration, motor control, and action planning from several reasons. First, as already noted, sensory-motor connection is mainly reflected in the domain of meter. Second, there is evidence that the structure of action forms the structure of metrical structure. For example, the way of perceiving pulse in African drumming music is tightly connected to the way to dance (
In particular, these aspects can be investigated in relation to the phenomenon of
In language and speech research, our framework can be applied in a bottom-up fashion, namely in terms of sensory-motor integration and motor control. Phonological rules determine how online movements are produced and controlled. As we saw, musical and phonological rule systems are similar in making use of not only sensory information, but also motor information. The questions how plans relate to motor programs and how planning perspective can be aligned with sensory-motor integration and motor control remain open. One suggestion to relate sensory-motor integration, motor control, and linguistic units was made by
In this paper, we have attempted to find an adequate level of comparison between music and language, to capture the intuition that they share a “syntax.” We saw that many theoretical and experimental investigations tend to focus mainly on hierarchical aspects of musical and linguistic syntax and face the conundrum that similarities and differences exist simultaneously. Musical headed hierarchies based on structural importance regarding rhythmic and harmonic stability cannot be compared in a one-to-one manner with linguistic headed hierarchy based on syntactic categories, propositional meaning, and lexical combinatorial units. The models suggested to resolve this problem in terms of domain-specific representations and shared syntactic integration resources do not make clear how domain-specific representations are activated and integrated by same syntactic resources. Even switching to phonological syntax does not quite solve the conundrum. Mechanisms processing pitch, grouping, and metrical structures seem to be similar in music and speech, but in music pitches are discrete, more fine-grained than those in speech, and hierarchically organized, grouping is less restricted, and metrical structures are isochronous. Moreover, the prolongational structure of music is somehow meaningful in a way that phonological structures are not because it encodes affect. In sum, the very similarity of syntax in music and language is the fact that hierarchical structures bundling different types of information should be mapped onto/constructed from linear strings to make sense of sequences by building structural expectancy by temporal integration. However, this is not enough to explain syntax in music and language.
As a first step toward resolving the conundrum, we introduced another level of comparison, namely action of which the hierarchical organization can be compared to narrow-sense syntax of language, phonological syntax, and musical syntax. We claimed that hierarchical plan as well as sensory-motor integration are of particular importance in the comparative language-music research. The conceptual framework we developed in terms of action-related components such as goal of action, action planning, motor control, and sensory-motor integration provides a new possibility for comparative research on music and language from theoretical as well as empirical perspectives. Regarding music and language as parallel to action enables us to explore syntax of music and language independently of any highly specific linguistic concepts. At this level of comparison, some of the differences between language and music could be explained in terms of different goals reflected in the hierarchical plans: the hierarchical structures of music arise to achieve goals with a strong relation to the affective-gestural system encoding tension-relaxation patterns as well as socio-intentional system, whereas hierarchical structures in language are embedded in a conceptual system that gives rise to compositional meaning. Although we did not discuss the relationship between syntax and semantics in terms of action-oriented perspective explicitly, to us this is a very important research question to be addressed in comparative research on language and music. Especially for musical semantics, an action-oriented approach seems to open up new research perspectives (
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Special thanks to Ray Jackendoff for several insightful discussions in developing the ideas introduced in this paper, three reviewers, Uwe Seifert, Clemens Maidhof, Lüder Schmidt, and colloquium members of the Department of Systematic Musicology, Cologne, Germany for helpful comments to the earlier version of this manuscript. We also would like to thank Michael A. Arbib for his critical and constructive comments, on which we will keep working in our future research. The present work was made possible through a Marie Curie International Reintegration Grant from the European Union (PIRG-GA-2009-256413), research funds from the Fundació Bosch i Gimpera, the Generalitat de Catalunya (2014-SGR-200), and the Spanish ministry of economy and competitiveness (FFI2013-43823-P), all associated with CB.
Boeckx (2010) discusses this issue more in detail. A comparative approach of language and music in this line of research was introduced by Fritz et al. (2013).
“[S]yntax consists of a process for progressively merging words into larger units, upon which are superimposed algorithms that determine the reference of items (in various types of structural configuration) that might otherwise be ambiguous or misleading”
In Western tonal music pitch is “the most obvious form- and structure-bearing dimension” (
Some readers familiar with evolutionary research of syntax might think about the distinction between the faculty of language in the narrow-sense (FLN) and in the broad-sense (FLB) introduced by
We note here that the importance of rhythmic aspects beside tonal-harmonic aspects was already pointed out in GTTM and subsequent work by
The existence of the tension and relaxation pattern tends to be regarded as restricted in Western tonal music, but there are good reasons to apply this concept to other musical styles and cultures. For example, in atonal music, in which tonal center is not clear salient conditions based on, e.g., registral prominence, relative loudness, and motivic importance play an important role to build the prolongational structure encoding affect (
For example, one particular melody, say “happy birth day to you,” in different keys is still recognized as the same melody, intonation contour, and lexical tones can be also identified in different people with different voice frequency. Notably, non-human animals including songbirds have difficulties with this relative pitch processing (
In the linguistic domain, there is also an example where rhythmic aspect becomes an organizational principle, namely poetics (
Phonology, especially supra-segmental phonology such as prosody, seems to encode pragmatic meaning. Though there are some approaches investigating the relationship between pragmatics and musical meaning (
Such abstract frameworks are sometimes called schemas.
Some authors (e.g.,
It is worth noting that the left-hemisphere bias in speech/language processing is claimed to have a particular importance (
Importantly, action differs from mere movement in including some sort of goal or intention, i.e., dangling arms without having a certain goal or intention is not an action although it is a movement. In this sense, actions (and subactions) include more than just movements that are, however, necessary building blocks of actions.
However, it is also claimed that the evidence for Broca’s area as a shared neuronal substrate for human gesture and language is very weak (
This characterization was developed in the discussion with one of the reviewers and by means of the characterization of “plan” in
A similar idea was already introduced by
The detailed discussion about the goals in language is beyond the scope of our current article. Here, we would like to briefly note reasons why we chose conceptual and pragmatic goals. Concerning the former,
The extra-musical meaning might have something shared with propositional meaning of language referring to states of world. However, this plays a secondary role in musical meaning.
These three kinds of meaning also exists in language, but they play a more peripheral role in language than in music because of the existence of referential and propositional meaning.
For more detailed discussions about “musical gesture,” see
In the linguistic domain, there is also an example where rhythmic aspect becomes an organizational principle, namely poetics (