Edited by: Gottfried Schlaug, Beth Israel Deaconess Medical Center and Harvard Medical School, USA
Reviewed by: Gottfried Schlaug, Beth Israel Deaconess Medical Center and Harvard Medical School, USA; Shinya Fujii, Beth Israel Deaconess Medical Center and Harvard Medical School, USA
*Correspondence: Floris Tijmen van Vugt, Institute of Music Physiology and Musicians’ Medicine, University of Music, Drama and Media, Emmichplatz 1, Hanover 30175, Germany. e-mail:
This article was submitted to Frontiers in Auditory Cognitive Neuroscience, a specialty of Frontiers in Psychology.
This is an open-access article distributed under the terms of the
We investigated how musical phrasing and motor sequencing interact to yield timing patterns in the conservatory students’ playing piano scales. We propose a novel analysis method that compared the measured note onsets to an objectively regular scale fitted to the data. Subsequently, we segment the timing variability into (i) systematic deviations from objective evenness that are perhaps residuals of expressive timing or of perceptual biases and (ii) non-systematic deviations that can be interpreted as motor execution errors, perhaps due to noise in the nervous system. The former, systematic deviations reveal that the two-octave scales are played as a single musical phrase. The latter, trial-to-trial variabilities reveal that pianists’ timing was less consistent at the boundaries between the octaves, providing evidence that the octave is represented as a single motor sequence. These effects cannot be explained by low-level properties of the motor task such as the thumb passage and also did not show up in simulated scales with temporal jitter. Intriguingly, this instability in motor production around the octave boundary is mirrored by an impairment in the detection of timing deviations at those positions, suggesting that chunks overlap between perception and action. We conclude that the octave boundary instability in the scale playing motor program provides behavioral evidence that our brain chunks musical sequences into octave units that do not coincide with musical phrases. Our results indicate that trial-to-trial variability is a novel and meaningful indicator of this chunking. The procedure can readily be extended to a variety of tasks to help understand how movements are divided into units and what processing occurs at their boundaries.
Playing music means executing particular motor commands to create an auditory stimulus (Jäncke,
Insight into the organizational structure of motor actions is provided by sequence learning paradigms. Participants learning to type a sequence of numbers divide it into smaller subsequences so as to facilitate learning (Koch and Hoffmann,
On the other hand, the musical material is thought to contain structural cues in timing deviations that are, intriguingly, reminiscent of the timing effects found in sequence production literature. That is, pianists slow down at the end of musical phrases (Palmer and Krumhansl,
Do these results imply that the structure of musical phrases is that of the motor commands used to create it? In other words, could musical phrases and motor sequences be two sides of the same coin? This idea conflicts with the
Then what causes temporal deviations in musical playing: motor chunking or perceptual deviations? We propose musical scale playing as our paradigm to investigate this question. Scales are the quintessential musical practice materials: the first thing Mozart’s Zauberflöte character Tamino plays on his flute is a scale. Indeed, practising scales is one of the chores that every classical pianist is engaged in for many hours during their professional career. As a result, we can expect that their motor structure has become sufficiently stable. In order to disentangle motor sequence structure and musical phrasing, we present a novel analysis of scale timing. This method segments the unevenness of playing into:
systematic deviations from objective evenness that are perhaps non-systematic deviations that can be interpreted as motor execution errors, perhaps due to noise in the nervous system (Harris and Wolpert,
The analysis is described in more detail below.
In this study we will first of all present a validation of our irregularity-instability analysis by showing that it allows one to reconstruct the previously used unevenness measure (Experiment I). We hypothesize that instability of the various notes in a scale will indicate how the motor program is chunked, whereas irregularity reveals its musical structure (Experiment I and II). Finally, in order to gain insight into the link between perception and action, and we investigate the relation between auditory perception resolution on playing instability (Experiment III).
Furthermore, our study is the first to investigate various types of scales, thus being able to control for differences in motor program (i.e., the fingering) and musical content (for example, major vs. minor scales). In addition to the C-major scale, we will include two other scales as controls. The first is the A-minor scale, which is of interest to us since it is played with a fingering identical to C-major. That means, in terms of low-level motor execution it is exactly the same as C-major (except of course for its being played three tones lower than C-major) whereas in terms of the musical content and the tension-resolution profile it is very different (Krumhansl and Kessler,
Thirty-four right-handed pianists were recruited from the student pool at the Hanover University of Music. Participants (17 female) were on average 24.72 (SD 4.47) years old. They started piano training at 6.4 (SD 2.2) years of age, and accumulated an average of 14140 (SD 8894) practice hours at their instrument. None of these participants reported any neurological disorder or problems related to performing, such as chronic pain. Participants played a Kawaii MP9000 stage piano connected to a Pioneer A109 amplifier. The MIDI data was captured through an M-Audio MIDI to USB converter and fed in to a Linux-PC running a custom developed C program that captured the MIDI events. The participants were invited to first play a few minutes to get used to the set-up and warm up. Then they played the scale exercises, which are explained in detail below. The exercises were presented in note score format with indicated (standard) fingering. The pianists were asked to play as regularly as possible at a comfortable mezzo-forte loudness and in legato style. The entire procedure took about half an hour, and the pianists received a nominal financial compensation (10 Euro). The experiment was performed in accordance with the Declaration of Helsinki.
Participants played two-octave piano scales accompanied by a metronome at 120 BPM. They played four notes within a metronome beat, i.e., eight keystrokes per second. They played blocks of approximately 30 alternating ascending and descending scales with a 9-note rest in between. The scales were played in the following blocks, separated by small breaks: (i) C-major with the right hand, (ii) C-major with the left-hand, (iii) A-minor with the right hand, (iv) F#-major with the right hand, (v) C-major with both hands. The left-hand and both-hand conditions were included as part of our scale playing battery, but will not be reported on in this paper. The C-major and A-minor scales were played with their conventional fingering (123123412312345, where the numbers indicate the fingers from the thumb, 1, to the little finger, 5, and the F#-major with 234123123412312). Following musicological convention, we will refer to the notes by their rank in the scale, in ascending order:
Previous scale playing studies computed the SD of the intervals between the subsequent onsets of the keystrokes as a measure of playing unevenness (Seashore,
However, a shortcoming of this metric is that it cannot be applied to investigate single-note timing deviations relative to an established temporal reference. For example, suppose one note in the scale is played too late, which is referred to as an “event onset shift” (Repp,
First we will describe our analysis of a single played scale. Suppose we have isolated the keystrokes and onsets of one correctly played scale. We then convert the note values to their rank in the scale (so for a C-major scale c would have rank 0, d has rank 1, e has rank 2, etc., up to c″ with rank 14) and perform a least-square straight line fit to this set of pairs of rank and timing. This allows us to compute for each note the expected time according to this fit and then the deviation of the timing of the actually measured onset (in ms).
Now we turn to our procedure to analyze the entire MIDI recording for a single participant. First we identified correctly played ascending and descending scales. We then performed the analysis described above for each scale separately and group the obtained temporal deviation values by playing direction (ascending or descending) and by note, yielding a set of 30 such deviations, one for each repetition.
As an illustration of how irregularity and instability were computed, we will present the data of a single participant playing a two-octave C-major scale. Each line in the Figure
Irregularity corresponds to what most previous studies have investigated. We found the irregularity trace is roughly an arc (Figure
In sum, we show that, although the two-octave scale is played as a single musical phrase, it is divided into two motor sequences with higher instability at their transition. Our study is the first that we are aware of to reveal this separation.
One participant was eliminated because he did not follow the instructions to play in a legato style. Scales that were incorrectly played were rejected (3.9% of the note onset events) as were scales for which the least-square fit had an
First, we compare our irregularity-instability analysis to the existing measure of unevenness. For each participant, scale, hand, and playing direction (ascending, descending), we computed unevenness, irregularity, and instability as follows. The unevenness was calculated by taking the SD of the inter-keystroke-intervals in each scale run and then averaged for all runs in each playing direction. Irregularity was computed as above for each note in the scale and then averaged across runs in each playing direction. Instability was calculated as the interquartile range of the deviations of each note, and then averaged across the notes in each scale and playing direction. That is, for each participant we obtained six scale conditions (the five scale tasks listed in the methods, one of which was played bimanually) times two playing directions (ascending, descending), that is 12 data points, for each of which we had three scalars: the unevenness, irregularity, and instability. We then proceeded with a multiple linear regression to predict the former on the basis of the latter two. Both irregularity and instability resulted as significant factors (both
First, we performed an overall three scales (C-major, A-minor and F#-major) × 2 directions (ascending, descending) × 15 notes ANOVA with irregularity as an outcome variable, and the participants as error terms. We found no main effect of scale [
Second, we performed the same ANOVA (3 scales × 2 directions × 15 notes) but with instability as an outcome measure. We found no main effects of scale or direction, but again a main effect of note [
We used trend analysis using orthogonal polynomials to investigate the contributions of the various polynomial degrees to the main effect of note on instability. We found no linear or cubic effect but a quadratic (u-shaped) and quartic (w-shaped) effect [
At this point, one may wonder whether there are systematic timing differences at the octave boundary. That is, does the peak in instability at the middle octave boundary also appear in the irregularity trace? Trend analysis using orthogonal polynomials in the irregularity trace, revealed a strong quadratic trend [
Now we turn to the A-minor scale (Figure
In the F#-major scale we found again the instability peak at the octave boundary in the descending scales [
First of all, we have validated our analysis method. We have decomposed the variability from a single variable (unevenness) into two mostly independent and qualitatively different factors (irregularity and instability). This is comparable to the way a vector in the Euclidian plane can be written as a linear combination of two basis vectors.
The participants’ trial-to-trial variability profiles (instability) show a clear w-shape pattern across the two scales, with greater instability at the beginning and end of the scale, but surprisingly also in the middle, at the boundary between the octaves. To our knowledge, our study is the first to reveal such subtle but robust differences in timing consistency. Of course, the irregularity and instability curves are related: when the mean deviation is high, the variance typically also increases. This could explain how instability peaks at the beginning and end of the two-octave scale are accompanied by irregularity peaks at those locations. This means that our finding of the instability peak at the octave boundary is all the more striking since the u-shaped irregularity curve is at its low-point there.
One may argue these two peaks could alternatively be explained by the mechanical effect of inverting the wrist movement, which switches at those locations between left-to-right and right-to-left movement account, first of all, this movement direction inversion is not abrupt, since ascending and descending scales in our measurement are separated by a 9-note rest (1.125 s). Secondly, such an explanation could not explain why a comparable peak occurs in the middle of the two-octaves.
One other potential explanation for the w-shaped variability pattern is that at the boundary between the two-octaves another event occurs: the thumb passes underneath the fourth finger to be able to play the
Another potential explanation for the w-shaped instability would be that this pattern is related to the metronome. However, note that a metronome click occurs every four notes, that is, at
Then the question arises how the increased instability at the octave boundary is to be explained. If the motor system would conceive of the two-octaves as a single motor program there would be no reason for the playing to become increasingly variable in the middle. Rather, our interpretation is that the octave boundary marks the transition between concatenated motor programs. Under this view, the increased instability is a result of having to load the next sequence into the motor buffer (Lashley,
It is interesting to note that the scales under investigation have revealed the octave boundary effect for both ascending and descending scales (except in the F#-major scales), suggesting that the motor system has chunked these in the same way. Ascending and descending scales are essentially the same movement, but mirrored in time. Therefore, our finding suggests that motor program chunking would be mostly invariant to temporal inversion.
At this point it remains possible, at least in principle, that this instability effect is an artifact of our line fitting procedure or another aspect of our analysis. To control for this, we run the same procedure with simulated data in Experiment II.
We used a python script to simulate the scale playing of 33 pianists. Each simulated pianist played 30 ascending and descending two-octave C-major scales at 8 notes/s, yielding a total of 900 note onsets that were perfectly regular in time. The timing of each note was then jittered by a time value sampled from a normal distribution with zero mean and a SD of 9 ms. This value was chosen such that the resulting instability profile was on average similar to that found in the real pianists. The same analysis as described for Experiment I was then applied to these data.
Overall instability levels were comparable to those of the human pianists reported in Experiment I and are shown in Figure
We ruled out the possibility that the octave boundary instability that was seen in the recordings of pianists would be an artifact to our analysis. If so, it would also have occurred in the simulated corpus.
Our novel analysis of scale playing has divided the playing unevenness into largely independent components: systematic deviations from regularity and trial-to-trial instability.
In order to test this hypothesis we performed a temporary delay detection experiment in which one note was delayed at various positions in the scale, and participants were asked to detect this. Since Experiment I revealed the octave boundary to be present in the three scales (although to a lesser extent in F#-major), we decided to restrict our current investigation to the C-major scale only.
Nineteen music students of various instruments were recruited from the Hanover University of Music. We used a python script to generate two-octave C-major scales (from
The python script generated MIDI files (using the MXM Python MIDI package), which were then converted offline into wave using Timidity and presented using Audacious. Participants indicated on a paper form for each scale whether they heard a timing deviation. As a training, they first heard two example scales with a (longer) deviation and two scales without and received accuracy feedback.
Overall, the participants responded 79% (SD 8.2%) correctly, showing the feasibility of the task. For each of the five delay locations we calculated the hit rate (correct answers/number of presentations). It is not possible to calculate a
Detection rates were above chance level at all locations in the scale (binomial test all
This experiment reveals that detection accuracy varies by note position in the scale. Overall accuracy was good, showing that the task was feasible. Optimal performance was seen in the middle of either of the octaves (at
The crucial case, however, is the transition point between the two-octaves. We observed a decreased auditory sensitivity to delays this point, but pianists’ playing shows no systematic deviation (Experiment I). Thus, one must reject our initial hypothesis that the auditory detection profile mirrors the irregularity trace. This is a tantalizing finding that nuances the way perceptual distortions affect action: a loss of perceptual resolution is reflected in a loss of playing stability in the absence of consistent timing deviations (irregularity). Finally, this result again undermines the interpretation that the instability peak at the octave boundary is related to a low-level motor process such as fingering. Such an interpretation would predict that there is no deterioration of auditory perception at the boundary, whereas our experiment shows there is.
Previous studies have investigated sensitivity to timing changes in regular sequences of events (Hyde and Peretz,
In sum, we have revealed a parallel between the instability trace in pianists’ playing and the detection rate of timing perturbations in listeners.
Our results indicate that it is possible to meaningfully dissociate irregularity as planned by the motor system and instability of the execution of the motor program. Experiment I revealed that these two factors contribute to the SD of the keystroke intervals as investigated in previous studies. The advantage of our analysis is that we tease apart systematic deviations, which can be rooted in perceptual biases (Penel and Drake,
Note-by-note investigation reveals that instability is greater at the boundaries between the octaves. This is true for two-octave C-major scales and the motorically identical A-minor scales, revealing that it is not related to the C-major musical content but related to motor execution. We interpret these results as revealing that at the octave boundary a transition occurs between subsequent motor program chunks. Previous studies have interpreted the chunking as an aid to learning (Sakai et al.,
The octave boundary instability is less strong but still present in the F#-major scales. This can readily be explained by the fact that F#-major scales are much less intensively practised because they are less common in the music literature. For example, one participant did not know it is played with a b rather than a b#. In other words, the F#-major scale may be represented more note-by-note in the motor system because it is played less frequently.
Indeed, we searched the ThemeFinder corpus (
In order to further clarify the processing that occurs at the octave boundary, future investigation could add a weight to the wrist during scale playing, increasing its inertial mass. This means that the preparation for the thumb passage movement would have to be longer (Engel et al.,
The picture that emerges from Experiment I is that the octave boundary instability is mainly a low-level motor sequencing phenomenon. But Experiment III reveals an unexpected parallel in perception: that the detection rate is lower at the boundaries of the two-octaves. Assuming that the two phenomena are causally related, which one is the cause and which the effect?
First we consider the possibility that the lack of auditory resolution at the octave boundary causes the playing to be less precise at those points. Similar hypotheses have been advanced that relate musical production to perception. One is that slowing down at the end is musically appropriate and not perceived as deviant (Repp,
The perceptual hypothesis could be amended to include this possibility. Imagine that due to the lack of perceptual resolution at the octave boundary, the interval does not sound systematically shorter, but sometimes shorter and sometimes longer. As a result, the playing would sometimes compensate by playing it longer and sometimes by playing it shorter, yielding increased playing variability but not systematic deviation, in line with our findings. What is not satisfying about this explanation is that it does not account for why perceptual resolution is lower at such locations that do not seem musically meaningful. However, an even more immediate problem is that it would predict the octave boundary instability to be present equally in the F#-major scales, contrary to our findings.
A second explanation is the inverse causality: a lack in playing precision leads to impaired perception. Indeed, participants in Experiment III were musicians and could therefore be heavily influenced by exposure to musical material. A future study could decide this issue by performing the perception experiment on non-musicians. A limitation of such investigation will be that even non-musicians have much been passively exposed to music.
In sum, then, we conclude that the chunks formed in the motor system and in the perceptual system overlap, at least for the materials presently studied. How chunk formation in the two systems is causally related remains yet to be answered.
These questions open the road to further investigation into the relation between music perception and production, which may be more complex than previously accounted for. A future study may use a signal-detection-theoretical framework to tease apart response bias and sensitivity and correlate these to playing irregularity and inconsistency. Our prediction is that the irregularity trace will be mirrored by the detection bias, whereas the instability trace reflects the inverse of the sensitivity.
In sum, our study points to a dissociation between musical phrases and motor programs. Musical phrases have previously been found to be indicated by systematic slowing at the end (i.e., increased irregularity), whereas our finding is that motor sequences are demarcated by increased playing instability. Perhaps the two reflect the previously discovered dissociation between timing processes and item sequencing (Pfordresher,
One limitation in the current study is that although contrary to previous studies we have included scales of different tonalities, still all our material consisted of two-octave scales. A future study could investigate scales over three octaves, although caution would need to be taken to control for the larger distance the arm needs to cover to reach the three octaves.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by the EBRAMUS, European Brain and Music Ph.D. Grant to Floris Tijmen van Vugt (ITN MC FP7, GA 238157). The authors are indebted to Karl Hartmann for assistance in running the experiment. We thank Dr. Michael Grossbach for discussions during the exploration and analysis of the data and to Dr. Shinichi Furuya for invaluable comments on an earlier manuscript. Furthermore, the C program used to capture MIDI was developed at our institute (IMMM Hannover) by Klaas Hagemann, Lukas Aguirre and Martin Neubauer. Finally, Felicia Cheng kindly shared with us some of her data, which we used for a pilot analyses (not reported in this article).