Edited by: Peter Neri, University of Aberdeen, UK
Reviewed by: Jessica A. Grahn, Western University (University of Western Ontario), Canada; Carl M. Gaspar, University of Glasgow, UK
*Correspondence: Sonja A. Kotz, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, D-04103 Leipzig, Germany e-mail:
This article was submitted to Perception Science, a section of the journal Frontiers in Psychology.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Auditory scene analysis describes the ability to segregate relevant sounds out from the environment and to integrate them into a single sound stream using the characteristics of the sounds to determine whether or not they are related. This study aims to contrast task performances in objective threshold measurements of segregation and integration using identical stimuli, manipulating two variables known to influence streaming, inter-stimulus-interval (ISI) and frequency difference (Δf). For each measurement, one parameter (either ISI or Δf) was held constant while the other was altered in a staircase procedure. By using this paradigm, it is possible to test within-subject across multiple conditions, covering a wide Δf and ISI range in one testing session. The objective tasks were based on across-stream temporal judgments (facilitated by integration) and within-stream deviance detection (facilitated by segregation). Results show the objective integration task is well suited for combination with the staircase procedure, as it yields consistent threshold measurements for separate variations of ISI and Δf, as well as being significantly related to the subjective thresholds. The objective segregation task appears less suited to the staircase procedure. With the integration-based staircase paradigm, a comprehensive assessment of streaming thresholds can be obtained in a relatively short space of time. This permits efficient threshold measurements particularly in groups for which there is little prior knowledge on the relevant parameter space for streaming perception.
Every day our auditory system is confronted with a wide range of sounds, some of which may be more relevant to us than others such as the voice of a friend we are listening to at a noisy party. Our auditory system is able to focus on this particular sound information, separating it from all the other auditory objects or streams around us, such as other conversations, music, knives and forks clicking, or the street noise drifting up through the window. Additionally, the sounds belonging to one source can be integrated together into one stream, allowing us to hear a continuous sequence that is the story our friend is telling us rather than unconnected individual noises. This ability, which allows us performing such a complex organization of our auditory surroundings, is generally known as “Auditory Scene Analysis” (Bregman,
In the laboratory, the phenomena of stream segregation and integration can be examined using two differing sounds, A and B. If these two sounds are alternated in time (ABBABBABB) they may be heard as integrated into one stream. However, under certain conditions, they may also seem to “split” or segregate, so that the listener hears two rather than one stream of sound. These two streams each correspond to the repetitions of one of the two sounds, here A--A--A-- accompanied by -BB-BB-BB (Van Noorden,
Various factors can influence our ability to integrate or segregate two streams, but two factors that are known to have the greatest influence on streaming thresholds are inter-stimulus interval (ISI), and frequency separation (Δf) (Bregman et al.,
Whilst the study of auditory streaming has spawned a large body of literature since it was first described (Ortmann,
There are a variety of approaches by which auditory stream segregation and integration can be measured, both with and without asking for an explicit subjective report from the participants. When measuring streaming thresholds subjectively, subjects are usually asked explicitly whether they perceive the tones as one or two streams. The advantage of this type of measure is that it is easy to set up, and a direct report of the subject's perception of the tones can be recorded. However, the disadvantage is that this type of measurement cannot be used in a paradigm in which subjects are not attending to the stimuli as can be the case in electroencephalography (EEG) studies. Further, some subjects, especially children or clinical populations, may find it difficult to give a good subjective report on whether or not they perceive one or two streams. This difficulty may be exacerbated by tasks, which are often performed in a quite unnatural situation, and by the fact that subjects may feel a pressure to “do well.”
By asking subjects to complete a perceptual task that is supported either by an integrated or a segregated percept, it becomes possible to measure auditory streaming thresholds objectively, i.e., without explicitly asking about the subject's perception of the streams. For example, it is easier to detect particular details about one stream, such as its regularities and any deviant contained within it, if it is not integrated with the second stream. The second stream may hide the regularities of the first, thus making deviants harder to detect (Sussman et al.,
Looking at the approaches described and used in different studies, it becomes clear that there is a certain lack of uniformity when it comes to how streaming is measured. This makes it difficult to compare the outcome of different studies directly. Different stimuli, different paradigms, and different measurement methods in different laboratories, all collected using different subjects, would make any comparison attempted very difficult. Under these circumstances, it is important to examine some of these approaches in a systematic fashion, using the same subjects and stimuli, with data recorded in the same experiment under very similar conditions. This would permit a direct comparison of integration and segregation thresholds, measured both subjectively and objectively. There is, as far as we are aware, only one study that looked to perform a systematic comparison of tasks for the measurement of stream integration and segregation (Micheyl and Oxenham,
One reason for the long duration of the measurement is that in most studies, fixed combinations of ISI and Δf values are determined in advance. Measurements must then be taken at each of these fixed points for each subject, regardless of whether the particular ISI-Δf combination is close to their streaming threshold or not. This may lead to large amounts of time spent measuring ceiling or floor effects. We propose to overcome this problem by setting only one parameter (ISI or Δf) as fixed, while varying the other one according to a staircase protocol. This would render the measurement considerably more time-efficient, allowing the coverage of a larger parameter space. Note that even though staircase procedures have been used in streaming studies before (Cusack and Roberts,
A staircase procedure for directly targeting the streaming threshold has been used in some studies (McAnally et al.,
By using both ISI and Δf as variables in a staircase procedure, the current study attempts to find a paradigm that will give a good coverage of the parameter space to be tested by removing the necessity of pre-selecting a certain number of fixed values at which to measure. Another aim of the study is also to complete all threshold measurements to be compared in one 75–90 min session for each subject by exploiting the efficiency of the staircase measurement procedure, thereby allowing within-subjects comparisons without the burden of multiple testing sessions.
Two objective tasks were chosen for the current study, one that is supported by hearing two segregated streams, here called the intensity task, and one that is supported by hearing one integrated stream, here called the rhythm task. The intensity task is based on a paradigm used by Sussman et al. (
Twenty healthy adult subjects participated in the first experiment. Of these, 19 subjects (3 male, 16 female; mean age 23.63,
The experiment was set up to map out the influence of ISI and Δf on auditory stream segregation by looking at the stream segregation and integration thresholds. For the objective measures, this was done using a weighted 3 up—1 down staircase procedure (Kaernbach,
Objective threshold measurement, intensity task (easier in two-stream percept), pre-determined ISI levels, variable Δf. Pure sinusoidal tones of 50 ms duration were presented over Sennheiser HD25-1 closed back on-ear headphones. A sequence with a regular ABB pattern consisting of 3 tones (one fixed lower tone at 250 Hz and with a level of 70 dB SPL, and two variable higher tones with variable frequencies and random levels of 65, 75, or 85 dB SPL) was used to determine the Δf threshold. All sound levels were measured using a HEAD acoustics HMS III.0 artificial head measurement system. The overall length of each sequence was 25 tones (i.e., ending in … ABBABBA). The sequences were presented at a fixed ISI pre-set at one of four values (50, 100, 150, or 200 ms, manipulated block-wise). We decided to use values 50 ms apart as this was the difference between the two ISIs at which Micheyl and Oxenham (
The deviant occurred by random choice in position 5, 6, 7, or 8 of the low-tone sequence (i.e., not in the initial positions 1–4, and not in the final position 9). The subjects were asked to press a button during or after each sequence to indicate whether or not they heard a louder deviant tone in the lower sound sequence. In order to make efficient use of time, trials were triggered by the subject's button press, rather than after a set amount of time. The Δf decreased by 1 ST if the subject answered correctly, or increased by 3 ST if the answer was incorrect. As it should be more difficult to identify infrequent within-stream deviants if all tones are perceived as belonging to one stream (Winkler et al.,
Objective threshold measurement, rhythm task (easier in one-stream percept), pre-determined ISI levels, variable Δf. Subjects were asked to listen to sequences made up of the same pattern of pure tones as in the previous task (low-high-high). Sequences for this task were only 12 tones long (4 ABB patterns) as this was found sufficient for rhythm discrimination in a pilot study. Intensity variations from task 1, whilst not relevant to this task, were preserved to make the stimuli more comparable. In this case though, the subjects needed to identify whether the entire sequence of high and low tones had an overall regular ISI pattern. The subject was presented with a sequence, which could either be a regular sequence, in which the tones were all separated by an ISI of equal length, or an irregular sequence, in which the ISI between the low A tone and the first higher B tone was reduced by 80%, whilst the ISI between the 2nd B tone and the next A tone was prolonged accordingly, in order to maintain the same overall pace as in the regular sequences (see Figure
The probability of an irregular sequence being presented was 50%. The subjects were asked to press a button during or after each sequence to indicate whether they had heard a regular or an irregular sequence. The sequences were presented over 4 blocks, with an ISI of 50, 100, 150, or 200 ms. The starting Δf in each block was 2 ST. Results were obtained using a 1-up 3-down staircase procedure, in which Δf increased by 1 ST if the subject answered correctly or decreased by 3 ST if they answered incorrectly. As it should be more difficult to perceive the relationship between the low and high tones when they are segregated, subjects should only be able to reliably determine whether the sequence had a regular or irregular rhythm when the sequence can still be perceived as one integrated stream (Bregman and Campbell,
Objective threshold measurement, rhythm task (easier in one-stream percept), pre-determined Δf levels, variable ISI. Subjects performed the rhythm task as described in task 2. However, this time the ISI was variable and the Δf was fixed. Four blocks were presented with a Δf of 10, 17, 24, and 31 ST, respectively. The starting ISI in each block was set at 200 ms. Subjects were asked to press a button during or after each sequence to indicate whether they had heard a regular or an irregular sequence. Results were obtained using a 1-up 3-down staircase procedure, in which the ISI decreased by 10 ms if the subject answered correctly or increased by 30 ms if they answered incorrectly. The 10 ms step size was chosen as this was the smallest unit used by Helenius et al. (
The first 5 subjects in the study also performed an objective ISI threshold measurement at 4 fixed Δf values for the intensity task to get an ISI threshold for the segregated percept, but these measurements were not continued, as all subjects performed at ceiling and were able to correctly identify the intensity deviant in close to 100% of responses. Even though longer ISIs are known to impede segregation, performance actually improved as the ISI became longer. This was probably because as the ISI became very long, the task became easy to solve cognitively without having to rely on being able to segregate the streams (i.e., by focusing on the intensity of every third tone; see Dowling et al.,
Subjective threshold adjustment, pre-determined ISI levels, variable Δf. Subjects were asked to listen to a continuous sound sequence made up of the same pattern of tones as in the previous tasks. By means of the two response buttons, they were able to adjust the Δf between the tones themselves until they felt they could consistently only hear a two-stream percept. Note that this instruction implies that an integration threshold (i.e., the point at which integration is no longer possible) is measured, rather than a segregation threshold. Subjects completed one self-adjustment task for each of the fixed ISI lengths (50, 100, 150, 200 ms).
Subjective threshold adjustment, pre-determined Δf, variable ISI. Subjects were asked to listen to a continuous sound sequence made up of the same pattern of tones as in the previous tasks. By means of the two response buttons, they were able to adjust the ISI between the tones themselves until they felt they could consistently only hear a two-stream percept. Again, it should be noted that this instruction implies that the integration rather than the segregation threshold is measured. Subjects completed one self-adjustment task for each of the fixed Δf values (10, 17, 24, and 31 ST).
We would expect subjects to show an increase in the Δf required for them to perceive the tones as two segregated streams as the ISI increases (task 1). We would also expect an increase in the Δf up to which subjects are able to hold an integrated percept as the ISI increases (task 2), and an increase in the ISI up to which subjects are able to hold an integrated percept as the Δf increases (task 3). Further, we would expect there to be a difference between the thresholds measured for integration and those measured for segregation, as shown by Van Noorden (
Kolmogorov–Smirnov tests show that data in tasks 1 (100, 150, and 200 ms conditions), 2 (50 and 100 ms conditions), 3 (10ST condition), and 4 (50 ms condition) has a non-Gaussian distribution with a positive skew (all
For the intensity task (task 1), which should require within-stream deviance detection supported by a segregated percept, an increase in the Δf thresholds with increasing ISI was expected, as subjects should find it more difficult to maintain a segregated percept at lower Δf and slower ISI. This would imply that a greater Δf would be necessary to hear the tones as two separate streams. However, contrary to this hypothesis, no significant difference between the Δf thresholds measured at the different ISI values was found [
For the two rhythm tasks (tasks 2 and 3), which should require an across-stream temporal judgment supported by an integrated one-stream percept, an increase in the Δf thresholds with increasing ISI was expected, as subjects should find it easier to maintain an integrated percept at lower Δf and slower ISI. The ANOVAs show that the Δf threshold was affected significantly by ISI length (task 2) [
The results from the rhythm task with variable Δf (task 2) were checked against the rhythm task with variable ISI (task 3). For both tasks, task performance was expected to improve with longer ISIs or higher Δf values. As they are both measured with the same task within the same parameter space and participants, the resulting thresholds should also fall on the same curve when plotted. In view of the linear trend found for both tasks, this curve would best be approximated by assuming the same linear relation between Δf and ISI for both tasks. This was tested by calculating the correlation between Δf and ISI for each subject using the data from both tasks (8 pairs of values per subject). The resulting 19 correlation coefficients were
For the self-adjustment tasks, the ANOVA showed that the Δf thresholds were significantly affected by ISI length (task 4) [
The self-adjustments in task 4 were always done from an integrated one-stream percept, and subjects were asked to adjust the Δf until they could no longer hear the one-stream percept. This makes the results subjective measures of integration thresholds. To validate the objective measurement as a good predictor of the subjective perception of integration thresholds, the results of the subjective threshold measurement with variable Δf (task 4) can be compared with the objective results of the equivalent rhythm task (task 2).
The correlation between Δf and ISI was calculated for each subject using the data from both tasks (8 pairs of values per subject). The same method was used as above where task 2 and 3 were compared. The mean
The results of the subjective threshold measurement with variable ISI (task 5) and the objective results of the equivalent rhythm task (task 3) were also compared using the same method. The mean
If we assume that the rhythm task (tasks 2 and 3) can only be solved if subjects maintain an integrated percept, then, as expected, the results show that subjects were able to maintain a one-stream percept at higher Δf values when ISI lengths were longer. They were also able to maintain a one-stream percept at shorter ISI lengths when the Δf between the high and low tones was smaller. There was a significant correlation between the ISI length and the Δf up to which subjects were able to successfully solve the task (task 2), and these objective threshold measurements also correlated with the subjective thresholds reported by the subjects (task 4). Further, the thresholds measured in the rhythm task with variable Δf values (task 2) and those measured in the rhythm task with variable ISI values (task 3) were consistent, lending strong support to the assumption that the rhythm task threshold measurements do accurately reflect the perceptual integration thresholds of the subjects.
The results from the subjective variable ISI condition (task 5), however, do not fit the same pattern as the other integration threshold measurements. No significant variation could be found between the ISI thresholds for the four fixed Δf values that were measured. As all the other results from the integration threshold measurements show a similar pattern to each other, it would seem unlikely that the thresholds in task 5 should be invariable. It seems that varying the ISI at a constant Δf value for subjective perceptual judgments is not symmetric to varying the Δf at a constant ISI value. The reasons for this asymmetry remain to be explored. It is possible that the frequency separation change was more salient to the subjects than the ISI change, allowing subjects to better judge where their integration threshold lies. If it is more difficult for some subjects to subjectively determine at what point integration is no longer possible when ISI is varied and Δf is fixed, this may lead them to hold on to either a very stable or a highest possible threshold of integration, thereby obscuring differences between the different Δf levels. As long as the underlying reasons for these difficulties are unclear, we would suggest that subjective self-adjustment procedures of ISI stream segregation and integration thresholds should be approached and interpreted with caution.
Against the initial expectations, there was no significant difference in performance across the different ISI levels in the objective thresholds measured through the intensity task. The question therefore arises as to why the Δf thresholds do not show the expected increase with ISI in the current measurements in this task.
It could be suggested that there was a ceiling effect in this particular task, in the sense that streaming was still “too easy” at the ISIs measured in the current study. Although early reports would support this, suggesting that the employed ISI range of 50–200 ms may not lead to a large increase in Δf thresholds required for streaming (Van Noorden,
Another possible explanation might be found in the suggestion that the segregated percept requires a certain build-up time. The build-up of streaming is usually associated with studies where sound streams were presented over 10 s or more (Carlyon et al.,
To see whether the current results had been affected by deviants being missed in faster sequences, as might be expected if build-up took longer than the time between the onset of the sound stream and the first deviant, a two-factorial repeated measures ANOVA was performed examining the effect of the deviant position within the tone sequence (4 levels: position 5/6/7/8) and of the ISI length (4 levels: 50/100/150/200 ms) on the percentage of correct responses for sequences containing a deviant. However, no significant effect of deviant position was found [
A third explanation might be that subjects were able to solve the task even when stream segregation was no longer possible. By employing some other strategy, such as tuning into the regular pace of the stimulus configuration and specifically attending every third (task-relevant) tone while ignoring the two intermediate tones (Dowling et al.,
A further explanation might be that as each sequence was triggered by the subject's response to the previous sequence consecutive sequences may have followed each other so closely that subjects were able to transfer their previous percept to the next sequence. Stream segregation has been shown to be cumulative (Bregman,
This would also be supported by studies of facilitation or priming effects on auditory stream perception: Some studies have revealed that the auditory system tends to retain a previously dominant percept in spite of parameter changes (Sussman and Steinschneider,
In order to reduce the likelihood of any possible ceiling effect, to counteract facilitation effects of perceptual organization in the previous sequence, and to make it more difficult to apply alternative strategies based on temporal attention, a follow-up experiment focussing on the intensity task (task 1) was conducted.
Ten healthy adult subjects participated in the second experiment (4 males, 6 females, mean age 22.3,
This second experiment was set up to test the alternative explanations for the lack of change in the Δf threshold with increasing ISI during the intensity task in Experiment 1. All conditions employed an objective threshold measurement by means of the intensity task described in task 1 of Experiment 1, with pre-determined ISI levels and variable Δf. The subjects were asked to press one of two buttons on the response keypad with their left and right thumbs to indicate whether or not they heard a louder deviant tone in the lower sound sequence. This could be done either during or after each sequence. However, two manipulations were made to the stimulus setup from Experiment 1, a timing jitter and a base frequency jitter. The four resulting conditions for the study were therefore as follows:
Control. Stimuli and task were identical to those from task 1 of Experiment 1. Jittered timing. Stimuli were the same as in the control condition, with the exception that the ISI was not kept constant between tones. Instead, the onset of each individual tone was independently jittered by a random amount that ranged equiprobably from −40 to +40% of the nominal ISI value. Jittered frequency. Stimuli were the same as in the control condition, with the exception that a different frequency was pseudo-randomly chosen for the lower sequence for each trial. The maximum jitter was set to 300 Hz ± 5 ST. The difference between the lower tone sequence frequencies of two consecutive trials was at least 1 ST. Jittered timing and frequency. Stimuli were the same as in the control condition, with the exception that both the frequency of the lower tone sequence was pseudo-randomly chosen for each sequence as in the jittered frequency condition (condition 3), and the ISI was randomly jittered as in the jittered timing condition (condition 2).
Firstly, by jittering the timing of stimulus presentation within each sequence, subjects should be prevented from being able to predict through the regularity of the sequence when the lower tones were being presented and just focussing on these. This manipulation has been shown to impair task performance in similar protocols (Andreou et al.,
Secondly, by changing the frequency of the lower tones with every sequence, it should be more difficult for subjects to retain the previously experienced segregated percept by just “holding on” to the lower stream. This manipulation should make it more likely that participants will segregate the streams anew with each presented sequence.
Both these manipulations should make it more difficult to apply alternative cognitive strategies to solve the task. As the cognitive strategies become less reliable and the likelihood of a performance limit having been reached (i.e., a floor effect) is reduced, we would expect performance to become more influenced by stream segregation. If this is the case, it should be possible to see an influence of ISI length on the Δf threshold, with longer ISI tone sequences requiring a higher Δf to segregate the streams than shorter ISI tone sequences.
If, in spite of the manipulations to the stimuli, no difference to the thresholds can be found, then it would appear more likely that other factors are influencing the performance in the intensity deviant detection task apart from those actively being manipulated, and that we would have to look at aspects beyond those that usually influence stream segregation to determine why the thresholds remained so stable in the intensity task.
Kolmogorov–Smirnov tests showed that the data in 2 ISI levels of condition 2 (150 and 200 ms conditions) had a non-Gaussian distribution with a positive skew (
In order to be able to examine the effect of jittering the timing and the base frequency as well as the effect of ISI length, a 3-factorial ANOVA with the factors ISI (4 levels: 50/100/150/200 ms), timing jitter (2 levels: absent/present), and base frequency jitter (2 levels: absent/present) was performed. The results show there was a main effect of the base frequency being jittered [
Experiment 2 employed two manipulations to increase difficulty of the objective within-stream intensity deviance detection task (employed to measure segregation thresholds). This was done to investigate whether the Δf thresholds in the manipulated conditions would then vary depending on the ISI of the tone sequence. Although one of the manipulations (base frequency jitter) was successful, task performance was still not influenced by ISI length. This leads back to the question of whether the lack of an increase in thresholds is really just a ceiling effect, or whether the task is dependent on other mechanisms as well.
One possible explanation for the lack of variation in the intensity task thresholds for different ISI values was that subjects might be capable of listening out for every third tone. This might allow them to continue solving the task when they can no longer segregate the streams (Dowling et al.,
Another possible explanation for the lack of variation in the intensity task thresholds for different ISI values would be that there was a transfer effect across trials. If the task was facilitated by the subjects being able to “hold on” to the lower tone from the previous trial, they might be able to solve the task at lower thresholds than would otherwise be expected. As transfer effects have been shown to be drastically reduced by noticeable parameter changes (Anstis and Saida,
Indeed, the base frequency jitter condition, where the lower tone sequence frequency was changed after each trial, did significantly increase the thresholds when compared to the control condition, indicating that the difficulty of the task was increased by the jitter. This would suggest that there was a carry-over effect in the original intensity task of Experiment 1, which should at least be reduced in the jitter condition. However, in spite of the increase in thresholds, there was still no significant effect of ISI on the measured Δf thresholds within this condition. This would suggest that in spite of reducing the carry-over effect that was present in the original intensity task, the adapted base frequency jitter condition still does not allow the effective use of the intensity task for determining stream segregation thresholds using the staircase procedure. It therefore seems likely that performance in this task is influenced not only by the factors actively being manipulated (ISI and Δf as determinants of stream segregation). Instead, it would appear that performance is affected additionally or even exclusively by other factors, such as cognitive strategies for solving the task.
We would therefore suggest that if the intensity task is used to examine stream segregation in a staircase procedure, the base frequency should be altered after each step to avoid such carry-over effects. However, we would also question whether the intensity task is actually suitable for use in such a procedure.
The current study sought to examine auditory stream integration and segregation thresholds as measured using staircase versions of two tasks, a within-stream deviance detection task (facilitated by segregation) and an across-stream temporal judgment task (facilitated by integration). Combining these objective measures with subjective streaming judgments, using highly similar stimuli within one study, with the same subjects performing all tasks, allowed us to draw a direct comparison between the performances in the different tasks, as well as between task performance and subjective perception. This meant that it is possible to assess how effective a predictor the objective task was of the subjective perception. Further, by examining both, integration and segregation thresholds, a “map” can be drawn of the parameter space tested, describing whether it is possible to hear an integrated or a segregated percept at a particular point, or whether both percepts are possible. This was done efficiently, though some questions remain about the validity of the objective segregation task (based on intensity deviance detection) when combined with the staircase procedure.
A key benefit of the procedure described here is that by mapping ISI thresholds at fixed Δf levels and vice versa, a larger parameter space can be covered in a relatively short space of time, without making prior assumptions about the expected thresholds. For the study detailed in Experiment 1, each subject completed 12 objective tasks and 8 subjective threshold assessments, each taking between 1 and 5 min to run. This meant that including breaks, the entire session lasted about 60–80 min. This is a feasible length of time for within-subject comparisons. The tasks were chosen to hopefully make them transferable to special populations, such as children, cochlear-implant users, and patients with auditory disorders, who may not cope too well with longer or multiple sessions. Note that the enhanced efficiency would also be beneficial in other contexts: for instance, prior to neurophysiological measurements of streaming correlates, the relevant subset of the current staircase measurements can be selected for quickly determining optimal combinations of Δf and ISI.
There is, to our knowledge, only one study by Micheyl and Oxenham (
The current study employed an intensity deviance detection task to assess the segregation thresholds. It was not possible to demonstrate an effect of both Δf and ISI on these objective segregation threshold measurements, even though Micheyl and Oxenham were able to find such effects within the 7 subjects that completed all conditions in their study. The most likely reason for this discrepancy lies in the nature of the objective task. Note that Micheyl and Oxenham's within-stream timing task is more similar to the rhythm task of the current study, which was found to be reliable in the integration threshold measurements. It remains to be investigated whether this within-stream timing task would yield consistent results when combined with a staircase procedure for directly assessing streaming thresholds. The current within-stream deviance detection task proved to be less suitable for use in the procedure used here, even though in Experiment 1, this task was employed very similarly to how it has been used in the literature. However, it has been well-established in electrophysiological studies and, more recently, has also been employed in fMRI studies of auditory streaming (Sussman et al.,
Experiment 2 was set up to attempt to alleviate this problem by employing two modifications to the stimulus protocol. Although one of these modifications (changing the base frequency from trial to trial) was successful in increasing task difficulty, we still found no effect of ISI on the Δf threshold required to solve the task. It appears that task performance is not exclusively influenced by stream segregation but also by other, possibly more cognitive factors. The validity of this task as a pure measure of stream segregation thresholds when combined with a staircase procedure must therefore be questioned. Whether this limitation extends to the continuous version of the within-stream deviance detection task is beyond the scope of the current manuscript. In view of the present data, the circumstances under which this task is solvable by cognitive strategies, rather than by relying on stream segregation, should be investigated carefully.
Based on the present results, the rhythm task for objectively measuring stream integration thresholds lends itself much more readily to be used in combination with a staircase procedure for the direct assessment of streaming thresholds. We have shown here that this protocol yields threshold data with high internal consistency at the group level. We currently see the greatest benefit of the suggested method in quickly narrowing down the relevant parameter space for streaming assessments in groups with little prior knowledge as to where their streaming thresholds are expected to lie. Whether this protocol can be extended to streaming threshold measurements at the level of individual subjects must be carefully evaluated in future studies. These should include investigations of re-test reliability at the individual level, which was beyond the scope of the present study. Given the group-level consistency of the present data for the various versions of the rhythm task (tasks 2, 3, and 4 of Experiment 1), we would suggest that it is sufficient to use only one of the versions for assessing stream integration thresholds, which would lead to a further reduction in measurement time. In exchange, it may be necessary to collect more than one staircase threshold measurement for each pre-set level of one of the parameters in order to obtain reliable data at the individual level. This issue of a trade-off between measurement time and reliability at the single subject level must await further examination (see also Micheyl and Oxenham,
In conclusion, the present study shows that a staircase procedure can be combined successfully with across-stream temporal judgments as a classical tool for measuring stream integration. This combination allows a fast coverage of the parameter space, resulting in a quick assessment of streaming thresholds. This can be done without the burden of limiting the range of observations a priori by pre-determining the parameter space for the variables affecting the streaming percept. Future studies should disclose an objective task for measuring stream segregation that is equally suited for combination with the staircase procedure.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by the Max Planck Society (International Max Planck Research School on Neuroscience of Communication: Function, Structure, and Plasticity). The experiment was realized using Cogent 2000 developed by the Cogent 2000 team at the FIL and the ICN.