How and Why Does Spatial-Hearing Ability Differ among Listeners? What Is the Role of Learning and Multisensory Interactions?

Editorial

16 February 2016

Editorial: How, and Why, Does Spatial-Hearing Ability Differ among Listeners? What is the Role of Learning and Multisensory Interactions?

Guillaume Andéol

and

Brian D. Simpson

3,994 views

5 citations

Editors

Guillaume Andeol

Other

Ewan Andrew Macpherson

Western University

Brian Simpson

Wright-Patterson Air Force Base

Impact

Original Research

17 September 2015

The interaction of vision and audition in two-dimensional space

Martine Godfroy-Cooper

, 2 more and

Robert B. Welch

Using a mouse-driven visual pointer, 10 participants made repeated open-loop egocentric localizations of memorized visual, auditory, and combined visual-auditory targets projected randomly across the two-dimensional frontal field (2D). The results are reported in terms of variable error, constant error and local distortion. The results confirmed that auditory and visual maps of the egocentric space differ in their precision (variable error) and accuracy (constant error), both from one another and as a function of eccentricity and direction within a given modality. These differences were used, in turn, to make predictions about the precision and accuracy within which spatially and temporally congruent bimodal visual-auditory targets are localized. Overall, the improvement in precision for bimodal relative to the best unimodal target revealed the presence of optimal integration well-predicted by the Maximum Likelihood Estimation (MLE) model. Conversely, the hypothesis that accuracy in localizing the bimodal visual-auditory targets would represent a compromise between auditory and visual performance in favor of the most precise modality was rejected. Instead, the bimodal accuracy was found to be equivalent to or to exceed that of the best unimodal condition. Finally, we described how the different types of errors could be used to identify properties of the internal representations and coordinate transformations within the central nervous system (CNS). The results provide some insight into the structure of the underlying sensorimotor processes employed by the brain and confirm the usefulness of capitalizing on naturally occurring differences between vision and audition to better understand their interaction and their contribution to multimodal perception.

8,007 views

18 citations

Correction

08 August 2016

Corrigendum: Perceptual factors contribute more than acoustical factors to sound localization abilities with virtual sources

Guillaume Andéol

, 1 more and

Anne I. Guillaume

1,631 views

0 citations

Original Research

07 October 2014

Auditory/visual distance estimation: accuracy and variability

Paul W. Anderson

and

Pavel Zahorik

Past research has shown that auditory distance estimation improves when listeners are given the opportunity to see all possible sound sources when compared to no visual input. It has also been established that distance estimation is more accurate in vision than in audition. The present study investigates the degree to which auditory distance estimation is improved when matched with a congruent visual stimulus. Virtual sound sources based on binaural room impulse response (BRIR) measurements made from distances ranging from approximately 0.3 to 9.8 m in a concert hall were used as auditory stimuli. Visual stimuli were photographs taken from the participant's perspective at each distance in the impulse response measurement setup presented on a large HDTV monitor. Participants were asked to estimate egocentric distance to the sound source in each of three conditions: auditory only (A), visual only (V), and congruent auditory/visual stimuli (A+V). Each condition was presented within its own block. Sixty-two participants were tested in order to quantify the response variability inherent in auditory distance perception. Distance estimates from both the V and A+V conditions were found to be considerably more accurate and less variable than estimates from the A condition.

9,678 views

53 citations

Individual judgment position against target position with individual and non-individual HRTFs (black and gray dots, respectively) at the pre-test in the up/down dimension. Each panel couple is for a different listener (N = 20).

Original Research

29 January 2015

Perceptual factors contribute more than acoustical factors to sound localization abilities with virtual sources

Guillaume Andéol

, 1 more and

Anne Guillaume

7,693 views

8 citations

Original Research

05 September 2014

From ear to body: the auditory-motor loop in spatial cognition

Isabelle Viaud-Delmon

and

Olivier Warusfel

Spatial memory is mainly studied through the visual sensory modality: navigation tasks in humans rarely integrate dynamic and spatial auditory information. In order to study how a spatial scene can be memorized on the basis of auditory and idiothetic cues only, we constructed an auditory equivalent of the Morris water maze, a task widely used to assess spatial learning and memory in rodents. Participants were equipped with wireless headphones, which delivered a soundscape updated in real time according to their movements in 3D space. A wireless tracking system (video infrared with passive markers) was used to send the coordinates of the subject's head to the sound rendering system. The rendering system used advanced HRTF-based synthesis of directional cues and room acoustic simulation for the auralization of a realistic acoustic environment. Participants were guided blindfolded in an experimental room. Their task was to explore a delimitated area in order to find a hidden auditory target, i.e., a sound that was only triggered when walking on a precise location of the area. The position of this target could be coded in relationship to auditory landmarks constantly rendered during the exploration of the area. The task was composed of a practice trial, 6 acquisition trials during which they had to memorize the localization of the target, and 4 test trials in which some aspects of the auditory scene were modified. The task ended with a probe trial in which the auditory target was removed. The configuration of searching paths allowed observing how auditory information was coded to memorize the position of the target. They suggested that space can be efficiently coded without visual information in normal sighted subjects. In conclusion, space representation can be based on sensorimotor and auditory cues only, providing another argument in favor of the hypothesis that the brain has access to a modality-invariant representation of external space.

8,251 views

26 citations

(A) (Color on-line) Mean of all subjects' reported location with 50% confidence ellipse linked to source location for the dominant hand condition. Front/back confusion corrected. Good directional pointing accuracy in the median plane, larger compression of reported distances in front than in side. (B) Mean of all subjects' reported azimuth as a function of the target azimuth for both hand conditions. Error bars show one standard deviation across the subjects. Gray line shows unity. For the sake of readability, results corresponding to the different hand conditions have been slightly horizontally shifted. This plot shows a good pointing accuracy on the frontal hemisphere and lower accuracy on the side. (C) Mean of all subjects' reported distance as a function of target distance with mean of linear regression slope for 1st and 2nd hand across all azimuths. Gray line shows unity. Error bars show one standard deviation across the subjects. Reported distance is linear but compressed.

Original Research

02 September 2014

Reaching nearby sources: comparison between real and virtual sound and visual targets

Gaëtan Parseihian

, 1 more and

Brian F. G. Katz

6,379 views

30 citations

Mean lateral error (A), elevation error (B), and front/back confusion rate (C) as functions of azimuth window width. Dashed lines represent means for individual participants. Solid symbols with solid lines represent means across participants. Open symbols with solid lines represent means predicted by the regression, partialling out the effects of duration and the extent of head rotation in elevation.

Original Research

12 August 2014

Sound localization with head movement: implications for 3-d audio displays

Ken I. McAnally

and

Russell L. Martin

Previous studies have shown that the accuracy of sound localization is improved if listeners are allowed to move their heads during signal presentation. This study describes the function relating localization accuracy to the extent of head movement in azimuth. Sounds that are difficult to localize were presented in the free field from sources at a wide range of azimuths and elevations. Sounds remained active until the participants' heads had rotated through windows ranging in width of 2, 4, 8, 16, 32, or 64° of azimuth. Error in determining sound-source elevation and the rate of front/back confusion were found to decrease with increases in azimuth window width. Error in determining sound-source lateral angle was not found to vary with azimuth window width. Implications for 3-d audio displays: the utility of a 3-d audio display for imparting spatial information is likely to be improved if operators are able to move their heads during signal presentation. Head movement may compensate in part for a paucity of spectral cues to sound-source location resulting from limitations in either the audio signals presented or the directional filters (i.e., head-related transfer functions) used to generate a display. However, head movements of a moderate size (i.e., through around 32° of azimuth) may be required to ensure that spatial information is conveyed with high accuracy.

12,206 views

55 citations

Psychometric functions and measurements of the minimum audible angle. (A) The mean psychometric functions for all listeners are plotted for all four conditions including: (1) head static/source static (solid diamonds), (2) head moving/source static (open diamonds), (3) head static/source dynamic (solid circles), and (4) head moving/source dynamic (open circles). Error bars show ±1 standard error. (B) The results of logistic fits to individual psychometric functions show that the MMAA was larger for source motion (black bar) than for self motion (white bar), and that both of these were larger than for the source static conditions (gray bars). A single asterisk represents differences in means significant at an alpha of 0.05, and a double asterisk an alpha of 0.01.

Original Research

02 September 2014

The moving minimum audible angle is smaller during self motion than during source motion

W. Owen Brimijoin

and

Michael A. Akeroyd

7,721 views

47 citations

Illustration of a hypothetical process of auditory adaptation through continuous sensory experience. First the input sound is decomposed into auditory space cues, then (1) a correspondence is established between the cues and a point in perceptual space. After a correspondence is established (2) a percept is formed. Perceiving auditory sources in space is most often accompanied by feedback. The feedback is compared to the auditory space percept (3). If no differences are found, then there is further tuning of the original cue combination rule. If the feedback is substantially different from the percept, then a new cue combination rule is created.

Review

25 July 2014

A review on auditory space adaptations to altered head-related cues

Catarina Mendonça

17,288 views

59 citations

(A) Filter functions of the left ear of one subject are plotted for the midline cone of confusion before and (B) after passing through a cochlear filter model. The features in (B) indicate that, despite the frequency filtering and spectral smoothing produced by the cochlear, substantial spectral features are preserved within the auditory nervous system. Filter functions for the left ear of a different subject are plotted for the midline [D–F: Azimuth 0°, cf. red line in (C)] and 40° off the midline [G–I: Azimuth −40°; cf. blue line in (C)] are plotted without molds (D,G) and with molds (E,H). The data have been smoothed, as above, using the cochlear filter model. The differences between the bare ear and mold conditions for both lateral angles are plotted in (F,I) (Data from Carlile and Blackman, 2013).

Review

06 August 2014

The plastic ear and perceptual relearning in auditory spatial perception

Simon Carlile

9,785 views

49 citations

Response elevation gain for BB stimuli plotted against the azimuth gain. Data from all control listeners (gray crosses), listeners with SSD with 8 kHz thresholds below 40 dB HL (filled circles) and SSD listeners with 8 kHz thresholds higher than 40 dB HL (open circles) are presented when spectral-shape cues were available (A), and when spectral-shape cues were reduced by molds (B). Error bars denote ± 1 SE of the azimuth and elevation regression coefficients. Data points from the two SSD listeners depicted in Figure 1 (P3 and P12), are indicated in the figure. Data are pooled across presentation levels. Note the two clear outliers in the control group. These two listeners demonstrated bilateral high-frequency hearing loss (8 kHz thresholds higher than 40 dB HL).

Original Research

04 July 2014

Single-sided deafness and directional hearing: contribution of spectral cues and high-frequency hearing loss in the hearing ear

Martijn J. H. Agterberg

, 3 more and

Ad F. M. Snik

Direction-specific interactions of sound waves with the head, torso, and pinna provide unique spectral-shape cues that are used for the localization of sounds in the vertical plane, whereas horizontal sound localization is based primarily on the processing of binaural acoustic differences in arrival time (interaural time differences, or ITDs) and sound level (interaural level differences, or ILDs). Because the binaural sound-localization cues are absent in listeners with total single-sided deafness (SSD), their ability to localize sound is heavily impaired. However, some studies have reported that SSD listeners are able, to some extent, to localize sound sources in azimuth, although the underlying mechanisms used for localization are unclear. To investigate whether SSD listeners rely on monaural pinna-induced spectral-shape cues of their hearing ear for directional hearing, we investigated localization performance for low-pass filtered (LP, <1.5 kHz), high-pass filtered (HP, >3kHz), and broadband (BB, 0.5–20 kHz) noises in the two-dimensional frontal hemifield. We tested whether localization performance of SSD listeners further deteriorated when the pinna cavities of their hearing ear were filled with a mold that disrupted their spectral-shape cues. To remove the potential use of perceived sound level as an invalid azimuth cue, we randomly varied stimulus presentation levels over a broad range (45–65 dB SPL). Several listeners with SSD could localize HP and BB sound sources in the horizontal plane, but inter-subject variability was considerable. Localization performance of these listeners strongly reduced after diminishing of their spectral pinna-cues. We further show that inter-subject variability of SSD can be explained to a large extent by the severity of high-frequency hearing loss in their hearing ear.

15,327 views

56 citations

Audiometric data for younger (left panel) and older (right panel) participants. See text for details.

Original Research

25 June 2014

Relating age and hearing loss to monaural, bilateral, and binaural temporal sensitivity1

Frederick J. Gallun

, 4 more and

Dawn L. Konrad-Martin

Older listeners are more likely than younger listeners to have difficulties in making temporal discriminations among auditory stimuli presented to one or both ears. In addition, the performance of older listeners is often observed to be more variable than that of younger listeners. The aim of this work was to relate age and hearing loss to temporal processing ability in a group of younger and older listeners with a range of hearing thresholds. Seventy-eight listeners were tested on a set of three temporal discrimination tasks (monaural gap discrimination, bilateral gap discrimination, and binaural discrimination of interaural differences in time). To examine the role of temporal fine structure in these tasks, four types of brief stimuli were used: tone bursts, broad-frequency chirps with rising or falling frequency contours, and random-phase noise bursts. Between-subject group analyses conducted separately for each task revealed substantial increases in temporal thresholds for the older listeners across all three tasks, regardless of stimulus type, as well as significant correlations among the performance of individual listeners across most combinations of tasks and stimuli. Differences in performance were associated with the stimuli in the monaural and binaural tasks, but not the bilateral task. Temporal fine structure differences among the stimuli had the greatest impact on monaural thresholds. Threshold estimate values across all tasks and stimuli did not show any greater variability for the older listeners as compared to the younger listeners. A linear mixed model applied to the data suggested that age and hearing loss are independent factors responsible for temporal processing ability, thus supporting the increasingly accepted hypothesis that temporal processing can be impaired for older compared to younger listeners with similar hearing and/or amounts of hearing loss.

7,011 views

57 citations

The Interaural Spectral Difference (ISD) for head orientations of 45 (black, dashed line) and 135° (gray, solid line). The different panels correspond to different HPDs, or to the no-HPD condition (upper-left panel).

Original Research

11 June 2014

Impact of hearing protection devices on sound localization performance

Véronique Zimpfer

and

David Sarafian

5,060 views

19 citations

Pupil response in the four speech reception threshold conditions as function of time relative to the onset of the target speech (time 0 s). The pupil dilation is calculated relative to the baseline pupil size in the interval between 3 s and 2 s prior to the onset of the target speech. Twenty-four participants were tested.

Original Research

29 April 2014

Cognitive processing load during listening is reduced more by decreasing voice similarity than by increasing spatial separation between target and masker speech

Adriana A. Zekveld

, 3 more and

Jerker Rönnberg

We investigated changes in speech recognition and cognitive processing load due to the masking release attributable to decreasing similarity between target and masker speech. This was achieved by using masker voices with either the same (female) gender as the target speech or different gender (male) and/or by spatially separating the target and masker speech using HRTFs. We assessed the relation between the signal-to-noise ratio required for 50% sentence intelligibility, the pupil response and cognitive abilities. We hypothesized that the pupil response, a measure of cognitive processing load, would be larger for co-located maskers and for same-gender compared to different-gender maskers. We further expected that better cognitive abilities would be associated with better speech perception and larger pupil responses as the allocation of larger capacity may result in more intense mental processing. In line with previous studies, the performance benefit from different-gender compared to same-gender maskers was larger for co-located masker signals. The performance benefit of spatially-separated maskers was larger for same-gender maskers. The pupil response was larger for same-gender than for different-gender maskers, but was not reduced by spatial separation. We observed associations between better perception performance and better working memory, better information updating, and better executive abilities when applying no corrections for multiple comparisons. The pupil response was not associated with cognitive abilities. Thus, although both gender and location differences between target and masker facilitate speech perception, only gender differences lower cognitive processing load. Presenting a more dissimilar masker may facilitate target-masker separation at a later (cognitive) processing stage than increasing the spatial separation between the target and masker. The pupil response provides information about speech perception that complements intelligibility data.

6,880 views

67 citations

Computational simulation of inner hair cell receptor potentials. Simulated IHC responses to broadband noises with a flat spectrum and with a 2-kHz wide, 15-dB spectral notch centered at 7 kHz. The noise duration was longer than that used in the psychoacoustical experiments (0.5 vs. 0.2 s) to obtain “smoother” responses. (A) IHC excitation pattern representation of the flat-spectrum (red) and notch noises (blue). Each curve illustrates the average (rms) receptor potential of each IHC as a function of the cell's CF, for a different stimulus level, from 40 to 100 dB SPL, as indicated by the numbers next to each trace. (B) Difference excitation patterns (in dB) normalized to the maximum value across CFs and intensities. The numbers next to each trace indicate stimulus intensity in dB SPL. (C) Spectra of the IHC receptor potential representation of the two noises for the same stimulus levels as in (A). Each curve depicts the frequency-wise summed spectra of individual IHC receptor potential spectra (see main text). (D) Difference receptor potential FFT (in dB) normalized to the maximum value across frequencies and intensities. In (B,D), the curves have been arbitrarily displaced vertically for convenience. Vertical dotted lines in (B,D) indicate the notch frequency band. The middle panels illustrate zoomed views of panels (A,C) over the frequency range of the spectral notch.

Original Research

27 May 2014

Perception and coding of high-frequency spectral notches: potential implications for sound localization

Ana Alves-Pinto

, 1 more and

Enrique A. Lopez-Poveda

7,117 views

9 citations

Original Research

23 April 2014

Acoustic and non-acoustic factors in modeling listener-specific performance of sagittal-plane sound localization

Piotr Majdak

, 1 more and

Bernhard Laback

The ability of sound-source localization in sagittal planes (along the top-down and front-back dimension) varies considerably across listeners. The directional acoustic spectral features, described by head-related transfer functions (HRTFs), also vary considerably across listeners, a consequence of the listener-specific shape of the ears. It is not clear whether the differences in localization ability result from differences in the encoding of directional information provided by the HRTFs, i.e., an acoustic factor, or from differences in auditory processing of those cues (e.g., spectral-shape sensitivity), i.e., non-acoustic factors. We addressed this issue by analyzing the listener-specific localization ability in terms of localization performance. Directional responses to spatially distributed broadband stimuli from 18 listeners were used. A model of sagittal-plane localization was fit individually for each listener by considering the actual localization performance, the listener-specific HRTFs representing the acoustic factor, and an uncertainty parameter representing the non-acoustic factors. The model was configured to simulate the condition of complete calibration of the listener to the tested HRTFs. Listener-specifically calibrated model predictions yielded correlations of, on average, 0.93 with the actual localization performance. Then, the model parameters representing the acoustic and non-acoustic factors were systematically permuted across the listener group. While the permutation of HRTFs affected the localization performance, the permutation of listener-specific uncertainty had a substantially larger impact. Our findings suggest that across-listener variability in sagittal-plane localization ability is only marginally determined by the acoustic factor, i.e., the quality of directional cues found in typical human HRTFs. Rather, the non-acoustic factors, supposed to represent the listeners' efficiency in processing directional cues, appear to be important.

5,423 views

37 citations

Tones in the right ear (red) and left ear (blue and dashed) as functions of time and with particular interaural phase differences (IPD) as indicated on the vertical axis to illustrate different regions of IPD. The boundaries between regions, separated by 90°, are logically and perceptually important in sound localization.

Original Research

28 February 2014

Anatomical limits on interaural time differences: an ecological perspective

William M. Hartmann

and

Eric J. Macaulay

Human listeners, and other animals too, use interaural time differences (ITD) to localize sounds. If the sounds are pure tones, a simple frequency factor relates the ITD to the interaural phase difference (IPD), for which there are known iso-IPD boundaries, 90°, 180°… defining regions of spatial perception. In this article, iso-IPD boundaries for humans are translated into azimuths using a spherical head model (SHM), and the calculations are checked by free-field measurements. The translated boundaries provide quantitative tests of an ecological interpretation for the dramatic onset of ITD insensitivity at high frequencies. According to this interpretation, the insensitivity serves as a defense against misinformation and can be attributed to limits on binaural processing in the brainstem. Calculations show that the ecological explanation passes the tests only if the binaural brainstem properties evolved or developed consistent with heads that are 50% smaller than current adult heads. Measurements on more realistic head shapes relax that requirement only slightly. The problem posed by the discrepancy between the current head size and a smaller, ideal head size was apparently solved by the evolution or development of central processes that discount large IPDs in favor of interaural level differences. The latter become more important with increasing head size.

7,873 views

19 citations

Original Research

05 February 2014

Sensitivity to temporal fine structure and hearing-aid outcomes in older adults

Elvira Perez

, 1 more and

Barrie A. Edmonds

4,846 views

19 citations

Original Research

13 February 2014