Correct tonotopic representation is necessary for complex pitch perception

  1. Andrew J. Oxenham*,
  2. Joshua G. W. Bernstein, and
  3. Hector Penagos
  1. Speech and Hearing Bioscience and Technology Program, Harvard–MIT Division of Health Sciences and Technology, and Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA 02139
  1. Edited by Masakazu Konishi, California Institute of Technology, Pasadena, CA, and approved November 25, 2003 (received for review October 27, 2003)

Abstract

The ability to extract a pitch from complex harmonic sounds, such as human speech, animal vocalizations, and musical instruments, is a fundamental attribute of hearing. Some theories of pitch rely on the frequency-to-place mapping, or tonotopy, in the inner ear (cochlea), but most current models are based solely on the relative timing of spikes in the auditory nerve. So far, it has proved to be difficult to distinguish between these two possible representations, primarily because temporal and place information usually covary in the cochlea. In this study, “transposed stimuli” were used to dissociate temporal from place information. By presenting the temporal information of low-frequency sinusoids to locations in the cochlea tuned to high frequencies, we found that human subjects displayed poor pitch perception for single tones. More importantly, none of the subjects was able to extract the fundamental frequency from multiple low-frequency harmonics presented to high-frequency regions of the cochlea. The experiments demonstrate that tonotopic representation is crucial to complex pitch perception and provide a new tool in the search for the neural basis of pitch.

Footnotes

  • * To whom correspondence should be addressed. E-mail: oxenham{at}mit.edu.

  • This paper was submitted directly (Track II) to the PNAS office.

  • Abbreviations: F0, fundamental frequency; ITD, interaural time difference; SACF, summary autocorrelation function.

  • See Commentary on page 1114.

  • Threshold values here are somewhat higher than in the most comparable earlier study (29). This may be because of the higher level of threshold sensitivity produced by a three-alternative, as opposed to a two-alternative, forced-choice procedure (d′ = 1.27, as opposed to d′ = 0.77) and to the fact that we introduced ITDs only to the ongoing (steady-state) portion of the stimulus and not to the onset and offset ramps.

« Previous | Next Article »Table of Contents
From the Cover