You are on page 1of 9

www.elsevier.

com/locate/ynimg NeuroImage 26 (2005) 1119 1127

Scanning silence: Mental imagery of complex sounds


Nico Bunzecka,*, Torsten Wuestenbergb, Kai Lutzc, Hans-Jochen Heinzea, and Lutz Janckec,*
a

Department of Neurology II, Otto von Guericke University, Leipziger Street 44, Magdeburg 39120, Germany Georg-August-University Go ttingen, Institut for Medical Psychology, Waldweg 37, 37073 Go ttingen, Germany c Department of Neuropsychology, University of Zurich, Treichlerstrasse 10, CH-8032 Zurich, Switzerland
b

Received 8 October 2004; revised 3 March 2005; accepted 10 March 2005 Available online 11 May 2005

In this functional magnetic resonance imaging (fMRI) study, we investigated the neural basis of mental auditory imagery of familiar complex sounds that did not contain language or music. In the first condition (perception), the subjects watched familiar scenes and listened to the corresponding sounds that were presented simultaneously. In the second condition (imagery), the same scenes were presented silently and the subjects had to mentally imagine the appropriate sounds. During the third condition (control), the participants watched a scrambled version of the scenes without sound. To overcome the disadvantages of the stray acoustic scanner noise in auditory fMRI experiments, we applied sparse temporal sampling technique with five functional clusters that were acquired at the end of each movie presentation. Compared to the control condition, we found bilateral activations in the primary and secondary auditory cortices (including Heschls gyrus and planum temporale) during perception of complex sounds. In contrast, the imagery condition elicited bilateral hemodynamic responses only in the secondary auditory cortex (including the planum temporale). No significant activity was observed in the primary auditory cortex. The results show that imagery and perception of complex sounds that do not contain language or music rely on overlapping neural correlates of the secondary but not primary auditory cortex. D 2005 Elsevier Inc. All rights reserved. Keywords: Mental imagery; Imagery; Auditory cortex; Perception; Sparce temporal sampling; fMRI.

Introduction Numerous imaging studies on mental auditory imagery (using functional magnetic resonance imaging [fMRI], magnetoencephalography [MEG], positron emission tomography [PET], or single
* Corresponding authors. Nico Bunzeck is to be contacted at Department of Neurology II, Otto von Guericke University, Leipziger Street 44, Magdeburg 39120, Germany. Lutz Jancke, Department of Neuropsychology, University of Zurich, Treichlerstrasse 10, CH-8032 Zurich, Switzerland. E-mail addresses: bunzeck@neuro2.med.uni-magdeburg.de (N. Bunzeck), l.jaencke@psychologie.unizh.ch (L. Jancke). Available online on ScienceDirect (www.sciencedirect.com). 1053-8119/$ - see front matter D 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.neuroimage.2005.03.013

photon emission computed tomography [SPECT]) came to the conclusion that imagery and perception of different sorts of auditory stimuli share common neural substrates (for an overview, see Kosslyn et al., 2001). However, there is still a debate about the involvement of the different subdivisions of the auditory cortex during this higher cognitive function. Several studies reported neural responses in secondary auditory cortex (SAC) but not primary auditory cortex (PAC) for imagery of music-related stimuli (Halpern and Zatorre, 1999; Ohnishi et al., 2001; Schurmann et al., 2002; Zatorre et al., 1996, 1998), verbal stimuli (Jancke and Shah, 2004; McGuire et al., 1996; Shergill et al., 2001), and during verbal or musical hallucinations (Griffiths, 2000; Lennox et al., 1999; Shergill et al., 2000; Silbersweig et al., 1995). In contrast, there are studies that show an increase of blood oxygen leveldependent (BOLD) signal in secondary and primary auditory cortices during imagery of musical stimuli (Yoo et al., 2001) and verbal hallucinations (Dierks et al., 1999). Compared to the language and music domain, there are only a few neuroimaging studies that investigated mental auditory imagery of neither language- nor music-related stimuli. Goldenberg et al. (1991) presented names of objects and the subjects imagined the corresponding sounds associated with these objects. The performance of the imagery task was correlated with an increased regional cerebral blood flow (rCBF) in the left inferior occipital region, the left thalamus, both hippocampal regions, and the right inferior and superior temporal regions. Because of the low spatial resolution of the employed imaging technique (SPECT), a distinction between primary and secondary cortices was not possible. An fMRI study by Wheeler et al. (2000) also revealed neural activations in secondary but not primary auditory cortex during imagery of sounds. However, the employed stimuli were not exclusively free from music. In contrast, Hoshiyama et al. (2001) found activations in primary auditory cortex when the subjects imagined the sound of a hammer striking an anvil but not if they just watched the scene without perceiving or imagining the sound. Additionally, this MEG study revealed initial activations in inferior frontal and insular areas prior to the activation seen in the auditory cortex (Hoshiyama et al., 2001). In this study, we addressed the question if mental imagery of complex sounds that do not contain verbal or music elements is

1120

N. Bunzeck et al. / NeuroImage 26 (2005) 1119 1127

associated with activation of the auditory cortex. Further, we ask which subdivision of the auditory cortex is involved in this imagery process. We hypothesize that imagery of these complex sounds might elicit activation rather in the secondary than in the primary auditory cortex whereas perceptual processing of the same stimuli would activate both primary and secondary auditory areas. Moreover, we predict a stronger activation in auditory cortex during the perception compared to the imagery condition. Materials and methods Subjects Eight volunteers (age range: 21 28; mean: 24; SD: 2.13; four female and four male) with normal or corrected-to-normal acuity participated in this study after giving written informed consent. All subjects were recruited from an academic environment and none of them had a history of major medical, neurological, or psychiatric disorders. Seven subjects were consistently right-handed and one was mixed-handed according to the Annett-Handedness-Questionnaire (AHQ) (Annett, 1970). The study was approved by the local ethics committee. Experimental design and task The experimental session comprised three runs. Within each run, three different types (conditions) of 10-s color movies were presented. In the first condition (perception), the movies constituted well-known scenes of everyday life including the appropriate sound. Importantly, the movies did not contain any verbal or music aspects. These scenes were, for example, clapping hands, a glass getting filled with water or a person blow-drying her hair. The subjects were instructed to watch and listen to the presented scenes attentively. In the second condition (imagery), the same movies were presented again. However, in contrast to the perception condition, the appropriate sound was excluded. The subjects were instructed to watch the scenes and to imagine the appropriate sound as intensely as possible. Since the containing acts and the corresponding sounds were highly familiar to the subjects, no preceding imagery training of these scenes was necessary. All subjects verbally reported that they were easily able to imagine the typical sounds coming with the movies. In the third condition (control), the subjects watched an unrecognizable scrambled version of the movies. The scrambled single pictures of that movie were taken from all movies of the second and third conditions, respectively (two pictures from each movie), randomized and presented for 250 ms per picture. No sound was present in this condition. Between the movies, the

subjects fixated a black cross presented on a white background square which was also present for 10 s (Fig. 1). The presentation order of the movie conditions was predetermined. First, the scrambled movie was presented (control), followed by a movie with the corresponding sound (perception); after that, the same movie was shown without sound (imagery) (Fig. 1). However, the order of the movie scenes (a glass getting filled with water, clapping hands, and so on) was pseudorandomized. During each run, 60 movies were shown (20 different scenes with and without sound, respectively, and 20 scrambled movies without sound). Hence, each run lasted 20 min. The films as well as the white square including the fixation cross were projected onto the center of a screen and the participants watched them through a mirror mounted on the head coil. Sound transmission was realized by a digital playback system including a high-frequency shield transducer system (Jancke et al., 2002a,b). fMRI acquisition Functional magnetic resonance imaging was performed on a 1.5 T whole-body MRI system (GE Signa Horizon LX, General Electrics Inc., Waukesha, WI, USA), equipped with echo planar imaging (EPI) capability using the standard circular polarized head coil for radio-frequency transmission and signal reception. The anatomical images were collected parallel to the anterior posterior commissural plane using whole-head T1-weighted inversion recovery prepared EPI (IR-EPI) sequences (matrix: 64 64, 18 slices/volume; FoV: 200 200 mm; spatial resolution: 3.13 3.13 8 mm; TE = 17 ms; TI = 1050 ms; TR = 12000 ms). In the functional session, eleven T2*-weighted echo planar images per volume, with blood oxygenation leveldependent contrast, were obtained in the same orientation (matrix: 64 64, 11 slices/volume; FoV: 200 200 mm; spatial resolution: 3.13 3.13 8 mm; slice thickness = 7 mm; gap = 1 mm; TE = 40 ms; TR = 1000 ms; flip angle = 80-). These partial volumes covered the whole temporal lobe (including the auditory cortex), cerebellum, and parts of the frontal lobe. To avoid interferences between the scanner-noise-evoked HRF and stimulus-evoked HRF, we used a variation of the sparse temporal sampling technique (Hall et al., 1999). Six seconds after the onset of the movie presentation, five functional partial volumes were acquired (lasting 5 s). The first and second volumes of each cluster were not included in the statistical analysis. They were obtained to ensure an appropriate steady state magnetization. Each of the last three volume sets had a different temporal distance to the onset of the movie and was analyzed separately (see below). The

Fig. 1. Schematic description of the experimental design. Three different types of movies were presented for 10 s each: (1) scrambled scenes (control), (2) scenes with the appropriate sound (perception), and (3) the same scene without sound (imagery). Between each movie, participants gazed at a centrally positioned fixation cross.

N. Bunzeck et al. / NeuroImage 26 (2005) 1119 1127

1121

interval between each clustered volume acquisition was 15 s (Fig. 2). During this time, no scanner noise was present. fMRI data analysis fMRI data were preprocessed with SPM99 and statistically analyzed by the general linear model approach using SPM2 software package (Wellcome Department of Cognitive Neuroscience, University College, London, UK) and MATLAB 6.1 (The Mathwork Inc., USA). After discarding the first two volume sets of every cluster, the remaining sets three to five were analyzed separately. Each set of images was corrected for motion artifacts by realignment to the first volume of each functional image set. The functional partial images were spatially normalized by normalizing the subjects anatomical IR-EPI to a standard T1-weighted SPM template and applying these parameters to the functional images. The images were resampled to 4 4 4 mm and smoothed with an isotropic 8-mm full-width half-maximum Gaussian kernel. The time-series fMRI data were highpass-filtered (cutoff 120 s) and globally scaled over voxels and scans within each session. A statistical model for each subject was computed by applying a boxcar model (first-level analysis). To test regionally specific condition effects, linear contrasts were employed for each subject and different conditions. The resulting contrast images were applied to perform a random effects second-level analysis. Here, a one-way analysis of variance (ANOVA) was used on images obtained for each subjects volume set and different conditions. The statistical parametric maps of the t statistics [SPM(t )] at each voxel were thresholded at P < 0.001 (uncorrected for multiple comparisons) and the spatial extent threshold was set at k = 10 voxel. The anatomical localization of significant activations was assessed with reference to the standard stereotaxic atlas by superimposition of the SPM maps on a standard MNI-brain provided by SPM2. Additionally, peaks located in Heschls gyrus (HG) or planum temporale (PT) were identified based on probability maps (Rademacher et al., 2001; Westbury et al., 1999) after a correction for differences in the coordinate system between the Talairach and Tournoux atlas (used in the probability maps) and the stereotaxic space employed by SPM2 (MNI-space) (Brett et al., 2002b). By performing an additional region of interest (ROI) analysis, activation patterns within HG and PT (averaged over left and right hemisphere) were examined using MarsBar software tool (Brett et al., 2002a). Three-dimensional ROIs were defined as spheres with a radius of 4 mm located in the medial proportion of HG (center coordinates: 44, 10, 8 and 40, 20, 8) and in PT (center coordinates: 60, 20, 10 and 60, 24, 10). These regions were described as highly probable parts of the PAC and SAC, respectively (Rademacher et al., 2001; Westbury et al., 1999). The

signal change as indicated by effect-sizeswithin the ROIs was statistically compared between conditions by two sample t tests. To test for between-hemispheric differences, the contrast images for main effects (perception and imagery) were doubled and horizontally flipped, resulting in a second contrast image data set with reversed transversal orientation. By comparing the original with the flipped images, hemodynamic responses obtained for the left and right hemispheres were comparable in the context of SPM analysis (Jancke et al., 2002a; Zaehle et al., 2004). The resulting SPM(t ) were thresholded at P < 0.001 (uncorrected for multiple comparisons) and spatial extent threshold was set at k = 5 voxel. To test for gender differences, contrast images of main effects were compared between male (N = 4) and female (N = 4) in a second-level two-sample t test (thresholded at P < 0.001, uncorrected, k = 5 voxel). Results Cluster set analysis The employed sparse temporal sampling design comprised five functional volumes per cluster (Fig. 2). Since different characteristic values of the hemodynamic response function (HRF) in auditory fMRI experiments were reported (see Discussion), the time course of the stimulus-evoked and scanner-noise-evoked HRF could not be determined accurately a priori. Therefore, we conducted a separate first- and second-level analysis for cluster sets three, four, and five. This analysis revealed highest signal intensities in auditory cortex for the fourth cluster. It was acquired 10 s after stimulus onset and 3 s after the scanner burst started. Therefore, the following results correspond only to the fourth cluster set. Activation pattern during perception and mental imagery Compared to the presentation of scrambled movies without sound (control), perception of scenes with sound (perception) as well as mental imagery of the appropriate sound when watching the scenes silently (imagery) revealed bilateral significant hemodynamic responses in superior and middle temporal gyrus (Table 1). The bilateral activations in the superior temporal gyrus (STG) during perception could be assigned to the primary and secondary auditory cortices (Fig. 3). This was confirmed based on the probability atlas of the PT and the cytoarchitectonic fields of the HG. According to these maps, the global maximum in the perception condition was located in HG with a probability of 80 90% and one local maximum was located in PT with a probability of 46 65% (Table 1). In contrast, the temporal lobe

Fig. 2. Illustration of the employed sparse temporal sampling design. The lower squares symbolize seconds, the horizontal black and white bars show the conditions or fixation periods, respectively, and the upper vertical bars illustrate the acquisition of one functional volume each. Six seconds after stimulus onset, five functional MR volumes were acquired for 5 s. Between each cluster set, no scanner noise was present for 15 s. Each condition and fixation period lasted 10 s.

1122

N. Bunzeck et al. / NeuroImage 26 (2005) 1119 1127

Table 1 Peak activations for perception vs. control (A) and imagery vs. control (B) Anatomical structure Prob. atlas for PT and HG (in %) 80 90 46 65 R L L Hemisphere P corrected FWE FDR Cluster size (voxel) 282 t value Peak coordinates MNI (mm) x y z

(A) Perception vs. control Temporal transverse gyrus (HG) Planum temporale (PT) Superior temporal gyrus Superior temporal sulcus Superior temporal gyrus Inferior temporal sulcus Fusiform gyrus Parahippocampal gyrus (B) Imagery vs. control Superior temporal gyrus Inferior temporal gyrus Planum temporale (PT) Inferior temporal sulcus Globus pallidus Putamen Planum temporale Planum temporale Planum temporale Putamen

<0.001 <0.003 <0.007 <0.001 <0.001 <0.001 <0.088 <0.131

<0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001

12.07 8.49 7.84 11.04 10.00 10.14 6.21 5.98

48 44 64 60 56 52 44 12

16 36 36 56 24 64 64 28

8 12 12 8 4 0 12 4

254 91 14

R 26 45 L L 46 65 26 45 26 45 L

<0.001 <0.012 <0.745 <0.001 <0.281 <0.425 <0.392 <0.465 <0.537 <0.710

<0.001 <0.001 <0.008 <0.001 <0.006 <0.007 <0.006 0.007 <0.007 <0.008

117

84 46 47

19

9.76 7.47 4.85 9.36 5.54 5.31 5.36 5.26 5.18 4.92

60 52 68 52 16 28 60 52 44 28

56 72 36 64 4 4 36 36 36 0

8 0 16 0 4 8 12 20 12 4

t = 3.79 ( P < 0.001 uncorrected); k = 10 voxel.

activations in the imagery condition were found more posterior in the SAC. Referring to the probability atlas of the PT, one local maximum was located in the PT with a probability of 46 65% and three local maxima with a probability of 26 45%. Comparing both conditions, higher t values in the auditory cortex were observed in the perception condition.

Additionally, in both experimental conditions, activations outside of the auditory cortex were found. Perception of complex sounds was associated with activations in the left inferior temporal sulcus (ITS), fusiform gyrus, and parahippocampal gyrus. During imagery, bilateral activations in the putamen and left lateral activations in the globus pallidus were observed. Since the MR

Fig. 3. Brain activation pattern during perception (A) and mental imagery (B) superimposed on a T1-weighted MNI-brain ( P < 0.001, uncorrected; k = 10 voxel). In the perception condition, activations of the primary and secondary auditory cortices were observed. Mental imagery of complex sounds was accompanied by activation of the secondary but not primary auditory cortex.

N. Bunzeck et al. / NeuroImage 26 (2005) 1119 1127 Table 2 Peak activations for the conjunction analysis perception vs. control and imagery vs. control Anatomical structure Prob. atlas for PT and HG (in %) Hemisphere P corrected FEW R 26 45 26 45 26 45 26 45 L L <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 FDR <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 Cluster size (voxel) 117 t value

1123

Peak coordinates MNI (mm) x y 56 72 36 64 36 36 36 z 8 0 16 0 12 20 12

Superior temporal sulcus Inferior temporal sulcus Planum temporale Inferior temporal sulcus Planum temporale Planum temporale Planum temporale

70 45

9.76 7.47 4.85 9.36 5.36 5.26 5.18

60 52 68 52 60 52 44

t = 3.79 ( P < 0.001, uncorrected); k = 10 voxel.

acquisition was focused on the auditory cortex and temporal lobe, these activations will be interpreted with caution. Overlap between activation pattern of perception and mental imagery A second-level conjunction analysis of the contrasts perception vs. control and imagery vs. control revealed brain areas that were activated in both conditions ( P < 0.001, uncorrected, k = 10 voxel). These regions were located in the right superior temporal sulcus (STS), bilaterally in the ITS and PT (Table 2 and Fig. 4). Referring to the probability maps, the local peaks of the activations were in the PT with a probability of 26 45%. Region-of-interest (ROI) analysis In order to examine the activation pattern in auditory cortex (averaged over hemispheres) during imagery and perception in more detail, a ROI analysis was performed. As shown in Fig. 5, during perception, both HG and PT were activated (compared to

baseline: P < 0.05, t = 2.1 for HG and P < 0.001, t = 7.12 for PT). During imagery, however, activation was observed only in PT (compared to baseline: P < 0.05, t = 2.9) but not in HG. The marginal deactivation in HG during imagery did not differ significantly from baseline ( P > 0.05, t = 1.81). Within HG, the neural activation was significantly stronger during perception as compared to imagery ( P < 0.05, t = 3.7). By comparing PT activations during both conditions, a significant weaker response was observed during imagery ( P < 0.001, t = 7.6). The ROI analysis, therefore, confirms the results of the SPM analysis. Hemispheric and gender differences To test for between-hemispheric differences, the contrast images for main effects (perception and imagery) were doubled and horizontally flipped resulting in a second contrast image data set with reversed transversal orientation. By comparing original with flipped images of the main contrasts, similar hemodynamic responses between left and right hemisphere were observed. Thus, there is no hemispheric differences for both the imagery and perception condition ( P = 0.001, uncorrected; k = 5 voxel). Additionally, gender differences were examined by comparing males (N = 4) vs. females (N = 4) in a second-level two-sample t test. This analysis did not reveal differences between

Fig. 4. Results of the conjunction analysis perception vs. scrambled and imagery vs. scrambled. During both conditions (perception and imagery), bilateral activations in secondary auditory cortex and bilateral ITS were observed.

Fig. 5. Shown is the neural response (effect-sizes, arbitrary values) within HG and PT for both conditions (averaged over left and right hemisphere). During perception, activation was observed in both ROIs. In contrast, mental imagery was associated with activation only in the PT but not within HG. Neural response within PT was significantly weaker during imagery compared to perception of sounds. Error bars indicate standard errors of the mean.

1124

N. Bunzeck et al. / NeuroImage 26 (2005) 1119 1127

both groups for perception and imagery of sounds ( P < 0.001, uncorrected, k = 5 voxel). Discussion Cluster set analysis One major problem in auditory studies using fMRI is the stray acoustic scanner noise. This can lead to several problems like interferences with the auditory stimulation (Hall et al., 2000; Shah et al., 1999, 2000) and masking effects of the cortical response in the auditory cortex (Amaro et al., 2002; Bandettini et al., 1998). In order to avoid these disadvantages, we chose a paradigm based on the sparse temporal sampling design developed by Hall et al. (1999). It was characterized by the acquisition of five functional cluster sets 6 s after stimulus onset and a silent gap of 15 s between each clustered acquisition (Fig. 2). Different time courses of the BOLD signal in auditory cortex for paradigms using fMRI have been shown (Belin et al., 1999; Hall et al., 1999). For example, the reported peak of the HRF in auditory cortex varies between 3 s (Belin et al., 1999), 5 s (Hickok et al., 1997), and about 10 s (Hall et al., 1999) after stimulus onset. The critical characteristic values of the HRF mainly depend on the presented stimulus material as well as on the duration of the MR scanner noise (Hall et al., 1999; Shah et al., 1999). Therefore, in our design, the cluster sets three, four, and five were analyzed separately, since the time course of the stimulus-evoked HRF and the HRF evoked by the scanner noise could only be estimated approximately. This analysis revealed maximum signal intensity in the auditory cortex for the fourth cluster set which was acquired 10 s after stimulus onset. It can be assumed that the stimulus-evoked HRF reached its peak at this point of time or alternatively that it was not yet masked by the scanner-noise-evoked HRF. For the last cluster set (11 s after onset), it is very likely that the stimulusdependent HRF already reached its peak and interfered with the scanner-noise-dependent HRF. For the third cluster set (9 s after onset), we presume that the stimulus-evoked HRF might not have reached its peak yet, since the MR signal in auditory cortex was stronger for the following cluster set. Activation pattern during perception and mental imagery During perception and imagery of complex sounds, we observed bilateral activations within the auditory cortex. While the activation pattern in the perception condition could be assigned to the primary (HG) and secondary auditory (PT) cortices, it was located in secondary but not primary auditory cortex (PT) during the imagery condition. The conjunction analysis of both conditions revealed a significant increase of the hemodynamic response only in the secondary auditory cortex (PT). Different studies have already shown that perception and mental imagery of music-related stimuli (Halpern and Zatorre, 1999; Shergill et al., 2001) and verbal stimuli (Jancke and Shah, 2004; McGuire et al., 1996) rely on common neural substrates of the secondary auditory cortex. Our findings are consistent with these studies. Furthermore, we show that mental imagery of complex sounds that are free from language and music also depends on structures of the secondary but not primary auditory cortex. Hence, perception and imagery of these stimuli rely on overlapping neural structures of the secondary auditory cortex. Two studies have been published so far using similar stimulus material. In contrast to our study, only one rather simple sound

made by a hammer striking an anvil had to be imagined in the experiment reported by Hoshiyama et al. (2001). The second study employed an imaging technique (SPECT) whose spatial resolution did not permit conclusions about the exact location of activity within the auditory cortex (Goldenberg et al., 1991). The finding that only perception but not imagery of the utilized stimuli was associated with activation of the primary auditory cortex is consistent with neuropsychological approaches that favor modular psychological mechanisms (for example, Hebb, 1949). According to them, perceptual inputs like auditory sounds are processed bottom up by several perceptual mechanisms that operate in a sequential and hierarchical mode. The cortical processing of the presented auditory stimuli starts at the first elementary level where basic acoustic features are extracted. The PAC and SAC are supposed to be the neural basis of these elementary acoustic analyses. After that, speech, voice, and also complex sound perception is conducted by higher-order processing mechanisms which also include the influence of modality-independent cognition. However, mental auditory imagery is a top down process that is characterized by reversed-order analysis. This process is initiated by neural networks specialized for higher-order cognition like the inferior frontal gyrus and insular regions (Griffiths, 2001; Hoshiyama et al., 2001; Jancke and Shah, 2004). These networks seem to activate only secondary but not primary auditory areas during imagery of complex sounds as used in this study. Three studies reported activations in PAC during auditory imagery (Dierks et al., 1999; Hoshiyama et al., 2001; Yoo et al., 2001). In contrast to the present and most other studies (Halpern and Zatorre, 1999; Jancke and Shah, 2004; McGuire et al., 1996; Ohnishi et al., 2001; Schurmann et al., 2002; Shergill et al., 2001; Wheeler et al., 2000; Zatorre et al., 1996, 1998) Hoshiyama et al. (2001) and Yoo et al. (2001) used rather simple sounds like computer-generated monotones. Therefore, in accordance with modular psychological approaches (as stated above), it is possible that the observed activation in different subdivisions of the auditory cortex varies as a function of the complexity of the imagined stimulus. A similar hypothesis has been disproved by Thompson et al. (2001) for the visual domain. For the auditory system, it is unclear. The third study that provides evidence for activations not only in SAC but also PAC during verbal hallucinations which can be considered as a specific sort of mental auditory imagery comes from Dierks et al. (1999). According to the authors, this result might be caused by the performed single-subject analysis. This did not lead to a possible attenuation of the effect in PAC due to signal averaging across subjects as seen in other studies (Dierks et al., 1999). Jancke and Shah (2004) argued that auditory hallucinations are fundamentally different from verbal imagery because hallucinations can be seen as the result of a disturbed auditory system and are not self-generated. Irrespective of the underlying mechanisms of hallucinations, most other studies did not find activations in primary auditory cortex during verbal (Lennox et al., 1999; Shergill et al., 2000; Silbersweig et al., 1995) or musical hallucinations (Griffiths, 2000). Up to now, we have argued that some kind of automatic auditory imagery would drive the activations in the SAC. However, it might also be that, during movie presentation without concomitant auditory signals, the subjects are trying to remember which auditory signal is associated with the particular movie. This remembering could be associated with a vivid sensory-specific information retrieval activating the secondary auditory areas

N. Bunzeck et al. / NeuroImage 26 (2005) 1119 1127

1125

similarly as has been suggested in the context of the reactivation hypotheses of memory (Nyberg et al., 2003; Wheeler et al., 2000). However, conceptually, it is currently not possible to differentiate auditory imagery from auditory retrieval associated with reactivation of sensory-specific areas. Most likely, retrieval and imagery share many processes and associated neural structures, thus, make it difficult to distinguish both processes. Stronger activations during perception than imagery Another important fact that has been reported not only in auditory imagery studies (Jancke and Shah, 2004), but also in the visual (Le Bihan et al., 1993), tactile (Yoo et al., 2003), and motor domain (Porro et al., 1996), is that common neural substrates are more strongly activated during perception or movement than during imagery of the appropriate stimulus or movement. As indicated by significantly greater effect-sizes in the ROI analysis, we also found stronger activations in PT (SAC) during perception than during imagery of sounds. However, an integrative interpretation is still missing. It is supposable that the bottom up analysis of a sensory input requires more neural resources within the adequate sensory system than the reactivation of the same stimulus during mental imagery. On the other hand, top down-driven mental imagery (or reactivation) of a stimulus additionally recruits a distributed network also including frontal areas. Thus, the peak activations within the core perception areas are relatively weak for imagery, but there is a large spatial extent of activation. It might be that a typical feature of imagery is the more distributed activation pattern. Lateralization during perception and imagery Several studies have shown lateralized processing in the auditory system depending on the stimulus type. Congruent with current theoretical assumptions, phonetic processing is predominantly left-lateralized whereas processing of non-verbal information like music is predominantly right-lateralized (Jancke et al., 2002b; Tervaniemi et al., 2000; Zatorre, 2003). Additionally, several neuroimaging studies suggested that not only perception but also imagery of the appropriate auditory stimulus should be processed by lateralized neural networks. That has been shown for language (Shergill et al., 2001) as well as for music (Zatorre and Halpern, 1993). In this study, we did not find lateralized processing for complex sounds. During perception and imagery, both hemispheres were almost equally activated. However, one has to be careful in interpreting fMRI data in the context of lateralization questions. For example, Sinai and Pratt (2003) using EEG and LORETA to cortically map the main focus of activation within the auditory cortex during word perception recently showed that activation switched between the left and right auditory cortices during the first 400 ms after stimulus presentation. Thus, there is no clear or steady state asymmetry of activation rather than a dynamic lateralization pattern which is difficult to measure using the very slow BOLD response. In a recent study, it was shown that not only the behavioral performance but also the underlying neural substrates during a mental rotation task differ between male and female subjects (Jordan et al., 2002). By comparing male and female activation patterns, we did not find any sex differences during imagery and perception of sounds in our study. However, because of the small sample size (N = 4 per group), possible differences might have not

been observed. Therefore, further research is needed to clarify the question of gender differences during mental imagery of sounds. Neural response outside of the PAC and SAC As shown in the conjunction analysis (Fig. 4), additional activations were found outside the auditory cortex. The strongest activations were observed bilaterally in the posterior temporal cortex and the adjacent lateral occipital regions. These regions are known to be involved in various aspects of visual perception. The movies used in the present study include several visual features, like motion and objects, which are not present in the visual control condition (scrambled movie). By comparing perception and imagery versus control, the activation in posterior temporal lobe and adjacent lateral occipital complex (LOC) might reflect differences in processing of these visual features (Grill-Spector et al., 2001; Kourtzi et al., 2002; Tootell et al., 1995). In addition, it has been shown that the posterior STS and the posterior part of the middle temporal gyrus (MTG) play an important role for the integration of different types of information within and across modalities. According to this assumption, the activation pattern found in our study could also reflect integration processes (Beauchamp et al., 2004). We also observed basal-ganglia activations associated with the imagery condition. Participation of basal ganglia has been shown during encoding and retrieval of procedural information of various types (Poldrack and Packard, 2003). Thus, the basal ganglia activations might reflect that the visual auditory associations studied here are encoded and stored within the implicit (procedural) memory system. This might explain the ability of all subjects to easily imagine (or reactivate) the sounds in the context of the visual cues. However, a further possibility might be that the basal ganglia involvement reflects a reactivation of motor-related activation in a sense that particular movement patterns are activated during the silent presentation of the movies. In some way, this activation resembles the activation found in a previous study (Lotze et al., 2003) that described increased neural response of the basal ganglia during imagery of playing a violin. However, this activation was only present in the amateur musicians and not in professionals. It was argued that the activation reflects a kind of audio motor association evident either in less-skilled musicians or in subjects currently establishing an audio motor association (Bangert and Altenmuller, 2003). Conclusion The present results suggest that mental imagery of sounds that do not contain language or music relies on structures of the secondary but not primary auditory cortex. Importantly, perception of the same stimuli activated both the primary and the secondary auditory cortex. Moreover, by using sparse temporal sampling technique, we were able to ensure that the acoustic scanner noise did not lead to interferences with the auditory stimulation and imagination. These findings provide evidence that perception and mental imagery of complex sounds rely on overlapping but dissociable neural correlates. Acknowledgments We are grateful to A. Schoenfeld and T. Zaehle for helpful discussions and comments on an earlier version of the manuscript.

1126

N. Bunzeck et al. / NeuroImage 26 (2005) 1119 1127 Jordan, K., Wustenberg, T., Heinze, H.J., Peters, M., Jancke, L., 2002. Women and men exhibit different cortical activation patterns during mental rotation tasks. Neuropsychologia 40 (13), 2397 2408. Kosslyn, S.M., Ganis, G., Thompson, W.L., 2001. Neural foundations of imagery. Nat. Rev., Neurosci. 2 (9), 635 642. Kourtzi, Z., Bulthoff, H.H., Erb, M., Grodd, W., 2002. Object-selective responses in the human motion area MT/MST. Nat. Neurosci. 5 (1), 17 18. Le Bihan, D., Turner, R., Zeffiro, T.A., Cuenod, C.A., Jezzard, P., Bonnerot, V., 1993. Activation of human primary visual cortex during visual recall: a magnetic resonance imaging study. Proc. Natl. Acad. Sci. U. S. A. 90 (24), 11802 11805. Lennox, B.R., Park, S.B., Jones, P.B., Morris, P.G., Park, G., 1999. Spatial and temporal mapping of neural activity associated with auditory hallucinations. Lancet 353 (9153), 644. Lotze, M., Scheler, G., Tan, H.R., Braun, C., Birbaumer, N., 2003. The musicians brain: functional imaging of amateurs and professionals during performance and imagery. NeuroImage 20 (3), 1817 1829. McGuire, P.K., Silbersweig, D.A., Murray, R.M., David, A.S., Frackowiak, R.S., Frith, C.D., 1996. Functional anatomy of inner speech and auditory verbal imagery. Psychol. Med. 26 (1), 29 38. Nyberg, L., Marklund, P., Persson, J., Cabeza, R., Forkstam, C., Petersson, K.M., Ingvar, M., 2003. Common prefrontal activations during working memory, episodic memory, and semantic memory. Neuropsychologia 41 (3), 371 377. Ohnishi, T., Matsuda, H., Asada, T., Aruga, M., Hirakata, M., Nishikawa, M., Katoh, A., Imabayashi, E., 2001. Functional anatomy of musical perception in musicians. Cereb. Cortex 11 (8), 754 760. Poldrack, R.A., Packard, M.G., 2003. Competition among multiple memory systems: converging evidence from animal and human brain studies. Neuropsychologia 41 (3), 245 251. Porro, C.A., Francescato, M.P., Cettolo, V., Diamond, M.E., Baraldi, P., Zuiani, C., Bazzocchi, M., di Prampero, P.E., 1996. Primary motor and sensory cortex activation during motor performance and motor imagery: a functional magnetic resonance imaging study. J. Neurosci. 16 (23), 7688 7698. Rademacher, J., Morosan, P., Schormann, T., Schleicher, A., Werner, C., Freund, H.J., Zilles, K., 2001. Probabilistic mapping and volume measurement of human primary auditory cortex. NeuroImage 13 (4), 669 683. Schurmann, M., Raij, T., Fujiki, N., Hari, R., 2002. Minds ear in a musician: where and when in the brain. NeuroImage 16 (2), 434 440. Shah, N.J., Jancke, L., Grosse-Ruyken, M.L., Muller-Gartner, H.W., 1999. Influence of acoustic masking noise in fMRI of the auditory cortex during phonetic discrimination. J. Magn. Reson. Imaging 9 (1), 19 25. Shah, N.J., Steinhoff, S., Mirzazade, S., Zafiris, O., Grosse-Ruyken, M.L., Jancke, L., Zilles, K., 2000. The effect of sequence repeat time on auditory cortex stimulation during phonetic discrimination. NeuroImage 12 (1), 100 108. Shergill, S.S., Brammer, M.J., Williams, S.C., Murray, R.M., McGuire, P.K., 2000. Mapping auditory hallucinations in schizophrenia using functional magnetic resonance imaging. Arch. Gen. Psychiatry 57 (11), 1033 1038. Shergill, S.S., Bullmore, E.T., Brammer, M.J., Williams, S.C., Murray, R.M., McGuire, P.K., 2001. A functional study of auditory verbal imagery. Psychol. Med. 31 (2), 241 253. Silbersweig, D.A., Stern, E., Frith, C., Cahill, C., Holmes, A., Grootoonk, S., Seaward, J., McKenna, P., Chua, S.E., Schnorr, L., et al., 1995. A functional neuroanatomy of hallucinations in schizophrenia. Nature 378 (6553), 176 179. Sinai, A., Pratt, H., 2003. High-resolution time course of hemispheric dominance revealed by low-resolution electromagnetic tomography. Clin. Neurophysiol. 114 (7), 1181 1188. Tervaniemi, M., Medvedev, S.V., Alho, K., Pakhomov, S.V., Roudas, M.S., Van Zuijen, T.L., Naatanen, R., 2000. Lateralized automatic auditory

This work was supported by a grant donated by the Swiss National Science Foundation (SNF) to L.J. References
Amaro, E. Jr., Williams, S.C., Shergill, S.S., Fu, C.H., MacSweeney, M., Picchioni, M.M., Brammer, M.J., McGuire, P.K., 2002. Acoustic noise and functional magnetic resonance imaging: current strategies and future prospects. J. Magn. Reson. Imaging 16 (5), 497 510. Annett, M., 1970. A classification of hand preference by association analysis. Br. J. Psychol. 61 (3), 303 321. Bandettini, P.A., Jesmanowicz, A., Van Kylen, J., Birn, R.M., Hyde, J.S., 1998. Functional MRI of brain activation induced by scanner acoustic noise. Magn. Reson. Med. 39 (3), 410 416. Bangert, M., Altenmuller, E.O., 2003. Mapping perception to action in piano practice: a longitudinal DC-EEG study. BMC Neurosci. 4 (1), 26. Beauchamp, M.S., Lee, K.E., Argall, B.D., Martin, A., 2004. Integration of auditory and visual information about objects in superior temporal sulcus. Neuron 41 (5), 809 823. Belin, P., Zatorre, R.J., Hoge, R., Evans, A.C., Pike, B., 1999. Event-related fMRI of the auditory cortex. NeuroImage 10 (4), 417 429. Brett, M., Anton, J.-L., Valabregue, R., Poline, J.-B., 2002. Region of interest analysis using an SPM toolbox. NeuroImage 16 (2) (Abstract; Presented at the 8th International Conference on Functional Mapping of the Human Brain, June 2 6, 2002, Sendai, Japan; http://marsbar. sourceforge.net/). Brett, M., Johnsrude, I.S., Owen, A.M., 2002b. The problem of functional localization in the human brain. Nat. Rev., Neurosci. 3 (3), 243 249. Dierks, T., Linden, D.E., Jandl, M., Formisano, E., Goebel, R., Lanfermann, H., Singer, W., 1999. Activation of Heschls gyrus during auditory hallucinations. Neuron 22 (3), 615 621. Goldenberg, G., Podreka, I., Steiner, M., Franzen, P., Deecke, L., 1991. Contributions of occipital and temporal brain regions to visual and acoustic imagerya spect study. Neuropsychologia 29 (7), 695 702. Griffiths, T.D., 2000. Musical hallucinosis in acquired deafness. Phenomenology and brain substrate. Brain 123 (Pt. 10), 2065 2076. Griffiths, T.D., 2001. The neural processing of complex sounds. Ann. N. Y. Acad. Sci. 930, 133 142. Grill-Spector, K., Kourtzi, Z., Kanwisher, N., 2001. The lateral occipital complex and its role in object recognition. Vision Res. 41 (10 11), 1409 1422. Hall, D.A., Haggard, M.P., Akeroyd, M.A., Palmer, A.R., Summerfield, A.Q., Elliott, M.R., Gurney, E.M., Bowtell, R.W., 1999. Sparse temporal sampling in auditory fMRI. Hum. Brain Mapp. 7 (3), 213 223. Hall, D.A., Haggard, M.P., Akeroyd, M.A., Summerfield, A.Q., Palmer, A.R., Elliott, M.R., Bowtell, R.W., 2000. Modulation and task effects in auditory processing measured using fMRI. Hum. Brain Mapp. 10 (3), 107 119. Halpern, A.R., Zatorre, R.J., 1999. When that tune runs through your head: a PET investigation of auditory imagery for familiar melodies. Cereb. Cortex 9 (7), 697 704. Hebb, D.O., 1949. The Organization of Nehavior: A Neuropsychological Theory. Wiley, New York. Hickok, G., Love, T., Swinney, D., Wong, E.C., Buxton, R.B., 1997. Functional MR imaging during auditory word perception: a single-trial presentation paradigm. Brain Lang. 58 (1), 197 201. Hoshiyama, M., Gunji, A., Kakigi, R., 2001. Hearing the sound of silence: a magnetoencephalographic study. NeuroReport 12 (6), 1097 1102. Jancke, L., Shah, N.J., 2004. Hearing syllables by seeing visual stimuli. Eur. J. Neurosci. 19 (9), 2603 2608. Jancke, L., Wustenberg, T., Scheich, H., Heinze, H.J., 2002a. Phonetic perception and the temporal cortex. NeuroImage 15 (4), 733 746. Jancke, L., Wustenberg, T., Schulze, K., Heinze, H.J., 2002b. Asymmetric hemodynamic responses of the human auditory cortex to monaural and binaural stimulation. Hear. Res. 170 (1 2), 166 178.

N. Bunzeck et al. / NeuroImage 26 (2005) 1119 1127 processing of phonetic versus musical information: a PET study. Hum. Brain Mapp. 10 (2), 74 79. Thompson, W.L., Kosslyn, S.M., Sukel, K.E., Alpert, N.M., 2001. Mental imagery of high- and low-resolution gratings activates area 17. NeuroImage 14 (2), 454 464. Tootell, R.B., Reppas, J.B., Dale, A.M., Look, R.B., Sereno, M.I., Malach, R., Brady, T.J., Rosen, B.R., 1995. Visual motion aftereffect in human cortical area MT revealed by functional magnetic resonance imaging. Nature 375 (6527), 139 141. Westbury, C.F., Zatorre, R.J., Evans, A.C., 1999. Quantifying variability in the planum temporale: a probability map. Cereb. Cortex 9 (4), 392 405. Wheeler, M.E., Petersen, S.E., Buckner, R.L., 2000. Memorys echo: vivid remembering reactivates sensory-specific cortex. Proc. Natl. Acad. Sci. U. S. A. 97 (20), 11125 11129. Yoo, S.S., Lee, C.U., Choi, B.G., 2001. Human brain mapping of auditory imagery: event-related functional MRI study. NeuroReport 12 (14), 3045 3049. Yoo, S.S., Freeman, D.K., McCarthy, J.J., Jolesz, F.A., 2003. Neural

1127

substrates of tactile imagery: a functional MRI study. NeuroReport 14 (4), 581 585. Zaehle, T., Wustenberg, T., Meyer, M., Jancke, L., 2004. Evidence for rapid auditory perception as the foundation of speech processing: a sparse temporal sampling fMRI study. Eur. J. Neurosci. 20 (9), 2447 2456. Zatorre, R.J., 2003. Music and the brain. Ann. N. Y. Acad. Sci. 999, 4 14. Zatorre, R.J., Halpern, A.R., 1993. Effect of unilateral temporal-lobe excision on perception and imagery of songs. Neuropsychologia 31 (3), 221 232. Zatorre, R.J., Halpern, A.R., Pery, D.W., Meyer, E., Evans, A.C., 1996. Hearing in the minds ear: a PET investigation of musical imagery and perception. J. Cogn. Neurosci. 8, 29 46. Zatorre, R.J., Perry, D.W., Beckett, C.A., Westbury, C.F., Evans, A.C., 1998. Functional anatomy of musical processing in listeners with absolute pitch and relative pitch. Proc. Natl. Acad. Sci. U. S. A. 95 (6), 3172 3177.

You might also like