Human Central Auditory Plasticity Associated With Tone Sequence Learning

  1. Julie Marie Gottselig1,3,
  2. Daniel Brandeis2,
  3. Gilberte Hofer-Tinguely1,
  4. Alexander A. Borbély1, and
  5. Peter Achermann1
  1. 1Institute of Pharmacology and Toxicology and2Institute of Child and Adolescent Psychiatry, University of Zürich, CH-8057 Zürich, Switzerland

Abstract

We investigated learning-related changes in amplitude, scalp topography, and source localization of the mismatch negativity (MMN), a neurophysiological response correlated with auditory discrimination ability. Participants (n = 32) underwent two EEG recordings while they watched silent films and ignored auditory stimuli. Stimuli were a standard (probability = 85%) and two deviant (probability = 7.5% each for high [HD] and low [LD]) eight-tone sequences that differed in the frequency of one tone. Between recordings, subjects practiced discriminating the HD or LD from the standard for 6 min. The amplitude of the LD MMN increased significantly across recordings in both groups, whereas the amplitude of the HD MMN did not. The LD was easier to discriminate than was the HD. Thus, practicing either discrimination increased the MMN for the easier discrimination. Learning and changes in the LD MMN amplitude were highly correlated. Source localizations of event-related potentials (ERPs) to all stimuli revealed bilateral sources in superior temporal regions. Compared with the standard ERP, the LD ERP revealed a stronger source in the left superior temporal region in both recordings, whereas the right-sided source became stronger after learning. Consistent with prior studies of auditory plasticity in animals and humans, tone sequence learning induced rapid neurophysiological plasticity in the human central auditory system. The results also suggest that there is asymmetric hemispheric involvement in tone sequence discrimination learning and that discrimination difficulty influences the time course of learning-related neurophysiological changes.

Consolidation of learned information induces changes in the brain that permit memory retrieval over a time scale of years. Memories remain intact despite temporary changes in attentional focus or arousal. If mechanisms of memory encoding were fully understood, it should be possible to physiologically assay the presence of a particular memory without the need to rely on behavioral measures of memory retrieval. For example, animal studies have shown that auditory learning induces receptive field plasticity that exhibits many characteristics of memory, including associativity, high specificity, neural consolidation, and long-term retention (e.g., Bakin and Weinberger 1990; Recanzone et al. 1993; Weinberger 1998; Galvan and Weinberger 2002). Such receptive field plasticity can develop very rapidly, after as few as five trials of auditory classical conditioning (Edeline et al. 1993). Receptive field plasticity can be measured even when performance measures are impossible, such as in anesthetized animals (Weinberger et al. 1993). However, receptive field measurement requires invasive recordings that are not feasible in normal humans. Measures derived from noninvasive scalp EEG recordings offer an alternative approach to assess auditory memory in humans. The mismatch negativity (MMN) is a candidate measure, which is noninvasive and has the advantage that it is relatively independent of attention and can be recorded during altered states of consciousness such as coma and sleep (for review, see Näätänen, 2001).

The MMN is an evoked potential measure of the response of the brain to acoustic change in a series of repetitive auditory stimuli (for review, see Näätänen 2001; Näätänen et al. 2001).

The MMN is recorded by using the “oddball paradigm,” which consists of the presentation of a series of high-probability “standard” stimuli and low-probability “deviant” (or oddball) stimuli. With repeated exposure, the brain forms a memory trace for the standard. When the acoustical deviant is introduced, the brain reacts differently if it detects the acoustical change. In the latency range of ∼100 to 250 msec after the onset of the acoustical change, the difference between the response of the brain to the deviant and its response to the standard is known as the MMN, because the mismatch between stimuli elicits a negative potential over frontocentral portions of the scalp. When salient deviance is used and subjects actively attend to stimuli, N1 enhancement and N2b components overlap with the MMN (Näätänen 1990). In human MMN studies, researchers usually use slight deviance and instruct participants to direct their attention away from auditory stimuli, which permits separation of the MMN component.

The amplitude of the MMN to a particular stimulus difference correlates with behavioral discrimination of that difference (e.g., Sams et al. 1985; Lang et al. 1990; Kraus et al. 1996). The MMN revealed long-term changes (i.e., changes that develop over months or years) in brain organization associated with language exposure (Näätänen et al. 1997; Cheour et al. 1998, 2002b; Winkler et al. 1999) and musical experience (Tervaniemi et al. 2001). The MMN has also been used to track auditory plasticity that develops over a time period of hours to weeks. Practice-related improvements in the discrimination of auditory stimuli were accompanied by increases in the MMN (Näätänen et al. 1993; Kraus et al. 1995; Tremblay et al. 1997, 1998; Menning et al. 2000, 2002; Tervaniemi et al. 2001; Atienza et al. 2002).

The MMN thus appears promising as an objective neural measure of learning that could be applied in a range of clinical and experimental settings (e.g., Atienza and Cantero 2001; Kujala et al. 2001). Therefore, it is important to characterize the relationship between learning and the MMN in detail. Many open questions remain. For example, if subjects learn to discriminate one stimulus difference, will the neurophysiological effects be more general, such that the MMN elicited by other stimulus differences will also change? Tremblay et al. (1997) reported that learning to discriminate a labial voice onset time contrast was associated with changes in the MMN not only for this contrast but also for alveolar voice onset time contrasts. Thus, generalization of the neurophysiological effects of learning is possible, but it is unclear if such generalization also occurs for other types of stimuli.

One aim of the present study was to investigate whether learning to discriminate tone sequences would cause stimulus-specific or generalized MMN increases. To answer this question, a standard tone sequence and two deviant tone sequences (Fig. 1) were presented during two recording sessions while subjects directed their attention to a film. In an intervening discrimination session, subjects practiced discriminating one of the two deviants from the standard. We hypothesized that the MMN response to the practiced stimulus would increase from the first to the second recording session. The MMN to the unpracticed stimulus difference was expected to remain constant or to decrease, because the brain may reduce responding to stimulus differences that are irrelevant for behavior, as occurs with habituation.

Figure 1

(A) Schematic representation of the tone sequence stimuli. Each stimulus consisted of eight tones of 50-msec duration with overlapping rise and fall times of 5 msec, yielding a total stimulus duration of 365 msec. The first five and the last two tones were identical for all three stimuli. The deviants differed from the standard in their sixth tone, which was 85 Hz higher (high deviant [HD]) or lower (low deviant [LD]). Stimulus probabilities are given in parentheses. The trigger for event-related potential averaging occurred at 225 msec after stimulus onset, when the tone segment of interest occurred. (B) Experimental procedures and their approximate durations.


The time course of changes in the MMN in relation to learning also merits further investigation. Some previous findings indicated that MMN changes precede improvements in discrimination ability (Sams et al. 1985; Kraus et al. 1995; Tremblay et al. 1998), whereas other results implied that discrimination learning caused concurrent or subsequent alterations in the MMN (Näätänen et al. 1993). We explored the relationship between discrimination performance and the MMN by using correlations. These correlations provide insight about the temporal dynamics of changes in the MMN in relation to behaviorally measured changes in discrimination ability. We hypothesized that initial discrimination performance would be correlated with initial MMN amplitude (e.g., Sams et al. 1985; Lang et al. 1990; Kraus et al. 1996) and that increases in discrimination performance would be correlated with increases in MMN amplitude (cf. Näätänen et al. 1993).

Studies in both humans and animals yielded evidence that the primary cortical generators of the MMN lie in the auditory cortex (Hari et al. 1984; Csepe et al. 1987; Giard et al. 1990; King et al. 1995; Karmos et al. 1997; Alho et al. 1998; Huotilainen et al. 1998; Rinne et al. 2000; Pincze et al. 2001; Waberski et al. 2001; Muller et al. 2002; Opitz et al. 2002), and frontal areas may also play a role (Giard et al. 1990; Rinne et al. 2000; Waberski et al. 2001; Muller et al. 2002; Opitz et al. 2002). Possible learning-related changes in the strength and localization of cortical MMN generators still require investigation. Learning to discriminate differences in synthetic-speech phonemes induced changes in the MMN that were greater in left than in right EEG derivations (Tremblay et al. 1997), suggesting that learning may engage brain regions asymmetrically. However, source localization methods are necessary to determine the sources of the EEG. Using an equivalent current dipole method of source localization, Menning et al. (2002) did not detect changes in pre- versus posttraining localizations of the mismatch field (measured with magnetoencephalography) elicited by Japanese words. We investigated possible changes in MMN topography and source localization after tone sequence learning. To reveal anatomical sources of event-related responses, we conducted source localizations with low-resolution electromagnetic tomography (LORETA; Pascual-Marqui et al. 1994; Pascual-Marqui 1999), which, unlike dipole methods, does not require assumptions about the number of sources. Based on previous evidence that MMN generators are located in auditory cortex and that MMN amplitude increases with learning, we predicted increased current source density in the superior temporal region after learning.

RESULTS

Performance

Figure 2 illustrates subjects' discrimination performance. Mean performance increased significantly over time (main effect of block: F(5, 26) = 8.26, P = 0.0001). Performance in the last block was significantly better than in the first block (t(31) = 4.88, P < 0.0001). In addition, performance was significantly better in the group that discriminated the LD than in the group that discriminated the HD (main effect of group: F(1, 30) = 16.81, P = 0.0003), indicating that the LD was easier to discriminate than was the HD. This difference between groups was already present in the first block (t(30) = 5.42, P < 0.0001), and the LD group still tended to do better in the last block (t(30) = 1.92, P = 0.06). The rate of learning did not differ significantly between groups (group × blockinteraction: F(5, 26) = 2.03, P = 0.11).

Figure 2

Performance during the discrimination session. Respectively, open squares and solid circles indicate the groups that discriminated the high deviant (HD) and the low deviant (LD) from the standard. Error bars indicate standard errors.


Amplitudes of the MMN Waveform

Figure 3A illustrates the event related potential (ERP) waveforms from Fz. The mean MMN amplitudes are shown in Figure 3B. These amplitudes were analyzed by using a repeated-measure ANOVA with the between-subjects factor group (HD practice versus LD practice), and the within-subjects factor stimulus (HD versus LD), and recording (1 versus 2). The LD elicited a significantly larger MMN than did the HD (main effect of stimulus: F(1, 30) = 30.11, P = 0.0001). This difference was already present in the first recording (t(31) = 2.98, P = 0.006).

Figure 3

(A) Grand mean of event-related potential (ERP) waveforms at Fz (average reference). The 0-msec time point indicates the onset of the sixth tone segment that differed between the deviants and the standard. The onset of the first tone segment occurred at -225 msec. The left panels show data from recording 1; the right panels show data from recording 2. The top and bottom panels show data from the groups that practiced the high deviant (HD) and the low deviant (LD), respectively. Solid lines, dotted lines, and dashed lines indicate ERPs to the standard, HD, and LD, respectively. (B) Mean mismatch negativity (MMN) amplitudes. Circles and squares represent MMN responses to the LD and HD, respectively. Solid symbols and solid lines designate data from the group that practiced the LD. Open symbols and dashed lines represent data from the group that practiced the HD. Error bars indicate standard errors. The data from the two groups are shifted slightly along the x-axis to permit better visualization.


The stimulus by recording interaction was significant (F(1,30) = 5.64, P = 0.024), because the MMN elicited by the LD increased from recording 1 to recording 2 (t(31) = 2.52, P = 0.02), whereas the MMN elicited by the HD did not change significantly across recordings (t = 0.61, P = 0.54). The mean amplitude of the MMN to the HD increased slightly in the group that practiced discriminating it and decreased in the group that did not.

The main effect of group and interactions involving group were not significant (main effect: F(1, 30) = 0.13, P = 0.72; group × stimulus interaction: F(1, 30) = 0.19, P = 0.66; group × recording interaction: F(1, 30) = 0.82, P = 0.37; group × stimulus × recording interaction: F(1, 30) = 0.314, P = 0.580). In the first recording, the mean MMNs of the two groups were nearly the same for both of the deviants (Fig. 3B, recording 1, solid circle and solid square are superimposed on open circle and open square).

We also investigated the dynamics of the MMN within recordings by splitting the two recordings into halves. We analyzed the MMN amplitudes by using a MANOVA with the same factors as above and the additional factor half (first versus second). The main effect of half showed a significant trend (F(1, 30) = 3.11, P = 0.088), indicating that the MMN tended to increase within sessions (mean ± SE: first half, -2.25 ± 0.15 μV; second half, -2.42 ± 0.15 μV). None of the interactions involving “half” were significant. Thus, the degree of increase was similar for both recordings, both groups, and both stimuli.

Correlations Between the MMN and Performance

Correlations between performance and the amplitude of the MMN elicited by the practiced deviant are presented in Table 1. In this context, a negative correlation reflects a positive relationship with MMN amplitude, because a more negative MMN indicates a stronger mismatch response.

Table 1.

Correlations Between MMN Amplitude and Performance


In the group that practiced the low (easier) deviant, the initial MMN amplitude was positively related to initial performance levels and inversely related to learning. A very strong, significant, positive relationship between learning and the change in the MMN was observed in this group, indicating that subjects who learned more showed larger increases in the MMN.

In the group that practiced the high (more difficult) deviant, the initial MMN amplitude was not correlated with initial performance, and there was no clear relationship between learning and the change in the MMN across recordings. Rather, the initial MMN amplitude was the best predictor of subsequent learning.

Objective and Subjective Measures of Attention and Sleepiness

Our assessments indicated no significant differences in attention or arousal state between the two recordings (attention ratings: t(31) = 1.31, P = 0.20, M1 = 23 mm M2 = 20 mm; sleepiness ratings: t(29) = 0.02, P = 0.98, M1 = 34 mm M2 = 34 mm; film scores: t(31) = 1.41, P = 0.17, M1 = -0.13 SD M2 = 0.13 SD; M1 and M2 indicate means for the first and second recordings, respectively). We also correlated changes in the MMN (MMN for recording 2 - MMN for recording 1) to each deviant with changes in attention, sleepiness, and film scores. None of these correlations were statistically significant, indicating that alterations in attention or arousal cannot account for changes in the MMN across recordings.

MMN Topography

Figure 4 shows the topography of the MMN. There was a significant increase in the amplitude of the MMN to the LD from recording 1 to recording 2 (P = 0.004), as can be seen from the additional contour line (central darkest blue area) and the increased size of the lateral and posterior positivity (darkred area). The amplitude of the MMN to the HD did not change significantly (P = 0.413). Also, the topography of the MMN did not change significantly for either deviant (LD: P = 0.262; HD: P = 0.367). Thus, the results of the topographical analysis support the results of the waveform analyses.

Figure 4

The topography of the mismatch negativity (MMN) to the low deviant (LD, top row) and the high deviant (HD, bottom row) in the first (left) and second (right) recording sessions.


Source Localization

LORETA revealed current source density maxima in the left and right superior temporal regions for all standard and deviant ERPs. A representative illustration is shown for the LD ERP in recording 2 (Fig. 5A). Differences in LORETA solutions between standard and deviant stimuli were calculated to reveal the sources of the MMN. These differences are shown for the LD in Figure 5B. In both recordings, the maximal difference in current source density was located in the left superior temporal region. Compared with the first recording, the second recording showed a greater difference in the right superior temporal region (Fig. 5B, cf. upper and lower panels). Statistical comparisons revealed significantly increased current source density in the right insula in the second recording compared with the first (Fig. 5C). For the HD, the sources of the MMN did not change significantly between recordings.

Figure 5

(A) LORETA solution for the grand average low deviant (LD) event-related potential (ERP) in recording 2. The X, Y, and Z values and the triangles at the edge of the image indicate the coordinates of the voxel with the maximum current source density (CSD); this CSD value is given in parentheses. The red color scale indicates CSD values in μA/mm2. Structural anatomy is shown in gray scale. (B) Differences in LORETA solutions (LD solution - standard solution) in recording 1 (top panel) and recording 2 (bottom panel). (C) Statistical maps comparing differences in CSD between recordings 1 and 2. The color scale indicates t-values, and the X, Y, and Z values indicate the coordinates of the voxel with the maximum t-value (one-tailed P = 0.0004, uncorrected). Dark red areas are statistically significant (t > 3.49, P < 0.05, after nonparametric correction).


DISCUSSION

The present study demonstrates that neurophysiological plasticity in the human auditory system can develop very rapidly, in this case, after 6 min of auditory discrimination learning. The LORETA source localizations offer the first evidence that MMN sources may change after learning. In combination with previous results, our results also suggest that discrimination difficulty influences the time course of changes in the MMN after auditory learning.

Changes in MMN Amplitude but not Topography

Both waveform and topographical analyses revealed that the amplitude of the MMN increased significantly after learning. Mechanisms that might underlie changes in MMN amplitude include cortical recruitment, changes in the tuning characteristics of auditory-responsive neurons, or greater synchronization of neuronal responses to the deviant tone segment (Gilbert et al. 2001), such as occurs with context-dependent facilitation (Kilgard and Merzenich 2002).

The topography (i.e., the shape of the EEG map contours) of the MMN showed no statistically significant changes after learning. Thus, the observed auditory plasticity differs from electrophysiological plasticity in the visual system, in which primarily topographical alterations were observed after discrimination learning of stereoscopic and vernier stimuli (Skrandies and Jedynak 1999; Skrandies 2001; Skrandies et al. 2001). However, one cannot exclude the possibility of subtle differences in activity distributions that were below the resolution limit of our ERPs.

Changes in MMN Sources After Learning

LORETA revealed bilateral sources in the superior temporal region for all standard and deviant ERPs in the latency range of the MMN, confirming that these potentials share a common origin in the auditory cortices. Compared with the standard ERP, the LD ERP demonstrated a stronger source in the left superior temporal region in both recordings, whereas the right-sided source became stronger in the second recording, after subjects had practiced discriminating the stimuli. The responses of the brain to deviant relative to standard stimuli thus became more bilateral as discrimination ability improved. Shifts in lateralization could reflect changes in perceptual strategy, because the left and right hemispheres are thought to specialize in different aspects of auditory perception (Peretz 1990; Robin et al. 1990). Differences in the source localizations of standard and LD ERPs revealed increased current source density in the right insula after learning. Possible mechanisms that underlie changes in current source density include increased responding of neurons that initially responded only weakly or moderately to the auditory stimuli, responding of neurons that initially did not respond to the stimuli (“cortical recruitment”), and/or increased neural synchronization. These mechanisms have been observed in animal studies of auditory cortical plasticity (e.g., Diamond and Weinberger 1986; Bakin and Weinberger 1990; Recanzone et al. 1993; Kilgard and Merzenich 2002) and are consistent with theoretical accounts of adaptive information processing (e.g., Weinberger et al. 1991). The possibility of insular recruitment would be consistent with physiological data that indicate the presence of neurons in the primate insula that respond to complex auditory signals (Bieser 1998). Its anatomical connections, as well as results from human neuropsychological and neuroimaging studies, also implicate the insula in auditory perception (for review, see Augustine 1996).

Specificity of MMN Changes

Increases in the MMN amplitude after discrimination practice were not limited to the practiced discrimination. The MMN response to the LD increased significantly in the group that practiced discriminating it and in the group that practiced discriminating the more difficult HD. The increased MMN to the LD in the HD practice group provides evidence that practicing difficult discriminations can strengthen MMN-generating mechanisms in a manner that transfers to easier discriminations. Generalization of MMN increases after discrimination training was previously noted for synthetic phoneme stimuli (Tremblay et al. 1997). Our results indicate that generalization also occurs when people learn to discriminate tone sequences; thus, generalized neurophysiological changes may accompany various types of auditory learning.

Difficulty of Discriminations

Both groups showed a significant increase in the MMN to the LD, yet there were no significant changes in the MMN to the HD. The difference in results for the two deviants is most likely attributable to the difference in the difficulty of discriminating them from the standard. Several aspects of our results demonstrated that the LD was easier to discriminate than was the HD. First, in same-different discrimination tests, the group that discriminated the LD from the standard performed significantly better than did the group that discriminated the HD from the standard. This difference was already present in the first blockof discrimination. Second, the LD elicited a significantly larger MMN than did the HD, as would be predicted for an easier discrimination (Sams et al. 1985). Finally, the two groups initially had MMNs of almost exactly the same amplitudes (Fig. 3B, recording 1). Because the amplitude of the MMN is related to discrimination ability (e.g., Sams et al. 1985; Lang et al. 1990; Kraus et al. 1996), the initial equivalence of the MMN amplitudes of the two groups suggests that performance differences truly reflected differences in the difficulty of the discriminations, rather than incidental group differences in ability.

Relationship Between Initial MMN and Initial Performance

Previous studies have shown that the MMN response is related to discrimination accuracy (e.g., Lang et al. 1990; Kraus et al. 1996). Based on these studies, one would expect the initial MMN to be correlated with initial discrimination accuracy. Indeed, the amplitude of the MMN in recording 1 was significantly correlated with initial performance levels in the group that practiced the LD. In contrast, there was no correlation between initial MMN and initial discrimination performance in the group that practiced the HD. One explanation for the discrepancy between groups lies in their initial performance levels, which differed strikingly. The LD practice group was already very accurate, whereas performance in the HD practice group was not far above chance. At the individual level, MMN amplitude may correlate with discrimination performance only when stimulus differences are well above discrimination threshold. This interpretation is consistent with the results of Allen et al. (2000), who reported that MMN responses did not differ for stimuli that were just above or below individually determined discrimination thresholds.

Time Course of Changes in the MMN in Relation to Learning

If learning was correlated with changes in the MMN from recording 1 to recording 2, we could conclude that changes in the neural generators of the MMN must have occurred during the discrimination session or the 10-min break after it. For the LD group, this correlation was very strong (r = 0.76), suggesting that for the easier discrimination, MMN generators strengthened during learning. On the other hand, for the HD group, there was no such correlation. Instead, in the HD group, learning tended to be associated with the initial level of the MMN (r = 0.47). Thus, for the difficult deviant, stronger preexisting MMN generators predicted subsequent learning. Collectively, these results suggest that the temporal relationship between learning and neurophysiological plasticity is variable and may depend on the relative difficulty of discriminations. Similarly, studies in guinea pigs revealed a correlation between auditory receptive field plasticity and behavioral learning for an easy discrimination, whereas there was a dissociation between behavior and cortical plasticity for a difficult discrimination (Edeline and Weinberger 1993).

Comparison With Previous Findings

In the HD group, learning showed a small nonsignificant correlation (r = -0.18) with changes in the MMN, and neither group showed a significant change between sessions in the MMN induced by the HD. Superficially, these results appear inconsistent with the results of Näätänen et al. (1993) and Atienza et al. (2002), who used the same HD and standard as we did. In Atienza et al.'s study, the HD MMN increased immediately after training, but this increase was not statistically significant. A statistically significant increase was observed only at 36 and 48 h after training, suggesting that a more extended period of consolidation is required for increases in the MMN to develop after learning difficult discriminations. A related study in guinea pigs showed that learning-induced receptive field plasticity in auditory cortical neurons may require up to 3 d to achieve asymptotic levels (Galvan and Weinberger 2002). At the cellular level, time-consuming processes such as protein synthesis probably govern the necessity for extended consolidation periods (Kraus et al. 2002).

Another difference compared with previous results is that our group of subjects showed a significant MMN to the HD in the first recording (t = 13.60, P < 0.0001), whereas Atienza et al. (2002)'s subjects and most of Näätänen et al.'s subjects (1993) did not. The most notable new methodological feature of our study was the addition of the second, easier deviant. Exposure to the more easily discriminable LD may have cued the brain to the relevant temporal segment of the tone pattern, thereby triggering an MMN to the HD. Such cuing might be associated with passive learning during the recording session. Indeed, an analysis including the factor “half” revealed that the MMN tended to increase from the first to the second half of the recordings, consistent with the notion that some passive, preattentive learning may have occurred. In infants, passive exposure to sounds during sleep resulted in an increase in the MMN to those sounds (Cheour et al. 2002a), suggesting that preattentive auditory learning is possible.

A potential caveat for the interpretation of studies that attempt to relate auditory learning and the MMN is that neurophysiological changes might occur with the passage of time, independent of learning. However, time alone is unlikely to account for the MMN increases we observed. Näätänen et al.'s (1993) control experiment demonstrated that the MMN does not increase over time if subjects do not learn. Furthermore, in the absence of learning one would expect habituation rather than an increase in the MMN (McGee et al. 2001). Also, the strong correlation between learning and the MMN in the LD group supports the notion that learning is causally related to increases in MMN amplitude, as does evidence from numerous prior studies (Näätänen et al. 1993, 1997; Kraus et al. 1995; Tremblay et al. 1997, 1998; Cheour et al. 1998; Winkler et al. 1999; Menning et al. 2000, 2002; Tervaniemi et al. 2001).

Conclusions and Outlook

The present study demonstrates that tone sequence learning is associated with rapid neurophysiological plasticity in the central auditory system, consistent with prior studies of auditory learning in humans (e.g., Näätänen et al. 1993; Atienza et al. 2002) and animals (e.g., Edeline et al. 1993). Neuroimaging studies similarly indicated that discriminatory classical conditioning (Morris et al. 1998) and classical eyeblinkconditioning (Molchan et al. 1994; Schreurs et al. 1997) induced changes in blood flow to auditory cortex within one experimental session. The combined results indicate that auditory cortical plasticity can develop within minutes.

Our source localizations provide evidence for asymmetric hemispheric involvement in learning of tone sequence discriminations. After learning, we observed significant increases in current source density only in the right hemisphere. Related findings of Tremblay et al. showed that when subjects learned to discriminate changes in synthetic phoneme stimuli, the MMN showed more pronounced changes in left EEG derivations (Tremblay et al. 1997), whereas P1 and N1 components of auditory evoked potentials showed changes only in right EEG derivations (Tremblay and Kraus, 2002). In contrast, Menning et al. (2002) found no learning-related changes in localizations of the mismatch field elicited by Japanese words. Additional source localization studies should investigate possible hemispheric asymmetries in neurophysiological changes that occur with learning of different types of auditory discriminations.

Future investigations should also elucidate the relationship between the plasticity measured with neuroimaging and neurophysiological techniques. By combining EEG and neuroimaging techniques, researchers could determine the extent to which auditory learning induces simultaneous and colocalized changes in current source density and cerebral blood flow or glucose metabolism, and they could track these changes with high temporal and spatial resolution (cf. Vitacco et al. 2002). Both MMN and classical conditioning paradigms should be used in such studies to determine their similarities and differences. Researchers should adapt learning tasks so that the same tasks can be used in both humans and animals, making results comparable across studies. Animal studies permit direct investigations of neural mechanisms using invasive techniques, such as single unit recordings and measures of gene expression. The MMN has been recorded in guinea pigs (Kraus et al. 1994), cats (Csepe et al. 1987), and monkeys (Javitt et al. 1992). One could measure the MMN before and after auditory discrimination learning in one of these animal models while simultaneously recording single units and cortical EEG. Single unit recordings from multiple sites in auditory cortex and insula may serve to clarify how changes in responses of individual neurons or the synchrony of firing among neurons give rise to macroscopic changes in the MMN. Pairing of nucleus basalis stimulation with presentation of an auditory stimulus permits experimental induction of auditory associative memory, including both receptive field plasticity (e.g., Bakin and Weinberger 1996) and behavioral evidence of learning (McLin et al. 2002). Results obtained with this method suggested that learning of tone sequences may involve mechanisms that differ from those involved in learning single tones (Kilgard and Merzenich 2002). Animal researchers could test whether experimentally induced auditory learning results in changes in the MMN.

The present results together with previous findings suggest that the temporal relationship between learning and the MMN may differ depending on the relative difficulty of discriminations. For the easier discrimination, there was a high correlation (r = 0.76, accounting for 58% of the variance) between learning and changes in the MMN, suggesting that MMN generators strengthened as people learned. For the harder discrimination, the MMN amplitude did not show statistically significant changes in our experiment, and the initial MMN amplitude was the best predictor of learning. Thus, for difficult discriminations, MMN amplitude increases may require longer consolidation periods, as reported by Atienza et al. (2002).

One reason that longer consolidation periods may be required for difficult discriminations is that sleep may play an important role in the development of neural changes associated with learning (for review, see Sejnowski and Destexhe 2000; Tononi and Cirelli 2001). The MMN has already been used to study learning during sleep (Cheour et al. 2002a) and to assay the existence of previously stored memories during REM sleep (Atienza and Cantero 2001). The MMN or other neurophysiological measures of learning could provide an important complement to behavioral measures in future studies of the possible effects of sleep and circadian rhythms on learning.

MATERIALS AND METHODS

Subjects

Participants (n = 32; 14 female, 18 male, aged 20 to 26 years) were right-handed nonsmokers with no history of neurologic or psychiatric disease. They had hearing thresholds ≤30 dB HL in the range 500 to 3000 Hz. Two additional individuals were excluded for failure to pass the hearing screen. Subjects were instructed to avoid alcohol and caffeine on the day before and the day of the experiments. They were paid for participation. The local ethics committee approved the study procedures. All subjects gave written informed consent.

Stimuli

Each stimulus consisted of a sequence of eight individual tones (Fig. 1A). Stimuli were presented at 70 dB SPL with an interstimulus interval (ISI, defined as offset to onset) of 610 msec. The standard stimulus (P = 0.85) consisted of tones of the following frequencies: 720, 500, 638, 1040, 1175, 565, 815, and 920 Hz. The high deviant (HD) and low deviant (LD) stimuli (P = 0.075 each) were identical to the standard except for the sixth tone, which was 650 or 480 Hz, respectively. The HD and the standard were based on those used in Näätänen et al.'s (1993) study, and the LD was new. In each recording session, eight blocks of 200 stimuli were presented with 2-min breaks between blocks. Such breaks were included in Näätänen et al.'s (1993) procedure, and we also included them to help prevent habituation (cf. McGee et al. 2001). Stimuli were presented in random order with the constraints that each stimulus block began with four standard stimuli and at least one standard preceded each deviant.

Discrimination Task

On each trial, subjects listened to a pair of stimuli separated by an ISI of 610 msec. The first stimulus of each pair was always the standard. The second stimulus was either the standard or a deviant. For each subject, only one of the two deviants was presented. Subjects were instructed to press the left mouse button if the two sequences were the same or the right button if the sequences were different. The intertrial interval was 1220 msec plus the reaction time of the subject. There were 60 same and 60 different trials in randomized order. The session lasted ∼6 min.

Procedure

The procedure is outlined in Figure 1B. Subjects reported to the laboratory at 8:30 a.m. or 2:00 p.m. Experiments were conducted in a sound-insulated, electrically shielded room. During recordings, subjects watched one of two documentary films with subtitles and no sound. We instructed subjects to ignore the auditory stimuli and focus attention only on the film because they would be asked about the content of the film after the recording. After each recording, subjects rated their attention (attention to film versus attention to tones) and sleepiness (awake versus sleepy) during the preceding recording by using 100-mm visual analog scales. Ratings are expressed in millimeters. Higher ratings reflect greater attention to tones or greater sleepiness. Subjects also answered 25 true/false questions about the film they had viewed. Between the two recordings, subjects completed a same/different discrimination task. Half of the subjects discriminated the HD from the standard (HD practice group, n = 16) and the other half discriminated the LD from the standard (LD practice group, n = 16). The HD and LD groups were balanced in terms of time of recording (morning or afternoon), sex of subjects, and order of film presentation.

Electrophysiology

The EEG was recorded continuously from 48 sintered Ag/AgCl electrodes at a resolution of 0.18 μV. An analog bandpass filter of 0.1 to 70 Hz was used (attenuation, 24 dB/octave), and signals were digitized at 500 Hz. Subjects wore an electrode cap that included all 10 to 20 system electrodes as well as electrodes Fpz (recording reference), AF1, AF2, FC1, FC2, FC5, FC6, FT9, FT10, TP9, TP10, CP5, CP6, PO9, PO10, and Oz (Klem et al. 1999). In addition, electrodes were placed on the mastoids, earlobes, and nose and below the outer canthus of each eye (EOG channels). Before analysis, data were digitally filtered with a 30-Hz low-pass filter (Butterworth, zero phase distortion, 48 dB/octave), down-sampled to 256 Hz using spline interpolation, and recalculated to an average reference (Lehmann 1987).

ERPs were segmented from -350 to 450 msec, where 0 msec labeled the onset of the tone segment of interest (i.e., 225 msec after stimulus onset, labeled by the trigger in Fig. 1). Segments containing artifacts that exceeded a threshold of ± 100 μV in any channel were excluded from the averages. The MMN was calculated as the average ERP for deviants minus the average ERP for standards. The first four standards of each block were excluded from the averages. An average of 101 deviant trials contributed to the MMN for each recording.

Waveform Analysis

Amplitudes and latencies were determined from Fz, where the MMN amplitude was maximal. The MMN peak latency for each subject was defined as the most negative peak in the time window 100 to 250 msec, and the amplitude was defined as the mean potential difference for 100 msec centered around this peak. Here, only amplitude data are presented, because latencies did not change significantly from recording 1 to recording 2.

Mapping Analysis

Topographic maps of the MMN scalp potential distributions were made at the individually determined time point of peak global field power (GFP; for details, see Lehmann 1987) in the latency range 100 to 250 msec. Topographic maps were generated in Brain Vision Analyzer software (Brain Products), using triangulation and linear interpolation. Statistical comparisons between maps were made by using topographical analyses of variance (TANOVAs; Strik et al. 1998), a nonparametric bootstrapping technique implemented in the LORETA software (Pascual-Marqui 2002). To determine if the topography of the maps changed, TANOVAs were conducted with the GFP of each map normalized to one.

Statistics

Performance data were expressed in terms of sensitivity, or d′ (Macmillan and Creelman 1991). Performance data and the MMN amplitude and latency data were analyzed by using multivariate repeated-measures ANOVAs (MANOVA results from SAS proc glm, repeated). Comparisons between recordings or groups were made by using two-tailed paired or unpaired t-tests, respectively.

The relationship between performance data and the MMN amplitude (at Fz, as defined above) was investigated by using Pearson correlations. (1) To investigate whether the initial MMN amplitude predicted initial performance levels, the MMN elicited in recording 1 by the relevant deviant (i.e., the deviant subsequently practiced in discrimination session 1) was correlated with performance in block 1 of discrimination session 1. (2) To investigate whether the initial MMN amplitude predicted subsequent learning, the MMN elicited in recording 1 by the relevant deviant was correlated with learning in discrimination session 1. For each individual, learning was defined as the slope of the regression line fitted to the performance values for the six blocks of discrimination. (3) To investigate whether MMN amplitude changed as a result of learning, we correlated the change in MMN amplitude across recordings (MMN to relevant deviant in recording 2 - MMN to relevant deviant in recording 1) with learning in discrimination session 1.

The relationship between attentional measures and the MMN was also investigated by using Pearson correlations. Scores on the two sets of film questions were normalized (expressed as the number of standard deviations from the mean on that particular set of questions) prior to analysis, in case the two sets of questions differed in difficulty.

Source Localization

We used LORETA to calculate source localizations (Pascual-Marqui et al. 1994; Pascual-Marqui 1999, 2002). LORETA determines the smoothest possible current source density solution that accounts for the observed scalp EEG topography. LORETA does not make assumptions about the number of generators. It produces a blurred solution of focal sources due to the smoothness constraint. LORETA is calculated using the three-shell spherical head model registered to the Talairach human brain atlas (Talairach and Tournoux 1988). Solutions are constrained to gray matter (cortex and hippocampus; Pascual-Marqui 1999). Paired statistical comparisons were made by using statistical nonparametric mapping, which corrects for multiple comparisons at 2394 voxels (Nichols and Holmes 2002). One-tailed comparisons were used because we predicted increases in current source density after learning.

For calculation of source solutions, potentials from EOG electrodes and electrodes on the mastoids, earlobes, and nose were not included, because nonuniform sampling of scalp potentials causes localization errors. By using the LORETA filtering utility, data were subjected to additional digital filtering in the range 1 to 20 Hz. LORETA solutions were calculated individually for each subject for each deviant and standard ERP in each recording. The solutions were calculated for 100 msec centered around the peak GFP in the latency range 100 to 250 msec, as individually determined for each deviant and each recording. The resulting time windows were also used for calculating the standard LORETA solutions that were subsequently subtracted from deviant solutions. Transformation matrices were calculated by using individually measured three-dimensional electrode coordinates, and the amount of over-smoothness was determined objectively by using cross validation with ERP files.

Acknowledgments

We thank Roberto Pascual-Marqui for advice about LORETA, Martin Speck for help with data collection, Reto Huber for assistance with subject recruitment, and Norbert Dillier for lending us equipment. We also thank Urs Maurer and Silvia Brem for their contributions. This project was supported by the Human Frontiers Science Program (RG00131/2000-B R), the Swiss National Science Foundation (3100-053005.97/2 and 3100A0-100567/1), and the NCCR on Neural Plasticity and Repair.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Footnotes

  • Article and publication are at http://www.learnmem.org/cgi/doi/10.1101/lm.63304.

    • Accepted October 29, 2003.
    • Received May 22, 2003.

References