Project Beep

Personnel: Sahar Akram, Claire Chambers, Connie Cheung, Nai Ding, Ying-Yee Kong, Lakshmi Krishnan, Adrian KC Lee, Matthew Runchey (and his brain), Barbara Shinn-Cunningham

This is unpublished work in progress.

Project Goal

The goal of this project is to measure different auditory attentional biomarkers using EEG. We are specifically interested in 3 neuromarkers that would be modulated by attention: (1) Mismatch Negativity (MMN); (2) Reorientation Negativity (RON) and (3) Auditory Steady-State Response (ASSR).

Recent studies have shown that ASSR is modulated by streaming / attention and we are interested to see whether MMN is modulated by selective attention. Our engineering goal is to ascertain the signal-to-noise ratio and robustness of these markers and project their usability in real-time brain computer interface (BCI) deployment. Current auditory-based EEG designs generally present 1 stream of sound. Implication of our findings is that we can present a competing stream in the auditory display. This will increase the number of items that can be selected in any one moment in time, and thereby increase the effective BCI information transfer bit-rate.


We wanted to investigate three potential signals that could identify auditory attentional processes.

Attentional modulation of the steady state response. Attentional modulation of the neural representation of competing streams has been previously investigated using MEG (Xiang, Simon & Elialhi, 2007). It was shown that the steady-state power and phase coherence were found to be enhanced at the target rate. Thus, attention enhances the neural representation of the attended stream. Here, we re-address attentional modulation of competing auditory streams, consisting of simple auditory stimuli presented at different rates, using EEG. We adopt a similar paradigm, with competing streams presented at different frequencies containing deviants. In the previous study, stimuli were streams of identical pure tones separated by a frequency difference, deviants being shifted in time relative to the standard tones. In the current study, harmonic complexes were used, which, unlike pure tones, are only partially resolved at the auditory periphery. Additionally, deviants consisted of frequency shifts rather than temporal shifts. As in Xiang et al (2007), we address the build-up of attentional modulation over time. Stream segregation requires temporal integration, therefore we predict a build-up of attentional modulation over time. In addition, we also introduce conditions where one of the two streams is primed by presenting one stream prior to the onset of the second stream. It has previously been found that priming one stream in this manner causes listeners to segregate concurrent streams (REF), therefore, we predict that the build-up of the attentional effect will occur over a shorter timescale when one stream is primed.

Attentional modulation of the mismatch negativity-P300 complex. In addition to attentional modulation of the EEG signal, we also address the neural response to the deviant stimuli, with the expectation that attention modulates these responses. The response to unexpected stimuli in a stream is commonly known as the mismatch negativity-P300 complex (MMN-P300), which is thought to reflect processes that generate predictions concerning future perception and attentional orientation toward this change, leading to an enhanced response. The MMN has been found to occur in the absence of attention, however, there is an ongoing debate on whether the mismatch negativity is modulated by attention. In the current paradigm we examine responses to the deviants in order to compare responses to deviants in the attended and unattended streams. This provides an additional neural signature of the attentional state of the listener, which could be exploited in a BCI paradigm. Presenting two streams at once and asking the listener to attend to one provides a more efficient way of accessing the content of the listener’s attention than presenting stimuli sequentially, effectively doubling the bit-rate.

Auditory distraction. Priming also allows us to explore potentials due to exogenous switches of attention, which occur when the listener’s attention is attracted by salient stimuli in the unattended stream and endogenous switches where the listener volitionally switches their attention from one stream to another.

Experimental Method

EEG Recording: We recorded our EEG signals with a BrainVision actiChamp 32-channels, with one bipolar EOG channel (measuring simultaneously saccades and blinks) and 2 auxiliary channels fed from the speakers for syncing the audio with the EEG recording.


Stimuli were harmonic complexes tones (harmonics 1 to 10, duration: 125 ms). We synthesized sequences of tones lasting 8 seconds, each sequence consisting of two concurrent streams with tone blips presented at either 4 or 7 Hz. The pitch (aka f0) of the lower stream (f01) was between 200 Hz and 300 Hz and the f0 of the higher stream (f02) was between 400 and 600 Hz. The f0s of the concurrent streams were randomly selected in a given trial. Deviants consisted of f0 shifts of 3 semitones. Shifts could be either up or down. This was randomized across trials.


The relative onset of the streams was varied. In one condition, the onsets were simultaneous. In the two remaining priming conditions, the duration of the entire sequence was 9 s and the slower or faster rate occurred first. When the 4 Hz stream occurred first, the first 5 tones of the 7 Hz sequence were removed. When the 7 Hz stream occurred first, the first 3 tones of the 4 Hz sequence were removed. Deviants consisted of f0 shifts of 3 semitones. Shifts could be either up or down. This was randomized across trials.


The rate of the tone blips (4 or 7 Hz) was counterbalanced with the f0 of the stream (low, high). Deviants were presented in both the attended and unattended streams (0-3 deviants), so that the attended and unattended streams are statistically identical. Conditions were included with either the higher or lower stream presented alone as a baseline condition allowing us to measure the MMN. Conditions were blocked leading to 14 blocks of 32 trials.


Listeners were instructed to attend to the sequence which started first in the primed conditions. For an equal number of blocks with identical stimuli, listeners were asked to attend to the sequence which started second. In order to encourage listeners to attend to one stream, listeners counted the number of deviants in the attended stream. This also allowed us to examine attentional modulation of the MMN-P300 response. In blocks where the onset of the two streams were simultaneous, the task was to attend to the lower or higher stream. For these two conditions stimuli were identical and only the task changed.

EEG Analysis:

Preprocessing: After collecting the EEG data, we used a pre-processing step to reject unnecessary signals and noises. In order to get rid of muscle activities we used the EMG channel to mark the time intervals in which a muscle activity might mask the brain signal. Electromyography (EMG) is a technique for evaluating and recording the electrical activity produced by skeletal muscles. Then we used PCA to project out these activities. Data was filtered afterward with a bandpass filter between 2-50Hz. The last step was to apply a denoising technique called Denoising Source Separation (DSS) to partition recorded activity into stimulus-related and stimulus-unrelated components, based on a criterion of stimulus-evoked reproducibility. Components that are not reproducible are projected out to obtain clean data.

MMN analysis: To observe the MMN, two ERPs were extracted: the ERP with respect to the onset of the oddball tone, and the ERP with respect to the onset of the tone played 2/rate (sec) prior to the oddball. To correct for any irrelevant drifts that might still be present in the signal, each ERP was separately rereferenced to the mean activity of the 150 ms prior to the tone of interest. The two ERPs were then subtracted from one another to calculate the MMN. MMNs for attended streams and unattended streams were processed separately. To observe signals driven specifically by the MMN, and not by other factors such as stream rates, we averaged all MMNs together regardless of the rate at which the stream was played. In total, 576 attended and 576 unattended MMN trials were measured.

The graph below shows the average recorded ERPs in the two conditions: attended and unattended.

RON analysis: To explore the effects of reorientation, subjects were asked to attend to either the primed or non-primed streams. For this analysis, two time points were of particular interest: the onset of the non-primed stream, and the onset of the first tone from the primed stream that occurred immediately after the introduction of the non-primed stream. ERPs were extracted from these time points and compared. Again, each ERP was referenced to the mean activity of the 150 ms prior to the onset. ERPs between blocks in which subjects were asked to attend to either the primed or the non-primed were also compared.

ASSR analysis: We used a standard multi-tapering technique to obtain a stable power spectrum estimate of the EEG signals. Multi-taper method was used to compute the power at 4Hz and 7Hz over 1 second intervals and then total power for a given trial was computed by averaging over all time intervals and channels.Power and phase-locked values were calculated to see the effect of attention on these two indexes.


We found that there is a significant differences between the attended and the unattended ASSR and MMN, but we could not find a reliable RON signal between the primed and the unprimed stream. Specifically, significant differences (p<0.01) were seen in the MMNs in electrodes commonly believed to represent auditory neurophysiologic signals (e.g. CZ, CP2). These topographically organized differences are highlighted in the below figure, and align well with previous literature findings.

MMN (ERP and topography):

The figure below shows the MMN in the attended and unattended conditions. As shown here, we saw significant modulation in the MMN based on whether the signal is being attended. Note that the topography of the average response difference map peaks in around Cz channels, suggesting that the primary auditory cortices contributes the most to these differences.

ASSR Response We looked at both the magnitude and the phase of the ASSR response, as summarized in the figures below.

ASSR Power analysis: When the subject is primed to attend the 1st stream (leftmost graph), there was no significant differences between the attended and unattended streams, most likely because the task was too easy and the subject did not have to attend too hard to segregate these sounds. This is in contrast to the simultaneous onset task (rightmost graph). This condition was the hardest and there is a clear power modulation (at 4 and 7 Hz) due to attention.

ASSR Phase analysis: Mirroring our results found in power analysis, there is a strong coherence difference due to the attentional state of the subject. Note that the signal to noise ratio for power is better at 4Hz (due to the 1/f SNR degradation in EEG signal), but the PLV analysis mitigates this 1/f drop off.

In future work, we will collect data for more subjects and also explore whether we can present more than 2 streams at a time to a subject and thereby further increasing the transfer bit-rate of auditory-based BCI.