Development of a method to compensate for signal quality variations in repeated auditory eventrelated potential recordings.  
Jump to Full Text  
MedLine Citation:

PMID: 20407635 Owner: NLM Status: PubMednotMEDLINE 
Abstract/OtherAbstract:

Reliable measurements are mandatory in clinically relevant auditory eventrelated potential (AERP)based tools and applications. The comparability of the results gets worse as a result of variations in the remaining measurement error. A potential method is studied that allows optimization of the length of the recording session according to the concurrent quality of the recorded data. In this way, the sufficiency of the trials can be better guaranteed, which enables control of the remaining measurement error. The suggested method is based on monitoring the signaltonoise ratio (SNR) and remaining measurement error which are compared to predefined threshold values. The SNR test is well defined, but the criterion for the measurement error test still requires further empirical testing in practice. According to the results, the reproducibility of average AERPs in repeated experiments is improved in comparison to a case where the number of recorded trials is constant. The testretest reliability is not significantly changed on average but the betweensubject variation in the value is reduced by 3335%. The optimization of the number of trials also prevents excessive recordings which might be of practical interest especially in the clinical context. The efficiency of the method may be further increased by implementing online tools that improve data consistency. 
Authors:

Antti K O Paukkunen; Miika M Leminen; Raimo Sepponen 
Related Documents
:

8192125  Analyzing preventive trials with generalized additive models. 2925965  A strategy to use soft data effectively in randomized controlled clinical trials. 20693015  Phase 1 of an in vitro neurotoxicological prevalidation trial. 20515215  Erratum of when wait lists are not feasible, nothing is a thing that does not need to b... 11726675  Collecting and managing multisource and multimethod data in studies of pediatric popula... 21868975  Texture discrimination based upon an assumed stochastic texture model. 
Publication Detail:

Type: Journal Article Date: 20100316 
Journal Detail:

Title: Frontiers in neuroengineering Volume: 3 ISSN: 16626443 ISO Abbreviation: Front Neuroeng Publication Date: 2010 
Date Detail:

Created Date: 20100421 Completed Date: 20110714 Revised Date: 20130813 
Medline Journal Info:

Nlm Unique ID: 101477961 Medline TA: Front Neuroeng Country: Switzerland 
Other Details:

Languages: eng Pagination: 2 Citation Subset:  
Affiliation:

Applied Electronics Unit, Department of Electronics, Helsinki University of Technology Espoo, Finland. 
Export Citation:

APA/MLA Format Download EndNote Download BibTex 
MeSH Terms  
Descriptor/Qualifier:


Comments/Corrections 
Full Text  
Journal Information Journal ID (nlmta): Front Neuroengineering Journal ID (publisherid): Front. Neuroeng. ISSN: 16626443 Publisher: Frontiers Research Foundation 
Article Information Download PDF Copyright ? 2010 Paukkunen, Leminen and Sepponen. openaccess: Received Day: 21 Month: 12 Year: 2009 Accepted Day: 25 Month: 2 Year: 2010 epreprint publication date: Day: 11 Month: 1 Year: 2010 Electronic publication date: Day: 16 Month: 3 Year: 2010 collection publication date: Year: 2010 Volume: 3Elocation ID: 2 ID: 2856627 PubMed Id: 20407635 DOI: 10.3389/fneng.2010.00002 
Development of a Method to Compensate for Signal Quality Variations in Repeated Auditory EventRelated Potential Recordings  
Antti K. O. Paukkunen^{1}^{2}*  
Miika M. Leminen^{3}^{4}  
Raimo Sepponen^{1}  
^{1}Applied Electronics Unit, Department of Electronics, Helsinki University of TechnologyEspoo, Finland 

^{2}Graduate School of Electrical and Communications Engineering, Helsinki University of TechnologyEspoo, Finland 

^{3}Cognitive Brain Research Unit, Institute of Behavioral Sciences, Helsinki UniversityHelsinki, Finland 

^{4}Finnish Centre of Excellence in Interdisciplinary Music ResearchHelsinki, Finland 

[editedby] Edited by: Ulrich Hofmann, University of Lubeck, Germany [editedby] Reviewed by: Nabeel Anwar, Tokyo Institute of Technology, Japan; Martin Schurmann, University of Nottingham, UK; Mehrnaz Hazrati, University of Lubeck, Germany; Lisa Marshall, University of Lubeck, Germany Correspondence: *Correspondence: Antti K. O. Paukkunen, Helsinki University of Technology, Department of Electronics, Applied Electronics Unit, Otakaari 7B, Espoo 02150, Finland. email: antti.paukkunen@tkk.fi 
The averaging of the evoked potentials is a convenient way to reduce the level of interference when highlighting eventlocked electroencephalographic (EEG) responses (Picton et al., ^{1995}). Thus, in spite of the increasing interest in singletrial analysis, averaging is still the most commonly used method to recover eventrelated potentials (ERPs) from noise (Davila and Mobin, ^{1992}). The average ERPs, however, are distorted by, for example, latency jitter (e.g. Gibbons and Stahl, ^{2007}), amplitude variations, external interference, physiological artifacts, noise, and measurement errors during the experiment (Leonowicz et al., ^{2005}). In addition to that, the withinsubject and the betweensubject variations in the responses also affect the measurement outcome (e.g. Pikvik et al., ^{1993}; Lang et al., ^{1995}). Even though the average level of interference is reduced in this process (Picton et al., ^{1995}), the testretest reliability of the responses on a singlesubject level is too low for clinical applications (Lang et al., ^{1995}; Sinkkonen and Tervaniemi, ^{2000}; Dalebout and Fox, ^{2001}; Beauchemin and De Beaumont, ^{2005}; Kujala et al., ^{2007}; Duncan et al., ^{2009}). The mismatch negativity, for example, has been estimated to have a testretest reliability of just over 0.5 (Sinkkonen and Tervaniemi, ^{2000}), 0.46?0.71 (FrodlBauch et al., ^{1997}), or 0.56?67 (Pekkonen et al., ^{1995}).
The replicability of the responses depends on the test subject, type and presentation of the stimulus, the response studied, and the electrode location (Pekkonen et al., ^{1995}; FrodlBauch et al., ^{1997}; Sinkkonen and Tervaniemi, ^{2000}). The testretest reliability of the N100 response, for example, is higher than that of the mismatch negativity (MMN) because of the better latency stability (Pekkonen et al., ^{1995}). In addition, the variation in the remaining level of interference also affects the testretest reliability (e.g. Sinkkonen and Tervaniemi, ^{2000}). This explains why the replicability of the response to the standard sound tends to be superior to the less frequent ones (Pekkonen et al., ^{1995}). The testretest reliability can be influenced by careful experiment design. However, as the initial amount of noise and the level of interference will vary anyway, the quality variation should also be compensated.
The basic problem is that the number of trials recorded is typically constant, while the initial quality of the signal changes between and during experiments. Consequently, the remaining measurement error is subject to variation which affects the comparability of the responses and the testretest reliability. Several methods have been suggested for cleaning the data (e.g. Effern et al., ^{2000}; Quiroga, ^{2000}; He et al., ^{2004}) and to adjust the averaging process to cope with the alternating signal quality (e.g. Woody, ^{1967}; Davila and Mobin, ^{1992}; Woldorff, ^{1993}; Jaskowski and Verleger, ^{1999}; Wang et al., ^{2001}; Leonowicz et al., ^{2005}; Gibbons and Stahl, ^{2007}). However, even though the average signal quality could be improved, this does not compensate the insufficiency of the data. Alternatively, it would also be possible to increase the length of the experiments. In practice, however, fatigue, for example, may affect the neuropsychological phenomena (Picton et al., ^{1995}; Ding and Ye, ^{2003}; Boksem et al., ^{2005}; MullerGass et al., ^{2005}; Thornton, ^{2008}) and the risks of equipmentrelated errors also increase with time (M?hler and von Specht, ^{1999}; Rahne et al., ^{2008}). As the amount of artifacts and sudden changes in the signal quality are not predictable, the sufficient number of trials cannot be reliably predefined (M?cks et al., ^{1988}; Sinkkonen and Tervaniemi, ^{2000}). Thus, the length of the experiment should rather be defined online with respect to the concurrent data quality.
In this study, the implementation of a novel autoadaptive recording procedure for auditory eventrelated potentials (AERP) is suggested. It aims at optimizing the length of the experiment with respect to the initial quality of the data recorded. The qualities being monitored are the contribution of noise and the remaining measurement error. The idea is to keep on recording until they meet a predefined threshold. This way, the quality of the results is guaranteed and the length of the experiment is optimized. The objectives of the study are to define appropriate estimators for the quality tests and to study the influence of the compensation on the testretest reliability in a simulated AERP experiment. Feasibility, technical requirements and developmental aspects will be discussed on the basis of the results.
The procedure involved cycling three phases (Figure 1): preprocessing, quality estimation, and decision making. The analysis was performed trialbytrial and the decision as to whether to continue the experiment or not was made at the end of each cycle. First, the trial being analyzed was filtered and tested to detect possible artifacts and the contaminated trials were rejected. Then, the maturity of the accumulating average was estimated on the basis of the contribution of noise and the remaining measurement error. They were compared to predefined threshold values and the experiment was concluded if both the criteria were met.
The data used in the demonstration was taken from an MMN study (Pakarinen et al., ^{2007}). Each trial lasted from ?50 to +450?ms from the stimulus onset. The data was lowpass filtered (0?30?Hz) and the baseline was corrected on the basis of the 50?ms prestimulus interval. These are typical filter settings for this kind of data (Pekkonen et al., ^{1995}; Pakarinen et al., ^{2007}; Duncan et al., ^{2009}). Artifacts were detected by testing the amplitude of the response and trials exceeding???40??V were rejected.
Additionally, variation in the physiological response may also affect the accumulating average. Latency variation, in particular, is harmful because it flattens the amplitude of the peaks in the average waveform (e.g. Thornton, ^{2008}). Several methods have been suggested to correct the error (e.g. Woody, ^{1967}; Woldorff, ^{1993}; Wang et al., ^{2001}). None of these, however, were considered to be feasible online and, thus, were not included into the procedure.
The purpose of the noise test was to evaluate the mean level of interference in the accumulating average waveform. In particular, the test was supposed to indicate the contribution of white noise so that the significance of the responses could be assessed.
Ttest is commonly used to test the statistical significance of noise over the single trials in AERP studies (e.g. Picton et al., ^{1995}). By definition, it estimates the probability of the null hypothesis that the value being tested is a random sample from a normal distribution with a mean of zero and unknown variance, against the alternative that the mean is different from zero. Typically, if the probability of the null hypothesis is smaller than 0.05, it is likely that the contribution of noise is negligible or small (Ewens and Grant, ^{2005}).
In the current context, the onesample ttest could be applied by testing a peak in the average response, the statistical significance of which would represent the general quality of the whole waveform. The auditory N100 peak in the average waveform, for example, is relatively stable and typically easy to obtain. Thus, it could be a wellsuited option for the purpose. On the other hand, the application of the ttest requires exact knowledge of the latency of the peak to be tested and the localization may prove laborious because the latency varies between trials. Thus, a simpler estimate would be preferable in terms of computational efficiency.
Alternatively, the statistical significance of noise may also be estimated from the respective signaltonoise ratio (SNR) (Sackett, ^{2001}). Compared to the onesample ttest, the computation of the SNR is simpler and it can also be applied to test the whole waveform at once. Although the accuracy of the estimation might be worse, it may still be capable of providing the number of trials required to obtain significant responses with reasonable accuracy. Derived from M?cks et al. (^{1988}), the effective SNR can be estimated from the difference of the consecutive trials by eqs 1?3.
(1)
Pavg,n=12(N?1)?i=1N?1P(xi+1?xi) 
(2)
Pavg,s=P(?N)?Pavg,n/N 
(3)
SNR=N?Pavg,sPavg,n 
where
(4)
P(xi)=1T?t0?t=t0t=Txi2(t) 
Here, P_{avg,n} represents the average power of the noise, P_{avg,s} represents the power of the average response, ?_{N} is the average of N trials, x_{i} represents the ith trial, and the time frame of the trial is [t_{0}, T]. Furthermore, the SNR values exceeding one are interpreted to indicate zero significance, because statistical significance may only have values between 0 and 1.
The purpose of testing the remaining measurement error was to be able to estimate the stability of the accumulating results and the magnitude of the change that could still be expected. In particular, the test was supposed to indicate the magnitude of the remaining interference peaks and the uncertainty in the average waveform resulting from variations in the physiological response. In order to estimate the measurement error, two alternative estimators were considered: a convergencebased estimator and a direct estimator of the measurement error. Both alternatives derive from comparing partial averages of the data and the computation of the estimates is simple.
The convergence rate represents the magnitude of change in the average waveform as new trials are included into the sum. It can be calculated online by comparing the consecutive averages. The comparison is made with respect either to the amplitude or the form of the responses. The similarity of the form of the responses can be estimated on the basis of the correlation. The calculation of the parameter is straightforward and it provides a reasonable estimate of the similarity of the whole waveform. On the other hand, the form of the whole curve is often irrelevant and it would be more interesting to estimate the maximal error in the peak amplitudes. This could be assessed better from the differences in the consecutive averages. Thus, the convergence ratebased error estimate ?_{c} was calculated according to eqn. 5.
(5)
?c=max?[1N?i=1N(xi)?1N?1?i=1N?1(xi)] 
Alternatively, the measurement error can also be estimated directly by computing the difference of two independent averages extracted from the data. Following Schimmel (^{1967}), the direct estimate of the measurement error ?_{d} was calculated from the difference of the average of the odd and the even trials according to eqn. 6.
(6)
?d=12?max?{2N?i=1,3,N?1(xi)?2N?i=2,4,N(xi)} 
The methods applied in this study were evaluated by simulating their performance with real data. With the authors? permission, the data used were taken from Pakarinen et al. (^{2007}). In that study, the multifeature MMN paradigm was applied to obtain an auditory discrimination profile by using harmonic sinusoidal sounds. The standard tones were composed of three sinusoidal partials (523, 1046 and 1569?Hz). They were presented at an intensity of 60?dB above the subject's hearing threshold and the length of the stimuli was 75?ms (including 5?ms rise and fall times). The deviant tones differed from the standard tones in frequency, intensity, duration, or perceived soundsource location and the magnitude of the deviation varied across six levels. The probability of the standard tone was 0.5 and the probability of each deviant tone at each level was 0.125/6. The length of each recording session was 90?min and the number of trials recorded was 5472 for the standard stimulus and 225 for each deviant stimulus (Pakarinen et al., ^{2007}).
For the concurrent study, data from eight test subjects was available. These data were used to simulate a series of repeated experiments with each test subject and they were each permuted 100 times to create artificial test runs. However, in order to avoid creating permutations that were too similar, datasets that were too small had to be rejected. The expected length of a single simulated experiment was about 200?600 trials. Therefore, only the standard datasets were applied. In addition, the evaluation was further limited to only the one channel having the largest SNR with respect to the physiological response (F_{z} referenced to the mean of the mastoids). Thus, the test data accepted for the study included eight datasets with over 5000 trials, from each of which 5% to 19% was still rejected because of artifacts.
In short, the quality estimators were tested first in order to verify the plausibility of the tests and to be able to define appropriate thresholds to be applied in the online procedure. Then, the procedure was finalized on the basis of the results and the benefits from using the active compensation were studied by comparing the application with the use of a fixed number of 200 trials. The tests prepared for the evaluation are summarized in Table 1.
The plausibility of the SNR estimator was evaluated by comparing the application with the use of a onesample ttest. The ttest was made for the N1 component of the average AERP waveform. N1 was parameterized using an average time window of 40?ms placed at the peak maximum of an individual waveform at 80?140?ms. First, the SNR was computed as a function of the statistical significance of noise in order to determine a criterion that would correspond to the ttest (p_{noise} <?0.05). Then, both parameters were applied to estimate the required number of trials to obtain a statistically significant response in order to see how they match. In order to justify the use of the SNR test, the correlation was supposed to be high. The limit for approval was chosen to be 0.8, which indicates high correlation.
The measurement error estimators were evaluated by studying their relevance to the validity of the measurement outcome. First, the estimators were computed as a function of the validity of the respective partial averages. Then they were compared on the basis of the sensitivity, distortion and the amount of the calculus required. Validity was estimated by the correlation of the partial averages and the expected outcome. The expected outcome was estimated by computing the sorted average (Rahne et al., ^{2008}) of each dataset. The trials were sorted on the basis of the respective interference level and they were averaged starting from the best one, until the SNR of the average started to decrease. The remaining trials were discarded and the average with the maximum SNR was considered to be the optimal outcome. Sorted averaging maximizes the SNR of the average curve (Rahne et al., ^{2008}). Thus, it was considered to be a wellsuited reference for the test.
The last parts of the evaluation dealt with the actual application of the procedure. First the use of the selected measurement error estimator was studied with respect to the feasibility in order to define two alternative threshold values for the test. Two such values were defined: one that would lead to recording 200 trials on average and one that represented the most stringent requirements still plausible in most cases (p?=?0.6). Then, the application was simulated again with the concluded thresholds and the resulting testretest reliability was compared with the results obtained when the total number of trials was fixed at 200. The feasibility was estimated by the probability of the requirements being reached with fewer than 300 trials on average. Testretest reliability was computed using the oneway intraclass correlation coefficient (ICC), because it provides more accurate results than the simple Pearson's correlation coefficient (Farahat et al., ^{2003}).
According to the results (Figure 2A), in general, the SNR estimation indicated a smaller significance than the ttest and the correlation of the single observations was only moderate (p?? 0.5). A?probable reason for this was that the SNR test assessed the average significance of noise in the whole waveform, while the ttest focused on a single peak in the response. On the other hand, the SNR estimator could still be applied to verify the realization of the significance criterion (p_{noise} <?0.05). Generally, it was met when the SNR exceeded 0.69. Using this value as a detection threshold, the required number of trials only exceeded the results from the ttest by 4???17 (mean ??SD) trials (Figure 2B). The correlation of the estimations was about 0.86, which suggests that the SNR estimator could be considered to be plausible.
According to the results (Figure 3), the convergencebased estimate was a linear function of the validity, while the direct estimate had a quadratic relation. Thus, the convergencebased estimator would be simpler to apply. On the other hand, both estimators were quite distorted and the direct estimator was more sensitive (3.6??V/unit) than the other one (0.42??V/unit). In this respect, the direct estimate would be a better choice. In terms of computation time, the difference between the alternatives was considered to be insignificant. They both included an update operation, difference calculation and finding the maximum. The direct estimator also included an additional absolute value operation which, however, did not cause significant difference to the computation time. Thus, as a whole, the direct estimator was considered to be preferable and was chosen to be implemented into the procedure.
Figure 4A presents the required number of trials, the feasibility of the test, and the testretest reliability with respect to the remaining measurement error as given by the direct error estimate. The criterion that is achieved with 200 trials on average is 1.5??V and the criterion that has a feasibility of 0.6 is 1.2??V. These values were applied in the last phase of the evaluation. The resulting testretest reliabilities in the simulated repeated experiments are presented in Figure 4B.
On the basis of the results (Table 2), the variation in the testretest reliability between test subjects was reduced by 0.1 and the average testretest reliability was improved by 0.01?0.06, depending on the measurement error threshold applied. The required number of trials in the novel procedure was 209???60 (mean ??SD) when the threshold was 1.5??V and about 310???78 (mean?? SD) when the threshold was 1.2??V. Thus, the variation between the test subjects was improved, although the average number of trials recorded was not greatly altered. The increase in the mean testretest reliability, on the other hand, was achieved by increasing the number of trials. Thus, the feasibility dropped with the improved testretest reliability.
The results of this study show that the compensation for the variation in the data quality has significant relevance for the remaining measurement error and the reproducibility of the average AERPs. In the simulated series of repeated experiments, the testretest reliability was improved by 2?9%, on average, and the variation in the value between the test subjects was reduced by 33?35%. In addition, the compensation also improved the measurement efficiency. When the number of trials recorded was a constant 200, the average resulting testretest reliability was about 0.69. In the novel procedure, the number of recorded trials was optimized and similar results could be achieved with about 209???60 (mean?? SD) trials, depending on the data quality. Thus, the length of the experiment could be decreased when the signal quality was high enough (about 40 % of simulation runs).
The findings of the study might be of practical interest particularly in the clinical context. On the one hand, the method suggested will simplify the comparison of results from repeated measurements. The level of distortion is minimized, while the number of trials is optimized. Thus, the differences between repeated measurements are more likely to occur because of the differences in the neuropsychological condition of the test subject. On the other hand, the improvement of the measurement efficiency is also an important factor. If the number of recorded trials is constant and the quality of the outcome is supposed to be high, the number of trials recorded must also be high in order to minimize the possibility of error. Some groups of patients, such as children or acutely ill patients, may not tolerate long experiments. The optimization of the number of trials prevents excessive recording, while the sufficiency of the data to obtain meaningful AERPs is still guaranteed. Thus, application of this kind of active compensation might reduce the negative impact of the investigation on them.
Regarding future developments, the efficiency of the method might be increased if the cleaning procedure and the artifact processing scheme were improved. Although most of the results were in line with each other, the results obtained using dataset 8 seem to deviate from the others. The estimation of the statistical significance was less accurate and the testretest repeatability deviated from the others, particularly when the number of trials included was kept constant. This suggests that the variation in the quality of the data was higher than in the other datasets. Thus, the implementation of advanced means for signal processing might improve the quality of the results.
The findings of the study indicate that the method suggested could be applied to compensate for the changes in the quality between repeated AERP measurements. Practical investigations, however, are needed to confirm the results and to gain further information about application of the method, the possible issues and the impact in practice. From the practical perspective, the predefinition of the measurement error criterion is particularly interesting. The suitable threshold is applicationspecific and depends on the response studied. Thus, the recommendations have to be prepared case sensitively. In general, the successful implementation of the method requires high signal quality, a reliable method for online artifact rejection, and a device capable of both recording and analyzing the EEG data. The efficiency may be further improved by implementing an advanced tool for online data processing.
The authors declare that the research was conducted in the absence of any commercial or financial relationship that could be construed as giving rise to a potential conflict of interest.
We thank M. Linnavuo and P. Eskelinen for valuable discussions. The work was supported in part by the Graduate School of Electrical and Communications Engineering, the Society of Electronics Engineers and the Finnish Centre of Excellence in Interdisciplinary Music?Research.
References
Beauchemin M.,De Beaumont L.. (Year: 2005). Statistical analysis of the mismatch negativity: to a dilemma, an answer. Tutorials Quant. Methods Psychol.1, 18?24  
Boksem M. A. S.,Meijman T. F.,Lorist M. M.. (Year: 2005). Effects of mental fatigue on attention: an ERP study. Cogn. Brain Res.25, 107?11610.1016/j.cogbrainres.2005.04.011  
Dalebout S. D.,Fox L. G.. (Year: 2001). Reliability of the mismatch negativity in the responses of individual listeners. J. Am. Acad. Audiol.12, 245?25311392436  
Davila C. E.,Mobin M. S.. (Year: 1992). Weighted averaging of evoked potentials. IEEE Trans. Biomed. Engry39, 338?34510.1109/10.126606  
Ding H. Y.,Ye D. T.. (Year: 2003). The extraction of MMN modulated by attention. IEEE Int. Conf. neural signal processing2003, Dec 14?17, China.  
Duncan C. C.,Barry R. J.,Connolly J. F.,Fischer C.,Michie P. T.,N??t?nen R.,Polich J.,Reinvang I.,Van Petten C.. (Year: 2009). Eventrelated potentials in clinical research: guidelines for eliciting, recording, and quantifying mismatch negativity, P300, and N400. Clin. Neurophysiol.120, 1883?190810.1016/j.clinph.2009.07.04519796989  
Effern A.,Lehnertz K.,Fernandez G.,Grunwald T.,David P.,Elger C. E.. (Year: 2000). Single trial analysis of event related potentials: nonlinear denoising with wavelets. Clin. Neurophysiol.111, 2255?226310.1016/S13882457(00)00463611090779  
Ewens W. J.,Grant G.. (Year: 2005). Statistical Methods in Bioinformatics. New York, SpringerVerlag10.1007/b137845  
Farahat M. F.,Rohlman D. S.,Storzbach D.,Ammerman T.,Anger W. K.. (Year: 2003). Measures of shortterm testretest reliability of computerized neurobehavioral tests. Neurotoxicology24, 513?52110.1016/S0161813X(03)00079212900064  
FrodlBauch T.,Kathmann N.,M?ller H.J.,Hegerl U.. (Year: 1997). Dipole localization and testretest reliability of frequency and duration mismatch negativity generator processes. Brain Topogr.10, 3?810.1023/A:10222149054529358949  
Gibbons H.,Stahl J.. (Year: 2007). Responsetime corrected averaging of event related potentials. Clin. Neurophysiol.118, 197?20810.1016/j.clinph.2006.09.01117070732  
He P.,Wilson G.,Russel C.. (Year: 2004). Removal of ocular artifacts from electroencephalogram by adaptive filtering. Med. Biol. Eng. Comput.42, 407?41210.1007/BF0234471715191087  
Jaskowski P.,Verleger R.. (Year: 1999). Amplitudes and latencies of singletrial ERP's estimated by a maximumlikelihood method. IEEE Trans. Biomed. Eng.46, 987?99310.1109/10.77540910431464  
Kujala T.,Tervaniemi M.,Schr?ger E.. (Year: 2007). The mismatch negativity in cognitive and clinical neuroscience: Theoretical and methodological considerations. Biol. Psychol.74, 1?1910.1016/j.biopsycho.2006.06.00116844278  
Lang A. H.,Eerola O.,Korpilahti P.,Holopainen I.,Salo S.,Aaltonen O.. (Year: 1995). Practical issues in the clinical application of mismatch negativity. Ear. Hear.16, 118?13010.1097/00003446199502000000097774765  
Leonowicz Z.,Karvanen J.,Shishkin S.. (Year: 2005). Trimmed estimators for robust averaging of eventrelated potentials. J. Neurosci. Methods142, 17?2610.1016/j.jneumeth.2004.07.00815652613  
M?cks J.,Gasser T.,K?hler W.. (Year: 1988). Basic statistical parameters of eventrelated potentials. J. Psychophysiol.2, 61?70  
M?hler R.,von Specht H.. (Year: 1999). Sorted averaging ? principle and application to auditory brainstem responses. Scand. Audiol.28, 145?14910.1080/01050399942471610489863  
MullerGass A.,Stelmack R. M.,Campbell K. B.. (Year: 2005). ??and were instructed to read a selfselected book while ignoring the auditory stimuli?: the effects of task demands on the mismatch negativity?. Clin. Neurophysiol.116, 2142?215210.1016/j.clinph.2005.05.01216029961  
Pakarinen S.,Takegata R.,Rinne T.,Huotilainen M.,N??t?nen R.. (Year: 2007). Measurement of extensive auditory discrimination profiles using mismatch negativity (MMN) of the auditory eventrelated potential (ERP). Clin. Neurophysiol.118, 177?18510.1016/j.clinph.2006.09.00117070103  
Pekkonen E.,Rinne T.,N??t?nen R.. (Year: 1995). Variability and replicability of the mismatch negativity. Electroenceph. Clin. Neurophysiol.96, 546?55410.1016/00134694(95)00148R7489676  
Picton W. T.,Lins O. G.,Scherg M.. (Year: 1995). The recording and analysis of eventrelated potentials. In Handbook of Neuropsychology, Boller F.,Grafman J. eds (Amsterdam, Elsevier Science B.V.), pp. 3?73  
Pikvik R. T.,Broughton R. J.,Coppola R.,Davidson R. J.,Fox N.,Nuwer M. R.. (Year: 1993). Guidelines for the recording and quantitative analysis of electroencephalographic activity in research contexts. Psychophysiology30, 547?55810.1111/j.14698986.1993.tb02081.x8248447  
Quiroga R. Q.. (Year: 2000). Obtaining single stimulus evoked potentials with wavelet denoising. Physica D145, 278?29210.1016/S01672789(00)001160  
Rahne T.,von Specht H.,M?hler R.. (Year: 2008). Sorted averaging ? application to auditory eventrelated responses. J. Neurosci. Methods172, 74?7810.1016/j.jneumeth.2008.04.00618499265  
Sackett D. L.. (Year: 2001). Why randomized controlled trials fail but needn't: 2. Failure to employ physiological statistics, or the only formula a cliniciantrialist is ever likely to need (or understand!). CMAJ165, 1226?123711706914  
Schimmel H.. (Year: 1967). The (?) reference: accuracy of estimated mean components in average response studies. Science157, 92?9410.1126/science.157.3784.926026675  
Sinkkonen J.,Tervaniemi M.. (Year: 2000). Towards optimal recording and analysis of the mismatch negativity. Audio Neurootol.5, 235?24610.1159/000013885  
Thornton A. R. D.. (Year: 2008). Evaluation of a technique to measure latency jitter in eventrelated potentials. J. Neurosci. methods168, 248?25510.1016/j.jneumeth.2007.09.03118006068  
Wang K.,Begleiter H.,Porjesz B.. (Year: 2001). Warpaveraging eventrelated potentials. Clin. Neurophysiol.112, 1917?192410.1016/S13882457(01)00640X11595152  
Woldorff M. G.. (Year: 1993). Distortion of ERP averages due to overlap from temporally adjacent ERPs: analysis and correction. Psychophysiology30, 98?1198416067  
Woody C. D.. (Year: 1967). Characterization of an adaptive filter for the analysis of variable latency neuroelectric signals. Med. Biol. Eng. Comput.5, 539?553 
Figures
Tables
Table 1. Evaluation of the novel procedure and the quality estimators applied. The estimators are evaluated first to justify their use and to define appropriate criteria for the tests. Then, the benefits of using the compensated procedure are studied by comparing the application with the use of a fixed number of 200 trials.
Test  Test method  Tested parameter(s)  Aim 

Evaluation of the SNR test  Comparison of SNR test and onesample ttest  Estimated number of trials to obtain significant N1  1. To define SNR test criterion 2. To estimate plausibility 
Evaluation of the meas. error estimators  Comparison of direct and convergencebased estimator  Sensitivity, validity, distortion, calculus  1. To study their preference with respect to the application in the procedure 
Evaluation of feasibility of requirements  Comparison of the criteria and the required number of trials  Number of trials and remaining meas. error  To define error test criteria: 1. criterion???N???200, on average 2. criterion???feasibility???0.6 
Evaluation of the benefits of compensation  Comparison of the use of the novel procedure and the use of a fixed number of 200 trials  Testretest reliability  To test the influence on: 1. mean testretest reliability 2. variation of testretest reliability 
Results from the simulated series of repeated AERP experiments with eight test subjects. The tests were made by using the novel procedure (SNR >?0.69 and ERR <?1.5??V/1.2??V) and a comparative procedure (N?=?200). Testretest reliability (ttr) was estimated by using ICC. ? represents the mean value, ? represents the standard deviation and ? represents the range of variation.
Procedure  SNR threshold  Measurement error threshold (?V)  Required number of trials (?????)  ? (ttr)  ? (ttr) 

Fixed number of trials  ?  ?  200???0  0.689  0.309 
Novel procedure  0.69  1.5  209???60  0.701  0.206 
Novel procedure  0.69  1.2  310???78  0.753  0.202 
Article Categories:
Keywords: adaptive signal processing, electroencephalography, evoked potentials, measurement error, signaltonoise ratio, testretest reliability. 
Previous Document: Criteria for optimizing cortical hierarchies with continuous ranges.
Next Document: Cholinergic neuromodulation controls directed temporal communication in neocortex in vitro.