Document Detail

Statistical properties of interval mapping methods on quantitative trait loci location: impact on QTL/eQTL analyses.
Jump to Full Text
MedLine Citation:
PMID:  22520935     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
BACKGROUND: Quantitative trait loci (QTL) detection on a huge amount of phenotypes, like eQTL detection on transcriptomic data, can be dramatically impaired by the statistical properties of interval mapping methods. One of these major outcomes is the high number of QTL detected at marker locations. The present study aims at identifying and specifying the sources of this bias, in particular in the case of analysis of data issued from outbred populations. Analytical developments were carried out in a backcross situation in order to specify the bias and to propose an algorithm to control it. The outbred population context was studied through simulated data sets in a wide range of situations.The likelihood ratio test was firstly analyzed under the "one QTL" hypothesis in a backcross population. Designs of sib families were then simulated and analyzed using the QTL Map software. On the basis of the theoretical results in backcross, parameters such as the population size, the density of the genetic map, the QTL effect and the true location of the QTL, were taken into account under the "no QTL" and the "one QTL" hypotheses. A combination of two non parametric tests - the Kolmogorov-Smirnov test and the Mann-Whitney-Wilcoxon test - was used in order to identify the parameters that affected the bias and to specify how much they influenced the estimation of QTL location.
RESULTS: A theoretical expression of the bias of the estimated QTL location was obtained for a backcross type population. We demonstrated a common source of bias under the "no QTL" and the "one QTL" hypotheses and qualified the possible influence of several parameters. Simulation studies confirmed that the bias exists in outbred populations under both the hypotheses of "no QTL" and "one QTL" on a linkage group. The QTL location was systematically closer to marker locations than expected, particularly in the case of low QTL effect, small population size or low density of markers, i.e. designs with low power. Practical recommendations for experimental designs for QTL detection in outbred populations are given on the basis of this bias quantification. Furthermore, an original algorithm is proposed to adjust the location of a QTL, obtained with interval mapping, which co located with a marker.
CONCLUSIONS: Therefore, one should be attentive when one QTL is mapped at the location of one marker, especially under low power conditions.
Authors:
Xiaoqiang Wang; Hélène Gilbert; Carole Moreno; Olivier Filangi; Jean-Michel Elsen; Pascale Le Roy
Related Documents :
22555065 - The development, preliminary validation and clinical utility of a shoe model to quantif...
22673395 - Modeling stochasticity and variability in gene regulatory networks.
22368455 - Attitude determination using a mems-based flight information measurement unit.
22259725 - A comparison of intensive care unit mortality prediction models through the use of data...
19354025 - A long-term forecast of mrsa daily burden using logistic modeling.
19639485 - A finite element study of invariant-based orthotropic constitutive equations in the con...
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't     Date:  2012-04-20
Journal Detail:
Title:  BMC genetics     Volume:  13     ISSN:  1471-2156     ISO Abbreviation:  BMC Genet.     Publication Date:  2012  
Date Detail:
Created Date:  2012-06-29     Completed Date:  2012-09-25     Revised Date:  2013-06-26    
Medline Journal Info:
Nlm Unique ID:  100966978     Medline TA:  BMC Genet     Country:  England    
Other Details:
Languages:  eng     Pagination:  29     Citation Subset:  IM    
Affiliation:
INRA, UMR 1348, PEGASE, 35042 Rennes cedex, France.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Algorithms
Chromosome Mapping*
Computer Simulation
Genetic Linkage
Genetics, Population / methods
Humans
Models, Genetic
Phenotype
Quantitative Trait Loci / genetics*
Regression Analysis
Software
Transcriptome* / genetics
Comments/Corrections

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Full Text
Journal Information
Journal ID (nlm-ta): BMC Genet
Journal ID (iso-abbrev): BMC Genet
ISSN: 1471-2156
Publisher: BioMed Central
Article Information
Download PDF
Copyright ©2012 Wang et al; licensee BioMed Central Ltd.
open-access:
Received Day: 18 Month: 9 Year: 2011
Accepted Day: 20 Month: 4 Year: 2012
collection publication date: Year: 2012
Electronic publication date: Day: 20 Month: 4 Year: 2012
Volume: 13First Page: 29 Last Page: 29
ID: 3386024
Publisher Id: 1471-2156-13-29
PubMed Id: 22520935
DOI: 10.1186/1471-2156-13-29

Statistical properties of interval mapping methods on quantitative trait loci location: impact on QTL/eQTL analyses
Xiaoqiang Wang12 Email: Xiaoqiang.Wang@rennes.inra.fr
Hélène Gilbert34 Email: Helene.Gilbert@toulouse.inra.fr
Carole Moreno5 Email: Carole.Moreno@toulouse.inra.fr
Olivier Filangi12 Email: Olivier.Filangi@rennes.inra.fr
Jean-Michel Elsen5 Email: Jean-Michel.Elsen@toulouse.inra.fr
Pascale Le Roy12 Email: Pascale.LeRoy@rennes.inra.fr
1INRA, UMR 1348, PEGASE, 35042 Rennes cedex, France
2Agrocampus Ouest, UMR 1348, PEGASE, 35042 Rennes cedex, France
3INRA, UMR 1313, GABI, 78352 Jouy en Josas cedex, France
4INRA, UMR 444, LGC, 31326 Castanet Tolosan cedex, France
5INRA, UMR 631, SAGA, 31326 Castanet Tolosan cedex, France

Background

For the last decade, several studies have shown that a large proportion of QTL are mapped at the markers locations whenever linkage analysis is applied. As to what regards dataset analyses, this bias first raised doubts in Spelman et al. [1], who observed a large proportion of significant test statistics at marker location when looking for QTL in five milk production traits. Walling et al. [2] have described the influence of markers in constructing the confidence intervals of QTL location and questioned whether QTL location was biased towards the location of markers instead of its true position. By applying the regression coefficients on the markers as suggested by Whittaker et al. [3], Walling et al. [4] calculated the proportion of putative QTL located at marker positions in a backcross population. They have reported a systematic bias for the estimated QTL position under the null hypothesis of the test, i.e. the hypothesis of no QTL on the linkage group. Moreover, results from linear regression methods for QTL detection have been reported to behave the same way the results from maximum likelihood methods in interval mapping approaches do [5]. The simulation studies by Walling et al. [4] have confirmed that these two approaches have similar biases on the estimated QTL position in a backcross population.

These previous works have shown that a bias on the QTL location occurs when genetic linkage analysis for QTL mapping is used in a backcross population. However, little research has been devoted to establishing which parameters give rise to that bias. What is more, no study has investigated how it affects linkage analysis applied to outbred populations. In order to address these shortcomings, the present study aims at identifying the sources of that bias, in particular regarding the analysis of data issued from outbred populations.

This question is of critical importance in expression quantitative trait loci (eQTL) mapping. Indeed, a main objective is often to search for eQTL which co localize with QTL which influences agronomical performances. The accuracy of the eQTL locations is thus a fundamental element for experimental design optimization, especially since experimental designs for gene expression analyses are generally of moderate size due to the cost of phenotyping. The pioneer work of eQTL detection can be traced back to the emergence of the concept of genetical genomics [6]. During the past decade, QTL mapping was widely applied to the detection of eQTL, for example in yeast [7], mice [8], human [9,10], maize [11] and pig [12]. Generally, mapping procedures were used to map eQTL considering each transcript expression level as one quantitative trait in a trait by trait analysis.

Recently, we have carried out linkage analyses by interval mapping on high throughput transcriptomic data from several familial QTL detection designs, in pig [13], in poultry [14] and in trout [15]. Because of the high dimensionality of the phenotypes, these eQTL analyses have highlighted to the bias of interval mapping estimation of the QTL location. We observed that the number of eQTL detected at marker locations was consistently higher than between marker locations: for instance, with the analysis of 6 665 gene expressions in a population of 325 pigs, we found 756 eQTL on the chromosome 18 distributed as shown in Figure 1. It appeared that the eQTL were significantly more often mapped on marker locations rather than between marker locations.

Hence, in order to qualify this possible bias on the estimated QTL location in outbred populations and to specify which parameters influence it, this paper presents a study of the QTL location accuracy. Firstly, in order to make things more concrete, we explored the empirical distribution of the LRT along the linkage group under the null hypothesis of "no QTL" on a real dataset. Secondly, analytical developments were carried out so as to identify the parameters which influence the QTL location accuracy. Since they are impossible to realize for outbred populations, because of the test statistic complexity, a more simple case of a backcross type population, i.e. a backcross between inbred lines, was considered at that stage. Thirdly, designs of outbred sib families were simulated and analyzed in order to characterize the bias variability under the null and the alternative ("one QTL segregating on the linkage group") hypotheses. Such parameters as population size and marker density under the null hypothesis, as well as QTL effect and simulated QTL location under the alternative hypothesis were taken into account. Finally, an approach as to how to adjust the QTL location estimation, for a QTL located at the position of one marker, was suggested.


Results
The LRT distribution along the linkage group under the null hypothesis

By using the real pedigree and genotypes structure from an experimental design in pig, 2 000 simulations of phenotypes under the null hypothesis of "no QTL on the linkage group" were performed. The distribution of the estimated QTL location, i.e. the location of the maximum LRT, was obtained (Figure 2) on the chromosome SSC1 which carried 16 microsatellite markers (black points in the X axis). It clearly shows that a large proportion of QTL was found at a marker location. The histograms in Figure 3 show the empirical distributions of the 2000 LRT at some locations on the chromosome, both markers and non-markers. The LRT at each location followed a x2 distribution, with degrees of freedom ranging from 4.05 to 4.55, according to a Kolmogorov-Smirnov (KS) test at α = 0.01. This was generalized to all tested positions on the linkage group in Figure 4, where the distributions of the LRT characterized by the number of degrees of freedom under the null hypothesis at markers was not very different from the distributions obtained for positions between markers.

It should be noted (Figure 2) that the proportion of QTL locations estimated at the two extreme marker positions of the linkage group was higher than for the other markers under the "no QTL" hypothesis. The asymptotic distribution of the LRT process at marker positions is known to be the square of an Ornstein-Uhlenbeck (OU) process as pointed out by Lander and Botstein [16] and proved by Cierco [17]. As indicated in Rabier et al. [18], when test statistics are performed only on markers, the OU process follows an autoregressive process of order 1. After 1 million of simulations for this process, we found that the probability for the maximum of the OU process to be on the bounds is higher than within the interval (Figure 5). This property of the OU process was consistent with the fact that we observed a large proportion of QTL localised at the extreme marker positions in comparison with the other markers.

The QTL location bias expression in a backcross population

In order to investigate the bias on the QTL location under the hypothesis of "one QTL", we considered a linkage group limited to an interval [0,T] between two markers M1 (alleles M1 and m1) at 0 and M2 (alleles M2 and m2) at T flanking a QTL (alleles Q and q) in a backcross population obtained from the cross M1M1QQM2M2 × M1m1QqM2m2.

Let yk be the phenotypic value for the individual k = 1,...,n. Assuming a QTL located at the location t0 , the genetic model for yk is:

[Formula ID: bmcM1]
(1) 
yk=μ+a2gk(t0)+ek,

where μ denotes the overall mean, a denotes the QTL allelic substitution effect, gk(t0) is the genotypic value of k at the QTL position, which takes 1 or -1 value depending on the QTL allele, Q or q respectively, received by k from its heterozygous parent and ek is a random normal variable with mean 0 and variance σ2. In order to simplify calculations, we set μ = 0 and σ2 = 1 and used a linearized likelihood function instead of a mixture of two normal distributions (e.g. [19]). In this case, the interval mapping method is the same as the regression method for QTL detection, and the model (1) is modified as follows:

[Formula ID: bmcM2]
(2) 
yk=a2xk(t0)+ek,

where xk(t0) = E[gk(t0)|Mk] and Mk denotes the genotype of the individual k at markers M1 and M2. Let θt-t' be the recombination rate in the distance |t -t'|, then for all k (see Appendix I):


xk(t)=(1-θt-θT-t)/(1-θT)ifMk=M1M1M2M2(θT-t-θt)/θTifMk=M1M1M2m2(θt+θT-t-1)/(1-θT)ifMk=M1m1M2m2(θT-t-θt)/θTifMk=M1m1M2M2

and the LRT at the position t is calculated as

[Formula ID: bmcM3]
(3) 
LRT(t)=[∑ykxk(t)]2∑xk2(t).

Note that given a position t, {xk(t)}k = 1, ..., n is a sequence of i.i.d. random variables with the same expected value 0 and the same variance (see Appendix III), noted Var(x(t)). Hence, if we replace yk by the model (2) in equation (3), then the LRT(t) consists of three terms as (see Appendix II):


LRT(t)=f(t)+ε1(t)+ε2(t),

where f(t)=14a2[∑xk(t0)xk(t)]2/∑xk2(t) is a function of variable t, the noise ε1(t) is the LRT at t under the no QTL hypothesis and the noise ε2(t) follows approximately a normal distribution with mean 0 and variance na2Var(x(t))ρtt02. Note that:


argmaxt∈[0,T]LRT(t)=argmaxt∈[0,T]1nLRT(t),

i.e. the bias on LRT(t) behaves similarly to the bias on LRT(t)/n. So this property allows to analyze the source of the bias in LRT(t)/n instead of LRT(t). Using the law of large numbers, we have this following decomposition of LRT (t)/n:

[Formula ID: bmcM4]
(4) 
1nLRT(t)≈14a2Var(x(t0))ρtt02+1nε1(t)+1nε2(t),

It can be seen from formula (4) that the first term reaches its maximum at t0 since ρtt0→1 when tt0. As seen in the previous section, the second term, which is proportional to the LRT at t under the "no QTL" hypothesis, reaches its maximum at the position of markers more often than at the positions between markers. As a result, the estimated QTL location will be biased towards the position of markers. However, when n or a2 increase, or when T decreases, or when t0 approaches one marker location (see Appendix III), the deviation between the two first terms in formula (4) increases and the influence of the second term is reduced. Therefore, in our simple backcross population model, under the hypothesis of one QTL, when the population size, marker density, QTL effect increase or when the true QTL location approaches the position of a marker, the bias of the estimated QTL location is expected to be reduced.

Simulations under H0

According to the preceding results, the estimated QTL location cannot be expected to be uniformly distributed on the chromosome under the null hypothesis of no QTL. Familial designs were simulated to test the influence of the population size and the marker density on this bias in outbred populations.

Impact of the population size

Six different population sizes, from 60 to 800 progeny, were simulated with three markers located at 0 M, 0.2 M and 0.4 M on a 0.4 M linkage group. The empirical distributions of the estimated QTL location, obtained from 5 000 simulations for each population size, showed that the probability of mapping a QTL at a marker location was always higher than that of mapping it between markers (Figure 6). As shown in Table 1, the proportion of QTL which co-localized with the markers looked independent from the population size. When comparing the estimated QTL location distributions, which were obtained with the different population sizes, using the Kolmogorov-Smirnov test, large p-values were obtained (> 0.97). This indicates that the population size did not influence the distribution of the estimated QTL location in a significative way under the null hypothesis.

Impact of the marker density

The empirical distributions of estimated QTL locations were compared for five marker, with 2, 3, 5, 7 and 11 markers equally distributed on a 0.6 M linkage group (Figure 7). The bias towards the marker positions was systematic whichever the marker density. Except for 11 markers, the number of markers had little influence on the proportion of QTL located at a marker position (Table 2), i.e. the bias seemed to divide up between markers. However, this criteria was difficult to interpret because the number of markers was different in each of the cases studied. Moreover, in all cases, the two extreme markers concentrated more false locations than intermediate markers did. Since the number of markers was different, the distributions based on the marker density were expected to be different as well. The KS test was thus not suitable to test the influence of the marker density.

Simulations under H1

According to the analytical results obtained in a backcross type population, the population size and the marker density, as well as the QTL effect and location, were parameters which were very likely to influence the bias on the estimated QTL location under the alternative hypothesis.

Impact of the population size

Three sizes of population were simulated: 100, 300 or 800 progeny. The empirical distributions of the estimated QTL location are given in Figure 8. One QTL was simulated at the 0.1 M location (Δ on the X axis) on a 0.4 M linkage group with three markers located at 0 M, 0.2 M and 0.4 M. The Figure 8 clearly shows a variability of the bias of the QTL location depending on the population size. It can be seen from Table 3 that the proportion of putative QTL that co-localized with a marker and the root mean square error (RMSE) of the QTL location became smaller when the population size increased. Table 3 also shows that the bias was correlated to the power of the analysis. However, considering only very significant LRT (α = 0.01), significant (α = 0.05) or all LRT (α = -), led to very similar values of RMSE or of proportion of QTL mapped on a marker location. The KS test indicated that the distributions of the estimated QTL position were significantly different depending on the size of the population studied (p - values = 0). Moreover, the Mann-Whitney-Wilcoxon (MWW) test showed that, when the population size increased, the median of the error of the estimated QTL position significantly decreased (p - values < 2.2e - 16).

Impact of the marker density

Figure 9 reports the QTL location distribution for a marker density from 2 to 11 markers equally distributed on a 0.6 M interval. One QTL was simulated at 0.25 M. It shows the advantage of using a high marker density in the QTL detection: when the marker spacing was minimal (i.e. 6 cM), the location of the QTL was very accurate. Table 4 shows that, when the density increased, the RMSE of the estimated QTL location decreased. So the QTL location was more accurately estimated. The dependency between the power of the analysis and the bias extent was confirmed. On the contrary, there were low variations of RMSE or proportion of QTL mapped at marker location according to α. Finally, as under H0, these tendencies were confirmed for the proportion of QTL located at a marker location, except for 11 markers.

Impact of the QTL effect

The empirical distributions of the estimated QTL location when the QTL effect increased from 0.5 phenotypic standard deviation (σ) to 4σ is shown in Figure 10. One QTL was simulated at 0.1 M on a 0.4 M linkage group with three markers equally spaced. The power of the QTL detection, the RMSE of the estimated QTL location and the proportion of estimated QTL locations at a marker are given in Table 5. Results indicated that, whenever the QTL effect increased, the bias decreased. As seen previously, the bias decreased when the power increased but the RMSE or the proportion of QTL mapped on a marker position were only slightly dependent on the test level. The KS test indicated that the distribution of the estimated QTL position was significantly different when the QTL effect changed (p - values < 2.2e - 16). The MWW test showed that, when the QTL effect increased, the median of the error of the estimated QTL position decreased (p - values < 2.2e - 16).

Impact of the true QTL location

Figure 11 shows the variation in the distribution of the estimated QTL location when the simulated QTL position (Δ on the X axis) moved towards the middle of two flanking markers on a 0.4 M linkage group. Table 6 shows the power of the QTL detection, the RMSE of the estimated QTL location and the proportion of estimated QTL positions at a marker position when the true QTL location changed from 0 M to 0.2 M. When the true QTL location tended towards the flanking markers, the power went up and the proportion of QTL locations at a marker location increased. On the contrary, the RMSE went down. The KS test confirmed the difference between the distributions of the estimated QTL location when the true QTL location varied (p - values < 2.2e - 16). The MWW test, which compared the medians of error of the estimated QTL location, confirmed the increase in accuracy when the true QTL location tended towards a marker location.

An algorithm to adjust the location of QTL mapped on markers

Analytical developments in backcross type population and simulation study in outbred type population demonstrated that the estimated position of the QTL is biased towards marker location under some circumstances. On the other hand, the decomposition of the LRT according to the formula (4) allowed to identify a putative cause of this bias: the residual error ε1 in the LRT both under "no QTL" and the "one QTL" hypotheses. Indeed, according to the decomposition of the LRT in the formula (4), if the QTL is not estimated at its true location, two residual errors may have generated the bias: ε1 and ε2. When the estimated QTL position is at a marker location, argmaxt ε2(t) has a uniform distribution but argmaxt ε1(t) is more often estimated at a marker location than between markers. In such a situation, ε1 is very likely to play a dominant role in the bias. On the contrary, when the estimated QTL location is not at a marker location, argmaxt ε1(t) and argmaxt ε2(t) are unknown for a given argmaxt[ε1(t) + ε2(t)] error. Under these circumstances, it is impossible to predict the relative influence of ε1 and ε2 on the bias. On the basis of this observation, we propose an approach to describe the ε1(t) process and, consequently, adjust the estimated QTL position when a QTL co localizes with one marker, i.e. an approach to correct the "marker effect" on the bias of the estimated QTL location.

1. Obtain the vector which contains the LRT profile along the linkage group, calculated on the phenotypic data, say L0. L0 is maximum at the location of the marker M.

2. Under the "no QTL" hypothesis, simulate phenotypes and obtain LRT profiles until to have 1000 profiles which have their maximum at the position of the marker M, say {Li}i = 1,...,1000

3. Calculate the 1 000 vectors Vi = L0-Li,i ϵ 1,...,1000.

4. Retain the 1000 locations where {Vi}i = 1, ..., 1000 is maximum.

5. Obtain the adjusted position of the QTL as the mean of these positions:


P^=Mean{pi|pi=argmaxVi,i=1,…,1000}.

In order to verify the validity of this proposal, simulations were carried out in R for a backcross type design of 100 progeny. There were six markers equally distributed on a 1 M linkage group. There was one QTL of 0.5σ of effect for which two true locations were envisaged: 0.1 M, i.e. between markers, and 0.2 M, i.e. on a marker. Table 7 shows the comparison of RMSE between "before" and "after" adjusting the estimated QTL location when the estimated QTL position was on one of the markers in the first place. The distributions of estimated QTL location before and after the adjustment of the estimated QTL location are presented in Figure 12. The RMSE was always smaller after the position had been adjusted, even when the true QTL location was in 0.2 M, i.e. on a marker location. In this example, the proportion of false QTL locations on the markers was effectively decreased by the proposed algorithm.


Discussion

In order to study the elements that give rise to the bias on the estimated QTL position, we checked whether the distribution of the test statistic changed along the locations on the linkage group. More precisely, we checked if the significance threshold remained the same at a marker and at a non-marker location. Under the null hypothesis of "no QTL on the linkage group", the asymptotic distribution of the LRT at a given point is well known and identical for all locations. It will be getting closer to the central χ2 distribution with a degree of freedom depending on the number of parameters fixed under the null hypothesis [20], i.e. here the number of sires or dams for which a QTL effect was estimated. Nevertheless, the population size is most often not large enough to make the LRT reach its asymptotic distribution for all the locations on the linkage group. The variability of the marker informativity along the linkage group may actually influence this convergence to asymptotic conditions, resulting in variability of the LRT distributions depending on the tested locations. Here, the differences between the empirical distributions of the LRT at each position along the linkage group were explored using a real example of an outbred type population. It appeared that the variability of informativity along the linkage group did not lead to a significative variability of the empirical distributions of the nominal test statistics. This observation is not contradictive to the bias on the estimated QTL location towards the locations of markers but it means that the bias is due to the process which defines the sup of LRT on the linkage group.

Some analytical results concerning the bias of the estimated QTL location were obtained in a backcross type population, i.e. a backcross between inbred lines. We identified a common source of bias under the "no QTL" and the "one QTL" hypotheses, and also showed the possible influence of several parameters under "one QTL" hypothesis, such as the population size, the marker density, the QTL effect and the true QTL location. Using simulations, we verified the existence of a bias on the estimation of the QTL location using the interval mapping method, under the null and the alternative hypotheses, when family structure are more complex than the backcross design considered by Walling et al. [4]. Simulations of outbred populations confirmed that this bias is influenced by the size of the population and the density of the genetic map, as well as by the QTL effect under the alternative hypothesis. We also demonstrated that the true QTL location, relatively to the flanking markers, had a significant impact on the accuracy of the estimated QTL location. Moreover, we quantified the bias of the estimated QTL location for various values of these parameters and validated the results by applying appropriate test statistics.

We showed that the population size does not affect the estimation of the QTL location under the null hypothesis. Under the alternative hypothesis, very similar values of RMSE or of proportion of QTL detected at marker locations were observed whatever α. On the other hand, a slight reduction in the bias seemed to be obtained when applying α < 0.01. However, the choice of a high significance level also implies a decrease of power and the detection of only few QTL. As a consequence, it cannot be considered an efficient way to correct the bias problem.

Considering these results leads to a first recommendation which would be that the number of animals and/or markers must be adjusted to the desired test power and location accuracy. Figure 13 summarizes the variability of the bias in accordance with the population size and the QTL effect. The proportion of QTL mapped at a marker location and the RMSE of the QTL location present the same tendencies according to the variations in QTL effect and in population size. This confirms that the bias on the estimated QTL location is essentially due to the location of QTL on markers.

Secondly, concerning the particular case of eQTL detection, when the marker information is relatively sparse, for example when microsatellite markers are used for genotyping, it is necessary to measure several hundred of animals for transcriptomic data to obtain an accurate eQTL location. Finally, a population size of 300 progeny seems to be a good compromise in the detection of eQTL, even if only those which have relatively large effects will be detected.

Thirdly, it is clear that significant QTL detection located at a marker position should be considered with caution, especially when the population size, the marker density or the QTL effect are low. Hence, the approach proposed above is efficient to remedy the bias on the estimated QTL location in such situation.


Conclusions

When we apply the interval mapping method on an outbred design to map QTL, the QTL is often incorrectly mapped at the position of a marker. In this work, this bias was studied by using analytical developments in backcross type population and simulated data in outbred populations. In the absence of QTL, adjusting the thresholds at the location of markers cannot reduce the bias, and the population size does not affect the bias. Under the hypothesis of having one QTL, the impact of some parameters on the bias was confirmed: when the population size and/or the QTL effect and/or the marker density are large enough, the bias is reduced. Moreover, the closer the QTL is to a marker location, the more accurate the estimation is. Therefore, caution should be taken when the QTL is mapped at a position of a marker, in particular for low power designs. In such cases, a method is proposed to correct the bias on the estimated QTL location. Simulations carried out in a backcross type population demonstrated that this method is valid to limit the bias.


Methods
Analyses on a real data set in pig

A real data set was used to illustrate some aspects of the present work. It is a porcine outbred population of 325 progeny issued from 4 sires. One example of eQTL analysis using the QTLMap software [21], i.e. the analysis of the chromosome SSC18 for 6 665 gene expression traits, was given. The linkage analysis method was applied according to Le Roy et al. [22] with a gene by gene procedure. For each gene, when the LRT was significant at the 5%0 level at the chromosome level, the estimated eQTL location was the location where the LRT was maximum on the linkage group.

The same familial structure was used to study the empirical distribution of the LRT along the linkage group. Two thousand simulations were performed under the null hypothesis of "no QTL" on the chromosome SSC1 which carried 16 microsatellite markers. A polygenic heritability coefficient of 0.5 was assumed for the trait (see http://www.inra.fr/qtlmap).

Simulations of an Ornstein-Uhlenbeck process

The asymptotic distribution of the LRT process at marker positions was shown as being the square of an OU process [16]. Let's Xt denote the value of this OU process at the t location. Xt could be described as:


dXt=2Xtdt+2dWt

where Wt denotes the Brownien movement.

In a backcross type population, the mean of this process is 0 and the autocovariance is: cov(Xt, Xt') = e-2|t-t'| with t and t' in the Haldane distance unit [16].

To simulate this process, we considered a linkage group with mk markers. We generated mk independent random numbers z0, z1, ..., zmk from a normal distribution with mean 0 and variance 1 with the function rnormin R. We defined X0 = z0. Then, a discrete analog of the OU process [23] was given by:


Xs=e-2τXs-1+1-e-4τzs

with s = 1, ..., mk, where τ denotes the spacing of two adjacent markers in Morgan. This sequence is a first-order autoregressive sequence.

Simulations of outbred type population

The QTLMap software [21] was used to simulate and analyse the data sets. QTLMap allowed the simulation of complete experimental designs with pedigree, genetic map, genotypes and phenotypes http://www.inra.fr/qtlmap. The population structure was a mixture of full and half sib families for given numbers of sires (s), of dams per sire (d) and of progeny per dam (p). Most often, 3 markers were equally distributed on a 0.4 M linkage group. Each marker had 6 alleles with equal frequencies in the parental population. The QTL was simulated at 0.1 M and all sires and dams were heterozygous for the QTL. The phenotypes of the progeny were simulated as follows:

[Formula ID: bmcM5]
(5) 
yijk=ui+uij+agijk(t0)+eijk

yijk is the phenotype of the progeny ijk of the sire i and of the dam ij. ui and uijk denote the polygenic effects, of the sire i and of the dam ij respectively, which follow a normal distribution with mean 0 and variance σu2. a denotes the QTL allelic substitution effect and gijk (t0) is the genotypic value of ijk at the QTL location t0. gijk takes value 1, 0 or -1 depending onthe QTL genotype, QQ, Qq or qq, respectively. eijk is a random normal variable with mean 0 and variance σe2. The variance within QTL genotype is σ2=2σu2+σe2 and a is expressed in σ unit. The heritability coefficient, equal to 4σu2/σ2, was fixed at 0.25.

For each of the cases studied, the results were based on 5 000 simulations, either under the null hypothesis (H0: there is no QTL segregating on the linkage group, i.e. a = 0) or under the alternative hypothesis (H0: there is one QTL segregating on the linkage group, i.e. most often a = 1σ). For each simulated dataset, the estimated QTL position was the location of the linkage group where the LRT was maximum.

Under the null hypothesis, simulations were carried out so as to compare the influence of the population size with 6 levels: 60 (3s,1d,20p), 80 (4s,1d,20p), 100 (5s,1d,20p), 300 (5s,2d,30p), 400 (5s,2d,40p), 800 (5s,4d,40p) progeny. Under the H1 hypothesis, only 3 of these population sizes were considered: 100, 300 and 800 progeny.

To understand how the QTL effect affects the estimation of the QTL location, a population of 100 progeny was simulated with a QTL effect ranging from 0.5 σ to 4 σ.

Other simulations were performed in a population of 300 progeny. Firstly, to check the bias extent depending on the marker density, samples with 2, 3, 5, 7 or 11 markers equidistant in a linkage group of 0.6 M were simulated, under the null and under the alternative hypotheses. Under H1, one QTL was simulated at 0.25 M (a = 1σ). Secondly, to test how the true QTL location may affect the bias, we performed simulations under H1 with a QTL (a = 1σ) lying at 0 M, 0.05 M, 0.10 M, 0.15 M, 0.2 M on a linkage group of 0.4 M with two flanking markers at 0 M and 0.4 M.

Criteria

In each of the cases studied, the empirical distribution of the estimated QTL location was obtained from the 5000 locations of maximum of LRT of the simulations. The proportion of simulations for which the QTL location was estimated at one marker position was retained to quantify the variability of the bias depending on the parameters. Under the alternative hypothesis, beside the proportion of the QTL which co-localized with a marker, the root mean squared error (RMSE) of the QTL location was chosen to describe the bias variability which depends on the parameters. What is more, the power of the QTL detection was calculated for a first type error α = 0.01 and 0.05. For all simulations and for simulations with a significant QTL at the level α = 0.01 or 0.05, the proportion of QTL that were estimated at a marker location and the RMSE of the estimated QTL location were computed. The RMSE computation was given by the following formula


RMSE=1L∑l=1L(t^l-t0)2,

where L is the number of simulations (all, significant at the level 0.05 or at the level 0.01), t^l is the lth estimated QTL position and t0 is the true QTL position.

Hypothesis test

Appropriate statistical tests are needed to evaluate which parameters affect the bias of the estimated QTL position. ANOVA was not adequate to test the equality of the average QTL position in two different conditions (e.g. 2 population sizes) because of the non normality of the QTL position estimator. Therefore, two nonparametric tests were combined in order to test which parameters affect the bias, and how they influence the variation of the QTL location estimation. This was performed in two steps: (1) the parameters which influence the accuracy of the estimated QTL location were identified. This step was carried out with a Kolmogorov-Smirnov test; (2) for the parameters identified in the first step, a description of their effect on the accuracy of the estimated QTL position was made. This step was performed with a Mann-Whitney-Wilcoxon test [24].

1. Kolmogorov-Smirnov test (KS): this test was applied in order to check whether a parameter affected the estimation of the QTL position. For each value of the parameter, an empirical distribution of the estimated QTL location was obtained using 5 000 simulations. The two hypotheses compared by the KS test were:


H0:Fa=FbH1:Fa≠Fb,

where Fa, Fb denote the distribution of the estimated QTL position under the conditions a and b, respectively. For a given parameter, all the distributions were compared by pairs with the function ks.test in R. If all pair comparisons concluded to accept the null hypothesis, it means that the value of this parameter did not influence the estimation of the QTL position.

Mann-Whitney-Wilcoxon test (MWW): when the null hypothesis in the first step was rejected, the MWW test was used to understand how the parameter affected the estimation with the function wilcox.test in R. The hypotheses compared were:


H0:Da=DbH1:Da≤Db,

where Da, Db denote the absolute values of the deviations between the estimated QTL position and the assumed, i.e. the true position, under the condition a and b, respectively. A smaller median D corresponds to a more accurate position estimation.


Appendix
Appendix I

Let us denote Mk the genotype of the markers at 0 and T for individual k and pkt = ℙ (gk(t) = 1|Mk). Then using a linearized likelihood function instead of the mixture of two normal distributions, the LRT can be written as: LRT(t)=-2lnL(y1,…,yn)maxaL(y1,…,yn;a)≈-2ln∏ϕ(yk;0,1)maxa∏ϕ(yk+a2(1-2pkt);0,1)=∑yk(1-2pkt)2∑(1-2pkt)2,

where ϕ(x;μ,σ)=12πσe-(x-μ)22σ2 and the distribution of pkt is given as:

Note that xk(t) = E(gk(t)|Mk) = 2pkt - 1, so the LRT at each position t can be described as LRT(t)=[∑ykxk(t)]2∑xk2(t), and the distribution of xk (t) for each individual k is

Appendix II

Replacing yk=a2xk(t0)+εk in LRT (3), we have LRT(t)=[∑(a2xk(t0)+εk)xk(t)]2∑xk2(t)=14a2[∑xk(t0)xk(t)]2∑xk2(t)+[∑εkxk(t)]2∑xk2(t)+a∑xk(t0)xk(t)∑xk2(t)∑εkxk(t)=f(t)+ε1(t)+ε2(t),

where in the case of large sample, we have

f(t)=14a2[∑xk(t0)xk(t)]2∑xk2(t) and f(t)/n→14a2Var(x(t0))ρtt02 when n → ∞ according to the law of large numbers.

ε1(t)=[∑εkxk(t)]2∑xk2(t) is the LRT under the no QTL hypothesis.

ε2(t)=a∑xk(t0)xk(t)∑xk2(t)∑εkxk(t)~aCov(x(t),x(t0))Var(x(t))∑εkxk(t) is a residual error, linear combination of gaussian random variables. Its distribution is approximated as N(0,na2Var(x(t))ρtt02).

Appendix III

Considering the two first terms in the expression of 1nLRT(t) (4), when n tends to infinity, 1nε1(t) will converge to 0 at each position t. So the amplitude of the curve representing the term 1nε1(t), with respect to that of the first term, is reduced. In the same way, as a2 increases, the amplitude of the first term becomes larger with respect to that of the second term.

The proof of the influence of t0 and T will use the results in this following lemma:

Lemma 1. Given two markers at the location 0 and T in a linkage group of length T and assuming a QTL located at t0, from the distribution of Var(x(t)) and applying the Taylor series expansion in case of small T, we have:

1. Var(x(t0))=(1-θt0-θT-t0)21-θT+(θt0-θT-t0)2θT≈4Tt02-4t0+1.

2. Cov(x(t),x(t0))=(1-θt-θT-t)(1-θt0-θT-t0)1-θT+(θt-θT-t)(θt0-θT-t0)θT≈4Ttt0-2(t+t0)+1.

Now let us assume T < 1 and consider the peak-to-peak amplitude of the first term of 1nLRT(t), g(t)=14a2Var(x(t0))ρtt02, i.e., the deviation between highest amplitude value and lowest amplitude value:


δ=maxt∈[0,T]g(t)-mint∈[0,T]g(t).

If t0∈[T2,T], then the maxt∈[0, T]g(t) is reached at t = t0 and the mint∈[0, T]g(t) is reached at 0. Hence, we have:


δ=a24Var(x(t0))-Cov2(x(0),x(t0))≈a2(1T-1)t02.

It can be seen that when t0 T from T2 and/or when T decreases, δ will become larger.

If t0∈[0,T2], then the maxt∈[0, T]g(t) is reached at t = t0 and the mint∈[0, T]g(t) is reached at T. Hence, we have:


δ=a24Var(x(t0))-Cov2(x(T),x(t0))≈a2(1T-1)(t0-T)2.

It can be seen that when t0 → 0 from T2, δ will become larger. Likewise, we set a constant c for the distance between the argmaxt∈[0, T]g(t) and argmint∈[0, T]g(t). Then, when T changes, the position of QTL is t0 = T -c and


δ≈a2(1T-1)c2.

Therefore, when T becomes larger, δ decreases.

In conclusion, the amplitude of g(t) will be greater as the QTL position tends to one marker and/or the distance between the markers decreases.


Competing interests

The authors declare that they have no competing interests.


Authors' contributions

PLR conceived this study. XW carried out the study and drafted the manuscript, supervised by PLR and JME. XW, HG, CM and OF wrote updated versions of the QTLMap software. All authors read and revised and approved the final manuscript.


Acknowledgements

These results are part of the SABRE research project that has been co-financed by the European Commission, within the 6th Framework Programme, contract No. FOOD-CT-2006-016250. XW is a Ph.D fellow supported by the SABRE research project and by the Animal Genetics division of INRA.


References
Spelman RJ,Coppieters W,Karim L,van Arendonk J,Bovenhuis H,Quantitative trait loci for five milk production traits on chromosome six in the Dutch Holstein-Friesian populationGeneticsYear: 1996144179918088978065
Walling GA,Visscher PM,Haley CS,A comparison of bootstrap methods to construct confidence intervals in QTL mappingGenetical ResearchYear: 19987117118010.1017/S0016672398003164
Whittaker JC,Thompson R,Visscher PM,On the mapping of QTL by regression of phenotype on marker-typeHeredityYear: 199677233210.1038/hdy.1996.104
Walling GA,Haley CS,Perez-Enciso M,Thompson R,Visscher PM,On the mapping of quantitative trait loci at marker and non-marker locationsGenetical ResearchYear: 2001799710611974607
Perez-Enciso M,Fernando RL,Bidanel JP,Le Roy P,Quantitative Trait Locus analysis in crosses between outbred lines with dominance and inbreedingGeneticsYear: 200115941342211560915
Jansen RC,Nap JP,Genetical genomics: the added value from segregationTRENDS in GeneticsYear: 20011738839110.1016/S0168-9525(01)02310-111418218
Brem RB,Yvert G,Clinto R,Kruglyak L,Genetic Dissection of Transcriptional Regulation in Budding YeastScienceYear: 200229675275510.1126/science.106951611923494
Schadt EE,Monks SA,Drake TA,Lusis AJ,Che N,Colinayo V,Ruff TG,Milligan SB,Lamb JR,Cavet G,Linsley PS,Mao M,Stoughton RB,Friend SH,Genetics of gene expression surveyed in maize, mouse and manNatureYear: 200342229730210.1038/nature0143412646919
Monks SA,Leonardson A,Zhu H,Cundiff P,Pietrusiak P,Edwards S,Phillips JW,Sachs A,Schadt EE,Genetic inheritance of gene expression in human cell linesAm J Hum GenetYear: 2004751094110510.1086/42646115514893
Morley M,Molony CM,Weber TM,Devlin JL,Ewens KG,Spielman RS,Cheung VG,Genetic analysis of genome-wide variation in human gene expressionNatureYear: 200443074374710.1038/nature0279715269782
Shi C,Uzarowska A,Ouzunova M,Landbeck M,Wenzel G,Lubberstedt T,Identification of candidate genes associated with cell wall digestibility and eQTL (expression quantitative trait loci) analysis in a Flint × Flint maize recombinant inbred line populationBMC GenomicsYear: 200782210.1186/1471-2164-8-2217233901
Ponsuksili S,Murani E,Schwerin M,Schellander K,Wimmers K,Identification of expression QTL (eQTL) of genes expressed in porcine M. longissimus dorsi and associated with meat quality traitsBMC GenomicsYear: 20101157210.1186/1471-2164-11-57220950486
Cherel P,Glenisson J,Damon M,Vincent A,Liaubet L,Lobjois V,Hatey F,Milan D,Le Roy P,Colocalization of quantitatitve trait loci for meat pH and differentially expressed genes in skeletal muscles in pigsProceedings of the 8th World Congress on Genetics Applied to Livestock Production: 13-18 August 2009; Belo Horizonte, MG, BrazilYear: 2006 06:18.
Le Mignon G,Desert C,Pitel F,Leroux S,Demeure O,Guernec G,Abasht B,Douaire M,Le Roy P,Lagarrigue S,Using transcriptome profiling to characterize QTL regions on chicken chromosome 5BMC GenomicsYear: 2009822
Le Bras Y,Dechamp N,Montfort J,Cam AL,Krieg F,Quillet E,Prunet P,Le Roy P,Acclimation to seawater in rainbow trout: QTL/eQTL approach for plasmatic ions and gill tissueProceedings of the 9th World Congress on Genetics Applied to Livestock Production: 1-6 August 2010; Leipzig, GermanyYear: 2010638
Lander E,Botstein D,Mapping Mendelian factors underlying quantitative traits using RFLP linkage mapsGeneticsYear: 19891211851992563713
Cierco C,Asymptotic distribution of the maximum likelihood ratio test for gene detectionStatisticsYear: 19983126128510.1080/02331889808802639
Rabier CE,Azais JM,Delmas C,Likelihood ratio test process for quantitative trait loci detectionJournal of the Royal Statistical SocietyYear: 2009
Elsen JM,Mangin B,Goffinet B,Boichard D,Le Roy P,Alternative models for QTL detection in livestock: I. General introductionGenetics Selection EvolutionYear: 19993121322410.1186/1297-9686-31-3-213
Goffinet B,Rebai A,Mangin B,Construction confidence intervals for QTL locationGeneticsYear: 1994138130113087896108
Filangi O,Moreno C,Gilbert H,Legarra A,Le Roy P,Elsen JM,QTLMap, a software for QTL detection in outbred populationsProceedings of the 9th World Congress on Genetics Applied to Livestock Production: 1-6 August 2010; Leipzig, GermanyYear: 2010787
Le Roy P,Elsen JM,Boichard D,Mangin B,Bidanel JP,Goffinet B,An algorithm for QTL detection in mixture full and half sib familiesProceedings of the 6th World Congress on Genetics Applied to Livestock Production: 11-16 January 1998; Armidale, AustraliaYear: 1998257260
Feller W,An Introduction to Probability Theory and Its ApplicationsYear: 1968Volume 22Wiley
Lehmann EL,Nonparametrics: Statistical Methods Based on RanksYear: 1975Mcgraw-Hill

Figures

[Figure ID: F1]
Figure 1 

eQTL mapping in pigs: example of distribution of the eQTL locations (chromosome 18). 756 eQTL were mapped on the chromosome 18 which have 5 microsatellite markers located at 0 M, 0.08 M, 0.39 M, 0.54 M and 0.83 M, respectively (black points on the X axis).



[Figure ID: F2]
Figure 2 

QTL mapping in pigs: empirical distribution of the estimated QTL location under H0 (chromosome 1). Results are based on 2000 simulations in a population of 4 sires and 325 offspring. There are 16 markers on the chromosome 1 (black points in the X axis).



[Figure ID: F3]
Figure 3 

QTL mapping in pigs: empirical distribution of the LRT at various locations. Results are based on 2000 simulations in a population of 4 sires and 325 offspring. The line represents the density obtained with a Kernel method, the dotted line represents the density of a χ2 with 4 d.f.



[Figure ID: F4]
Figure 4 

QTL mapping in pigs: empirical degrees of freedom of the χ2 distribution. M indicates the locations of the markers.



[Figure ID: F5]
Figure 5 

Empirical distribution of the maximum of the Ornstein-Uhlenbeck process. Results are based on 1 million of simulations.



[Figure ID: F6]
Figure 6 

Empirical distribution of the estimated QTL location according to the population size under H0. Num. Offsp. indicates the number of offspring in the outbred population. Results are based on 5000 simulations per case. There were three markers at 0 M, 0.2 M and 0.4 M. The line represents the density obtained with a Kernel method.



[Figure ID: F7]
Figure 7 

Empirical distribution of the estimated QTL location according to the marker density under H0. Results are based on 5000 simulations per case of in an outbred population of 300 individuals. The number of markers varied from 2 to 11 across along a linkage group of 0.6 M. The line represents the density obtained with a Kernel method.



[Figure ID: F8]
Figure 8 

Empirical distribution of the estimated QTL location according to the population size under H1. Num. Offsp. indicates the number of offspring in the outbred population. Results are based on 5000 simulations per case. There were three markers at 0 M, 0.2 M and 0.4 M. Δ indicate the true QTL location. The line represents the density obtained with a Kernel method.



[Figure ID: F9]
Figure 9 

Empirical distribution of the estimated QTL location according to the marker density under H1. Results are based on 5000 simulations per case of an outbred population of 300 individuals. The number of markers varied from 2 to 11 along a linkage group of 0.6 M. Δ indicates the true QTL location. The line represents the density obtained with a Kernel method.



[Figure ID: F10]
Figure 10 

Empirical distribution of the estimated QTL location according to the QTL effect under H1. Results are based on 5000 simulations per case in an outbred population of 100 progeny. There were three markers at 0 M, 0.2 M and 0.4 M. The QTL effect varied from 0.5 σ to 4σ. Δ indicates the true QTL location. The line represents the density obtained with a Kernel method.



[Figure ID: F11]
Figure 11 

Empirical distribution of the estimated QTL location according to the true QTL location under H1. Results are based on 5000 simulations per case of an outbred population of 300 progeny. There were two flanking markers at 0 M and 0.4 M. The true QTL location varied from 0 M to 0.2 M. Δ indicates the true QTL location. The line represents the density obtained with a Kernel method.



[Figure ID: F12]
Figure 12 

Empirical distribution of the estimated QTL locations before and after adjustment. Results are based on 5000 simulations per case in a backcross population of 100 progeny. There were six markers equally distributed on a linkage group of 1 M. The true QTL location indicated as Δ was set at 0.1 M or 0.2 M.



[Figure ID: F13]
Figure 13 

Distribution of the bias depending on the population size and the QTL effect. a. Proportion of estimated QTL locations on marker positions. b. RMSE of QTL location. Results are based on 5000 simulations per case. There were three markers at 0 M, 0.2 M and 0.4 M. The true QTL location was set at 0.1 M. 5*1*20 indicates that there were 5 sires, 1 dam per sire and 20 progeny per dam in the design.



Tables
[TableWrap ID: T1] Table 1 

Proportion of estimated QTL locations at marker locations according to the population size under H0


Number of individuals (s × d × p)1

60 (3 × 1 × 20) 100 (5 × 1 × 20) 300 (5 × 2 × 30) 400 (5 × 2 × 40) 800 (5 × 4 × 40)
Proportion2 (%) 64.2 63.8 65.3 67.1 66.0

1The population structure was a mixture of full and half sib families for given numbers of sires (s), dams per sire (d), and progeny per dam (p).

2Proportion of QTL located at a marker position (3 markers at 0 M, 0.2 M and 0.4 M).


[TableWrap ID: T2] Table 2 

Proportion of estimated QTL locations at marker locations according to the marker density under H0


Number of markers1

2 3 5 7 11
Proportion2 (%) 58.6 58.5 52.6 57.8 69.5

1The markers were evenly distributed on a linkage group of 0.6 M.

2Proportion of QTL located at a marker location for a population of 300 individuals.


[TableWrap ID: T3] Table 3 

Power, RMSE of the QTL location and proportion of estimated QTL locations at marker locations according to the population size under H1


Number of individuals (s × d × p)1

α2 100 (5 × 1 × 20) 300 (5 × 2 × 30) 800 (5 × 4 × 40)
Power3 (%) 0.05 39 93 100
0.01 15 81 100

RMSE4 (cM) - 13.9 8.7 4.2
0.05 12.2 8.4 4.2
0.01 11.4 8.0 4.2

Proportion5 (%) - 39.9 15.2 1.6
0.05 29.7 14.0 1.6
0.01 26.8 12.9 1.6

1The population structure was a mixture of full and half sib families for given numbers of sires (s), dams per sire (d), and progeny per dam (p).

2Significance level for the LRT.

3Power of the QTL detection at significance level α.

4RMSE of the estimated QTL location, the true QTL being located at 0.1 M on a 0.4 M linkage group.

5Proportion of QTL located at a marker location (3 markers at 0 M, 0.2 M and 0.4 M).


[TableWrap ID: T4] Table 4 

Power, RMSE of the QTL location and proportion of estimated QTL locations at marker locations according to the marker density under H1


Number of markers1

α2 2 3 5 7 11
Power3 (%) 0.05 62.2 91.6 94.4 96.5 97.5
0.01 37.2 79.0 85.2 88.9 91.8

RMSE4 (cM) - 16.6 10.5 8.8 7.4 5.7
0.05 14.9 10.1 8.4 7.2 5.6
0.01 14.1 9.7 8.0 6.9 5.2

Proportion5 (%) - 19.0 15.4 12.5 11.8 35.0
0.05 13.0 14.6 11.5 11.5 34.5
0.01 10.8 13.7 10.7 10.5 34.0

1The markers were evenly located in a linkage group of 0.6 M.

2Significance level for the LRT.

3Power of the QTL detection at significance level α in a population of 300 progeny.

4RMSE of the estimated QTL location, the true QTL being located at 0.07 M on a 0.6 M linkage group.

5Proportion of QTL located at a marker location.


[TableWrap ID: T5] Table 5 

Power, RMSE of the QTL location and proportion of estimated QTL locations at marker locations according to the QTL effect under H1


QTL effect

α1 0.5 σ 1 σ 1.5 σ 2 σ 4 σ
Power2 (%) 0.05 10.5 41.8 81.0 97.4 100

0.01 2.9 19.5 58.1 89.8 100

RMSE3 (cM) - 17.0 13.6 10.3 8.0 4.4

0.05 15.6 11.9 9.8 7.8 4.4

0.01 15.4 10.9 9.1 7.6 4.4

Proportion4 (%) - 56.6 37.9 24.6 13.6 2.4

0.05 37.1 28.8 22.2 13.2 2.4

0.01 31.0 24.6 20.3 12.9 2.4

1Significance level for the LRT.

2Power of the QTL detection for significance level α.

3RMSE of the estimated QTL location in a population of 100 progeny with a true QTL located at 0.1 M.

4Proportion of QTL located at a marker location (3 markers at 0 M, 0.2 M and 0.4 M).


[TableWrap ID: T6] Table 6 

Power, RMSE of the QTL location and proportion of estimated QTL locations at marker locations according to the true QTL location under H1


QTL location (M)

α1 0 0.05 0.1 0.15 0.2
Power2 (%) 0.05 96.1 92.3 87.2 82.0 81.2
0.01 87.4 80.1 69.7 61.9 60.2

RMSE3 (cM) - 7.5 7.6 9.5 10.7 11.2
0.05 7.2 7.2 9.0 10.1 10.7
0.01 7.0 7.0 8.6 9.7 10.3

Proportion4 (%) - 51.5 38.6 26.6 19.0 16.2
0.05 51.3 37.8 24.8 15.8 13.6
0.01 50.9 37.2 24.3 13.8 11.2

1Significance level for the LRT.

2Power of the QTL detection for significance level α.

3RMSE of the estimated QTL location in a population of 300 progeny with a true QTL located at 0.1 M on a 0.4 M linkage group.

4Proportion of QTL located at a marker location (2 markers located at 0 M and 0.4 M).


[TableWrap ID: T7] Table 7 

RMSE of the QTL location before or after adjustment


True QTL location (M)

0.1 0.2
Before1 28.6 23.6
After2 26.6 21.3

Results based on 5000 simulations per case in a backcross population of 100 progeny. There were 6 markers equally distributed on 1 M.

1RMSE for the estimated QTL location.

2RMSE for the estimated QTL location after adjusting the estimated QTL location by the proposed algorithm.


[TableWrap ID: T8]  
pkt Probability Mk
(1-θt)(1-θT-t)/(1-θT) (1-θT)/2 M1M1M2M2

(1-θt)θT-tT θT/2 M1M1M2m2

θT-tθtT (1-θT)/2 M1m1M2m2

θt(1-θT-t)/θT θT/2 M1m1M2M2

[TableWrap ID: T9]  
xk (t) Probability Mk
(1-θt-θT-t)/(1-θT) (1-θT)/2 M1M1M2M2

(θT-t-θt)/θT θT/2 M1M1M2m2

t+θT-t-1)/(1-θT) (1-θT)/2 M1m1M2m2

(θt-θT-t)/θT θT/2 M1m1M2M2


Article Categories:
  • Methodology Article

Keywords: QTL, linkage analysis, QTL location, bias.

Previous Document:  Independent and Incremental Value of Severely Enlarged Left Atrium in Risk Stratification of Very El...
Next Document:  The myopic vitreous. Why is its examination so difficult?