|
FlexQTLtm - Quantitative Genetics on Pedigreed Populations
Introduction
Genetic linkage is the dependent cosegregation of genes at different loci on the same chromosome. Linkage detection and linkage analysis on the basis of data observed on related individuals require the computation of multilocus probabilities of observed phenotypic data on pedigree structures.
Relatives share common ancestors. A single gene iin such an ancestor may therefore descend via repeated segregations to each of the relatives. Such genes, which are copies of a single ancestral gene within a defined pedigree, is saidto be identical by descent (IBD). An implicit assumption in linkage analysis is that a trait is genetically determined. That is, individuals of similar phenotype have higher probabilities of sharing genes IBD at trait loci and hence at linked marker loci.
The probability of patterns of gene identity by descent are determined by the pedigree structure, and in turn determine the probability distribution of observed data on individuals of the pedigree.
The software package FlexQTLTM is based on Bayesian Theoryand implemented via Markov chain Monte Carlo simulation. This software may be used for
- QTL mapping
- To estimate the number and locations of QTL given any pedigree and a marker linkage map.
- IBD probabilities
- To estimate the IBD probability, conditional on known pedigree and marker data The Bayesian statistical approach may be represented by the following scheme:
- Population Structures
- Recombination Fraction
- Genetic Models
- Nuisance Variables
- Missing Values
- Multiple Traits
How to use FlexQTLTM? Go to input files
Marco Bink BIOMETRIS, Wageningen UR Version 0.98 PO Box 100 6700 AC Wageningen 10 AUG 2004 The Netherlands www.biometris.nl wwww.flexqtl.nl
*************************************************************** This software greatly benefitted from original work by:
Pekka Uimari, University of Helsinki, Finland Publication : Genet Epidem 21:224-242 (2001) ***************************************************************
Population StructureThe data set is assumed to contain a fully specified pedigree structure. That implies that for every individual either zero or both parents are known (and precede this individual in the data file) The software was originally developed for human genetics, assuming outbred populations with little or no inbreeding and pedigree loops. Currently, the software has been strongly designed to analyse pedigreed populations in animals and plants, allowing high degrees of inbreeding in outbred populations. Furthermore, the software has been made even more flexible to analysis fully inbred plant populations. Note that FlexQTLTM can also analyse data from very simple pedigree design such as a BackCross, F2, Cross-Polinator Full Sib populations, or combinations thereof.

Recombination FractionRecombination events are assumed to occur independently, i.e., the Haldane mapping function is applied to transform genetic distances between markers and QTLs. In case of fully selfed or inbred individuals, such as Recombinant Inbred Lines (RIL), FlexQTLTM assumes that these individuals are fully selfed via single seed descent since the process of selfing has implications for the expected recombination frequenties in these inbred populations
Genetic Models FlexQTLTM assumes that QTLs are diallelic, allowing FlexQTLTM to estimate additive, dominance and imprinting effects of each QTL. In case of multiple founder populations the QTL may have different allelic frequencies in each population. In that case the user has to assign founders to different base populations (first column of data file) The genetic model in FlexQTLTM can easily be extended to allow for Major Genes (unlinked QTL) and polygenes. The Major Genes have similar assumptions has the QTL, for the polygenes avariance component is estimated.
Nuisance Variables Next to multiple genetic sources of variation the software also allows to include non-genetic or so-called nuisance variables. By default, FlexQTLTM always includes an overall mean for a quantitative trait (therefore not implicitly represented by a column in the data file). Other nuisance variables, such as treatments, years, environments or locations, may be included in the model. In that case, these nuisance variables need to be specified in the data file! Note that at this moment it is not possible to include interactions between the different components.So, Genotype by Environment interactions can not directly be estimated.The nuisance variables may be included with different priors, i.e., assuming a Uniform prior (the Frequentistic 'fixed' effect) and assuming a Normal prior ('random' effect). In the latter case, the variance of this Normal prior is estimated as well during the analysis.
Missing Values The analysis of real life data sets will like imply that some records are missing for trait phenotypes and/or marker genotypes. FlexQTLTM can easily accommodate these situations since it is based on Bayesian analysis (thereby naturally treating uncertainties) and Markov chain Monte Carlo simulation. In case of bivariate trait analysis and one phenotype missing, the trait value is imputed conditionally on the other (observed) trait value and correlation structures among the two traits. In the near future this may be extended for more than two trait situations. In case of missing marker genotypes, FlexQTLTM performs a pre-processing imputation step to infera putative unique genotype for an unscored individual conditional on its direct neighborhood (its neighbors being the two parents and all offspring and mates). Imputation of marker genotypes only occurs when there is only one genotype left over being consistent with the pedigree and observed markers. NOTE that this approach is different from the packages LOKI (and SIMWALK) that will fully augment the missing marker genotypes. Our rational is that the imputation step (employed by LOKI) is utilizing information from flanking marker loci (HORIZONTAL INFO) that have been genotyped and the relatives of an individual (VERTICAL INFO). The majority of vertical information is utilized in FlexQTLTM during the pre-processing step, whereas the horizontal information is directly utilized in FlexQTLTM since the software screens for the nearest informative loci when gathering recombination information for the sampling distributions for position or genotypes of the QTLs.
Multiple Traits FlexQTLTM can analyse any number of multiple traits simultaneously, providing the opportunity to explore pleiotropic behaviour of QTL and to estimate correlations of polygenic and residual components of the quantitative traits. Caution is necessary with respect to the completeness of the trait values in case of multiple traits analyses since large unbalancedness may cause distortion of the Markov chain simulation process and may result in spurious inferences on correlations.
| |