# **MODELS AND ESTIMATION OF GENETIC EFFECTS**

**Topic Editors José M. Álvarez-Castro and Rong-Cai Yang**

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2015 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

**ISSN** 1664-8714 **ISBN** 978-2-88919-444-5 **DOI** 10.3389/978-2-88919-444-5

### *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **MODELS AND ESTIMATION OF GENETIC EFFECTS**

Topic Editors:

**José M. Álvarez-Castro,** Universidade de Santiago de Compostela, Spain **Rong-Cai Yang,** University of Alberta, Canada

Authors: Ernesto González Torterolo and Ignacio Castro (Náchok)

Ronald Fisher needed to develop elaborate models of genetic effects in order to set the foundations of Quantitative Genetics in his 1918 paper "The correlation between relatives on the supposition of Mendelian inheritance". Since then, many significant implementations have been made to model genetic effects. However, at the verge of one century after Fisher's kick-off, models of genetic effects keep on being discussed and implemented. Indeed, the relatively recent advent of QTL analyses challenged the state of the art of this field by providing researchers the opportunity to obtain and analyze estimates of genetic effects from real data. In this context, the development of this field was not exempt of some polemics, like the debate about the convenience of the functional and the statistical epistasis approaches. This research topic is meant to provide recent developments in models and estimation of

genetic effects and to enrich the discussion about how and why models of genetic effects must be further developed and applied.

The articles in this Research Topic shall thus extend, refine and/or provide a refresh look at Fisher's original models of genetic effects and their application to genetic effects estimation and to improve our understanding of evolutionary processes and breeding programs.

# Table of Contents


#### *José M. Álvarez-Castro1 \* and Rong-Cai Yang2*

*<sup>1</sup> Department of Genetics, Universidade de Santiago de Compostela, Lugo, Spain*

*<sup>2</sup> Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada*

*\*Correspondence: jose.alvarez.castro@usc.es*

#### *Edited and reviewed by:*

*Samuel A. Cushman, United States Forest Service Rocky Mountain Research Station, USA*

**Keywords: genetic effects, mathematical modeling, statistical estimation, genetic architecture, environmental effects**

The foundation of genetics as a scientific field at the beginning of the twentieth century was not free from controversy. It meant no resolution that the advocates of the Biometric and the Mendelian schools agreed in one thing: the inheritance laws Mendel inferred by studying meristic (discrete) traits did not seem to be compatible with the findings the biometricians had been reporting for continuous (quantitative) variation since the nineteenth century (see Provine, 1971). For providing conclusive evidence against that paradigm, Fisher (1918) developed the foundations of the mathematical models of genetic effects that remain pertinent today, an endeavor in which he developed statistical tools that soon became broadly used beyond genetics.

The genetic effects comprised the core of that theory, but they were initially implemented in those expressions as parameters neither to be estimated nor to actually take any defined numerical values. The most parsimonious hypothesis about genetic effects at that time proposed that the genetic basis of quantitative traits is dominated by the effects of large numbers of genes at which allele substitutions have very small (infinitesimal) and independent (additive) effects on phenotype. This was eventually called the infinitesimal model (see e.g., Bulmer, 1980). Despite the accumulation of evidences suggesting more complex genetic architectures (e.g., Dobzhansky, 1970), the infinitesimal model proved to be a useful paradigm to guide investigation of practical quantitative genetics.

At the time when mapping genetic architectures has moved out the domains of pure fiction (see e.g., Rifkin, 2012), new possibilities for reassessing the adequacy of the infinitesimal model not only reawaken our thirst of knowledge but shall also enable a leap in applicability. It is thus not surprising to witness an increased research effort in updating mathematical and statistical tools for analysing genetic effects, aiming to typify all possible kinds of genetic architectures and their evolutionary implications. We feel grateful for having been able to gather a stimulating account of that update within the current Frontiers Research Topic Issue on Models and Estimation of Genetic Effects.

In the first work in this volume, Gjuvsland et al. (2013) analyse epistasis in genetic networks by focusing on monotonicity as a (correlated) alternative to additivity. Their approach further illustrates that population-referenced (statistical) and non-population-referenced (physiological, functional) genetic parameters are complementary tools in quantitative genetics analyses. The next work, by Le Rouzic (2014), stresses that the evolutionary implications of epistasis are conditioned on whether the interactions follow patterns. He uses the multilinear model to provide practical tools for the detection of such patterns (particularly, directionality) in real data, as well as conceptual keys for aiding the interpretation of the results.

We then move to imprinting, through a work by Álvarez-Castro (2014), who extends the NOIA model to account for that phenomenon and discusses the mathematical properties of the resulting theory in comparison with previous models of imprinting. Further, general procedures for advanced implementation of models of genetic effects are presented in that work. NOIA is also used by Álvarez-Castro and Yang (2012) in the next communication for clarifying the interpretation of the genetic effects defined as average excesses by Ronald Fisher. The interest raised by the publication of that work in Frontiers in Genetics actually triggered the current Research Topic Issue.

A group of papers follows that explicitly account for the environment. Yang (2014) analyses experimental datasets with non-linear functions and addresses some common constraints of the use of linear models to gene by environment interactions. He shows that even under largely linear genotypic responses, strong gene by environment interactions occur because of differences in positions and effects of quantitative trait loci (QTL) between poor and good environments. Marigorta and Gibson (2014) perform simulation studies to tackle the particularities of genome wide association (GWA) human studies. They show that for a wide range of scenarios, cumulative risk of alleles is highly significant despite the lack of evidence for gene by environment interactions, and that increased phenotypic variance after environmental perturbation lowers the statistical power to detect risk alleles in mixed cohorts. The environment of one species may be conditioned by the genome of another, like in the following study by Kodaman et al. (2014) on host-pathogen interactions. They illustrate how pathogens and their human hosts have interacted and coevolved to reduce antagonism and they endorse such information to be incorporated into genetic models to account for the heterogeneity of disease pathology and to avoid dubious conclusions about disease etiology.

The last two communications offer new insights into statistical issues commonly encountered in QTL mapping and GWA studies. Loredo-Osti (2014) provides a bootstrapping procedure to estimate the *p*-values under the mixed-model framework that is applied to QTL mapping when the mapping population consists of recombinant congenic strains, which overcomes a problem concerning the Type I error that had been pointed out in previous approaches. To conclude our compilation, Dai et al. (2014) address the classic issue of multiple hypothesis tests in the current era of high throughput genomics. They advocate a new (modified Lancaster) procedure that improves the control of the Type I error as compared to the Fisher's combination test as well as to the original Lancaster procedure, whilst maintaining statistical power to detect signals related to biomarkers in pathways.

We also find it worth noting that a couple of interesting works addressing genetic effects have been released during the preparation of this editorial. Wang (2014) provides new developments leading to the same genetic variance decomposition of multiallelic loci under departures from the Hardy-Weinberg proportions that we obtained using NOIA (Álvarez-Castro and Yang, 2011; incidentally, we hereby thank Dr. Wang for pointing out a misprint in one of the values of the applied case we provided in our paper). Varona et al. (2014) also use NOIA for dissecting genetic covariances between individuals in the context of genomic selection. Although this kind of analysis was originally developed under the paradigm of the infinitesimal model, and was specifically designed for accounting for any putative infinitesimal additive genetic signal, it is encouraging that it effectively utilizes innovative models of genetic effects. Finally, we commend the coming publication of a volume devoted to a specific (and important) instance of genetic effects, "Epistasis. Methods and Protocols" Edited by Jason H Moore and Scott M Williams, which can be viewed as a new instalment of the already classical "Epistasis and the Evolutionary Process" (Wolf et al., 2000) and whose author list overlaps with that of this Frontiers Research Topic Issue on Models of Genetic Effects.

We hope the papers in this volume provide a useful compendium of theoretical and statistical developments, data analyses, simulation studies, conceptual contributions and discussion that collectively advance knowledge of genetic architectures and environmental interactions, and their broad implications in evolutionary and population genetics. To better contextualize the consequence of this volume, we recall that the recent Frontiers Specialty Grand Challenge Article of Evolutionary and Population Genetics identifies the integration of genomics, modeling and experimentation as both the most critical challenge and exciting opportunity in advancing our field (Cushman, 2014). We feel that the papers presented in this volume, by showing strong linkages and synergies among modeling, experimentation, genomics and bioinformatics, demonstrate the importance of this kind of integrative research. Updating models of genetic effects is critical to take advantage of the stunning burst of molecular techniques and computing capabilities we are witnessing. Obtaining more general formulations of those models shall enable us to more efficiently characterize genetic architectures and to formulate hypothesis that could better guide experimental and simulation studies. Ultimately, evolutionary and population genetics benefits from the integration of different perspectives, methodologies and scopes of research within it, which in its turn accelerates its integration into a fully-fledged science of evolutionary quantitative genetics.

#### **ACKNOWLEDGMENTS**

José M. Álvarez-Castro has been supported by the Autonomous Administration Xunta de Galicia through project EM2014/024 to edit this Research Topic Issue. The authors thank Specialty Chief Editor Samuel A. Cushman for helpful comments.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 October 2014; accepted: 27 October 2014; published online: 12 November 2014.*

*Citation: Álvarez-Castro JM and Yang R-C (2014) One century later: dissecting genetic effects for looking over old paradigms. Front. Genet. 5:396. doi: 10.3389/fgene. 2014.00396*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Álvarez-Castro and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Monotonicity is a key feature of genotype-phenotype maps

#### *Arne B. Gjuvsland1 \*, Yunpeng Wang2, Erik Plahte1 and Stig W. Omholt 2,3*

*<sup>1</sup> Centre for Integrative Genetics (CIGENE), Department of Mathematical Sciences and Technology, Norwegian University of Life Sciences, Ås, Norway <sup>2</sup> Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Ås, Norway*

*<sup>3</sup> Department of Biology, Centre for Biodiversity Dynamics, NTNU Norwegian University of Science and Technology, Trondheim, Norway*

#### *Edited by:*

*José M. Álvarez-Castro, Universidade de Santiago de Compostela, Spain*

#### *Reviewed by:*

*Arnaud Le Rouzic, Centre National de la Recherche Scientifique, France Ovidiu Dan Iancu, Oregon Health & Science University, USA*

#### *\*Correspondence:*

*Arne B. Gjuvsland, Centre for Integrative Genetics (CIGENE), Department of Mathematical Sciences and Technology, Norwegian University of Life Sciences, PO Box 5003, N-1432 Ås, Norway e-mail: arne.gjuvsland@umb.no*

It was recently shown that monotone gene action, i.e., order-preservation between allele content and corresponding genotypic values in the mapping from genotypes to phenotypes, is a prerequisite for achieving a predictable parent-offspring relationship across the whole allele frequency spectrum. Here we test the consequential prediction that the design principles underlying gene regulatory networks are likely to generate highly monotone genotype-phenotype maps. To this end we present two measures of the monotonicity of a genotype-phenotype map, one based on allele substitution effects, and the other based on isotonic regression. We apply these measures to genotype-phenotype maps emerging from simulations of 1881 different 3-gene regulatory networks. We confirm that in general, genotype-phenotype maps are indeed highly monotonic across network types. However, regulatory motifs involving incoherent feedforward or positive feedback, as well as pleiotropy in the mapping between genotypes and gene regulatory parameters, are clearly predisposed for generating non-monotonicity. We present analytical results confirming these deep connections between molecular regulatory architecture and monotonicity properties of the genotype-phenotype map. These connections seem to be beyond reach by the classical distinction between additive and non-additive gene action.

**Keywords: genotype-phenotype map, gene regulatory networks, epistasis, variance component analysis, genetic modeling, systems genetics, genetic variance, monotonicity**

### **INTRODUCTION**

Quantitative genetics is the major theoretical foundation for genetic studies in production biology, evolutionary biology, and biomedicine. A core concept in quantitative genetics is the genotypic value, the mean observed phenotype for a given genotype. It constitutes the basis for the genotype-to-phenotype (GP) map concept. The shape of a given GP map is typically described by the classical gene action terms: additivity, dominance, and epistasis. Together with genotype frequencies in a given population, the GP map is the basis for decomposing observed phenotypic variance into environmental variance and genetic variance components including additive variance, dominance variance and epistatic variance. This provides the basis for a very successful theory when it comes to predicting selection response and breeding values (Falconer and Mackay, 1996; Lynch and Walsh, 1998) and more recent statistical genetics methods for mapping Quantitative Trait Loci (QTL) (Neale et al., 2008). Quantitative genetics thus provides a mature machinery for predicting the population level consequences of a given GP map, but in order to understand several generic genetic phenomena there is a stated need for new tools for disclosing how the shape of the GP map is determined by underlying biology (Jaeger et al., 2012; Moore, 2012; Gjuvsland et al., 2013).

One such phenomenon is the resemblance between parents and offspring. An explanation in quantitative genetic terms is that the additive variance (*VA*) makes up a substantial part of the phenotypic (*VP*) and genetic variance (*VG*). Hill et al. (2008) showed that in populations with extreme allele frequencies, high *VA*/*VG* ratios will arise regardless of the shape of the GP map. However, for populations with intermediate allele frequencies a much wider range of *VA*/*VG* ratios is observed (Wang et al., 2013). In such populations, high *VA*/*VG* ratios cannot be fully accounted for without considering properties of the GP map. Gjuvsland et al. (2011) showed that a key feature of GP maps that give high ratios of additive to genotypic variance (*VA*/*VG*), is a monotone (or order-preserving) relation between gene content (the number of alleles of a given type) and phenotype. This led to the hypothesis that the regulatory circuitry of sexually reproducing organisms predominantly predisposes for highly monotone genotype-phenotype maps.

Here we address the above hypothesis by a two-step approach. First we provide methods and software tools for measuring monotonicity of generic GP maps (i.e., sets of genotypic values). Then we use these tools on the data generated by an extensive simulation study of a broad collection of gene regulatory network models. In these network models the steady state expression levels serve as phenotypes and genetic variation is introduced through parameters describing maximal production rates and the shape of the gene regulation function. Such *causally cohesive genotype-phenotype (cGP) models* [see Gjuvsland et al. (2013) and references therein] allow us to identify relationships between regulatory network architecture and properties of the resulting GP maps.

Our results confirm the prediction that the GP maps arising from a wide range of gene regulatory network motifs are in general highly monotone. In addition we show through numerical as well as mathematical analysis that regulatory motifs involving incoherent feed-forward or positive feedback stand out in their capacity to generate non-monotonicity. These relationships between molecular regulatory architecture and properties of the genotype-phenotype map—of substantial relevance to functional genomics in general—are beyond reach by the standard distinction between additive and non-additive gene action.

Our approach can be applied to cGP models of a wide range of biological systems at any level of model complexity. It opens for a systematic study of the monotonicity properties of molecular regulatory structures underlying the whole spectrum of physiological regulation. This suggests that the concept of monotonicity of GP maps can be used to build theory about heredity phrased in terms of molecular mechanism, something which standard genetic concepts and approaches appear to be incapable of.

### **MODELS AND METHODS**

#### **BACKGROUND ON MONOTONICITY OF GP MAPS**

To ease understanding we provide a brief recapitulation of the concept of monotonicity (or order-preservation) in GP maps introduced in (Gjuvsland et al., 2011). We consider a diploid genetic model with *N* biallelic loci (alleles indexed 1 and 2) underlying a quantitative phenotype. A genotype at a single locus *k* is denoted by *gk* ∈ {11, 12, 22}. In the case of two loci *k* and *l* there are 9 possible genotypes *gkl* = *gkgl* ∈ {1111, 1112, 1122, 1211,..., 2212, 2222}. The general *N* loci genotype space contains 3*<sup>N</sup>* genotypes *<sup>g</sup>*1*g*<sup>2</sup> ··· *gN* (in condensed notation *g*1:*N*) constructed by concatenating single locus genotypes, -= {*g*1*g*<sup>2</sup> ··· *gN* | *gk* ∈ {11, 12, 22}, *k* = 1, 2,..., *N*}. For any locus *k*, *the genotypic background*, i.e., the allele composition of all loci *except k*, is *<sup>g</sup>*(*k*) <sup>=</sup> *<sup>g</sup>*1*g*<sup>2</sup> ... *gk* <sup>−</sup> <sup>1</sup>*gk* <sup>+</sup> <sup>1</sup> ... *gN* <sup>=</sup> *<sup>g</sup>*1: *<sup>k</sup>* <sup>−</sup> <sup>1</sup>*gk* <sup>+</sup> <sup>1</sup>: *<sup>N</sup>*. For example, if *<sup>N</sup>* <sup>=</sup> 4 then *<sup>g</sup>*(2) <sup>=</sup> <sup>112212</sup> means that the genotypes of locus 1, 3, and 4 are 11, 22, and 12, respectively. We use the straightforward notation *g*1*g*<sup>2</sup> ... *gk* <sup>−</sup> 111*gk* <sup>+</sup> <sup>1</sup> ... *gN* = *g*1:*<sup>k</sup>* <sup>−</sup> 111*gk* <sup>+</sup> <sup>1</sup>: *<sup>N</sup>* to indicate a genotype where *gk* = 11 while the background genotype is arbitrary. We will also use the compressed notation 11*g*(*k*) (or generally *gkg*(*k*) ).

We use the 2-allele content (i.e., the number of 2-alleles) of genotypes to define a partial order on the genotype space - (see **Figure 1**, left panel for an illustration). For a particular locus *k* we order the three genotypes sharing the same background genotype *g*1: *<sup>k</sup>* <sup>−</sup> <sup>1</sup>*gk* <sup>+</sup> <sup>1</sup>: *<sup>N</sup>* as follows,

$$\lg\_{1:k-1} 1 \lg\_{k+1:N} \prec \lg\_{1:k-1} 1 \heartsuit\_{k+1:N} \prec \lg\_{1:k-1} 2 \lg\_{k+1:N} \tag{1}$$

We call this the *partial genotype order relative to locus k*, and it defines a strict partial order on -.

A genotype-phenotype map is a mapping *G* that assigns to each genotype *g* ∈ a real-valued genotypic value *G*(*g*) (the mean trait value for a given genotype). We define monotonicity of *G* in terms of how it transforms the partial genotype order to the algebraic order of the genotypic values *G*(*g*). Without loss of generality we assume that the allele indexes at each locus have been chosen such that *G*(1111 ··· 11) is the smallest of all homozygote genotypic values. We call a genotype-phenotype map *G monotone or order-preserving with respect to locus k* if it preserves the partial genotype order relative to locus *k,* i.e., if,

$$G(\mathfrak{g}\_{1:k-1}11\mathfrak{g}\_{k+1:N}) \le G(\mathfrak{g}\_{1:k-1}12\mathfrak{g}\_{k+1:N})$$

$$\le G(\mathfrak{g}\_{1:k-1}22\mathfrak{g}\_{k+1:N})\tag{2}$$

#### **FIGURE 1 | Examples of partial genotype order and**

**genotype-phenotype maps. Left panel**: The allele content defines a partial order on genotype space. A two-locus example is shown. The plot at the top displays the genotype at locus 1 (x-axis) and locus 2 (color) vs. the total number of 2-alleles (y-axis) in the two-locus genotype. The resulting partial ordering of genotypes is shown below. **Right panel:** Each lineplot shows the 9 genotypic values (y-axis) for a single GP map, coding of genotype are the same as in the left panel. GP maps that preserve the partial order of

genotypes are called monotone. Examples shown are an intra- and interlocus additive map (A), a map showing partial dominance at both loci (PD), and duplicate dominant (DD) epistasis (see Table 1 in Phillips, 1998). GP maps that break the partial order of genotypes are called non-monotone, examples shown are pure overdominance at both loci (OD), additive-by-additive epistasis (A × A) and dominance-by-dominance epistasis (D × D). The rightmost plot shows a GP map that is monotone w.r.t. locus 1, but non-monotone w.r.t. locus 2.

for all genetic backgrounds of locus *k*. By allowing non-strict inequalities we include GP maps showing complete dominance and complete magnitude epistasis (Weinreich et al., 2005) in the class of order-preserving GP maps. Conversely we call a GP map *non-monotone* or *order-breaking with respect to locus k* if it does not preserve the partial genotype order relative to locus *k* for all backgrounds. **Figure 1** (right panel) shows classical dominance and epistasis patterns, categorized into monotone and non-monotone GP maps.

#### **STATISTICAL DECOMPOSITION OF GENOTYPE-PHENOTYPE MAPS**

Given a genotype-phenotype map *G* as described above and a corresponding vector of genotype frequencies *f* in a population, quantitative genetic provides methods for orthogonal decomposition of genotypic values and resulting genetic variance in the population into additive and non-additive (dominance and epistasis) components (Lynch and Walsh, 1998). We performed such statistical decomposition with the function linearGPmapanalysis in the R package noia (http:// cran.r-project.org/package=noia; Le Rouzic and Alvarez-Castro, 2008) version 0.94.1. We assumed an idealized population where all genotype frequencies are equal (1/3*N*). In such a hypothetical population the NOIA (Alvarez-Castro and Carlborg, 2007) statistical and functional formulations and the unweighted regression model proposed by Cheverud and Routman (1995) are equivalent. Furthermore, the decomposition of genotypic values is equivalent to decomposing *G* into a sum of additive and nonadditive GP maps, and the genetic variance in this case is simply the variance of the 3*<sup>N</sup>* genotypic values in *G*. We used the NOIA statistical formulation to decompose a GP map *G* into its additive and non-additive components, and computed the ratio of additive to total genetic variance *VA*/*VG* as a measure of how well the additive component describes the original GP map. In case of the illustrative GP maps depicted in **Figure 1**, this gives *VA*/*VG* = 1 for the fully additive GP map A, and *VA*/*VG* = 0 for the pure overdominance (OD) and the pure epistasis (Cheverud and Routman, 1996) maps A × A and D × D.

#### **GENE REGULATORY NETWORK MODELS**

Gene expression in eukaryotes is controlled through gene regulatory networks involving numerous regulatory mechanisms [see e.g., Latchman (2005), for details]. Modeling of such gene regulatory networks is well-established, and available modeling frameworks range from coarse-grained descriptions of the topology of genome-wide networks to very detailed mechanistic models describing the dynamics of small networks (De Jong, 2002; Schlitt and Brazma, 2007; Karlebach and Shamir, 2008). In line with a large number of authors we used ordinary differential equations (ODEs) to study a family of generic gene regulatory network models containing three diploid genes *X*1, *X*2, and *X*3, organized as a regulatory system where the rate of expression of each gene can be regulated by the expression level of one or both of the other genes. The wiring of the system is described by a 3 × 3 connectivity matrix *A* with elements *Akl* ∈ {−1, 0, 1}. The signs of the elements of *A* describe the mode of regulation, *Akl* = 0 indicates that *Xl* is not a regulator of *Xk*, if *Akl* = 1 then *Xl* is an activator of *Xk*, and if *Akl* = −1 then *Xl* is a repressor of *Xk*. Gene regulatory systems are often laid out visually as signed directed graphs. There is a one-to-one correspondence between a connectivity matrix and a signed directed graph, two examples are illustrated in **Figure 4**. We used the sigmoid formalism (Mestl et al., 1995; Plahte et al., 1998) in the diploid form (Omholt et al., 2000) where the expression the two alleles of gene *k* is described by the following ODEs,

$$
\dot{\mathbf{x}}\_{k1} = \alpha\_{k1} R\_{k1}(\mathbf{y}\_1, \mathbf{y}\_2, \mathbf{y}\_3) - \gamma\_{k1} \mathbf{x}\_{k1}, \tag{3}
$$

$$
\dot{\mathbf{x}}\_{k2} = \alpha\_{k2} R\_{k2}(\mathbf{y}\_1, \mathbf{y}\_2, \mathbf{y}\_3) - \gamma\_{k2} \mathbf{x}\_{k2},
$$

$$
\mathbf{y}\_k = \mathbf{x}\_{k1} + \mathbf{x}\_{k2}, \quad k = 1, 2, 3.
$$

Here α*ki* is the maximal production rate for allele *i* of gene *Xk*, γ*ki* is the decay rate, while *Rki* is the gene regulation function (dose-response function). If *Xk* has no regulators, we assume production is always switched on i.e., *Rki* = 1. If *Xk* has a single regulator *Xl*, the gene regulation function is given as *Rki*(*yl*) = *<sup>S</sup>*(*yl*, <sup>θ</sup>*lki*, *plki*), where *<sup>S</sup>*(*y*, <sup>θ</sup>, *<sup>p</sup>*) <sup>=</sup> *<sup>y</sup>p*/(*y<sup>p</sup>* <sup>+</sup> <sup>θ</sup>*p*) if *Xl* is an activator and *<sup>S</sup>*(*y*, <sup>θ</sup>, *<sup>p</sup>*) <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>y</sup>p*/(*y<sup>p</sup>* <sup>+</sup> <sup>θ</sup>*p*) if it is a repressor. In both cases the parameter θ*lki* gives the amount of regulator needed to get 50% of maximal production rate, and *plki* determines the steepness of the response. In the case of two regulators *Xl* and *Xj* we set *Rki*(*yl*, *yj*) = *S*(*yl*, θ*lki*, *plki*)*S*(*yj*, θ*jki*, *pjki*), corresponding to the Boolean AND function. Modeling transcription regulation by means of Hill functions and Boolean composition has a long tradition in modeling of gene regulation and is widely used.

With three genes and up to two regulators per gene the number of possible connectivity matrices is 6859. We further required that the system is connected, and that *X*<sup>3</sup> is downstream to both *X*<sup>1</sup> and *X*<sup>2</sup> so either *X*<sup>1</sup> and *X*<sup>2</sup> both regulate *X*<sup>3</sup> directly (*A*31*A*<sup>32</sup> = 0), or one of them regulates *X*<sup>3</sup> directly and the other one indirectly (*A*31*A*<sup>12</sup> = 0 or *A*32*A*<sup>21</sup> = 0). This reduces the number of distinct connectivity matrices to 3724. Finally, we identified pairs of matrices that are symmetric with respect to interchanging *X*<sup>1</sup> and *X*<sup>2</sup> and picked just one matrix from each pair. The resulting 1881 connectivity matrices were used for our gene regulatory simulations.

#### **IDENTIFYING FEEDBACK LOOPS AND FEEDFORWARD MOTIFS**

Feedback and feedforward motifs appear recurrently as regulatory building blocks in transcription networks across all living organisms. These network motifs have several characteristic features (Alon, 2007), negative feedback can for example accommodate fast transcriptional responses and homeostasis, while positive feedbacks are utilized as biological switches. We went through all 1881 gene regulatory models and extracted information about their feedback and feedforward loop characteristics from their connectivity matrices. For each system we computed three autoregulatory feedback loop products *FL*<sup>1</sup> = *A*11, *FL*<sup>2</sup> = *A*22, *FL*<sup>3</sup> = *A*33, three two-gene feedback loop products: *FL*<sup>12</sup> = *A*21*A*12, *FL*<sup>13</sup> = *A*31*A*13, *FL*<sup>23</sup> = *A*23*A*<sup>32</sup> and two three-gene feedback loop products: *FL*<sup>123</sup> = *A*32*A*21*A*13, *FL*<sup>213</sup> = *A*31*A*12*A*23. Non-zero loop products indicate that the system contains the corresponding feedback loop, and the sign of the loop product gives the sign of the feedback loop. We also computed the products for two feedforward motifs: *FFL*<sup>32</sup> = *A*32(*A*31*A*12), *FFL*<sup>31</sup> = *A*31(*A*32*A*21). Again non-zero products indicate that the system contains the corresponding feedforward motif, a positive value corresponds to a coherent feedforward while a negative value indicates incoherent feedforward. **Figure 4** depicts the connectivity matrix and the signed digraphs of a system with a positive feedback loop as well as a system with incoherent feedforward. Spreadsheet S1 contains adjacency matrices and loop products for all 1881 motifs.

#### **GENE REGULATORY NETWORK SIMULATIONS**

The simulation were performed with the Python package cgptoolbox (http://github.com/jonovik/cgptoolbox), using the sigmoidmodel submodule, which contains an implementation of the gene regulatory network model (Equation 3) and the connectivity matrix *A*. A similar simulation setup is found in Gjuvsland et al. (2011) together with a discussion of gene regulation functions and the genotype-parameter map in molecular terms. We compared two different types of genotype-toparameter maps:


All decay rates γ*ki* were set equal to 10. We assembled parameter sets for all 27 diploid genotypes, and for each genotypic parameter set the system of Equation 3 was integrated numerically until convergence to a stable state. The equilibrium value of *y*<sup>3</sup> was recorded as phenotype. Datasets where the system failed to converge for one or more genotypes were discarded. For each of the 1881 motifs we performed 1000 Monte Carlo simulations.

Some Monte Carlo simulations lead to very little phenotypic variation, in the sense that the span between the largest and smallest of the 27 genotypic values was small. In order to avoid artifacts arising from the numeric ODE solver tolerance, these essentially flat GP maps were discarded. Further analysis of monotonicity and variance components were only performed on GP maps where the absolute range (maximum genotypic value – minimum genotypic value) and relative range (absolute range/mean genotypic value) were both > 0.01.

### **RESULTS**

#### **MEASURING MONOTONICITY OF GP MAPS**

In the following we present two numerical measures for quantifying monotonicity in a GP map *G* with *N* biallelic loci. The first quantifies the monotonicity for individual loci by comparing negative and positive allele substitution effects before weighting the individual loci into an overall measure. The second utilizes isotonic regression to quantify the distance between *G* and the closest fully monotone GP map.

#### *Measure 1: quantifying non-monotonicity by substitution effects*

We first develop a measure of monotonicity based on the effects of substituting a single allele at locus *k,*

$$s^1(\mathcal{g}^{(k)}) = G(\mathfrak{g}\_{1:k-1}22\mathfrak{g}\_{k+1:N}) - G(\mathfrak{g}\_{1:k-1}12\mathfrak{g}\_{k+1:N}), \quad \text{(4)}$$

$$s^2(\mathcal{g}^{(k)}) = G(\mathfrak{g}\_{1:k-1}12\mathfrak{g}\_{k+1:N}) - G(\mathfrak{g}\_{1:k-1}11\mathfrak{g}\_{k+1:N}),$$

while keeping the background genotype *<sup>g</sup>*(*k*) <sup>=</sup> *<sup>g</sup>*1: *<sup>k</sup>* <sup>+</sup> <sup>1</sup>*gk* <sup>+</sup> <sup>1</sup>: *<sup>N</sup>* fixed. Monotonicity as defined by Equation 2 is equivalent to *s i* (*g*(*k*) ) ≥ 0 for *i* = 1, 2 across all genetic backgrounds of locus *k*. By taking into account also the magnitude of the substitution effects we can quantify the deviation from strict monotonicity. We start with the set *Sk* = {*<sup>s</sup> i* (*g*(*k*) )} of single allele substitution effects for locus *<sup>k</sup>* for *<sup>i</sup>* <sup>=</sup> 1, 2 and across all genotypic backgrounds *<sup>g</sup>*(*k*) . The total number of elements in *<sup>S</sup><sup>k</sup>* thus becomes 2 · <sup>3</sup>*N*−1, and we split the set into two disjoint subsets reflecting their sign; *S<sup>k</sup>* <sup>+</sup> = {*s i* (*g*(*k*) ) <sup>∈</sup> *Sk*|*<sup>s</sup> i* (*g*(*k*) ) > <sup>0</sup>} and *<sup>S</sup><sup>k</sup>* <sup>−</sup> = {*s i* (*g*(*k*) ) <sup>∈</sup> *Sk*|*<sup>s</sup> i* (*g*(*k*) ) < 0}. We compute the sum of positive substitution effects and the sum of absolute values of negative substitution effects,

$$P\_k = \sum\_{\mathcal{S}\_+^k} s^i(\mathcal{g}^{(k)}),\tag{5}$$

$$N\_k = \sum\_{\mathcal{S}\_-^k} \left| s^i(\mathcal{g}^{(k)}) \right|,$$

and let *Tk* = *Pk* + *Nk* denote the overall sum of absolute substitution effects. We then define the degree to which the GP map *G* is monotone with respect to locus *k* by,

$$m\_k = \frac{|P\_k - N\_k|}{T\_k} = \frac{\left| \sum\_{\mathcal{g}^{(k)}} \left( s^1(\mathcal{g}^{(k)}) + s^2(\mathcal{g}^{(k)}) \right) \right|}{\sum\_{\mathcal{g}^{(k)}} \left( |s^1(\mathcal{g}^{(k)})| + |s^2(\mathcal{g}^{(k)})| \right)}. \tag{6}$$

The absolute value in the numerator ensures that the measure *mk* is invariant with respect to the choice of indexes for the two alleles of locus *k*. Interchanging the numbering of the alleles leads to the mappings *s* <sup>1</sup>(*g*(*k*) ) → −*s* <sup>2</sup>(*g*(*k*) ), *s* <sup>2</sup>(*g*(*k*) ) → −*s* <sup>1</sup>(*g*(*k*) ), which leaves the value of *mk* unchanged. By the triangle inequality *mk* ≤ 1. If *mk* = 1, then *G* is monotonic with respect to locus *k*, whereas *mk* < 1 implies that *G* is order-breaking w.r.t. locus *k*. If *mk* = 0, then the positive substitution effects equal the negative substitution effects in magnitude and we say that *G* is completely order-breaking w.r.t. locus *k*. This measure distinguishes well between the monotone and non-monotone maps in **Figure 1**. Clearly *m*<sup>1</sup> = *m*<sup>2</sup> = 1 for the additive map (A) and GP maps showing partial dominance and duplicate dominance epistasis. In contrast, *m*<sup>1</sup> = *m*<sup>2</sup> = 0 for the maps showing pure OD and pure epistasis (A × A and D × D).

In order to quantify the overall monotonicity of the GP map *G* we introduce the *degree of monotonicity (m)* which is a weighted mean of all *mk*, where the weights reflect the relative effect size of the loci in terms of *Tk*,

$$m = \frac{\sum\_{k=1}^{N} m\_k T\_k}{\sum\_{k=1}^{N} T\_k}. \tag{7}$$

As shown in **Figure 3A**, the *degree of monotonicity* is accordingly 1 for the monotone maps in **Figure 1** while it is 0 for the pure OD and pure epistasis maps. This definition of *degree of monotonicity* allows us to establish a vocabulary that is analogous to the classification of single locus dominance; i.e., a GP map is called *monotone* if *m* = 1, (*partially) non-monotone* if *m* < 1 and *purely non-monotone* if *m* = 0.

For example, the degree of monotonicity of the GP map published by Cheverud and Routman (1995), with two loci underlying 10-week body-weight (in grams) at 10 weeks in a mouse *F*<sup>2</sup> cross, may be computed as follows. After renaming the two loci (B →1, A →2) and indexing alleles to conform to our notation, the nine genotypic values (Table 1 in (Cheverud and Routman, 1995)) are *G*(1111) = 31.23, *G*(1112) = 34.13, *G*(1122) = 33.82, *G*(1211) = 34.89, *G*(1212) = 35.90, *G*(1222) = 36.53, *G*(2211) = 34.12, *G*(2212) = 37.95, and *G*(2222) = 36.84. From the line plot of this GP map (**Figure 2**, left panel) we find that the GP map is non-monotone with respect to both loci. Locus 1 shows marginal OD for the 11 genotype of locus 2 and locus 2 shows marginal OD for the 11 and 22 genotypes of locus 1. To compute the degree of monotonicity, we start with the set of single allele substitution effects for locus 1, *<sup>S</sup>*<sup>1</sup> = {3.66, <sup>−</sup>0.77, <sup>1</sup>.77, <sup>2</sup>.05, <sup>2</sup>.71, <sup>0</sup>.31}, and divide this into sets of negative *S*<sup>1</sup> <sup>−</sup> = {−0.77} and positive effects *S*<sup>1</sup> <sup>+</sup> = {3.66, 1.77, 2.05, 2.71, 0.31}. The sum *N*<sup>1</sup> of elements in *S*<sup>1</sup> <sup>+</sup> is 10.50 and *P*<sup>1</sup> the sum of absolute values of elements in *S*<sup>1</sup> <sup>−</sup> is 0.77, which gives *T*<sup>1</sup> = *P*<sup>1</sup> + *N*<sup>1</sup> = 11.27. From Equation 6 it follows that *m*<sup>1</sup> = 0.86. Similarly, the sets of substitution effects for locus 2 are *S*<sup>2</sup> <sup>−</sup> = {−1.11, −0.31} and *S*2 <sup>+</sup> = {3.83, 0.63, 1.01, 2.90}. This gives, *N*<sup>2</sup> = 1.42, *P*<sup>2</sup> = 8.37, *T*<sup>2</sup> = 9.79, and *m*<sup>2</sup> = 0.71. Inserting values for both loci into Equation 7, the degree of monotonicity (*m*) of this GP map is calculated to be 0.79. This value concords well with the visual observation (**Figure 2**, left panel) that it does not deviate substantially from a purely monotone map.

For random GP maps (randomly sampled genotypic values as in (Gjuvsland et al., 2011)) there is a strong positive correlation between the degree of monotonicity and the size of the additive component (*VA*/*VG*) (**Figure 3A**). A similar relationship was observed for three-locus random GP maps (**Figure A1A**). All GP maps in **Figure 3A** with *m* < 0.1 have *VA*/*VG* < 0.1. At the other end of the spectrum there is much more variation, for instance the most extreme completely monotone map (the duplicate dominant factors DD) has *VA*/*VG* as low as 0.375.

#### *Measure 2: quantifying monotonicity by isotonic regression*

This measure quantifies the monotonicity of a particular GP map *G* in terms of the least-squares distance to the closest monotone map. We build on the mathematical notation introduced in section "Background on monotonicity of GP maps" where is the genotype space for *N* biallelic loci and a GP map is a function that assigns a real-valued genotypic value *G*(*g*) to each genotype

**FIGURE 3 | Measures of monotonicity vs. additivity of GP maps.** Scatterplots showing *VA*/*VG* from unweighted regression vs. **(A)** degree of monotonicity (*m*) and **(B)** *R*<sup>2</sup> mono from isotonic regression. Black dots correspond to the maps shown in **Figure 1** together with additive-by-dominance epistasis (A × D), a map with two loci showing complete dominance (CD) and two classical epistasis types from Table 1 in Phillips (1998); duplicate recessive genes (DR) and recessive epistasis (RE). Red dots show 1000 random two-locus GP maps, while blue dots show the same 1000 GP maps after rearranging genotypic values to introduce order-preservation for 1 locus [see Model and Methods in Gjuvsland et al. (2011)].

Genotype-phenotype map *G* for two loci underlying 10-week body-weight at 10 weeks in a mouse *F*<sup>2</sup> cross. The GP map shown here is equivalent to the one in the original publication [see Figure

GP map *G* is decomposed with isotonic regression into a **(middle panel)** monotone component *GM* and a (**right panel**) non-monotone component *GN* .

*g* in -. For any particular GP map *G*, we identify the *monotone component* of *G* as the map *GM* which minimizes the residual variance var(*G* − *GM*), i.e., *GM* is the monotone GP map which is closest to *G* in the least-squares sense. For a given *G* the monotone component *GM* is unique (Barlow and Brunk, 1972) and can be computed numerically by isotonic regression (Leeuw et al., 2009) of *G* subject to the partial ordering of genotypes defined in Equation 1. Furthermore, the residual *GN* is orthogonal to *GM* in the sense that *g* ∈ - *GM*(*g*)*GN*(*g*) = 0. This allows the orthogonal decomposition,

$$G = G\_M + G\_N,\tag{8}$$

of a genotype-phenotype map into a *monotone component GM* and a *non-monotone component GN* such that var(*G*) = var(*GM*) + var(*GN*). The orthogonality property allows us to measure monotonicity of *G* in terms of the coefficient of determination *R*<sup>2</sup> mono of the isotonic regression given by the ratio *R*2 mono = var(*GM*)/var(*G*). In the case that *G* itself is monotone for all loci we have *R*<sup>2</sup> mono = 1, while order-breaking for one or more loci will result in *R*<sup>2</sup> mono < 1.

The isotonic regression approach can be illustrated in a straightforward way on the two-locus GP map provided by Cheverud and Routman (1995) (see text above and left panel of **Figure 2**). The partial ordering of genotypes defined by Equation 1 is illustrated in **Figure 1** (left panel). By isotone regression (Leeuw et al., 2009) on this partial genotype ordering, the original GP map is decomposed into a monotone and a non-monotone component (**Figure 2**, middle and right panels), and the coefficient of determination (*R*<sup>2</sup> mono) is 0.97.

Our simulation results for random GP maps show that *R*<sup>2</sup> mono is positively correlated to the size of the additive component (**Figure 3B** for two-locus GPs maps and **Figure A1B** for threelocus GP maps) and that for a given *VA*/*VG* the lower bound for *R*2 mono is close to a straight line from (0, 0.2) to (1, 1). However, due to the search for the closest monotone GP map, *R*<sup>2</sup> mono will not become zero even for purely overdominant or purely epistatic maps. As shown in **Figure A2**, the two monotonicity measures are highly correlated.

### *An R package for studying monotonicity in GP maps*

We developed an R package gpmap for studying functional properties of GP maps. The package takes GP maps in the form of vectors of genotypic values as input, and provides functions for (i) determining whether the map is order-breaking or orderpreserving w.r.t. any given locus, (ii) the degree of monotonicity *m*, (iii) *R*<sup>2</sup> *mono* using isotonic regression from the isotone package (Leeuw et al., 2009), and (iv) plots of the original and decomposed GP maps. Code example 1 (**Box 1**) below illustrates the usage and functionality of the gpmap package. The package is available from CRAN http://cran.r-project.org/package=gpmap under GPLv3.

### **MONOTONICITY IN GP MAPS ARISING FROM GENE REGULATORY NETWORKS**

To search for generic relationships between monotonicity and regulatory network structure, we used the above measures of monotonicity to characterize GP maps emerging from the gene regulatory network models (see Models and Methods). Based on earlier results (Gjuvsland et al., 2007, 2011; Wang et al., 2013) we hypothesized that incoherent feed forward (**Figure 4**, right panel) or positive feedback (**Figure 4**, left panel) would be necessary in order to obtain highly order-breaking GP maps, and we characterized all 1881 networks in terms of these two properties. **Table 1** shows the number of motifs falling into the resulting four categories. We summarized the number of Monte Carlo simulations where all genotypic parameter sets gave convergence to a stable steady state, and where the resulting GP maps were not essentially flat (see Models and Methods for details). Motifs with less than 100 usable GP maps were discarded from further analysis. For the genotype-to-parameter maps without pleiotropy (in the sense

#### **Box 1 | Code example 1.**

Code example for quantifying and visualizing monotonicity for the two-locus GP map published in [14] using the R package gpmap.

```
> library(gpmap) #load package
> data(GPmaps) #load dataset
> gp <- mouseweight #GP map from reference
  [14]
>
> ## Tabulate genotypic values
> cbind(gp$genotype,gp$values)
>
> ## Plot the GP map
> plot(gp)
>
> ## Compute degree of monotonicity
> gp <- degree_of_monotonicity(gp)
> gp$degree.monotonicity.locus
> print(gp)
>
> ## Quantify monotonicity by isotonic
```

```
regression
```

```
> gp <- decompose_monotone(gp)
```

```
> print(gp)
```

```
>
```

```
> ## Plot decomposed GP map
```
> plot(gp,decomposed=TRUE)

### **FIGURE 4 | Connectivity matrices and signed directed graphs.**

Connectivity matrix *A* and the corresponding signed directed graph for two of the 1881 systems in the simulation study. The **left panel** depicts the connectivity matrix and the signed digraph of a system with a positive feedback loop between *X*<sup>1</sup> and *X*<sup>2</sup> while the **right panel** shows a system with incoherent feedforward from *X*<sup>1</sup> to *X*3.


**Table 1 | Frequencies (proportion of row total in parenthesis) of incoherent feedforward and positive feedback loops in subsets of the 1881 studied motifs.**

that genetic variation at one locus influences only a single parameter, see Model and Methods) 868 motifs were discarded, while for the genotype-to-parameter map with pleiotropy (genetic variation at one locus influences three parameters) 791 motifs were discarded. All (but one) discarded motifs contained at least one positive feedback loop (**Table 1**). A plausible explanation for this is that many motifs with positive feedback loops have a stable steady state at, or very close to 0 for one or more state variables regardless of parameter values, and this leads to essentially flat GP maps.

The introduction of pleiotropy in the genotype to parameter map has a marked effect on the monotonicity characteristics of the associated GP map (**Figure 5**). When genetic variation at a locus *Xi* affects only its maximal production rate the GP maps come out as highly monotone (**Figure 5A**), with a large majority being fully monotone or order-breaking for just a single locus. When genetic variation at locus *Xi* affects the threshold and steepness of the dose-response curve in addition to the maximal production rate (pleiotropy in the genotype-to-parameter map), the majority of GP maps still show order-breaking either for no loci or just one locus (**Figure 5B**). But a considerable number of GP maps are in this case order-breaking for two or three loci. Furthermore, by dividing the motifs into the four groups given in **Table 1** it is evident that the regulatory anatomy of a network determines its predisposition for non-monotonicity in its associated GP map. Presence of incoherent feedforward or positive feedback loops appears to be prerequisites for the majority of the observed non-monotonic GP maps.

The class of motifs lacking both incoherent feedforward and positive feedback contains very few order-breaking GP maps, and with no pleiotropy in the genotype-to-parameter map we observe only fully order-preserving GP maps for this class (cyan in **Figure 5A**). In the Appendix we generalize this to an arbitrary number of nodes and formally prove that without pleiotropy in the genotype-to-parameter map, the presence of incoherent feedforward or positive feedback is indeed a necessary condition for non-monotone GP maps to arise from networks with monotone gene regulation functions.

The introduction of pleiotropy in the genotype-to-parameter map increases the frequency of order-breaking GP maps substantially (**Figure 5B**). Motifs lacking both incoherent feedforward

**FIGURE 5 | Order-breaking in motifs containing a single feedforward loop.** Summary of order-breaking for all motifs for which at least 100 (out of 1000) Monte Carlo simulations lead to GP maps with non-negligible variation (see Models and Methods section "Gene regulatory network simulations," for detailed criteria). Results are shown for 1013 motifs with a genotype-to-parameter map without pleiotropy **(A)** and 1090 motifs with a genotype-to-parameter map with pleiotropy **(B)**. Colors indicate classes of motifs based on the presence/absence of incoherent feedforward and positive feedback loops, see **Table 1** for the number of motifs in each class. A single boxplot summarizes, for all motifs in the given class, the proportion of the GP maps (y-axis) that are order-breaking with respect to a given number of loci (x-axis). For example, consider the red box at *x* = 0 in panel **(A)**. This boxplot contains results for motifs with both incoherent feedforward and positive feedback and from **Table 1** we find that the red boxplot summarizes results for 135 motifs. From the y-axis we find that at least half (box median at *y* = 1) of these 135 motifs result in only monotone GP maps, while for the most extreme (end of whisker) of the 135 motifs only 25% of the GP maps are monotone. Similarly, the cyan box is compressed into a line at *x* = 0, *y* = 1 indicating that all 251 motifs that lack both incoherent feedforward and positive feedback result in only monotone GP maps.

and positive feedback may in this case lead to GP maps that are order-breaking for one or two loci, but never for all three loci. Using isotonic regression to quantify the overall monotonicity of the GP maps reinforces the finding that incoherent feedforward and positive feedback predispose for non-monotonicity (**Figure 6**). **Figure 6** also shows that for all classes of motifs the majority of GP maps are fully monotone, while the most non-monotone GP maps (lowest *R*<sup>2</sup> monovalues) are observed for motifs with positive feedback. The differences between classes

*R*2 mono values from isotone regression for all motifs for which at least 100 (out of 1000) Monte Carlo simulations lead to GP maps with non-negligible phenotypic variation (see Models and Methods section "Gene regulatory network simulations," for detailed criteria). Results are shown for 1013 motifs with a genotype-to-parameter map without

pleiotropy **(A)** and 1090 motifs with a genotype-to-parameter map with pleiotropy **(B)**. Each panel is divided into 4 subplots containing classes of motifs based on the presence/absence of incoherent feedforward and positive feedback loops, see **Table 1** for the number of motifs in each class. Each curve shows, for a single motif, the empirical distribution function value (y-axis) of *R*<sup>2</sup> mono for all GP maps (x-axis).

of motifs are also evident when inspecting the additivity of GP maps (**Figure A3**), but since monotone GP maps can still be non-additive, the patterns are much more blurred than for monotonicity.

### **DISCUSSION**

Fisher's (1918) regression on gene content and the concepts derived from this, such as additive effects and dominance deviation, provide the theoretical basis for most of quantitative genetics (Falconer and Mackay, 1996; Lynch and Walsh, 1998). By regressing on gene content, including the extensions by Cockerham (1954), the genotype-phenotype map is decomposed into additive, dominant, and epistatic components. The use of gene content or the number (0, 1, or 2) of alleles with a particular index in a genotype implies the same partial ordering of genotype space as defined in Equation 1. Thus, our proposed definition of monotonicity of GP maps, and in particular the use of isotonic regression to quantify monotonicity, may be viewed as a relaxation of the linearity assumption underlying current quantitative genetics theory. In this perspective the positive correlation between monotonicity and additivity (**Figure 3**) is expected.

We have addressed GP maps with 2 and 3 loci as we considered an in-depth study of the properties of GP maps with higher number of loci to be outside the scope of this study. Some general observations can be made, though. Since *m* is a weighted aver- age, the *mk* of major loci (i.e., for which *Tk* is large relative to *Tk*) will tend to dominate. For instance, in a case with a single major locus showing monotone gene action and several minor loci showing order-breaking, the GP map will overall be close to monotone (*m* close to 1). Conversely, order-preservation in a number of minor loci would have little influence on *m* if major loci have strongly non-monotone gene action. Isotonic regression gives an overall measure of monotonicity of a GP map, but provides no locus-specific measures corresponding to *mk*. Similar to the case for *m,* the gene action of major loci will have high influence on the value of *R*<sup>2</sup> mono.

The observation that monotonicity is an important property of GP maps is in principle not new. For a single locus, non-monotone gene action appears in the form of over- or under-dominance, while complete and partial dominance as well as additivity exemplify monotone gene action. Weinreich et al. (2005) distinguished between *sign epistasis* and *magnitude epistasis* and showed that sign epistasis limits the number of mutational trajectories to higher fitness. As sign epistasis reflects a nonmonotone GP relationship and magnitude epistasis reflects a monotone one, this insight concords with our results. A similar distinction has been proposed (Wang et al., 2010) for statistical interactions where *removable interactions* are those that can be removed by a monotone transformation of the phenotype scale, while non-monotonicity in the GP map leads to *essential interactions*. Wu et al. (2009) developed a method to screen for and test the significance of essential interaction in genome-wide association studies. Isotonic regression has also recently been applied to link genotype and phenotype data (Beerenwinkel et al., 2011; Luss et al., 2012). Our treatment of monotonicity is more general than these earlier works in three major ways. First, we deal with monotonicity of the GP map as a whole rather than either intra-locus (dominance vs. overdominance) or inter-locus (magnitude vs. sign epistasis and removable vs. essential interactions). Second, where the earlier treatments have focused on classifying the type of gene action, we make use of quantitative measures of monotonicity. Third, our approach combining the concept of monotonicity with cGP models opens a direct link between genetics and the theory of dynamical systems in the wide sense.

Monotonicity is a property of the GP map separate from the allele frequencies, making it a physiological (Cheverud and Routman, 1995) or functional (Hansen and Wagner, 2001) descriptor rather than a statistical one. The distinction between physiological and statistical epistasis has lead to much debate (Phillips, 2008). Zeng et al. (2005) argued the distinction was unnecessary and potentially misleading. Although their arguments around orthogonality and variance components are valid, our results demonstrate very clearly that describing the properties of the GP map without reference to any particular study population is essential if we want to connect quantitative genetics with regulatory biology.

It is clear from our results that positive feedback and incoherent feedforward promote non-monotonicity. The clear-cut differences in monotonicity between different classes of regulatory networks, combined with the strong correlation between monotonicity and additivity of GP maps, appear therefore to explain the findings that regulatory systems with positive feedback give considerably more statistical epistasis than those without (Gjuvsland et al., 2007; Wang et al., 2013). Even though both incoherent feedforward and positive feedback predispose for non-monotone GP maps, the underlying mechanisms are different for the two regulatory motifs. In the case of incoherent feedforward the sum of direct and indirect effects may result in a non-monotone doseresponse relationship (Kaplan et al., 2008). That positive feedback loops can give non-monotonicity is intuitively less clear, but in the Appendix we show both results analytically. Positive feedback predisposes for multiple steady states, and order-breaking might also emerge from different genotypes corresponding to different states. It should be noted, however that positive feedback is only a necessary condition for multistationarity (Plahte et al., 1995), and a positive loop in the connectivity matrix *A* of a system is not necessarily active at any point during the time course of the system.

### **REFERENCES**


*Evolution* 50, 1042–1051. doi: 10.2307/2410645


Without any restrictions on the connectivity of a threegene system there are 3<sup>9</sup> <sup>=</sup> <sup>19</sup>, 683 possible distinct networks. The main restriction we imposed (see Models and Methods for details) was a maximum of two regulators per gene, which allowed us to use Boolean gene regulation functions already established in the sigmoid formalism (Plahte et al., 1998). Other model formalisms allowing an arbitrary number of regulators are also available (Wagner, 1994, 1996; Siegal and Bergman, 2002) and could be extended to diploid forms and used in later studies.

Although this study has focused on gene regulatory networks, the concept of monotone gene action applies to the propagation of genetic variation across the whole physiological hierarchy. One may therefore systematically use the concepts and methods presented here to study the orderpreserving and order-breaking properties of genotype-phenotype mappings that are associated with any regulatory structure amenable for mathematical modeling. Through this it will be possible to make a wide-ranging survey of which regulatory anatomies promote monotonicity and which promote nonmonotonicity. We foresee that this classification may become instrumental for predicting how phenotypic effects of genetic variation propagate across generations in sexually reproducing populations.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fgene.2013.00216/ abstract

**Spreadsheet S1 | Excel spreadsheet with connectivity matrices and loop products for all 1881 gene regulatory networks.**

genotype-phenotype gap: what does it take? *J. Physiol.* 591, 2055–2066. doi: 10.1113/jphysiol.2012.248864


feed-forward loop can generate non-monotonic input functions for genes. *Mol. Syst. Biol.* 4, 203. doi: 10.1038/msb.2008.43


to gene–gene interaction search. *Ann. Appl. Stat.* 6, 253–283. doi: 10.1214/11-AOAS504


*Sci. U.S.A.* 91, 4387–4391. doi: 10.1073/pnas.91.10.4387


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 June 2013; accepted: 07 October 2013; published online: 07 November 2013.*

*Citation: Gjuvsland AB, Wang Y, Plahte E and Omholt SW (2013) Monotonicity is a key feature of genotype-phenotype maps. Front. Genet. 4:216. doi: 10.3389/fgene.2013.00216*

*This article was submitted to Genetic Architecture, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 Gjuvsland, Wang, Plahte and Omholt. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### **APPENDIX**

In this appendix we complement the simulation studies in the main text with some analytic results for GP maps emerging from ODE models of gene regulatory networks. We study a generalization of the gene network model in Equation (3) with an arbitrary number of loci and monotone gene regulation functions, but restrict the analysis to genotype-parameter maps without pleiotropy. In particular, we show that (i) if there are no positive feedback loops and no incoherent feedforward loops in the network, the resulting GP maps are always monotone, (ii) a positive feedback loop or an incoherent feedforward loop may lead to non-monotone GP maps. The results hold for phenotypes given as the stable concentration of the product of one of the genes, and under certain restrictions also for phenotypes given as a function of one or several stable gene product concentrations that is monotonic with respect to each of its arguments.

### **GENE NETWORK MODEL**

We consider a dynamic system consisting of *n* mutually interacting diploid loci *Xj*, *j* ∈ *N* = {1,..., *n*}, regulating each other's expression. The time dependent output of *Xj* is denoted *zj*, and we define *z* = [*z*1,*z*2,...,*zn*]. It goes without saying that *zj* in general depends on the genotypes of all the genes even though we will not always state this explicitly.

For a given genotype *<sup>g</sup>* <sup>=</sup> *gjg*(*j*) <sup>=</sup> *ajbjg*(*j*) , where *gj* ∈ {11, 12, 22} denotes the genotype and *aj*, *bj* ∈ 1, 2 denote the indexes of the two alleles of locus *Xj*, the equations of motion for *Xj* are

$$\begin{aligned} \dot{z}\_j^1 &= \alpha\_j^{a\_j} r\_j^{a\_j}(\mathbf{z}) - \gamma\_j^{a\_j} z\_j^1, \\ \dot{z}\_j^2 &= \alpha\_j^{b\_j} r\_j^{b\_j}(\mathbf{z}) - \gamma\_j^{b\_j} z\_j^2, \\ z\_j &= z\_j^1 + z\_j^2, \end{aligned} \tag{A1}$$

where *z*<sup>1</sup> *<sup>j</sup>* and *<sup>z</sup>*<sup>2</sup> *<sup>j</sup>* are the time-dependent outputs of the two homologous copies of *Xj*. The two allele rate functions *r*<sup>1</sup> *<sup>j</sup>* (*z*) and *r*2 *<sup>j</sup>* (*z*) have range [0, <sup>1</sup>] so that <sup>α</sup><sup>1</sup> *<sup>j</sup>* and <sup>α</sup><sup>2</sup> *<sup>j</sup>* represent the maximum production rates of the two alleles. We assume that all dose-response functions in Equation (A1) are differentiable and monotonic with respect to each of its arguments, and that for each *j*, *k*, the signs of ∂*r*<sup>1</sup> *<sup>j</sup>* /∂*xk* and <sup>∂</sup>*r*<sup>2</sup> *<sup>j</sup>* /∂*xk* in the stable point *x* are equal. This model generalizes Eq. (3) to an arbitrary number of loci and a broader class of gene regulation functions.

In the following we are only concerned with the steady states of Equation (A1), and assume for simplicity that they have just a single stable equilibrium point. Solving the equilibrium conditions of Equation (A1) with respect to *z*<sup>1</sup> *<sup>j</sup>* and *<sup>z</sup>*<sup>2</sup> *<sup>j</sup>* and adding gives

$$f\_j(\mathbf{x}) = \mu\_j^{a\_j} r\_i^{a\_j}(\mathbf{x}) + \mu\_j^{b\_j} r\_j^{b\_j}(\mathbf{x}) - \mathbf{x}\_j = \mathbf{0}, \quad j \in \mathcal{N}, \tag{A2}$$

where *<sup>x</sup>* = [*x*1,..., *xn*] is the stable point, <sup>μ</sup>*ai <sup>j</sup>* = α *aj <sup>j</sup>* /γ *aj <sup>j</sup>* and μ *bj <sup>j</sup>* = α *bj <sup>j</sup>* /γ *bj <sup>j</sup>* . Since our definition of monotonicity of GP maps does not depend on the numbering of alleles, we will without loss of generality assume μ<sup>1</sup> *<sup>j</sup>* <sup>≤</sup> <sup>μ</sup><sup>2</sup> *<sup>j</sup>* for all *j*.

The network architecture can be read out from the structure of the system's Jacobian matrix in the stable state *x*. We define the elements of the Jacobian *J* for the set of functions *fj* defined in Equation (A2) by

$$J\_{jk} = J\_{jk}(\mathfrak{g}) = \frac{\partial f\_{\mathfrak{j}}(\mathfrak{x})}{\partial \mathfrak{x}\_k}, \quad j, k \in \mathcal{N}. \tag{A3}$$

To the Jacobian *J* it is customary to assign a signed directed graph *G* in which each locus *Xk* is represented by a node *Xk*, and in which there is an arc from *Xj* to *Xk* if and only if *Jkj* = 0, its sign given by the sign of *Jkj*. A chain from *Xj* to *Xk* is a set of arcs in *G* leading from *Xj* to *Xk* in which all intermediate nodes are visited only once. The sign of a chain is equal to the product of the signs of the *Jij* corresponding to the arcs in the chain. If there is a chain from *Xi* to *Xj* and also a chain from *Xj* to *Xi* through a disjoint set of nodes, the two chains constitute a proper feedback loop (FBL). To each FBL is associated a loop product *L* which is the product of the Jacobian elements corresponding to all the arcs in the loop. The sign of the loop is given by the sign of *L*. Two chains from *Xj* to *Xi*, *i* = *j*, with only the endpoint nodes in common, constitute a feedforward loop (FFL). If the two chains have opposite signs, the FFL is incoherent (IFFL), otherwise it is coherent (CFFL).

The system's phenotype could be any scalar quantity defined by its equilibrium value *x*. In the following we assume the genotype-phenotype map *G*(*g*) = *xq*(*g*), *q* ∈ *N*, for a given and fixed *q*, and investigate the monotonicity properties of *G*(*gkg*(*k*) ) with respect to genetic variation in any locus *Xk* for different backgrounds *g*(*k*) . In the following sections we analyse the causes of order-breaking in *G* in the restricted case in which there is only genetic variation in μ<sup>1</sup> *<sup>k</sup>* and <sup>μ</sup><sup>2</sup> *k*, not in the shape of the doseresponse functions *r*<sup>1</sup> *<sup>k</sup>* and *<sup>r</sup>*<sup>2</sup> *<sup>k</sup>* , implying *<sup>r</sup>*<sup>1</sup> *<sup>k</sup>* (*x*) <sup>=</sup> *<sup>r</sup>*<sup>2</sup> *<sup>k</sup>* (*x*) = *rk*(*x*). This is what we mean by a genotype-to-parameter map without pleoitropy.

In the next sections we prove the following result:

**Proposition 1.** *Assume all rate functions in Equation* (A1) *are monotonic and that G is the mapping from g to xq for some fixed q so that xq*(*g*) *is the phenotype. If there is no feedback loop (FBL) and no feedforward loop (FFL) anywhere in the network corresponding to the system Equation* (A1)*, then necessarily mk* = 1 *for all k. If the system contains either a single FFL or a single FBL, then G may be non-monotone for some xk if the FFL is positive or the FBL is incoherent, but if the FBL is negative or the FFL is coherent, no order breaking can occur for any xk.*

At the end of this note we show that under some reasonable conditions this result is also valid for more general phenotypes depending on more than one *xq*.

#### **NETWORKS WITHOUT LOOPS**

We first consider networks containing no feedforward loop and no feedback loop. In these networks there is at most one chain from one node to another, and of course no autoregulatory loops. If there is a chain from *Xj* to *Xk*, there is no chain from *Xk* to *Xj*. Any node is either unregulated (constitutively expressed) or regulated by one or several other nodes.

We first prove a useful lemma.

**Lemma 1.** *If xl*(11*g*(*j*) ) <sup>≤</sup> *xl*(12*g*(*j*) ) <sup>≤</sup> *xl*(22*g*(*j*) ) *for any j and l and there is an arc Xl* → *Xm with positive sign and no other chain from Xl* <sup>→</sup> *Xm, then also xm*(11*g*(*j*) ) <sup>≤</sup> *xm*(12*g*(*j*) ) <sup>≤</sup> *xm*(22*g*(*j*) )*. If the sign of the arc is negative, then xm*(11*g*(*j*) ) <sup>≥</sup> *xm*(12*g*(*j*) ) ≥ *xm*(22*g*(*j*) )*.*

*Proof.* Suppressing the explicit dependence on other genes that are not affected by genetic variation in *Xj*, we have

$$\begin{aligned} \varkappa\_m(11\lg^{(j)}) &= 2\mu\_m^1 r\_m(\varkappa(11\lg^{(j)})), \\ \varkappa\_m(12\lg^{(j)}) &= (\mu\_m^1 + \mu\_m^2)r\_m(\varkappa\_l(12\lg^{(j)})), \\ \varkappa\_m(22\lg^{(j)}) &= 2\mu\_m^2 r\_m(\varkappa(22\lg^{(j)})). \end{aligned} \tag{A4}$$

Now, *rm* is monotonic by assumption. If it is monotonically increasing,

$$\begin{aligned} \mathbb{x}\_m(12\mathfrak{g}^{(j)}) &\geq (\mu\_m^1 + \mu\_m^2)r\_m(\mathbb{x}(11\mathfrak{g}^{(j)})) \geq \mathbb{x}\_m(11\mathfrak{g}^{(j)}),\\ \mathbb{x}\_m(22\mathfrak{g}^{(j)}) &\geq 2\mu\_m^2r\_m(\mathbb{x}(12\mathfrak{g}^{(j)})) \geq \mathbb{x}\_m(12\mathfrak{g}^{(j)}),\end{aligned} \quad (\text{A5})$$

from which the assertion follows. If *rm* is monotonically decreasing, we find the same relations with the inequality signs reversed.

If there is no chain from *Xj* to *Xq*, genetic variation in *Xj* will not be reflected in *G*, i.e. *G*(11*g*(*j*) ) <sup>=</sup> *<sup>G</sup>*(12*g*(*j*) ) <sup>=</sup> *<sup>G</sup>*(22*g*(*j*) ), and by definition does not give order-breaking. Then assume *Xj* is upstream relative to *Xq* and that the chain from *Xj* to *Xq* is positive. We first let *Xj* be an unregulated node with no predecessor. Then

$$\begin{aligned} \varkappa\_{\not\!\!/} (11 \mathfrak{g}^{(j)}) &= 2\mathfrak{\mu}\_{j}^{1}, \\ \varkappa\_{\not\!\!/} (12 \mathfrak{g}^{(j)}) &= \mathfrak{\mu}\_{j}^{1} + \mathfrak{\mu}\_{j}^{2}, \\ \varkappa\_{\not\!\!/} (22 \mathfrak{g}^{(j)}) &= 2\mathfrak{\mu}\_{j}^{2}, \end{aligned} \tag{A6}$$

because *r*<sup>1</sup> *<sup>j</sup>* <sup>=</sup> *<sup>r</sup>*<sup>2</sup> *<sup>j</sup>* <sup>=</sup> 1. From this it follows that *xj*(11*g*(*j*) ) ≤ *xj*(12*g*(*j*) ) <sup>≤</sup> *xj*(22*g*(*j*) ).

Repeated use of Lemma 1 leads eventually to *xq*(11*g*(*j*) ) ≤ *xq*(12*g*(*j*) ) <sup>≤</sup> *xq*(22*g*(*j*) ), irrespective of the genotypic background of *Xj*. If the chain from *Xj* to *Xq* is negative, the argument goes in the same way, but then *xq*(11*g*(*j*) ) <sup>≥</sup> *xq*(12*g*(*j*) ) <sup>≥</sup> *xq*(22*g*(*j*) ). The above argument can be carried out in the same way if *Xj* is not top-stream. It follows that in a network without FFBs and FFLs and where genetic variation is restricted to μ<sup>1</sup> *<sup>k</sup>* and μ2 *<sup>k</sup>*, the genotype-phenotype map *G*(*g*) = *xq*(*g*) cannot be orderbreaking.

#### **NETWORKS WITH A FEEDBACK LOOP**

In this section we investigate the effects of feedback loops on the degree of monotonicity. Assuming monotonic dose-response functions and non-pleiotropic genetic variation, we show that a positive feedback loop may lead to order breaking, while negative feedback loops never do. We consider a network in which there is no FFL and a single FBL with *Xq* as one of its members and *Xk* is upstream of the loop.

**Lemma 2.** *Consider a network with n nodes for which all doseresponse functions are monotonic and there is only genetic variation in* μ<sup>1</sup> *<sup>k</sup> and* <sup>μ</sup><sup>2</sup> *<sup>k</sup>. Asssume there is a chain from Xk to X*1*, that X*1*, but not Xk, is member of a FBL with m nodes, and that there is no other FBL and no FFL in the system. If Xq is in the loop, let the loop be X*<sup>1</sup> → *X*<sup>2</sup> → ... → *Xq* → ... → *Xm* → *X*1*. If the FBL is positive, there may be order-breaking in Xq due to genetic variation in Xk, but no order-breaking can occur if the loop is negative. If Xq is downstream of the loop, the same result applies.*

*Proof.* With a single FBL and no FFL there is at most one directed path from any node *Xi* to any other node *Xj*, and if there is a path from *Xi* to *Xj*, there is no return path from *Xj* to *Xi* if either *Xi* or *Xj* is not part of the FBL. We first consider the dependence of *x*<sup>1</sup> on *xk*. The direct regulators of node *X*<sup>1</sup> are *Xm* and *Xl*, the latter being the last but one node in the chain from *Xk* to *X*1. In Plahte et al. (2013) we introduced *the propagation functions xj* = *pjk*(*xk*) which express the effect on *xj* of genetic variation in *Xk*. An important property of *pjk* is that it can be derived from all the equilibrium conditions Equation (A2) except the equation for *fk*. This implies that the effects on *Xj* of genotypic variation in *Xk* are only expressed in terms of the variations in *xk*, while the parameters expressing the genotype of *Xk* do not enter into the function *pjk* .

We then have *xl* = *plk*(*xk*) and *xm* = *pm*1(*x*1). To make it easier to use the results in Plahte et al. (2013) we rewrite the equilibrium condition Equation (A2) as

$$R\_{\dot{\jmath}}(\mathbf{x}) - \chi\_{\dot{\jmath}}\mathfrak{x}\_{\dot{\jmath}} = \mathbf{0},\tag{A7}$$

where γ*<sup>j</sup>* > 0. In the following, the Jacobian refers to this set of equations, which has the same root and the same functional dependencies between the variables as the original set. The signs of the partial derivatives of *Rj* are the same as for *r aj <sup>j</sup>* and *r bj <sup>j</sup>* . The equilibrium condition for *X*<sup>1</sup> is then

$$
\gamma\_1 \mathbf{x}\_1 = \mathcal{R}\_1(\mathfrak{p}\_{lk}(\mathbf{x}\_k), \mathfrak{p}\_{m1}(\mathbf{x}\_1))).\tag{A8}
$$

This equation defines *x*<sup>1</sup> as a function of *xk* in an open domain around the equilibrium point and with a derivative that can be computed by implicit differentiation, i.e.

$$
\gamma\_1 \frac{d\mathbf{x}\_1}{d\mathbf{x}\_k} = \frac{\partial R\_1}{\partial \mathbf{x}\_l} q\_{lk} + \frac{\partial R\_1}{\partial \mathbf{x}\_m} q\_{m1} \frac{d\mathbf{x}\_1}{d\mathbf{x}\_k},\tag{A9}
$$

where *qij* = *p ij* is the derivative of *pij* for all *i*, *j*.

From Lemma 1 it follows that there is no order breaking in *Xl*, in other words, *qlk* has a fixed sign. Consider then *qm*1. There is just a single chain from *X*<sup>1</sup> to *Xm*, and Equation (13) in Plahte et al. (2013) gives

$$q\_{m1}(\mathbf{x}\_1) = (-1)^{m-1} \frac{D\_{VV}C\_U}{D^{(11)}}.\tag{A10}$$

Here *U* is the set of nodes in this chain, *CU* is its chain product, i.e. the product of the Jacobian elements corresponding to the arcs in the chain, *<sup>V</sup>* <sup>=</sup> *<sup>N</sup>* \ *<sup>U</sup>*, *<sup>D</sup>*(11) is the subdeterminant of *<sup>J</sup>* with row 1 and column 1 deleted, and *DVV* is the subdeterminant of *J* composed of the rows and columns *V*. Because there is no feedback loop among the nodes represented in *D*(11) and *DVV*, only the diagonal degradation terms contribute to these two determinants. Hence *<sup>D</sup>*(11) <sup>=</sup> (−1)*<sup>n</sup>* <sup>−</sup> <sup>1</sup> *<sup>i</sup>* = <sup>1</sup> <sup>γ</sup>*i*. Similarly, *DVV* <sup>=</sup> (−1)*<sup>n</sup>* <sup>−</sup> *<sup>m</sup> <sup>i</sup>*∈*<sup>V</sup>* <sup>γ</sup>*<sup>i</sup>* , giving *qml* = γ1*CU*/ -*<sup>U</sup>*, where -*U* = *<sup>i</sup>*∈*<sup>U</sup>* <sup>γ</sup>*i*. Finally, we note that *<sup>P</sup>* <sup>=</sup> (∂*R*1/∂*xm*)*CU* is the loop product of the loop.

Solving Equation (A9) with respect to d*x*1/d*xk* and using all these expressions lead to

$$
\gamma\_1 \frac{d\mathbf{x}\_l}{d\mathbf{x}\_k} = \frac{\Gamma\_U}{\Gamma\_U - P} \frac{\partial \mathbf{x}\_1}{\partial \mathbf{x}\_l} q\_{1k} \,. \tag{A11}
$$

The sign of ∂*x*1/∂*xl* is independent of the genotype of *Xk* and the sign of *q*1*<sup>k</sup>* is fixed. Genotypic variation in *Xk* may change the magnitude of *P*, but its sign is fixed because all Jacobi elements have fixed sign independent of the system parameters. Thus, genotypic variation in *Xk* does not alter the sign of d*x*1/d*xk* if the loop is negative (*P* < 0), while for a positive loop the sign of -*<sup>U</sup>* − *P* may switch. In the latter case, an increase in *xk* due to genetic variation in *Xk* may increase *x*<sup>1</sup> in some cases and decrease it in others, leading to order breaking. As there is only a single chain from *X*<sup>1</sup> to *Xq*, no order breaking in *X*<sup>1</sup> implies no order breaking in *Xq*, while order breaking in *X*<sup>1</sup> may propagate to *Xq*. The same result follows if *Xq* is downstream a node in the loop because order breaking in this node may propagate to *Xq*.

#### **FEEDFORWARD LOOPS (FFLS)**

A feedforward loop (FFL) is a motif in the network in which there are two different chains *C*<sup>1</sup> and *C*<sup>2</sup> from one particular node to another particular node. To each chain *Ci* is associated a chain product *Pi* defined as the product of the Jacobian elements corresponding to the arcs in *Ci*. If *P*<sup>1</sup> and *P*<sup>2</sup> have equal signs, the FFL is coherent, otherwise it is incoherent.

In a network with a single feedforward loop and no feedback loops we now investigate the effect on *G*(*g*) = *xq*(*xk*(*g*)) of genetic variation in *Xk* for varying background *g*(*k*) . Our starting point is again Equation (A7). We first let *Xk* and *Xq* be the initial and terminal nodes in the FFL. The two chains *C*<sup>1</sup> and *C*<sup>2</sup> leading from *Xk* to *Xq* comprise ρ<sup>1</sup> and ρ<sup>2</sup> nodes including *Xk* and *Xq*, respectively. Let the set of nodes in *C*<sup>1</sup> and *C*<sup>2</sup> be *XU*<sup>1</sup> and *XU*<sup>2</sup> , respectively, where *U*<sup>1</sup> and *U*<sup>2</sup> are the corresponding subsets of *N*, and let *V*<sup>1</sup> and *V*<sup>2</sup> be their complements.

Roughly speaking, the derivative of the propagation function *pqk*(*xk*) can be expressed as a sum of terms, each term corresponding to one of the chains leading from *Xk* to *Xq* (Plahte et al., 2013). To the chain *Ci* is assigned the chain weight *wi* given by

$$\omega\_{i} = (-1)^{\rho\_{i} - 1} \frac{D\_{V\_{i}V\_{i}}}{D^{(kk)}}, \quad i = 1, 2,\tag{A12}$$

where *DViVi* is the Jacobian subdeterminant for the nodes not included in *Ci*, and *D*(*kk*) is the Jacobian subdeterminant for all nodes except *Xk*. Because there are two chains from *Xk* to *Xq*, the derivative of *pqk* is a sum of two terms:

$$\frac{\mathrm{d}p\_{qk}}{\mathrm{d}x\_{k}} = \boldsymbol{w}\_{1}\boldsymbol{P}\_{1} + \boldsymbol{w}\_{2}\boldsymbol{P}\_{2},\tag{A13}$$

where *P*<sup>1</sup> and *P*<sup>2</sup> are the two chain products, and *w*<sup>1</sup> and *w*<sup>2</sup> their weights (Plahte et al., 2013). When there is no feedback loop in the system, only the diagonal elements in *J* stemming from the term −γ*ixi* in Equation (A7) contribute to the determinants *DViVi* and *D*(*kk*) :

$$\begin{aligned} D\_{V\_i V\_i} &= (-1)^{n - \rho\_i} \prod\_{j \in V\_i} \mathbb{Y}\_j, \\ D^{(kk)} &= (-1)^{n - 1} \prod\_{j \neq k} \mathbb{Y}\_j. \end{aligned} \tag{A14}$$

Altogether this gives

$$\frac{\mathrm{d}\mathbf{x}\_{q}}{\mathrm{d}\mathbf{x}\_{k}} = \frac{\mathrm{d}p\_{qk}}{\mathrm{d}\mathbf{x}\_{k}} = \frac{\mathbb{Y}\_{k}}{\Gamma\_{1}}P\_{1} + \frac{\mathbb{Y}\_{k}}{\Gamma\_{2}}P\_{2},\tag{A15}$$

where -<sup>1</sup> and -<sup>2</sup> are the products of the γ*<sup>j</sup>* in the two chains, respectively. The chain products *P*<sup>1</sup> and *P*<sup>2</sup> depend on the genotype *gk* of *Xk* as well as on the genotypic background *g*(*k*) , but their signs *S*<sup>1</sup> and *S*<sup>2</sup> are invariant under genotypic variation. It is easy to see that a negative autoregulatory loop, which is a common feature in gene regulatory networks, would not invalidate the conclusion, but a positive autoregulatory loop might.

If the FFL is incoherent, *P*<sup>1</sup> and *P*<sup>2</sup> have opposite signs, implying that the sign of d*xq*/d*xk* may vary. If the FFL is coherent, however, no order-breaking can occur.

If *Xk* is upstream relative to the initial node *X*init of the FFL, it follows from the above section on networks without loops that there will be no order-breaking in *X*init, and the above argument is still valid.

#### **MORE GENERAL PHENOTYPES**

In real life, relevant phenotypes are not direct gene products, but rather functions of the concentrations of one or several gene products. Let the phenotype *G*(*g*) be a function of *xU*(*g*), *G* = *h*(*xU*(*g*)), where *U* is a subset of *N*, and assume that for any *u* ∈ *U*, ∂*h*/∂*xu* has fixed sign for all genotypes. To analyse this case we extend the original system Equation (A2) to

$$\begin{cases} \mu\_i^{a\_i} r\_i^{a\_i}(\mathbf{x}(\mathbf{g})) + \mu\_i^{b\_i} r\_i^{b\_i}(\mathbf{x}(\mathbf{g})) - \mathbf{x}\_i(\mathbf{g}) = 0, \quad i = 1, \dots, n, \\\\ h(\mathbf{x}\_U(\mathbf{g})) - \mathbf{x}\_{n+1} = 0, \end{cases} \tag{A16}$$

and apply our above results to this system, in which *G*(*g*) = *xn* <sup>+</sup> <sup>1</sup>(*g*), i.e. *q* = *n* + 1. If there are two nodes among *XU* which have a common predecessor *Xk*, then there will exist two chains from *Xk* to *Xn* <sup>+</sup> 1. These two chains constitute a feedforward loop with *Xn* <sup>+</sup> <sup>1</sup> as final node. If this FFL is incoherent, order breaking due to genetic variation in *Xk* may occur even if there is no order breaking in the original system comprising the nodes *X*1,..., *Xn*. If the FFL is coherent, order breaking only occurs if it occurs in the original system.

### **REFERENCES**

Plahte, E., Gjuvsland, A. B., and Omholt, S. W. (2013). Propagation of genetic variation in gene regulatory networks. *Phys. D* 256–257, 7–20. doi: 10.1016/j. physd.2013.04.002

degree of monotonicity (*m*) and **(B)** *R*<sup>2</sup> mono. Red dots show 1000 random three-locus GP maps, blue dots show the same 1000 GP maps after sorting to introduce order-preservation for 1 locus while green dots show the same 1000 GP maps after sorting to introduce order-preservation for 2 loci [see Model and Methods in Gjuvsland et al. (2011)].

Scatterplots showing degree of monotonicity (*m*) vs. *R*<sup>2</sup> mono. Black dots correspond to the maps shown in **Figure 1**. Red dots show 1000 random two-locus GP maps, while blue dots show the same 1000 GP maps after sorting to introduce order-preservation for 1 locus [see Model and Methods in Gjuvsland et al. (2011)].

## Estimating directional epistasis

### *Arnaud Le Rouzic\**

*Centre National de la Recherche Scientifique, Laboratoire Évolution, Génomes, et Spéciation, UPR 9034, Gif-sur-Yvette, France*

#### *Edited by:*

*José M. Álvarez-Castro, Universidade de Santiago de Compostela, Spain*

#### *Reviewed by:*

*Michael Kopp, Aix-Marseille University, France Janna Lynn Fierst, University of Oregon, USA*

#### *\*Correspondence:*

*Arnaud Le Rouzic, Centre National de la Recherche Scientifique, Laboratoire Évolution, Génomes, et Spéciation, Avenue de la Terrasse, Bâtiment 13, 91198 Gif-sur-Yvette, France e-mail: arnaud.le-rouzic@ legs.cnrs-gif.fr*

Epistasis, i.e., the fact that gene effects depend on the genetic background, is a direct consequence of the complexity of genetic architectures. Despite this, most of the models used in evolutionary and quantitative genetics pay scant attention to genetic interactions. For instance, the traditional decomposition of genetic effects models epistasis as noise around the evolutionarily-relevant additive effects. Such an approach is only valid if it is assumed that there is no general pattern among interactions—a highly speculative scenario. Systematic interactions generate directional epistasis, which has major evolutionary consequences. In spite of its importance, directional epistasis is rarely measured or reported by quantitative geneticists, not only because its relevance is generally ignored, but also due to the lack of simple, operational, and accessible methods for its estimation. This paper describes conceptual and statistical tools that can be used to estimate directional epistasis from various kinds of data, including QTL mapping results, phenotype measurements in mutants, and artificial selection responses. As an illustration, I measured directional epistasis from a real-life example. I then discuss the interpretation of the estimates, showing how they can be used to draw meaningful biological inferences.

**Keywords: epistasis, genetic effects, estimation, statistics, evolution, multilinear model**

### **1. INTRODUCTION**

An ability to understand and predict how genes affect morphological, physiological, and behavioral characteristics is of crucial importance in biology. This also poses a considerable challenge, given the complexity of the genetic architecture of quantitative traits (Flint and Mackay, 2009). This complexity is not only due to the large number of genetic, environmental, and physiological factors involved, but also to their multiple and nonlinear interactions. In particular, it was noticed very early in the history of genetics that the same genetic change often produces differing effects depending on the genetic background of the experimental species, population, or individual (Phillips, 1998; Wade et al., 2001; Phillips, 2008). The biological consequences of this phenomenon, known as "epistasis," have triggered a considerable amount of discussion. A whole century of active research in genetics and molecular biology has revealed the ubiquity of epistatic interactions associated with the organization of biological systems as networks of interacting molecules (Omholt et al., 2000). However, we are still far from being able to integrate epistasis into a consensual, explicit, and predictive theoretical framework.

In the classical analysis of genetic variance (Fisher, 1918), epistasis is considered as a source of noise. Most epistatic effects are not transmitted from parent to offspring, and therefore, are not involved in the response to natural or artificial selection. Epistatic variance—the contribution of epistasis to genetic variance in a population—can be calculated (Cockerham, 1954; Kempthorne, 1954; Lynch and Walsh, 1998; Álvarez-Castro and Carlborg, 2007; Gjuvsland et al., 2007), but is almost meaningless in terms of predicting the genetic properties of a population (Barton and Turelli, 2004; Hansen, 2013; Álvarez-Castro and Le Rouzic, 2014), and may be negligible compared to evolutionarily-relevant additive genetic variance (Hill et al., 2008; Hemani et al., 2013).

Another idea, which has become popular only in recent decades, is that epistasis matters because of its capacity to affect additive variance rather than because of its contribution to interaction variance (Cheverud and Routman, 1995). In an epistatic genetic architecture, the effects of alleles on the phenotype depend on the genetic background. Accordingly, changes in the genetic background promoted by genetic drift (Goodnight, 1987, 1988; Barton and Turelli, 2004; Turelli and Barton, 2006; Álvarez-Castro et al., 2009; Jarvis and Cheverud, 2009) or by selection (Carter et al., 2005; Hansen et al., 2006; Hallander and Waldmann, 2007; Le Rouzic et al., 2013) may reveal, hide, or revert allelic effects, and thus significantly affect the genetic variance.

### **1.1. DIRECTIONAL EPISTASIS**

Epistasis can only exert a significant long-term influence on populations if individual epistatic effects do not tend to cancel out each other, i.e., if a general pattern emerges. The most obvious pattern is the directionality of epistasis, the fact that genetic interactions can be biased toward either high or low phenotype values. Estimates of directional epistasis allow to make useful predictions about the evolutionary potential of populations: if additive genetic variance is a measure of evolvability (Houle, 1992; Hansen et al., 2011), then the directionality of epistasis is a measure of genetic architecture asymmetry, i.e., how evolvability is influenced by the direction of evolution. When epistasis is positive, evolution is easier in the direction of high, rather than low, phenotypic values (because additive genetic variance tends to increase with the phenotypic value). In contrast, negative epistasis favors evolution toward low phenotypic values.

In spite of its predictive and descriptive value, directional epistasis is rarely reported for quantitative characters (Pavlicev et al., 2010). This can be attributed to two main factors: (i) many (if not most) quantitative geneticists are used to measuring epistasis via epistatic genetic variance, in spite of its marginal interest, and (ii) very few statistical or computational tools have been devised for measuring directional epistasis. The aim of this article is to present several methods for estimating directional epistasis from genetic and phenotypic data, and to propose accessible statistical procedures for computing epistasis. Several such methods will be illustrated from a real-life biological example, the genetic architecture of bodyweight in chicken, which displays a clear and consistent signal of positive epistasis. The data is based on a long-term artificial selection experiment on chicken body weight, and features (i) times series of the phenotypic response to selection, (ii) Quantitative Trait Locus (QTL) mapping data from a cross between the divergent lines, and (iii) minimal line-cross information (means of *F*<sup>1</sup> and *F*<sup>2</sup> populations) from the QTL setting.

#### **1.2. GENETIC MODELS**

In general, measuring the directionality of epistasis requires a model of genetic effects, i.e., a mathematical description of the relationships between the data (for instance, individual genotypes or phenotypes) and parameters to be estimated. The desirable properties for a "good" model of genetic effects depend on both the biological question and the nature of the data, and have resulted in rewarding (and sometimes conflictual) discussions (Cheverud and Routman, 1995; Hansen and Wagner, 2001b; Kao and Zeng, 2002; Yang, 2004; Zeng et al., 2005; Wang and Zeng, 2006; Álvarez-Castro and Carlborg, 2007; Aylor and Zeng, 2008; Hansen, 2014).

Genetic models can be conveniently divided into physiological and statistical models (Cheverud and Routman, 1995). In physiological (or functional: Hansen and Wagner, 2001b) models, genetic effects are described relative to a reference genotype, which can be arbitrary (for instance, one of the parental strains in an intercross) or conventional (typically, the wild genetic background). Functional models are generally rooted in traditional Mendelian genetics, in which a limited number of genotypes are experimentally generated and compared to reference strains. In contrast, statistical models quantify genetic effects in polymorphic populations across multiple genotypes. They are derived from the classical decomposition of genetic variance. Statistical genetic effects depend on allelic frequencies, and thus change when populations evolve; they provide a populationspecific description of the genotype-to-phenotype map. In spite of obvious historical and conceptual divergences, it is sometimes possible to express both functional and statistical models in common mathematical frameworks, and to transform functional into statistical estimates (and *vice versa*) by means of "change of reference" operations (Hansen and Wagner, 2001b; Álvarez-Castro and Carlborg, 2007; Le Rouzic and Álvarez-Castro, 2008).

With respect to epistasis, another useful distinction can be made between unidimensional and multidimensional models (Kondrashov and Kondrashov, 2001; de Visser et al., 2011). Unidimensional epistasis describes the general curvature of the genotype-phenotype map, and can be interpreted as the average effect of allelic substitutions that would be observed if all loci were exchangeable. Multidimensional epistasis accounts for the complexity of the genotype-phenotype relationship, by characterizing all pairs of loci that have a specific epistatic effect. While directional epistasis is unidimensional by definition, it can be measured based on either unidimensional or multidimensional models.

Several models of directional epistasis will be reviewed below, starting from the multilinear model of epistasis, originally functional and multidimensional, which has been extended toward statistical and unidimensional formulations. I will then present and discuss alternative functional unidimensional models that are commonly used to measure epistasis for fitness, and show how they can be applied to quantitative characters.

#### **2. MULTILINEAR EPISTASIS**

### **2.1. THE MULTILINEAR MODEL OF GENETIC INTERACTIONS** *2.1.1. General framework*

The multilinear model of genetic interactions developed by Hansen and Wagner (2001b) extends and makes explicit the concept of directional epistasis in quantitative genetics, and makes it possible to build genotype-to-phenotype maps implementing directional epistasis. In its original multidimensional form, the model expresses the phenotype *z* as a multilinear function of the genotype *G* of an individual. For two loci, labeled "1" and "2" respectively,

$$z\_G = z\_R + \mathcal{y}\_{1\underline{R}} + \mathcal{y}\_{2\underline{R}} + \mathcal{y}\_{1\underline{R}}\mathcal{y}\_{2\underline{R}}\varepsilon\_{1\underline{2}}.\tag{1}$$

Genetic effects are measured relative to an arbitrary reference genotype for which *y*<sup>1</sup> = *y*<sup>2</sup> = 0, associated with a reference phenotype *zR*. The effect of substituting the genotype of interest at locus 1 in the reference genotype *R* is *y*1*<sup>R</sup>* , and conversely, *y*2*<sup>R</sup>* is the effect at locus 2. When introducing the genotype of interest at both loci, in the absence of epistasis, the phenotype is expected to change by *y*1*<sup>R</sup>* + *y*2*<sup>R</sup>* . Any deviation from this expected additive outcome is attributable to epistasis. The originality of the multilinear model is to assume that this deviation is proportional to the product of allelic effects, the proportionality coefficient ε<sup>12</sup> quantifying the strength and directionality of epistasis between loci 1 and 2.

The multilinearity arises from the fact that any change in the genotype of a locus when keeping the genetic background constant leads to a proportional change in the phenotype. For instance, Equation (1) can be reformulated as*zG* = *a* + *fy*1*<sup>R</sup>* (with *a* = *zR* + *y*2*<sup>R</sup>* and *f* = 1 + *y*2*<sup>R</sup>* ε12), illustrating that the genotypephenotype map is always linear with respect to single genotypes (**Figure 1**).

The epistatic coefficient, ε12, is expressed in terms of inversed phenotypic units (e.g., if the trait is measured in cm, ε will be in cm−1), which is not intuitive and does not allow comparisons between traits. Hansen and Wagner (2001b) suggest measuring epistasis by computing epistatic factors, *f*<sup>1</sup> = 1 + *y*2ε<sup>12</sup> and *f*<sup>2</sup> = 1 + *y*1ε12, which quantify how much locus 1 is affected by locus 2, and *vice versa*; *f* = 1 implies no epistasis, *f* < 1 negative (antagonistic) epistasis, and *f* > 1 positive (synergistic) epistasis.

### *2.1.2. Statistical formulation*

The multilinear model is built as a functional model, since it defines genetic effects relative to a reference genotype, but a "change of reference" tool can be used to recompute genetic effects in any genotype or weighted combination of genotypes. When genetic effects are calculated relative to the average genotype of a population, the marginal contributions of individual loci coincide with additive effects, and the model can be considered to be statistical.

The multilinear model can also be used as a local approximation on a non-multilinear genotype-phenotype map. There are various ways of generating genotype-phenotype maps, which are multidimensional mathematical functions *g*(*y*1, *y*2,..., *yn*) that provide a deterministic phenotypic value for a series of genotypic values *yi* at *n* loci. Such mathematical maps are often defined in theoretical work intended to explain the evolution of populations in complex genetic landscapes. Furthermore, even if the lack of large empirical genotype-phenotype data sets means that it is not yet realistic to attempt to do so, it is in principle possible to fit smooth surfaces (such as multidimensional splines) to experimental measurements, and thus generate models of genetic landscapes that could be analyzed mathematically (and tested empirically).

In any case, the multidimensional directional epistasis coefficients ε*ij*, which measures the curvature of the genotypephenotype function between loci *i* and *j*, can be directly quantified as <sup>ε</sup>*ij* <sup>=</sup> *<sup>D</sup>*<sup>2</sup> *ij*/*DiDj*, where *Di* = ∂*g*/∂*yi* is the value of the first partial derivative of function *g* taken at the reference point, and *D*2 *ij* <sup>=</sup> <sup>∂</sup>2*g*/∂*yi*∂*yj* is the mixed partial derivative (the curvature of the function *g* across both loci). This result illustrates the fact that the multilinear model is similar to a Taylor expansion of the genotype-phenotype map that ignores intra-locus curvature (Hansen and Wagner, 2001b) (see Appendix I and **Figure 2**).

### *2.1.3. Composite directional epistasis*

The original multilinear model is multidimensional, as it involves as many ε*ij* parameters as pairs of loci. A unidimensional (and

**approximation of the interlocus curvature in a complex genotype-phenotype map.** When the average genotype is chosen as the reference (red point), the multilinear approximation is able to predict the evolutionary properties of the population in a more precise way than the additive model.

statistical) version of the model was proposed in Carter et al. (2005), with the composite directional epistasis coefficient ε*<sup>c</sup>* calculated as the average ε*ij* coefficient weighted by the additive genetic variance explained by each pair of loci:

$$\varepsilon\_{\varepsilon} = \frac{\sum\_{i} \sum\_{j \neq i} V\_{A\_i} V\_{A\_j \mathcal{E}\_{ij}}}{\sum\_{i} \sum\_{j \neq i} V\_{A\_i} V\_{A\_j}}.\tag{2}$$

Both uni- and multi-dimensional versions of the model can be extended to higher orders of interactions and to multiple traits (Hansen and Wagner, 2001b).

### **2.2. DIRECTIONAL EPISTASIS FROM PHENOTYPIC DATA** *2.2.1. Response to artificial selection*

Directional epistasis affects evolution, as it changes the amount of genetic variation available depending on the direction of phenotypic change (Hansen et al., 2006). For instance, selection in the direction of positive epistasis tends to increase the frequency of synergistic genetic interactions, thus enhancing the effect of selection. In contrast, selection in an antagonistic system decreases the genetic variance, and thus decreases the selection response. These effects can be experimentally observed, especially with bidirectional artificial selection responses, since they are expected to generate asymmetric responses in up- and down-selected lines.

*2.2.1.1. Theoretical framework.* It is possible to model the expected impact of directional epistasis on genetic variance and to predict the difference between up- and down-selected lines as a function of the epistatic coefficients. Using a series of simplifying assumptions detailed in Appendix II, the selection response under a constant selection gradient after *t* generations is expected to be:

$$
\mu\_t \simeq \mu\_0 - \frac{\log\left(1 - 2\Delta\_{\mu\_0}\varepsilon t\right)}{2\varepsilon}
$$

$$
\approx \mu\_0 + \Delta\_{\mu\_0}t + \varepsilon\Delta\_{\mu\_0}^2 t^2 + \dots,\tag{3}
$$

where μ<sup>0</sup> is the initial mean phenotype, μ<sup>0</sup> is the initial selection response (after the first generation), and ε is the directionality of epistasis. The second part of the equation is the secondorder Taylor approximation around *t* = 0, illustrating the linear selection response expected by the traditional breeder's equation (μ<sup>0</sup> *t*), and how directional epistasis appears as a quadratic term. Here, ε is the unidimensional directional epistasis, and thus corresponds to ε*<sup>c</sup>* in Equation (2).

A convenient way to estimate directional epistasis from bidirectional selection responses is to compute the up/down asymmetry through the average selection response, *<sup>A</sup>*(*t*) <sup>=</sup> <sup>1</sup> <sup>2</sup> (up(*t*) + down(*t*)) (**Figure 3**). If epistasis is directional and relatively weak (μ<sup>0</sup> ε 1), *A*(*t*) changes approximately with *t* 2, such that *<sup>A</sup>*(*t*) ε<sup>2</sup> <sup>μ</sup><sup>0</sup> *t* 2. It is thus possible to estimate μ<sup>0</sup> as the slope at origin of the selection response, and then ε through a quadratic regression on the average up/down response. Including the effects of e.g., inbreeding, linkage disequilibrium, or canalization, is possible, but requires to numerically maximize the likelihood of complex models. This can be done with the software package sra for R, described in Le Rouzic et al. (2011).

*2.2.1.2. Example: artificial selection on body weight.* For more than 50 years, two chicken (*Gallus gallus*) lines were selected for high and low body weight at 56 days, respectively (Siegel, 1962; Liu et al., 1994; Dunnington and Siegel, 1996). The experiment is still ongoing; here, I consider the latest phenotypic results available (54 generations, Dunnington et al., 2013). For simplicity, only the time series of mean phenotypes are considered, although some variance estimates were also available in this case.

The impact of artificial selection was considerable (**Figure 4**). In the high-selection line, the body weight at 8 weeks rose from 800 g (male-female average) to 1650 g. In the low-selected line, the average body weight decreased to around 150 g, leading to an impressive order-of-magnitude difference between high- and low-selected lines, well beyond the differences usually observed between closely-related species, and spanning more than one third of the relative weight diversity in the entire 20 Myr-old Galliformes order. The selection response was asymmetric: although the selection strength was identical in both lines, progress was slower in the low line. This can easily be attributed

in the high line, and −19.6 g per generation in the low line. **Bottom:** quadratic regression on the up- and down-selection average, illustrating the cumulative effect of directional epistasis. The quadratic coefficient (which is an approximation of <sup>2</sup> <sup>μ</sup><sup>0</sup> ε), estimated by a non-linear, least-square regression, was 0.033 g per generation squared.

to epistasis, given the expected differences in the genetic backgrounds of 1500 vs. 150 g birds.

Using the procedure described in Equation (3), the strength of directional epistasis could be estimated from a quadratic regression over the high-low asymmetry. Estimating the initial selection response at around |μ<sup>0</sup> | = 22.6 g per generation on average, directional epistasis is <sup>ε</sup> +6.<sup>6</sup> <sup>×</sup> <sup>10</sup>−<sup>5</sup> <sup>g</sup>−1. Although apparently small, this figure is statistically significant and generates cumulative effects on genetic architectures: Any phenotypic change corresponding to the initial (first-generation) selection response induces an increase of allelic effects of 0.15% in the high line, and decreased accordingly in the low line. The same allele is thus expected to display a >10% difference in the two extreme genetic backgrounds, representing weak, but nonnegligible, epistasis.

Of course, this estimate relies on major assumptions about the underlying process. Several genetic or non-genetic factors other than epistasis could affect the available genetic variance, and thus bias ε. For instance, the quadratic approximation relies on the hypothesis that the selection gradient is constant over the entire time series, whereas in fact we know from e.g., Dunnington et al. (2013) that the selection intensity actually increases with time. Meanwhile, the reduced population size in the experiment necessarily generated a significant amount of inbreeding (even with a carefully-designed breeding scheme), which decreases the variance due to genetic drift. However, these mechanisms are unlikely to generate misleading estimates of ε, since (i) they affect both the up and down lines in the same way, and so cannot generate any asymmetry, and (ii) they tend to offset each other, as the selection strength increases while the genetic variance decreases.

More worrisome is the possibility of uncontrolled natural selection in the low line. A fraction of the smallest birds appeared to be sterile or unviable, which could contribute to the slowingdown of the response. Such a mechanism could generate an asymmetric response, and thus spurious positive estimates of the epistatic coefficient. Nevertheless, this seems rather unlikely, given the behavior of the twelve relaxed selection lines presented in Dunnington et al. (2013). Indeed, when selection was stopped in both lines, the populations did not tend to evolve back to the original phenotype, as would have been expected if natural selection was preventing the population from responding to artificial selection. The phenotypic data therefore seems to be compatible with a genetically-driven asymmetry, due to smaller allelic effects in low-weight chickens (i.e., positive epistasis).

#### *2.2.2. Line-cross analysis*

With the improvement in sequencing and genotyping technologies, the phenotype-based methods developed and used by quantitative geneticists for most of the 20th century to investigate genetic architectures without resorting to genotype data are currently losing popularity. However, they are still both elegant and informative, especially when used to estimate general properies of populations such as unidimensional directional epistasis. One of the most powerful (and simple) of these biometric methods consists of crossing individuals or strains of interest in order to generate hybrid and backcross populations, from which the phenotypic means and variances can be determined. The knowledge of the transmission mechanisms of genetic factors from parents to offspring makes it possible to disentangle the impact of additive, dominance, and epistatic effects on the genetic differences between the original individuals (Lynch and Walsh, 1998 p. 205).

A set of equations that can be used to compute additive, dominance, and directional epistatic effects from parental, intercross, and backcross populations are provided in Hansen and Wagner (2001b) (see Demuth and Wade, 2005, for an alternative model). Directional epistasis is unidimensional, and thus corresponds to the ε*<sup>c</sup>* parameter of Equation (2). Below, a slightly different parameterization will be used, in which both parental populations are separated by four additive effects, so that the model is identical to a 2-locus QTL effect model in a diploid species. The model was set up so that genetic effects cancel out in the *F*<sup>2</sup> population, but a different reference point can be chosen (using the genetic effect matrices provided in, e.g., Álvarez-Castro and Carlborg, 2007). Average phenotypes for both parental populations (*P*<sup>1</sup> and *P*2) and the first two intercross populations *F*<sup>1</sup> and *F*<sup>2</sup> can be express as functions of four parameters: a reference μ (arbitrarily, the mean *F*2), additive and dominance effects *A* and *D*, and the directional epistasis coefficient ε.

$$P\_1 = \mu - 2A - D + \varepsilon (A^2 + AD + \frac{1}{4}D^2)$$

$$P\_2 = \mu + 2A - D + \varepsilon (A^2 - AD + \frac{1}{4}D^2)$$

$$F\_1 = \mu + D + \frac{1}{4}\varepsilon D^2 \tag{4}$$

$$F\_2 = \mu.$$

This simple model can be illustrated by the data from the experimental cross between the two chicken strains (Dunnington and Siegel, 1996; Marquez et al., 2010). In this experiment, the two generations of crossing necessary to generate a polymorphic F2 population for QTL mapping makes it possible to sketch a minimal line-cross analysis. Both parental populations as well as F1 and F2 individuals were raised in the same location, with the same food, and at the same density; their average weights at 8 weeks were 170 and 1412 for both parental chicken populations respectively, 650 g for the F1, and 624 g for the F2. Both F1 and F2 are below the parental arithmetic average (791 g), suggesting the presence of dominance and/or epistatic effects (Álvarez-Castro et al., 2012).

Although not perfect, this setting makes it possible to estimate up to four genetic parameters. Two models, with and without dominance, were tested, and gave very similar results (Equation 4 and **Table 1**). The dominance effect, when estimated, was an order of magnitude below the additive contribution. Epistasis was positive, and of similar magnitude in both models.

#### **2.3. DIRECTIONAL EPISTASIS FROM QTL DATA**

Nowadays, data sets often consist of individuals in which both the phenotype and the genotype at loci of interest are known. This is for instance the case after the mapping of Quantitative Trait Loci (QTLs), either by linkage or association methods. Such data sets represent a valuable source of information about epistasis, and in


*The full model (involving dominance) has no degree of freedom, so that statistical errors cannot be estimated.*

particular about multidimensional epistasis, which can hardly be estimated from phenotypic data.

#### *2.3.1. Linear and multilinear models of genetic effects*

In most cases, QTL mapping procedures only focus on marginal (additive and dominance) effects, and do not explicitly consider genetic interactions (Carlborg and Haley, 2004). However, epistasis may be of major interest, both for improving QTL detection (Carlborg et al., 2003, 2004, 2006), and for the biological interpretation of the genotype-phenotype relationship (Malmberg and Mauricio, 2005; Le Rouzic et al., 2007, 2008). Mapping procedures accounting for epistasis generally rely on components of the interaction variance (Cockerham, 1954; Kempthorne, 1954; Lynch and Walsh, 1998), which makes it necessary to estimate four genetic effects for each pair of loci (additiveby-additive, additive-by-dominant, dominant-by-additive, and dominant-by-dominant statistical effects). More recently, "variance QTL" approaches have been proposed to map loci involved in various kinds of interactions, including gene-gene and geneenvironment interactions (Rönnegård and Valdar, 2012). Until recently, there was no QTL mapping method based on directional epistasis (Slatkin and Kirkpatrick, 2012), and estimation from genotype-phenotype data usually relied on model fitting on a predefined set of candidate loci (Cheverud et al., 2001; Le Rouzic et al., 2008; Shao et al., 2008; Pavlicev et al., 2010; Jarvis and Cheverud, 2011).

The traditional genetic regression model, ignoring dominance (and dominance-related epistatic components), can be written as:

$$P\_{\mathcal{Y}\_1, \mathcal{Y}\_2} = \mu + \alpha\_1 \mathcal{S}\_1 + \alpha\_2 \mathcal{S}\_2 + \alpha \alpha\_{12} \mathcal{S}\_{12}.\tag{5}$$

This model has 4 parameters for a pair of loci: μ is the intercept of the model (reference point), α<sup>1</sup> and α<sup>2</sup> are the additive effects for both loci, and αα<sup>12</sup> — a traditional (and probably unfortunate) notation, not to be confused with the product α × α<sup>12</sup> — is the additive-by-additive effect. The *S* coefficients determine the genetic model, i.e., the weights of the genetic effects for each genotype. For instance, consider a haploid twolocus two-allele system with the reference genotype (arbitrarily) set to *A*1*B*1. In the reference genotype, all *S* coefficients are set to 0 (μ, the reference point, thus corresponds to the intercept of the model). For genotype *A*1*B*2, *S*<sup>1</sup> = 0, *S*<sup>2</sup> = 1 (because 1 effect α<sup>2</sup> has been added to the model, given the substitution of a *B*<sup>2</sup> allele), and *S*<sup>12</sup> = 0. In genotype *A*2*B*2, *S*<sup>1</sup> = 1, *S*<sup>2</sup> = 1, and *S*<sup>12</sup> = 1, reflecting the possibility of an interaction between *A*<sup>2</sup> and *B*<sup>2</sup> alleles. Of course, different reference points can be chosen, including mixtures of genotypes in specific frequencies (such as in the *F*<sup>2</sup> model, considering even allelic frequencies and Hardy-Weinberg proportions). The models becomes more complex with diploid genotypes (which include dominance effects), but the principle remains the same. Below, I used the model "NOIA" proposed by Álvarez-Castro and Carlborg (2007), which has some interesting statistical features. In particular, the model is orthogonal (provided there is no linkage disequilibrium) even if the population is not at Hardy-Weinberg proportions. In "NOIA," the *S* coefficients are stored as a genetic design matrix, and the model can be extended (to include more alleles and/or more loci) using simple matrix algebra.

It is possible to modify the above framework to estimate directional epistasis. The strategy proposed by Le Rouzic and Álvarez-Castro (2008) is based on a non-linear, least-square regression, very similar to the framework proposed in Equation (4) for the analysis of line crosses: the model explicitly decomposes the epistatic parameter as a multilinear combination of additive effects, assuming that αα*ij* = α*<sup>i</sup>* × α*<sup>j</sup>* × ε*ij*:

$$P\_{\mathcal{Y}\_1,\mathcal{Y}\_2} = \mu + \alpha\_1 \mathcal{S}\_1 + \alpha\_2 \mathcal{S}\_2 + \alpha\_1 \alpha\_2 \varepsilon\_{12} \mathcal{S}\_{12}.\tag{6}$$

This setting can easily be extended to account for dominance and higher-order epistasis (Álvarez-Castro and Carlborg, 2007; Le Rouzic and Álvarez-Castro, 2008; Pavlicev et al., 2010). When ε*ij* is estimated for each pair of loci, the model describes multidimensional epistasis. There are two distinct ways to estimate unidirectional epistasis from this setting. The first method is to assume that ε is identical between loci, i.e., replacing ε*ij* by a constant ε in Equation (6). The second strategy is to estimate independent ε*ij* values for each pair of loci, and to compute the composite epistasis ε*<sup>c</sup>* using Equation (2). This last strategy is more theoretically-grounded than the former, but it rapidly becomes impractical when the number of loci increases: the number of interactions increases quadratically with the number of loci, which reduces the precision of pairwise interaction estimates.

### *2.3.2. Application to QTLs for body weight*

Individuals from both the high and low chicken lines were intercrossed at generation 46, to form the F1 and F2 populations described above. The 795 surviving individuals from the F2 population were phenotyped for various characters and genotyped for 145 genetic markers on 25 chromosomes. The QTL mapping analysis identified 6 significant loci (four major loci and two of lesser effect). These significant loci combined explained around 10% of the phenotypic variance, and strong epistatic interactions have been reported among them (Carlborg et al., 2006; Le Rouzic et al., 2007; Álvarez-Castro et al., 2012). For the sake of both simplicity and statistical power, only the four major QTLs are considered in the subsequent analyses.

There are 24 second-order epistatic interactions between four loci (6 additive-by-additive, 6 dominance-by-dominance, and 12 additive-by-dominance interactions). It is possible to estimate all of them using a model performing the traditional decomposition of genetic effects (here, I used the software package noia for R, Le Rouzic and Álvarez-Castro, 2008), but interpreting these 24 independent epistatic estimates is complicated: in spite of the large sample size (around 800 individuals), only 4 (out of 24) epistatic estimates reached the 5% *p*-value threshold, and none remained statistically significant after correction for multiple-testing. There were no obvious signs of directional epistasis (11 positive estimates out of 24), even when focusing on additive-by-additive epistasis (3 positive estimates out of 6).

Fitting a unidimensional multilinear model of epistasis leads to a much more conclusive analysis. The estimated constant ε coefficient is positive (<sup>ε</sup> = +0.057 g−1). The weighted composite parameter, calculated from Equation (2), is also positive and of the same order of magnitude (ε*<sup>c</sup>* = +0.020 g−1). The multilinear model fits better than the traditional genetic-effects model with pairwise epistasis, outperforming it by 13.5 AIC units (AIC scores >10 can be considered to be conclusive, Burnham and Anderson, 2002). The multilinear model is also considerably better than models without epistasis (AIC = 18.5). The undisputable statistical superiority of the multilinear model translates into a substantial gain in explanatory power: the four-locus model without epistasis explains only 5.4% of the total phenotypic variance, while the multilinear model explains 7.8%.

### **3. REGRESSIONS AGAINST THE NUMBER OF MUTATIONS**

While it is particularly rare to find estimates of directional epistasis for quantitative characters in general (Pavlicev et al., 2010), the sign of epistasis has been frequently estimated for fitness. The importance of directional epistasis for the logarithm of fitness has now been fully acknowledged by evolutionary biologists, as it affects the evolution of sex, recombination, mutation rates, and other related phenomena (Phillips et al., 2000). Here I will review two models frequently used in this context, and show how they can be modified to fit other quantitative traits. According to the previous definitions, these models are both functional and unidimensional, as they estimate directional epistasis with reference to the "wild type" with no mutations.

#### **3.1. MODEL DESCRIPTION**

A common way to estimate directional epistasis for (log) fitness is a "power" (or "multiplicative") model *<sup>W</sup>* <sup>=</sup> <sup>α</sup>*n*<sup>β</sup> (illustrated in **Figure 6**), where *W* stands for the log-fitness, α is the effect of a single mutation, *n* is the number of mutations, and β measures directional epistasis. The model is based on the fact that the fitness of the reference individual or strain (*n* = 0) is 1, so that the intercept of the model is log (1) = 0 by construction. Fitness in single mutants (*n* = 1) is not affected by epistasis, which makes it possible to estimate α. Epistasis appears for *n* ≥ 2, generating deviations from linearity. β > 1 represents positive epistasis, while β < 1 stands for negative epistasis. The parameters of the model are usually estimated through non-linear regressions (least squares) or by non-linear generalized model approaches (maximum likelihood).

An alternative setting is the quadratic model *W* = −(α*n* + 1 2β *n*2) (Elena and Lenski, 1997; Kouyos et al., 2007) (for consistency with the literature, I have retained the same notation, although it should be noted that β and β have different units, and β > 0 means positive epistasis). This latter model has some interesting theoretical properties associated with the Gaussian fitness function, and is more firmly grounded in classical population genetics theory (Charlesworth, 1990; Otto, 2007).

Alternative parameterizations of the above models appear in the literature (e.g., estimating −α instead of α, or β − 1 instead of β, which provides a more straightforward interpretation of "positive" and "negative" epistasis). This framework is generally used in two different experimental contexts: estimating the directionality of deleterious mutations (in which case, α < 0, and negative epistasis means that the deleterious mutations act synergistically to decrease fitness), or estimating epistasis among the beneficial mutations accumulated during an artificial evolution experiment (α > 0, and negative epistasis represents the antagonistic effects of mutations) (Lenski et al., 1999; Wilke and Adami, 2001; Maisnier-Patin et al., 2005). These symetric interpretations are arguably confusing, and the literature is not always consistent with regard to the association between the sign of directional epistasis and the synergistic or antagonistic properties of mutations (e.g., Szathmáry, 1993).

#### **3.2. MODEL FITTING**

These models are clearly not suited for fitting traditional quantitative genetics data, in which there are no "wild type" or "mutants." However, it is still possible to define the following continuous function for a phenotype *P*, which behaves in a similar fashion as the power model:

$$P(m) = \begin{cases} \mu + \alpha m^{\beta}, & \text{if } m > 0 \\ \mu, & \text{if } m = 0 \\ \mu - \alpha |m|^{1/\beta}, & \text{if } m < 0, \end{cases} \tag{7}$$

where *m* is a real number analogous to the "number of mutations" compared to the reference genotype, α and β have the same meaning as in the power model (α is the average effect of the first mutation, and β is the epistatic coefficient, with β = 1 standing for no epistasis). μ is the intercept of the model, i.e., the phenotype of the "reference genotype." This function is not differentiable at *m* = 0, but this is unlikely to affect the estimates. In order to obtain a proper analogy with traditional quantitative genetics, the mean F2 (same number of alleles from both parental lines) was chosen as the reference. *m*, the "number of mutations" parameter, thus stands for the number of additional "high-line" (H) alleles in a genotype compared to the reference. Considering the 4 significant QTLs, *m* = 0 for the reference (mean F2) genotype (which has 4 low-line alleles and 4 high-line alleles), *m* = −4 in the full low-line genotype (8 alleles from the low-line), and *m* = +4 in the full high-line genotype. An equivalent formulation (*P*(*m*) <sup>=</sup> <sup>μ</sup> <sup>+</sup> <sup>α</sup>*<sup>m</sup>* <sup>+</sup> <sup>1</sup> 2β *m*2) can also be defined for the quadratic model.

Fitting the "continuous power model" of Equation (7) to the data by a non-linear, least-square procedure leads to the following estimates (estimate ± std. err.): α = 13.0 ± 5.8 g; β = 2.18 ± 0.41 (**Figure 5**). This is indicative of strong (and statistically significant) positive epistasis. The first allelic substitution in the reference background (average F2 individual) is thus expected to have an effect of 13 g, the second substitution will affect the phenotype by 45.9 g (two "high" substitutions) or 4.9 g (two "low" substitutions). The epistatic effect is extreme for the fourth substitution, which is predicted to have an effect of 124 g in the "high" direction (i.e., 10 times the estimated effect in the average genetic background) but only 3 g in the "low" direction. The estimate of directional epistasis in the power model is heavily influenced by the few "extreme" genotypes: the 7 individuals with eight "H" alleles are all far above the average, which contributes to the excessive curvature of the genotype-phenotype relationship (**Figure 5**). Yet, epistasis is still present when all extreme genotypes (full homozygotes LL and HH) are removed, with an estimate of β = 1.83 ± 0.50.

**to the chicken QTL data.** The reference genotype contains as many "low" (L) alleles as "high" (H) alleles. The x-axis scales from −4 (LL genotype at all loci) to +4 (HH genotype at all loci). Intermediate numbers of mutations are due to genotype uncertainties when QTLs are not in total linkage disequilibrium with markers.

Estimates from the quadratic model are α = 23.1 ± 4.7 g, and β = 8.3 ± 4.0 g. In spite of the similar notation, β is not on the same scale as β, and directional epistasis, although significantly positive, is smaller here (the two first allelic substitutions in the direction of higher phenotypes have an effect of 27.3 and 35.6 g respectively, vs. 19.0 g and 10.7 g for one and two substitutions toward lower phenotypes).

#### **4. DISCUSSION**

#### **4.1. MODEL COMPARISONS**

Although they all provide an estimate of unidimensional directional epistasis, the models reviewed in this paper have been designed to address different questions, and based in different sub-fields of population and quantitative genetics.

The multilinear model provides an explicit description of epistasis between a set of loci, as in classical quantitative genetics models, and can be extended to fit to phenotypic data. On the opposite, both "regression" models suppose that epistatic patterns follow a general function. This incompatibility between models of directional epistasis for fitness and traditional quantitative genetics models is probably an important factor in the lack of experimental measurements of directional epistasis for quantitative traits (Hansen and Wagner, 2001a; Pavlicev et al., 2010).

In addition to the fact that models are not designed to be applied to the same kind of data (the need to compare genotypes to an arbitrary wild type or the assumption of constant mutational effect size are difficult to overcome for quantitative genetics data), models also carry conceptual differences about the nature of epistatic interactions. For instance, the power model necessarily involves highly complex epistatic interactions (Hansen and Wagner, 2001a). Quantitative genetics rely on linear models of genetic effects, in which interactions are calculated iteratively as the deviation between mutant phenotypes and the sum of lower effect interactions. The multilinear model follows this tradition, and is built as a sum of effects involving one locus (marginal effects), two loci (pairwise interaction effects), three loci, etc. For instance, second-order epistasis is the difference between the double mutant and twice the single mutant effect (**Figure 6**). In contrast, in the power model, there are as many interaction effects as there are mutations, which leads to very complex epistasis. For most realistic values of β (0 <β< 2), the second- and third-order interactions have opposite effects—in other words, if combining two mutations has antagonistic effects, combining three of them will have synergistic effects (the triple mutant is closer to additivity than predicted by the sum of second-order interactions). Moreover, the magnitude of high-order epistatic effects can represent a substantial fraction of lower-order effects (**Figure 6**), suggesting that combined mutant phenotypes are heavily impacted by the emergent properties of specific combinations of allelic substitutions, and thus difficult to predict from experimental results.

This issue is avoided with the quadratic model, which is limited to interactions between pairs of loci. However, this quadratic model implies that mutational effects can switch signs depending on the genetic background (sign epistasis). This property, which is sometimes perceived as undesirable when considering epistasis

**model (here with negative epistasis,** *<sup>α</sup>n<sup>β</sup>* **with** *<sup>α</sup>* **<sup>=</sup> <sup>0</sup>***.***1 and** *<sup>β</sup>* **<sup>=</sup> <sup>0</sup>***.***8).** The second-order epistatic effect is negative (the power model is always below the additive prediction), but the third-order effect is positive (the power model is always above the quadratic model). The sign of the interactions thus alternates when β < 2, and their relative size does not decrease rapidly. As a result, the effect of combining several mutants cannot be properly inferred from simpler combinations—for instance, the prediction for four mutants is not much better for the second-order epistatic model than for the additive model, and can even be worse with more substitutions.

for fitness (Wilke and Adami, 2001), could explain the persistence of alternative models. Another side effect of most unidimensional models of epistasis for fitness is that mutations are assumed to be of constant size. Relaxing this assumption significantly alters the evolutionary properties of the system (Butcher, 1995; Otto and Feldman, 1997), casting doubts on the operational meaning of β (or β ) parameters.

#### **4.2. FULL-GENOME EPISTASIS**

For most of the 20th century, the concept of genotype-tophenotype map was mostly virtual, and mainly used for theoretical purposes. The possibility to access complete individual genomes for a reasonable price has not really been anticipated by quantitative geneticists, and we are now in the uncomfortable situation of not being able to properly translate the massive amount of data collected experimentally into ground-breaking theoretical insights. Indeed, it is widely acknowledged that the revolutionary improvement in the quality and quantity of genotypic information has not generated a proportional improvement in our ability to describe the genetic architecture of quantitative traits from genome-wide association studies. This "missing heritability" problem might be partly due to our inability to detect properly epistatic interactions (Maher, 2008; Zuk et al., 2012; Hemani et al., 2013).

Identifying interacting pairs of loci from a genotypephenotype dataset schematically follows two strategies: (i) combine epistatic and marginal effects while mapping loci, with the hope to increase the genetic signal (Carlborg and Haley, 2004), or (ii) first map loci based on their marginal effects, and estimate epistasis *a posteriori* between pairs of significant loci. Although theoretically elegant, the first strategy generally collapses with high-quality sequencing data because there are so many pairwise combinations to be tested that statistical noise overcomes the genetic signal by orders of magnitude. So far, the second strategy is thus unavoidable for estimating epistasis from high-throughput sequencing data. On the one hand, some epistatic loci will not be detected (in particular, those involved in sign epistasis, which may have no marginal effect). On the other hand, we know from Equation (2) that the impact of loci on the composite epistatic coefficient is weighted by their (marginal) genetic variance, meaning that the loci with no additive effects will not affect directional epistasis. Consequently, estimating epistatic noise in general remains a complex task, and may require further statistical development. When it comes to directional epistasis, focusing on major loci is much less problematic and ensures a proper estimation of this biologically meaningful parameter.

#### **4.3. CONSISTENCY ACROSS ESTIMATES**

This paper illustrates the estimation of epistasis directionality by several methods, using independent data describing the same biological system. The various estimates are reported in **Table 2**. The units and the meaning of the epistatic coefficients differ according to the method. In order to facilitate the comparison, an epistatic factor *f*<sup>100</sup> is provided. This factor corresponds to the coefficient by which genetic effects change when body weight increases by (arbitrarily) 100 g.


**Table 2 | Summary of the directional epistasis estimates from different sources of data and different methods.**

*Estimates can be compared with the f100 factor.*

Directional epistasis estimates are consistently positive, and in most cases statistically significant. This provides strong confirmation that the genetic architecture of the weight differences between the high and low chicken lines is characterized by positive epistasis. However, the epistatic coefficients vary by several orders of magnitude in the different experiments; two categories of estimates can be defined: epistasis is strong when measured from the genotype data (increasing the phenotype by 100 g multiplies the allelic effects by 2 to almost 7), but weaker when measured from phenotype data (increasing the phenotype by 100 g increases allelic effects by 0.7 to 19%).

These measures are not necessarily contradictory, because epistasis can be restricted to a specific subset of the genetic architecture. As the epistatic coefficient measures the "average" curvature of the genotype-phenotype map, it is strongly affected by the nature of the data (and more specifically, the span of the data in terms of number of loci and phenotype range), as it seems to be the case for the chicken bodyweight (**Figure 7**). The extreme epistatic factors measured from the QTL data can be attributed to several factors. The four large-effect QTLs are not a random sample of loci, their effect is statistically inflated by detection bias (the Beavis effect: Beavis, 1994; Xu, 2003), and their strong epistatic interactions remain atypical (Carlborg et al., 2006). Their interaction pattern involves sign epistasis (Le Rouzic et al., 2007), so that additive effects vanish in some genetic backgrounds: increasing a small effect by a large factor does not necessarily mean that the absolute interaction effect is huge. In any case, even if positive epistasis is very strong for the 4 major loci, these QTLs only explain 7% of the total phenotypic variance, and the F2 population covers only 50% of the phenotype range of the parental lines. If directional epistasis is not a property of the whole genetic architecture, but merely reflects specific interactions between a few loci, data involving more loci and more genetic backgrounds would be expected to reveal less directional epistasis, which seems to be the case here with a striking regularity among the three independent data sources (**Figure 7**).

#### **5. CONCLUDING REMARKS**

Unidimensional directional epistasis measures how the properties of genetic architectures change with the phenotype. It has often

been confused with scaling. Scale transformation is a common operation in biology, often motivated by the need to make the data suitable for a particular statistical analysis (e.g., enforcing normality). Changing the scale of the phenotype measurement impacts on directional epistasis (Pavlicev et al., 2010), and it is possible to find an arbitrary scale transformation on which directional epistasis becomes negligible (or even is canceled out) in a data set. Applying such *ad hoc* mathematical operations to phenotypes prior to analysis could hardly be considered good practice. First, it has been repeatedly pointed out to biologists that, according to measurement theory, scales do actually have a meaning, and are thus not interchangeable (Wagner et al., 1998; Houle et al., 2011). One of the best examples is fitness, which is essentially multiplicative (Wagner, 2010). Epistasis on fitness thus has to be measured as the deviation from log-linearity, which justifies models of directional epistasis presented above. Obviously, directional epistasis following the power model cancels out on a log scale, but such a double log transformation would be meaningless, and should not be seriously considered. A second reason why scale change does not solve the problem of directional epistasis is that one should not necessarily expect consistent directionality. As exemplified by the chicken example, and illustrated in **Figure 2**, directionality is a local measure of the interlocus curvature of the genotype-phenotype map. It is thus likely that directionality could itself evolve as the phenotype changes (in the presence of third-order epistasis and higherorder interactions, directionality could even change when the phenotype remains constant). Therefore, comparing the properties of genetic architectures across populations or species requires measuring directional epistasis on a common scale.

Recent conceptual and theoretical advances have convincingly demonstrated that what matters in epistasis is not its direct contribution to genetic variation (interaction variance), but rather its propensity to (indirectly) influence the evolution of additive genetic variance. This propensity can be estimated by looking for specific patterns among epistatic interactions. The directionality of epistasis may be the most obvious, but other patterns are also emerging as candidate contributors to the evolvability of genetic architectures, such as the monotonicity of the genotype-phenotype relationship (closely linked to sign epistasis) (Gjuvsland et al., 2011, 2013), and the robustness or canalization of genetic architectures (Hermisson and Wagner, 2004; Draghi et al., 2010; Fraser and Schadt, 2010; Le Rouzic et al., 2013).

In quantitative genetics and breeding, correctly describing epistasis can improve the prediction of selection responses. In evolutionary genetics, epistasis determines the structure of genetic diversity and variability. At the phylogenetic scale, directional epistasis could contribute to biased anagenesis patterns and affect evolutionary trajectories. Most molecular mechanisms do not simply add up, and the genotype-phenotype relationship has to be curved to some extent. Is the observed curvature (quantified with one or several of the methods described here) consistent with predictions from system-biology models? To what extent is it constrained by the physical properties of the phenotypic trait? Does it vary depending on the trait, on the species? Does it evolve rapidly? The importance of determining directional epistasis for a wide diversity of traits in many organisms has probably been underestimated in the past, but now appears to be a key toward obtaining a better understanding of the general properties of genetic architectures.

#### **ACKNOWLEDGMENTS**

I am grateful to Thomas F. Hansen, Estelle Rünneburger, and two reviewers for their careful reading and constructive comments on the manuscript. I acknowledge Paul Siegel, Örjan Carlborg, and Leif Andersson for allowing liberal use of their phenotypic and genetic data on the chicken experiment. Sincere gratitude is expressed to colleagues for advice and discussion, especially José M. Álvarez-Castro. The English text was reviewed by Monika Ghosh.

#### **REFERENCES**


Wagner, G. P. (2010). The measurement theory of fitness. *Evolution* 64, 1358–1376.


**Conflict of Interest Statement:** The Guest Associate Editor, Dr. José M. Alvarez-Castro, declares that, despite having collaborated on a publication with the authors in the last 2 years, the review process was handled objectively. The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 April 2014; paper pending published: 04 May 2014; accepted: 13 June 2014; published online: 14 July 2014.*

*Citation: Le Rouzic A (2014) Estimating directional epistasis. Front. Genet. 5:198. doi: 10.3389/fgene.2014.00198*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Le Rouzic. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### **APPENDIX I: MULTILINEAR EPISTASIS ON A CONTINUOUS GENOTYPE-PHENOTYPE MAP**

#### **TWO LOCI**

The multilinear model of Hansen and Wagner (2001b) is defined based on a reference genotype, and proposes a change-ofreference operation to recompute the genetic effects in a different genotype, assuming a multilinear genotype-phenotype map. In an arbitrary genotype-phenotype relationship, the multilinear model can be considered to be a local approximation of the multilocus curvature, and epistatic coefficients can be calculated from Taylor polynomial coefficients.

Let *g*(*y*1, *y*2) be a continuous and differentiable (at least twice) two-dimensional Genotype-Phenotype function associating a phenotype value *P* to any genotype combination (*y*1, *y*2) at two loci. The gradient vector at a particular genotype = (1, 2) is **D** (*Di* = ∂*g*(*y*1, *y*2)/∂*yi*|1,<sup>2</sup> ), and the Hessian matrix is **D**<sup>2</sup> (*D*<sup>2</sup> *<sup>i</sup>*,*<sup>j</sup>* <sup>=</sup> <sup>∂</sup>2*g*(*y*1, *<sup>y</sup>*2)/∂*yi*∂*yj*|1,<sup>2</sup> ). The second-order Tailor series around this genotype is:

$$P(\boldsymbol{\jmath}\_1, \boldsymbol{\jmath}\_2) \simeq g(\boldsymbol{\Gamma}\_1, \boldsymbol{\Gamma}\_2) + D\_1(\boldsymbol{\jmath}\_1 - \boldsymbol{\Gamma}\_1) + D\_2(\boldsymbol{\jmath}\_2 - \boldsymbol{\Gamma}\_2)$$

$$+ \frac{1}{2} D\_{1,1}^2 (\boldsymbol{\jmath}\_1 - \boldsymbol{\Gamma}\_1)^2 + \frac{1}{2} D\_{2,2}^2 (\boldsymbol{\jmath}\_2 - \boldsymbol{\Gamma}\_2)^2$$

$$+ D\_{1,2}^2 (\boldsymbol{\jmath}\_1 - \boldsymbol{\Gamma}\_1)(\boldsymbol{\jmath}\_2 - \boldsymbol{\Gamma}\_2). \tag{A1}$$

Rescaling as *y* <sup>1</sup> = *D*1(*y*<sup>1</sup> − 1) and *y* <sup>2</sup> = *D*2(*y*<sup>2</sup> − 2) and neglecting the quadratic terms leads to a multilinear approximation taking the genotype as a reference point:

$$P(\mathbf{y}'\_1, \mathbf{y}'\_2) \simeq \mathbf{g}(\Gamma\_1, \Gamma\_2) + \mathbf{y}'\_1 + \mathbf{y}'\_2 + \mathbf{y}'\_1 \mathbf{y}'\_2 \frac{D^2\_{1,2}}{D\_1 D\_2},\tag{A2}$$

where it appears clearly that the directionality coefficient of Hansen and Wagner (2001b) is <sup>ε</sup>*ij* <sup>=</sup> *<sup>D</sup>*<sup>2</sup> *i*,*j* /*DiDj*. The quadratic terms <sup>1</sup> 2*D*<sup>2</sup> 1,1*y* <sup>2</sup> <sup>1</sup> and <sup>1</sup> 2*D*<sup>2</sup> 2,2*y* <sup>2</sup> <sup>2</sup> disappear from the equation as a consequence of the multilinear approximation.

#### **SEVERAL LOCI**

The previous approximation can be extended to several loci in a straightforward way:

$$\kappa\_{\vec{\eta}} = \frac{\partial^2 \mathbf{g}}{\partial \mathbf{y}\_i \partial \mathbf{y}\_j} \big|\_{\Gamma} \Big/ \frac{\partial \mathbf{g}}{\partial \mathbf{y}\_i} \big|\_{\Gamma} \frac{\partial \mathbf{g}}{\partial \mathbf{y}\_j} \big|\_{\Gamma}. \tag{A3}$$

Developing the third-order Taylor series and neglecting all quadratic terms, the third-order epistatic coefficients can be written as follows:

$$\varepsilon\_{ijk} = \frac{\partial^{\beta} \mathbf{g}}{\partial \boldsymbol{\chi}\_{i} \partial \boldsymbol{\chi}\_{j} \partial \boldsymbol{\chi}\_{k}} \big|\_{\Gamma} \Big/ \frac{\partial \mathbf{g}}{\partial \boldsymbol{\chi}\_{i}} \big|\_{\Gamma} \frac{\partial \mathbf{g}}{\partial \boldsymbol{\chi}\_{j}} \big|\_{\Gamma} \frac{\partial \mathbf{g}}{\partial \boldsymbol{\chi}\_{k}} \big|\_{\Gamma}. \tag{A4}$$

The multilinear approximation can thus be easily extended to any number of loci and any order of epistasis, with the *n*th order epistasis coefficient being the *n*th mixed partial derivative of the genotype-phenotype function scaled by the product of the firstorder derivatives of this function for all loci involved in the interaction.

### **APPENDIX II: EFFECT OF DIRECTIONAL EPISTASIS ON ARTIFICIAL SELECTION RESPONSE**

The impact of directional epistasis on the response to directional selection is rather complex to predict precisely for arbitrary time periods (Carter et al., 2005). Nevertheless, useful approximations can still be derived by making realistic assumptions about the properties of genetic architectures. For instance, Le Rouzic et al. (2011) proposed a model that can be simplified as:

$$
\mu\_{t+1} = \mu\_t + V\_{A\_t} \beta\_t \tag{A5a}
$$

$$V\_{A\_{l+1}} = V\_{A\_l} + 2\beta\_l \varepsilon V\_{A\_l}^2 \tag{A5b}$$

Equation (A5a) is the traditional breeder's equation, formulated as in Lande and Arnold (1983), where *VA* is the additive genetic variance, and β the selection gradient, i.e., the slope of the regression between phenotype and relative fitness. Equation (A5b) approximates the impact of directional epistasis on additive variance, summarized by the directionality coefficient ε.

This model requires 3 parameters: μ0, the initial phenotype, the initial additive variance *VA*<sup>0</sup> , and the epistatic parameter ε. Fitting the model by maximizing its likelihood for phenotype times series including means and variances provide convincing estimates of epistasis, especially when the data include bidirectional artificial selection (Le Rouzic et al., 2011).

Unfortunately, variance time series are not always available from historical data, because either they were measured but not reported in the corresponding publications, or simply because they were not computed, as only the mean phenotype was the center of interest. Moreover, fitting such a complex multidimensional non-linear model can be tricky, and requires significant computer programming input (and possibly having to solve numerical convergence issues). Proposing simpler formulas could therefore be helpful, as they may allow any biologist with basic statistical knowledge to report the strength of directional epistasis based on average phenotype data.

The following calculation is based on several approximations, the main ones being that selection is expected to be constant (β*<sup>t</sup>* = β), and that linkage disequilibrium can be ignored. If directional epistasis is the only phenomenon affecting the selection response, the additive genetic variance is expected to change as in Equation (A5b). Approximating the discrete process by a continuous function leads to the ordinary differential equation <sup>d</sup>*VA* d*t* = 2βε*V*<sup>2</sup> *<sup>A</sup>*, which can be solved as:

$$V\_{A\_1} = \frac{V\_{A\_0}}{1 - 2\beta V\_{A\_0 \mathcal{E} t}}.\tag{A6}$$

Assuming that directional epistasis is not very strong (εβ*VA*<sup>0</sup> 1), the expected phenotype at time *t* results from the product between the (supposedly constant) selection gradient β and the cumulative change in *VA*, which can be calculated as:

$$\mu\_t = \mu\_0 + \beta \int\_0^t V\_{A\_t} d\tau = \mu\_0 - \frac{\log\left(1 - 2\beta V\_{A\_0} \varepsilon t\right)}{2\varepsilon}. \tag{A7}$$

## Dissecting genetic effects with imprinting

### *José M. Álvarez-Castro1,2\**

*<sup>1</sup> Department of Genetics, University of Santiago de Compostela, Lugo, Spain*

*<sup>2</sup> Quantitative Organism Biology, Instituto Gulbenkian de Ciência, Oeiras, Portugal*

#### *Edited by:*

*Rong-Cai Yang, University of Alberta, Canada*

#### *Reviewed by:*

*Bin He, The University of Chicago, USA Rebekah L. Rogers, University of California, Irvine, USA*

#### *\*Correspondence:*

*José M. Álvarez-Castro, Department of Genetics, Veterinary Faculty, University of Santiago de Compostela, Avda Carvalho Calero, s/n, ES-27002 Lugo, Galiza, Spain e-mail: jose.alvarez.castro@usc.es*

Models of genetic effects are mathematical representations of a genotype-to-phenotype (GP) map that, rather than accounting for a raw map assigning phenotypes to genotypes, rely on parameters with deliberate evolutionary meaning—additive and interaction effects. In this article, the conceptual particularities of genetic imprinting and their implications on models of genetic effects are analyzed. The molecular mechanisms by which imprinted loci affect the relationship between genotypes and phenotypes are known to be singular. Despite its epigenetic nature, the (parent-of-origin-dependent) way in which the alleles of imprinted genes are modified and segregate in each generation is precisely determined, and thus amenable to be represented through conventional models of genetic effects. The Natural and Orthogonal Interactions (NOIA) model framework is here extended to account for imprinting as a tool for a more thorough analysis of the evolutionary implications of this phenomenon. The resulting theory improves and generalizes previous proposals for modeling imprinting.

**Keywords: imprinting, individual-referenced models of genetic effects, population-referenced models of genetic effects, NOIA, genetic variance decomposition**

### **INTRODUCTION**

Classical models of genetic effects were established almost one century ago for assembling biometric observations with Mendelian genetics (Fisher, 1918; Provine, 1971). This way, mechanistic explanations were provided for interesting properties of quantitative traits that had been revealed in the nineteenth century, particularly the regression toward mediocrity (Galton, 1886). A key concept in this theory is the split of effects of allele substitutions into additive and non-additive components, since the population variance of the additive components was shown to determine the resemblance between relatives within that population (see e.g. Falconer and Mackay, 1996).

The practicality of that rule keeps on being of huge importance nowadays. By assessing the resemblance between relatives for a trait within one generation of a population (which requires tracking relatedness and phenotype scores) it is possible to estimate the additive variance of that trait at that population. That estimate may in its turn be used to predict the resemblance between parents and their offspring and hence the response to selection in the forthcoming generation. Thus, although the theory behind relies on genetic effects, no direct information about the genes underlying a trait in a population is necessary in practice for estimating parameters with convenient predictive power.

With time, molecular, statistical and computational tools have enabled mapping experiments to be performed even in nonmodel species (see e.g. Rifkin, 2012). The need to update models of genetic effects for making the most of this new source of information was soon pointed out (Cheverud and Routman, 1995), leading to the development of models of genetic effects depicting the GP map as effects of allele substitutions from individual genotypes (Hansen and Wagner, 2001). This is the context in which the Natural and Orthogonal Interactions (NOIA) model of genetic effects was developed (Álvarez-Castro and Carlborg, 2007; Álvarez-Castro and Yang, 2011).

NOIA is a generalization of models of genetic effects that unifies the individual-based formulations mentioned right above with the aforementioned classical approaches, which depict the GP map in terms of effects of allele substitutions averaged over populations. As an example, this approach has enabled analyses of the role of epistatic interactions during the artificial selection process leading to the domestication of chicken (Álvarez-Castro et al., 2008). The classical population-referenced models are convenient for obtaining genetic effects of growth rate from the data generated in quantitative trait loci (QTL) experiments. But, next, those have to be transformed into individual-based genetic effects for analyzing how allele substitutions could have occurred in genes underlying growth rate from the reference of the genotype of the wild ancestors of current domestic chicken. In general, being able to transform between the individual- and the populationreferenced approaches opens new opportunities of analyses of gene effects and interactions, as reviewed by Álvarez-Castro (2012).

QTL analyses eventually focussed also on the quest for imprinted genes and the estimation of imprinting effects (Knott et al., 1998). The traditional scheme of either maternal or paternal allele-effect silencing is known not to be universal—the callypige phenotype in sheep being a remarkable counterexample for this (Cockett et al., 1996). Indeed, several alternative patterns of imprinting have been described more recently (e.g. Wolf et al., 2008; Xiao et al., 2013). In general, a gene is imprinted for a trait when heterozygotes with different parent-of-origin of their alleles are associated to different phenotypes. Hence, imprinting always involves some kind of dominance (since at least one of the two cases will depart from the mid-homozygote expectation).

New models of genetic effects, involving also epistasis, have recently been proposed to detect and analyze imprinted genes (Wolf and Cheverud, 2009). Here, the discussion on how to model genetic effects in the presence of imprinting is resumed with emphasis on the conceptualization (and thus the biological meaning) of all genetic effects involved. Two different options of extending NOIA to imprinting are developed and pondered in order to stress that the meaning of the genetic effects with imprinting must be considered with particular caution.

### **INDIVIDUAL- AND POPULATION-REFERENCED GENETIC EFFECTS**

First, let us recall the most basic expressions and facts of NOIA (from Álvarez-Castro and Carlborg, 2007; Álvarez-Castro et al., 2012). The effects of allele substitutions can be expressed in terms of additive (*a*) and dominance (*d*) effects in matrix notation as **G** = **SE**, which, for one non-imprinted locus with two alleles (*A*1, *A*2) and using the homozygote for the first allele as reference, expands to:

$$
\begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{22} \end{pmatrix} = \begin{pmatrix} 1 \ 0 \ 0 \\ 1 \ 1 \ 1 \\ 1 \ 2 \ 0 \end{pmatrix} \begin{pmatrix} R \\ a \\ d \end{pmatrix} \tag{1}
$$

In this expression, **E** is the vector of genetic effects (including also the reference point *R*), **G** is the vector of genotypic values (accounting for the expected phenotype for each of the genotypes), and **S** is the genetic-effect design matrix, which determines how the genetic effects are defined as a reparameterization of the genotypic values. This point is easier to visualize through the equivalent expression **<sup>E</sup>** <sup>=</sup> **<sup>S</sup>**−1**G**:

$$
\begin{pmatrix} R \\ a \\ d \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 \\ -\mathbb{M} & 0 & \mathbb{M} \\ -\mathbb{M} & 1 & -\mathbb{M} \end{pmatrix} \begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{22} \end{pmatrix} \tag{2}
$$

Since *a* = (*G*<sup>22</sup> − *G*11)/2 is half the distance between the genotypic values of the two homozigotes, adding two additive effects from the genotypic value of the reference genotype *A*1*A*<sup>1</sup> (*G*11) brings us to the genotypic value of the other homozygote (*G*22). Thus, adding one only additive effect brings us to the midpoint between the two homozygotes, from which further adding the dominance effect brings us to the genotypic value of the heterozygote (*G*12). Indeed, the dominance effect *d* = *G*<sup>12</sup> − (*G*<sup>11</sup> + *G*22)/2 measures the deviation of the heterozygote from its additive expectation.

More general expressions, enabling the use of any genotype as reference point, have been developed. In any case, the split of effects of allele substitutions from the reference of an individual genotype into additive and interaction components has direct evolutionary meaning. Indeed, assuming that the genotypic values reflect fitness, a quick comparison of the additive and dominance effects provides the equilibrium properties of the system (either one stable or one unstable polymorphic equilibrium, or fixation of a particular allele, which may occur asymptotically with complete dominance). For the simple case of one locus with two alleles, this information can also be retrieved visually from the representation of the raw genotypic values—the genetic effects become more useful for systems of increasing complexity.

On the other hand, the classical additive and interaction population-referenced genetic effects are useful for analyzing properties of particular populations, with given genotype frequencies (*pij*, with *pi* = *pii* + 1/2*p*<sup>12</sup> being the allele frequencies and μ the phenotype mean). They are average effects of allele substitutions over populations and they can be obtained by a regression of the genotypic values on the allele content. The general expression for two alleles can be written as:

$$
\begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{22} \end{pmatrix} = \begin{pmatrix} 1 & -2p\_2 & -\frac{p\_{12}p\_{22}}{2p\_1p\_2 - 1/2p\_{12}} \\ 1 \ p\_1 - p\_2 & \frac{p\_{11}^2p\_{22}}{p\_1p\_2 - 1/4p\_{12}} \\ 1 & 2p\_1 & -\frac{p\_{11}p\_{12}}{2p\_1p\_2 - 1/2p\_{12}} \end{pmatrix} \begin{pmatrix} \mu \\ \alpha \\ \delta \end{pmatrix} \tag{3}
$$

The parameters of this model are summarized in **Table 1**. The link between expression (3) and the previous ones comes easy, by just taking into account that the genotypic values remain the same. From any two expressions of this kind, **G** = **S1E1** and **G** = **S2E2**, the genetic effects can be transformed into each other directly as:

$$\mathbf{E\_2 = (S\_2)^{-1} S\_1 E\_1} \tag{4}$$

#### **INTERACTIONS MAKE A DIFFERENCE**

Using expression (4), it is easy to derive that a GP map in which *d* = 0 fulfills δ = 0 and α = *a*, regardless of the genotypic frequencies. However, the presence of interactions makes the relationship between individual- and population-referenced



*Gij are the genotypic values (expected phenotype of each genotype), with G*<sup>12</sup> *for the only heterozygote without imprinting and for one of the two heterozygote options with imprinting (in which case G*<sup>21</sup> *stands for the other option). The genotype frequencies (whose subscripts follow the same logic) are pij and, following the standard notation, the allele frequencies not included in the table are pi* = *pii* + 1/2*p*12*, i* = 1, 2*. The parameters pij can also stand as indexes of individual genotypes in the individual-referenced formulation—when one of them equals one and the others equal zero. In the individual-referenced formulation, R stands for the reference point (which is an individual genotype), a for the additive genetic effect and d for the dominance genetic effect. With imprinting, there is an additional imprinting effect, i (in the imprinting-effect model), or two alternative dominance effects, d*<sup>12</sup> *and d*<sup>21</sup> *(in the two-dominance model; for a justification of the use of the superscripts see Álvarez-Castro and Yang, 2011). In the population-referenced formulation (last row), the corresponding parameters are taken from the Greek alphabet instead of the Latin one (e.g.* μ *is the population phenotype mean).*

genetic effects to be far from trivial—and, indeed, far more interesting (Álvarez-Castro and Le Rouzic, 2014). This is illustrated by two simple examples in **Figure 1**. These graphs show the linear regression (solid line) of the genotypic values (discs) on the allele content (horizontal axis) for a particular population (with specific allele frequencies), as well as the decomposition of the genetic variance (curves) for any allele frequencies.

The first example (**Figure 1A**) shows a case in which the individual-referenced additive genetic effect is nil (the genotypic values of the homozygotes are equal) whereas the dominance

**FIGURE 1 | Genotypic values (discs) and variance decomposition (curves) of one-locus, two-allele (***A***<sup>1</sup> and** *A***2), non-imprinted genetic systems with overdominance assuming Hardy–Weinberg proportions for all possible allele frequencies (represented by the frequency of** *A***2,** *p***2).** The variances (black solid curve for additive, gray dashed curve for dominance) are actually plotted as trait units squared. The size of the discs marking the genotypic values are scaled according to *p*<sup>1</sup> = 0.625 (approximately, *p*<sup>11</sup> = 0.14, *p*<sup>12</sup> = 0.47, *p*<sup>22</sup> = 0.39). **(A)** The genotypic values are *G*<sup>11</sup> = 0, *G*<sup>12</sup> = 5, *G*<sup>22</sup> = 0, leading to individual-referenced genetic effects (from the reference of *A*1*A*1) *a* = 0, *d* = 5. At *p*<sup>2</sup> = 0.375 (*p*<sup>1</sup> = 0.625, marked by the vertical dashed line), the regression of the genotypic values on the proportional allele content (solid line) is an increasing function with slope (and thus population-referenced additive effect) α = 2.5, indicating that *p*<sup>2</sup> would increase under directional selection (toward the equilibrium point, *p*<sup>1</sup> = *p*<sup>2</sup> = 0.5). **(B)** The genotypic values are the same as in **(A)** but for *G*<sup>11</sup> = 2, leading to individual referenced genetic effects of *a* = −1, *d* = 4. At *p*<sup>2</sup> = 0.375 (*p*<sup>1</sup> = 0.625, marked by the vertical dashed line), the regression of the genotypic values on the proportional allele content (solid line) has α = 0 slope, indicating a polymorphic equilibrium point.

effect is not (the genotypic value of the heterozygote is different from them). The slope of the weighted regression of the genotypic values on the allele content provides the population-referenced additive genetic effect, α. In that figure, such regression is shown for a Hardy–Weinberg population with *p*<sup>1</sup> = 0.625, marked with a vertical dashed line. Since the slope of the regression is positive, so it is α. The second example (**Figure 1B**) still shows a case of overdominance (the genotypic values of the homozygotes are lower than the one of the heterozygote, i.e., *d* > |*a*|), although in this case the individual-referenced additive effect is not nil. However, the regression at *p*<sup>1</sup> = 0.625 has a slope of zero, indicating that this is a (polymorphic) equilibrium point.

In the context of a population, the decomposition of the genotypic values into additive and interaction effects has its parallel at the level of variances. Indeed, in the second example (**Figure 1B**), the additive variance is nil at *p*<sup>1</sup> = 0.625. Coming back to the first example (**Figure 1A**), the additive variance is not nil at *p*<sup>1</sup> = 0.625 (where the regression slope is not either nil) and, more in general, the additive variance, which determines the selection response, dominates the extremes of the graph (40% of the possible frequencies), indicating very efficient selection response of those populations (toward the equilibrium point, with *p*<sup>1</sup> = 0.5, where the additive variance is nil).

Thus, throughout these examples it becomes evident that interaction makes it possible both to have nil individualreferenced with non-nil population-referenced additive effects and vice versa. Overall, the presence of interactions unveils that individual- and population-referenced genetic effects have different meanings. The later ones reflect properties of populations (the additive effect and the additive variance are nil at equilibrium frequencies) whereas the former ones are effects of allele substitutions from individual references (the additive effect is nil when the homozygotes have equal genotypic values). Keeping this in mind aids interpretation of the subsequent developments and discussion.

### **MODELING IMPRINTING: HOW MANY ADDITIVE AND DOMINANCE EFFECTS?**

When considering one imprinted locus with two alleles, we could be tempted to try to fit it into a one-locus four-allele genetic model, since each of the two alleles (with different nucleotide sequences) may be expressed at the level of the phenotype in two ways (each has two possible methylation stages), thus leading to a total of four variants with potentially different effects on the phenotype. One evident issue coming from this scheme arises when considering how segregation is assumed in a one-locus four-allele model, which does not at all consider transformations of the variants into one another through generations (as it is the case of alleles in imprinted genes). Moreover, even if we dismissed any analyses involving segregation, we could not possibly use the multiallelic model for depicting the differences between phenotypes due to allelic variants, as explained below.

Let the two alleles be *A*<sup>1</sup> and *A*2, just as in the cases without imprinting above. Due to imprinting there now also exist the modified variants *A*¯ <sup>1</sup> and *A*¯ 2, summing up to a total of four variants as mentioned just above. In a four-allele model of genetic effects, there are six additive effects, three of which can be retrieved from the other three (see e.g. Álvarez-Castro and Yang, 2011). These parameters account for effects of allele substitutions between any possible pair of homozygotes, which in our case would be *A*1*A*1, *A*2*A*2, *A*¯ <sup>1</sup>*A*¯ 1, and *A*¯ <sup>2</sup>*A*¯ 2. However, none of these genotypes will be present in any of the individuals of our analyses. More to the point, we cannot easily think of those genotypes as putative artificial constructs, since imprinted loci preclude viability under unbalanced dosages of modified alleles (Kono et al., 2004; Kawahara et al., 2007).

Indeed, the two "homozygotes" of our imprinted biallelic locus actually are *A*1*A*¯ <sup>1</sup> and *A*2*A*¯ 2—they are allele-wise homozygotes, although not variant-wise homozygotes. Only substitutions implying the pairs *A*1-*A*<sup>2</sup> and *A*¯ 1-*A*¯ <sup>2</sup> are allowed. Thus, one only additive effect of allele substitutions makes sense in this genetic system, involving substitutions of alleles *A*<sup>1</sup> and *A*<sup>2</sup> in each of their variants. In the context of the individual-referenced framework, that effect can be measured in a way analogous to the non-imprinted loci as *a* = (*G*<sup>22</sup> − *G*11)/2, just considering that with imprinting the "homozygotes" bear two differently modified allelic variants.

Thus, although properly conceptualizing the additive effects of an imprinted locus may require some reflection, they in the end can be modeled in a way that brings no additional complexity as compared to modeling the non-imprinted case. It is the modeling of the dominance effects that will make the difference. It has been discussed just above that from genotype *A*1*A*¯ <sup>1</sup> there is one only way of performing two allele substitutions, which leads to genotype *A*2*A*¯ 2. There are however two possible ways of All parameters are summarized in **Table 1**. The genotypic value of *A*2*A*¯ <sup>2</sup> is here expressed as the sum of two additive effects from the reference whilst the genotypic values of the heterozygotes involve one additive plus one dominance effect each. The difference between (5) and (1) is that in (5) each heterozygote involves a different dominance effect. By equating the vector of genetic effects in (5) we obtain an extension of expression (2) to imprinting, providing how each of the genetic effects is defined in terms of the genotypic values:

$$
\begin{pmatrix} R \\ a \\ d^{12} \\ d^{21} \end{pmatrix} = \begin{pmatrix} 1 & \mathbf{0} \ \mathbf{0} & \mathbf{0} \\ -^{1/2} \mathbf{0} \ \mathbf{0} & ^{1/2} \\ -^{1/2} \mathbf{1} \ \mathbf{0} & ^{-1/2} \\ -^{1/2} \mathbf{0} \ \mathbf{1} & ^{-1/2} \end{pmatrix} \begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{21} \\ G\_{22} \end{pmatrix} \tag{6}
$$

Thus, for instance, the second dominance effect is defined as *<sup>d</sup>*<sup>21</sup> <sup>=</sup> *<sup>G</sup>*<sup>21</sup> <sup>−</sup> <sup>1</sup> /2(*G*<sup>11</sup> + *G*22). Expression (6) also entails the general individual-referenced formulation of NOIA for one biallelic imprinted locus, by just replacing the first row of the matrix by (*p*11, *p*12, *p*21, *p*22), so that any genotype may be chosen as reference (e.g. *A*2*A*¯ <sup>2</sup> is the reference when *p*<sup>22</sup> = 1 and the remaining *pij* = 0).

For describing the potential response of the imprinted genetic system to one-generation step of selection, a populationreferenced formulation [as expression (3) for a non-imprinted locus] is required. Following the same approach as by Álvarez-Castro and Carlborg (2007, Appendix C; see Supplementary Material), such expression can be obtained as:

$$
\begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{21} \\ G\_{22} \end{pmatrix} = \begin{pmatrix} 1 & -2p\_{2} & -\frac{2p\_{12}p\_{22}}{\left(p\_{11}+p\_{22}\right)\left(p\_{12}+p\_{21}\right)} & -\frac{2p\_{12}p\_{22}}{\left(p\_{11}+p\_{22}\right)\left(p\_{11}+p\_{22}\right)} \\\ 1 \ p\_{1} - p\_{2} & \frac{\left(4p\_{11}+p\_{21}\right)\left(p\_{11}+p\_{22}\right)-4p\_{11}^{2}}{\left(p\_{11}+p\_{22}\right)\left(p\_{12}+p\_{21}\right)} & -\frac{p\_{21}}{\left(p\_{12}+p\_{22}\right)\left(p\_{11}+p\_{22}\right)} \\\ 1 \ p\_{1} - p\_{2} & -\frac{p\_{12}}{\left(p\_{12}+p\_{21}\right)} & \frac{\left(4p\_{11}+p\_{12}\right)\left(p\_{11}+p\_{22}\right)}{\left(p\_{11}+p\_{22}\right)\left(p\_{12}+p\_{21}\right)} \\\ 1 \ 2p\_{1} & -\frac{2p\_{11}p\_{21}}{\left(p\_{11}+p\_{22}\right)\left(p\_{12}+p\_{21}\right)} & -\frac{2p\_{11}p\_{21}}{\left(p\_{11}+p\_{22}\right)\left(p\_{12}+p\_{22}\right)} \end{pmatrix}
$$

performing one only allele substitution from that genotype, leading to either *A*1*A*¯ <sup>2</sup> or *A*2*A*¯ 1. Consequently, considering two possible dominance effects (one for each parent-of-origin of the two alleles in the heterozygote) emerges as a sensible solution.

To begin with the development of this two-dominance setting, an expression of the genotypic values as a sum of genetic effects of allele substitutions from one reference genotype is firstly provided—as it was done in expression (1) above for a non-imprinted locus. This way (following the same logic as in Álvarez-Castro and Carlborg, 2007; Álvarez-Castro and Yang, 2011), the expression of NOIA from the reference of homozygote *A*1*A*¯ <sup>1</sup> can be obtained as:

$$
\begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{21} \\ G\_{22} \end{pmatrix} = \begin{pmatrix} 1 \ 0 \ 0 \ 0 \\ 1 \ 1 \ 1 \ 0 \\ 1 \ 1 \ 0 \ 1 \\ 1 \ 2 \ 0 \ 0 \end{pmatrix} \begin{pmatrix} R \\ a \\ d^{12} \\ d^{21} \end{pmatrix} \tag{5}
$$

$$\begin{pmatrix} -\frac{2p\_{21}p\_{22}}{\left(p\_{11}+p\_{22}\right)\left(p\_{12}+p\_{21}\right)}\\ -\frac{p\_{21}}{\left(p\_{12}+p\_{21}\right)}\\ \frac{\left(4p\_{11}+p\_{12}\right)\left(p\_{11}+p\_{22}\right)-4p\_{11}^{2}}{\left(p\_{11}+p\_{22}\right)\left(p\_{12}+p\_{21}\right)}\\ -\frac{2p\_{11}p\_{21}}{\left(p\_{11}+p\_{22}\right)\left(p\_{12}+p\_{21}\right)} \end{pmatrix} \begin{pmatrix} \mu\\ \alpha\\ \delta^{12}\\ \delta^{21} \end{pmatrix} \tag{7}$$

Using the procedure for inspecting orthogonality of models of genetic effects, also conveyed by Álvarez-Castro and Carlborg (2007, Appendix C; see the Supplementary material), it follows that expression (7) entails an orthogonal decomposition of the genotypic values into additive and dominance components, thus leading to an orthogonal decomposition of the genetic variance. The two dominance effects are however not orthogonal to each other. Overall, it is possible to model a biallelic imprinted locus using one additive and two dominance genetic effects, which makes it straightforward to keep track of the biological meaning of the parameters, in analogy with the non-imprinted case.

#### **IMPRINTING AS A GENETIC EFFECT**

The previous setting can be used for detecting imprinting by just developing a procedure for testing whether the two dominance effects are significantly different. To this aim, it seems however more convenient to design a model in which a parameter accounts Álvarez-Castro Modeling imprinting

for the difference between the two heterozygotes, thus leading to a more direct test for imprinting—consisting in just checking whether that parameter is significantly different from zero. Actually, this is in general terms the approach commonly chosen to model imprinting (see e.g. Wolf et al., 2008). Hereafter, NOIA is extended following that approach and thus implemented with a parameter to account for the putative difference between the heterozygotes with different parent-of-origin. As in the previous section, an expression of effects of allele substitutions from the reference of homozygote *A*1*A*¯ <sup>1</sup> is here provided in the first place, as:

$$
\begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{21} \\ G\_{22} \end{pmatrix} = \begin{pmatrix} 1 \ 0 \ 0 \ 0 \\ 1 \ 1 \ 1 \ -1 \\ 1 \ 1 \ 1 \ 1 \\ 1 \ 2 \ 0 \ 0 \end{pmatrix} \begin{pmatrix} R \\ a \\ d \\ i \end{pmatrix} \tag{8}
$$

This model is designed for using the midpoint between the two heterozygotes to define the dominance effect and the deviations of the two heterozygotes from that point as the imprinting effect. A graphical comparison explaining how the three models shown in this article (the non-imprinted model, the two-dominances model and the imprinting-effect model) decompose the genotypic values is shown in **Figure 2**. By equating the vector of genetic effects in (8) it follows:

$$
\begin{pmatrix} R \\ a \\ d \\ i \end{pmatrix} = \begin{pmatrix} 1 & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ -1/2 & \mathbf{0} & \mathbf{0} & 1/2 \\ -1/2 & 1/2 & 1/2 & -1/2 \\ \mathbf{0} & -1/2 & 1/2 & \mathbf{0} \end{pmatrix} \begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{21} \\ G\_{22} \end{pmatrix} \tag{9}
$$

From this expression it immediately follows that indeed *<sup>d</sup>* <sup>=</sup> <sup>1</sup> /2(*G*<sup>12</sup> + *G*21) − <sup>1</sup> /2(*G*<sup>11</sup> + *G*22) (i.e., the dominance effect measures the distance of the midpoint between the two heterozygotes and the additive expectation) and *i* = <sup>1</sup> /2(*G*<sup>21</sup> − *G*12) (i.e., the imprinting effect measures the distance of the heterozygotes from the midpoint between them). Expression (9) provides a general individual-referenced formulation, analogously to (6) for the two-dominances model in the previous section. Also in an analogous way as in that section, an orthogonal population-referenced formulation of the imprinting-effect model can be obtained as:

$$
\begin{pmatrix} G\_{11} \\ G\_{12} \\ G\_{21} \\ G\_{22} \end{pmatrix} = \begin{pmatrix} 1 & -2p\_2 & -\frac{p\_{22}(p\_{12} + p\_{21})}{2p\_1p\_2 - \left(\frac{p\_{11}}{2}(p\_{12} + p\_{21})\right)} & 0 \\ 1 \ p\_1 - p\_2 & \frac{p\_{11}p\_{22}}{p\_1p\_2 - \left(\frac{p\_{11}}{2}(p\_{12} + p\_{21})\right)} & \frac{-2p\_{21}}{p\_{12} + p\_{21}} \\ 1 \ p\_1 - p\_2 & \frac{p\_{11}p\_{22}}{p\_1p\_2 - \left(\frac{p\_{11}}{2}(p\_{12} + p\_{21})\right)} & \frac{2p\_{12}}{p\_{12} + p\_{21}} \\ 1 & 2p\_1 & -\frac{p\_{11}(p\_{12} + p\_{21})}{2p\_1p\_2 - \left(\frac{p\_{11}}{2}(p\_{12} + p\_{21})\right)} & 0 \end{pmatrix} \begin{pmatrix} \mu \\ \alpha \\ \delta \\ \iota \end{pmatrix} \tag{10}
$$

In this case, the three genetic (additive, dominance and imprinting) effects are fully orthogonal. The independence of the parameters makes this expression to resemble expression (3). Indeed, the decomposition of the genotypic values of the homozygotes into additive and dominance effects in (3) holds in (10), since *p*<sup>12</sup> in (3) is equivalent to (*p*<sup>12</sup> + *p*21) in (10). Concerning the heterozygotes, in the imprinted case we have two instead of one, leading to an extra row in the genetic-effects design matrix in (10), and there is an extra (imprinting) term in the

**FIGURE 2 | Individual-referenced genetic effects proposed in the text for a one-locus, two-allele, imprinted genetic system.** As in the previous figure, the alleles are *A*<sup>1</sup> and *A*2, with the variants due to imprinting being *A*¯ <sup>1</sup> and *A*¯ 2. Although no population frequencies are considered in this figure, the notation of the axes is kept consistent with the other figures, with *p*<sup>2</sup> = 0.5 indicating the heterozygotes. Also the genotypic values (black discs) are mostly kept, with the one of the heterozygote of **Figure 1** (*A*1*A*2, *G*<sup>12</sup> = 5) being the midpoint between the ones of the two heterozygotes in this figure (*A*1*A*¯ <sup>2</sup> and *A*2*A*¯ 1, *G*<sup>12</sup> = 4 and *G*<sup>21</sup> = 6, respectively). **(A)** The two-dominances model is a natural extension of the non-imprinted case that consists in introducing two dominance parameters, one for each heterozygote. As well as in the non-imprinted case, the dominance effects measure departures of the heterozygotes from their additive expectation (gray disc). **(B)** The imprinting-effect model keeps one only dominance effect that accounts for the departure of the midpoint between the two heterozygotes (upper gray disc) and the midpoint between the homozygotes (lower gray disc), and adds up an imprinting effect for accounting for the distance between the heterozygotes and their midpoint (upper gray disc). Thus, the dominance effect in this model coincides with that of a non-imprinted model (i.e., the one that would be obtained if imprinting was just disregarded).

decomposition, coming from the fourth column of that matrix. That term actually makes the only difference of the decomposition of the genetic effects of the heterozygotes as compared with the decomposition of the heterozygote in the non-imprinted case (3).

#### **VARIANCE DECOMPOSITION WITH IMPRINTING**

The previous expressions and arguments can be extended to the decomposition of the genetic variance with an imprinting variance component, which can easily be obtained from the model in matrix notation above (10) by following the formulae provided by Álvarez-Castro and Yang (2011). In expressions (12) and (13) of that article, the additive and the dominance variance have been obtained as *VA* <sup>=</sup> <sup>P</sup><sup>T</sup> *<sup>G</sup>* (α*<sup>G</sup>* ◦ α*G*) and *VD* = PT *<sup>G</sup>* (δ*<sup>G</sup>* ◦ δ*G*), respectively. In an analogous way (by means of analogous intermediate definitions; see Supplementary Material), a general expression for the imprinting variance can be provided simply as:

$$V\_O = \mathbf{P}\_G^{\vec{\Gamma}}(\mathfrak{u}\_G \circ \mathfrak{u}\_G) \tag{11}$$

Since *VI* traditionally stands for the epistatic variance, the subscript *O* is here chosen for the imprinting variance, ultimately coming from a differential effect of the alleles depending on their parent-of-origin. In any case, it is also possible to obtain the decomposition of the genetic variance by getting all three variance components at the same time, by just following expressions (14) and (15) in Álvarez-Castro and Yang (2011). Indeed, the imprinting variance component emerges from that formulae as a new term due to feeding them with expression (10).

By obtaining the variance decomposition in any of the ways described above (each individually or all simultaneously), it is easy to check that the additive and the dominance variances actually remain the same as for a non-imprinted biallelic locus. Assuming for simplicity the Hardy–Weinberg proportions, they are *VA* <sup>=</sup> <sup>2</sup>*p*1*p*2[*<sup>a</sup>* <sup>+</sup> *<sup>d</sup>*(*p*<sup>1</sup> <sup>−</sup> *<sup>p</sup>*2)]2, *VD* <sup>=</sup> (2*dp*1*p*2)<sup>2</sup> (see e.g. Falconer and Mackay, 1996)—, whilst the imprinting variance component can be expressed simply as:

$$V\_O = 2i^2 p\_1 p\_2 \tag{12}$$

**Figure 3** shows the decomposition of the genetic variance for two cases of imprinting. The genotypic values in **Figure 3A** are the same as in **Figure 2**, and thus they also fit the nonimprinted case in **Figure 1B**, in which the genotypic value of the heterozygote (*A*1*A*2, *G*<sup>12</sup> = 5) is the midpoint between the genotypic values of the two heterozygotes in **Figure 3A** (*A*1*A*¯ <sup>2</sup> and *A*2*A*¯ 1, *G*<sup>12</sup> = 4 and *G*<sup>21</sup> = 6, respectively). Therefore, the additive effects coincide in both cases and the dominance value of the imprinting-effect model in **Figure 3A** coincides with the simpler non-imprinted model in **Figure 1B**. Hence, the additive and the dominance variances coincide in both graphs. In **Figure 3A** there is, though, an extra (imprinting) term of the genetic variance decomposition.

As it is the case for dominance, the imprinting variance is higher for intermediate frequencies. In **Figure 3A**, the relatively small imprinting effect (relatively short distance between the two heterozygotes) leads to a small imprinting variance for all allele frequencies. In **Figure 3B**, however, it is shown that with larger differences between the two heterozygotes the imprinting variance may dominate the variance decomposition at almost any allele frequencies. And this actually occurs in practice, since this case fits to the callypige pattern mentioned above (with equal or similar phenotype values of the two homozygotes and one of the heterozygotes, relative to

**(curves) of one-locus, two-allele, imprinted genetic systems assuming Hardy–Weinberg proportions.** The notation is in accordance with the previous figures (with the addition of a black dashed curve for the imprinting variance). **(A)** The genotypic values are *G*<sup>11</sup> = 2, *G*<sup>12</sup> = 4, *G*<sup>21</sup> = 6, *G*<sup>22</sup> = 0, leading to individual-referenced genetic effects (from the reference of *A*1*A*¯ 1) *a* = −1, *d* = 4, *i* = −1. Since the additive and the dominance effects are the same as in **Figure 1B**, the additive and the dominance variances coincide and the equilibrium point also remains at *p*<sup>2</sup> = 0.375. **(B)** The genotypic values are the same as in **(A)** but for *G*<sup>21</sup> = 1 (at the midpoint between the two homozygotes), leading to individual referenced genetic effects of *a* = −1, *d* = 2.5, *i* = −2.5. The equilibrium point occurs here at *p*<sup>2</sup> = 0.3.

a higher value of the remaining heterozygote). Imprinting is thus—as well as other allele interactions (Álvarez-Castro and Le Rouzic, 2014)—a phenomenon that may by itself condition little responses to selection in the face of high genetic variances.

Incidentally, this particular claim could not be supported using the two-dominances model alone. Indeed, that model does not provide a separate term accounting for the variance explained by the difference between the two heterozygotes. Instead, it leads to a dominance variance that is different from the one of this imprinting-effect model (and thus also from the one of the non-imprinted case), and it actually equals the sum of the classical dominance variance *VD* and the imprinting variance *VO* as expressed above (11, 12).

### **COMPARISONS TO PREVIOUS MODELS**

Xiao et al. (2013) have recently proposed a model of imprinting based on the (non-imprinted) NOIA model. They take the option of implementing an explicit imprinting parameter, which in their mathematical construction is closely related to the additive effect, rather than to the dominance effect as in the imprinting-effect model developed above (8–10). Since it is in this article acknowledged that modeling imprinting requires some improvisation as compared to other facts of genetic architecture, several different solutions could be possible—it is not intended here to pose any objective criticism on that choice by itself.

The developments by Xiao et al. (2013) are indeed inspired in the NOIA model and they provide both statistical (i.e., population-referenced) and functional (which are not population-referenced) formulations. However, their models are difficult to be considered as pure extensions of the NOIA model. A very simple counterexample for this can be shown through their expression (12), from which it follows that they define the functional additive effect as *r*<sup>1</sup> = *G*<sup>22</sup> − *G*11, whereas in the NOIA model it is defined as *a* = (*G*<sup>22</sup> − *G*11)/2. This can be easily derived e.g. from (2) for the non-imprinted case, and also from (6) and (9) for the extensions to imprinting provided in this article.

Xiao et al. (2013) carried out simulations to prove that their statistical models are more appropriate (due to orthogonality) for detecting allelic effects than their functional developments. This effort seems to be rather futile since the functional formulations are in general not developed with that motivation in mind, but mainly for representing the GP map as effects of allele substitutions from individual references (Hansen and Wagner, 2001; Álvarez-Castro and Carlborg, 2007; Álvarez-Castro, 2012; and also summarized above). In any case, the statistical models of imprinting by Xiao et al. (2013) are admittedly not fully orthogonal as the imprinting-effect model provided above (10), but only under certain conditions e.g. (but not only) under the Hardy–Weinberg proportions.

Wolf and Cheverud (2009, Appendix 2) had also provided a model with an explicit imprinting parameter that is orthogonal under the Hardy–Weinberg proportions. As well as Xiao et al. (2013), they make the point that, also with imprinting, extensions to multiple loci with epistasis come naturally using the Kronecker product of genetic-effect design matrices (following Tiwari and Elston, 1997), which incidentally applies directly also to the models of imprinting provided in this article. However, Wolf and Cheverud (2009) do not provide explicit expressions for performing variance decompositions.

Neither they discuss an explicit link of their statistical setting to a functional formulation, although their expressions (4) and (5) fit to an extension of the physiological model (Cheverud and Routman, 1995, which is an alternative to statistical formulations with the unweighted population mean as reference point) rather than to the *F*<sup>2</sup> model they initially follow in their developments. More to the point, in their previous work on imprinting (Wolf et al., 2008) they made an extension of the *F*<sup>∞</sup> model, another alternative to the classical statistical formulations.

There is also a previous work in which a two-dominance strategy has been chosen to model imprinting, by Santure and Spencer (2011). They have adapted several standard quantitative approaches to derive quantitative genetics parameters in the presence of imprinting, which is implemented as in this article, in the form of one dominance effect for each heterozygote. The different approaches considered in that article lead to different results, but none of them enables an orthogonal decomposition of the genetic variance into additive and dominance (due to the two dominance effects) components. For several of those approaches, expressions of the covariances due to lack of orthogonality could not be derived.

### **DISCUSSION**

Since models of genetic effects are mathematical expressions aimed to enable the estimation of parameters with particular biological interpretations, their development is often directed to a predefined target. The difficulties of these developments often consist in reaching the mathematical properties that are in accordance with the desired biological meanings. With imprinting, there appears an extra layer of issues to be solved, ultimately coming from the fact that many combinations of alleles or allele variants will never occur (not even artificially). For solving that issue, modeling that *A*2*A*¯ <sup>2</sup> can be reached by performing two equal allele substitutions from *A*1*A*¯ <sup>1</sup> entails a very sensible and practical solution (even acknowledging that this is not in reality the case).

Standing from this point, and facing the presence of two different heterozygotes (and their genotypic values), it appears natural to think of accounting for two different dominance effects, analogous to the one dominance effect in the non-imprinted case. This solution, here called the two-dominances model, is not only feasible but, as shown in **Figure 2A**, rather clean by construction. It indeed leads naturally to an orthogonal variance partition into additive and interaction components. However, with this setting it may not be completely straightforward to detach imprinting as an effect either to test or to analyze in terms of evolutionary properties.

Traditional models of imprinting have embraced the option of implementing an explicit imprinting effect, which is here called the imprinting-effect model. Dominance is modeled as a departure from an additive (non-dominance) expectation. For modeling imprinting in an analogous way, a non-imprinting reference has to be considered. Due to the particularities of imprinting, this reference has to be a construct. Indeed, as explained above, we cannot just remove imprinting effects from our alleles and expect that the resulting genotypes exist or could even be viable, and there seems to be no biological justification for choosing one of the heterozygotes as the non-imprinted reference against the other one. Hence, the midway between the two of them is in this article set as a non-imprinted fictitious reference. In **Figure 2B** it can be seen that this leads for instance to a definition of the dominance effect in terms of points (gray discs) that are not genotypic values (black discs). In any case, several advantages come from this choice.

The imprinting-effect model here provided leads to a fully orthogonal setting, which entails a clear advantage over previous models. This is optimal in the first place for testing for statistical significance of the imprinting parameter. Furthermore, this setting can be described as a pure extension of a non-imprinting case with the heterozygote at the midpoint between the two imprinted heterozygote options. The variance partition, in particular, remains equal to the non-imprinting case in what regards all variance components except from the imprinting variance, which is of course absent in the non-imprinting case. This enables extremely convenient comparisons: the equilibrium points of the two cases will be the same, with a slowed down speed of phenotype change along generations for the imprinted case, which shall be more noticeable for increasing proportions of the imprinting variance component in the genetic variance partition (since the proportion of the additive component of the phenotypic variance decreases accordingly).

Besides population-referenced orthogonal expressions, individual-based formulations are in this article provided. When using any expressions in this article, the choice of a formulation and a reference point must be based on the mathematical properties and/or biological meaning that fits the particular question to be addressed. Each choice leads to different numerical values of at least some of the parameters in an applied case and thus not paying enough attention to picking the correct expression may be misleading. An illustration of such requisite of awareness on the specific kind of genetic effects used in each case follows.

In their article on imprinting and epistasis, Wolf and Cheverud (2009) claim, based on a previous work (Cheverud, 2000), that "additive-by-dominance indicates that the additive effect of the first locus depends on (i.e., changes as a function of) the genotype present in the second locus, while the dominance effect of the second locus depends on the genotype present at the first locus." This is true when analyzing a genetic system with the physiological model (that is, for physiological additive-bydominance genetic effects). Functional formulations are meant to express genetic effects from the reference of individual genotypes, i.e., as individual-based formulations. Mathematically, it is straightforward to use those expressions also from other reference points and, when doing so, it can be shown that they then coincide with statistical (population-referenced) formulations under certain conditions [Álvarez-Castro and Carlborg, 2007, expression (7)]. Both the *F*∞, the *F*<sup>2</sup> and the physiological models are instances of this situation: they thus may fit both to functional and to statistical interpretations and this is why the afore-cited sentence holds true within its particular context.

However, it is worthwhile noting that the referred sentence is not true for additive-by-dominance genetic effects of any model or formulation, and in particular it cannot be applied if the genetic effects are orthogonal (in the context a population under study) and conditions (7) of Álvarez-Castro and Carlborg (2007) do not hold. Indeed, in those instances it may well be that dominance-by-dominance interactions generate statistical additive-by-dominance interaction at genetic systems for which the latest equals zero under the physiological model. Such a phenomenon is analogous to the simpler instance shown in **Figure 1**, where the presence of dominance interaction is shown to generate additive variance in a genetic system where there are no difference between the homozygotes (i.e., nil functional additive effects). Interestingly, this hierarchical behavior works in a different way when it comes to imprinting. Indeed, the imprinting-effect model developed above is structured such that functional imprinting alone (with neither functional dominance nor functional additive effects) generates neither dominance nor additive variance, as it can be seen by the fact that these variances do not depend on the imprinting effect.

Overall, it is in general crucial to mind the biological meaning of the models in order to make the choice of the particular expression to be used in each particular case. In relation with this, NOIA conveniently provides expressions that work as a changeof-reference tool so that the genetic effects required to a particular question can be obtained from any others. The scope of that tool applies to transformations between the two-dominances and the imprinting-effect models developed above, which differ in the presence/absence of an explicit genetic imprinting effect. The choices of formulations are therefore not excluding, but potentially informative about different aspects in the analysis of a particular situation under study as long as the resulting values of the genetic effects (or variance decompositions) are interpreted in the light of the particular form of the genetic model used.

This article stands on recent advances in genetic modeling for carrying out new theoretical developments to the aid of the analysis of genetic imprinting. The models here developed improve previous proposals by providing both functional and statistical formulations that enable an orthogonal partition of the genotypic values and the genetic variance with a separate component for imprinting, which enables both better estimation of, and insight on, imprinted genes. Besides, imprinting may here be conceived also as an excuse or a challenge in order to elaborate on the logics behind the development of models of genetic effects—what are they intended for, which difficulties condition their stage of development, how to face them. Overall, one more step in the generalization of models of genetic effects is here provided, as well as keys about the way models of genetic effects may keep on being developed.

#### **ACKNOWLEDGMENT**

The author acknowledges the two reviewers for their suggestions, which improved this manuscript. The Autonomous Administration Xunta de Galicia provided funding for this research through project EM2014/024.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fevo.2014. 00051/abstract

#### **REFERENCES**


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 May 2014; accepted: 05 August 2014; published online: 08 September 2014.*

*Citation: Álvarez-Castro JM (2014) Dissecting genetic effects with imprinting. Front. Ecol. Evol. 2:51. doi: 10.3389/fevo.2014.00051*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Ecology and Evolution.*

*Copyright © 2014 Álvarez-Castro. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### **Appendix**

### *Regression approach for developing orthogonal genetic effects*

This approach consists in computing the genetic effects from the regression of genotypic values to the allele content, as Fisher (1918) proposed (see e.g. Falconer and Mackay (1996)). Álvarez-Castro and Carlborg (2007) expressed the genotypic values as *G*(*N*)=E(*G*)+*N*, where *N* stands for the number of *A*<sup>2</sup> alleles. The intercept of the regression, E(*G*), is the expectation of the genotypic values and the regression coefficient is *NNG* )(Var),(Cov . The additive effects come from the linear regression itself, whereas the interaction terms come from the departures—the distances between the regression and the original genotypic values (for further details, see Álvarez-Castro and Carlborg 2007). In the twodominance model, each heterozygote determines one dominance effect (as represented in Figure 2A), whereas in the imprinting-effect model, the dominance effect is defined as in the non-imprinting case by taking the midpoint between the two heterozygotes as the one heterozygote required for making that definition. The imprinting effect is defined afterwards from the departures of the real heterozygotes from that midpoint (as represented in Figure 2B).

A genetic-effect design matrix, **S**, is orthogonal for a set of genotype frequencies when **S** <sup>T</sup>**DS** is diagonal, with **D**=Diag[(*pij*)] *i.e.* the diagonal matrix with the genotypic frequencies at its diagonal.

### *Variance components from the decomposition of the genotypic values*

Following Álvarez-Castro and Yang (2011), the decomposition of genotypic values into additive and interaction terms that is implicit in an expression of the type **G**=**SE** can be made explicit as **G***dec*=**S**Diag[**E**]. From here, the decomposition of the genetic variance takes the form of a vector (with the variance components) by just computing **V**<sup>=</sup> *decG* **GGP** *dec* T , with T **P***G* =(*pij*). For computing the imprinting variance separately in the context of a one-locus two-allele model, the imprinting vector can be defined as **ι***G*=**S**Diag[(0,0,0,1)]**E**, to then apply expression (11).

## Corrigendum for "Dissecting genetic effects with imprinting"

### *José M. Álvarez-Castro\**

*Department of Genetics, Universidade de Santiago de Compostela, Lugo, Spain \*Correspondence: jose.alvarez.castro@usc.es*

#### *Edited and reviewed by:*

*Rong-Cai Yang, University of Alberta, Canada*

**Keywords: imprinting, NOIA, individual-referenced models of genetic effects, population-referenced models of genetic effects, genetic variance decomposition**

#### **A corrigendum on**

**Dissecting genetic effects with imprinting** *by Álvarez-Castro, J. M. (2014). Front. Ecol. Evol. 2:51. doi: 10.3389/fevo.2014.00051*

### Corrigendum:

In the article of the Frontiers Research Topic Issue on Models and Estimation of Genetic Effects "Dissecting genetic effects with imprinting," by Álvarez-Castro (2014), the citation of the work in press by Álvarez-Castro and Le Rouzic is no longer correct since its publication has in the end been postponed to 2015 (Álvarez-Castro and Le Rouzic, 2015). The same holds for the citation of that reference in press in the article "Estimating directional epistasis," by Le Rouzic (2014), also within the Frontiers Research Topic Issue on Models and Estimation of Genetic Effects.

### **REFERENCES**


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 November 2014; accepted: 20 November 2014; published online: 08 December 2014.*

*Citation: Álvarez-Castro JM (2014) Corrigendum for "Dissecting genetic effects with imprinting." Front. Genet. 5:427. doi: 10.3389/fgene.2014.00427*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Álvarez-Castro. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Clarifying the relationship between average excesses and average effects of allele substitutions

### *José M. Álvarez-Castro1\* and Rong-CaiYang2,3*

*<sup>1</sup> Department of Genetics, University of Santiago de Compostela, Lugo, Spain*

*<sup>2</sup> Research and Innovation Division, Alberta Agriculture and Rural Development, Edmonton, Alberta, AB, Canada*

*<sup>3</sup> Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta, AB, Canada*

#### *Edited by:*

*Jason Wolf, University of Bath, UK*

#### *Reviewed by:*

*Chen-Hung Kao, Academia Sinica, Taiwan Alan Templeton, Washington*

*University, US Minor Outlying Islands*

#### *\*Correspondence:*

*José M. Álvarez-Castro, Department of Genetics, Veterinary Faculty, University of Santiago de Compostela, Avda Carvalho Calero, s/n, ES-27002 Lugo, Galiza, Spain. e-mail: jose.alvarez.castro@usc.es*

Fisher's concepts of average effects and average excesses are at the core of the quantitative genetics theory. Their meaning and relationship have regularly been discussed and clarified. Here we develop a generalized set of one locus two-allele orthogonal contrasts for average excesses and average effects, based on the concept of the effective gene content of alleles. Our developments help understand the average excesses of alleles for the biallelic case. We dissect how average excesses relate to the average effects and to the decomposition of the genetic variance.

**Keywords: average effects, average excesses, effective gene content, models of genetic effects, non-equilibrium populations**

### **INTRODUCTION**

Since Fisher (1918), partitioning of the genotypic values at a locus into additive and dominance effects has been usedfor conventional quantitative genetic analyses and recently for mapping quantitative trait loci (QTL; see, e.g., Lynch and Walsh, 1998). Numerous statistical models have been proposed for such partitioning. Some of them are restricted to populations under Hardy–Weinberg equilibrium (HWE; see, e.g., Falconer and MacKay, 1996), including a special case of gene frequency being one half (Mather and Jinks, 1982). Others also adequately account for Hardy–Weinberg disequilibrium (HWD; e.g., Cockerham, 1954; Yang, 2004; Álvarez-Castro and Carlborg, 2007). Regardless of whether a population is in HWE or HWD, Fisher (1918) and others have shown that the additive and dominance genetic effects are simply the coefficient of a linear regression of the genotypic values on the gene content and the deviation from that regression, respectively. The regression coefficient is commonly known as the average effect of substituting one allele by the other in a diploid genotype (Falconer and MacKay, 1996).

As another measure of the additive effect, Fisher (1941) defined the average excess of an allele as the difference by which the average of genotypes carrying that allele exceeds the average of genotypes carrying the alternative allele. Fisher (1941) also pointed out that the average effect is equal to the average excess if the population is in HWE, but it is less than the average excess if inbreeding occurs. Such relationships between average effect and average excess have been subsequently confirmed and elaborated (e.g., Kempthorne, 1957; Falconer, 1985; Templeton, 1987; Lynch and Walsh, 1998).

In this note, we further clarify the relationship between the average effect and the average excess of a gene substitution based on a new set of general contrasts that entail both the average effects and the average excesses as particular cases. We provide a common conceptual and graphical interpretation for both parameters and further dissect how they are related to the decomposition of the genetic variance.

### **MODEL**

Additive and dominance contrasts are commonly used to build and interpret models of genetic effects (e.g., Cockerham, 1954; Li, 1976; Zeng et al., 2005). Such contrasts enter the regression model as:

$$G\_{i\bar{j}} = \mu + \tilde{\alpha}\nu\_{i\bar{j}} + \tilde{\delta}\nu\_{i\bar{j}},\tag{1}$$

where *Gij* are the genotypic values, μ is the population mean, α˜ and ˜ δ are the additive and dominance genetic effects, and *wij* and *vij* are, respectively, the coefficients for the additive and dominance contrasts.

In this context, the values 0 and 1 can naturally be used to indicate the presence of alleles A1 and A2 in the genotypes, leading to the genotype indicator variable *zij* taking the values *z*<sup>11</sup> = 0, *z*<sup>12</sup> = 1, and *z*<sup>22</sup> = 2 for A1A1, A1A2, and A2A2, respectively, and to the coefficients for the additive effects through *wij* = *zij* − *E*(*z*), where *E*(*z*) is the expectation of *z* (see, e.g., Zeng et al., 2005). This indicator variable has thus a clear biological meaning – the gene content of one of the alleles, A2. When using this indicator variable, the additive parameter is the average effect, i.e., α˜ = α, and the dominance parameter is the dominance genetic effect ˜ δ = δ.

On the other hand, the average excesses of alleles in a population under HWD were proffered to further entail the effects of alleles due to correlations with other alleles in that population (Fisher, 1941). Aiming to allow for such correlations in our derivations, we here consider more general indexes. In particular, we introduce a constant *c* as the ratio of the average effect over the average excess (cf. Eq. 3 of Fisher, 1941). Multiplying *zij* by this constant leads to a new genotype indicator variable with *z*<sup>11</sup> = 0, *z*<sup>12</sup> = *c*, and *z*<sup>22</sup> = 2*c*. This new genotype indicator variable will serve to indicate the effective content of allele A2 in the three genotypes, as it will be further illustrated below.

The use of effective gene contents for obtaining orthogonal contrasts under HWD is summarized in **Table 1**. Obtaining the coefficients for the orthogonal additive contrast, *wij*, as *zij* − *E*(*z*), warrants that Σ*pijwij* = 0, where *pij*, *ij* = 11, 12, 22, are the genotypic frequencies of the population (see, e.g., Cockerham, 1954). The coefficients for the orthogonal dominance contrasts, *vij*, are obtained to fulfill Σ*pijvij* = 0 and Σ*pijwijvij* = 0 (Álvarez-Castro and Carlborg, 2007). These are the deviations of the observed genotypic values from the expected values as predicted from the regression of the genotypic values on the effective gene contents.

Additive and dominance contrasts (e.g., the ones built in **Table 1**) can be conveniently expressed in matrix notation. This allows for a straightforward extension of the one locus model to and arbitrary number of loci with arbitrary epistasis under linkage equilibrium (LE; Tiwari and Elston, 1997). It has also been shown that the matrix notation enables straightforward transformations between parameters that have previously been expressed using appropriate contrasts (Álvarez-Castro and Carlborg, 2007).

Let thus **G** be the vector of genetic effects, **E** be the vector entailing the population mean and the additive and dominant parameters and **S** be the genetic-effect design matrix entailing the contrasts that allow for a transformation between vectors **G** and **E**. Then, just using the contrasts in **Table 1** we obtain the matrix expression **G** = **S**·**E** as:

$$
\begin{pmatrix} G\_{11} \\ G\_{12} \\ \vdots \\ G\_{22} \end{pmatrix} = \begin{pmatrix} 1 & -2p\_2c & -\frac{\mathcal{P}12\mathcal{P}2}{2p\_1p\_2 - 1/2p\_{12}} \\ 1 & (p\_1 - p\_2)c & \frac{\mathcal{P}11\mathcal{P}2}{p\_1p\_2 - 1/4p\_{12}} \\ 1 & 2p\_1c & -\frac{\mathcal{P}11\mathcal{P}12}{2p\_1p\_2 - 1/2p\_{12}} \end{pmatrix} \cdot \begin{pmatrix} \mu \\ \tilde{\alpha} \\ \tilde{\alpha} \\ \tilde{\delta} \end{pmatrix}, \tag{2}
$$

where *pi*, *i* = 1, 2, are the frequencies of the alleles, *pi* = *pii* + 1/2*pij*, *j* -= *i*.

### **A UNIFIED FRAMEWORK FOR AVERAGE EFFECTS AND AVERAGE EXCESSES**

As mentioned above, the contrasts in **Table 1** provide the average effects of allele substitutions when *c* = 1. It is thus not surprising that in this case Eq. 2 reduces to Álvarez-Castro and Carlborg (2007) Eq. 8 – for the average (additive and dominance) effects. For analyzing how (2) relates to the average excesses, we first

**Table 1 | Coefficients of orthogonal contrasts for the average effects and the average excesses for two allels at a locus.**


*The non-zero constant c is introduced for accounting for effective gene contents.*

recall their definition for one biallelic gene (following Fisher, 1941; Kempthorne, 1957):

$$\begin{cases} \alpha\_1^\* = \frac{\rho\_{11}}{\rho\_1} G\_{11} + \frac{1}{2} \frac{\rho\_{12}}{\rho\_1} G\_{12} - \mu\\ \alpha\_2^\* = \frac{1}{2} \frac{\rho\_{12}}{\rho\_2} G\_{12} + \frac{\rho\_{22}}{\rho\_2} G\_{22} - \mu \end{cases} \tag{3}$$

By inverting expression (2), it is easy to see that α˜ = α∗ <sup>2</sup> −α<sup>∗</sup> <sup>1</sup>.when *c* = 1/(1 + *F*), with *F* = 1 − *p*12/2*<sup>p</sup>*1*p*<sup>2</sup> being Wright's (1965) fixation index. *F*, with the range of −1 ≤ F ≤ 1, reflects any departure from the HWE, toward either an excess or a deficiency of heterozygotes. We can thus rename α˜ = α∗, ˜ δ = δ<sup>∗</sup> when *c* = 1/(1 + *F*). That is to say, Eq. 3 restores the definition of average excesses of the alleles for a biallelic locus. We will consequently refer to (2) with *c* = 1/(1 + *F*) as the average-excess formulation of NOIA.

From the general expression (2), we have thus retrieved both the average effects and the average excesses as particular cases of the contrasts in **Table 1**, specifically with *c* = 1 and *c* = 1/(1 + *F*), respectively. Therefore, by implementing the effective gene content *c* we have actually made our model to capture the correlation between alleles that the average excesses account for. Further, using the relationship between the two values of *c* (1 and 1/(1 + *F*)) we are also retrieving the relationship between average effects and average excesses reported by Kempthorne (1957), α*<sup>i</sup>* = <sup>α</sup><sup>∗</sup> *<sup>i</sup>*/(1+*F*), which actually applies to the case of multiple alleles (see also Templeton, 1987).

Evidently, the possible values of the function 1/(1 + *F*) depend on those of the fixation index, *F*. In particular, *c* = 1/(1 + *F*) must always be positive and within the range 1/2 ≤ *c* <∞ for the allowable values of *F* ranging from complete homozygosity (*F* = 1) to complete heterozygosity (*F* = -1). When *F* = 0 (i.e., *c* = 1) we have the well-known case where the average effect and average excess are the same, that is under HWE. Since *c* = 1/(1 + *F*) must always be positive, α and α\* will always have the same sign and will verify |α<sup>|</sup> <sup>=</sup> *<sup>c</sup>*|α\*|. Taking all this into account, **Table 2** summarizes how the fixation index affects the relationships between average excesses and additive genetic effects under three situations: heterozygote deficiency (*F* < 0), HWE (*F* = 0) and heterozygote excess (*F* > 0). Within that table, we also stress that the mathematical relationship between average excesses and average effects does not depend upon which one(s) of all potential biological features is (are) underlying a particular set of observed genotype frequencies.

**Table 2 | Summary of some relevant mathematical and biological features associated to different statuses of the heterozygosity of a population.**


### **PARTITIONING THE GENOTYPIC VALUES AND THE GENETIC VARIANCE**

The average-excessformulation [expression (2) with *c* = 1/(1 + *F*)] comes from a linear regression (1) and it can thus be expressed by means of its intercept, μ, and its regression coefficient, α\*, as:

$$
\hat{G}(\omega) = \mu + \alpha^\* \omega \tag{4}
$$

This regression entails a decomposition of the genotypic values in which the predictions from the regression are the additive components and the deviations of the regression – due to dominance interactions – are the dominance components. For instance, the predicted [by (4)] value for genotype A1A1 is α<sup>∗</sup> <sup>11</sup> = *G*ˆ (−*c*). Now, both **Table 1** and expression (2) show that the dominance contrasts, *vij*, do not depend upon the scaling factor *c* and, hence, they are equal for the statistical and the statistical excess formulations. This implies that the dominance deviations are the same in both cases, i.e., δ∗ *ij* = δ*ij* and that, therefore, α<sup>∗</sup> *ij* = α*ij* = α*<sup>i</sup>* + α*<sup>j</sup>* . That is to say, both formulations lead to the same decomposition of genotypic values,

$$G\_{\vec{i}\vec{j}} = \mu + \alpha\_{\vec{i}\vec{j}} + \delta\_{\vec{i}\vec{j}} \equiv \mu + \alpha\_{\vec{i}\vec{j}}^{\*} + \delta\_{\vec{i}\vec{j}}^{\*}.\tag{5}$$

This is illustrated in **Figure 1**, where we show the graphical interpretation of the decomposition of genotypic values coming from the average excesses and compare it with the classical decomposition coming from the average effects (Fisher, 1918). Note, particularly, that the decomposition of genotype A1A1 into additive and dominance parts is the same regardless of which linear regression is used. Interestingly, although for the average effects formulation (with *c* = 1) the predictions of the regression can be obtained by just summing up the appropriate average effects (see, e.g., Álvarez-Castro and Carlborg, 2007), this does not hold for the average-excess formulation [with *c* = 1/(1 + *F*)], i.e., α∗ *ij* -= α<sup>∗</sup> *<sup>i</sup>* + α<sup>∗</sup> *<sup>j</sup>* , unless the genotypic frequencies are under HWE. The reason for this is also noted in **Figure 1**, where it can be seen that α∗ *<sup>i</sup>* and α<sup>∗</sup> *ij* associated to different valuesfor the regression independent variable (α∗ <sup>11</sup> = *G*ˆ (−*c*) whereas α<sup>∗</sup> <sup>1</sup> + α<sup>∗</sup> <sup>1</sup> = *G*ˆ (−1)). The exact relationship between these values under HWD is straightforward from α∗ *ij* = α*ij* = α*<sup>i</sup>* + α*<sup>j</sup>* and α*<sup>i</sup>* = *c*α<sup>∗</sup> *<sup>i</sup>* (Kempthorne, 1957), which lead to:

$$
\alpha\_{ij}^\* = c \left( \alpha\_i^\* + \alpha\_j^\* \right). \tag{6}
$$

The decomposition of genotypic values being the same for the average effects and the average excesses (5) necessarily implies that they also lead to the same decomposition of the genetic variance. We have confirmed this result by substituting the average-excess additive contrasts (**Table 1**, with *c* = 1/(1 + *F*)) in the equation for the additive variance (see, e.g., Cockerham, 1954). When doing so, a common factor *c*<sup>2</sup> can be simplified from both the numerator and the denominator of that expression so that the original expression for the additive variance is retrieved.

The additive variance coming from the average excesses is the variance of the values α∗ *ij* . Thus, the average excesses of the alleles enter the computation of the additive variance by just applying

**FIGURE 1 | Graphical interpretation of the decomposition of the genotypic values (5) through the statistical excess (in black) and the statistical (in gray) formulations of NOIA for one locus with two alleles.** For simplicity, a case with equal allele frequencies (*p*<sup>1</sup> = *p*<sup>2</sup> = 1/2) is shown. The specific genotypic values (circles; *G*<sup>11</sup> = 1, *G*<sup>12</sup> = 3, *G*<sup>22</sup> = 2) displaying overdominance and a fixation index (*F* = − 2/5) have been chosen for facilitating the visualization of the parameters of interest. The size of the circles represents the frequency of the genotypes. Horizontal dashed lines emphasize coincident arrow edges, the upper one corresponding to the population mean phenotype, μ = 2.55. The regression independent variable of the statistical formulation is the gene content, whereas the one of the statistical excess formulation is scaled by *c* = 1/(1 + *F*) = 5/3 and it works as an effective gene content. For both cases, the independent variable, *w*, is rescaled by its expectation as shown in**Table 1**.

(6). Although a common way to express and compute the additive variance under HWD entails both the average (additive) effects and the average excesses [see, e.g., expression (4.23a) in Lynch and Walsh, 1998], here we have shown that either formulation alone suffices to provide the additive variance under HWD. We recall that this is true as long as the formulations are built using contrasts that are appropriate to HWD – as the ones we are providing in this communication for both the biallelic case.

#### **EFFECTIVE GENE CONTENT**

Hardy–Weinberg disequilibrium implies that alleles become (either positively or negatively) correlated in zygotes as compared to the expected genotype frequencies under HWE. A deficiency of heterozygotes, for instance, causes alleles to become positively correlated, leading to their effective additive contribution to the genotypes of a population to be more extreme (i.e., further away from their expectation) than under HWE. Fisher (1941) noted that this is accounted for by the average excesses. We note that this is not in contradiction with the interpretation of the average excesses of one allele as the conditional average genotypic deviation of the individuals that received that allele from at least one parent (see, e.g., Templeton, 2006).

For the biallelic case, we can trace Fisher's (1941) remark in our graphical interpretation (**Figure 1**). We first recall that although both the average effects and the average excesses are linear regressions of the genotypic values (the regression dependent variable) as expressed in (1), each of them is regressed on a different independent variable. The independent variable of the formulation of average effects is the actual content of allele A2 (which is in **Figure 1** shown as rescaled by its expectation) whereas the independent variable of the average-excess formulation is the effective content of allele A2 measured by a factor *c*. This factor being greater than one in our example (*c* = 5/3) reflects an excess of heterozygotes (particularly with *F* = − 2/5) and makes the slope of the regression for the average excess, α\*, to be less steep than the one on the actual gene content, α, as noted in **Table 2**. Conversely, a deficiency of heterozygotes would make the slope of the average-excess regression to become steeper than the one of the regression for the average effects. Thus, the effective gene content *c* leads to the average excesses to reflect the effective contributions of the alleles to the genotypes of a population.

### **CLOSING PERSPECTIVE**

In conclusion, we have showed here that Fisher's (1941) definition of average excesses can be phrased within a new regression framework that also generalizes the average effects. This has

### **REFERENCES**


enabled us to clarify the significance of the average excesses in different ways. First, we have expressed the average excesses in terms of matrix notation within the NOIA framework, which entails the extension of that theory to multiple loci with arbitrary epistasis under LE and allows us to easily transform between average excesses and other genetic parameters. Second, we have fully integrated the average excesses into the theory for the decomposition of the genotypic values and the genetic variance into additive and dominant components. Third, we have provided a graphical interpretation of the average excesses that is analogous to the one of the average effects. Finally, we interpret the factor determining the relationship between average effects and average excesses as the effective gene content of individuals, accounting not only for the effects of their alleles but also for how pairs of alleles are correlated in a particular population.

### **ACKNOWLEDGMENTS**

José M. Álvarez-Castro acknowledges funding by an "Isidro Parga Pondal" contract from the autonomous administration Xunta de Galicia. This research has been supported by project BFU2010- 20003 form the Spanish Ministry of Science (José M. Álvarez-Castro) and the Natural Sciences and Engineering Research Council of Canada, Grant OGP0183983 (Rong-Cai Yang).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 21 December 2011; accepted: 17 February 2012; published online: 09 March 2012.*

*Citation: Álvarez-Castro JM and Yang R-C (2012) Clarifying the relationship between average excesses and average effects of allele substitutions. Front. Gene. 3:30. doi: 10.3389/fgene.2012.00030*

*This article was submitted to Frontiers in Genetic Architecture, a specialty of Frontiers in Genetics.*

*Copyright © 2012 Álvarez-Castro and Yang . This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits noncommercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

## Analysis of linear and non-linear genotype × environment interaction

### *Rong-Cai Yang1,2\**

*<sup>1</sup> Alberta Agriculture and Rural Development, Edmonton, AB, Canada*

*<sup>2</sup> Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, Canada*

#### *Edited by:*

*José M. Álvarez-Castro, Universidade de Santiago de Compostela, Spain*

#### *Reviewed by:*

*Keyan Zhao, University of California, Los Angeles, USA Urko M. Marigorta, Georgia Institute of Technology, USA*

#### *\*Correspondence:*

*Rong-Cai Yang, Department of Agricultural, Food and Nutritional Science, University of Alberta, 410 Agriculture/Forestry Centre, Edmonton, Alberta T6G 2P5, Canada e-mail: rong-cai.yang@ales. ualberta.ca*

The usual analysis of genotype × environment interaction (G × E) is based on the linear regression of genotypic performance on environmental changes (e.g., classic stability analysis). This linear model may often lead to lumping together of the non-linear responses to the whole range of environmental changes from suboptimal and super optimal conditions, thereby lowering the power of detecting G × E variation. On the other hand, the G × E is present when the magnitude of the genetic effect differs across the range of environmental conditions regardless of whether the response to environmental changes is linear or non-linear. The objectives of this study are: (i) explore the use of four commonly used non-linear functions (logistic, parabola, normal and Cauchy functions) for modeling non-linear genotypic responses to environmental changes and (ii) to investigate the difference in the magnitude of estimated genetic effects under different environmental conditions. The use of non-linear functions was illustrated through the analysis of one data set taken from barley cultivar trials in Alberta, Canada (Data A) and the examination of change in effect sizes is through the analysis another data set taken from the North America Barley Genome Mapping Project (Data B). The analysis of Data A showed that the Cauchy function captured an average of >40% of total G × E variation whereas the logistic function captured less G × E variation than the linear function. The analysis of Data B showed that genotypic responses were largely linear and that strong QTL × environment interaction existed as the positions, sizes and directions of QTL detected differed in poor vs. good environments. We conclude that (i) the non-linear functions should be considered when analyzing multi-environmental trials with a wide range of environmental variation and (ii) QTL × environment interaction can arise from the difference in effect sizes across environments.

**Keywords: barley, environmental index, estimation, genotype × environment interaction, non-linear functions, quantitative trait loci**

### **INTRODUCTION**

Inconsistent performance of genotypes over different environments known as genotype × environment interaction (G × E) remains to be a major impediment to genetic improvement of biological species in Canada and elsewhere. G × E is particularly important for plant species (e.g., agricultural crops and forest trees) because they spend their entire life at the same locality. Over the past decades, the assessment of G × E has been done with the data obtained from testing of the same genotypes over multiple environments (locations or years), i.e., multi-environmental trials (Yang, 2007).

The G × E effect has been incorporated into quantitative genetic models (Falconer and Mackay, 1996) through the use of genetic correlations within and between individual genotypes (e.g., Crossa et al., 2004; Burgueño et al., 2008). The basic idea behind such an approach is to predict genetic values through borrowing information among individuals from genetic relationships, and within individuals (across environments) from genetic and environmental correlations. The analysis of such correlation structure has been performed to obtain the parsimony description of G × E variation using different versions of linear-bilinear models based on a mathematical technique known as singular value decomposition (SVD) (Golub and Reinsch, 1970). One popular use of the SVD technique is the biplot analysis of G × E based on the two commonly used rank-two linear-bilinear models: the additive main effects and multiplicative interaction (AMMI) model and the genotype main effects and genotype × environment interaction effects (GGE) model (i.e., fitted to residuals after removal of environment main effects) (for review, see Yang et al., 2009). Recently, Burgueño et al. (2008) and Cullis et al. (2010) described a similar biplot analysis under a mixed-model framework using a series of rank-two factor-analytic (FA) model. Apart from the adequacy of the rank-two models and other statistical issues, Yang et al. (2009) pointed out that the biplot analysis has contributed little to our understanding of the nature of G × E variation because it is a descriptive analysis with little predictive power.

Baker (1988) and others (e.g., Scheiner, 1993; Lindgren and Ying, 2000) have suggested the use of predictive models based on linear and non-linear response functions for studying G × E. The classic stability analysis based on simple linear regression model as pioneered by Yates and Cochran (1938)is a special case of the general non-linear predictive models. In addition, linear functions would usually account for a small portion of G × E variation if a wide range of environmental conditions are tested. On the other hand, for quantitative traits such as crop yield or human complex diseases (Franks et al., 2013), the G × E is manifested when the magnitude of the genetic effect differs across the range of environmental conditions regardless of whether the response to environmental changes is linear or non-linear. For this reason, many recent genome-wide association studies (GWAS) in human (Kilpelainen et al., 2011; Qi et al., 2012) have focused on determining the effect sizes of causal variants (e.g., SNPs) over different environmental conditions (e.g., different lifestyle behaviors).

The objectives of this paper are two folds. First, we investigate the use of different non-linear functions for modeling genotypic response to environmental changes or gradients. In this case, G × E is present when the response curves fail to be parallel (Baker, 1988). Similar concept has been used in evolution and ecology but under different names [e.g., phenotypic plasticity (robustness), reaction norm] (e.g., Via et al., 1995). Second, we examine whether there are differences in estimated genetic effects under different environmental conditions. It is generally expected that a larger effect is more likely found in the environmental condition where the expression of a gene is facilitated than in the environmental condition where the expression of a gene is not facilitated.

### **MATERIALS AND METHODS**

#### **DESCRIPTION OF NON-LINEAR FUNCTIONS**

As a starting point, we provide a brief description of the classic stability analysis that is based on a linear regression function (Yates and Cochran, 1938; Finlay and Wilkinson, 1963; Eberhart and Russell, 1966; Perkins and Jinks, 1968):

$$
\gamma\_{i\dot{j}} = a\_i + b\_i \mathbf{x}\_{\dot{j}} \tag{1}
$$

Where *yij* is the performance (say yield) of the *i*th genotype tested in *j*th environment, *xj* is the mean yield of all genotypes tested in the *j*th environment (known as environmental index), the intercept *ai* is the yield of the *i*th genotype at the worst environment, and the slope *bi* measures the stability of the *i*th genotype.

According to Finlay and Wilkinson (1963), all genotypes can be conveniently classified into three groups: (i) genotypes with average stability (*bi* = 1.0); (ii) genotypes with low stability or high sensitivity to environmental changes (*bi* > 1.0) and (iii) genotypes with high stability or low sensitivity to environmental changes (*bi* < 1.0). Eberhart and Russell (1966) further refined this definition by suggesting that a stable genotype would be the one with average stability, low variance due to deviations from regression and high mean yield.

However, linear response usually accounts for only a small portion of the G × E variation and the responses are most often non-linear in practice (Knight, 1973; Jinks and Pooni, 1988). This occurs because when individuals of the same genotype are evaluated at different levels of an environmental factor ranging from suboptimal, optimal to super-optimal levels, their performance (i.e., yield) often shows a continuous non-linear relationship with the environment. The response curve can rise from near zero performance at extreme suboptimal levels of the environmental factor to some asymptotic value at optimal levels, and then decrease to near zero value at extreme super-optimal levels. If a small portion of the environmental range is evaluated, only the linear response could possibly be observed within this limited range of environmental conditions.

Here we briefly describe some well-known non-linear functions that have been used to model relationships of yield or growth with a single more defined environmental variable (for details, see Baker, 1988; Ratkowsky, 1993). The most obvious non-linear function is a quadratic function (parabola function) and it is often used to describe the relationship between grain yield and field water availability (e.g., McKenzie et al., 2004):

$$y\_{i\dot{j}} = a\_i + b\_i \mathbf{x}\_{\dot{j}} + c\_i \mathbf{x}\_{\dot{j}}^2 \tag{2}$$

The quadratic function has been also used to describe the genetic response to climate variables in forest trees (Rehfeldt et al., 1999). Another non-linear function is the reciprocal of the quadratic function used to describe the relationship between yield and planting density (Baker, 1988):

$$\mathcal{Y}\_{ij}^{-1} = a\_i + b\_i \mathbb{x}\_j + c\_i \mathbb{x}\_j^2 \tag{3}$$

This general expression can take several special forms, one of which is known as Cauchy function,

$$\mathcal{Y}\_{ij} = \frac{k\_i}{\left[1 + \frac{\left(\mathbf{x}\_j - \mathbf{x}\_{\text{max}}\right)^2}{r\_i^2}\right]} \tag{4}$$

Where *Ki* is a parameter that scales yield from zero to one (i.e., 0 ≤ *Ki* ≤ 1), *x*max is the *x* value at which the maximum yield is achieved and γ*<sup>i</sup>* is the scale parameter which measures the range of genotypic response to environmental changes. This Cauchy function has been used to delineate breeding zones in forest trees (Raymond and Lindgren, 1990; Lindgren and Ying, 2000). The logistic curve:

$$\mathcal{Y}\_{ij}^{-1} = a\_i + b\_i \mathbf{c}\_j^{\mathbf{x}\_j} \tag{5}$$

is often used to describe the plant growth with age, but it can also be useful for the response to the environmental changes (Baker, 1988; West et al., 2001; Zuo et al., 2012). Roberds and Namkoong (1989) proposed the use of the Gaussian function to model the genotypic response to an environmental gradient:

$$\rho\_{ij} = \frac{k\_i}{\sqrt{2\pi r\_i^2}} e^{\left[\frac{\left(\frac{\chi\_j - \chi\_{\text{max}}}{2r\_i^2}\right)^2}{2r\_i^2}\right]} \tag{6}$$

When *Ki* = 1, Equation (6) becomes the normal probability density function. These non-linear functions are graphed in **Figure 1**.

It should be noted that the y-axis and x-axis in **Figure 1** are rescaled in standardized units. For example, the standardized Cauchy function is given by:

$$\chi\_{\vec{ij}}' = \frac{1}{1 + \mathbf{x}'^2\_{\vec{ij}}} \tag{7}$$

Where *y ij* <sup>=</sup> *yij ki* and *x ij* <sup>=</sup> *xj*−*x*max *ri* Thus, *y ij* becomes a relative measure of the performance within the range of 0 (0%)–1 (100%). All non-linear functions are indistinguishable at or near the optimum *x ij* = 0. For example, the Cauchy function can be well approximated by a quadratic function at the rescaled axises because of the following mathematical relationship:

$$\frac{1}{1+{\bf x'^{2}\_{ij}}} \rightarrow 1-{\bf x'^{2}\_{ij}} \\ \text{when} \\ \bf{x'\_{ij}} \rightarrow 0 \tag{8}$$

but the approximation becomes less desirable at the extreme environmental conditions (i.e., *x ij* >> 0).

#### **ANALYSIS OF EMPIRICAL DATA**

We will describe the analysis of two empirical data sets. The first data set (Data A) is taken from Yang et al. (2006) who analyzed 324 replicated barley cultivar trials sown at 84 sites across three provinces (Alberta, Saskatchewan and Manitoba) in the Canadian prairies during 1995–2003. Here we illustrate the use of nonlinear G × E analysis of the data taken from the trials in the province of Alberta only. The data set for the analysis is briefly recapitulated now. In each year, there were 16 (1995)–22 (2000) trials planted at different locations across Alberta. Each trial consisted of 39–44 barley cultivars. It should be pointed that in a given year, the same cultivars were usually included in each and every trial but over different years, at least some cultivars were different in the same and different test sites either due to a turnover to newly registered cultivars or to unavailability of pedigree seed of older cultivars. The same check cultivars were used across the different years. All trials were conducted using a randomized complete block design with three or four replications. Cultural practices such as fertility, tillage and pest control varied from site to site but were considered to be the most appropriate for the individual sites.

Following the procedure of Yang et al. (2006), the usual analysis of variance partitioned the total sum of squares in each year into components due to the site effects (E), the cultivar effects (G) and the interaction between cultivar and site effects (G × E) using SAS PROC MIXED (Sas Institute Inc, 2012). Further partitioning of the G × E variation under different non-linear functions was carried out using appropriate data transformations that enabled the analysis of non-linear G × E under the mixed-model framework. The different non-linear functions were compared interms of their ability to capture the amount of G × E variation.

The second data set (Data B) is a publicly available data set that we previously analyzed using single-marker analysis (Ham et al., 2010) and genome-wide prediction (Yang and Ham, 2012). The data set consisted of 150 doubled haploid (DH) lines that were developed from a cross between two malting barley varieties (Steptoe × Morex) for the North American Barley Genome Mapping Project (NABGMP) (http://wheat.pw.usda.gov). These DH lines were tested in 16 environments over North America for yield and seven other agronomic and malt quality traits. A total of 223 restricted fragment length polymorphism (RFLP) makers mapped over the seven chromosomes of the barley genome with 37, 37, 31, 33, 29, 22, and 34 makers being mapped on chromosomes 1, 2, 3, 4, 5, 6, and 7, respectively. The effects of these RFLP markers were estimated using a R package, GLMNET/R, at three representative environments: poor (minimum environmental index), average (mean environmental index) and good (maximum environmental index) environments. GLMNET/R implemented an efficient procedure for fitting the entire elasticnet regularization path for super-saturated linear regression as in genome-wide association studies (GWAS) (Friedman et al., 2010; R Core Team, 2012). The elastic-net penalty (Pα) is a compromise between the ridge-regression penalty (α = 0) and the LASSO penalty (α = 1), where α is related to the degree of shrinkage of marker effects. Two shrinkage methods, elastic net with α = 0.5 and α = 1 (i.e., LASSO), were used for genome-wide estimation of marker effects on response at poor, average and good environments.

### **RESULTS**

### **DATA A**

We (Yang et al., 2006) previously partitioned the total variability into components due to genotypes (G), environments (E) and G × E, and G × E accounted for 6.6% (2003)–23.9% (2000) of the total variability across different years. Here we further partitioned the G × E variability into a component that could be explained by different linear and non-linear models described above and a residual (**Table 1**). This further partitioning was based on linear or non-linear regression of yield on the environmental index (calculated as the mean of all cultivars at each and every test location). It is evident from **Table 1** that different non-linear models


2003 17.71 14.06 22.51 18.88 37.69 Average 11.26 10.17 17.23 19.13 40.28

**Table 1 | Percentages of genotype × environment interaction variation explained by linear function and four non-linear functions**

captured different amounts of the total G × E variation, ranging from an average of 10.2% for logistic model to 40.3% for Cauchy model. It is somewhat surprising that some non-linear models (e.g., logistic model) actually captured less G × E variation than the linear model. For a given model, there was also a large amount of year-to-year variation in the percentages of the G × E variation being captured. For example, Cauchy model captured 12.5% in 1997 and 86.5% in 2001. This result suggests that G × E variation is more predictable in some "good" years than in other "poor" years. In good years, stable and non-extreme weather or other agroclimatic conditions are available for optimal performance of individual genotypes whereas in poor years, such conditions do not exist.

#### **DATA B**

Responses of the DH lines to environmental index were examined under different linear and non-linear models. The responses of most DH lines were linear (**Figure 2**). Furthermore, the variation in such linear response was greater in "good" environments (i.e., the locations with higher environmental index values) than in "poor" environments (i.e., the locations with lower environmental index values). It is evident from **Figures 3**, **4** that Elastic net (α = 0.5) detected more marker effects than LASSO (α = 1.0) but LASSO gave much sharper resolution of marker effects. Under both estimation methods, marker effects were more pronounced in good environment than in poor environment.

#### **DISCUSSION**

Differential responses of genotypes to environmental conditions (G × E interactions) can be linear or non-linear. Most current analyses of such responses are limited to the use of linear models. In this study, we explore the use of different non-linear models for characterizing and dissecting G × E interaction. This was done by extending the linear regression on environmental indexes (the means of all genotypic values at individual environments) or the classic stability analysis (Yates and Cochran, 1938; Finlay and Wilkinson, 1963; Eberhart and Russell, 1966; Perkins and Jinks, 1968) to the non-linear regression analysis. In the past, several non-linear functions including logistic,

**cross between two malting barley cultivars (Steptoe × Morex) for the North American Barley Genome Mapping Project (NABGMP).** The range of the environmental index values runs from low (poor environment) to high (good environment).

quadratic (parabola), Cauchy and normal functions have been individually used to describe genotypic responses to environments (e.g., Knight, 1973; Jinks and Pooni, 1979; Roberds and Namkoong, 1989; Raymond and Lindgren, 1990; Van Tienderen and Koelewijn, 1994; Lindgren and Ying, 2000). For example, Van Tienderen and Koelewijn (1994) found that the quadratic function was "statistically significantly better" than the linear function. In this study, our comparison of these representative non-linear functions (**Figure 1**) reveals the following characteristics. First of all, when the parameters are appropriately chosen or rescaled, the response curves of different non-linear functions near the optimum are indistinguishably similar, but their differences become increasingly evident when the environmental condition is not good (suboptimal) or too good (superoptimal). Second, should the true response be non-linear but be treated as linear, it would be difficult to tell the difference between non-linear responses to suboptimal and super-optimal conditions because in the linear analysis, both suboptimal and super-optimal conditions are lumped together to represent a deteriorated environment (**Figure 5**). Thus, the linear analysis would cause the reduced range of environmental variation when non-linear response is present but its presence unknown to the researcher or simply ignored! Third, including responses to both suboptimal and super-optimal conditions provides more opportunities to characterize the nature of G × E interaction. For example, differences in the rate of increase in response at suboptimal levels would reflect differences in efficiency but differences in the rate of decrease in response at super-optimal levels would reflect differences in tolerance.

It may not totally surprising from this study that the Cauchy function is the best in capturing the G × E variation because it may be best representative of how different genotype respond to the whole range of environmental conditions. Each genotype

would have its own optimal growing environment. Any deviation from such optimum, either super-optimal or sub-optimal conditions, would cause a reduced performance or adaptation. The reduction must be very gentle for relatively mild super-optimal or sub-optimal conditions. For the extremely poor environments, the reduction asymptotically approaches a nonzero minimum. This scenario is best described by the Cauchy function which has a gentle decline at the regions close to the optimum (the center) and it has very long, flat tails at either side of the center but never converges. Comparing to the other non-linear functions, the Cauchy function is more sensitive to the values close to the optimum but less sensitive to the values at extreme environments which are of

little practical interest (Raymond and Lindgren, 1990; Lindgren and Ying, 2000). Thus the Cauchy should be considered in future plant and animal breeding and evolution studies.

Our analysis of Data A shows that different non-linear functions captured different amounts of G × E interaction variation with Cauchy function capturing an average of 40% of the total G × E variation which is twice the amount captured by the second best model (normal function). This striking capability of Cauchy function was also observed in Raymond and Lindgren (1990) and Lindgren and Ying (2000). It is evident from **Figure 1** that all non-linear functions are similar and indistinguishable when environmental conditions are close to the optimum but they become markedly different when environmental conditions move toward the extremes. Our results suggest that the actual range of environmental conditions as represented by all test locations over the years is too extended to be accommodated by all the functions except for the Cauchy function which can accommodate the environmental conditions at some distance away from the optimum. Thus, in practical applications, the choice of a non-linear function should be done after examining the actual distributions of environmental conditions either from previous experiences or from empirical data. It should also be reminded that a sufficient number of environments (e.g., ∼40 locations in our study) are needed so that the true distribution of environmental conditions can be well approximated by the empirical data.

The results from the analysis of Data B reveal that responses of 150 DH lines to environmental indexes were largely linear (**Figure 2**). The 16 environments (essentially 12 locations in 2 years) at which these DH lines were tested would hardly be considered sufficient for covering the whole environmental range. Thus, the linear responses may be reflective of the response to a limited range of environmental indexes. The possibility of nonlinear responses could not be ruled out particularly if the whole environmental range is available. Even within this limited environmental range, our analysis revealed some interconnected and interesting features. First of all, the variation in the responses of DH lines was greater in good environment than in poor environment. Second, the contrast between good and poor environments correspondingly led to the difference in the estimated positions, sizes and directions of QTL effects between these environments and this occurred irrespective of which method was used (**Figures 3**, **4**). Third, inconsistency in the positions, sizes and directions of QTLs across the environmental range is a direct evidence of strong QTL × environment interaction.

As just mentioned above, there is increase in the effect size of detected QTLs in good environment in comparison to poor environment (**Figures 3**, **4**). Similar observations have recently been made in many human GWAS particularly with respect to GWASdiscovered causal SNPs controlling the susceptibility of obesity. For example, Kilpelainen et al. (2011) showed that the risk effect of FTO (fat mass and obesity associated) alleles was about 100% and larger in physically inactive individuals than in active individuals from North America. Similar increase in the effect size was observed when individuals with ≥1 serving sugar-sweetened beverage per day were compared to those with sugary beverage intake <1 serving per month (Qi et al., 2012). Such increase in the effect size occurs because there are causal variants that lead to more phenotypic variation in the inactive lifestyle than in the active lifestyle. While generally being ignored in the past, our study and those other recent studies raise an important point that the genetic effects must not only be defined and estimated under a reference population, but also under an appropriate environment.

In conclusion, this paper calls for the attention to the use of non-linear functions for studying G × E interaction. We illustrate that the portion of G × E variation due to non-linear responses can be substantial if the correct non-linear function is used. We also emphasize that the correct identification of non-linear functions depends critically on how close the estimated environmental range is to the true range.

#### **ACKNOWLEDGMENTS**

I thank Dr. Zhiqiu Hu for computational and technical assistance, and two anonymous reviewers for helpful comments. This research is supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC OGP0183983).

#### **REFERENCES**


Finlay, K., and Wilkinson, G. (1963). The analysis of adaptation in a plant-breeding programme. *Aust. J. Agric. Res.* 14, 742–754. doi: 10.1071/AR9630742


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 May 2014; accepted: 30 June 2014; published online: 22 July 2014. Citation: Yang R-C (2014) Analysis of linear and non-linear genotype* × *environment interaction. Front. Genet. 5:227. doi: 10.3389/fgene.2014.00227*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## A simulation study of gene-by-environment interactions in GWAS implies ample hidden effects

### *Urko M. Marigorta\* and Greg Gibson*

*Center for Integrative Genomics, School of Biology, Georgia Institute of Technology, Atlanta, GA, USA*

#### *Edited by:*

*José M. Álvarez-Castro, Universidade de Santiago de Compostela, Spain*

#### *Reviewed by:*

*Rong-Cai Yang, University of Alberta, Canada Chirag Patel, Harvard Medical School, USA*

#### *\*Correspondence:*

*Urko M. Marigorta, Center for Integrative Genomics, School of Biology, Georgia Institute of Technology, 310 Ferst Drive, Atlanta, GA 30332, USA e-mail: urko.martinez@ biology.gatech.edu*

The switch to a modern lifestyle in recent decades has coincided with a rapid increase in prevalence of obesity and other diseases. These shifts in prevalence could be explained by the release of genetic susceptibility for disease in the form of gene-by-environment (GxE) interactions. Yet, the detection of interaction effects requires large sample sizes, little replication has been reported, and a few studies have demonstrated environmental effects only after summing the risk of GWAS alleles into genetic risk scores (GRSxE). We performed extensive simulations of a quantitative trait controlled by 2500 causal variants to inspect the feasibility to detect gene-by-environment interactions in the context of GWAS. The simulated individuals were assigned either to an ancestral or a modern setting that alters the phenotype by increasing the effect size by 1.05–2-fold at a varying fraction of perturbed SNPs (from 1 to 20%). We report two main results. First, for a wide range of realistic scenarios, highly significant GRSxE is detected despite the absence of individual genotype GxE evidence at the contributing loci. Second, an increase in phenotypic variance after environmental perturbation reduces the power to discover susceptibility variants by GWAS in mixed cohorts with individuals from both ancestral and modern environments. We conclude that a pervasive presence of gene-by-environment effects can remain hidden even though it contributes to the genetic architecture of complex traits.

**Keywords: gene-by-environment, environmental perturbation, modern lifestyle, complex disease, genetic risk score, decanalization, GWAS, obesity**

### **INTRODUCTION**

Diseases such as diabetes, cardiovascular disease, and obesity have become highly prevalent in the developed world in a period of just a few generations. For example, more than one third of U.S Citizens are obese (Ogden et al., 2007). The incidence of these "modern" diseases is now also rising in developing countries (Abegunde et al., 2007). Recent changes in lifestyle are thought to be the main drivers of the emergence of these diseases, because genetic changes at the population level only occur after many generations.

Paradoxically, the rapid increase in prevalence of these diseases coincides with large heritability values. There is increasing evidence that the heritability of several traits has increased in the last 50 years. Obesity serves to illustrate this point. An analysis of Swedish military conscripts born from 1951 to 1983 showed an increase in the heritability along with a marked increase in the genetic variance for obesity (Rokholm et al., 2011b). A further study of Danish twins showed that one percentage point increase in the prevalence of obesity accompanies a ∼3.3% increase in the genetic variance for the trait (Rokholm et al., 2011a). Thus, the increased influence of the current "obesogenic" environment exerts its effects through a large alteration in the overall contribution of genetic factors to the susceptibility for obesity. The two most likely explanations for this phenomenon consist of (i) uncovering of new cryptic susceptibility variants that did not previously participate in the genetic architecture of the trait (Gibson and Dworkin, 2004), or (ii) an increase in the effect size of variants already associated with obesity before the emergence of the current "obesogenic" environment (Hermisson and Wagner, 2004).

In the last 5 years, thanks to the detection of genetic variants robustly associated by GWAS, the presence of gene-byenvironment interactions (GxE) has been confirmed for several traits. However, the discovered GxE effects explain just a minor fraction of variance, suggesting that most interaction effects remain hidden. The poor availability of reliable environmental data constitutes one the major hurdles to detect GxE interactions. Genetic variation of common nature can be interrogated systematically with commercial genotyping arrays, but the availability of counterpart environmental information is often patchy and inconsistent, impeding a systematic interrogation of GxE effects (Patel et al., 2010, 2013). Moreover, the lack of highthroughput environmental data makes it difficult to replicate consistently GxE findings across datasets (Patel and Ioannidis, 2014). A second obstacle lies in the large sample size that is required to discover interaction effects univocally. For example, an early report observed that physical activity and diet modulate the effects of FTO variants on obesity (Demerath et al., 2011), but the evidence remained unclear in subsequent studies (Hubacek et al., 2011; Van Vliet-Ostaptchouk et al., 2012) until a large meta-analysis of 45 studies of ∼240,000 samples confirmed this interaction. Specifically, this meta-analysis established that the risk effect of FTO alleles was ∼100 and 40% larger in physically inactive relative to active individuals from North America and Europe, respectively [Odds Ratio: 1.43 vs. 1.22 and 1.27 vs. 1.21, respectively (Kilpelainen et al., 2011)].

Additionally, synergistic interactions between causal alleles and environmental factors are being detected through genetic risk scores (Franks et al., 2013). The calculation of GRS involves generation of a weighted sum of the risk due to several variants into a single figure, thus overcoming the limitation of statistical power for individual SNPs. For example, the interaction between risk alleles and sugar-sweetened beverage intake has been confirmed by means of a predisposition score for obesity based on 32 GWASdiscovered obesity variants. Specifically, the risk in BMI per 10 risk alleles increased by 77% in individuals with ≥1 serving per day compared to sugary beverage intake <1 serving per month (Qi et al., 2012). Similar examples of GRSxE detection have been described for fried food consumption and adiposity (Qi et al., 2014), cigarette use polygenic risk and neighborhood social cohesion (Meyers et al., 2013) and Western dietary patterns and type 2 diabetes (Qi et al., 2009; Nettleton et al., 2013).

In order to quantify how prevalent this GRS-by-environment (GRSxE) contribution may be, we have performed a simulation study of a quantitative trait under "ancestral" and "modern" environments. Our main aim was to define the range of realistic conditions in which GRSxE interaction effects can be detected in the absence of evidence for individual GxE for the contributing alleles. The environmental perturbation and genetic architecture of the trait are based on recent inferences from human GWAS data. We demonstrate that a wide range of perturbation effects is consistent with current observations from GxE studies, although our investigations also show that these effects may heavily reduce the power to detect causal alleles by GWAS.

### **MATERIALS AND METHODS**

#### **GENETIC ARCHITECTURE OF THE SIMULATED TRAIT**

We performed simulations of a polygenic quantitative trait to study the feasibility to detect gene-by-environment effects in the context of GWAS studies. We considered a trait partially controlled by genetic variants in the context of a total phenotypic variance of 1 (*VP* = 1). In all simulations, we approximate the genetic architecture based on two recent inferences regarding the genetic basis of complex traits in humans. First, the trait is controlled by 2500 causal SNPs of common nature (minor allele frequency >5%). This number of genes resembles the number of susceptibility variants inferred for several complex traits [e.g., from ∼1700 to 2900 for myocardial infarction and type 2 diabetes, respectively (Stahl et al., 2012)]. Second, we assign the percentage of variance explained by each causal SNP (genetic variance of the trait, *gv*) based on the inferences from a large meta-analysis on normal height variation (Lango Allen et al., 2010). This study discovered 180 loci associated with height, each explaining from 0.012 to 0.28% of the variance in the trait. The contribution of 701 variants of similar effect size (accounting for 16% of the VP) was inferred. We thus assigned the inferred distribution to 701 randomly selected variants from the 2500 simulated SNPs (gathered from Supplementary Table 4 in Lango Allen et al., 2010). Each of the remaining 1799 alleles was assigned to explain 0.012% of the variance. Hence, the 2500 simulated common SNPs individually explain from 0.012 to 0.28% of the variance, and the total genetic component of the trait accounts for 36% of the VP (heritability = 36%). Importantly, note that we assign the allelic effects as a percentage of variance that each SNP explains, with the corollary that the actual effect size per allele will depend on the frequency of the causal allele (see next paragraph).

The number of SNPs and *g* of the trait are fixed. Then, for each simulation we re-assign the effect allele frequencies (EAF) and effect sizes (β) at each of the 2500 causal SNPs. To mimic the ascertainment bias of GWAS arrays, EAF values were drawn from a uniform distribution with boundaries 0.05 and 0.95 [*U*(0.05,0.95)]. Genotypes were simulated assuming Hardy-Weinberg equilibrium. For example, for a SNP with EAF = 0.4 in a simulation of 10,000 samples, we would assign a value of 0, 1, and 2 phenotype-increasing alleles to ∼1600, 4800, and 3600 individuals, respectively. At this point of each simulation, we know the number of alleles that every individual carries at each site, as well as the total genetic variance each SNP explains. We can then easily calculate the effect size (β) of each SNP. The effect of the *i* th SNP on the trait is given by its contribution to the genetic variance of the trait (Park et al., 2010):

$$\text{gv}\_{i} = 2 \ast \beta^{2} \ast \text{EAF}\_{i} \ast (1 - \text{EAF}\_{i})$$

For example, a variant that explains 0.28% of the VP with an effect allele frequency of 0.4 would increase the simulated phenotype by 0, 0.076, and 0.153 in individuals with 0, 1, and 2 causal alleles at that position, respectively. We consider an additive polygenic architecture. Thus, for each simulated individual the effects are added additively per allele copy, and summed independently across all 2500 causal loci. After assigning the effects to all SNPs, the additive genetic variance component (VA) equals ∼0.36. To achieve the desired phenotypic variance (*VP* = 1), we assigned a random environmental component (VE) to every individual, drawn from a normal distribution with mean 0 and variance 0.64 (*VE* = 1 − VG). In summary, we simulated a quantitative trait with heritability 36% that results from the additive gene action over 2500 independent causal SNPs of common frequency.

### **MODELING A SHIFT IN ENVIRONMENT THAT PERTURBS THE GENETIC EFFECT SIZES**

The genetic architecture explained above assumes that all individuals experience the same environment. This study investigates the consequences of a change in the environment that also modifies genetic contributions to disease or traits. Consequently, for convenience we call the baseline situation the "ancestral" environment, and postulate a new "modern" environment in which genetic effects are perturbed at some fraction of the 2500 causal SNPs. We also suppose that in contemporary society, some individuals have a lifestyle more close to the "ancestral" one (simplistically, low caloric intake, high activity) while others have a more "modern" lifestyle (they consume sugary beverages and engage in other obesogenic behaviors). In reality there will be a gradation, but the dichotomous model serves for purposes of illustration of the potential consequences for disease for contemporary societies of the transition to a western lifestyle, that may have induced GxE effects (Gibson, 2009). Specifically, we considered the situation in which some or all individuals in the population live in a new environment that provokes a scaling effect (perturbation) in the genetic effect size at a fraction of the 2500 causal SNPs. Thus, simulated individuals can be classified into two binary "unperturbed" and "perturbed" categories, according to the environment they live in. The ancestral and modern environments aim to model a situation in which the genetic susceptibility to disease may have been altered in modern societies as a consequence of the transition to a western lifestyle (Gibson, 2009), that may have induced GxE through scaling effects. Specifically, the "modern" environment alters the genetic architecture of the trait by causing a multiplication of the effect size (β) by a constant factor (e.g., with a 1.5-fold change, a SNP with βANCESTRAL = 0.06 transforms to βMODERN = 0.09). The strength of the GxE interaction is proportional to, first, the factor of perturbation and, second, the proportion of SNPs that become perturbed in the "modern" environment. For example, physical activity was shown to attenuate the association between rs9939609 in *FTO* and body mass index (BMI) by ∼30 to 95% (Andreasen et al., 2008; Kilpelainen et al., 2011). Another recent study on the interaction of sugarsweetened beverages and BMI described an increase of 77% in the genetic risk per 10 causal alleles for individuals who drink >1 beverage serving per day, which would translate into an ∼8% increment in the effect size per variant under the "modern" environment (Qi et al., 2012). In our simulations, we explored the parameter space that ranges from 5 to 100% increase in the genetic effect size (1.05–2-fold change, respectively). Regarding the proportion of SNPs perturbed, we explored the outcomes after perturbing from a minimum of 1% to a maximum of 20% of the causal variants (25 and 500 of the 2500 simulated SNPs, respectively).

### **SELECTION OF SNPs THAT BECOME PERTURBED IN THE "MODERN" ENVIRONMENT**

All causal SNPs do not account for the same proportion of genetic variance in the simulated trait. Therefore, the degree of GxE we induce also depends on the actual effect size of the perturbed SNPs. We explored two different models of SNPs that become perturbed. In model 1, the SNPs were chosen at random, whereas in model 2 they were chosen from those explaining most of the variance (e.g., the 250 SNPs with highest explained variance in simulations if 10% of the variants were perturbed). Importantly, the random environmental component (VE) was drawn equally in both "ancestral" and "modern" environments. In other words, the "modern" environment induces an increase in the VP after perturbation that is entirely dependent on the genetic component of the trait, thus increasing the VG and heritability. Models entailing an increase in VE could be similarly explored, but we do not do so here. Moreover, we note that although we only simulate scaling effects (at the SNP level), since only a small portion of variant effects is perturbed, there are also rank effects at the phenotype level.

### **THREE SCENARIOS OF SNP DISCOVERY IN A GWAS SETTING**

For both perturbation models 1 and 2 explained above, we set up three different scenarios to perform a "SNP discovery" process to ascertain the variants that were subsequently tested for the presence of GxE effects (see a workflow summary in **Figure 1**). In the first scenario, "scenario A," we act as if all perturbed SNPs were known, and forward them directly to GxE analysis (see next section). "Scenario A" avoids the GWAS discovery step and thus constitutes an ideal situation to establish an upper bound for the range of perturbation effects that can be detected under models 1 and 2.

However, in reality we do not know in advance which SNPs may have undergone environmental perturbation in effect size. Usual practice consists on testing GxE effects for variants that have been previously associated by GWAS. To mimic the situation, we developed two further scenarios in which we added a preliminary GWAS step to discover SNPs. In "scenario B," we performed a GWAS in which 100% of the samples were selected from the "modern" perturbed environment. In "scenario C" we performed GWAS upon a sample that is drawn equally from each of the two environments (50% of the individuals come from the "ancestral" and "modern" settings, respectively). In other words, "scenario C" corresponds to a situation in which half of the society lives in an "ancestral" environment (e.g., extensive physical activity in daily life and low fat diet), whilst the other half follows a "modern" lifestyle that increases the effect size of perturbed alleles. Importantly, we do not "know" which environment each individual lives in, in the sense that this information is not included in the discovery GWAS. For both scenarios, we performed a two-stage genome-wide screen in which the quantitative phenotype is regressed against the allele dosage at each SNP. In the discovery screen, we assay the 2500 simulated SNPs in a sample of 50,000 individuals. SNPs that achieve *P* < 10−<sup>5</sup> in the discovery GWAS are then assayed in a meta-analysis with 100,000 individuals after joining the 50,000 samples from the discovery GWAS with a new simulated replication sample of 50,000 individuals. Finally, SNPs associated with the quantitative trait at *<sup>P</sup>* <sup>&</sup>lt; <sup>5</sup> <sup>×</sup> <sup>10</sup>−<sup>8</sup> in the meta-analysis are then forwarded to a novel sample of 40,000 individuals for the GxE analysis described in the next section.

### **TESTING FOR GENE-BY-ENVIRONMENT EFFECTS AFTER PERTURBATION**

A central focus of our study lies in the evaluation of the power to detect the GxE effects in our simulated trait. We aimed to evaluate the performance of two different approaches, namely (i) power of detection through the examination of individual SNPs and (ii) by means of unweighted genetic risk scores (GRS) that sum up the number of causal alleles for each individual (without weighting each allele by its effect size). To do so, for each scenario we simulated two cohorts of 20,000 individuals each that are sampled from the "ancestral" and "modern" environments, respectively. For each simulated individual, we know its phenotype, the number of causal alleles at each SNP (coded as "0," "1," and "2"), the total number of causal alleles over all selected loci (GRS) and the environment it belongs to (coded as "0" and "1" for "ancestral" and "modern" environments, respectively). In each simulation of 40,000 individuals, we tested the interaction between genetic component and environment by means of a multiple linear regression: Y*<sup>j</sup>* = β<sup>0</sup> + β<sup>G</sup> ∗ χ(G) + β<sup>E</sup> ∗ χ(E) + β(G∗E) ∗ χ(GE)

to estimate the regression coefficient β(G∗E), with Y*j*, χ(G)i, and χ(E)i recording the phenotype, allele dosage (or GRS) and environment of the individual j, for individuals 1,..., 40, 000.

In summary, we explored two different ways to select SNPs that undergo perturbation and three different procedures to choose the actual variants upon which we test for gene-by-environment interactions. For each of the six resulting combinations (models 1 or 2, and scenarios A, B, or C), we explored 400 combinations of parameters. Specifically, the percentage of SNPs that experienced perturbation ranged from 1 to 20% (20 steps of 1%), and the factor of perturbation ranged from a 1.05–2-fold change in effect size (20 steps of 0.05-fold increments). We performed five different replications for each of the 400 combinations, and thus 2000 simulations for each of the six combinations. Results are summarized as heat maps that interpolate relevant parameters across a continuous range of values (**Figures 2**, **4**–**7**, and Supplementary Table 1).

### **STATISTICAL ANALYSIS**

All the analyses were performed using the R software v.3.0 (R Core Team, 2013). Associations between the simulated phenotype and allele dosage, as well as the GxE interactions, were tested with the

*lm* function. Heatmap plots were generated using the *fields* and *akima* R packages.

by each variant in the "ancestral" environment (from left to right). Gray dots correspond to the effect size in the "ancestral" environment. The scaling

### **RESULTS**

We simulated an environmental perturbation in genetic effect sizes to explore the feasibility of detecting gene-by-environment interactions. In the "ancestral" environment, the 2500 causal variants explained from 0.012 to 0.28% of the phenotypic variance. In the "modern" setting a percentage of variants ranging from 1 to 20% underwent perturbation, and their effect sizes increased by a constant factor that ranged from 1.05 to 2-fold. We applied two different models to select the causal SNPs that become perturbed in the second "modern" environment, and built three scenarios to select the SNPs upon which we investigated the feasibility of detecting gene-by-environment interactions following the workflow in **Figure 1**. A detailed summary of the results for each simulation is available in Supplementary Table 1.

perturbation. The curve in the background represents the histogram of phenotypes if the two simulated samples are joined into a cohort of 20,000 individuals.

### **EFFECTS OF THE "MODERN" ENVIRONMENT IN THE DISTRIBUTION OF EFFECT SIZE AND PHENOTYPES**

The actual effect size of each causal allele depends on the frequency and variance explained by the causal variant. For example, we set the strongest contribution in the "ancestral" environment at ∼0.3% of the variance explained. If that allele has a frequency of 0.5, it would present an effect size of 0.075 (βANC), increasing the phenotype by 0, ∼0.075 and 0.15 in individuals with zero, one and two causal alleles, respectively. If it becomes perturbed in the "modern" environment by the strongest perturbation possible (2-fold change; βMOD = 2 ∗ βANC), the effect size would increase from ∼0.075 to 0.15. Thus, the variant would increase by 4-fold the percentage of phenotypic variance it accounts for, hiking from ∼0.3 to 1.2% (see Materials and Methods).

The differences in the distribution of phenotypes under each environment not only depend on the strength but on the proportion of variants that become perturbed in the "modern" setting. The same perturbation inducing a 2-fold increment in the effect size, but acting upon 20% of the SNPs, would result in a distribution of phenotypes that do not overlap extensively. We illustrate the resulting distribution of phenotypes under the "ancestral" and "modern" environments for a range of perturbation effects in **Figure 2** (black and red lines, respectively). For instance, the average phenotype under "modern" conditions after perturbing 20% of the causal SNPs by 1.2-fold is two standard deviations above the average phenotype under the "ancestral" environment. Overall, perturbation leads to a flattened distribution of phenotypes when individuals from both environments are combined, and the increase of phenotypic variance is proportional to the percentage of people that live in the "modern" environment. The differences are strengthened under model 2, because the SNPs that already present the largest effect sizes in the "ancestral" environment are chosen for perturbation in the "modern" setting. Indeed, the most extreme simulated perturbation, such as multiplying the effect of 20% of the variants by two, results in bimodal distributions that can be easily distinguished and are probably biologically unrealistic. However, the differences are much subtler for most of the parameter space, and in next sections we refer to the parameter space that results in a change in the distribution of phenotypes that resembles that of typical traits such as contemporary BMI (see **Figure 3** for a real example based on the change in BMI shown by North American males).

The perturbation in genetic effect sizes prompted by the "modern" environment leads to an increase in the heritability of the quantitative trait (**Figure 4**). The phenotype presents a basal heritability of 36% in the "ancestral" environment, but it easily boosts in the "modern" setting. For instance, a 1.2-fold increase in the effect size of 20% of the causal SNPs leads to a heritability of ∼80%, and a similar effect is achieved with a 1.3 and

We illustrate the effects of the "modern" environment on (i) the genetic effect sizes of perturbed SNPs (major graphs in **Figure 2**), (ii) the differences in the distribution of phenotypes between the "ancestral" and "modern" lifestyles (small graphs in **Figure 2**), and (iii) the heritability of the simulated trait (**Figure 4**). We next describe the ability to detect gene-byenvironment interaction effects induced by the "modern" setting. We compare the ability to detect GxE interactions at the SNP level with that of GRSxE analyses. Overall, we consider three different scenarios to ascertain candidate SNPs, and examine for GxE effects in cohorts of 40,000 individuals in which half of the samples come from the "ancestral" and "modern" environments, respectively.

### **DETECTION OF GxE EFFECTS WHEN ALL PERTURBED VARIANTS ARE KNOWN (SCENARIO A)**

Even if the analyses include all variants that are perturbed (that is, known from the model, without a GWAS discovery step), GxE effects tend to remain undetected at the SNP level (see **Figure 5**). Specifically, under Model 1 only 32 out of 2000 simulations (1.6%) achieved genome-wide significance (*<sup>P</sup>* <sup>&</sup>lt; <sup>5</sup> <sup>×</sup> <sup>10</sup>−8) for any SNP in the GxE analyses, and all of these required a >1.5 fold change in the effect size (**Figure 5B**). Indeed, at most a single

under model 1 **(A)** and model 2 **(B)**, according to the percentage of SNPs perturbed (x-axis) and the factor of perturbation in effect size (y-axis).

maps showing the results of the gene-by-environment interaction analyses according to the percentage of SNPs perturbed (x-axis) and the factor of perturbation in effect size (y-axis). **(A)** *P*-value of the GRSxE interaction under model 1. **(B)** Number of SNPs at

1. **(C)** *P*-value of the GRSxE interaction under model 2. **(D)** Number of SNPs at genome-wide significance levels (*<sup>P</sup>* <sup>&</sup>lt; <sup>5</sup> <sup>×</sup> <sup>10</sup>−8) for GxE under model 2. Panels **(B,D)** record the largest number observed out of five permutations.

variant was detected in each simulation, even if we tested for GxE individually for all perturbed SNPs (e.g., 500 tests for GxE when 20% of the variants were perturbed). Furthermore, only 14% of the 100 simulations with a 2-fold change in the effect size harbored a variant that passed the threshold for genome-wide significance (**Figure 5B**). Conversely, there was a wide range of perturbation parameters for which genetic risk scores, the sum of the total number of causal alleles each individual carries, constituted a powerful tool to detect interaction effects induced by the "modern" environment (**Figure 5A**). For instance, GRSxE interaction terms using GRS calculated over 250 perturbed SNPs (10% of causal variants) showed extremely low *p*-values (*P* < 10−10) for all the ranges from 1.3 to 2-fold change in the genetic effect size. Indeed, tiny increments in the effect size, such as a 1.2-fold change, resulted in ∼100% of the simulations detecting GRSxE effects at the *P* < 0.05 significance level (notice that we performed a single GRSxE test per simulation, because the allelic count of all tested variants were collapsed into a single number). Only the parameter space correspondent to <1.1-fold changes for <5% of the causal variants consistently resulted in non-significant GRSxE interaction terms (**Figure 5A**).

The same patterns were observed under the environmental perturbations of Model 2, although an overall increased ability to detect interaction effects was noticed (**Figures 5C,D**). Specifically, 12.8% of the simulations (255 out of 2000) led to significant GxE effect at the SNP level, although 74.1% of those showed a single variant being genome-wide significant (189 out of 255). It was necessary to perturb genetic effects by 1.8–2-fold to achieve several variants being significant at the SNP level (**Figure 5D**). The interaction effects induced by the "modern" environment are almost universally detected through GRSxE analyses (**Figure 5C**).

### **DETECTION OF CAUSAL ALLELES BY GWAS AFTER MODERN ENVIRONMENTAL PERTURBATION**

In "scenario A," the environmental perturbation in effect sizes can be easily detected with GRSxE analyses. These results establish an upper bound for the ability to detect gene-by-environment effects induced by the "modern" lifestyle, because the analyses are restricted to the truly perturbed variants. Yet, for real traits it is uncertain which SNPs may present GxE effects. Usual practice consists of prioritizing variants unequivocally associated to the trait of interest, such as the alleles discovered by GWAS. To mimic this procedure, we perform a preliminary GWAS study to ascertain variants for GxE analyses.

GWAS meta-analyses of 100,000 individuals entirely drawn from the "ancestral" environment detected ∼90 genome-wide significant variants, accounting for ∼15% of the heritability (data not shown). GWAS on cohorts with 100% of the individuals being "perturbed" under model 1 led to an increased ability to detect variants associated to the trait (**Figure 6A**). The number of detected variants oscillated from 100 to 150 for the most realistic range of perturbation parameter space, and hiked to ∼300 when GWAS was performed upon 100,000 very heavily "disturbed" individuals (e.g., 2-fold change in the effect size for ∼20% of the causal variants). A progressively larger number of the associated variants that are detected correspond to perturbed variants (**Figure 6B**). The tendency to detect increasing proportions of perturbed variants becomes exacerbated under model 2. Specifically, and even if similar numbers of significant variants are detected by GWAS (**Figure 6C**), the increment in SNP detection corresponds to perturbed variants (**Figure 6D**).

Highly divergent patterns were observed when we perform a preliminary GWAS upon a mixed sample of individuals drawn equally from the "ancestral" and "modern" environment ("scenario C"). Under Model 1, the number of variants detected by GWAS still remained close to ∼90 only if the 50% of GWAS individuals coming from the "modern" environment had only been perturbed slightly (e.g., <1.2-fold for <5% of the causal SNPs, bottom-left corner in **Figure 6E**).The ability to detect causal alleles dropped when more extensive perturbations were simulated. For instance, ∼60 variants were detected at genome-wide significance levels when 7% of the variants had their effect size multiplied by 1.3-fold, and almost no variants are discovered if the same percentage of SNPs underwent a 1.8-fold change in

**C.** Color maps showing the results of the GWAS upon cohorts of 100,000 individuals with (i) 100% of the samples drawn from the "modern" environment (scenario B; top panels, **A–D**) and (ii) 50% of the samples drawn from each "ancestral" and "modern" environments (scenario C; bottom panels, **E–H**). Specifically: **(A,E)** Under model 1, number of variants

**(B,F)** Under model 1, percentage of the genome-wide significant variants that have undergone perturbation. **(C,G)** Under model 2, number of variants discovered by GWAS at genome-wide significance levels (*<sup>P</sup>* <sup>&</sup>lt; <sup>5</sup> <sup>×</sup> <sup>10</sup>−8). **(D,H)** Under model 2, percentage of the genome-wide significant variants that have undergone perturbation.

the effect size, or with a 1.3-fold increase for 20% the causal SNPs. Interestingly, the increasingly reduced number of variants discovered by GWAS under "scenario C" corresponded to perturbed SNPs (top-right corner in **Figure 6F**). Similar patterns were observed for "scenario C" under model 2 of perturbation (**Figures 6G,H**). As discussed below, we attribute these effects to the increase in phenotypic variance being greater than the individual genetic effects of each SNP.

### **DETECTION OF GENE-BY-ENVIRONMENT INTERACTIONS WITH SNPs DETECTED BY GWAS**

The enhanced power to discover SNPs under "scenario B" resulted in patterns of GxE interaction detection that are similar to those observed for "scenario A," in which only perturbed variants were used (**Figures 7A–D**). SNP-by-SNP tests rarely resulted in significant GxE interaction coefficients (**Figure 7B**). By contrast, a wide range of the parameter space led to significant GRSxE evaluations, starting from ∼1.4-fold change for ∼5% of the variants to any stronger perturbation, **Figure 7A**). Similarly, under model 2 the tendency toward significant GRSxE detection was exacerbated (**Figure 7C**), and GRSxE interactions were significant for the whole range of simulated parameters. In these analyses, only GWAS performed upon strongly perturbed individuals (1.8–2-fold change in β) permitted detection of perturbed SNPs that were consistently significant at the individual level in the GxE analysis (**Figure 7D**).

A reversed pattern was observed under "scenario C." The proportion of perturbed SNPs among the detected variants was higher as perturbation strengthened, but it became negligible in absolute terms because almost no variants were detected by GWAS. Thus, the overall poor performance of mixed GWAS to detect perturbed SNPs rendered almost impossible the task of detecting GxE effects with GWAS SNPs, even at the GRSxE level (**Figures 7E,F**). The compromised detection power under "scenario C" does not however preclude the detection

**(scenarios B and C).** Color maps showing the results of the gene-by-environment interaction analyses according to the percentage of SNPs perturbed (x-axis) and the factor of perturbation in effect size (y-axis). Results for scenario B are shown in top panels **(A–D**). Specifically: **(A)** *P*-value of the GRSxE interaction under model 1. **(B)** Number of SNPs at genome-wide significance levels (*<sup>P</sup>* <sup>&</sup>lt; <sup>5</sup> <sup>×</sup> <sup>10</sup>−8) for GxE under model 1.

genome-wide significance levels (*<sup>P</sup>* <sup>&</sup>lt; <sup>5</sup> <sup>×</sup> <sup>10</sup>−8) for GxE under model 2. The corresponding results for scenario C are shown in bottom panels **(E–H)**. Panels **(B,D,F,H)** record the largest number observed out of five permutations. White areas in top right corners in panels **(E,G)** correspond to parameter space with no SNPs detected by GWAS and thus missing GRSxE analyses.

of gene-by-environment effects through GRSxE analyses under model 2 (**Figures 7G,H**).

### **DISCUSSION**

In this study we performed a series of simulations to inquire under what conditions gene-by-environment effects can be detected. We applied an environmental perturbation upon cohorts of individuals that live in either an "ancestral" environment, or a "modern" setting that leads to an increment in the genetic effect sizes of a percentage of the causal alleles. For a wide range of the explored parameter space, gene-by-environment effects mostly remain unnoticed when interaction is examined at the SNP level. Conversely, GxE analyses are well powered to detect significant interactions when the genetic component of each individual is summarized through genetic risk scores (GRS) that sum up the total number of causal alleles in a single figure. Moreover, we find that the ability to detect perturbed SNPs in a GWAS preliminary to the GxE analysis depends on the mixture of samples coming from each environment. Genome-wide screens performed upon homogeneous cohorts of perturbed individuals show increased power to detect significant gene-by-environment interaction effects. In contrast, GWAS upon heterogeneous mixtures of "unperturbed" and "perturbed" individuals present a decreased ability to detect significant SNPs, thus inhibiting the task of detecting GxE effects.

### **FEASIBILITY OF THE ENVIRONMENTAL PERTURBATION UNDER THE "MODERN" ENVIRONMENT**

The validity of the insights gained from this study depends on the plausibility of our model of environmental perturbation, and the extent to which we mimic the reality faced by current GWAS studies. Certainly, it is difficult to evaluate the consequences of the "modern" perturbation in the case of actual human phenotypes because the heritability and phenotype distributions correspondent to the "ancestral" lifestyle are unknown. However, there is increasing evidence that the switch to a western lifestyle may have been coupled with a change in the genetic effects of causal alleles (Gibson, 2009). Human complex traits result from the assemblage of multiple physiological dimensions, which may lead to a canalization of phenotypes whereby genetic effects are minimized following long-term stabilizing selection (McGrath et al., 2011). Under such a theoretical model, the "modern" human standard of living may have uncovered the activity of previously silent, or almost silent, cryptic genetic variability (Hermisson and Wagner, 2004). For example, this could have been the case for polymorphisms lying in genes that participate in pathways involved in neural regulation of appetite (Heber and Carpenter, 2011). These variants may have played a small role in the genetic etiology of weight throughout the history of our species, but may explain a larger proportion of the individual susceptibility to obesity in the modern environment of unrestricted access to processed food. A variety of other similar situations could be imagined, such as the interplay between addiction, tobacco use and lung cancer (Amos et al., 2008). In our simulations, we explore a range of parameter space in which the "modern" environment perturbs from 1 to 20% of the causal variants. Such a change can be easily framed in a pathway perspective. Specifically, one or several physiological pathways participating in the genetic architecture of complex traits may respond differently under the "modern" environment. In the context of a common disease, the environmental perturbation we explore would plausibly amount to an increase in the proportion of the population at risk (as in **Figure 3** for real BMI).

Our model postulates one of the simplest instances of GxE in which individuals are assigned to a binary environmental state that would roughly correspond to "ancestral" and "modern" lifestyles. A more realistic scenario of environmental perturbation should summarize the varying fraction of "modern lifestyle" followed by each person into an individual-specific measure, or "exposome" (Patel and Ioannidis, 2014). More complex simulations could be tuned to incorporate more realistic settings. For instance, the extent of exposure to modern lifestyle could be more finely determined (e.g., degree of sedentary behavior, diet patterns, stress at work... ) to explore threshold-dependent models of GxE. Our simulations are necessarily a simplification of the almost infinite array of GxE interactions that could arise in the presence of multi-layered and continuous environments that can perturb the genetic effects of causal variants (Luan et al., 2001; Wong et al., 2003). However, the qualitative environmental states in our simulations resemble the practice of recent studies that have confirmed GxE effects after categorizing the environment into binary categories, as has been the case for example in studies of sugar-sweetened beverage consumption and overall diet patterns (Do et al., 2011; Qi et al., 2012).

In addition to the mechanism of perturbation and the binary nature of the simulated environment, the realism of our perturbation model also depends on the likelihood that the explored parameter space is realistic. We chose to approximate this by checking whether the range of simulated effects results in phenotypic distributions that approximate real observations. In the context of BMI, for instance, western urban women have been shown to present an average BMI value that is ∼4 standard deviations larger than the corresponding figure for Hadza hunter-gatherer women (see Table 1 in Pontzer et al., 2012). These differences are similar to the average horizontal shift between "ancestral" and "modern" environment that we observe in our simulations (e.g., depending on the percentage of perturbed SNPs, changes in effect sizes by <1.4-fold lead to ∼1 to 4 standard deviations of difference in the average phenotype). Furthermore, we also examined the shape of the phenotype distributions. Indeed, we observe significant GRSxE analyses for simulations with parameter combinations that result into more flattened but unimodal distributions of phenotypes, such as those observed in U.S men (**Figure 3**). Nonetheless, the actual phenotypic variance of a combined population depends on the mixture proportions and even extreme situations in which half of the individuals are raised in each environment do not lead to a bimodal phenotypic distribution in a combined simulation population. The heritability of the trait is also kept within a reasonable range. It can severely hike to 90% in the context of the most severe perturbations, but the actual heritability would lie from 36 to 80% according to the exact proportion of "unperturbed" and "perturbed" individuals.

### **DETECTION OF GENE-BY-ENVIRONMENT EFFECTS WITH GENETIC RISK SCORES**

We observe a substantial parameter space in which gene-byenvironment effects can be easily detected with genetic risk scores while remaining hidden in individual SNP analyses, even after testing exclusively those variants that were detected in populations perturbed by the "modern environment." SNP-by-SNP analyses provide anecdotal evidence for significant GxE, and only when extreme perturbations are assayed (e.g., >400 SNPs perturbed by 2-fold in the effect size are necessary to detect a single genome-wide significant variant). Conversely, GRSxE analyses are always significant when β-s are multiplied by 1.3-fold or more, or for the whole range of perturbation parameters when the "modern" environment affects the SNPs that explain most of the variance in the trait (i.e., model 2). These results confirm that a widespread presence of GxE effects is not at odds with the lack of evidence when individual variants are assayed, despite of a substantial presence of interaction effects.

An important aspect of our simulations lies in the choice of variants that are perturbed by the "modern" environment. We observe that it is easier to detect GxE effects when the variants that are perturbed coincide with the alleles that explain most of the genetic basis of the trait, as in model 2. This makes sense considering that these perturbed variants not only present the largest effect sizes, but also have multiplied it in the "modern" environment. The same mechanism explains the increment in the number of variants detected by GWAS when the genomewide screen is performed entirely upon perturbed individuals, as in "scenario B." For real traits with widespread GxE effects, it may be key to perform GWAS selecting for perturbed individuals. The selection of those individuals following a "modern" lifestyle would unravel specific pathways that respond badly in face of perturbation, thus enabling a more detailed understanding of the etiology of the diseases of affluence. Nonetheless, it may be inherently complex to design "perturbed-only" GWAS, owing to the difficulty in defining what exactly constitutes the perturbed environment. The sampling of individuals could also be confounded by the fraction of cases that are entirely due to purely environmental causes without any major role of gene-by-environment interactions linked to "modern" life.

### **MIXTURE OF ENVIRONMENTS COMPROMISES GWAS DISCOVERY POWER**

The simulations in which the preliminary GWAS is performed upon cohorts with a mixed environmental exposure ("scenario C") show a remarkable trend regarding SNP discovery. The combination of "ancestral" and "modern" environments does not compromise the detection of causal variants when perturbation effects are tiny or restricted to a small fraction of the causal SNPs. However, larger perturbations decrease the ability to detect new variants, and statistical power eventually collapses for the strongest range of effects in our simulations. This result makes sense because gene-by-environment interactions add variance and heterogeneity in the estimates of SNP effects. We show the results for a causal variant that explains 0.3% of the variance in an "ancestral" population (**Figure 8**). This allele achieves *P* < 10−<sup>12</sup> when assayed in a GWAS with 20,000 individuals that follow the "ancestral" lifestyle. In contrast, the significance worsens (*P* < 10−7) when this variant is assayed upon a mixture of 10,000 "ancestral" individuals and 10,000 individuals in which 10% of the SNPs have increased their effect size by 1.5-fold. Eventually, the variant remains completely unnoticed in a mixed GWAS when the effect size increases by 2-fold in the individuals following "modern" lifestyle (P∼10−4). As a consequence, these variants are not found among the top candidate list in our simulated meta-analysis GWAS.

It is difficult to evaluate the extent to which pervasive geneby-environment effects have compromised the power to discover associated variants by GWAS. The number of discovered variants correlates with sample size (Visscher et al., 2012), but some other differences among studies can be remarked upon. For instance, a large meta-analysis of ∼180,000 individuals reported 180 different loci associated to height, whereas a similarly powered study with >250,000 individuals only described 32 loci for BMI (Lango Allen et al., 2010; Speliotes et al., 2010). This may be explained simply by a difference of narrow sense heritability. On the other hand, the SNP-based heritability in these studies explains a notably greater proportion of the total heritability for height, implying a reduced missing heritability concern. We propose that this difference might be attributed to environmentally-induced heterogeneities in genetic effect size being more prevalent in the case of BMI, in turn explaining the lack of power to detect obesityrelated loci. Arguably, this limitation can be avoided in real GWAS through the inclusion of covariates (e.g., variables that capture nutrition and physical activity levels per individual in a GWAS for obesity). However, the potential covariates to be included are often unknown or not available for all the cohorts, as in for example the largest meta-analyses for height and BMI (Lango Allen et al., 2010; Speliotes et al., 2010).

We explore a genetic architecture and a range of perturbation parameters that are based on empirical data, which strengthens the validity of our observations. However, the present study is not devoid of weaknesses. Among others, we have used the same sample size in all the simulated GWAS and GxE studies. This comes at a price, since the range of perturbations that result in significant GRSxE would certainly change if larger studies were performed. Second, we performed simulations of random mating populations with genotypic proportions following strict Hardy-Weinberg equilibrium (HWE). This procedure follows the usual practice consisting of screening polymorphisms for HWE. Nonetheless, confounding of population structure with environmental variability, further complicating the detection of GxE in real studies, remains a possibility. Third, we explored the presence of interactions through unweighted GRS that do not take into account the effect size of each variant. Since only a few variants present notably large effects (**Figure 2**), in reality weighted and unweighted risk scores are very highly correlated once more than a few dozen loci are incorporated, which minimizes the loss of power to detect GRSxE effects compared to weighted risk scores. Finally, it should be noted that we only simulate causal variants instead of tagging SNPs, which effectively over-estimates effect sizes relative to those discovered in true GRS.

In summary, the present study constitutes a preliminary evaluation of a realistic mechanism by which gene-by-environment

interactions may have altered the genetic etiology of human traits. A widespread presence of realistic GxE effects could only be detected by genetic risk scores calculated upon all variants discovered by GWAS. The extent to which these effects have shaped real human traits remains as an open question, and should be studied in future research.

### **AUTHOR CONTRIBUTIONS**

GG conceived the original idea. UMM and GG designed the study. UMM performed the simulations. UMM and GG interpreted the data and wrote the paper.

### **ACKNOWLEDGMENTS**

We acknowledge Kevin Lee and other colleagues from Gibson's lab and Isabel Mendizabal for their helpful comments during this work. Urko M. Marigorta and Greg Gibson are supported by Project 3 of NIGMS P01 GM099568 (B. Weir, U. Washington, PI).

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fgene. 2014.00225/abstract

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 April 2014; paper pending published: 03 June 2014; accepted: 28 June 2014; published online: 21 July 2014.*

*Citation: Marigorta UM and Gibson G (2014) A simulation study of gene-byenvironment interactions in GWAS implies ample hidden effects. Front. Genet. 5:225. doi: 10.3389/fgene.2014.00225*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Marigorta and Gibson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Disrupted human–pathogen co-evolution: a model for disease

### *Nuri Kodaman1,2 , Rafal S. Sobota1,2 , Robertino Mera3 , Barbara G. Schneider <sup>3</sup> and Scott M. Williams1\**

*<sup>1</sup> Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH, USA*

*<sup>2</sup> Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, TN, USA*

*<sup>3</sup> Division of Gastroenterology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA*

#### *Edited by:*

*José M. Álvarez-Castro, Universidade de Santiago de Compostela, Spain*

#### *Reviewed by:*

*Dana Crawford, Vanderbilt University, USA Sebastien Gagneux, Swiss Tropical*

*and Public Health Institute, Switzerland*

#### *\*Correspondence:*

*Scott M. Williams, Department of Genetics, Geisel School of Medicine, Dartmouth College, HB-6044, Hanover, NH 03755, USA e-mail: scott.williams@dartmouth.edu*

A major goal in infectious disease research is to identify the human and pathogenic genetic variants that explain differences in microbial pathogenesis. However, neither pathogenic strain nor human genetic variation in isolation has proven adequate to explain the heterogeneity of disease pathology.We suggest that disrupted co-evolution between a pathogen and its human host can explain variation in disease outcomes, and that genomeby-genome interactions should therefore be incorporated into genetic models of disease caused by infectious agents. Genetic epidemiological studies that fail to take both the pathogen and host into account can lead to false and misleading conclusions about disease etiology. We discuss our model in the context of three pathogens, *Helicobacter pylori*, *Mycobacterium tuberculosis* and human papillomavirus, and generalize the conditions under which it may be applicable.

**Keywords: host–pathogen co-evolution, human disease,** *Helicobacter pylori***,** *Mycobacterium tuberculosis***, human papillomavirus, genome–genome interactions**

### **INTRODUCTION**

Human response to infectious agents is known to be highly heritable, but identifying the genetic variants responsible for differences in disease susceptibility has proven difficult. Pathogenic variation has, in some cases, become a better predictor of disease outcome, but it too does not sufficiently predict whether a given individual or class of individuals will present with disease. Thus far, genetic epidemiological studies of infectious disease have typically sought to explain the inter-individual variation in disease phenotypes by assessing genetic factors in humans or pathogens alone, under the implicit assumption that these factors have effects that are essentially independent of each other. Here, we argue that genomeby-genome interactions between host and pathogen are likely to play a major role in infectious disease etiology, and as such, should be incorporated into genetic epidemiological models. In short, insofar as host and pathogen jointly determine disease phenotypes, no genetic variant in either should be considered harmful without taking the context of the other into account.

The term "interaction" has two related but distinct meanings in the context of infectious disease, one molecular, and one statistical. Here we refer mainly to the statistical meaning of the term. At the individual level, all aspects of pathogenesis involve molecular interactions of varying importance, e.g., between a pathogenic epitope and a host receptor. Such interactions can be detected statistically, however, only when multiple variants exist in a population and when specific pairings lead to different effects. In some cases, pathogenic variants may function independently of host variation, and vice versa. However, because many pathogens have co-existed with their human hosts for millennia and have likely co-evolved with them, we argue here that statistical interactions, where appropriately sought, will often be found, with profound biomedical implications.

Recent advances in genomics have provided both the impetus and the means to evaluate human–pathogen coevolutionary hypotheses directly. Whole-genome sequencing of many pathogenic species has substantially improved the resolution with which we classify strains, and facilitated the detection of potentially virulent genetic variants. A clearer picture of microbial evolution has also emerged, marked by selective mechanisms such as rapid gene gain/loss and horizontal gene transfer (Pallen and Wren, 2007). Overlaying human genetic variation onto this emerging evolutionary picture of microbial diversity offers the potential to make the pathogenic process more transparent.

The past few decades have also seen an explosion in studies seeking to identify human susceptibility loci for infectious diseases (Rowell et al.,2012). Candidate gene andfamily based linkage studies have identified several common polymorphisms with clinical significance at the population level, such as the *CCR5* deletion that protects against HIV (Samson et al., 1996; Picard et al., 2006; Casanova and Abel, 2007). However, most human susceptibility is in fact polygenic, with individual polymorphisms conferring small marginal effects (Hill, 2001). Where infectious disease phenotypes deviate from the "one susceptibility locus – one infection" model, elucidating the genetic architecture underlying inter-individual variation has proven elusive.

While genome-wide association studies (GWAS) may be better designed to accommodate multifactorial phenotypes, those performed thus far on infectious diseases have typically been less informative than GWAS performed on complex noncommunicable diseases (Jallow et al., 2009; Hill, 2012; Ko and Urban, 2013). A major challenge facing the GWAS of infectious disease has been the recruitment of a sufficient number of cases and matched controls to achieve adequate statistical power (Hill, 2012; Ko and Urban, 2013). Another potential drawback, and the

one that concerns us here, is the fact that many infectious disease phenotypes depend on complex interactions between host and pathogen genomes. In such cases, the pooling together of human samples infected with even subtly different pathogenic strains can obscure genetic associations (Hill, 2012; Ko and Urban, 2013). A problem common to all GWAS is that the statistical effect sizes of biologically meaningful polymorphisms are often too small to pass significance thresholds after correction for multiple testing. This problem is exacerbated, however, when human polymorphisms (or networks of polymorphisms) (Wilfert and Schmid-Hempel, 2008) confer variable, or even opposite effects in the context of different pathogenic strains within the same study cohort. In this regard, it is perhaps telling that the most successful GWAS performed on infectious disease susceptibility to date have been on leprosy; the signal-to-noise ratios in these association studies may be higher because *Mycobacterium leprae* exhibits substantially less genetic heterogeneity than many other pathogens (Monot et al., 2009; Hill, 2012).

There is in fact strong empirical and theoretical justification for the hypothesis that the effects of susceptibility and virulence alleles in the respective gene pools of humans and pathogens are often contingent upon each other. The evolution of virulence is a dynamic process, easily perturbed by extrinsic variables over space and time, and therefore unlikely to follow the same trajectory in every population. For example, a spike in the density of hosts available for transmission can select for increased virulence, by reducing the cost of lethal harm (Anderson and May, 1982). If a pathogen is transmitted vertically (parent to child), the genetic factors that affect pathogenicity are "co-inherited" by host and pathogen, often promoting commensalism (Frank, 1996; Messenger et al., 1999). Even in these cases, the adventitious introduction of a microbial competitor can induce a commensal species to evolve a defensive toxin that harms the host, if only incidentally (Blaser and Kirschner, 2007; Frank and Schmid-Hempel, 2008). The evolution of defenses against pathogenic harm must also navigate fitness tradeoffs that vary with population, including tradeoffs pertaining to the correlated nature of complex traits (Lambrechts et al., 2006). As pathogens evolve rapidly, exerting strong selective pressures on different human populations, host phenotypes will respond in the *ad hoc* manner typical of evolution, limited by the available genetic variation at hand (Jacob, 1977). Whether the result is a steady-state equilibrium due to a perpetual "arms race" or a commensal detente, the same genes and pathways are unlikely to be involved in every population. As a consequence, when humans and pathogens migrate to new environments or admix, the ensuing disruption of co-evolutionary equilibria and loss of complementarity between host and pathogen genotypes may yield unpredictable and potentially deleterious biomedical consequences.

Our emphasis on the significance of mismatched traits is consistent with the genetic mosaic theory of co-evolution, which aims to account for why virtually all co-evolutionary interactions observed in natural populations show spatial variation in outcomes (Thompson et al., 2002; Thompson, 2014). The theory posits that co-evolution occurs in the context of geographically distinct"selection mosaics," each characterized by a unique genetic and environmental profile, where environmental variables can include both biotic and abiotic factors. Every selection mosaic progresses toward its own co-evolutionary equilibrium, while gene flow between selection mosaics ensures that patterns of maladaptation will be common and detectable where properly studied (Thompson et al., 2002; Ridenhour and Nuismer, 2007).

Despite the likely etiological importance of human–pathogen co-evolution, attempts at empirical confirmation have been rare. Indeed, "proof " of co-evolution poses a formidable challenge, requiring a demonstration of increased reproductive fitness in each species driven by reciprocal changes in two genomes over time (Woolhouse et al., 2002). Although these criteria have been met in laboratory studies and in some natural populations (Lenski and Levin, 1985; Little, 2002; Little et al., 2006), a similarly rigorous assessment of human–pathogen co-evolution must accommodate long generation times and the genetic and phenotypic complexity of the human traits under selection. Nonetheless, substantial phenomenological evidence consistent with human–pathogen co-evolution now exists, including evidence of spatial patterns of parallel genetic variation between species, and of correlated functional changes at the molecular level (Kraaijeveld et al., 1998; Lively and Dybdahl, 2000; Funk et al., 2000; Woolhouse et al., 2002). The collection of high-density genomic data in paired human–pathogen samples and improvements in phenotypic data, as well as advances in pathogen genomics, should soon enable more explicit tests of the concept.

Our aim here is to summarize the growing body of evidence in favor of the hypothesis that genetic interactions driven by host and pathogen co-evolution can have significant implications for genetic epidemiological studies and biomedicine. While this is not a novel hypothesis, it remains understudied. We also underscore how recent advances in genomic technology provide new opportunities to test for genome-by-genome interactions, and offer suggestions on how to incorporate them into more accurate genetic models of disease.

### *HELICOBACTER PYLORI*

Studies of *Helicobacter pylori* provide perhaps the best evidence in favor of human–pathogen co-evolution, and distinctly illustrate the power of the modern genetic toolkit to investigate it. *H. pylori* chronically infects the gastric epithelia of half the world's population, causing peptic ulcers in 10–20% of those infected, and distal gastric carcinoma in ∼1% (Peek and Blaser, 2002; Jemal et al., 2011). The majority of individuals infected, however, suffer only from superficial gastritis in adulthood, while likely gaining protection against diseases such as esophageal cancer and reflux esophagitis, and more controversially, childhood asthma and diarrhea (Rothenbacher et al., 2000; Vaezi et al., 2000; Blaser et al., 2008). That *H. pylori* should have a largely innocuous and potentially symbiotic relationship with its host follows from coevolutionary theory, based on its vertical mode of transmission, its long-term colonization of a single host, and its ∼50,000 year association with *Homo sapiens* (Rothenbacher et al., 2002; Moodley et al., 2012). Why a fraction of individuals develop life-threatening clinical disease, on the other hand, requires explanation, with one possibility being the disruption of long-standing co-evolutionary relationships.

Although *H. pylori*-mediated diseases often advance to the clinical stage in late adulthood, their onset typically occurs during reproductive years (Correa et al., 1976; Susser and Stein, 2002). Importantly, a disease need not have an especially large selection coefficient to shape allele frequency distributions in populations, especially over thousands of years (Ewald and Cochran, 2000). In fact, the historical fitness load of peptic ulcers, obtained by multiplying prevalence by selection coefficient, has been estimated to be similar to those for infectious diseases such as meningitis and rubella (Cochran et al., 2000). Also consistent with co-evolutionary theory is the fact that *H. pylori*-mediated gastric diseases occur disproportionately in men (Susser and Stein, 2002; Engel et al., 2003); *H. pylori* is usually, but not necessarily, transmitted by the mother, such that female fitness has likely exerted a stronger constraint against *H. pylori* virulence.

Some *H. pylori* virulence factors appear to increase the risk of serious clinical outcome regardless of host genotype. The *cag* pathogenicity island, present in some strains, encodes a type IV secretion system, and *VacA* encodes a pore-forming cytotoxin. Both have been implicated as carcinogenic risk factors, though neither is a necessary nor sufficient one (Wroblewski et al., 2010). Other virulence factors released by *H. pylori* include urease, which facilitates neutralization of the otherwise forbidding acidity of the gastric mucosa; NAP, which enables iron uptake; and arginase, which helps *H. pylori* subvert host macrophages. These, like most *H. pylori* virulence factors, operate to create a basal inflammatory state without generating an excessive immune response. Serious clinical disease reflects a disturbance of this balance (Baldari et al., 2005; Blaser and Kirschner, 2007; Salama et al., 2013).

The maintenance of this balance also depends partly on human genetic factors (Lichtenstein et al., 2000; Chiba et al., 2006; Mayerle et al., 2013a). Candidate gene studies on *H. pylori*mediated diseases have implicated several gene polymorphisms that appear to affect risk, most notably in the interleukin-1 (IL-1) family of cytokines (Schneider et al., 2008). Recently, two GWAS assessing susceptibility to gastric cancer and *H. pylori* infection identified SNPs with odds ratios ranging from 1.3 to 1.4, mostly of uncertain biological function (Shi et al., 2011; El-Omar, 2013; Mayerle et al., 2013b, **Table 1**). These polymorphisms account for only a small proportion of the estimated heritability of disease phenotypes.

Studies of human or *H. pylori* genetics in isolation have generally failed to explain why populations with similar rates of *H. pylori* infection exhibit strikingly different susceptibilities to gastric cancer. For example, in many African and South Asian countries, the low incidences of gastric cancer in the presence of almost universal rates of *H. pylori* infection remain a source of much speculation, and have been referred to collectively as the "African enigma" and the"Asian enigma" (Holcombe, 1992; Campbell et al., 2001; Ghoshal et al., 2007). In Latin America, where *H. pylori* strains native to Amerindian populations have been largely displaced by European strains (Dominguez-Bello et al., 2008; Correa and Piazuelo, 2012), the predominantly Amerindian populations living at high altitudes suffer disproportionately from gastric cancer relative to other populations with similar infection rates (de Sablet et al., 2011; Torres et al., 2013). These and other points of evidence raise the possibility that the pathogenicity of a given *H.*

*pylori* strain may vary with human genomic variation, and that some individuals may be better adapted to their infecting strains than others.

Modern genomic techniques have made the assessment of such hypotheses feasible. Over the past two decades, a comprehensive phylogeography of *H. pylori* has been constructed using multilocus sequence typing (MLST), a procedure by which polymorphisms in fragments from housekeeping genes are used to characterize bacterial isolates (Maiden et al., 1998). Analyses of samples from around the world have revealed a strong concordance between *H. pylori* phylogenetic clusters and the geographical locations from which they are derived (Falush et al., 2003; Moodley and Linz, 2009; Moodley et al., 2009). Ancestral *H. pylori* sequences inferred using MLST data also correspond to geographically defined human populations (Falush et al., 2003; Moodley et al., 2012). The typical modern *H. pylori* chromosome is now understood to be an amalgam of fragments from multiple ancestral sequences, a consequence of *H. pylori's* high recombinogenicity (Suerbaum et al., 1998; Falush et al., 2003). The genome of an *H. pylori* isolate can thus be quantitatively resolved into ancestral proportions, which correlate with proportions of human ancestry in admixed populations (Kodaman et al., 2014). In some cases, the ancestries of *H. pylori* isolates outperform human mitochondria in differentiating ethnic groups (Wirth et al., 2004).

These shared patterns of ancestry are unlikely to have arisen merely from parallel divergence due to founder effects or neutral drift. Certainly, the well-documented evolvability of functional loci within *H. pylori* strains, even within single individuals over a 6 year span, argues for the importance of adaptive microevolution (Israel et al., 2001; Dorer et al., 2009). Furthermore, at least 25% of known genes, including genes involved in mucosal adherence and the evasion of host immunity, are absent in some *H. pylori* strains isolated from different ethnic groups (Salama et al., 2000; Gressmann et al., 2005). In at least one case, variants of an *H. pylori* gene (*babA2*) encode adhesion proteins that exhibit host-specific effects, a hallmark of co-evolution. BabA binds to blood group antigens, triggering the release of proinflammatory cytokines. Notably, Amerindians, who almost all carry blood group O, harbor strains with a BabA variant that has up to a 1500 fold greater binding affinity to blood group O (Aspholm-Hurtig et al., 2004).

If we conclude from these patterns of genetic covariation that co-evolution between humans and *H. pylori* has occurred and that it has promoted commensalism, then we may ask whether individuals who develop serious clinical disease have inherited mutually ill-adapted sets of host and pathogen alleles. Under this hypothesis, we should expect to find significant interactions between specific pairs of host and pathogen loci in disease models. Toward this end, candidate pairs of loci can be tested based on biochemical evidence of protein–protein interactions, such as those between the adhesin BabA and the Lewis(b) antigen, its epithelial receptor (Backstrom et al., 2004). However, the effect size of any single two-locus interaction may be relatively small, as gastric disease etiology is phenotypically heterogeneous, and likely to be influenced by a large number of human and *H. pylori* genes (El-Omar, 2013). Thus, characterizing the relevant loci in a biologically meaningful way will ultimately require a systems biological approach.


**Table 1 | Genetic variants identified by GWAS for phenotypes related to infection by** *H. pylori***,** *M. tuberculosis***, and human papillomavirus.**

<sup>1</sup>*OR, odds ratio.*

<sup>2</sup>*CI, confidence interval.*

We recently took a broad-based view to assess the impact of human – *H. pylori* co-evolution on gastric disease, using ancestry estimates from both humans and their *H. pylori* isolates in the absence of knowledge of specific interacting loci (Kodaman et al., 2014). Our study participants were recruited from two Colombian populations with highly different rates of gastric cancer, despite a nearly universal prevalence of *H. pylori* infection in both. We found that the low-risk human, coastal population was of admixed African, European, and Amerindian ancestry, whereas the high-risk, Andean population was mainly of Amerindian ancestry, with a minority of European ancestry. Severity of gastric disease correlated with the proportion of African *H. pylori* ancestry in patients with primarily Amerindian ancestry. On the other hand, patients with a large proportion of African human ancestry infected by African *H. pylori* strains had the best prognoses, consistent with ancestral coadaptation, and likely pertinent to the "African enigma." The interaction between Amerindian human ancestry and African *H. pylori* ancestry accounted for the difference in disease risk between mountain and coastal populations, whereas even the well-known virulence factor, CagA, did not. These findings are thus consistent with the idea that neither human nor *H. pylori* genetic variation confers susceptibility or virulence *per se*, but only in context (**Figure 1**).

These findings also bring to light how understanding coevolutionary interactions can inform and improve public health measures. It has been suggested that because *H. pylori* dominates the gastric microbiome in infected persons and has been shown to confer some beneficial effects, large-scale antibiotic eradication programs may not be warranted (Bik et al., 2006; Hung and Wong, 2009). Simply estimating ancestry from human samples and *H. pylori* isolates may help to identify individuals at greatest

risk for gastric cancer, for whom antibiotic treatment may be most appropriate.

### *MYCOBACTERIUM TUBERCULOSIS* **COMPLEX**

Another interesting candidate to study from a co-evolutionary perspective is *Mycobacterium tuberculosis* (Mtb) and closely related species, believed to have co-existed with anatomically modern humans for ∼70,000 years (Comas et al., 2013). Since the advent of antibiotics, tuberculosis (TB) has ceased to be as common a cause of human mortality as it once was, but it remains among the most deadly infectious diseases worldwide, with immunocompromised individuals at particularly high risk (Dye and Williams, 2010; Fenner et al., 2013). As with *H. pylori*, the majority of Mtb infections do not develop into clinical disease: 90% of cases are asymptomatic with only latent infection. However, 10% of individuals with latent infections develop TB over their lifetime, for mostly unknown reasons (Barry et al., 2009).

In contrast to *H. pylori,* Mtb is transmitted horizontally, and must cause active disease to be transmitted (e.g., *via* coughing or sneezing). Because Mtb transmission increases with virulence, evolutionary theory predicts that strong selective pressures should favor increased virulence until the number of transmissions per infected host reaches a fitness-reducing limit (Knolle, 1989; Frank and Schmid-Hempel, 2008). Such a limit necessarily depends on population-specific parameters, of which host density is probably the most important (Comas et al., 2013). Thus, the limited pathogenicity and chronicity of Mtb likely reflect its historical adaptation to isolated, low-density human populations. These historical conditions remain relevant in part because Mtb reproduces clonally and without lateral gene transfer; evolution only through point mutations and irreversible gene deletions limits a pathogen's ability to shift virulence strategies rapidly in response to changing population parameters (Achtman, 2008; Galagan, 2014).

Before advances in genotyping technology improved strain classification, the apparent genetic homogeneity of Mtb led investigators to believe that variation in disease outcome depended primarily on environmental and human genetic factors (Galagan, 2014). Twin and adoption studies provided compelling evidence for the involvement of human genetic variation as a risk modifier (Comstock, 1978). The most recent analyses have calculated the heritable component of Mtb-related immune response phenotypes to range from 30 to 71% (Moller and Hoal, 2010). These findings have motivated a large number of linkage and candidate gene association studies seeking to identify relevant susceptibility loci, but results have often been inconclusive or, worse, contradictory. Many biologically plausible genes, such as those that encode vitamin-D-binding protein (Lewis et al., 2005; Gao et al., 2010), the phagolysomal membrane protein NRAMP/SLC11A1 (Hoal et al., 2004;Velez et al., 2009), and the dendritic adhesion molecule DC-SIGN (Barreiro et al., 2006; Olesen et al., 2007), appear to associate with TB in some human populations, but not others. Inconsistent replication across ethnic groups has also beset the handful of GWAS performed on TB (Chimusa et al., 2014). The few loci that have passed genome-wide significance thresholds also lack clear biological interpretability and fail to explain more than a trivial portion of the estimated heritable component of TB susceptibility (Thye et al., 2010, 2012, **Table 1**).

Since the advent of PCR-based genotyping techniques, it has become increasingly clear that Mtb genetic variation is non-trivial and clinically consequential (Malik and Godfrey-Faussett, 2005; Nicol and Wilkinson, 2008). Most notably, strains now recognized as part of the "Beijing family," first genotyped in the 1990s following several drug-resistant outbreaks, have been found to exhibit

greater efficiency of transmission and to cause more severe disease phenotypes in many animal models (Glynn et al., 2002; Reed et al., 2004; Parwati et al., 2010). Whole-genome sequencing of a large number of clinical Mtb isolates has since revealed over 30,000 Mtb SNPs, a large proportion of which are non-synonymous (Comas et al., 2013; Stucki and Gagneux, 2013). It has been shown that even a few such SNPs can shift a strain from avirulent to virulent (Reiling et al., 2013).

High-throughput sequence data have also enabled the construction of a robust phylogenetic tree, the major branches of which parallel human mitochondrial phylogeny (Comas et al., 2013). Seven major human-adapted Mtb lineages have now been identified, which can be classified as "ancient" or "modern" (Hershberg et al., 2008; Comas et al., 2013). The Beijing family of strains, which causes 50% of infections in East Asia and 13% worldwide, belongs to the most modern lineage. In contrast, *Mycobacterium africanum*, which causes up to half of TB cases in West Africa, belongs to the most ancient Mtb clade, its divergence predating the human migration out of Africa (de Jong et al., 2010). Although strains within all major Mtb lineages induce an overlapping range of immune responses, clade-specific patterns of virulence are emerging. For example, evolutionarily modern lineages appear to induce a less severe early inflammatory response, which possibly increases the efficiency of transmission (Moller and Hoal, 2010; Portevin et al., 2011). A large number of studies in experimental models have also confirmed that diverse Mtb strains reflect substantial functional diversity (Coscolla and Gagneux, 2010).

It is thus likely that genetic factors in both Mtb and humans influence a wide range of TB phenotypes, including those pertaining to infectivity, progression from latent to active disease, and effectiveness of treatment (de Jong et al., 2008; Comas and Gagneux, 2011). However, whether Mtb genetic variation influences disease outcome independently of human genetic variation, and vice versa, is a question that has only recently been addressed (Gagneux, 2012). The mirrored pattern of human and Mtb phylogeography indicates that co-evolution has likely occurred, and consequently, that genome-by-genome interactions may be significant. However, identifying these interactions and assessing their clinical relevance requires the demonstration of heterogeneous outcomes in paired human and Mtb samples of multiple genotypic backgrounds. A small number of published studies to date have met this criterion, assessing previously implicated loci (e.g., in immunogenicity pathways). A study in a Vietnamese cohort found that a variant of the Toll-interleukin 2 receptor (TLR2), known to trigger a cytokine cascade upon recognition of Mtb, increased TB susceptibility only in patients infected with a Beijing strain (Caws et al., 2008). In a Ghanaian cohort, a polymorphism in the immunity-related GTPase M (*IRGM*) gene conferred protection against the European lineage of *M. tuberculosis*, but not *M. africanum* (Intemann et al., 2009). Perhaps of consequence, a gene deletion in the European Mtb strains increases their vulnerability to the autophagy pathway, mediated by IRGM. Thus, the high frequency of the human *IRGM* polymorphism in West Africa has been proposed to explain the competitive advantage of *M. africanum* there (Intemann et al., 2009). The innate immunityrelated genes *ALOX5* and *MBL* have also been shown to influence

the infectivity of *M. africanum*, but not other strains, in Ghanaian populations (Herb et al., 2008; Thye et al., 2011).

Despite being an ancient strain with ample opportunity to spread beyond West Africa, *M. africanum* has not done so, possibly indicating host-specific adaptation (de Jong et al., 2010; Gagneux, 2012). Other Mtb lineages also appear to associate preferentially with particular human populations, though not as exclusively. A study of ethnically diverse, US-born patients in San Francisco showed that such preferential associations with Mtb lineages persisted even in a cosmopolitan setting (Gagneux et al., 2006). Interestingly, when TB transmission in non-sympatric populations did occur, patients were significantly more likely to be immunocompromised, indicating that non-sympatric Mtb lineages may require some degree of host immunosuppression to compete with sympatric lineages. Mechanisms of Mtb immune evasion, therefore, may have been shaped by population-specific variation in human immune response.

While the above discussion has focused mainly on pulmonary TB, we note here that extra-pulmonary TB, a less common and more severe form of disease, may be especially amenable to analyses guided by co-evolutionary hypotheses. This form of the disease leads more quickly to fatality and results in fewer transmissions than the pulmonary form (Sharma and Mohan, 2004), which probably represents a non-optimal outcome in terms of Mtb fitness. However, data on extra-pulmonary TB to support co-evolutionary hypotheses – especially historical data pre-dating the antibiotic era and the HIV epidemic – are at present lacking (Tiemersma et al., 2011).

### **HUMAN PAPILLOMAVIRUS**

Human papillomavirus (HPV) is the most common sexually transmitted infectious agent in the world, and the second most common infectious cause of cancer after *H. pylori* (de Martel et al., 2012). Cervical cancer is the major source of mortality associated with HPV, but the virus also causes cancers of the anus, vagina, penis, and oropharynx (zur Hausen, 1989; zur Hausen, 1991; Carter et al., 2001; de Martel et al., 2012). Although over 100 types of papillomaviruses infect humans, only a fraction of them are carcinogenic (Bernard et al., 2010). Infection with two specific types, HPV 16 and HPV 18, account for approximately 70% of cervical cancer cases worldwide, with the remainder of cases largely attributable to 14 other types (Bernard et al., 2010). Nevertheless, the great majority of infections with even carcinogenic HPV types are ultimately benign, demonstrating that HPV infection, although necessary, is not sufficient to cause of cervical cancer (Schiffman et al., 2005; Plummer et al., 2007).

Papillomaviruses (PVs) are notable for their slow rate of evolution relative to other pathogens – only an order of magnitude higher than humans, in the case of HPV (Ong et al., 1993; Rector et al., 2007; Shah et al., 2010). This is commonly attributed to their use of high-fidelity host replication mechanisms (Van Doorslaer, 2013). A slow evolutionary rate precludes rapid adaptation to new hosts, and PV strains correspondingly show little evidence of interspecies transmission or related horizontal gene transfer (Herbst et al., 2009; Shah et al., 2010; Van Doorslaer, 2013). All carcinogenic types of HPV belong to a single genus of papillomaviruses that diverged from a common ancestor about 75 million years ago, predating the primate lineage (Rector et al., 2007; Van Doorslaer, 2013). By the emergence of *H. sapiens*, the common ancestor of HPV 16 and HPV 18 had diverged into separate species, and in fact HPV 16 and HPV 18 had already diverged from all other HPV types within their respective species clades (Lewin, 1993; Ong et al., 1993). Given this combination of early divergence, slow evolution, and strict host specialization, we would expect variants within HPV types independently to have similar phylogeographic patterns to that of *H. sapiens*. Global data collected for the two most frequently sexually transmitted types, HPV 16 and 18, reflect such a pattern (Bernard, 1994). The subtypes and variants of HPV 16 cluster into five major branches of a phylogenetic tree: European (E), Asian/American (AA), East Asian (As), and two African (Af1 and Af2) (Ho et al., 1993; Ong et al., 1993). Subtypes and variants of HPV-18 clustering into three major branches: African (Af), European (E), and Asian + American Indian (As+AI) (Ong et al., 1993).

Biochemical and bioinformatic analyses indicate that HPV evolution has not been entirely neutral. Viral genes expressed early during a PV infection, for example, appear to have evolved at different rates than those expressed late (Garcia-Vallve et al., 2005; Rector et al., 2007). Although most PV genes show signs of strong purifying selection, the exceptions appear to be important (DeFilippis et al., 2002; Chen et al., 2005; Carvajal-Rodriguez, 2008). Two genes under diversifying selection, *E6* and *E7*, are essential for viral replication. They induce cell cycle progression in host cells, and encode proteins that, in the high-risk HPVs, are oncogenic (White et al.,1994; Doorbar,2006;Klingelhutz and Roman, 2012). Of note, E6 and E7 interfere with the human tumor suppressor proteins, pRB and p53 (Dyson et al., 1989; Huibregtse et al., 1993a,b; Storey et al., 1998; Munger et al., 2004; Doorbar, 2006). In turn, polymorphisms in the human p53 gene were shown to modulate the tumorigenicity of HPV 16 and 18 (Storey et al., 1998). Patients homozygous for the p53Arg mutation were seven times more likely to develop cervical cancer than individuals with 1 or 2 p53Pro alleles (Storey et al., 1998). Other human polymorphisms, such as those in the genes *RPS* and *TYMS*, influence HPV transmissibility. In a study of high-risk HPV infections in Nigerian women, variants in these genes were shown to modulate risk of infection with HPV 16 and 18. Despite the effects described above, genetic variation in neither the host nor the pathogen has been successful in explaining most heritable risk of HPV-associated disease, when considered in isolation (Magnusson et al., 2000; Hildesheim and Wang, 2002; Wheeler, 2008; Chen et al., 2013; Shi et al., 2013, **Table 1**).

Because the integration of the HPV genome within the human genome is permanent, death of the host ends all possibility of viral multiplication and transmission. Even strains that damage the health of the host sufficiently to reduce human-to-human sexual contact can suffer a competitive disadvantage. Therefore, both host and pathogen should cooperate to prevent severe disease. As with *H. pylori* and MTB, there is some empirical evidence supporting the idea that humans and HPV types co-evolved to limit tumorigenesis, and that evolutionarily mismatched strains may be driving severe clinical outcomes. A study of high-grade

cervical intraepithelial neoplasia (CIN) and invasive cervical cancer in an Italian cohort of Caucasian women demonstrated that non-European variants of HPV16, Af1 and AA, were found at an increased frequency in invasive lesions (Tornesello et al., 2004). A separate study of mostly Caucasian (81%) female university students in the United States showed that those infected with non-European HPV 16 variants were 6.5 times more likely to develop high-grade CIN than those with European variants (Xi et al.,1997). The same study demonstrated a similar HPV 16-related risk profile (4.5 relative risk) in a predominantly Caucasian (79%) population of women presenting at a sexually transmitted disease clinic (Xi et al., 1997). Finally, at the molecular level, there is some evidence that variants of the HPV 16 E6 protein, described above, may be better adapted for replication within specific hosts (DeFilippis et al., 2002).

### **DISCUSSION**

Taken together, the three examples above illustrate how coevolution can promote a reduction in antagonism between pathogen and host, and in doing so leave discernible signatures on the genomes of both species. If, as we argue here, the disruption of historical co-evolutionary relationships can explain many differences in disease outcomes, knowledge of the conditions under which such relationships arise and dissolve will be helpful in defining genetic architecture of disease etiology. The applicability of this model depends, to a large extent, on the degree of integration between host and pathogen genomes, which can take many forms.

A long-standing association between humans and pathogens may be a necessary factor for cross-genomic integration, as with the three pathogens we have discussed. In contrast, many infectious diseases that occur epidemically are caused by zoonotic pathogens for which the human host is an evolutionary dead end, such as *Salmonella enterica* and *Borrelia burgdorferi* (Sokurenko et al., 2006; Falush, 2009). Other pathogens have had limited occasion to co-evolve with humans, because they cause disease primarily on an opportunistic basis (e.g., *Streptococcus pneumonia* or *Clostridium difficile*) or over a broad range of hosts (e.g., *Toxoplasma gondii*) (Ajzenberg et al., 2004; Sokurenko et al., 2006). The epidemic outbreaks caused by these pathogens may leave detectable signatures on the human genome, but reciprocal evolution in the pathogen need not occur.

For human-specific pathogens that cause endemic diseases and are not recent, the likelihood that severe disease is the outcome of a co-evolutionary mismatch should increase with the overlap between host and pathogen fitness. The pathogenicity of vertically transmitted pathogens, for example, should decrease over time, because such pathogens often depend on host survival (and possibly reproduction) for transmission. However, a strong overlap between host and pathogen fitness can also exist in the absence of vertical transmission. A horizontally transmitted pathogen, such as HPV, can evolve to be largely benign insofar as it depends on a healthy host for transmission.

When a pathogen's fitness depends on its ability to cause damage to its human host, as with Mtb, attenuated antagonism becomes a special case, and its disruption becomes more difficult to detect and requires more evidence to confirm. While Mtb strains that increase the duration of a transmissible state will generally have a competitive advantage, the optimal duration can be expected to vary based on many population-level parameters, such as host density. This probably explains why modern Mtb lineages that are more common in high-density urban populations exhibit greater virulence. On the other hand, if horizontal transfer is confined to small, isolated populations, it may be considered effectively vertical. With such pathogens, a better understanding of the co-evolutionary history will be necessary to infer whether severe disease is caused by disrupted co-evolution or by another factor, such as infection by a universally more virulent strain or an opportunistic infection in an immunosuppressed patient.

The life history of the pathogen is also important in assessing the possibility and nature of co-evolution. A pathogen typically faces a tradeoff between fecundity and longevity. Increased fecundity within a host increases the probability (or rate) of transmission, but may negatively affect host lifespan or mobility (Frank and Schmid-Hempel, 2008). Therefore, a pathogen's position on the continuum between greater fecundity and increased longevity will often reflect the degree to which its fitness depends on the health of the host. The case of HPV is somewhat of an exception in this regard. Host immune responses can induce diverse strategies, creating HPV types that are highly fecund, or less fecund with few virions per host. Whereas highly fecund types are more likely to transmit, they are also more likely to induce a vigorous immune response leading to clearance. Low fecundity types on the other hand, are more likely to persist as subclinical infections that can lead to prolonged inflammation and eventually cancer (DeFilippis et al., 2002). However, human populations that co-evolved with specific variants of these persistent types may be less likely to develop cancer, as described above.

Another factor influencing the applicability of the model we propose is a pathogen's recombinogenicity. In theory, a pathogen that recombines freely is more likely to be panmictic, and hence less likely to co-evolve with a particular human host population (Bull et al., 1991). In fact, epidemic disease outbreaks often follow recombination events, and the pathogens responsible for the epidemics often appear superficially clonal, likely reflecting the rapid proliferation of especially successful recombinant strains (Grigg et al., 2001; Heitman, 2006). A case in point is *Neisseria meningitides* (Falush, 2009), as well as the eukaryotic parasites *Toxoplasma gondii* and *Plasmodium falciparum*, which though able to recombine sexually, exhibit surprisingly limited genetic diversity (Grigg et al., 2001). On the other hand, the strict clonality of Mtb and HPV has likely favored co-evolution, leading to reduced antagonism, while recombination in *H. pylori* can disrupt the co-evolutionary relationship favored by vertical transmission.

Recombination can also occur via horizontal gene transfer, as among species within the microbiome (Smillie et al., 2011; Ravel et al., 2011; Liu et al., 2012). This would suggest that co-evolution might be a relatively weak force in shaping microbiotal genetic variation. However, data possibly supporting human–microbiome co-evolution exist; for example, the strongest correlate of an individual's microbiotal identity is ethnicity (Benson et al., 2010; Human Microbiome Project Consortium, 2012). The extent to which this correlation is driven by mutual geneticfactors is unclear, as recurring environmental exposure and frequent vertical transmission may also account for most, if not all of it (Turnbaugh et al., 2009). Assessing whether the genomes of the microbiome and humans are integrated will be a key area of research, as it relates to co-evolution and disease risk (McFall-Ngai et al., 2013).

### **CONCLUSION**

While the prospect of introducing co-evolutionary interactions into genetic epidemiology models may appear to add a new layer of complexity to an already difficult problem, a co-evolutionary perspective should help us construct more precise and accurate hypotheses, improving our ability to find real and reproducible results. Importantly, co-evolved genes will not be neutral in either species, which may make their identification easier. Although many methods exist to find loci that are candidates to have evolved under selection (Aguileta et al., 2009; Karlsson et al., 2014), and these methods can assess the strength, timing, and direction of selection (e.g., balancing or positive), they are not at present well adapted to the study of joint patterns of selection.

If the ultimate goal is to find interacting genes that have co-evolved to be benign and are subsequently disrupted in disease, we will need to identify differential patterns of concerted selection in paired human and pathogenic loci from different populations. The limiting factor to the development of appropriate methods toward this end has probably been the lack of prospectively collected paired genetic data for humans and pathogens. Once these data are available, existing methods to detect epistasis within a species can be adapted for cross-species analyses in the absence of *a priori* biological hypotheses. Where evidence for selection exists, genetic variants can be filtered prior to analyses to detect epistasis. Framing hypotheses in the context of biochemical and bioinformatic functional evidence or pre-existing evidence for association can hone study design even further. For example, using paired data and pathogenic genetic variation as the outcome variable, novel epitopes have been discovered in association studies (Bartha et al., 2013). Such data can be used to mitigate the immense multiple testing burden incurred by a hypothesis-free approach to detecting genetic interactions.

Finally, we should note that the ultimate impact of this approach may extend beyond infectious diseases to what are traditionally considered non-communicable diseases. For example, we now recognize that both gastric and cervical cancers, as well as atherosclerosis, may have origins in infection (Libby et al., 2002; Porta et al., 2011). The number of such examples will certainly expand.

### **ACKNOWLEDGMENTS**

This study was supported by the National Center for Research Resources, grant UL1 RR024975-01, which is now at the National Center for Advancing Translational Sciences; National Cancer Institute Grant P01 CA28842, the Vanderbilt-Ingram Cancer Center, the Wendy Dio family and the TJ. Martell Foundation. Scott M. Williams was partially supported by P20 GM103534.

### **REFERENCES**


ancient phylogenetic root in Africa and intratype diversity reflect coevolution with human ethnic groups. *J. Virol.* 67, 6424–6431.


NOS2A and TLR2 in African-Americans and Caucasians. *Int. J. Tuberc. Lung Dis.* 13, 1068–1076.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 June 2014; paper pending published: 30 June 2014; accepted: 05 August 2014; published online: 25 August 2014.*

*Citation: Kodaman N, Sobota RS, Mera R, Schneider BG and Williams SM (2014) Disrupted human–pathogen co-evolution: a model for disease. Front. Genet. 5:290. doi: 10.3389/fgene.2014.00290*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Kodaman, Sobota, Mera, Schneider and Williams. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## A cautionary note on ignoring polygenic background when mapping quantitative trait loci via recombinant congenic strains

### *J Concepción Loredo-Osti\**

*Department of Mathematics and Statistics, Memorial University, St. John's, NL, Canada*

#### *Edited by:*

*José M. Âlvarez-Castro, Universidade de Santiago de Compostela, Spain*

#### *Reviewed by:*

*Hongying Dai, Children's Mercy Hospital, USA Carl Nettelblad, Uppsala University, Sweden*

#### *\*Correspondence:*

*J Concepción Loredo-Osti, Department of Mathematics and Statistics, Memorial University, Henrietta Harvey Building, St. John's, NL A1C 5S7, Canada e-mail: jcloredoosti@mun.ca*

In gene mapping, it is common to test for association between the phenotype and the genotype at a large number of loci, i.e., the same response variable is used repeatedly to test a large number of non-independent and non-nested hypotheses. In many of these genetic problems, the underlying model is a mixed model consistent of one or very few major genes concurrently with a genetic background effect, usually thought as of polygenic nature and, consequently, modeled through a random effects term with a well-defined covariance structure dependent upon the kinship between individuals. Either because the interest lies only on the major genes or to simplify the analysis, it is habitual to drop the random effects term and use a simple linear regression model, sometimes complemented with testing via resampling as an attempt to minimize the consequences of this practice. Here, it is shown that dropping the random effects term has not only extreme negative effects on the control of the type I error rate, but it is also unlikely to be fixed by resampling because, whenever the mixed model is correct, this practice does not allow to meet some basic requirements of resampling in a gene mapping context. Furthermore, simulations show that the type I error rates when the random term is ignored can be unacceptably high. As an alternative, this paper introduces a new bootstrap procedure to handle the specific case of mapping by using recombinant congenic strains under a linear mixed model. A simulation study showed that the type I error rates of the proposed procedure are very close to the nominal ones, although they tend to be slightly inflated for larger values of the random effects variance. Overall, this paper illustrates the extent of the adverse consequences of ignoring random effects term due to polygenic factors while testing for genetic linkage and warns us of potential modeling issues whenever simple linear regression for a major gene yields multiple significant linkage peaks.

**Keywords: misspecified genetic models, bootstrapping mixed models, recombinant congenic strains, ignoring random effects, mapping quantitative trait loci**

### **1. INTRODUCTION**

For more than four decades, linear mixed models have been used in a wide range of applications because of their conceptual simplicity and flexibility to accommodate correlated sources of variation as well as fixed regressors. A generic linear mixed model can be written as

$$\mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \mathbf{Z}\boldsymbol{\gamma} + \mathbf{e} \tag{1}$$

where **X** and **Z** are known incidence matrices, *β* is a vector of unknown fixed regression coefficients, *γ* is a vector of random effects, and **e** is the vector of errors. It is also common to assume that *γ* and **e** are independent and both have null expectation and finite variances. In many situations, either intentionally or unintentionally, the statistical analysis is carried out ignoring the term **Z***γ* in the model. This practice, although recognized as inefficient, has been thought to be harmless whenever the interest resides solely on a subset of the regression coefficients with the remaining parameters of the model deemed as nuisance. This thought seems to be mostly based on the fact that *<sup>β</sup><sup>o</sup>* <sup>=</sup> (**X**- **X**)−1**X**- **y** is still an unbiased and consistent estimator of *β*. However, it is well known that ignoring **Z***γ* and using ordinary least squares, results in an estimator of Var- *βo* that is biased and inconsistent as well as non-independent of *β<sup>o</sup>* (Dhymes, 1978). Of course, this will affect the distribution properties associated with *β<sup>o</sup>* under normality or, otherwise, the asymptotic properties of its distribution. It has been suggested that this problem can be mitigated if testing is done through resampling. However, the adverse consequences of dropping the random term from the mixed model is unlikely to be fixed by the use of resampling methods. In this paper, a specific application to genetic mapping via recombinant congenic strains (RCS) of experimental animals is used to illustrate this. Briefly speaking, genetic mapping can be seen as a problem in which the association of one dependent variable (the phenotype) with a large number of potential explicative variables (the marker genotypes) is tested one-by-one or by taking a very small number of markers at once. An RCS panel is a replicable mapping population for which animals within the same strain are considered to be genetically identical and related to different degrees with animals from other strains. Such an inter-strain relationship results in what is known as the genetic background effect and, whenever this effect is understood as the result of the addition of many components of minuscule effect, the inclusion of a random effects term in the model would be the natural way to account for it.

A mouse panel of RCS is obtained by mating mice from two genetically distinct inbred strains (a donor strain and a recipient strain) followed by two or more rounds of backcrossing to the recipient strain and subsequent sister × brother mating without selection for particular markers or phenotypes for a minimum of 20 generations. The genetic resolution of the panel is controlled by the number of backcrossing rounds. Because of this construction, each strain of an RCS panel can be thought of as an inbred strain in which segments of random length from the genome of a recipient strain have been replaced with the corresponding segments from a donor strain. The main consequence of this breeding scheme is that non-linked genes controlling the same trait are separated and fixed in haplotypes of different strains, allowing the possibility of studying them individually. The standard RCS panel uses two backcross generations and, consequently, the total length of the segments from recipient strain constitute on average the 87.5% of the genome of each strain; the remaining 12.5% represents the total expected length of the replaced genome segments. Without loss of generality, this is the type of RCS considered in this paper. For a more comprehensive description of the RCS and their use in gene mapping see Démant and Hart (1986), Moen et al. (1992), and Fortin et al. (2001b, 2007). Once the RCS panel have been established, the whole panel is genotyped to obtain full characterization of the genome of each strain. Each genotype data set can then be used for the analysis of all individuals of the same strain; this is an important money-saving feature of the design since it does not require of regenotyping each individual because, except for *de novo* mutations, all pups from the same strain are genetically identical.

Although most mouse geneticists agree that RCS are a powerful resource to map loci associated with complex traits, there is some disagreement on how to do the analysis. Originally, when the use of RCS for genetic mapping was proposed, the core idea was to look into the stain distribution pattern with respect to a phenotype of interest and identify the strain that exhibited the largest deviation from the other strains in the RCS panel and subsequently cross it with the recipient strain to obtain *F*<sup>1</sup> and *F*<sup>2</sup> progenies to be analyzed by standard methods (Démant and Hart, 1986; Fortin et al., 2001b). Two examples of the application of this approach are reported in Fortin et al. (2001a) and Müllerová and Hozák (2004). The problem is that contrasting phenotypes from *F*<sup>1</sup> mice versus the ones from the recipient strain will only be effective for dominant traits, while the power for additive traits will be diminished and lost completely for recessive traits. On the other hand, the analysis of the *F*<sup>2</sup> mice requires new genotyping, which not only defeats the economic advantages of having developed RCS, but more importantly, because every *F*<sup>2</sup> individual has different genotype, this approach is not suited for complex quantitative traits when a single measurement may not be reliable enough to determine the phenotype (Moen et al., 1992). Alternatively, there is a designs consisting of taking a sample of mice from each strain and analyzing the whole panel together. Although this approach does not require additional genotyping and has the potential for making more efficient use of the phenotypic variation, also opens more room for analysis pitfalls if the proper model is not used. For example, Joober et al. (2002) uses a QTL mapping procedure equivalent to simple linear regression at the markers ignoring genetic background which, as pointed by Palmer and Airey (2003), it may result in false positive rates far in excess of the nominal value, even when Bonferroni corrections are used. Another common way to address the problem is to use strain averages as the phenotype and treat the panel of means as a backcross dataset for analysis purposes. This is essentially the "interval mapping" procedure proposed by Shao et al. (2010) and equivalent to the one used by Thifault et al. (2008). This approach may substantially reduce the power for RCS panels with reduced number of strains and it does not deal with the fact that the strains, related because their background, may not have the same kinship degree at genomic level and consequently the phenotype means may be not only non-independent but heteroscedastic, as well. Lee et al. (2006) and Camateros et al. (2010) extend the simple linear regression to account for the genetic background by adding a fixed factor ("background proportion" in the first paper; "background indicator" in the second). Although better than ignoring the background, from the genetics standpoint, it is difficult to justify the plausibility of a fixed effects model under the assumption that the background effect is the result of the additive action of many genes of minuscule effect. In fact, I argue that the natural way to model such a background effect consistent with the principles outlined by Fisher (1919) is through the inclusion of a random effects term in the model as implemented in Di Pietrantonio et al. (2010). In this paper, I describe in detail a procedure for the analysis of a quantitative trait locus (QTL) that models the genetic background (assumed to be of polygenic nature) as a random effect term and use this to show how the omission of such a term in the model leads to conclusions that are wrong and inconsistent with the data.

#### **2. MODELS**

#### **2.1. THE NAIVE QTL MODEL FOR AN RCS PANEL**

In its simplest form, at each marker position *m*, *m* = 1, 2,..., *M*, the RCS/QTL model for the *i*th individual, *i* = 1, 2,..., *n*, can be written as

$$
\mu\_i = \mu + q\_{im}\xi\_m + e\_i \tag{2}
$$

where *yi* denotes the phenotype for the *i*th individual, ξ*<sup>m</sup>* denotes the major locus effect associated with the *m*th marker, *qim* is the indicator of the BB genotype at the *m*th position which is determined by the RCS data, and the *ei*s are a set of independent random variables with distribution *N* - 0, σ<sup>2</sup> (AA and BB are the genotypes of the donor and recipient parental strain, respectively). Of course, under an oligogenic model, at most, a handful of ξ*m*s should be different from zero. In fact, it is common practice that at the first screening, the estimation is carried out by regression at each marker under the assumption of only one major gene. When the presumption of a dense enough genotyping marker panel is not correct, procedures like modified interval mapping can be used instead. Variations of the problem include conditioning on a given set of markers. The salient feature of this design is that, at the *m*th marker position, one looks across the RCS panel and classifies each strain as either AA or BB, since under the model (Equation 2), this is the only source of genetic variation when estimating ξ*m*. However, this model ignores the fact that individuals from the same strain are genetically identical (assuming no new mutation at the locus under scrutiny), and strains with the same ancestral background share large portions of their genome so that even without the involvement of a major gene, there is more likely to be reduced variation within strains. In a nutshell, regression mapping works by testing the association of the phenotype with the observed genotype at each marker location so that finding significant linkage at any position implies testing the *M* null hypotheses, ξ*<sup>m</sup>* = 0. Clearly, most of these hypotheses as well as their test statistics are not independent. This may lead to problems in the control of the type I error rate if multiple testing is not addressed properly. Another irregularity results from the fact that with a dense genotyping panel the number of tested hypotheses can by far exceed the sample size. Because of these considerations, *p*-value estimation by resampling of residuals has been seen as a plausible alternative. For this paper, the problem is addressed through bootstrap.

#### *2.1.1. Computation of p-values*

The estimation of genome-wide corrected *p*-values by resampling requires that under the null hypothesis: (i) each resample is taken from an exchangeable distribution, (ii) the variation of the original sample is preserved through all resamples, and (iii) the genome-wide baseline for the test statistics at each position is the same. The first two requirements are standard for resampling in regression (Davison and Hinkley, 1997; Anderson and Ter Braak, 2003). The last requirement is imposed to ensure that the uncorrected *p*-values across the genome are comparable (this is particularly important when there are missing genotype data). One way to estimate corrected *p*-values is to select an ensemble of test statistics whose marginal distribution is the same when the model does not contain any major locus.

Since under model (Equation 2) and the hypothesis of no major gene, the distribution of **y** = (*y*1, *y*2,..., *yn*) is exchangeable, resampling from the raw observations will also preserve the variation through the pseudo-observations. This means that in the absence of non-genetic regressors or other non-oligogenic factors, resampling the raw phenotypes either by permutation or through bootstrap will produce similar results. Furthermore, under these premises, basic sampling and hypothesis testing principles indicate that a permutation based procedure will be more efficient and powerful. However, this is not necessarily the case when the premises are removed. Should the model also contain fixed non-genetic regressors, resampling from the leverageadjusted residuals under the null hypothesis would be a procedure that approximates exchangeability while preserving the original variation of the data. However, under this situation, resampling from leverage-adjusted residuals results in a procedure with acceptable properties only in the bootstrap case (Davison and Hinkley, 1997), while this is not longer guaranteed when resampling via permutation. The main issue is that sampling without replacement magnifies the effects of modest departures from exchangeability. Then, permuting leverage-adjusted residuals may not be good enough (even worst, it may not be valid) and we would require of a much more elaborate and computer intensive procedure to obtain residuals guaranteed to be at least weakly exchangeable so that permutation works properly (see, for example, Kherad-Pajouh and Renaud, 2010). To complete the requirements listed above regarding the possibility of missing genotypes, we propose to use the test statistic defined by the expression

$$z\_m = t\_m \left( 1 - \frac{1}{4\upsilon\_m} \right) \left( 1 + \frac{t\_m^2}{2\upsilon\_m} \right)^{-\frac{1}{2}} \quad \text{where} \quad t\_m = \frac{|\hat{\xi}\_m|}{\hat{\sigma}\_{\hat{\xi}\_m}} \text{ (3)}$$

and ξˆ *<sup>m</sup>* is the ordinary least squares estimate of ξ*m*, *m* = 1, 2,..., *M*, i.e., *zm* is just *tm*, our familiar *t*-statistic with ν*<sup>m</sup>* degrees of freedom, transformed into a *z*-score (ν*<sup>m</sup>* may vary slightly from marker to marker due to missing data). Another option would be a modified *t*-statistic *t* - *<sup>m</sup>* in which the *m*th estimate of variance *s* 2 *<sup>m</sup>* used to compute <sup>σ</sup><sup>ˆ</sup> <sup>2</sup> ξˆ *m* is replaced by *s* 2 <sup>0</sup>, the estimate under the null hypothesis. With no missing genotypes the use of any of *zm*, *t* - *<sup>m</sup>*, and *tm* would yield approximately the same *p*-value estimates.

### *2.1.2. Bootstrap procedure for simple linear regression at the markers*

The following bootstrap procedure computes the genome-wide corrected *p*-values for model (Equation 2) with the test statistic (Equation 3):


This resampling scheme can be seen as an adaptation of a regular regression residuals bootstrapping procedure (Davison and Hinkley, 1997), coupled with Roy's union-intersection principle (Roy, 1953) to control for the genome-wide type I error rate. When applied to the analysis of the RCS panel, this procedure is valid when there is only one observation per strain or when the within-strain variation is negligible. Otherwise, a random term in the model has been neglected and, regardless of ξˆ *<sup>m</sup>* being an unbiased estimator of ξ*m*, the exchangeability requirement cannot be met and the most likely consequence would be an inflated type I error rate. In fact, as per arguments given by Churchill and Doerge (1994) and Churchill and Doerge (2008), this statement is correct not only for the bootstrap and RCS, but also for permutation test procedures applied to any study design involving replicable mapping populations because, as for bootstrap, the Fisher (1935) principle of permutation also relies on exchangeability. For simple experimental designs such as an intercross or a backcross mating, the individual units can safely be assumed to be exchangeable. However, it would be wrong to assume exchangeability for more complicated designs, like advanced intercross, heterogeneous stocks and RCS.

#### **2.2. THE QTL MIXED MODEL FOR AN RCS PANEL**

The previous simple linear model (Equation 2) generalizes to a model of the form:

$$\mathbf{y} = \mathbf{X}\boldsymbol{\mathfrak{F}} + \mathbf{Z}\boldsymbol{\mathfrak{Y}} + \mathbf{q}\_m \boldsymbol{\xi}\_m + \mathbf{e} \tag{4}$$

where **y** represents the phenotype vector, **q***<sup>m</sup>* is a vector with each entry being an indicator variable of the genotype BB at the marker position *m* with ξ*<sup>m</sup>* being its associated effect (major gene effect), *γ* is a random effects vector associated with the genetic background with E(*<sup>γ</sup>* ) <sup>=</sup> **<sup>0</sup>** and Var(*<sup>γ</sup>* ) <sup>=</sup> <sup>σ</sup><sup>2</sup> <sup>γ</sup> 1, with σ<sup>2</sup> <sup>γ</sup> > 0 and 1, a positive-definite matrix, both assumed to be constant, although unknown, **X** is a matrix of fixed covariates and its corresponding parameter vector *β*, **e** is a vector of independent and identically distributed random variables representing the error term with E(**e**) <sup>=</sup> **<sup>0</sup>** and Var(**e**) <sup>=</sup> <sup>σ</sup><sup>2</sup> **<sup>I</sup>**. Up to a multiplicative constant, <sup>1</sup> is a function of the length of the segments identical by descent shared amongst strains. For an established RCS panel there are only two possible identity states between pairs of strains at a given locus: either (i) all four alleles are identical by descent (<sup>1</sup> is the matrix holding the pairwise probabilities for this state), or (ii) the strains have different allelic forms and thus identical by descent only amongst themselves. So an estimator of <sup>1</sup> with "a high degree of precision" can be reached. Such an estimator uses only genomic information and does not involve **y**, so when estimating the parameters, one can assume that <sup>1</sup> is given. Another option is to take the entries of <sup>1</sup> as the expected value of the proportion of the genome shared identical by descent between the respective strains under the RCS panel construction described above, i.e.,

$$
\delta\_{1\circ j} = \begin{cases} 1 & \text{if } i = j \\ \frac{15}{16} & \text{if } i \text{ and } j \text{ have the same background} \\ \frac{1}{16} & \text{if } i \text{ and } j \text{ have different background.} \end{cases} \tag{5}
$$

This option, although not the most efficient, does capture the main features of the design and yields a variance structure for the random effects vector that can be exploited in the implementation of the resampling algorithm. For example, if all the strains in the panel under scrutiny have the same background and the simplified expectation-based <sup>1</sup> is used, then the distribution of the vector of random effects is exchangeable. Nonetheless, replacing a genomic-based <sup>1</sup> estimate by its theoretical expectation (Equation 5) implies ignoring important information regarding the correlation of the additive polygenic effects associated to the genetic background.

#### *2.2.1. Estimation*

The estimation for the mixed linear model has been extensively discussed in the literature (Harville, 1977; Henderson, 1986). Here we develop an application of these standard methods to the RCS design. Without loss of generality, let us consider the linear mixed model (Equation 1) with Var(*<sup>γ</sup>* ) <sup>=</sup> <sup>σ</sup><sup>2</sup> <sup>γ</sup> <sup>1</sup> and Var(**e**) = σ2**I**. Thus

$$\mathbf{E(y)} = \mathbf{X}\boldsymbol{\beta} \qquad \text{and} \qquad \text{Var(y)} = \sigma^2 \left(\mathbf{Z}\mathbf{GZ}^\prime + \mathbf{I}\right) = \sigma^2 \mathbf{Z}^\prime$$

where **<sup>G</sup>** <sup>=</sup> <sup>λ</sup><sup>1</sup> and <sup>λ</sup> <sup>=</sup> <sup>σ</sup><sup>2</sup> γ <sup>σ</sup><sup>2</sup> , i.e., λ represents the signal-to-noise ratio. Under the assumption of no major gene and only polygenic background, λ is related to the heritability coefficient. When **G** is known, the best linear unbiased estimator of *β* and the best linear unbiased predictor of *γ* (also known as a shrinkage estimator) can be written as

$$
\tilde{\boldsymbol{\beta}} = (\mathbf{W}^\prime \mathbf{W})^{-} \mathbf{W}^\prime \mathbf{v} \quad \text{and} \quad \hat{\mathbf{y}} = \mathbf{G} \mathbf{Z}^\prime \boldsymbol{\Sigma}^{-\frac{1}{2}} (\mathbf{v} - \mathbf{W}\tilde{\boldsymbol{\beta}}),
$$

respectively, where **<sup>W</sup>** <sup>=</sup> <sup>−</sup> <sup>1</sup> <sup>2</sup> **<sup>X</sup>** and **<sup>v</sup>** <sup>=</sup> <sup>−</sup> <sup>1</sup> <sup>2</sup> **y**. Also

$$
\hat{\sigma}^2 = \frac{1}{N - \text{rank}(\mathbf{W})} (\mathbf{v} - \mathbf{W}\tilde{\boldsymbol{\beta}})'(\mathbf{v} - \mathbf{W}\tilde{\boldsymbol{\beta}})
$$

$$
\hat{\sigma}\_{\mathcal{Y}}^2 = \frac{1}{\text{rank}(\mathbf{G})} \left(\hat{\mathbf{y}}'\mathbf{G}^{-1}\hat{\mathbf{y}} + \hat{\sigma}^2 \text{tr}(\mathbf{G}^{-1}\mathbf{C})\right)
$$

with

$$\mathbf{C} = (\mathbf{Z}'\mathbf{M}\mathbf{Z} + \mathbf{G}^{-1})^{-1} \quad \text{and} \quad \mathbf{M} = \mathbf{I} - \mathbf{X}(\mathbf{X}'\mathbf{X})^{-1}\mathbf{X}'.$$

Notice that the previous expressions cannot be computed unless the signal-to-noise ratio, λ, is known. A situation of a more practical interest is an iterative procedure on which λ is replaced by its estimate and, once that the estimates of σ<sup>2</sup> and σ<sup>2</sup> <sup>γ</sup> have been updated, a refinement of the estimate of λ is obtained and so on. This iterative procedure will result in a *β*˜ and *γ*ˆ that are no longer linear, nonetheless, they preserve most of the desirable properties present in their linear counterpart (Jiang, 1998).

#### *2.2.2. Mixed model resampling scheme*

Let us now focus our attention toward a resampling scheme appropriate for RCS data under a mixed model. By now, it is obvious that the bootstrap procedure described in the previous section will not work for the mixed model (Equation 4). A crude extension to this procedure would consist of computing

$$
\hat{\mathbf{e}} = \mathbf{y} - \mathbf{x}\hat{\boldsymbol{\beta}} - \mathbf{Z}\hat{\boldsymbol{\beta}}
$$

and resampling from *γ*ˆ and **e**ˆ to obtain *γ* <sup>∗</sup> and **e**<sup>∗</sup> so that the pseudo-observation **y**∗ could be recovered as

$$\mathbf{y}^\* = \mathbf{X}\widetilde{\boldsymbol{\beta}} + \mathbf{Z}\boldsymbol{\eta}^\* + \mathbf{e}^\*.$$

However, it is straightforward to see that these residuals are not exchangeable and they are biased toward zero. Thus, they may not adequately represent the hypothesis tested nor reflect the true variation of the model.

Alternatively, note that when *β* and λ are known, it follows from the model under the null hypothesis that E(**v**) = **W***β* and Var(**v**) <sup>=</sup> <sup>σ</sup><sup>2</sup> **<sup>I</sup>** which implies that the distribution of the vector of residuals, = **v** − **W***β*, is exchangeable. This suggests the following residuals resampling scheme:

(1) given λ˜ and *β*˜ obtained under the mixed model without a major gene, i.e., under the null hypothesis, compute ˜ , **W**˜ by replacing λ with λ˜ and <sup>1</sup> with its genomic-based estimate; then, obtain the leverage-adjusted residuals

$$
\tilde{\epsilon} = \mathbf{D} (\mathring{\Sigma}^{-\frac{1}{2}} \mathbf{y} - \mathbf{\tilde{W}} \tilde{\boldsymbol{\beta}}),
$$

where **D** is a diagonal matrix with each of the non-zero elements given by (<sup>1</sup> <sup>−</sup> *hii*)−<sup>1</sup> and *hii* is the *<sup>i</sup>*th leverage coefficient;

(2) with replacement, resample from ˜ <sup>∈</sup> <sup>R</sup>*<sup>n</sup>* to obtain <sup>∗</sup> <sup>∈</sup> <sup>R</sup>*n*, its bootstrapped replica, and construct the vector of pseudoobservations as

$$\mathbf{v}^\* = \tilde{\mathbf{W}} \tilde{\boldsymbol{\beta}} + \boldsymbol{\epsilon}^\*.$$

If instead of a bootstrap procedure based on leverage-adjusted residuals we want to use a residuals-based permutation procedure, then we need to extend the method of Kherad-Pajouh and Renaud (2010) to get weak exchangeability of residuals. However, when λ is estimated from the data, such an extension is not possible and we would have to rely on approximations. More research is needed to explore this direction.

Outside of a genetics context, there is a number of permutation and bootstrap procedures for mixed models whose objective is testing the components of variance (for example, Fitzmaurice et al., 2007; Sinha, 2009; Lee and Braun, 2012; Samuh et al., 2012). However, they cannot be applied in our case because we are interested in the regression coefficients (or a subset of them) and the variance of the random effects is just nuisance parameter. Incidentally, when testing the components of variance, bootstrap has the edge over most permutation procedures (Samuh et al., 2012).

### *2.2.3. Bootstrap procedure for the mixed linear model*

According to the foregoing argument, generalization to the previous bootstrap procedure to compute the genome-wide corrected *p*-values for the mixed model (Equation 4) goes as follows:


$$
\tilde{\mathbf{v}} = \begin{pmatrix} \tilde{\mathbf{w}} & \tilde{\mathbf{z}}^{-\frac{1}{2}} \mathbf{q}\_m \end{pmatrix} \begin{pmatrix} \mathcal{J} \\ \xi\_m \end{pmatrix} + \epsilon. \tag{6}
$$

Of course, this model is equivalent to model (Equation 4), the RCS/QTL mixed model, with λ replaced by λ˜. Compute the model parameter estimates with the outlined mixed model procedure as well as the test statistic set *Z* = {*zm*, *m* = 1, 2,..., *M*} by using Equations (6) and (3); set the acceptance count vector to zero.


To my knowledge, this bootstrap procedure for the analyzing a panel of RCS has not been proposed before Di Pietrantonio et al. (2010) and this paper contains the first detailed derivation and study of its properties. In fact, the resampling methods (mostly conditional permutation) applied to analyze RCS have not used mixed models, but consider the strain effect as fixed which is inconsistent with the hypothesis of a genetic background of polygenic nature or discard information by using only the estimated strain means (for example, Gill and Boyle, 2005; Thifault et al., 2008; Camateros et al., 2010).

### **3. RESULTS**

One straightforward way to show the effect of ignoring the random effects term in a mixed model is by simulation. The idea is to generate a dataset from a model that includes a random term for genetic background and noise, but is free of any major locus. Then compare the *p*-value profiles (actually, − log10 *p* profiles) obtained by the use of the naive model (Equation 2) as well as the mixed model (Equation 4). For this simulation study, the genotypes of an RCS panel of 36 strains that were described in Fortin et al. (2001b) were used. The panel originally had 37 lines and 625 microsatellite markers; since then, one line has died out and six markers were removed for reliability reasons. Although a much larger set of single nucleotide polymorphism markers for this RCS panel is also available, I think that this set of 619 markers is enough to show the harmful effects of fitting the wrong model on the inference. Of course, more markers will only exacerbate the problem. For this simulation experiment, six different values for the signal-to-noise ratio parameter λ were chosen (0, 1 <sup>8</sup> , <sup>1</sup> <sup>4</sup> , <sup>1</sup> <sup>2</sup> , 1, and 2). Under a standard additive polygenic model, i.e., a model without major genes, the signal-to-noise parameter is a function of the heritability coefficient (the chosen values correspond to the heritability proportions of 0, <sup>1</sup> <sup>9</sup> , <sup>1</sup> <sup>5</sup> , <sup>1</sup> <sup>3</sup> , <sup>1</sup> <sup>2</sup> , and <sup>2</sup> 3 , respectively). In every simulation run, a sample of seven individuals from each strain was simulated under the assumption of no major gene, i.e., under model (Equation 4) with ξ*<sup>m</sup>* = 0 for all markers, *<sup>m</sup>* <sup>=</sup> <sup>1</sup>, <sup>2</sup>,..., *<sup>M</sup>*. The value of <sup>σ</sup><sup>2</sup> was fixed for all simulations to 1.175, while **X***β* was fixed as a vector with 7 in all its entries. Simulations for each value of λ were run 1000 times and both methodologies, the mixed model as well as the bootstrapped naive regression at the markers were applied to the simulated datasets with 10, 000 as the number of resamples for every dataset. In gene mapping studies, a significant peak is defined as the most extreme point of a region beyond the *p*-value threshold according to some pre-specified genome-wide type I error rate (Churchill and Doerge, 1994). For this study, we use a value of 0.01 or equivalently, a threshold value of 2 on a − log10 *p*-scale. **Tables 1**–**3** summarize the results of these simulations. As expected, whenever there is not a polygenic term in the model (i.e., λ = 0), both methodologies produce identical results. However, the picture changes when λ > 0. In this case, it is quite obvious that ignoring the random effects term has pernicious consequences even for modest levels of λ, the signal-to-noise ratio, while the proposed mixed model method keeps the genome-wide type I error rate relatively close to the nominal value. However, the empirical type I error rates obtained by the proposed procedure seem to increase slightly with λ (**Table 3**). This phenomenon may be due to the fact that the makers used for mapping purposes are also used to estimate the probability of identity by descent between strains and, to a lesser extent, the fact that the the bootstrap procedure is based on residuals computed with λ and *β* estimated from the same data. Nonetheless, the moral of this exercise is that whenever simple regression of a major gene model produces many significant peaks, a warning flag about the model validity should be raised.

**Table 1 | Percentage of declared significant peaks with a bootstrap genome-wide adjusted significance level of 0.01 when the proposed mixed model methodology is used.**


*Estimates based on 1000 simulated datasets for each* λ*.*

**Table 2 | Percentage of declared significant peaks with a bootstrap genome-wide adjusted significance level of 0.01 when a naive regression at the markers is used.**


*Estimates based on 1000 simulated datasets for each* λ*.*

The histogram of a typical dataset obtained by simulation from a model with polygenic effects only would look like the one shown in **Figure 1**. Nonetheless, for this histogram I chose a dataset for which simple linear regression produces a very large number of significant peaks. If a major locus were at play, one would expect to have a well-defined bimodal distribution, so this histogram seems consistent with the generating model of no major gene. However, when we look into the *p*-value profiles obtained through the model that ignores the genetic background term, instead of profiles consistent with the model we will have something extreme as shown by dashed lines in **Figure 2**. According to the profiles on this figure, one might conclude that all chromosomes have at least one significant peak, fact that does not appear to be supported by the histogram of the data, and more conclusively, this is in conflict with the generating model. If anything, it can be argued that the data distribution may seem a bit skewed, but one may expect that estimation of *p*-values via bootstrapping of residuals should not be too sensitive to this. Of course, as for bi-modality, skewness may also be caused by a mixture of distributions. However, a very strong peak, as any of the ones spotted on every chromosome, is difficult to conceive without a conspicuous bimodal distribution. Even with the use of robust regression estimates instead of the obtained by regular least squares to minimize the potential impact of outliers on the estimation, these profiles change very little (data not shown). When the missing random effects term is introduced into the model (solid blue

**Table 3 | Empirical genome-wide type I error rates obtained via bootstrap in the simulation study (0.01 is the nominal value and the number of simulated datasets for each** *λ* **is 1000).**


the data on this histogram were computed and plotted in **Figure 2**.

lines in **Figure 2**), *p*-value profiles become consistent with the generating model. Repetition of this exercise on any other simulated datasets yields similar results, although the specific resulting profiles most likely are not be the same.

### **4. DISCUSSION**

This paper proposes a bootstrapping procedure to estimate the *p*values under a mixed model applied to gene mapping when RCS are used. The method can be easily adapted for other replicable mapping population/designs. This procedure is a generalization of the linear regression bootstrap of residuals coupled with the union-intersection principle aimed to control the genome-wide type I error rate. A simulation study with different values of the signal-to-noise ratio unequivocally shows that when a panel of RCS is used for mapping, ignoring one random effects term in a mixed linear model can have pernicious consequences, resulting in inflated type I error rates and leading to the declaration of significant linkage peaks were no such peaks should be found. The simulation study also shows that the proposed bootstrap procedure seems to produce slightly inflated type I error rates as the signal-to-noise ratio increases. This problem is likely due to the fact that the markers used for mapping are also used to estimate the length of the segments shared identical by descent but also it can be associated with a stronger departure from exchangeability as the ratio increases. In any case, the problem deserves further scrutiny. The proposed bootstrap procedure for mixed models is quite general and can easily be adapted to non-genetic problems.

### **FUNDING**

This work has been supported by the Canadian Institutes of Health Research.

### **ACKNOWLEDGMENTS**

The author expresses gratitude to M. Fujiwara, E. Schurr, T. di Pietrantonio, and K. Morgan for the discussion and comments that substantially improved the manuscript.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 October 2013; accepted: 17 March 2014; published online: 02 April 2014. Citation: Loredo-Osti JC (2014) A cautionary note on ignoring polygenic background when mapping quantitative trait loci via recombinant congenic strains. Front. Genet. 5:68. doi: 10.3389/fgene.2014.00068*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Loredo-Osti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## A modified generalized Fisher method for combining probabilities from dependent tests

### *Hongying Dai 1,2,3\*, J. Steven Leeder 2,4 and Yuehua Cui <sup>5</sup>*

*<sup>1</sup> Department of Pediatrics, Research Development and Clinical Investigation, Children's Mercy Hospital, Kansas City, MO, USA*

*<sup>2</sup> Department of Pediatrics, University of Missouri-Kansas City, Kansas City, MO, USA*

*<sup>3</sup> Department of Informatic Medicine and Personalized Health, University of Missouri-Kansas City, Kansas City, MO, USA*

*<sup>4</sup> Department of Pediatrics, Clinical Pharmacology and Therapeutic Innovation, Children's Mercy Hospital, Kansas City, MO, USA*

*<sup>5</sup> Department of Statistics and Probability, Michigan State University, East Lansing, MI, USA*

#### *Edited by:*

*José M. Álvarez-Castro, Universidade de Santiago de Compostela, Spain*

#### *Reviewed by:*

*Wei Hou, Stony Brook University, USA J. Concepcion Loredo-Osti, Memorial University, Canada*

#### *\*Correspondence:*

*Hongying Dai, Department of Pediatrics, Research Development and Clinical Investigation, Children's Mercy Hospital, 2401 Gillham Road, Kansas City, MO 64108, USA e-mail: hdai@cmh.edu*

Rapid developments in molecular technology have yielded a large amount of high throughput genetic data to understand the mechanism for complex traits. The increase of genetic variants requires hundreds and thousands of statistical tests to be performed simultaneously in analysis, which poses a challenge to control the overall Type I error rate. Combining *p*-values from multiple hypothesis testing has shown promise for aggregating effects in high-dimensional genetic data analysis. Several *p*-value combining methods have been developed and applied to genetic data; see Dai et al. (2012b) for a comprehensive review. However, there is a lack of investigations conducted for dependent genetic data, especially for weighted *p*-value combining methods. Single nucleotide polymorphisms (SNPs) are often correlated due to linkage disequilibrium (LD). Other genetic data, including variants from next generation sequencing, gene expression levels measured by microarray, protein and DNA methylation data, etc. also contain complex correlation structures. Ignoring correlation structures among genetic variants may lead to severe inflation of Type I error rates for omnibus testing of *p*-values. In this work, we propose modifications to the Lancaster procedure by taking the correlation structure among *p*-values into account. The weight function in the Lancaster procedure allows meaningful biological information to be incorporated into the statistical analysis, which can increase the power of the statistical testing and/or remove the bias in the process. Extensive empirical assessments demonstrate that the modified Lancaster procedure largely reduces the Type I error rates due to correlation among *p*-values, and retains considerable power to detect signals among *p*-values. We applied our method to reassess published renal transplant data, and identified a novel association between B cell pathways and allograft tolerance.

**Keywords: generalized Fisher method (Lancaster procedure), weight function, correlated** *p***-values, multiple hypothesis testing, high dimensional genetic data**

### **INTRODUCTION**

Rapid developments in molecular technology have created high throughput data in search of genetic variants associated with complex traits. As the cost of experiments goes down, the amount of data that can be generated, and the resulting complexity of statistical analysis required to interpret the data goes up. The increase of genetic variants requires more statistical testing to be performed simultaneously, which poses a challenge to control the genome wide Type I error rate. False discovery rate (FDR) and its extended methods have been proposed to adjust *p*-values in multiple tests in order to control the genome wide Type I error (Benjamini and Hochberg, 1995; Cheng and Pounds, 2007). However, in large-scale hypothesis testing, these methods often require very a large sample size to maintain power of detecting risk factors.

The global test (also named omnibus test) of *p*-values can combine evidence and turn dimensionality from a curse into rich information. From a systems biology perspective, genes, cells, tissues, and organs function as a system through metabolic networks and cell signaling networks. In non-Mendelian inheritance patterns, such as complex disorders, a subset of genetic variants may jointly confer moderate effects in mediating molecular activities. As a result, signals may not be significant in single marker-single trait analysis, but many such values from related genes might provide valuable information on gene function and regulation. For instance, in pathway analysis (Khatri et al., 2012) and gene set enrichment analysis (Subramanian et al., 2005), multiple genes that work together to serve a particular biological function are often analyzed jointly as a gene set. Several pathway repositories, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al., 2004), PANTHER classification system for protein sequence data (Nikolsky and Bryant, 2009), and Reactome pathways in humans (Matthews et al., 2009) have been established, and are continually being updated. For non-Mendelian diseases and complex traits, identification of isolated genetic variants is insufficient to summarize the complex association with disease. The "most-significant SNPs/genes" approach often detects variants with small effect sizes and odds ratios ranging between 1.3 and 2 (Wacholder et al., 2004). Therefore, integrating information from pathways, gene sets, and networks will provide useful information in understanding the gene regulation mechanism. Furthermore, filtration techniques can be integrated with global testing of *p*-values to remove sets of genetic variants that are not related to traits, and thereby reduce the dimensionality of the data (Dai and Charnigo, 2008; Dai et al., 2012a).

The global test of *p*-values evaluates the pattern (distribution) of *p*-values instead of selecting *p*-values less than an arbitrary threshold. Therefore, this method has the potential to identify multiple genes with small effects. If we assume that all individual tests are independent and arise from genetic variants with no effects, then *p*-values are identically and independently distributed as Uniform (0, 1). Taking this as a null hypothesis for the pattern of *p*-values in the global test, one can assess whether *p*-values, especially small *p*-values, are generated by chance. The global test of *p*-values is robust and can be applied to *p*-values from varying statistical models including *t*-tests, analysis of variance (ANOVA), linear mixed models, and so forth. Multiple simulation studies and case studies have demonstrated that this approach usually has sufficient power to detect signals of genetic association from a group of genes. For instance, Peng et al. (2010) has assessed Fisher's combination test and Sidak's combination test, Sime's combination test and the FDR method using 13 published genome wide association studies (GWAS), and the results indicate that combined *p*-value approaches can identify biologically meaningful pathways associated with the disease susceptibility. A review of methods of global test of *p*-values, developmental trends and their application to genetic data analysis has been presented by (Dai et al., 2012b).

One category of global tests of *p*-values involves combining *p*-values in the form of - *<sup>i</sup> H*(*pi*), where *p*-values might first be transformed by a function *H*. So far, several statistical methods have been developed to combine *p*-values. Let *pi*(*i* = 1, 2,..., *n*) be independent *p*-values obtained from *n* hypothesis tests. Under the null hypothesis (*H*0) that *p*-values follow a Uniform (0, 1) distribution, Fisher (1932) shows that −2 *n <sup>i</sup>* <sup>=</sup> <sup>1</sup> ln(*pi*) follows a chi-square distribution with 2*n* degrees of freedom. For a one sided test with a nominal error rate of α, one can reject the null hypothesis when the test statistics exceeds the (<sup>1</sup> <sup>−</sup> <sup>α</sup>)∗100% percentile of <sup>χ</sup><sup>2</sup> <sup>2</sup>*n*. Stouffer (Stouffer et al., 1949) proposed a *z*-test by transforming *p*-values to standard normal variables, i.e., *n i* = 1 -<sup>−</sup>1(<sup>1</sup> <sup>−</sup> *pi*) <sup>√</sup>*<sup>n</sup>* , where -<sup>−</sup><sup>1</sup> is the inverse Cumulative Distribution Function (CDF) for *N*(0, 1). Under the null hypothesis, the *z*-test statistic follows *N*(0, 1).

Although there is no consensus regarding the most powerful method of combining *p*-values, Littell and Folks (1971, 1973) demonstrated that the Fisher's method of combining independent tests is asymptotically Bahadur efficient (Bahadur, 1967). Subsequently, weighting schemes have been incorporated into the Fisher's method and the *z*-test. Lancaster (1961) generalized the Fisher method by converting independent *p*-values to chi-square variables with *wi* degrees of freedom and he showed that *m <sup>i</sup>* <sup>=</sup> <sup>1</sup> <sup>γ</sup>−<sup>1</sup> (*wi*/2,2) (<sup>1</sup> <sup>−</sup> *pi*) <sup>∼</sup> <sup>χ</sup><sup>2</sup> *<sup>d</sup>*, *d* = - *<sup>i</sup> wi* under *H*0, where γ−<sup>1</sup> (*wi*/2,2) is the inverse CDF of Gamma distribution. Mosteller and Bush (1954) proposed a weighted *z*-test, - *<sup>i</sup> wi*-−1 (<sup>1</sup> <sup>−</sup> *pi*)/- *<sup>i</sup> w*<sup>2</sup> *<sup>i</sup>* , which follows *N*(0, 1) under *H*0.

In a separate paper, we have proved that the Lancaster procedure achieves the optimal Bahadur efficiency. We further demonstrated that the Lancaster procedure yields higher Bahadur efficiency than the weighted *z*-test. The Bahadur efficiency ratio gives the limiting ratio of sample sizes required by two statistics to attain an equally small significance level. Thus, Bahadur efficiency is an important method to compare test statistics. From the perspective of Bahadur efficiency, the Lancaster procedure asymptotically requires a relatively smaller sample size than other weighted *p*-value combining methods. This prompted us to focus on modification of the Lancaster procedure for correlated genetic data in this work.

Although the Fisher's method and Lancaster procedure both achieve the optimal Bahadur efficiency, the Lancaster procedure is more general and can be viewed as a generalized Fisher's method with weighting functions. There are three advantages to carefully select appropriate weight functions in genetic data analysis. Firstly, weight functions allow incorporation of prior biological information. Genetic data are complex and can be measured from different sources. Thus, weight functions can be used as a tool to incorporate meaningful information from different sources in order to interpret and derive biological insight from gene expression profiles. (Wu and Lin, 2009) provides a review of statistical methods for analysis of microarray data by incorporating prior biological knowledge using gene sets and biological pathways, which consist of groups of biologically similar genes. They show that the use of prior knowledge has led to a better understanding of the biological mechanisms underlying phenotypic responses. Secondly, weight functions can be used to remove bias. For instance, larger genes may contain more probes and/or SNPs. Therefore, larger genes will exert a stronger influence on the *p*-value combining methods as compared to smaller genes (Wang et al., 2007). To avoid this bias, one can consider a weight function to adjust for gene size when combining *p*-values. We will illustrate this approach in sections Empirical Assessments and Case Study: Renal Transplant Tolerance Data. Thirdly, as suggested by Benjamini and Hochberg (1997), Genovese et al. (2006), procedures that assign weights positively associated with the underlying alternative hypotheses will usually improve power. Therefore, one needs to carefully choose an appropriate weight function, either based on the biological knowledge, or by statistical hypotheses. An arbitrary weight is inappropriate for the Lancaster procedure.

In this work, we will provide modifications to the Lancaster procedure to accommodate correlation structures among *p*-values. The proposed method provides a generalization to the Fisher's method with a weight function and can be used in pathway analysis and gene sets enrichment analysis for a variety of genetic data including microarray gene expression data, GWAS data, and next generation sequencing data. In essence, investigators first dissect genetic variants by biological functions or prior knowledge, then combine the *p*-values from these gene sets to identify whether a proportion of genetic variants are associated with traits.

#### **CORRELATED LANCASTER PROCEDURES**

In this section, we allow *p*-values to be correlated. Consider a Lancaster test statistic *T* = *n <sup>i</sup>* <sup>=</sup> <sup>1</sup> <sup>γ</sup>−<sup>1</sup> (*wi*/2,2) (<sup>1</sup> <sup>−</sup> *pi*) where <sup>γ</sup>−<sup>1</sup> (*wi*/2,2) is the inverse CDF of Gamma distribution with a shape parameter *wi*/2 and a scale parameter 2. This transformation converts *pi* <sup>∼</sup> Uniform(0, <sup>1</sup>) to a chi-square distribution, i.e., <sup>γ</sup>−<sup>1</sup> (*wi*/2,2) (<sup>1</sup> <sup>−</sup> *pi*) <sup>∼</sup> <sup>χ</sup><sup>2</sup> *wi* where <sup>χ</sup><sup>2</sup> *wi* is a chi-square distribution with *wi* > 0 degree(s) of freedom. The parameter *wi* serves as a weight function to adjust the individual *p*-values. When *p*-values are independent, *T* has an exact chi-square distribution with *n <sup>i</sup>* <sup>=</sup> <sup>1</sup> *wi* degrees of freedom.

For correlated *p*-values, *T* = *n <sup>i</sup>* <sup>=</sup> <sup>1</sup> <sup>γ</sup>−<sup>1</sup> (*wi*/2,2) (1 − *pi*) does not follow χ<sup>2</sup> *n <sup>i</sup>* <sup>=</sup> <sup>1</sup> *wi* . The distribution of *T* does not have an explicit analytical form. To address this issue, we consider a Satterthwaite approximation by approximating a scaled *T* statistic with a new chi-square distribution (Li et al., 2011). Let *cT* <sup>≈</sup> <sup>χ</sup><sup>2</sup> *<sup>v</sup>* where *c* > 0 is a scalar and *v* > 0 is the degree of freedom for the approximated chi-square distribution. Note that

$$\begin{split} E(T) &= E\left(\sum\_{i=1}^{n} \boldsymbol{\chi}\_{(w\_i/2,2)}^{-1} \left(1 - p\_i\right)\right) = \sum\_{i=1}^{n} w\_i \text{ and} \\ \text{Var}(T) &= \text{var}\left(\sum\_{i=1}^{n} \boldsymbol{\chi}\_{(w\_i/2,2)}^{-1} \left(1 - p\_i\right)\right) \\ &= \sum\_{i=1}^{n} \text{var}\left(\boldsymbol{\chi}\_{(w\_i/2,2)}^{-1} \left(1 - p\_i\right)\right) \\ &+ 2 \sum\_{i$$

where <sup>ρ</sup>*ij* <sup>=</sup> cov γ−<sup>1</sup> (*wi*/2,2) (<sup>1</sup> <sup>−</sup> *pi*), <sup>γ</sup>−<sup>1</sup> (*wi*/2,2) 1 − *pj* takes the correlations among *p*-values into account.

We propose the following five approaches to approximate the distribution of *T*. In approximation (A), we use the Satterthwaite method to match the mean and variance of *cT* and χ<sup>2</sup> *<sup>v</sup>* , and then solve the equations to derive *c* and *v*. Koziol (1996) have proposed multiple methods to approximate the Lancaster procedure, but these approximations require the assumption of independence. In approximation (B)–(E), we extend the work of Koziol (1996) to correlated data by first approximating *cT* with χ<sup>2</sup> *<sup>v</sup>* then approximating χ<sup>2</sup> *<sup>v</sup>* using varying methods.

• *TA* approximation.

Correlation among *p*-values is taken into consideration, and then Satterthwaite's approximation is used (Patnaik, 1949) to derive new degrees of freedom:

$$T\_A = \varepsilon T \approx \chi\_\nu^2, \quad \text{where} \ c = \frac{\nu}{E(T)} \text{ and } \nu = 2 \frac{[E(T)]^2}{\text{var}(T)}.$$

• *TB* approximation.

*cT* is first approximated by χ<sup>2</sup> *<sup>v</sup>* , followed by Fisher's approximation (Fisher, 1922) to χ<sup>2</sup> *v* :

$$T\_B = \sqrt{2\frac{\nu T}{E(T)}} \approx \mathbb{N}(\sqrt{2\nu - 1}, 1).$$

• *Tc* approximation.

After approximating *cT* by χ<sup>2</sup> *<sup>v</sup>* , the Wilson–Hilferty approximation is performed (Wilson and Hilferty, 1931) to derive χ2 *v* .

$$\text{Let } T\_c = \sqrt[3]{\frac{T}{E(T)}}, \text{ then } T\_c \approx N \left(1 - 2/(9\nu), \sqrt{2/(9\nu)}\right).$$

• *TD* approximation.

Approximate *cT* by χ<sup>2</sup> *<sup>v</sup>* , followed by the Cornish–Fisher expansion (Fisher and Cornish, 1960) to χ<sup>2</sup> *<sup>v</sup>* . Let *x*<sup>α</sup> denote the α-percentage point of the standard normal distribution, that is, -(*x*α) = α. It follows that the corresponding percentage point for *TD* <sup>=</sup> *vT <sup>E</sup>*(*T*) is given by

$$\begin{split} &\nu + \sqrt{2\nu}\chi\_{\alpha} + \frac{2}{3}(\chi\_{\alpha}^{2} - 1) + \frac{\chi\_{\alpha}^{3} - 7\chi\_{\alpha}}{9\sqrt{2\nu}} - \frac{6\chi\_{\alpha}^{4} + 14\chi\_{\alpha}^{2} - 32}{405\nu} \\ &+ \frac{9\chi\_{\alpha}^{5} + 256\chi\_{\alpha}^{2} - 433\chi\_{\alpha}}{4860\nu\sqrt{2\nu}}. \end{split}$$

• *TE* approximation.

Approximate *cT* by χ<sup>2</sup> *<sup>v</sup>* then perform saddle point approximation (Lugannani and Rice, 1980) to χ<sup>2</sup> *<sup>v</sup>* . Let *TE* <sup>=</sup> *<sup>T</sup> <sup>E</sup>*(*T*). Then Pr(*YE* ≤ *y*) = -(*ay*) <sup>−</sup> <sup>φ</sup>(*b*−<sup>1</sup> *<sup>y</sup>* <sup>−</sup> *<sup>a</sup>*−<sup>1</sup> *<sup>y</sup>* ) for *y* = 1 and Pr(*YE* ≤ 1) = 0.5 − (3 <sup>√</sup>π*v*)−1, where *ay* = 2*v*(*yty* − *K*(*ty*))sign(*ty*), *by* = *ty vK*(*tx*) and *K*(*t*) = −0.5 log(1 − 2*t*), and *ty* = (*y* − 1)/2*y*.

When the covariance ρ*ij* is unknown, one can use the permutation approach to estimate ρ*ij* by shuffling the phenotype variable among subjects. For the *k*th permutation (*k* = 1, 2,..., *m*), we keep the genetic variants within the subject to preserve the correlation structure, then randomly assign the phenotype variable to subjects. Individual hypothesis testing can be done on all *n* genetic variants separately to generate the *<sup>p</sup>*-value vector *<sup>p</sup><sup>k</sup>* <sup>=</sup> (*p<sup>k</sup>* <sup>1</sup>, *<sup>p</sup><sup>k</sup>* <sup>2</sup>,... *<sup>p</sup><sup>k</sup> n*)*t* . The permutation is repeated *m* = 1000 times, and ρ*ij* is estimated from (*p*1, *p*2,... *pm*).

The accuracy of the five approximate distributions to the correlated Lancaster procedure is then assessed using *p*-values with varying correlation structures. We consider six different types of correlation structures, including fixed and random compound symmetric as well as random positive definite variancecovariance structures for . Let *I* be an identity matrix, 1 be a vector of 1 s, ⊗ be the Kronecker product, and superscript *t* be the transposition. In Cases I–V, let = Block ⊗ *I*<sup>20</sup> be compound symmetric variance matrices with 20 blocks of size 5 where Block <sup>=</sup> <sup>1</sup> 51 *<sup>t</sup>* <sup>5</sup>ρ + (1 − ρ)*I*5. We vary ρ over two fixed values with ρ = 0.3 for moderate dependence and ρ = 0.6 for strong dependence. In addition, we simulate random correlation coefficients from beta and uniform distributions, i.e., ρ ∼ β(0.3, 1.5) and ρ ∼ uniform(−0.2, 0.2), which ensures that 20 variance blocks have distinct correlation coefficients ρ within . More generally, we consider random positive definite correlation matrices that vary across samples and simulation runs.

The quantile-quantile (Q-Q) plot assessing the accuracy of the proposed methods when the correlation coefficient ρ = 0.3 is shown in **Figure 1**. For clarity, the Lancaster statistic *T* that combines *n p*-values is renamed as *T*Lancaster *<sup>n</sup>* in **Figure 1**. For the original Lancaster procedure under the independence assumption, the general trend of the Q-Q plot is flatter than the reference line *y* = *x*, indicating the limiting distribution for the test statistic in the original Lancaster procedure is less dispersed than the distribution of *T*Lancaster *<sup>n</sup>* under correlation structures. As a result, the original Lancaster procedure will have severely inflated Type I errors. In contrast, the five approximations (*TA*,..., *TE*) match the underlying distribution of *T*Lancaster *<sup>n</sup>* . For data with stronger internal correlation, *TA*, *TD*, and *TE* better approximate *T*Lancaster *<sup>n</sup>* . The Q-Q plots under other correlation structures are similar to **Figure 1**. To save space, these similar results are not shown, but can be provided upon request.

### **EMPIRICAL ASSESSMENTS**

We assess the Type I error rates and power for the proposed correlated Lancaster procedures and compare them to the independent Lancaster procedure (Lancaster, 1961). SNPs from a pathway of haploid GWAS are simulated using linkage disequilibrium (LD) (Li et al., 2011). Let *q*<sup>1</sup> and *q*<sup>2</sup> be the minor allele frequencies (MAFs) at loci 1 and 2. Assuming Hardy– Weinberg equilibrium, the genotype at locus 1 can be randomly generated using a binomial distribution. Given the distribution of SNP at locus 1, one can simulate the genotype at locus 2. To do so, let *D* be a measure of LD. Then the conditional probability for the genotype at locus 2 given the genotype at locus 1 can be expressed as *P*(*A*|*B*) = [*qAqB* + *D*]/*qB*, *P*(*a*|*B*) = [(1 − *qA*)*qB* − *D*]/*qB*, *P*(*A*|*b*) = [*qA*(1 − *qB*) − *D*]/(1 − *qB*), and *P*(*a*|*b*) = [(1 − *qA*)(1 − *qB*) + *D*]/(1 − *qB*) where *A* and *B* represent the minor alleles at the two loci. For a diploid genome, similar idea can be applied and the simulation details can be found at Cui et al. (2008). We simulate a pathway with 5 genes with varying numbers of SNPs in each gene listed in parenthesis i.e., G1(12), G2(8), G3(5), G4(3), G5(2). The MAF of each SNP was set to be 0.3. We simulate different levels of LD for SNPs from

the same gene with *D* = 0, 1.5, 2, and uniform(0, maximum of LD). The variable *D* = 0, 1.5, and 2 suggests no LD, moderate LD, and very strong LD among SNPs with the corresponding correlation *R* = 0, 0.71, and 0.95. Six scenarios for disease susceptibility (*p*) are simulated


**Table 1 | Type I error and power for independent Lancaster Procedure and five approximations to correlated Lancaster Procedures when sample size = 200 and linkage disequilibrium** *D* **= 0***.***15.**

**Independent** *TA TB TC TD TE* **Lancaster procedure CASE I** β = 0 *0.101* 0.038 0.042 0.039 0.039 0.038 β = 0.4 0.999 0.995 0.995 0.995 0.995 0.995 β = 0.6 1 1.000 1 1 1 1 **CASE II** β = 0 *0.1* 0.037 0.041 0.038 0.038 0.037 β = 0.4 0.947 0.863 0.875 0.864 0.865 0.863 β = 0.6 0.997 0.995 0.995 0.995 0.995 0.995 **CASE III** β = 0 *0.078* 0.038 0.038 0.038 0.038 0.038 β = 0.4 0.735 0.506 0.522 0.508 0.507 0.506 β = 0.6 0.961 0.864 0.876 0.866 0.866 0.863 **CASE IV** β = 0 *0.107* 0.046 0.051 0.046 0.047 0.046 β = 0.4 0.997 0.997 0.997 0.997 0.997 0.997 β = 0.61 11111 **CASE V** β = 0 *0.084* 0.036 0.038 0.037 0.037 0.036 β = 0.4 0.884 0.71 0.724 0.71 0.711 0.71 β = 0.6 0.989 0.952 0.957 0.953 0.953 0.952 **CASE VI** β = 0 *0.084* 0.036 0.038 0.037 0.037 0.036 β = 0.4 0.741 0.57 0.585 0.572 0.572 0.568 β = 0.6 0.953 0.898 0.904 0.898 0.898 0.898 Weight functions can be used to remove potential bias when combining *p*-values. Wang et al. (2007) and others have noted that larger genes contain more probes and/or SNPs. Therefore, larger genes may exert a stronger influence on the *p*-value combining methods compared to smaller genes. To avoid this bias, we set the weight function *wi* = 2/ <sup>√</sup>*ni* where *ni* is the number of SNPs in the *i*th gene. When *ni* = 1, γ−<sup>1</sup> (*wi*/2, 2) (<sup>1</sup> <sup>−</sup> *pi*) transforms *<sup>p</sup>*-value into a variable with <sup>χ</sup><sup>2</sup> 2 distribution.

We simulate data with sample sizes *n* = 200 (**Tables 1**, **4**) and *n* = 400 (**Tables 2**, **3**), respectively. For simplicity, we assume the same effect size for all of the regression coefficients. For each set of data, we perform the original and modified Lancaster procedures to assess the pathway data by combining *p*-values from individual tests. We set nominal error rate to be 0.05. The simulation is repeated 1000 times.

Due to LD, SNPs from the same gene are correlated. We first assess the Type I error rate of the test statistics by testing *H*<sup>0</sup> :

**Table 2 | Type I error and power for independent Lancaster Procedure and five approximations to correlated Lancaster Procedures when sample size = 400 and linkage disequilibrium** *D* **= 0***.***20.**


*A weight function is applied to adjust for the gene size\*.*

*\*The nominal error rate is set to be 0.05. Type I error rates are listed when* <sup>β</sup> <sup>=</sup> <sup>0</sup>*. Power is listed when* β > 0*. Inflated Type I error rates are italicized.*

*\*A weight function wi* <sup>=</sup> <sup>2</sup>/ <sup>√</sup>*ni is applied to each test to adjust for the size of gene.*

*A Weight function is applied to adjust for the gene size\*. \*The nominal error rate is set to be 0.05. Type I error rates are listed when* <sup>β</sup> <sup>=</sup> <sup>0</sup>*.*

*Power is listed when* β > 0*. Inflated Type I error rates are italicized.*

*\*A weight function wi* <sup>=</sup> <sup>2</sup>/ <sup>√</sup>*ni is applied to each test to adjust for the size of gene.*

β<sup>1</sup> = ... = β<sup>6</sup> = 0. As shown in **Tables 1**, **2**, the Type I error rate for the original Lancaster procedure is inflated (>0.05) for all of the six cases. In contrast, five modified Lancaster procedures (*TA* − *TE*) have well controlled Type I error rates (<0.05).

The power of all test statistics was compared for regression coefficient values set at β = 0.4 and β = 0.6, respectively. The results in **Tables 1**, **2** suggest strong and comparable power among the modified Lancaster procedures. In most simulated cases, the proposed methods have more than 80% power to detect β = 0.4. When the effect size increases to β = 0.6, the power of proposed methods increases to 90% or above. Also the power of these tests improves as sample size increases from *n* = 200 to *n* = 400.

We simulate different levels of LD for SNPs with *D* = 0, 1.5, 2, and uniform(0, maximum of LD). To save the space, we only show the results for *D* = 1.5 (**Table 3**) and *D* = 2 (**Tables 1**, **2**). Our findings show that the inflation of Type I error rate for the original Lancaster procedure gets severe when LD is strong (**Tables 1**, **2**). The modified Lancaster procedures (*TA* − *TE*) have

**Table 3 | Type I error and power for independent Lancaster Procedure and five approximations to correlated Lancaster Procedures when sample size = 400 and linkage disequilibrium** *D* **= 0***.***15.**


*A weight function is applied to adjust for the gene size\*.*

*\*The nominal error rate is set to be 0.05. Type I error rates are listed when* <sup>β</sup> <sup>=</sup> <sup>0</sup>*. Power is listed when* β > 0*. Inflated Type I error rates are italicized.*

*\*A weight function wi* <sup>=</sup> <sup>2</sup>/ <sup>√</sup>*ni is applied to each test to adjust for the size of gene.*

well-controlled Type I error rates and power for both moderate and strong LD (**Tables 1**–**3**).

In **Table 4**, we assess the performance of all tests without a weighting function. We then compare the results in **Table 4** (without a weight function) vs. **Table 1** (with a weight function). All other simulation parameters are held the same in **Tables 1**, **4**. We note that the original Lancaster procedure without a weighting function (**Table 4**) tends to have higher Type I error rates than the original Lancaster procedure with a weighting function (**Table 1**). For modified tests (*TA* − *TE*), the power is increased when a weighting function is used. This confirms that an appropriate weight function is beneficial to the Lancaster procedure.

#### **CASE STUDY: RENAL TRANSPLANT TOLERANCE DATA**

We revisited a kidney transplant data first collected and analyzed by Newell et al. (2010). Data were downloaded from the GEO website with ID = GDS4266 (http://www.ncbi.nlm.nih.

**Table 4 | Type I error and power for independent Lancaster Procedure and five approximations to correlated Lancaster Procedures when sample size = 200 and linkage disequilibrium** *D* **= 0***.***20.**


*No Weight function is applied to adjust for the gene size\*.*

*\*The nominal error rate is set to be 0.05. Type I error rates are listed when* <sup>β</sup> <sup>=</sup> <sup>0</sup>*. Power is listed when* β > 0*. Inflated Type I error rates are italicized.*

*\*These are the un-weighted tests with wi* <sup>=</sup> <sup>2</sup> *for all genes. We do not adjust the size of genes.*

gov/sites/GDSbrowser?acc=GDS4266). A group of tolerant renal transplant recipients (Tolerant, *n* = 19), as defined by stable graft function in the absence of immunosuppression for more than 1 year, were compared to subjects with stable graft function who were receiving standard immunotherapy (SI, *n* = 27) as well as to a group of healthy controls (Control, *n* = 12). Gene expression profiles of whole-blood total RNA from all subjects were measured by microarray. The goal of the study was to identify genetic variants associated with long-term allograft survival without the requirement for continuous immunosuppression, a condition known as allograft tolerance. Newell et al. (2010) performed statistical analysis to identify differentially expressed genes between the SI group and the Tolerant group. The results revealed a critical role for B cells in regulating alloimmunity, and provided a candidate set of genes for wider-scale screening of renal transplant recipients. However, no comprehensive pathway analysis was conducted by this group (Newell et al., 2010).

To further understand molecular mechanisms underlying renal allograft tolerance, we have applied the modified Lancaster procedure to this dataset to identify candidate cellular pathways. Gene expression levels were normalized using Robust Multichip Average (rma) preprocessing methodology, which included background subtraction, quantile normalization, and summarization via median-polish.

Gene expression levels were summarized for a total of 54,675 probes from 21,049 genes. Expression levels were compared among three groups using the Bioconductor "Limma" package. Three pair wise comparisons were conducted, including: SI vs. Control, SI vs. Tolerant, and Tolerant vs. Control. Then three comparisons were combined into one *F*-test. This is equivalent to a One-Way ANOVA for each gene except that the residual mean squares have been moderated across genes. *P*-values from multiple hypothesis testing were adjusted by FDR (Benjamini and Hochberg, 1995). Our results of differentially expressed genes are consistent with the previous published work. See Newell et al. (2010) for the gene analysis findings.

Although (Newell et al., 2010) identified a set of differentially expressed genes, our analysis demonstrates that these significant


**Table 5 | Top 10 significant pathways detected by the modified Lancaster procedure (***TA***).**

genes have small effect sizes with fold changes <1.5. Therefore, a limited number of individual genes in the absence of a biological context is inadequate to explain the total variation of allograft tolerance among renal transplant patients.

To address this issue, we performed the modified Lancaster procedure (*TA*) as described in Section Correlated Lancaster Procedures to combine *p*-values from pathways. Combining *p*-values allows us to integrate small effects in pathway and gain the power of statistical testing. A total of 1454 Gene Ontology human pathway gene sets were analyzed. The size of pathways ranged from 9 genes to 2131 genes, with a median of 27 genes per pathway. Also, the number of probes per gene was highly variable. In order to map genes to pathways, we removed genes without gene symbols from the analysis. Among 21,049 genes with gene symbols, approximately 48% (*n* = 10161) of genes were interrogated with a single probe, 26% (*n* = 5389) of genes were queried using 2 probes, 14% (*n* = 2842) of genes were assessed by 3 probes. There were 3 or more probes for each on the remaining genes (range: 4–17). This finding indicates that larger genes would have more *p*-values and a stronger impact to pathway analysis. To prevent this bias, we set the weight function as *wi* = 2/ <sup>√</sup>*ni* where *ni* is the number of probes for the *i*th gene.

We performed pathway analysis for the One-Way ANOVA test and three pair wise comparisons. The top 10 significant pathways based on the One-Way ANOVA test are listed in **Table 5**. The top two pathways, B cell differentiation (GO:0030183) and B cell activation (GO:0042113), confirm the signature of B cell involvement described by Newell et al. (2010). Furthermore, we identified other pathways related to B cell activation and function. These include antigen binding (GO:0003823), map kinase kinase kinase activity (GO:0004709) and lymphocyte differentiation (GO:0030098). These pathways are biologically consistent with the proposed role of B-lymphocytes in renal transplant tolerance reported by Newell et al. In contrast, when we performed the traditional Fisher's method without considering correlation structures (LD) within pathways or applying a weighting function to compensate for variability in the number of probes per gene, the result was a list of larger pathways, some containing >1000 genes, describing more general cellular processes and not specifically related to immune functions (See **Table 6**, #gene and


#probe). Furthermore, when comparing the SI group and the Control group, the traditional method identified 1078 significant pathways while our proposed method narrowed the list down to 64 significant pathways (adjusted *p*-value <0.05). The increase in number of significant pathways identified by the traditional approach is primarily due to false positive discovery, and is consistent with the inflation of Type I error rate as presented in Section Empirical Assessments. Thus, by accounting for correlation structures (LD) within pathways and the number of probes per gene, our proposed method minimized identification of larger, nonspecific cellular processes pathways, and instead revealed more focused and functionally relevant biological pathways implicating a role for a humoral immune response in immunotolerance to renal transplants (See **Table 5**, #gene and #probe).

#### **DISCUSSION AND CONCLUSIONS**

Modifications to the Lancaster procedure are proposed to take correlations among *p*-values into account. Extensive simulation studies show that the original Lancaster procedure has inflated Type I error rates due to correlation among *p*-values. By using permutation approach to estimate the correlation among *p*-values, the proposed methods have well-controlled Type I error rates and maintain strong power to detect signals related to SNPs in pathways.

Among five proposed approximation methods (*TA*,..., *TE*), the Satterthwaite approximation (*TA*) is the most computationally efficient. Other approximation methods (*TB*,..., *TE*) are based on the Satterthwaite approximation. Therefore, we recommend using the Satterthwaite approximation (*TA*) as the standard procedure to modify the Lancaster procedure. Among other approximation methods, simulation results in Section Correlated Lancaster Procedures show that, for data with stronger internal correlation, *TD* and *TE* have better approximation than *TB* and *TC*. Our simulation study and the case study further provide evidence that *TD* tends to have slightly higher power than the Satterthwaite approximation *TA*. The R code for five approximation is posted at http://d.web.umkc.edu/daih/.

### **ACKNOWLEDGMENTS**

We thank two reviewers for their constructive comments, which helped us improve the manuscript. This work was supported in part by NSF grant DMS-1209112 (to Yuehua Cui).

#### **REFERENCES**


models in selection of optimal number of target genes. *BioData Min.* 5:3. doi: 10.1186/1756-0381-5-3


Wu, M. C., and Lin, X. (2009). Prior biological knowledge-based approaches for the analysis of genome-wide expression profiles using gene sets and pathways. *Stat. Methods Med. Res.* 18, 577–593. doi: 10.1177/096228020 9351925

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 05 November 2013; accepted: 27 January 2014; published online: 20 February 2014.*

*Citation: Dai H, Leeder JS and Cui Y (2014) A modified generalized Fisher method for combining probabilities from dependent tests. Front. Genet. 5:32. doi: 10.3389/fgene. 2014.00032*

*This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Dai, Leeder and Cui. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*