# RECENT ADVANCES IN SYMBIOSIS RESEARCH: INTEGRATIVE APPROACHES

EDITED BY: M. Pilar Francino and Mónica Medina PUBLISHED IN: Frontiers in Microbiology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-015-2 DOI 10.3389/978-2-88945-015-2

## About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **RECENT ADVANCES IN SYMBIOSIS RESEARCH: INTEGRATIVE APPROACHES**

Topic Editors:

**M. Pilar Francino,** Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunitat Valenciana-Salud Pública & Universitat de València & CIBER en Epidemiología y Salud Pública, Spain

**Mónica Medina,** Pennsylvania State University, USA

The upside-down jellyfish *Cassiopea xamachana* lives in symbiosis with dinoflagellate algae in the genus *Symbiodinium*. Photo credit: Erika Diaz-Almeyda, Department of Biology, Emory University, USA.

Traditionally, symbiosis research has been undertaken by researchers working independently of one another and often focused on a few cases of bipartite host-symbiont interactions. New model systems are emerging that will enable us to fill fundamental gaps in symbiosis research and theory, focusing on a broad range of symbiotic interactions and including a variety of multicellular hosts and their complex microbial communities. In this Research Topic, we invited researchers to contribute their work on diverse symbiotic networks, since there are a large variety of symbioses with major roles in the proper functioning of terrestrial or aquatic ecosystems, and we wished the Topic to provide a venue for communicating findings across diverse taxonomic groups. A synthesis of recent investigations in symbiosis can impact areas such as agriculture, where a basic understanding of plant-microbe symbiosis will provide foundational information on the increasingly important issue of nitrogen fixation; climate change, where anthropogenic factors are threatening the survival of marine symbiotic ecosystems such as coral reefs; animal and human health, where unbalances in host microbiomes are being

increasingly associated with a wide range of diseases; and biotechnology, where process optimization can be achieved through optimization of symbiotic partnerships. Overall, our vision was to produce a volume of works that will help *define general principles of symbiosis within a new conceptual framework, in the road to finally establish symbiology as an overdue central discipline of biological science*.

**Citation:** Francino, M. P., Medina, M., eds. (2017). Recent Advances in Symbiosis Research: Integrative Approaches. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-015-2

# Table of Contents

*05 Editorial: Recent Advances in Symbiosis Research: Integrative Approaches* M. Pilar Francino and Mónica Medina

## **Section 1. Plant-microbe symbioses:**


Devin Coleman-Derr and Susannah G. Tringe

*23 Convergence in mycorrhizal fungal communities due to drought, plant competition, parasitism, and susceptibility to herbivory: consequences for fungi and host plants*

Catherine A. Gehring, Rebecca C. Mueller, Kristin E. Haskins, Tine K. Rubow and Thomas G. Whitham


Luis C. Mejía, Edward A. Herre, Jed P. Sparks, Klaus Winter, Milton N. García, Sunshine A. Van Bael, Joseph Stitt, Zi Shi, Yufan Zhang, Mark J. Guiltinan and Siela N. Maximova

## **Section 2. Marine symbioses:**


Ross Cunning and Andrew C. Baker

*127 A genomic approach to coral-dinoflagellate symbiosis: studies of* **Acropora digitifera** *and* **Symbiodinium minutum**

Chuya Shinzato, Sutada Mungpakdee, Nori Satoh and Eiichi Shoguchi


Jia Y. Har, Tim Helbig, Ju H. Lim, Samodha C. Fernando, Adam M. Reitzel, Kevin Penn and Janelle R. Thompson

## **Section 3. Other animal-microbe symbioses:**


Ana E. Pérez-Cobas, Alejandro Artacho, Stephan J. Ott, Andrés Moya, María J. Gosalbes and Amparo Latorre

## **Section 4. Novel theoretical and methodological approaches:**


# Editorial: Recent Advances in Symbiosis Research: Integrative Approaches

#### M. Pilar Francino1, 2, 3 \* and Mónica Medina<sup>4</sup> \*

<sup>1</sup> Area of Genomics and Health, Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunitat Valenciana-Salud Pública, València, Spain, <sup>2</sup> Unitat Mixta d'Investigació en Genòmica i Salut, FISABIO-Universitat de València, València, Spain, <sup>3</sup> CIBER en Epidemiología y Salud Pública, Madrid, Spain, <sup>4</sup> Department of Biology, Pennsylvania State University, University Park, PA, USA

Keywords: microbiome, multicellular host, plant-microbe symbiosis, marine symbiosis, holobiont

#### **The Editorial on the Research Topic**

#### **Recent Advances in Symbiosis Research: Integrative Approaches**

#### Edited by:

#### Suhelen Egan, University of New South Wales, Australia

Reviewed by: Julie L. Meyer, University of Florida, USA

## \*Correspondence:

M. Pilar Francino mpfrancino@gmail.com Mónica Medina momedinamunoz@gmail.com

#### Specialty section:

This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology

Received: 18 July 2016 Accepted: 11 August 2016 Published: 31 August 2016

#### Citation:

Francino MP and Medina M (2016) Editorial: Recent Advances in Symbiosis Research: Integrative Approaches. Front. Microbiol. 7:1331. doi: 10.3389/fmicb.2016.01331 Symbiosis research is being transformed by new model systems and technologies that bring forth unexpected discoveries. Technological advances such as those stemming from Next Generation Sequencing enable detailed insights into the molecular bases of symbiotic relationships, and have revolutionized the study of complex microbial communities. As new data gathers, the need grows for a conceptual framework that helps organize and make sense of the information. Here, we present some ground-breaking works pushing the boundaries of our understanding of symbiosis in a variety of systems, as well as some state-of-the-art attempts at putting forward organizing principles for the whole of symbiology.

Several works in the Topic address plant-microbe symbioses. All plants are home to a variety of microorganisms that inhabit nearly all their tissues, offering a variety of benefits. Rhizosphere microbial communities are particularly critical, as they provide access to limiting nutrients. Coats and Rumpho review how molecular biodiversity analyses have advanced our understanding of the rhizosphere microbiota of invasive plants, which enables the invasion of new ranges. Coleman-Derr and Tringe, on the other hand, focus on the microbiome of crop plants and its large potential for modulating plant responses to the stresses associated with climate change and use of suboptimal agricultural lands. The relationship between the rhizosphere microbiome and stress is further demonstrated by Gehring et al. Their research on ectomycorrhizal fungal communities associated with pinetrees reveals that biotic and abiotic stressors can result in similar patterns of symbiotic community disassembly, and, remarkably, that the less diverse communities that result may actually be beneficial to host trees under stressful conditions. Kuo et al., Larrainzar et al., and Maróti and Kondorosi rather focus on the biology of specific rhizosphere microbes and the mechanisms enabling their interactions with host plants. The role of foliar symbionts is addressed by Mejía et al., who demonstrate the effects of an endophytic fungus on genetic and phenotypic expression in the tropical tree Theobroma cacao.

Marine systems also provide innumerable examples of fascinating symbioses. Soto and Nishiguchi offer us a new twist on the classic squid-Vibrio model, proposing experimental evolution approaches to gain further insights into this system. Abby et al. and Yarden explore lesser-known associations, such as those among bacteria and planktonic microalgae and among fungi and various marine invertebrates. A number of works focus on corals and their associated microbes, the study of which has been fundamental to cement the concept of the holobiont—a multicellular host and its associated microbiome (Margulis, 1993). Corals have become the prime example of a living system that will not survive when the interactions among the species that compose it are disrupted, as is increasingly occurring in the context of climate change. Parkinson and Baums argue that the coral holobiont should be considered as a single unit of selection, because it is specific host/symbiont genotype combinations that determine its extended phenotype and capacity for survival. Cunning and Baker remind us that the success of coral symbioses may be determined not only by the genetic identity of the partners, but also by their relative abundance, which should depend on environmental conditions. From a more mechanistic perspective, Shinzato et al. demonstrate how genomic analyses can generate detailed knowledge about the molecular and cellular processes enabling the coral-dinoflagellate symbiosis. Pernice and Levy and Roth illustrate how other technologies complement genomic analyses to generate further insight into coral physiology and metabolism. Meanwhile, Har et al. employ a variety of "omic" approaches to characterize the microbiota of Nematostella vectensis, a new cnidarian model for the study of metazoan evolution and development.

Insect symbioses also exemplify the varied interactions that can be established among microbes and animal hosts. Asgharian et al. revisit the many-faceted relationship between insects and Wolbachia, using transcriptomic techniques to show that the endosymbiont can have both sex-dependent and independent effects on its host. Hamidou Soumana et al. also exploit transcriptomics to explore the multi-level relationship among endosymbionts, trypanosomes and the tsetse fly. In turn, O'Connor et al. unveil that symbiotic microbes play a crucial role in shaping the complex interactions between the Hawaiian Drosophilidae and their host plants, hence contributing to their adaptive radiation.

In spite of their importance, host-microbe associations are vulnerable. A notorious example is the disruption of the complex microbial community inhabiting the intestinal tract when assaulted by antibiotics, resulting in an increased vulnerability to infection by opportunistic pathogens. Pérez-Cobas et al. explore how antibiotic-induced changes in the gut microbiota relate to Clostridium difficile infection, and define microbial taxa and functions that might protect against colonization by this pathogen. While this approach illustrates the common perspective of analyzing host-microbe symbioses in terms of potential benefits to the host, García and Gerardo take the

## REFERENCES

Margulis, L. (1993). Symbiosis in Cell Evolution. New York, NY: W. H. Freeman.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

less-traveled road of considering what microbes have to gain from the symbiotic interaction. They conclude that symbionts may sometimes be more akin to prisoners or farmed crops than to equal partners. Therefore, researchers should determine whether symbionts have adaptations to evade capture by hosts, in addition to evaluating both costs and benefits of presumed mutualisms. A theoretical framework to analyze cost/benefit ratios is provided by Hill in the context of fixed-carbon allocation in phototroph/heterotroph symbioses, where endosymbionts may control the energy trade-offs faced within host cells.

We close with two articles proposing new methodological and theoretical approaches to take forward the study of symbiotic systems. Zaneveld and Thurber demonstrate the application of ancestral state reconstruction for predicting the presence of unknown taxa and functions in microbial communities that are too complex to be wholy described experimentally. And, finally, Fitzpatrick raises the crucial issue of whether hosts and symbionts can truly coevolve when symbionts are not vertically transmitted, proposing linkage disequilibrium analysis as the correct framework to address this question. Reassuringly, this approach reveals that selection and population structure can generate covariance between host and symbiont traits, providing the basic requirement for the coevolution of the intricate symbiotic systems that pervade all realms of Life.

## AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct, and intellectual contribution to the work, and approved it for publication.

## FUNDING

The authors were supported by grants NSF OCE 1442206 (MM) and MINECO SAF2012-31187 (MF) during writing of this manuscript.

## ACKNOWLEDGMENTS

We thank the Frontiers editorial staff for their support and all of the authors and reviewers who participated in this Research Topic.

Copyright © 2016 Francino and Medina. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## The rhizosphere microbiota of plant invaders: an overview of recent advances in the microbiomics of invasive plants

## *Vanessa C. Coats1 and Mary E. Rumpho2 \**

<sup>1</sup> Department of Molecular and Biomedical Sciences, University of Maine, Orono, ME, USA

<sup>2</sup> Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Scott Clingenpeel, DOE Joint Genome Institute, USA Detmer Sipkema, Wageningen University, Netherlands

#### *\*Correspondence:*

Mary E. Rumpho, Department of Molecular and Cell Biology, University of Connecticut, 91 North Eagleville Road, Unit 3125, Storrs, CT 06269, USA e-mail: rumpho@uconn.edu

Plants in terrestrial systems have evolved in direct association with microbes functioning as both agonists and antagonists of plant fitness and adaptability. As such, investigations that segregate plants and microbes provide only a limited scope of the biotic interactions that dictate plant community structure and composition in natural systems. Invasive plants provide an excellent working model to compare and contrast the effects of microbial communities associated with natural plant populations on plant fitness, adaptation, and fecundity. The last decade of DNA sequencing technology advancements opened the door to microbial community analysis, which has led to an increased awareness of the importance of an organism's microbiome and the disease states associated with microbiome shifts. Employing microbiome analysis to study the symbiotic networks associated with invasive plants will help us to understand what microorganisms contribute to plant fitness in natural systems, how different soil microbial communities impact plant fitness and adaptability, specificity of host–microbe interactions in natural plant populations, and the selective pressures that dictate the structure of above-ground and below-ground biotic communities. This review discusses recent advances in invasive plant biology that have resulted from microbiome analyses as well as the microbial factors that direct plant fitness and adaptability in natural systems.

**Keywords: rhizosphere, microbiome, plant–microbe interactions, invasive plant, soil**

## **INTRODUCTION**

Symbiotic relationships shaped the origin, organization, and evolution of all life on Earth. Originally defined as "the living together of unlike named organisms" (de Bary, 1878), the term symbiosis has traditionally been applied to associations like mutualism, commensalism, and even parasitism (Parniske, 2008). More recent symbiosis research is expanding this definition to encompass a role of microbial symbiotic relationships in far-reaching themes of biology such as speciation, evolution, and coadaptation (Margulis, 1993;Klepzig et al., 2009; Carrapiço, 2010; Lankau, 2012). The association and close relationships of organisms that cohabitate are vital for the growth and development of all eukaryotic organisms (Carrapiço, 2010; McFall-Ngai et al., 2013). These associations (=symbiotic networks of microorganisms) shape natural landscapes and directly influence the evolutionary trajectory of individual species and entire ecosystems (Gilbert, 2002; Klepzig et al., 2009).

Plant invasions are a global concern because they pose a direct threat to biodiversity and natural resource management, especially in protected areas (i.e., public lands, refuges, conservations, etc.; Foxcroft et al., 2013). For a plant to be considered invasive (and not just naturalized) it must be non-native to the ecosystem in question and it must cause environmental damage (i.e., detrimental effects on native flora and fauna) or harm humans (Invasive Species Advisory Committee [ISAC], 2006). Invasive plant science represents a crossroads of diverse opinions derived from many economic, ecological and societal interest groups, and

this has lead to disputes regarding the correct approach to invasive plant issues (Simberloff et al., 2013). To further complicate the issue, plant classification as "invasive" or "weedy" is often based more on human perceptions and opinions than on actual data regarding the economic, societal, or environmental impact of the plant taxon (Hayes and Barry, 2008). However, the environmental consensus supports severe ecological damage by plants deemed invasive in protected areas and significant reductions in the biodiversity of native species resulting from plant invasions. Comprehensive reviews of invasive plant impacts have covered the ecological effects of invaders (Pyšek et al., 2012), nutrient cycling modifications (Ehrenfeld, 2003; Liao et al., 2007), mechanisms of plant invasion (Levine et al., 2003), hybridization, and competition (Vila et al., 2004). Synthesizing accurate predictions of the invasive potential of specific plant taxa has proven difficult and there is no universal trait that can be collectively applied to predict invasiveness (Rejmanek and Richardson, 1996; Richardson and Pysek, 2006; Hayes and Barry, 2008; Thompson and Davis, 2011; Morin et al., 2013). A standard approach is needed for accurate impact assessment and the development of a new global database suitable to make future predictions of problem taxa (Morin et al., 2013).

The rhizosphere microbiome comprises the greatest diversity of microorganisms directly interacting with a given plant; therefore, it has a tremendous capacity to impact plant fitness and adaptation. Bacterial and fungal communities in the rhizosphere affect plant immunity (van Wees et al., 2008; Ronald and Shirasu, 2012), pathogen abundance (Berendsen et al., 2012), nutrient acquisition (Jones et al., 2009; Richardson et al., 2009), and stress tolerance (Doubkova et al., 2012; Marasco et al., 2012). Traditional hypotheses for plant invasion, such as enemy release hypothesis (ERH; Klironomonos, 2002; Mitchell and Power, 2003; Blumenthal, 2006; Liu and Stiling, 2006; Reinhart and Callaway, 2006; Blumenthal et al., 2009; Eschtruth and Battles, 2009), accumulation of local pathogens (ALP; Eppinga et al., 2006), enhanced mutualist hypothesis (EMH; Marler et al., 1999; Reinhart and Callaway, 2004; Parker et al., 2006), and plant–soil feedbacks (Ehrenfeld, 2003; Ehrenfeld et al., 2005; Bever et al., 2012), all point directly to the rhizosphere microbiome, in its entirety, as the primary mediator of plant establishment and success.

The study of soil microbial communities once relied on laboratory culture techniques, phospholipid fatty acid analysis (PFLA), denaturing gel gradient electrophoresis (DGGE), and terminal restriction fragment length polymorphism (TRFLP; Zhang and Xu, 2008; van Elsas and Boersma, 2011). Early on, culture-based approaches revealed "the great plate count anomaly" wherein only about 1% of visible microscopic cells can be cultured using conventional techniques (Staley and Konopka, 1985; Zhang and Xu, 2008; Stein and Nicol, 2011). The DNA technologies available today use genetic information to model the structure and composition of a microbial community (Venter et al., 2004; Tringe and Rubin, 2005; Hugenholtz and Tyson, 2008; Kunin et al., 2008; Vakhlu et al., 2008; Marguerat and Bähler, 2009; Metzker, 2010; Wooley et al., 2010; Simon and Daniel, 2011; Sun et al., 2011; van Elsas and Boersma, 2011; Thomas et al., 2012; Yousuf et al., 2012; Bibby, 2013; Mathieu et al., 2013). Capable of generating millions of base pairs in a matter of hours for only a few thousand dollars, the primary limitation to next-gen sequencing technologies is handling the expansive datasets and applying appropriate statistical analyses to address the biological questions at hand (Metzker, 2010).

The link between the rhizosphere microbial community and invasive plant success has been studied for many years (Van der Putten et al., 2007; Pringle et al., 2009; Berendsen et al., 2012; Bakker et al., 2013). Invasive plants provide a unique perspective to study the effects of the rhizosphere microbiome on plant fitness, the role evolutionary interactions play in structuring the plant ecology observed at present, and the potential for directed control and management of invasive plants. The aim of this review was to focus on recent insights into plant– microbe interactions in the rhizosphere of invasive plants. We were interested in studies that used a sequencing based approach to investigate the rhizosphere microbiome of invasive plants. Surprisingly, we found that few invasive plant scientists have moved beyond traditional methods of soil community analysis (i.e., DGGE) regardless of the increasing availability of nextgen sequencing platforms. We discuss the current microbiome data for invasive plants with regard to popular mechanisms of plant invasion (i.e., enemy release, novel symbiont, etc.). Particular attention has been given to rhizosphere microbiome analysis and what this methodology reveals about microbial symbiotic networks in the soil as contributing factors to the development and progression of plant invasions in terrestrial ecosystems.

## **RHIZOSPHERE MICROBIOTA ARE A KEY COMPONENT OF PLANT FITNESS**

Over 400 million years ago, during the Paleozoic era, the evolution of land plants was made possible by a symbiosis between mycorrhizal fungi and the common ancestor of land plants (Wang and Qiu, 2006; Humphreys et al., 2010). This association resulted in a fitness advantage and enhanced stress tolerance that was critical for the establishment of terrestrial plants (i.e., increased access to water and mineral nutrients). Evidence of microbial symbiosis is apparent in the oldest lineages of land plants, the liverworts. The arbuscular mycorrhizal (AM) symbioses of liverworts significantly promote photosynthetic C uptake, acquisition of P and N from the soil, growth, and asexual reproduction (Humphreys et al., 2010). Mycorrhizal symbioses undoubtedly demonstrate the importance of symbiotic relationships in terrestrial ecosystems and have been credited for stimulating the diversification of both plant hosts and fungal symbionts (Wang and Qiu, 2006).

The soil microbial community constitutes a major portion of a plant's symbiotic network. Soil is the greatest reservoir of microbes that affect plant growth, fitness, fecundity, and stress tolerance (reviewed by Buée et al., 2009; Faure et al., 2009; Lambers et al., 2009; Lugtenberg and Kamilova, 2009; Chaparro et al., 2012; Doornbos et al., 2012; Bakker et al., 2013). All plants maintain a direct interaction with soil microbes in the rhizosphere, which is the soil compartment immediately surrounding the root wherein plant root exudates directly influence the structure and function of the soil microbial community (**Figure 1**; Hiltner, 1904; Hartmann et al., 2008). The sugars, amino acids, flavonoids, proteins, and fatty acids secreted by plant roots help to structure the associated soil microbiome (Badri et al., 2009; Dennis et al., 2010; Doornbos et al., 2012) and these exudates vary among plant species and between genotypes (Rovira, 1969; Micallef et al., 2009). The quantity and composition of root exudate fluctuates with plant developmental stage and the proximity to neighboring species (Chaparro et al., 2012). Microbes growing in the nutrient rich rhizosphere produce molecular signals that promote plant fitness and growth (i.e., hormones) and can disrupt inter-plant communication in natural systems (Faure et al., 2009; Sanon et al., 2009).

Microbes in the rhizosphere can provide a direct access to limiting nutrients (e.g., N2 fixing symbiont) or increase the total surface area of the root system (e.g., mycorrhizal fungi). Many reviews have already covered the positive effects of beneficial root symbionts in the rhizosphere (Buée et al., 2009; Bakker et al., 2013), factors affecting rhizosphere microbial communities (Philippot et al., 2013), and the microbial effects on plant health (Berendsen et al., 2012; Berlec, 2012; Bever et al., 2012) and stress tolerance (Rodriguez et al., 2008).

Antagonistic interactions derived from microbial pathogens play critical roles in determining the genetic structure and spatiotemporal abundance of a plant (Gilbert, 2002; Blumenthal et al., 2009). Pathogenic microbes impose selective pressures on a plant population that favor a specific genetic structure within the

host plant community and this stimulates evolutionary change over time (Gilbert, 2002). In natural systems, pathogens mediate plant competition and affect spatiotemporal distribution of individuals within the plant community by creating inhabitable and uninhabitable areas within the ecosystem (Gilbert, 2002). The Janzen-Connell hypothesis postulated that pathogen and host densities are responsible for the observed distribution of a plant species by affecting the establishment success of seedlings (Packer and Clay, 2000). A high density of *Pythium* sp. in the soil beneath parental *Prunus serotina* trees was observed to prohibit the establishment of seedlings in the immediate vicinity (0–5 m), but not seedlings growing at greater distances (25– 30 m; Packer and Clay, 2000). Thus, pathogen accumulation beneath parent plants functions to promote seedling distribution and reduce competition between the parent plant and its offspring.

and are recruited to the rhizoplane or root interior. Bulk soil microbes

#### **INVASIVE PLANTS DISRUPT NATIVE SYMBIOTIC NETWORKS**

The introduction of non-native plants can disrupt native symbiotic networks in the soil and change local grazing patterns for insects and fauna (Elias et al., 2006; Klepzig et al., 2009). Introduced plants alter patterns of nutrient cycling (Laungani and Knops, 2009) and cause chemical changes in the soil environment (i.e., allelopathy; Cipollini et al., 2012). Often these non-native invaders bring novel traits to the environment that

put native plants at a disadvantage (Van der Putten et al., 2007; Laungani and Knops, 2009; Perkins et al., 2011). Plant–microbe interactions may assist invasive plants with outcompeting native flora using mechanisms that include allelopathy-mediated suppression of native rhizosphere microbes and beneficial symbionts (Stinson et al., 2006; Callaway et al., 2008), the accumulation of native plant pathogens in the invaded soils (Mangla et al., 2008), and changes in nutrient cycling dynamics that favor the exotic plant (Ehrenfeld et al.,2001; Ehrenfeld,2003; Laungani and Knops, 2009). Increased availability or access to vital nutrients provides a competitive advantage to invasive plants and facilitates significant biomass accumulation (Blumenthal, 2006; Blumenthal et al., 2009).

fungi, bacterial endophytes, and symbiont nodules.

Allelopathic plants are among the most aggressive invaders of non-native ecosystems because non-native plants with the ability to synthesize toxic chemicals are often at a competitive advantage (Lankau, 2012). *Allaria petiolata* (garlic mustard) produces allelopathic chemicals that target beneficial microbes like AM symbionts of native plants (Stinson et al., 2006; Callaway and Vivanco, 2007; Callaway et al., 2008). *A. petiolata* also demonstrated an increased production of toxic chemicals when growing in non-native regions that contain a greater competitive interspecific density, implicating the allelopathic effects as the primary invasive characteristic (Lankau, 2012). The introduction of novel allelochemicals into an environment affects the structure of the

soil microbial community and the microbial biodiversity, especially if these chemicals have antimicrobial activity or function as metal chelators (Inderjit et al., 2011). Soil microbes are the first line of defense toward novel chemicals in a native ecosystem. They mediate much of the allelopathic effect in ways as simple as the ability to degrade or detoxify compounds before they accumulate in the soil and inhibit native plant growth (Cipollini et al., 2012).

Invasive plants outcompete native plants by accumulating large concentrations of native plant pathogens in the soil (Eppinga et al., 2006; Mangla et al., 2008). A release from microbial pathogens, insect pests, and herbivores of the native range is one mechanism behind the success of invasive plants (Klironomonos, 2002; Mitchell and Power, 2003; Reinhart and Callaway, 2006; Blumenthal et al., 2009), but the distribution of pathogens in the invasive range is just as important for defining competition with native flora. Root exudates of *Chromolaena odorata*, a severely destructive tropical weed, concentrate *Fusarium* sp. spores to a level 25-times greater than that observed in the root zone of native plants (Mangla et al., 2008). Thus, these plants exacerbate and exploit the native biotic interactions and gain a competitive advantage.

Many, but not all, invasive plants alter patterns of nutrient cycling in the invasive range (Perkins et al., 2011). Changes in the N cycling dynamics in the soil are a frequent consequence of invasive plant introduction (Ehrenfeld, 2003; Mack and D'Antonio, 2003; Laungani and Knops, 2009; Perkins et al., 2011). Non-native species can change the quality and quantity of leaf litter (Ehrenfeld et al., 2001), modify local decomposition rates (Kourtev et al., 2002a; Elgersma et al., 2012), and disrupt local feedback mechanisms in the soil system (Ehrenfeld et al., 2005). For example, *Pinus strobus* is an invader of N-poor grasslands that demonstrates a higher N residence time in the plant tissues than native species (Laungani and Knops, 2009). This increased residence time facilitates the accumulation of twice as much N in plant tissues and up to four times as much N in the photosynthetic tissues, relative to native grasses (Laungani and Knops, 2009). The differences in N utilization between non-native and native plants create a positive feedback in the soil that significantly increases N availability and results in increased total C gains, both of which allow *P. strobus* to gain a competitive advantage (Laungani and Knops, 2009).

## **MICROBIAL IMPACTS ON PLANT ESTABLISHMENT AND PROLIFERATION**

Not all microbes are found ubiquitously throughout soils around the world, and thus, soil microbes are not exempt from fundamental evolutionary processes of geographic isolation and natural selection (Rout and Callaway, 2012). Plant–microbe interactions in the rhizosphere (beneficial, pathogen, etc.) can dictate whether the plant is capable of naturalization and the possibility of an invasive growth habit. Pringle et al. (2009) proposed three criteria to model how mycorrhizal symbioses influence the outcome of a plant invasion: (1) the type of plant–fungi relationship (obligate or facultative) from the plant perspective; (2) if the relationship was specific or flexible, meaning the plant associates with one mycorrhizal fungus versus many; and (3) whether these microbial symbionts were found in the introduced range (Pringle et al., 2009). According to this model, obligate symbionts prevent the growth of non-native plants if the microbial symbiont is not already present in the introduced region, nor is it co-introduced with the host plant. Facultative symbioses are often less restrictive because the plants may form novel beneficial symbioses with suitable replacement microbes in the non-native range, or survive without the symbiont. Consequently, the symbiotic flexibility in facultative symbioses enhances the likelihood of favorable plant adaptations and the development of invasive populations in the introduced region (Pringle et al., 2009).

In the introduced region, the soil microbial community mediates plant abundance and disturbance of the soil can influence the progression of a plant invasion. A removal of the aboveground plant community coupled with little or no physical disruption of the soil is classified as Type I soil disturbance. A Type II soil disturbance includes physical disruption of the soil matrix in addition to removal of the above-ground plant biomass (Fukano et al., 2013). Type I disturbances leave the soil microbial community intact, whereas Type II disturbances completely disrupt the structure of the microbial community. Interestingly, the growth of non-native species is enhanced when they are rare in the ecosystems subjected to Type I disturbance (Fukano et al., 2013). In contrast, type II disturbances give native species an advantage and require non-native invaders to maintain a higher competitive ability. Thus, a physical disturbance that alters the composition of the soil microbial community favors native plants, yet the opposite result occurs (enhanced fitness of non-native plants) if the soil microbial community remains intact.

#### **THE RHIZOSPHERE MICROBIOTA OF INVASIVE PLANTS**

The rhizosphere microbiota of non-cultivated plant systems provide a better platform to study the critical plant–microbe interactions that affect plant fitness and adaptability because they are under less anthropogenic control than agricultural systems (Philippot et al., 2013). **Figure 2** depicts seven biotic and abiotic factors that together determine the presence or absence of specific microbiota in the soil microbiome of natural systems. Factors such as soil disturbance, local flora and fauna, and allelopathic effects from the plant each impose a selective pressure on the soil microbial community. The cumulative effect of these selective pressures is what determines the frequency and abundance of microbes in the soil, and thus, what microbes the plant is able to recruit into the rhizosphere.

Microbiome analysis of rhizosphere microbiota associated with invasive *Berberis thunbergii* in Maine showed that environmental factors alone cannot explain the structure of the rhizosphere microbial community associated with this plant in the invasive range. Coats et al. (2014) used amplicon pyrosequencing to assess effects of environmental factors on the bacterial and fungal communities in the rhizosphere of *B. thunbergii* (Japanese barberry) from invasive stands in coastal Maine, USA. The effects of soil chemistry, location, and surrounding plant canopy cover were investigated and a high degree of spatial variation in the rhizosphere microbial communities of *B. thunbergii* was reported. Bulk soil chemistry had more of an effect on the bacterial community structure than the fungal

community. An effect of location was detected in the rhizosphere microbial community, but it was less significant than the effect of surrounding plant canopy cover. The significant effects of these environmental factors on the structure of the rhizosphere microbial community associated with *B. thunbergii* suggests some soils and/or plant communities are more prone to plant invasions based on the soil microbial communities they foster.

The microbial diversity in the rhizosphere includes many species of bacteria, archaea, fungi, oomycetes, viruses, and various microfauna (nematodes, protozoa, etc.; reviewed by Buée et al., 2009; Bever et al., 2012; Philippot et al., 2013). The rhizosphere microbiome differs from the bulk soil and between plant species. Using a metatranscriptomic approach, Turner et al. (2013) identified kingdom level differences in the rhizosphere bacterial communities of wheat, oat, and pea plants. The fungal diversity in the rhizosphere also varied significantly between these crop plants. Investigations that have focused on the interactive effects between major microbial groups in the rhizosphere have revealed a joint effect of fungal endophytes and AM fungi that promotes plant growth (Larimer et al.,2010). Bacterial endophytes have been observed to enhance competition by invasive plants through providing the plant with increased access to nutrients (Fe and P) and by producing plant growth promoting hormones (IAA; Rout et al., 2013). When comparing native and non-native plants with DGGE, Xiao et al. (2014) found that the soil fungal communities were more affected by the invasive plant than the native plant and the modifications to the fungal community promoted invasive plant growth. Differences in the rhizosphere pathogen communities

of related *Phragmites australis* haplotype populations (a native and non-native) have also demonstrated that non-native species cultivate different soil pathogen communities than native plants regardless of the genetic similarity of the host plant (Nelson and Karp, 2013).

## **RHIZOSPHERE MICROBIOME IN NATIVE AND INVASIVE RANGE SOILS**

Recent investigations that have contrasted plant–microbe interactions in the native and invasive range have focused on the net effect of soil biota on plant growth, plant allelopathic responses, and the rhizosphere microbiome. The rhizosphere microbiota (saprophytes, pathogens, and beneficials) each have positive effects on invasive plant growth (lower boxes of **Figure 2**). Stimulating saprophyte growth creates a positive feedback in the soil of invasive plants by increasing litter decay rates and nutrient availability (Van der Putten et al., 2007; Bever et al., 2012). The mutualistic associations and/or novel symbioses in the introduced range can enhance plant fitness by promoting plant growth, nutrient acquisition, and disease suppression (Van der Putten et al., 2007; Pringle et al., 2009; Berendsen et al., 2012; Bakker et al., 2013). The empirical evidence obtained from studies that compare plant–microbe interactions in each range support current microbe based theories of plant invasions and provide evidence for microbe enhanced plant fitness in the invasive range.

*Triadica sebifera* (Chinese tallow) is native in China and invasive in the US. Yang et al. (2013) studied the net effect of native and invasive range soil microbiota on the growth of *T. sebifera* and four co-occurring plant genera (*Liquidambar*, *Ulmus*, *Celtis*, and *Platanus*). Native range soils had no effect, or a negative effect, on *T. sebifera* performance yet there was always a positive effect of invasive range soil on plant survival and biomass production. A greater biomass was observed for the invasive plants grown in active soil mix than in sterilized or fungicide-treated soils. Higher mycorrhizal colonization of *T. sebifera* was found on plants growing in the invasive range soil. Interestingly, there was no effect of native or invasive range soil on the other four genera examined, and native plants maintained higher mycorrhizal colonization rates in native soil than invasive range soil. These results not only support Enhanced Mutualist and Pathogen Release Hypotheses, they also indicate a significant specificity in the plant–microbe interactions for some plant species that contribute to invasive plant growth.

The allelopathic response of invasive plants can differ between native and invasive ranges with greater allelopathic effects observed in the invasive range. Yuan et al. (2013) observed increased allelochemical content (total phenolics, total flavones, and total saponins) for *Solidago canadensis*, a native of the US that has developed invasive populations in China. The increased production of allelopathic chemicals by *S. canadensis* in the invasive range also coincided with a greater inhibition of native plant seedlings. Whether the increase in allelochemical production is solely a result of the plant–microbe interactions remains unclear, although it would seem to be a beneficial plant response to the development of novel interactions with foreign soil microbiota.

The most comprehensive investigation of a rhizosphere microbiome associated with an invasive plant was conducted on *B. thunbergii*, a native of central Japan that is invasive in the US. The microbial community (Bacteria, Archaea, and Eukaryota) structure was modeled using amplicon pyrosequencing to compare rhizosphere communities of native *B. thunbergii* from central Japan (*n* = 8) with those from an invasive stand in the US (*n* = 5; Coats, 2013). A total of 432 genera were identified from all three domains in Japan and US rhizosphere soils combined, although only Eukaryotes from the lineage Fungi were included in this analysis. *B. thunbergii* rhizosphere soils from Japan and the US shared 171 genera, most of which were Proteobacteria (Bacteria) and Ascomycota (Fungi). Rhizosphere soil from Japan contained 71 unique genera and the US soils harbored 190 unique genera. A high degree of phylogenetic redundancy was observed within the microbial community at the phyla level, although the community structure was significantly different between samples from each region (Coats, 2013).

The apparent difference in the rhizosphere microbiota of *B. thunbergii* in native and invasive (non-native) soil supports our hypothesis that soil microbial communities are the primary mediators of invasive plant growth in non-native habitats. The data showed a significant effect of geographic location with less species diversity and increased abundance of pathogenic species observed in rhizosphere soils from the native range compared to the invasive range (Coats, 2013). Therefore, the microbial community shifts observed between the rhizosphere soil in the native and non-native ranges support Enemy Release and Enhanced Mutualist Hypotheses, as well as an increased access to nutrients via saprophyte stimulation and/or novel

symbiont acquisition. Interestingly, Bacteria communities were more significantly different between rhizosphere samples from the two ranges than the Archaea or the Eukaryota communities (Coats, 2013).

Pathogen release, wherein exotic plants are not subjected to the heavy pathogen loads characteristic of native range soils in the non-native range, has been implicated as a common mechanism for plant invasions, especially when coupled with increased access to nutrients (Blumenthal, 2006; Blumenthal et al., 2009). The impacts of enemy release on a plant invasion are determined from two opposing factors: (1) plants' "escape" from heavy pathogen loads in the native range and (2) the rate of accumulating pathogens in the introduced range (release = escape − accumulation; Mitchell and Power, 2003). Many genera that were found strictly in *B. thunbergii* rhizosphere soils from Japan are common plant pathogens, including *Clostridium*, *Enterobacter* (*Pantoea*), and *Serratia* (Schaad et al., 2001; Grimont and Grimont, 2006), and these putatively pathogenic microbes occurred in greater abundance in the native soils. For instance, two pathogenic *Serratia* species (*S. proteamaculans* and *S. marcescens*) constituted 1.8% of the total reads in some rhizosphere samples from Japan and as much as 52% of the total for other Japan rhizosphere samples (Grimont and Grimont, 2006; Coats, 2013). *Buttiauxella* was detected in every rhizosphere sample from Japan (compared to three US samples) and it comprised 8.5–70.1% of the total reads, although the average was approximately 30–35% per sample. *Stenotrophomonas*, another putative *Berberis* pathogen, comprised approximately 1–9% of the total reads in the native Japan soils but contributed very little (∼0.1% of the total reads) to the microbial community in the rhizosphere soil from the US (Coats, 2013).

The rhizosphere microbial communities associated with *B. thunbergii* also implicate a role for enhanced mutualism as one factor in the development of invasive populations (Coats, 2013). Some genera that are likely to be putative beneficial symbionts, such as *Glomus* (mycorrhizal fungi) and *Frankia* (N2-fixing actinomycete), were detected solely in rhizosphere communities of the invasive range. Other genera that also contain putative beneficials were detected in both regions, although their abundance was greater in the rhizosphere soil from the invasive range. Some of these genera are capable of symbiotic or free-living (diazotrophic) N fixation (e.g., *Bradyrhizobium*, *Rhizobium*, *Azospira*, etc.), whereas others are likely to function more like plant growth promoting rhizobacteria (e.g., *Bacillus* and *Pseudomonas*) that promote plant fitness by producing growth simulating phytohormones (Faure et al., 2009; Effmert et al., 2012), enhancing stress tolerance (Dimkpa et al., 2009; Kang et al., 2010; Pineda et al., 2010), or antagonizing pathogenic microbes that inhabit the root zone (Berendsen et al., 2012).

Alterations to N cycling dynamics are a commonly reported feature of *B. thunbergii* invasions in North American soils, which suggests saprophyte stimulation (via increased litter decay rates) and/or novel symbiont acquisition are responsible for the observed changes in the invasive range (Coats, 2013). Relative to native *Vaccinium* shrubs, *B. thunbergii* plants produce large quantities of N-rich biomass, N-rich leaf litter, and N-rich secondary metabolites (Ehrenfeld et al., 2001; Elgersma et al., 2012) and they harbor higher levels of extractable nitrate in the soil (Ehrenfeld, 1999). *B. thunbergii* preferentially uses nitrate (Ehrenfeld et al., 2001), a trait that facilitates out-competing ammonium utilizing plants (Gilliam, 2006), and these exotic plants have increased rates of nitrification in the soil rather than high N availability from mineralization (Kourtev et al., 2002b, 2003; Elgersma et al., 2011). The rhizosphere soil from *B. thunbergii* showed an increased abundance of nitrifying bacteria such as Nitrospirales (0.0–2.4%) and Nitrosomonadales (0.4–1.6%) in the invasive range soils relative to rhizosphere soils from the native range (0.0–0.3% and 0.0–0.2% for Nitrospirales and Nitrosomonadales, respectively; Coats, 2013). The data acquired by microbiome analysis show that differences in the microbial community structure between the two ranges corroborate previous investigations of soil N cycling beneath *B. thunbergii* in the invasive range. This metagenomic approach also identifies specific organisms that are likely to be the culprits behind changes in the N cycling patterns in the invasive range soil and that can be targeted during future investigation of the microbial function in the rhizosphere.

#### **FUTURE RESEARCH**

Given the recent advances in high-throughput DNA sequencing and the availability of cost-effective microbiome analysis, it is time invasive plant biologists begin to focus on a full characterization of soil microbial communities in an effort to understand how changes or shifts in the rhizosphere microbiome are affecting the above-ground ecology. Metagenomics and metatranscriptomics provide a rapid means to investigate the genomics and gene expression that mediate plant–microbe interactions in the rhizosphere as well as provide much needed information regarding the metabolic capacity and ecological function of rhizosphere microbes. These plant–microbe interactions not only contribute to invasive plant growth and fitness, they also define the range of suitable habitats and areas of competitive advantage. Obtaining high quality predictions for the most susceptible habitats is the best way to prevent invasive plant introduction and subsequent damage. Microbiome profiling of soil, by programs such as the Earth Microbiome Project (http://www.earthmicrobiome.org/; Gilbert et al., 2010), will undoubtedly enhance prediction algorithms and help identify microbial components in regions of high or low susceptibility. However, the information gained from rhizosphere microbiome analysis is not limited to predictions and promoting a better understanding of plant–microbe interactions in natural ecosystems. Microbiome-based investigations will greatly assist in the development of microbial probiotics and/or targeted approaches to reclaiming habitats that have become heavily invaded (Berlec, 2012). Such an approach would continue to build on current methods of reducing cost and environmental damage caused by terrestrial invaders and focus efforts on prohibiting the initial establishment.

#### **CONCLUSION**

The introduction and prevalence of invasive plants, and the threat of increasing invasion rates, substantiates the need to understand the mechanisms underlying the success of plants that become invasive. Symbiotic networks of microorganisms in the soil undoubtedly affect the naturalization of non-native plants in the introduced region and the ability of these plants to outcompete native species. Plant–microbe interactions in the rhizosphere directly contribute to plant fitness, nutrient acquisition, and stress tolerance. Therefore, the rhizosphere microbiome of a plant harbors a tremendous capacity to promote or inhibit invasive growth characteristics. Invasion mechanisms employed by some plants involve rhizosphere microbiome shifts between the native and invasive ranges. These microbial community shifts provide evidence in support of the Enemy Release and Enhanced Mutualist Hypotheses as well as corroborating plant–microbe feedbacks that lead to an enhanced resource acquisition beyond the limits of native flora.

## **AUTHOR CONTRIBUTIONS**

The manuscript was drafted by Vanessa C. Coats with editorial remarks from Mary E. Rumpho.

#### **ACKNOWLEDGMENTS**

This research was supported by U.S. Department of Agriculture CSREES grants no. 2009-38914-19786 and 2010-38914-20996 to Mary E. Rumpho, and a 2010 LL Bean Scientific Research Fellowship in Acadia National Park from the Friends of Acadia National Park (http://www.friendsofacadia.org/) granted to Vanessa C. Coats. This is Maine Agricultural and Forest Experiment Station Publication Number 3368, Hatch Project no. ME08362-07H.

## **REFERENCES**


Vakhlu, J., Sudan, A. K., and Johri, B. (2008). Metagenomics: future of microbial gene mining. *Indian J. Microbiol.* 48, 202–215. doi: 10.1007/s12088-008-0033-2


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 April 2014; paper pending published: 27 May 2014; accepted: 01 July 2014; published online: 23 July 2014.*

*Citation: Coats VC and Rumpho ME (2014) The rhizosphere microbiota of plant invaders: an overview of recent advances in the microbiomics of invasive plants. Front. Microbiol. 5:368. doi: 10.3389/fmicb.2014.00368*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Coats and Rumpho. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

#### *Devin Coleman-Derr and Susannah G. Tringe\**

Joint Genome Institute, Walnut Creek, CA, USA

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Gabriele Berg, Graz University of Technology, Austria Oded Yarden, The Hebrew University of Jerusalem, Israel

#### *\*Correspondence:*

Susannah G. Tringe, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA e-mail: sgtringe@lbl.gov

The exponential growth in world population is feeding a steadily increasing global need for arable farmland, a resource that is already in high demand. This trend has led to increased farming on subprime arid and semi-arid lands, where limited availability of water and a host of environmental stresses often severely reduce crop productivity. The conventional approach to mitigating the abiotic stresses associated with arid climes is to breed for stress-tolerant cultivars, a time and labor intensive venture that often neglects the complex ecological context of the soil environment in which the crop is grown. In recent years, studies have attempted to identify microbial symbionts capable of conferring the same stress-tolerance to their plant hosts, and new developments in genomic technologies have greatly facilitated such research. Here, we highlight many of the advantages of these symbiont-based approaches and argue in favor of the broader recognition of crop species as ecological niches for a diverse community of microorganisms that function in concert with their plant hosts and each other to thrive under fluctuating environmental conditions.

**Keywords: symbiosis, abiotic stress, agriculture, plant growth promotion, plant–microbe interactions, drought**

## **INTRODUCTION**

Climate change and an increasing world population are predicted to drastically increase the global need for arable farmland, a resource that is already in high demand (Barrow et al., 2008). With the world population expected to reach 9 billion by 2050, it is estimated that the global food supply will need to increase by 70% to meet rapidly rising demand (Editorial, 2010). Changes in the global climate may well compound this challenge, as predicted increases in drought and temperature-related stresses are expected to reduce crop productivity (Ciais et al., 2005; Grover et al., 2010; Larson, 2013).

This large expansion in agricultural output will require both improvements in crop yield as well as the cultivation of additional farmland. One direct effect of this trend will be the steadily increasing prevalence of farming on marginal, arid, and semi-arid lands, especially in the developing world (Lantican et al., 2003; Köberl et al., 2011). Even without considering the effects of climate change, semi-arid, and arid lands often present a host of abiotic challenges to plant growth, including extreme temperatures, excess radiation, and poor nutrient and water availability (Yang et al., 2009).

The historical approach to mitigate the negative effects of abiotic stresses on crop yield has been the creation of stress-tolerant cultivars (Barrow et al., 2008; Eisenstein, 2013). Conventional breeding techniques have enabled the development of crop varietals with increased yields and greater tolerance to a variety of abiotic stresses (Atkinson and Urwin, 2012), but are both time and labor intensive; genetic engineering of crops with improved stress tolerance is faster, but comes with its own set of drawbacks. Furthermore, both methods often neglect the complex ecological context of the soil environment in which the crop is grown (Morrissey et al., 2004).

In recent years, plant-associated microbial communities have received considerable attention for their ability to confer many of the same benefits to crop productivity and stress resistance as have been achieved through plant breeding programs (Mayak et al., 2004; Barrow et al., 2008; Marulanda et al., 2009; Tank and Saraf, 2010; Marasco et al., 2012). It is now well recognized that all plants, and nearly all tissues within the plant, are inhabited by a variety of microorganisms (Partida-Martínez and Heil, 2011; Berg et al., 2013), many of which offer benefits to the host, improving nutrient uptake, preventing pathogen attack, and increasing plant growth under adverse environmental conditions (Yang et al., 2009; Turner et al., 2013). In return these microorganisms receive shelter from the surrounding environment and access to a carbon-rich food supply. The most well-studied of these symbionts include the mycorrhizal fungi, which enhance nutrients uptake (Bonfante and Anca, 2009) and root-nodulating bacteria, which fix nitrogen from the surrounding soil (Lugtenberg and Kamilova, 2009), but many other novel plant growth-promoting microorganisms (PGPM) continue to be identified each year. These organisms confer stress resistance via diverse mechanisms recently reviewed elsewhere (Lugtenberg and Kamilova, 2009; Yang et al., 2009; Grover et al., 2010; de Zelicourt et al., 2013; Nadeem et al., 2014). Importantly, efforts are being made to harness these naturally occurring, soilderived beneficial microbes for large-scale improvement of crop performance in agriculture (Nadeem et al., 2014).

In this article, we will highlight some of the advantages associated with symbiont-based approaches to increasing crop resistance to abiotic stress, with a focus on engineering increased tolerance to drought, which is the most critical and prevalent factor for crop production in many parts of the world (Castiglioni et al., 2008; Grayson, 2013). We present suggestions for future directions of abiotic stress tolerance improvement in crop plants, including the use of cutting edge genomic technologies for the identification and selection of candidate symbionts and the functional modules they employ for enhancing host growth, as well as an assessment of current agronomic practices in the light of modern understanding of microbial community influence over plant phenotype. We conclude with an argument in favor of increased collaboration between conventional breeding programs and microbial-based research for crop improvement and, more generally, for a broader conceptual understanding of crop productivity as a complex product of plant genetics and microbial community function.

## **LIMITATIONS ASSOCIATED WITH DIRECT ENGINEERING OF INCREASED STRESS TOLERANCE INTO CROP PLANTS**

The success of plant biotechnology programs has helped the world's food supply keep pace with the increasing rate of population growth (Morrissey et al., 2004). Novel crop varietals, with superior yields as well as increased tolerance to biotic and abiotic stresses, have been continuously produced for decades through conventional plant breeding programs, and more recently through genetic engineering (Atkinson and Urwin, 2012). Despite the undeniable success of these past efforts and their continued applicability to drought-tolerance in crop species, each of these methods has its drawbacks, which should be fully considered. Plant breeding is highly time consuming, as well as labor and cost intensive (Ashraf, 2010; Eisenstein, 2013). Additionally, in the quest for the improvement of a particular trait, such as drought tolerance, certain (often unknown) desirable traits can be unintentionally lost from the host's gene pool during conventional breeding (Philippot et al., 2013). Perhaps the largest drawback, however, is that plant breeding only confers benefit to a single host species, and this benefit is often not easily transferable to other crop systems, as the genetic components responsible for the improvements frequently remain unidentified.

To avoid the time and labor costs associated with conventional breeding, some researchers have turned to generation of transgenic lines for producing varietals with improved plant growth regulators, antioxidants, organic osmolytes or other factors capable of increasing drought tolerance (Eisenstein, 2013). Unfortunately, the vast majority of these are developed and tested in the greenhouse, rather than in the field and claims made regarding their performance are often inflated compared to actual results in agricultural settings, due to the large array of abiotic and biotic factors left out of the initial experiments (Ashraf, 2010). Additionally, these transgenic crops often must pass rigorous food and environmental safety regulations and trials before becoming marketable, which adds additional time to the product development process (Eisenstein, 2013). Furthermore, release of a transgenic product into the marketplace does not guarantee its success, as public response to use of genetically modified crops varies considerably from country to country (Fedoroff et al., 2010).

Both the conventional breeding and genetic engineering based approaches may rely too heavily on the assumption that plants function as autonomous organisms regulated solely by their genetic code and cellular physiology (Barrow et al.,2008), although plant–microbe interactions can heavily influence crop response to environmental conditions. Many field trials of new stresstolerant cultivars simply have not addressed microbial influence on improved performance (Budak et al., 2013; Swamy and Kumar, 2013; Cooper et al., 2014). Greenhouse trials are often conducted with standard sterilized potting soils and sterilized soil amendments (Porch, 2006; Waterer et al., 2010; Witt et al., 2012) in an attempt to create a microbe-free growth environment, an artificial context rarely if ever found in nature (Friesen et al., 2011; Partida-Martínez and Heil, 2011). By doing so, they not only neglect one of the top determinants of phenotypic output, they may also miss vertically transmitted symbionts present within the plant seed (Barrow et al., 2008), which could lead to overestimations of the effect of host genotype on plant phenotype.

## **ADVANTAGES OF SYMBIONT-BASED APPROACHES TO IMPROVING STRESS TOLERANCE**

Compared with methods for directly engineering stress tolerance into the host described above, symbiont-based approaches to improving stress tolerance offer some clear advantages (**Figure 1**). First, microbial symbionts are frequently capable of conferring stress tolerance to a wide variety of diverse plant hosts, and many PGPM can confer benefits to both monocots and dicot crop species (Timmusk and Wagner, 1999; Redman et al., 2002; Zhang et al., 2008). The bacterium *Achromobacter piechaudii*, isolated from dry riverbeds of southern Israel, was capable of increasing salt and drought resistance in both pepper and tomato (Mayak et al., 2004). Using olive trees, tomato, grapevine, and pepper plants, Marasco et al. (2013) have demonstrated that microbes isolated from the roots of one host species cultivated under desert farming conditions are capable of improving the growth of a different host species when grown under a water-stress regime. The ability to transfer stress-resistance solutions from one crop species to another through a microbial inoculum has the potential to save years of plant breeding effort.

Secondly, PGPM frequently confer more than one type of abiotic and/or biotic stress tolerance (Mayak et al., 2004; Rodriguez et al., 2008), and crops grown on arid and semi-arid lands typically suffer from multiple stress factors. It has been shown that *Arabidopsis* plants in symbiosis with *Paenibacillus polymyxa* have increased drought tolerance as well as improved resistance to pathogen attack (Timmusk and Wagner, 1999). Waller et al. (2005) demonstrated that barley plants inoculated with the fungus *Piriformospora indica* have both increased resistance to *Fusarium* and *Blumeria* infections and increased salt tolerance. These examples of microbes conferring multiple benefits are likely due to the fact that many symbionts exert their influence over the plant host through manipulating plant hormone pathways (Glick et al., 2007; Friesen et al., 2011) and that considerable cross-talk exists between plant stress response pathways (Atkinson and Urwin, 2012).

Thirdly, plant-associated microbial species represent a vast reservoir of genetic information that has coevolved with their hosts

**stress tolerance in crops.** Plant-growth promoting microbes are capable of conferring benefits to multiple species of plant hosts, and of offering improved tolerance to multiple stresses simultaneously. Inoculations with combinations of PGPM can be tailored to specific environmental

under natural environmental conditions. These microbes can add genetic flexibility to the adaptation of comparatively sessile and longer-lived plants (Barrow et al., 2008). The concept of "habitatspecific symbioses," put forth by Rodriguez et al. (2008), is one of the most intriguing discoveries pertaining to microbial contributions to stress tolerance made in recent years. Their research found that salt, drought, and disease resistance were each individually conferred by specific fungal symbionts that had been harvested from coastal, arid, and agricultural environments, respectively. Furthermore, they found that these beneficial effects could be conferred on different plant host species, including both monocots and dicots. These insights suggest that the foundation for the growth-promoting effects of microbial symbionts is based on the co-evolution of the association between plant and microbe under adverse environmental conditions (Rodriguez et al., 2008). For the purposes of developing novel biotechnological agents for use in agriculture, this study supports the idea that the optimal place to look for PGPM that confer resistance to a specific environmental stress is in soils where that stress is a regular phenomenon.

## **FUTURE DIRECTIONS OF ABIOTIC STRESS TOLERANCE IMPROVEMENT IN CROP PLANTS**

Microbial species with plant-growth promoting capabilities are both numerous and easier to characterize now than ever before. A considerable fraction of endophytes isolated from crops appear the potential to reveal both the microbial and host genetic components responsible for improved stress tolerance; these may serve as targets for plant-breeding/genetic-engineering based approaches to improving stress tolerance in the host.

to have measurable effects on host fitness (Friesen et al., 2011). Two recent studies found that more than 25% of bacteria isolated from cultivated crops had plant growth promoting activities (Hassan et al., 2010; Marasco et al., 2012). While the identification of microbial endophytes has been challenging in the past due to the frequent lack of plant–host symptoms, localized colonization, intimate integration with plant cellular structures, and lack of cultivability, recent advances in genomic technologies have helped make this process faster and cheaper (Berg et al., 2013). A recent technique for selective depletion of chloroplast and mitochondrial-derived 16S amplicons allows for vastly increased resolution of bacterial endophyte populations derived from within plant tissues (Lundberg et al., 2013). While in the past wholegenome sequencing of candidate symbionts was only possible for cultivable species, it is now possible to obtain draft genomes of microbial endophytes in a high-throughput fashion using singlecell sorting coupled with next-generation sequencing technologies (Woyke et al., 2006). Understanding the genomic content of these PGPMs will enable us to better understand the mechanisms behind the conferred stress-tolerances, as well as cultivate them for experimental investigation (Pope et al., 2011).

As more and more genomes from PGPM become available, our ability to identify the shared genetic components or metabolites that are responsible for conferring specific abiotic stress advantages increases. Through a transcriptomic analysis of the

symbiosis between oilseed rape and *Stenotrophomonas rhizophila*, a recent study identified spermidine as a novel PGPM regulator of plant abiotic stress (Alavi et al., 2013). Identification of the genetic components within PGPMs that are responsible for alleviating abiotic stress may in some cases yield potential targets for transgenic modification of the host organism (Nadeem et al., 2014). Recently, bacterial cold-shock proteins transformed into various plant species led to increased tolerance to a variety of abiotic stresses, including cold, heat, and drought (Castiglioni et al., 2008).

Investigation of the mechanisms by which PGPM confer stresstolerance to their plant hosts is another avenue for identifying targets for direct transgenic manipulation of stress response in crops. Recent technological advances in cell-type specific transcriptomics (Taylor-Teeples et al., 2011), combined with an experimental system designed to examine host transcription during symbiosis with PGPM, could allow for a precise dissection of the genetic signaling mechanisms responsible for increased stress tolerance. An improved understanding of these host mechanisms could provide potential candidate loci for transgenic or plant-breeding strategies aimed at plant–host improvement (Grover et al., 2010). For example, salt tolerance induced by *Bacillus subtilus* was shown to be the result of tissue specific modulation of the expression of the *Arabidopsis* Na+/K+ transporter, *HKT1* (Zhang et al., 2008). Similarly, drought resistance in *Arabidopsis* as a result of inoculation with *P. polymyxa* was related to strong upregulation of the host gene *ERD15* (Timmusk and Wagner, 1999).

Finally, there is a need for rethinking modern agronomic practices in light of our current understanding of the importance of host-associated microbial communities for plant productivity and health. Current large-scale agricultural systems rely heavily on monoculture cropping systems, in many cases without between-season crop rotation, which has been shown to lead to the build up of specialized plant pathogens, increased disease incidence, and decreased yield (Berendsen et al., 2012; Gentry et al., 2013). Research is being conducted to determine if the use of specific cover crops can be used to promote and maintain a beneficial microbiome between growing seasons for important crop species (East, 2013). Current methods of tilling may also negatively impact the plant microbial community; alternatives, including "conservation-" or "zero-tillage," may have the potential to promote a healthy belowground microbiome by reducing moisture loss and maintaining naturally occurring strata within the soil, which helps support microbial biodiversity (East, 2013).

#### **CONCLUSION**

As with the plant-breeding and transgenic approaches to engineering stress-resistance in tomorrow's crops, there are of course challenges associated with symbiont based strategies that will need to be overcome. One potential challenge will be detangling synergistic and antagonistic effects of different microorganisms within the plant microbiome (Trabelsi and Mhamdi, 2013). Research has demonstrated synergistic effects of multiple PGPM (Figueiredo et al., 2008), and another study has identified a virus present within a plant growth promoting fungus as the causative agent of heat resistance conferred to a tropical grass (Márquez et al., 2007). A second challenge stems from the fact that while many PGPM have been shown to confer their benefits across multiple host species, it is clear that this is not always the case. In some studies, the host species (and even host cultivar) has been shown to play a significant role in driving microbial community composition and activity (Ofek et al., 2013; Philippot et al., 2013), selecting for and against particular microbial partners. Additionally, interactions between the PGPM and the members of the existing microbial community could alter or negate the potential beneficial effects of the microbe (Schippers et al., 1987). Due to the complexity of interactions among the microbes, host, and environment, there is the potential that a PGPM that confers benefit in one context may have a null, or even negative, effect in a different context; therefore, considerable work will need to be done to determine the range of applicability for each PGPM as a beneficial agricultural agent. A third challenge, which is equally important for both symbiont and host-based methods of improving stress tolerance, will be unraveling the complex relationships between the various biotic and abiotic stress responses. Research programs aimed at developing tolerance to a particular stress do not necessarily test susceptibility to other stresses; due to the intrinsically related nature of the pathways governing stress response, later field trials have in some instances revealed increased susceptibility to other stresses (Atkinson and Urwin, 2012). Lastly, methods of microbial delivery within field settings and stable integration of PGPMs into the agricultural soil ecosystem will need improvement. While many applications of PGPMs to crops in field settings have demonstrated significant improvements to stress tolerance (Celebi et al., 2010; Mengual et al., 2014; Rolli et al., 2014), others have shown inconsistent or even negative effects (Nadeem et al., 2014). One promising method of stabilizing beneficial effects of PGPM in the field involves the inoculation of a microbial consortium of PGPM, as opposed to a single PGPM species. Combining PGPM known to grow and perform well together will likely increase the resilience of the inoculum and its beneficial effects, and additionally allow for tailoring the community to respond to specific combinations of abiotic and biotic stresses (Trabelsi and Mhamdi, 2013).

Agriculture currently accounts for 70% of human fresh water use, and in many parts of the world this rate of water consumption exceeds local regeneration rates, leading to unsustainable reliance on underground aquifers that are rapidly depleting (Castiglioni et al., 2008; Jiao, 2010). Given this, it is not surprising that drought and other water-related stresses are considered by many to be the most significant threats to global agricultural security in the near future. Encouragingly, in the research conducted by Rodriguez et al., the "habitat-specific symbionts" selected from a coastal site, a geothermal site, and an agricultural site shared one trait: the ability to confer drought resistance. Rodriguez et al. (2008) hypothesize that the ability of fungal endophytes to confer drought tolerance may be a common evolutionary relic from when plants left the ocean, as fungal symbiosis is thought to be in part responsible for the movement of plants to land. If this turns out to be the case, proponents of symbiont-based approaches to increasing stress resistance in crop plants may do well to focus their efforts on drought and other water-related stresses.

In the future, there is a need for more collaboration between the host-focused and symbiont-focused approaches to mitigating abiotic stress in crop plants. Medical science has in recent years undergone a profound restructuring of its understanding of the microbiome housed within the body and its impact on human health (East, 2013). There is a clear parallel here for plant science, with implications that have the potential to change the face of agriculture and help us to meet the challenges confronting humanity in light of our expanding population and changing planet. The fundamental change required is a broader recognition that plants do not exist as autonomous organisms governed entirely by their genetic blueprints, but rather serve as ecological niches for diverse communities of easily overlooked microbes, which work in concert with the plant to survive in a wide range of stressful environmental conditions.

#### **ACKNOWLEDGMENTS**

Work by the U.S. Department of Energy Joint Genome Institute is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Devin Coleman-Derr and Susannah G. Tringe are supported in part by a subcontract to US National Science Foundation Microbial Systems Biology grant IOS-0958245 to Jeffery Dangl, University of North Carolina.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 11 April 2014; paper pending published: 09 May 2014; accepted: 22 May 2014; published online: 06 June 2014.*

*Citation: Coleman-Derr D and Tringe SG (2014) Building the crops of tomorrow: advantages of symbiont-based approaches to improving abiotic stress tolerance. Front. Microbiol. 5:283. doi: 10.3389/fmicb.2014.00283*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Coleman-Derr and Tringe. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Convergence in mycorrhizal fungal communities due to drought, plant competition, parasitism, and susceptibility to herbivory: consequences for fungi and host plants

## *Catherine A. Gehring1\*, Rebecca C. Mueller 1 †, Kristin E. Haskins1 †,Tine K. Rubow2 and Thomas G.Whitham1*

<sup>1</sup> Department of Biological Sciences and Merriam Powell Center for Environmental Research, Northern Arizona University, Flagstaff, AZ, USA <sup>2</sup> Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA

#### *Edited by:*

M. Pilar Francino, Center for Public Health Research, Spain

#### *Reviewed by:*

John Everett Parkinson, The Pennsylvania State University, USA Jukka Jokela, Eidgenössische Technische Hochschule Zurich, Switzerland

#### *\*Correspondence:*

Catherine A. Gehring, Department of Biological Sciences and Merriam Powell Center for Environmental Research, Northern Arizona University, 617 South Beaver Street, Flagstaff, AZ, USA e-mail: catherine.gehring@nau.edu

#### *†Present address:*

Rebecca C. Mueller, Los Alamos National Laboratory, Los Alamos, NM, USA;

Kristin E. Haskins, The Arboretum at Flagstaff, Flagstaff, AZ, USA

Plants and mycorrhizal fungi influence each other's abundance, diversity, and distribution. How other biotic interactions affect the mycorrhizal symbiosis is less well understood. Likewise, we know little about the effects of climate change on the fungal component of the symbiosis or its function.We synthesized our long-term studies on the influence of plant parasites, insect herbivores, competing trees, and drought on the ectomycorrhizal fungal communities associated with a foundation tree species of the southwestern United States, pinyon pine (Pinus edulis), and described how these changes feed back to affect host plant performance. We found that drought and all three of the biotic interactions studied resulted in similar shifts in ectomycorrhizal fungal community composition, demonstrating a convergence of the community towards dominance by a few closely related fungal taxa. Ectomycorrhizal fungi responded similarly to each of these stressors resulting in a predictable trajectory of community disassembly, consistent with ecological theory. Although we predicted that the fungal communities associated with trees stressed by drought, herbivory, competition, and parasitism would be poor mutualists, we found the opposite pattern in field studies. Our results suggest that climate change and the increased importance of herbivores, competitors, and parasites that can be associated with it, may ultimately lead to reductions in ectomycorrhizal fungal diversity, but that the remaining fungal community may be beneficial to host trees under the current climate and the warmer, drier climate predicted for the future.

**Keywords: climate change, community convergence, community disassembly, competition, drought, ectomycorrhizal fungi, herbivory, mistletoe parasitism**

## **INTRODUCTION**

The aims of the field of community ecology include understanding how communities respond to changing environmental conditions, as well as, the consequences of those changes for both communities and ecosystems. Understanding community trajectories is currently of greater importance due to agents of rapid environmental change such as climate warming and the introduction of non-native species (Tylianakis et al., 2008). Both of these global changes can alter interspecific interactions with consequences for species persistence and biodiversity (Voigt et al., 2003; Zarnetske et al., 2012). Changes in herbivore and predator communities due to global change have been argued to have disproportionate effects on the broader community (Zarnetske et al., 2012). In grassland, warmer temperatures and higher nitrogen increased insect herbivore biomass, with no concomitant increase in parasitoids (de Sassi and Tylianakis, 2012). Global change impacts on mutualist communities are predicted to be among the most extreme (Dunn et al., 2009). For example, disruption of vertebrate seed dispersal mutualisms may create "widow" species that lack mutualist services (Aslan et al., 2013). Likewise, disruption of mutualistic associations between plants and mycorrhizal fungi by

non-native plant species may tilt the competitive balance towards non-native plants (Vogelsang and Bever, 2009; Meinhardt and Gehring, 2012).

While there are abundant examples of the impacts of environmental change on communities, studies are often necessarily focused on one aspect of environmental change, leaving us with little information on the similarities or differences in community trajectories in response to different types of environmental change. For example, a significant body of research has demonstrated the importance of the symbiosis between plants and mycorrhizal fungi at the individual, population, community, and ecosystem level (see examples in Johnson and Gehring, 2007; Smith and Read, 2008), and several individual studies have documented shifts in fungal communities due to environmental changes such as nitrogen deposition and climate change (e.g., (Lilleskov et al., 2002; Heinemeyer et al.,2004). However, many of these studies havefocused on the relationships between plants and fungi in isolation from other biotic interactions such as competition, facilitation, and herbivory, and even fewer studies have determined if fungal communities respond similarly to varied perturbations. Do mycorrhizal fungi respond similarly to the parasites and herbivores that feed on their

host plants, for example? Do communities change in similar ways if the stressor is abiotic versus biotic? The consequences of fungal community changes for host plant growth and survival are often also poorly known, but fungal symbionts, including mycorrhizal fungi, may alter host plant response to global change (Kivlin et al., 2013). Understanding the feedbacks among global changes, mycorrhizal fungal communities and host plant survival will provide insights into the long-term effects of global change on ecosystems.

In this paper, we examined the interactions between communities of ectomycorrhizal fungi (EMF) associated with a single plant species as it interacted with an insect herbivore (the scale insect, *Matsucoccus acalyptus* Herman), a plant parasite (dwarf mistletoe, *Arceuthobium divaricatum* Engelm), and an interspecific belowground tree competitor (*Juniperus monosperma* Engelm). We also examined if abiotic and biotic stressors resulted in similar community shifts by comparing the EMF of insect herbivore-affected and unaffected trees at two time points, one prior to long-term drought in the study area and one in the midst of a severe, ongoing drought that began in 1995 (Mueller et al., 2005). We focused on EMF because of the intimate trading partnership they develop with their plant hosts in which soil resources (nutrients and water) are exchanged for photosynthate (Smith and Read, 2008). Stressors such as drought, herbivory, parasitism, and competition all may increase host plant need for soil resources while potentially reducing the ability of the plant to provide photosynthate to EMF whose carbon requirements can be substantial (Nehls, 2008). EMF also represent good models for community studies because they are diverse, with an estimated 200+ genera from eleven orders involved in the association (Tedersoo et al., 2010), and their communities can be highly responsive to environmental change (Swaty et al., 2004). We tested the following hypotheses: (1) Communities of EMF will respond similarly to biotic stresses of parasitism, herbivory, and competition leading to a convergence in community structure associated with biotic stress. We predicted that these biotic stresses would result in similar changes in EMF community composition because they likely alter the ability of host plants to provide photosynthate to EMF, resulting in an EMF community composed of species with lower carbon demands. (2) Communities of EMF will respond similarly to the abiotic stress of drought as they do to the biotic stress of herbivory. Again, we reasoned that chronic herbivory and drought stress would affect EMF communities similarly because both stressors were likely to lead to photosynthate limitation. (3) Plants colonized by the EMF community associated with high herbivory, parasitism, and competition will exhibit poor growth. Previous studies have shown that EMF with low carbon requirements tend to invest less in structures such as external hyphae (Saikkonen et al., 1999), suggesting that they may be inferior mutualists.

We tested these hypotheses using DNA sequence data on the root colonizing EMF communities associated with pinyon pine (*Pinus edulis* Engelm.), a foundation tree species distributed across large areas of the southwestern US. This species has experienced substantial, drought related mortality in recent years across much of its distribution (Mueller et al., 2005; Garrity et al., 2013). We have previously shown that herbivory by a needle feeding scale

insect (Gehring and Whitham, 2002), parasitism by a dwarf mistletoe (Mueller and Gehring, 2006), and competition with co-dominant juniper (Haskins and Gehring, 2004) altered EMF community composition. Here we synthesized these data sets and conducted new analyses to determine if these varying biotic stressors had similar impacts on EMF community composition (Hypothesis 1). Repeated sampling of the same herbivore resistant and herbivore susceptible trees before and during drought allowed us to assess the similarity of drought and herbivore affected communities (Hypothesis 2). Long-term herbivore removal experiments provided us with the opportunity to examine the influence of changes in EMF community composition on plant growth when the direct impact of herbivores on plant performance was dramatically reduced (Hypothesis 3). This study is important because it compares the responses of EMF communities to different types of stressors, both biotic and abiotic, and examines the potential consequences of community changes to the host plant. Studies of such complex interactions are of growing importance given that global change has been shown to influence nearly every type of species interaction (Tylianakis et al., 2008). Also, while community disassembly, the nonrandom process of progressive species decline or loss, has been demonstrated in response to a variety of global changes (Zavaleta et al., 2009), the dynamics of this process in the mycorrhizal symbiosis remains poorly understood.

#### **MATERIALS AND METHODS**

#### **HYPOTHESIS 1: COMPETITION, PARASITISM, AND HERBIVORY WILL HAVE SIMILAR EFFECTS ON EMF COMMUNITY COMPOSITION**

To test Hypothesis 1, we used previously published data on the EMF communities of *P. edulis* that experienced low versus high levels of three types of negative biological interactions: (1) parasitism by the dwarf mistletoe (*A. divaricatum*) which derives water, mineral nutrients, and a portion of its carbon requirements from its host plant (Mueller and Gehring, 2006), (2) belowground competition with juniper (*J. monosperma*), a co-dominant, drought tolerant tree in the pinyon-juniper woodland ecosystem (Haskins and Gehring, 2004), and (3) herbivory by a scale insect (*M. acalyptus*) that feeds on the leaf mesophyll tissue of juvenile *P. edulis* leading to premature needle abscission, reduced growth, and a characteristic poodle tail architecture of susceptible trees (Cobb and Whitham, 1993), whereas resistant trees have normal tree architecture and a full complement of needle cohorts. Results of scale insect transfer experiments suggested that resistance versus susceptibility to the scale is genetically based (Cobb and Whitham, 1993; Gehring et al., 1997). Although species richness and diversity did not respond consistently across studies, in all cases the EMF community composition of *P. edulis* experiencing low levels of the biotic interaction were significantly different from those experiencing high levels of the biotic interaction (Gehring and Whitham, 2002; Haskins and Gehring, 2004; Mueller and Gehring, 2006). In the case of *M. acalyptus*, degree of foliage loss due to scale herbivory on scale resistant and susceptible trees was significantly, linearly associated with degree of change in EMF community composition (*r*<sup>2</sup> <sup>=</sup> 0.591, *<sup>P</sup>* <sup>&</sup>lt; 0.001; Gehring and Bennett, 2009). We took advantage of natural variation in herbivory and mistletoe parasitism but experimentally reduced belowground competition

with juniper by trenching. In this study, we compared the community composition of EMF across the studies to determine if these three biotic stressors resulted in convergent or divergent communities.

Although detailed methods can be found in the individual publications, a brief description follows. All studies were conducted in pinyon-juniper woodlands in northern Arizona, but soil type, year of sampling, and tree size and age varied among studies. Within a study, all high and low biotic interaction trees were intermixed at the same site, but sites differed among studies. Trees experiencing competition and insect herbivory occurred within 2 km of one another on nutrient poor volcanic soils, but mature trees were sampled in the competition study and juvenile trees (prereproduction) in the herbivory study. Plant parasite effects on *P. edulis* EMF communities were studied on mature trees at sites with better developed soils of volcanic origin more than 35 km distant from the other sites. Because of this variation in sites, tree age, and year of sampling, and the high diversity of EMF, we expected that trees experiencing low levels of these different negative biotic interactions would have different communities.

We used similar methods to characterize EMF communities. Briefly, we collected fine roots (<2 mm diameter) from each tree at a depth of 0–30 cm. Roots for the herbivory study were collected in 1994, and for the competition and parasitism study in 2002. We classified between 75 and 100 living EM root tips per tree based on morphology and stored the EM root tips in 1.5 ml microcentrifuge tubes at –20◦C until molecular analyses were conducted. This level of sampling has been shown to adequately characterize the EMF community as extensive assessment of *P. edulis* showed that individual trees had seven or fewer species, with two species dominating (82%) the community (Gehring et al., 1998). We extracted the DNA from a minimum of two to three root tips of each morphotype from each tree using DNeasy Kits (Qiagen, Valencia, CA, USA). We used the mini-prep method of (Gardes and Bruns, 1993) to extract DNA from the herbivory samples collected in 1994. DNA extraction and amplification success was similar for samples collected during all years, averaging >90%. We amplified the internal transcribed spacer (ITS) region of the fungal genome, located between the 18S and 28S rRNA, using PCR (polymerase chain reaction) with the ITS1F and ITS4 primer pair (Gardes and Bruns, 1993). Morphotypes were characterized by a single species of EMF, except for the smooth, red-brown morphotype that characterizes the genus *Geopora*. Multiple closely related species of *Geopora* are found on *P. edulis* (Gordon and Gehring, 2011); additional sequencing was done to estimate the relative abundance of the *Geopora* species if multiple species were found in the initial screening. We assembled forward and reverse DNA sequences in BioEdit version 7.0.5.3 (Hall, 1999) to create a consensus sequence that was used in a BLASTn search on the NCBI and UNITE websites (Altschul et al., 1990; Abarenkov et al., 2010). We used percentage query coverage, percentage maximum identity, and bit score data to identify the closest match of our fungi to those in these databases. The names of some species reported in previous papers were modified based on cross-referenced nomenclature and phylogenetic placements with Index Fungorum (http://www.indexfungorum.org) accessed during January of 2014.

We visualized data on the community composition of EMF associated with the six groups of trees using relative abundance data (the percentage of a given EMF species relative to all EMF root tips in a sample) and non-metric multidimensional scaling (NMS) ordinations with a Bray-Curtis distance measure in PC-ORD 5.10 (McCune and Mefford, 2006). We used an analysis of similarity (ANOSIM) in PRIMER version 6.1 (Clarke and Gorley, 2006) to determine if the EMF communities of low biotic interaction trees (low herbivory, low competition, low parasitism) differed from one another. We used the same type of analysis to determine how the EMF communities of high biotic interaction trees compared to one another. Hypothesis 1 would be supported if we observed significant differences among communities in low interaction trees, but no difference in community composition in high interaction trees.

#### **HYPOTHESIS 2: COMMUNITIES OF EMF WILL RESPOND SIMILARLY TO THE ABIOTIC STRESS OF DROUGHT AS THEY DO TO THE BIOTIC STRESS OF HERBIVORY**

We addressed this hypothesis by re-sampling the juvenile trees that experienced high versus low levels of herbivory in 2004, ten years after the first sampling (*n* = 14 trees per group). The trees were still non-reproductive in 2004. The first year sampled, 1994, occurred at the end of a period of wet years, while the second, 2004, occurred during a period of ongoing drought. Average early year (January–May) precipitation totaled 188.4 mm for the 5 year before the 1994 collection and 86.6 mm for the 5 year before the 2004 collection (Sthultz et al., 2009a,b). The persistently dry conditions beginning in 1995–1996 resulted in extensive mortality of *P. edulis* in northern Arizona (Mueller et al., 2005). The methods used to characterize EMF communities were similar for the 2004 and 1994 sampling periods, with the exception of DNA extraction using the mini-prep method in 1994 as noted above. Likewise, community data were visualized using ordinations in PC-ORD. We tested the influence of insect herbivory (insect susceptible high versus insect resistant low) and year (1994 versus 2004) on EMF community composition with a permutation-based nonparametric multivariate analysis of variance (PerMANOVA; Anderson, 2001) using relative abundance data in PRIMER version 6.1 (Clarke and Gorley, 2006). We sampled the same trees each year and accounted for this repeated sampling by including tree identity as a factor nested within the insect resistance category. We analyzed the main effects of herbivory and year as a two-way factorial (*P* ≤ 0.05). Hypothesis 2 would be supported if the EMF communities of *P. edulis* experiencing low levels of herbivory shifted with drought to resemble those of *P. edulis* experiencing high levels of herbivory.We did not expect the community composition of trees experiencing high herbivory to change with drought.

#### **HYPOTHESIS 3: PLANTS COLONIZED BY THE EMF COMMUNITY ASSOCIATED WITH HIGH HERBIVORY, PARASITISM, AND COMPETITION WILL EXHIBIT POOR GROWTH**

We tested this hypothesis by sampling EMF communities and shoot growth in an independent set of juvenile *P. edulis*. These trees had experienced chronic scale insect herbivory in the past, but these insects had been mechanically removed for 19 years, allowing both foliage and EMF abundance to completely recover (Cobb and Whitham, 1993; Gehring et al., 1997). Preliminary measurements indicated that these trees had EMF communities that encompassed those of both high and low herbivory, allowing us to examine the effects of community variation without the complication of variation in parasitism, competition, or foliage loss due to herbivory. We sampled fifteen susceptible trees that had had their insects experimentally removed for EMF communities in August 2004, using the methods described previously. At the same time, we measured the length of ten shoots per tree (2004 growth only) as an estimate of tree growth. We compared the relative abundance of three members of the genus *Geopora* with shoot growth using regression analysis in IBM SPSS version 20. We chose the relative abundance of these *Geopora* as our measure of EMF community variation because our tests of Hypothesis 1 indicated that these taxa increased substantially in association with parasitism, competition, and herbivory (see below). Hypothesis 3 would be supported if we observed a significant negative relationship between shoot growth and the abundance of *Geopora* in the EMF community.

#### **RESULTS**

#### **HYPOTHESIS 1: COMPETITION, PARASITISM, AND HERBIVORY WILL HAVE SIMILAR EFFECTS ON EMF COMMUNITY COMPOSITION**

In support of hypothesis 1, the EMF communities associated with *P. edulis* experiencing low levels of parasitism, herbivory, and competition were significantly different from one another (*A* = 0.172, *P* < 0.001), while the communities of *P. edulis* experiencing high levels of these same interactions were similar (*A* = 0.015, *P* = 0.651; **Figure 1**). We observed 18 species of EMF across the six groups of trees; members of the genera *Geopora* (five species) and *Rhizopogon* (three species) were the most common but we also observed species in the genera *Tricholoma, Lactarius, Inocybe, Russula, Cortinarius*, and *Tomentella.*

All three of the high biotic interaction communities were dominated by the same three members of the genus *Geopora* that made up 95, 89, and 77% of the relative abundance in the high parasitism, high competition, and high herbivory trees, respectively (**Figure 2**). Members of this genus were much less common on low biotic interaction trees, averaging 39% relative abundance. Among the low interaction trees, the relative abundance of *Geopora* was highest on low herbivory trees. However, most of the *Geopora* observed on these trees were of different species than the *Geopora* observed on high biotic interaction trees, and included *G. cooperi*, which appears to be phylogenetically distinct from the other species (**Figure 2**; Guevara-Guerrero et al., 2011; Stielow et al., 2012; Flores-Rentería et al., 2014). Members of the genus *Rhizopogon* dominated *P. edulis* experiencing low competition, while *Tricholoma terreum* dominated *P. edulis* experiencing low parasitism (**Figure 2**).

#### **HYPOTHESIS 2: COMMUNITIES OF EMF WILL RESPOND SIMILARLY TO THE ABIOTIC STRESS OF DROUGHT AS TO THE BIOTIC STRESS OF HERBIVORY**

Our hypothesis that drought would result in similar shifts in EMF community composition as insect herbivory was supported.

**another (bottom portion of graph) while trees with high levels of parasitism, herbivory or competition had very similar EMF communities (top portion of graph).** Data represent the community centroids and the SE surrounding those centroids as follows: orange circles – mistletoe parasitism; green squares – scale insect herbivory; purple triangles – competition with juniper.

The EMF communities of susceptible, high herbivory trees were similar in pre-drought and drought years, while the EMF communities of resistant, low herbivory trees changed substantially during the drought year, becoming more like the communities of high herbivory trees (**Figure 3**).

This change in low herbivory trees was supported by a significant herbivory by year interaction across 10 years from normal to severe drought conditions (Pseudo *F*1,53 = 2.52, *P* = 0.041). The main effect of herbivory was also statistically significant (Pseudo *F*1,53 = 2.86, *P* = 0.014), while the main effect of year was not statistically significant (Pseudo *F*1,53 = 1.902, *P* = 0.109). The three members of the genus *Geopora* observed to increase dramatically with herbivory, competition, and parasitism also increased substantially during the drought year in low herbivory trees, shifting from 16% of the community to 58% of the community. We sampled the same trees for EMF communities in both 1994 and 2004, but tree identity did not explain a significant portion of the variation in community composition (Pseudo *F*25,53 = 0.747, *P* = 0.936).

#### **HYPOTHESIS 3: PLANTS COLONIZED BY THE EMF COMMUNITY ASSOCIATED WITH HIGH HERBIVORY, PARASITISM, AND COMPETITION WILL EXHIBIT POOR GROWTH**

In contrast to our hypothesis, shoot growth was significantly positively correlated with the abundance of the three species of *Geopora*

that dominated on trees that experienced high levels of competition, parasitism, and herbivory (*r* <sup>2</sup> <sup>=</sup> 0.574, *<sup>F</sup>*1,13 <sup>=</sup> 17.454, *P* = 0.001; **Figure 4**).

#### **DISCUSSION**

#### **COMMUNITY CONVERGENCE TOWARDS GENERALIST ECTOMYCORRHIZAS**

The convergence of EMF communities in response to biotic and abiotic stressors is consistent with several of the predictions of Chase (2003) who argued that community assembly would lead to a single equilibrium state in environments with small regional species pools, high dispersal potential, low levels of productivity and frequent disturbance. Relative to better studied plant communities, fungal communities assemble and disassemble rapidly, and are likely more linked to finer-scale environmental changes, which helps explain why the communities of trees experiencing low competition, herbivory, and parasitism were different, while trees under abiotic or biotic environmental stress (e.g., disturbance), were not. Chase (2003) also found that as site productivity increased, communities at the same site became more dissimilar. Consistent with these results, *P. edulis* experiencing low levels of drought and/or negative biotic interactions likely represented high productivity environments for EMF, promoting community dissimilarity. Ikeda et al. (2014) used similar arguments to predict

that changes in host productivity with climate change would influence the community structure of dependent communities such as mycorrhizal mutualists and herbivores.

The EMF communities of *P. edulis* experiencing high biotic and abiotic stress converged toward a community highly dominated by three species within the same genus, *Geopora*. A review of the effects of past and current climate change on species interactions indicated that climate change frequently resulted in communities dominated by generalist species and interactions (Blois et al., 2013). The distribution and symbiotic traits of members of the genus *Geopora* are poorly understood, yet they appear to be generalists. They have been observed on both gymnosperm and angiosperm hosts (Fujimura et al., 2005; Hrynkiewicz et al., 2009; McDonald et al., 2010), and in association with ecosystems ranging from arid shrubland to boreal forest (Tedersoo et al., 2006; McDonald et al., 2010). Members of the genus *Geopora* were the principal EMF colonists of willow clones planted for restoration in fly ash that had been inoculated with another genus of EMF (Hrynkiewicz et al., 2009), suggesting they may disperse readily and survive well in harsh environmental conditions. Previous studies with *P. edulis* also documented increases in the relative abundance of members of this genus within and among sites as drought intensified in the southwestern United States (Sthultz et al., 2009a,b; Gordon and Gehring, 2011; Gehring et al., 2014).

**FIGURE 3 | An NMS ordination showing the EMF communities of high insect herbivory (susceptible trees) and low insect herbivory (resistant) trees during a pre-drought period in 1994 and a drought period that began in 1996 and continues to the present.** The tree types are represented by different symbols (open symbols indicate susceptible, high herbivory, which is also indicated by the icon showing the poodle tail architecture resulting from high foliage loss; closed symbols indicate resistant, low herbivory shown by an icon with a full complement of needles). Pre-drought samples are indicated with squares and drought samples with circles. Each point represents the centroid of the EMF community of 14 replicates per treatment with vertical and horizontal bars depicting ±1 SE. Arrows show trajectories of communities from the pre-drought to the drought period.

Interestingly, convergence toward *Geopora* dominance happened more rapidly with drought in the scale resistant juvenile *P. edulis* described here than in moth resistant mature *P. edulis* at the same site (Gehring et al., 2014). Given that all of the *P. edulis* studies described patterns of EMF communities before and during drought, thereby confounding drought and sampling time, alternative explanations for the community shifts are possible. Experimental work is necessary to substantiate these patterns and to explore mechanisms.

The community convergence observed across the three biotic interactions is striking given that shifts in abundance, measured as percent root colonization, were positive in some studies, but not in others. Abundance of EMF was lower on pinyons with high levels of root competition and insect herbivory, but was higher on trees with high plant parasitism (Haskins and Gehring, 2004; Mueller and Gehring, 2006). This finding suggests that even when pinyon hosts invested more in the EMF symbiosis following parasitism, they tended to associate with a limited group of EMF. The extreme convergence we observed is also surprising given that the site where plant parasitism was studied was more than 30 km distant from the others, with distinct soil characteristics, particularly soil nutrients. We would have expected a different pool of EMF to be present in this site, including a different subset of species tolerant of high biotic stress. As mentioned above, we know little about the biology of members of the genus *Geopora* that would help explain these patterns. However, relatives of the genus *Geopora* in the order Pezizales were reported to have significant saprotrophic abilities (Tedersoo et al., 2010), which could allow them to persist in situations when they are poor competitors with other EMF for root colonization sites.

#### **ECTOMYCORRHIZAL FUNGAL COMMUNITY DISASSEMBLY**

The species losses and community convergence of EMF we observed in response to multiple environmental stressors is indicative of community disassembly. Community disassembly has been observed in response to global changes such as habitat destruction and climate change (reviewed in Zavaleta et al., 2009), and can occur over very short time scales, particularly with environmental perturbations that alter species interactions, such as invasion by an exotic species (Sanders et al., 2003). In many studies that observed community disassembly, species losses were associated with specific traits, such as rarity or degree of specialization (Zavaleta et al., 2009). For organisms involved in symbioses, traits that directly or indirectly impact the fitness of their partner may also impact their own survival, particularly under stressful conditions. Here we documented how multiple biotic and abiotic stressors acted in concert to favor a community of generalist ectomycorrhizal fungal mutualists. These seemingly disparate drivers of community disassembly may have had similar effects on EMF communities because they altered the trading relationships within the symbiosis, favoring fungi with low carbon demands as the photosynthetic capabilities of the host were compromised. The carbon demands of *Geopora* relative to other EMF have not been studied, but they have the morphological characteristics described for low cost fungi in other systems (Saikkonen et al., 1999).

Although the ectomycorrhizal symbiosis is generally considered mutualistic, it can be constructive to think of mutualisms in the context of reciprocal cheating, which persists only when both partners are able to prevent cheating by the other (Hoeksema and Kummel, 2003). Shifting abiotic conditions can alter the impact of biotic interactions (Agrawal et al., 2007). The cost to benefit ratio of the ectomycorrhizal symbiosis has been shown to change under different environmental conditions (Kennedy and Peay, 2007), and host plants have been shown to regulate their EMF partners under changing environmental conditions. For example, Peay et al. (2010) found that seedlings were able to maintain high growth rates under experimental nutrient enrichment by reducing colonization by EMF of the genus *Rhizopogon*. Across a natural environmental gradient, Moeller et al. (2013) found that the traits of EMF reflected the nutritional needs of their host plants, with communities composed of efficient foragers with high carbon requirements dominating in nutrient deficient soils. Because *Geopora* is a common member of the EMF communities found on pinyons, trees on which less efficient mutualists were eliminated were able to maintain higher growth rates as in Peay et al. (2010). The strong positive relationship observed between dominance by *Geopora* and pinyon growth suggests that although community disassembly was often considered detrimental (Zavaleta et al., 2009), negative effects may not always be observed, at least in the short term. The abundance of *Geopora* also was positively associated with host plant growth in another study of trees that experienced drought for a longer period than our study trees (Gehring et al., 2014). In addition, *P. edulis* that survived extreme drought were dominated by members of this EMF community (Swaty et al., 2004; Sthultz et al., 2009a,b). Taken together, these studies suggest that community disassembly may be a critical response to stress that favors the host tree and a subset of the EMF community.

#### **LONG-TERM EFFECTS OF EMF COMMUNITY CONVERGENCE**

In a drought year, pinyons colonized by EMF communities dominated by *Geopora* had higher growth rates, but the long-term effects of hosting such constrained communities are unclear. Plants have been shown to benefit from hosting a highly diverse EMF community (Baxter and Dighton, 2001; Jonsson et al., 2001), likely because this results in higher functional diversity of EMF traits, such as the ability to access different forms of phosphorus and nitrogen (Baxter and Dighton, 2005). However, whether communities composed of closely related species have lower functional diversity is unclear. Studies linking community relatedness and functional diversity have found both negative (Burns and Strauss, 2011) and positive relationships (Prinzing et al., 2008). For EMF, this could be further complicated by studies that showed that the relative effects of EMF species on host growth can change depending upon environmental conditions (Kipfer et al., 2012). As a result, it is possible that communities composed primarily of *Geopora* could be less beneficial under more benign environmental conditions.

Community convergence could potentially alter the relative cost to benefit ratio of EMF communities dominated by *Geopora* under non-drought conditions, but another possible outcome is the loss of biodiversity within the larger EMF community. Arid conditions are predicted for the duration of this century in the southwestern USA (Seager et al., 2007), and concurrent increases in herbivory and competition that can result from warmer, drier conditions (Anderegg et al., 2013) could facilitate the persistence of *Geopora*-dominated communities to the detriment of other species of EMF. In the high stress situations we observed, the relative abundance of three common species of *Geopora* averaged 87%. An additional site in which*Geopora* was uncommon in association with *P. edulis* shifted to*Geopora* dominance with drought (Gordon and Gehring, 2011). Ongoing drought could lead to the extirpation of once common species of EMF from large areas of northern Arizona. The persistence of these formerly common species may rely on their survival as propagules in the soil, a poorly understood aspect of the biology of EMF. Data from the most comprehensive study of spore longevity in EMF to date showed that several initially abundant species persisted for a minimum of six years as spores, while other, initially less common species were no longer observed after the same time period (Nguyen et al., 2012). Locally extirpated species of EMF could also colonize from areas more favorable for their growth and reproduction. However, long distance dispersal may be required as recent studies suggest that EMF propagules decreased rapidly with increasing distance from spore sources (Peay et al., 2012).

## **CONCLUSION**

Several conclusions have emerged from our long-term studies spanning wet to record dry conditions. First, diverse stressors including plant parasites, insect herbivores, competing trees, and drought similarly altered the EMF communities associated with an iconic foundation tree species that characterizes much of the arid American Southwest. Second, this community disassembly resulted in convergence towards a few closely related, generalist species of EMF. Third, while this community shift had negative consequences for the distribution of previously dominant fungi, the change may be beneficial for host plants because the remaining EMF community members were better mutualists under current, drought conditions. Fourth, the long-term trajectory of community disassembly appeared to follow some of the "rules" of community disassembly observed in other systems, demonstrating the importance of both the drivers of change and the abiotic context in which they were found.

### **AUTHOR CONTRIBUTIONS**

Catherine A. Gehring, Rebecca C. Mueller, Kristin E. Haskins, Tine K. Rubow, and Thomas G. Whitham designed and conducted the initial studies upon which the synthesis in this manuscript was based. Tine K. Rubow and Catherine A. Gehring conducted the subsequent sampling of a subset of the trees during drought. Catherine A. Gehring analyzed the data and wrote the first draft of the manuscript. Rebecca C. Mueller, Kristin E. Haskins, and Thomas G. Whitham provided valuable comments on the manuscript. Tine K. Rubow passed away before the first draft of the manuscript was written.

#### **ACKNOWLEDGMENTS**

We thank N. S. Cobb for identifying scale resistant and susceptible trees and initiating the scale removal experiment, the U.S. Forest Service and Sunset Crater National Monument for their cooperation, NSF DEB0816675 and LTREB DEB0236204 for funding, and the Gehring lab group and two reviewers for helpful comments on the manuscript.

#### **REFERENCES**

Abarenkov, K., Henrik Nilsson, R., Larsson, K.-H., Alexander, I. J., Eberhardt, U., Erland, S., et al. (2010). The UNITE database for molecular identification of

fungi–recent updates and future perspectives. *New Phytol.* 186, 281–285. doi: 10.1111/j.1469-8137.2009.03160.x


drought: evidence for long-term vegetation shifts. *J. Ecol.* 93, 1085–1093. doi: 10.1111/j.1365-2745.2005.01042.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 April 2014; paper pending published: 16 May 2014; accepted: 03 June 2014; published online: 25 June 2014.*

*Citation: Gehring CA, Mueller RC, Haskins KE, Rubow TK and Whitham TG (2014) Convergence in mycorrhizal fungal communities due to drought, plant competition, parasitism, and susceptibility to herbivory: consequences for fungi and host plants. Front. Microbiol. 5:306. doi: 10.3389/fmicb.2014.00306*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Gehring, Mueller, Haskins, Rubow and Whitham. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**MINI REVIEW ARTICLE** published: 04 November 2014 doi: 10.3389/fmicb.2014.00582

## Expanding genomics of mycorrhizal symbiosis

## *Alan Kuo1, Annegret Kohler <sup>2</sup> , Francis M. Martin2 \* and Igor V. Grigoriev1\**

<sup>1</sup> United States Department of Energy Joint Genome Institute, Walnut Creek, CA, USA

<sup>2</sup> UMR, Lab of Excellence for Advanced Research on the Biology of TRee and Forest Ecosystems, Tree-Microbe Interactions, Institut National de la Recherche Agronomique, Université de Lorraine, Nancy, France

#### *Edited by:*

M. Pilar Francino, Center for Public Health Research, Spain

#### *Reviewed by:*

Daniel J. Thornhill, Defenders of Wildlife, USA Eunsoo Kim, American Museum of Natural History, USA

#### *\*Correspondence:*

Francis M. Martin, UMR, Lab of Excellence for Advanced Research on the Biology of TRee and Forest Ecosystems, Tree-Microbe Interactions, Institut National de la Recherche Agronomique, Université de Lorraine, Nancy, 54280 Champenoux, France e-mail: fmartin@nancy.inra.fr; Igor V. Grigoriev, United States Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA e-mail: ivgrigoriev@lbl.gov

The mycorrhizal symbiosis between soil fungi and plant roots is a ubiquitous mutualism that plays key roles in plant nutrition, soil health, and carbon cycling.The symbiosis evolved repeatedly and independently as multiple morphotypes [e.g., arbuscular mycorrhizae (AM), ectomycorrhizal (ECM)] in multiple fungal clades (e.g., phyla Glomeromycota, Ascomycota, Basidiomycota). The accessibility and cultivability of many mycorrhizal partners make them ideal models for symbiosis studies. Alongside molecular, physiological, and ecological investigations, sequencing led to the first three mycorrhizal fungal genomes, representing two morphotypes and three phyla. The genome of the ECM basidiomycete Laccaria bicolor showed that the mycorrhizal lifestyle can evolve through loss of plant cell walldegrading enzymes (PCWDEs) and expansion of lineage-specific gene families such as short secreted protein (SSP) effectors. The genome of the ECM ascomycete Tuber melanosporum showed that the ECM type can evolve without expansion of families as in Laccaria, and thus a different set of symbiosis genes. The genome of the AM glomeromycete Rhizophagus irregularis showed that despite enormous phylogenetic distance and morphological difference from the other two fungi, symbiosis can involve similar solutions as symbiosis-induced SSPs and loss of PCWDEs. The three genomes provide a solid base for addressing fundamental questions about the nature and role of a vital mutualism.

**Keywords: mycorrhizae,** *Laccaria***,***Tuber***,** *Rhizophagus***,** *Glomus*

#### **INTRODUCTION**

The roots of most plants form intimate mutualistic associations with soil fungi known as "mycorrhizae." The mycorrhizal symbiosis is both ancient (among early land plants 410 ma) and pervasive (>80% of plants participate; Tedersoo et al., 2010), and thus underpins most terrestrial ecosystems, the soil portion of the global carbon budget, and much agricultural production (Read and Perez-Moreno, 2003). Mycorrhizae can provide stress tolerance and metal detoxification to the host plant (Hall, 2002), but the fundamental transactional logic of the symbiosis is the exchange of sugar photosynthesized by the plant for phosphorus and other nutrients acquired by the fungus (Martin and Nehls, 2009). A major goal of mycorrhizal studies is to define the symbiosis in molecular terms, i.e., to identify the "symbiosis genes" that encode the molecules that mediate and regulate symbiosis development and interspecific metabolic pathways.

This seemingly straightforward metabolic exchange has evolved many times among many different species pairings and been implemented in a diverse array of structural forms, some extracellular to the root cell and others intracellular but extracytoplasmic. The latter includes arbuscular mycorrhizae (AM), where the fungal hypha penetrates the root cell wall and invaginates (but does not penetrate) the root cell membrane, producing a tree-shaped arbuscule and a large surface area for nutrient exchange. AM is at once more intimate, more dependent (obligate for the fungus), more widespread (most plants can partner), and more ancient (∼410 ma) than other mycorrhizal types. In morphological contrast, ectomycorrhizal (ECM) fungi remain outside of the root cell wall, forming an intercellular hyphal network and a sheath of aggregated hyphae that encases the whole root tip and thus mediates the root's external interactions with the soil. ECM is the second most common mycorrhizal type, mostly with woody plants. The ECM fungus orders Agaricales, Boletales, and Russulales are at least the same age as the Pinaceae (∼160 ma), suggesting that ECM plausibly evolved at this point. There are other less common or more obscure mycorrhizal types, including orchid mycorrhizae (OM) and ericoid mycorrhizae (ERM), restricted to the Orchidaceae and the Ericaceae (acid-tolerant heathers such as cranberry), respectively. Both OM and ERM have both extra- and intracellular (but not AM) morphological components, but the fungal partner is sometimes capable of switching between morphotypes in a host-dependent manner (Dearnaley et al., 2012).

The morphological and ecological diversity of mycorrhizal fungi is matched by their phylogenetic diversity, encompassing many mushrooms and other fruiting bodies famous for their gastronomy (e.g., porcini, matsutake, chanterelle, morel, truffle) or toxicity (e.g., fly agaric). Three of the top-level fungal phyla (Glomeromycota, Ascomycota, and Basidiomycota) have mycorrhizal representatives, but within Basidiomycota and Ascomycota the symbiosis has evolved independently many times in many subclades (66x; Tedersoo et al., 2010). A taxonomic level as low as genus may harbor both symbiotic and non-symbiotic species. Most of the Ascomycota and Basidiomycota symbioses are ECM, the exceptions being ERM Ascomycota and OM Basidiomycota. In contrast, all known Glomeromycota are AM and all known AM are Glomeromycota, suggesting monophyly of both the clade and the symbiosis. The divergence between Glomeromycota and Dikarya (Ascomycota+Basidiomycota) is deep (>800 ma). Glomeromycota have no known sexual cycle. The hyphae and spores are aseptate and multinucleate, with contradictory evidence indicating that some species may or may not be heterokaryotic. Where the Dikarya life cycle is known, Basidiomycota colonize ECM as dikaryons while Ascomycota colonize ECM as monokaryons.

The diversity of mycorrhizae provides motive and opportunity for application of an equally diverse array of investigative methods. Numerous models for physiological, ecological, and molecular biological study have been developed. For example, *in vitro* hyphal-branching assays have been used to isolate plant-secreted small molecules that stimulate AM and ECM fungal morphogenesis (Lagrange et al., 2001; Akiyama et al., 2005) and conversely root-branching and other assays have been used to identify fungus-secreted molecules that promote ECM and AM formation (Nehls et al., 1998; Felten et al., 2009; Splivallo et al., 2009; Maillet et al., 2011). As a second example, stable heterologous gene expression has been accomplished in ECM Basidiomycota (Marmeisse et al., 1992; Hanif et al., 2002; Kemppainen et al., 2005; Pardo et al., 2005), both heterologous expression and gene knockout in ERM Ascomycota (Martino et al., 2007; Abbà et al., 2009), and transient heterologous expression in ECM Ascomycota and AM Glomeromycota (Grimaldi et al., 2005; Helber and Requena, 2008). As a third example, various high-throughput RNA-interrogation methods have been used to recover symbiosis-specific transcripts from *in vitro* models of ECM and AM (Kim et al., 1998; Voiblet et al., 2001; Tamasloukht et al., 2003; Johansson et al., 2004; Duplessis et al., 2005), including a comprehensive multi-sample multi-method analysis of an AM fungal transcriptome that revealed many symbiosis-specific genes, and even meiosis genes in this putatively asexual organism (Lanfranco and Young, 2012; Tisserant et al., 2012).

The biochemical, genetic, and transcriptomic experiments are being aided by a massive effort to sequence the genomes of multiple mycorrhizal fungal (**Figure 1**). The first three of those genomes to be published are those of the ECM basidiomycete *Laccaria bicolor*, the ECM ascomycete *Tuber melanosporum*, and the AM glomeromycete *Rhizophagus irregularis* (formerly *Glomus intraradices*; Martin et al., 2008, 2010; Tisserant et al., 2013). These first three were chosen for both their diversity and their individual scientific and economic significance. They represent the two most important mycorrhizal morphotypes and three major fungal phyla. The ECM basidiomycete *L. bicolor* and the AM glomeromycete *R. irregularis* were also selected as part of a larger effort to sequence the microbiome of the bioenergy-domesticated poplar tree *Populus trichocarpa*. The ECM ascomycete *T. melanosporum* is of commercial importance in its own right as the gustatory delicacy, black truffle.

Each of the three genomes posed significant technical challenges due to their unprecedentedly (at the time of each sequencing project) large size and repetitive nature. All three genomes harbor large numbers of transposable elements (TEs); in addition, *L. bicolor* and *R. irregularis* have very large numbers of gene families, many of which have very large numbers of genes (**Table 1**).

#### **THE ECTOMYCORRHIZAL BASIDIOMYCETE** *Laccaria bicolor*

The genome of *L. bicolor* was the first published of a mycorrhizal fungus (Martin et al., 2008) and led directly to identification of many categories of molecules potentially involved in symbiosis: secreted proteases, lipases, carbohydrate-active



All numbers are calculated from the genomes' computationally reconstructed assemblies and annotations.

enzymes (CAZymes), enzymes for all core carbohydrate metabolic pathways and for fatty acid metabolism (Deveau et al., 2008; Reich et al., 2009), transporters of hexoses and of nitrogenous compounds (López et al., 2008; Lucic et al., 2008), aquaporins (Dietz et al., 2011), multicopper oxidases (Courty et al., 2009), antioxidant enzymes (Morel et al., 2008), signaltransduction protein kinases and small GTPases (Rajashekar et al., 2009), hydrophobins (Plett et al., 2012), and matingtype loci (Niculita-Hirzel et al., 2008). In all these studies the genes were subjected to comparative and phylogenetic analysis with homologs in other fungi (see below) and to transcriptomic analysis (see below). In the carbohydrate and lipase pathway studies, direct assay of storage carbohydrates and fatty acids allowed further pathway reconstruction. In the hexose transporter and aquaporin studies, function was confirmed by genetic complementation of *Saccharomyces* mutants.

Molecular manipulation of the organism complements genomics. Thus it is of great interest that RNA silencing methodology has been developed to knockdown genes in *L. bicolor* (Kemppainen et al., 2009). Such experiments have demonstrated that nitrate reductase and nitrate transporter (Kemppainen and Pardo, 2013), and a mycorrhiza-induced small secreted protein (MiSSP; Plett et al., 2011) are involved in symbiosis.

Complementing these "bottom–up" approaches, a sequenced genome allows application of "top–down" surveys of potential symbiosis genes. Comparison with other Agaricales genomes, at the time all saprotrophic, revealed a large genome size attributable to both TEs and large numbers of large gene families (Martin et al., 2008). Many of these appeared to be lineage-specific, without homologs in the saprobic Agaricales nor Pfam nor other domains allowing straightforward inference of function. Most of

the core and potentially symbiosis-related gene families described above are not expanded in *L. bicolor*, with the notable exception of certain families of signal transduction enzymes (Rajashekar et al., 2009). The genome lacks invertase as well as many plant cell wall-degrading enzyme (PCWDE) families, both consistent with notions that *L. bicolor* is dependent on its plant host for carbohydrate and does not activate its host's defenses (Deveau et al., 2008; Martin and Selosse, 2008; Martin et al., 2008). *L. bicolor* was also the sole mycorrhizal representative in a phylogenomic study of an unprecedentedly large collection of whole genomes to elucidate the evolutionary history of wood decomposition (Floudas et al., 2012). The results suggested that the Agaricomycetes ancestor of *L. bicolor* was a ligninolytic fungus, consistent with the loss of most CAZyme categories by *L. bicolor* as well as non-genome based studies suggesting the repeated, unrelated, and presumptively irreversible adoption of the mycorrhizal lifestyle within many clades of Agaricomycetes (Plett and Martin, 2011; Ryberg and Matheny, 2012; Wolfe et al., 2012).

Transcriptomics enhances genome analysis by showing expression and regulation. Computational prediction of orphan genes and their cleaved signal peptides defined a large set of small secreted proteins (SSPs) in *L. bicolor*. Transcriptomics showed that some of the SSPs are differentially expressed between free-living mycelia and mycorrhizae (Martin et al., 2008). These mycorrhizainduced SSPs (MiSSPs) appear specific to *L. bicolor*, in that homologs have not been found in other fungi. The 7-kD MiSSP7 is the most highly induced of these genes (>10 k-fold), and has been intensely scrutinized using traditional bottom–up techniques, including immunolocalization, conditional expression studies (Plett and Martin, 2012), and gene knockdown (Plett et al., 2011), all further implicating MiSSP7 as an effector in symbiosis. MiSSP7 interacts with the poplar protein PtJAZ6, a negative regulator of jasmonic acid (JA)-induced gene regulation in poplar (Plett et al., 2014). MiSSP7 protects PtJAZ6 from JAinduced degradation. Furthermore, MiSSP7 blocks or mitigates the impact of JA on *L. bicolor* colonization of host roots. In addition to helping define novel genes, transcriptome data was used to confirm or discover regulation of many of the core and symbiosisrelated genes described above, and were also used to validate and correct the genome annotation as a whole (Larsen et al., 2010). In combination with the poplar genome, whole-mycorrhizal transcriptomics was used to attempt to reconstruct a comprehensive metabolic pathway description for the symbiosis (Larsen et al., 2011).

Similarly, proteomics supported the systematic evaluation of the secretome of the saprotrophic phase of *L. bicolor*. Cell-free media from mycelial culture was separated using combinations of isoelectric focusing, gel electrophoresis, and liquid chromatography, and the separated peptides were measured by mass spectrometry and mapped back against the genomic annotation (Vincent et al., 2012). Cleaved signal peptides were predicted in 103 of the secreted proteins, including a limited set of CAZymes, proteases, SSPs, and one MiSSP with a glycophosphatidylinositol anchor. This study was confined to the free-living phase, as the mycorrhizal is so preponderantly plant tissue that current techniques have difficulty detecting the relatively rare fungal proteins.

Other than genes, the genome led directly to a collection of repetitive sequences, including microsatellites (Labbé et al., 2011) and TEs (Labbé et al., 2012). In principle these could be used as markers for population and ecosystem surveys of *L. bicolor in situ*. Already it appears that some microsatellite loci are unstable enough to vary between generations (Labbé et al., 2011). This was also demonstrated with hydrophobin genes located near TEs, suggesting a mechanism for evolutionary change (Plett et al., 2012).

#### **THE ECTOMYCORRHIZAL ASCOMYCETE** *Tuber melanosporum*

The second mycorrhizal fungal genome published was that of *T. melanosporum* (Martin et al., 2010), an ascomycete not related to *L. bicolor* and yet also an ECM. As with *L. bicolor*, the sequenced genome facilitated the identification and analysis of many gene families of interest to symbiosis studies, including CAZymes (induced during mycorrhiza formation, apparently to force a path between the root cells for the mycelium), lipases, multicopper oxidases, an invertase (unlike *L. bicolor*), other carbohydrate metabolism enzymes (Ceccaroli et al., 2011), metal detoxification genes (Bolchi et al., 2011), and cell wall metabolism enzymes (Balestrini et al., 2012; Sillo et al., 2013). As this genome was also the first sequenced for the Pezizomycetes, it also prompted investigation in another fungal clade of genes of general mycological interest, such as cytoskeleton components that determine hyphal morphology (Amicucci et al., 2011). Conversely, the genome also facilitated investigation of traits specific to the life history of *T. melanosporum*, such as sulfur metabolism genes that produce fruiting body (the truffle *per se*) volatiles (Martin et al., 2010), cold-shock proteins that may mediate the seasonality of fruiting body development (Zampieri et al., 2011), and tyrosinases and

laccases that catalyze melanins implicated in development (Zarivi et al., 2011, 2013). In almost all of the above studies, the genes were subject to both phylogenetic and transcriptomic analyses, some were complemented with microscopic observations (cell wall, cytoskeleton, melanins), and some were confirmed with metabolite or enzyme assays (carbohydrates, metals, melanins).

As with *L. bicolor*, the *T. melanosporum* genome and transcriptome combined are a powerful top–down analytical resource. The transcriptome was used to improve the genomic annotation and to identify alternative and antisense transcripts, and alternative splice variants, some of which are developmentally or symbiotically specific (Tisserant et al., 2011). Conversely, predicted transcription factors (TFs) combined with transcriptomics and a yeast "transcriptional activator trap" system (a variant yeast-2-hybrid screen) identified 29 developmentally regulated TFs (Montanini et al., 2011). Physical separation of mycorrhizal tissues from soil hyphae by microdissection allowed still greater resolution of symbiosis-specific transcripts (Hacquard et al., 2013). As with *L. bicolor*, the *T. melanosporum* genome enabled proteomics by electrophoresis, chromatography, and mass spectrometry (Islam et al., 2013).

The *T. melanosporum* life cycle has not been reconstituted in the laboratory, and only monokaryons have been documented in the wild. The genome revealed putative meiosis genes, as well as a mating locus (Martin et al., 2010), which was used to discover an alternate idiopathic mating locus in the wild (Rubini et al., 2011a). Strains with different mating loci were spatially separated in a single orchard (Rubini et al., 2011b), but this could not be explained by heterokaryon incompatibility, despite the presence of multiple potential HET genes in the genome (Iotti et al., 2012). The genes were not polymorphic among 18 strains. In contrast, the TE- and microsatellite-rich genome has proven to be a rich source of markers for demonstrating the diversity and distribution of populations in the field (Murat et al., 2011, 2013).

## **THE ARBUSCULAR MYCORRHIZAL GLOMEROMYCETE** *Rhizophagus irregularis*

The third published mycorrhizal genome was that of *R. irregularis* (Tisserant et al., 2013), an AM fungus differing profoundly from the two ECM fungi in morphology and development. Being the first sequenced Glomeromycota,*R. irregularis* is not Dikarya and is even more phylogenetically remote from the other two fungi, with the largest genome encoding the largest gene set. The genome allowed cloning and characterization of a monosaccharide transporter both specific to and required for the symbiosis (Helber et al., 2011). Some signal transduction pathway genes, especially tyrosine kinase-like genes, were expanded (Tisserant et al., 2013). The CAZyme repertoire was more reduced than even that of *L. bicolor*, and there was neither invertase nor sucrose transporter, suggesting even greater dependence of the fungus on its host for carbohydrate. Secondary metabolite gene clusters (polyketide synthases, non-ribosomal peptide synthetases, terpene cyclases, and dimethylallyl tryptophan synthetases) were absent, and the set of predicted secreted effectors also appeared small (Lin et al., 2014). The latter includes five proteins with a putative Crinkler domain characteristic of the Heterokont Oomycota and *Batrachochytrium dendrobatidis* but no other fungi, as well as 13 proteins similar

to SP7, a *R. irregularis* effector protein previously cloned using a secretion trap screen (Kloppholz et al., 2011). Transcriptomics revealed a modest set of symbiosis-upregulated genes, including lineage-specific MiSSPs (Tisserant et al., 2013).

Phylogenomics is helping to resolve a long-standing question as to the placement of the Glomeromycota in the fungal tree of life, given their deeply divergent life history and long paleontological record. With the first Glomeromycota genome plus the genomes of many other basal fungi, large numbers of orthologs are available for concatenated multiple sequence alignment and tree building. The most recent efforts point to closer relationship to Mucoromycotina (the classical case of former"zygomycetes") than to Dikarya, and even closer relationship to Mortierellomycota (Tisserant et al., 2013; Lin et al., 2014).

Cloning of potential meiosis genes and mating type loci in partial genomes (genome survey sequences and transcript sequences of multiple strains) suggest the possibility of a cryptic sexual cycle (Tisserant et al., 2012; Halary et al., 2013; Riley et al., 2014). However, the genome confirms that the genomic contexts of the would-be mating loci are not similar to those of known sexual fungi (Tisserant et al., 2013). In contrast, the genome demonstrates low levels of polymorphism between genomic reads within cells and even between nuclei, thus resolving the long-standing ploidy controversy in favor of homokaryosis.

### **CONCLUSION**

The first three sequenced genomes of mycorrhizal fungi were groundbreaking. They revealed potential molecular mechanisms underpinning these symbioses, and offered a first glimpse of the evolution of the different types of mycorrhizae. The genome of *L. bicolor* provided the first genetic blueprint of a mycorrhizal fungus, with its expansive genome and proteome. It also enabled other –omics approaches and the identification of the MiSSPs, a novel family of symbiosis effectors. The genome of *T. melanosporum* provided the first comparison between 2 phylogenetically unrelated ECM fungi, contrasting the expanded proteome of *Laccaria* with a compact proteome embedded in the massively enlarged and repetitive genome of *Tuber*. Comparative genomics did not reveal any universal "symbiosis genes," but did demonstrate convergence of genomic features such as lack of PCWDE and secondary metabolite genes. The genome of *R. irregularis* provided the first opportunity to explore an AM fungus. Despite its great evolutionary distance and morphological distinctiveness from the other two species, the *Rhizophagus* genome showed similar reductions of PCWDEs and expansions of MiS-SPs. Thus while the mycorrhizal symbiosis now appears unlikely to be defined by a set of universal "symbiosis genes," it may be explained by convergent traits that independently and repeatedly evolved. Validation of this conclusion requires a broader sampling to mycorrhizal genomes to be sequenced and followed by transcriptomics studies in established host-mycorrhizal laboratory systems as well as metatranscriptomics of natural environments (**Figure 1**).

#### **ACKNOWLEDGMENTS**

The work conducted by the United States Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, was

supported under Contract No. DE-AC02-05CH11231. This work was supported by the Laboratory of Excellence ARBRE (ANR-11- LABX-0002-01), the Région Lorraine and the Genomic Science Program (project "Plant-Microbe Interactions"), United States Department of Energy, Office of Science, Biological and Environmental Research under the contract DE6 AC05-00OR22725.

### **REFERENCES**


ectomycorrhizal symbiosis and soil-growing hyphae. *New Phytol.* 180, 365–378. doi: 10.1111/j.1469-8137.2008.02539.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 May 2014; accepted: 15 October 2014; published online: 04 November 2014.*

*Citation: Kuo A, Kohler A, Martin FM and Grigoriev IV (2014) Expanding genomics of mycorrhizal symbiosis. Front. Microbiol. 5:582. doi: 10.3389/fmicb.2014.00582*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Kuo, Kohler, Martin and Grigoriev. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Nodule carbohydrate catabolism is enhanced in the Medicago truncatula A17-Sinorhizobium medicae WSM419 symbiosis

## *Estíbaliz Larrainzar†, Erena Gil-Quintana†, Amaia Seminario, Cesar Arrese-Igor and Esther M. González\**

Departamento de Ciencias del Medio Natural/Environmental Sciences, Universidad Pública de Navarra, Pamplona, Spain

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Andrea Doris Nussbaumer, University of Vienna, Austria Jason Terpolilli, Murdoch University, Australia

#### *\*Correspondence:*

Esther M. González, Departamento de Ciencias del Medio Natural/Environmental Sciences, Universidad Pública de Navarra, Campus Arrosadia s/n, Pamplona, Navarra 31006, Spain e-mail: esther.gonzalez@unavarra.es

†Estíbaliz Larrainzar and Erena Gil-Quintana have contributed equally to this work.

The symbiotic association between Medicago truncatula and Sinorhizobium meliloti is a well-established model system in the legume–Rhizobium community. Despite its wide use, the symbiotic efficiency of this model has been recently questioned and an alternative microsymbiont, S. medicae, has been proposed. However, little is known about the physiological mechanisms behind the higher symbiotic efficiency of S. medicae WSM419. In the present study, we inoculated M. truncatula Jemalong A17 with either S. medicae WSM419 or S. meliloti 2011 and compared plant growth, photosynthesis, N2-fixation rates, and plant nodule carbon and nitrogen metabolic activities in the two systems. M. truncatula plants in symbiosis with S. medicae showed increased biomass and photosynthesis rates per plant. Plants grown in symbiosis with S. medicae WSM419 also showed higher N2-fixation rates, which were correlated with a larger nodule biomass, while nodule number was similar in both systems. In terms of plant nodule metabolism, M. truncatula–S. medicaeWSM419 nodules showed increased sucrose-catabolic activity, mostly associated with sucrose synthase, accompanied by a reduced starch content, whereas nitrogenassimilation activities were comparable to those measured in nodules infected with S. meliloti 2011. Taken together, these results suggest that S. medicae WSM419 is able to enhance plant carbon catabolism in M. truncatula nodules, which allows for the maintaining of high symbiotic N2-fixation rates, better growth and improved general plant performance.

**Keywords:** *Medicago truncatula***,** *Sinorhizobium medicae***,** *Sinorhizobium meliloti,* **symbiosis, efficiency, nitrogen fixation, carbon metabolism**

#### **INTRODUCTION**

One of the most studied plant–microbe symbiosis is the one established between members of the *Leguminosae* family and soil bacteria from diverse genera collectively termed rhizobia. When compatible symbiotic partners interact, the microsymbiont is able to invade the host root hair cells, typically (but not exclusively) through infection threads, reaching the root cortex, where they are released and differentiate into nitrogen-fixing forms; the bacteroids. In such differentiated forms, bacteria express an enzyme complex, the nitrogenase, which catalyzes the reduction of atmospheric dinitrogen (N2) to ammonium during the highly energy-demanding process known as symbiotic N2-fixation. During this complex symbiotic interaction the plant provides a carbon source, mainly in the form of malate (Udvardi et al., 1988), to be used as a respiratory substrate to fuel the N2-fixation process (Lodwig and Poole, 2003). Symbiotic N2-fixation is estimated to contribute to nearly half of the global biological N2-fixation reactions worldwide, representing a key process for sustainable natural and agricultural systems (Gruber and Galloway, 2008).

In recent years *Medicago truncatula* (barrel medic) has been one of the model legume species most widely studied by the symbiotic community (Barker et al., 1990; Cook, 1999). The development of mutant collections (Tadege et al., 2008; Calderini et al., 2011), optimization of transformation techniques (Boisson-Dernier et al., 2001) and availability of its genome sequence (Young et al., 2011) have greatly contributed to progress in the field.

So far at least two *Sinorhizobium* [renamed *Ensifer* (Young, 2003)] species have been described to nodulate *Medicago* spp: *Sinorhizobium meliloti* and *S. medicae* (Rome et al., 1996a). Although *M. truncatula* is able to establish N2-fixing symbiosis with both symbionts, most plant molecular biology studies have been carried out using the sequenced *S. meliloti* 1021 strain (Galibert et al., 2001). In recent years, however, the suitability of the *M. truncatula*–*S. meliloti* model has been questioned based on evidences that suggest that N2-fixation in this model is only partially effective (Moreau et al., 2008; Terpolilli et al., 2008). Instead, *S. medicae* WSM419, for which genomic sequence is also available (Reeve et al., 2010), has been suggested as a more efficient symbiont for *M. truncatula* (Terpolilli et al., 2008).

Phylogenetic analysis has shown that *S. meliloti* and *S. medicae* form a tight cluster within the *Sinorhizobium* group (Gaunt et al., 2001). Furthermore, application of several molecular markers to genetically analyze this relationship suggests that *S. medicae* was originated from an ancestral *S. meliloti* population (Biondi et al., 2003). Nowadays, these rhizobial species can be differentiated both at the phenotypic and genotypic level: *S. meliloti* is more specific for the tetraploid *M. sativa* and is preferentially found in alkaline or neutral soils, while *S. medicae* prefers diploid *Medicago* species such as *M. truncatula* and is predominantly found in moderately acid environments (Biondi et al., 2003; Garau et al., 2005). These host and environment preferences may have been a consequence of the various interspecific horizontal gene transfers that occurred during species diversification (Bailly et al., 2007; Epstein et al., 2014).

Nevertheless, to date, the physiological mechanisms underlying the higher symbiotic efficiency in the *M. truncatula* A17*–S. medicae* WSM419 association remain largely unknown. Comparative genomic studies of multiple *S. meliloti* and *S. medicae* strains have shed some light, suggesting that differences in gene content between the two species, particularly in genes involved in sulfur assimilation, conjugation and secretion, can be related to the differential symbiotic interaction and N2-fixation efficiency (Sugawara et al., 2013). Understanding which are the factors that underpin N2-fixation efficiency in legumes has potentially profound implications for sustainable agricultural systems and the environment.

In the current work, we analyzed the differences at the physiological and metabolic levels between the currently established model *M. truncatula–S. meliloti* and the more efficient *M. truncatula–S. medicae* symbiosis. We hypothesized that plant nodule metabolism may be enhanced in the *S. medicae* symbiosis compared to less efficient strains. To test this hypothesis, two sets of *M. truncatula* Jemalong A17 plants were grown under symbiotic conditions either with *S. meliloti* 2011 or *S. medicae* WSM419. Plant growth parameters, photosynthesis, N2-fixation, and plant nodule carbon and nitrogen metabolic activities were determined. Results presented here show that *S. medicae* WSM419-derived nodules generate a stronger sink in the plant, through the activation of sucrose-hydrolyzing enzymes. This allows the maintenance of high N2-fixation rates, increased nodule growth, and, therefore, a generally improved plant performance.

## **MATERIALS AND METHODS**

#### **GROWTH CONDITIONS**

*Medicago truncatula* Gaertn cv. Jemalong A17 plants were grown in 1-L pots with a mixture of perlite:vermiculite (2:5, v/v) as substrate under controlled environmental conditions (14 h day/10 h night; 450 μmol m−<sup>2</sup> s <sup>−</sup><sup>1</sup> light intensity; 22◦C/16◦C day/night temperature; 60–70% relative humidity). After germination, plantlets were separated into two sets: one was inoculated with *S. meliloti* strain 2011 (Meade and Signer, 1977) and the other was inoculated with *S. medicae* strain WSM419 (Rome et al., 1996b). Bacterial cultures were grown on a rotary shaker (175 rpm) at 28◦C for 48 h in yeast extract mannitol broth containing (*<sup>g</sup>* <sup>L</sup>−1) K2HPO4 (0.5), 0.2 MgSO4·7H2O, NaCl (0.1), mannitol (10), and yeast extract (0.4), pH adjusted to 6.8, to an OD600 of 0.7–0.8, which corresponds to <sup>∼</sup><sup>3</sup> <sup>×</sup> <sup>10</sup><sup>8</sup> cells (Vincent, 1970). 1 ml of the cultures was inoculated onto each seedling at sowing.

Plants were watered with a nutrient solution containing (values in mg L−1): MgSO4·7H2O (493), K2SO4 (279), K2HPO4 (145), CaCl2 (56), KH2PO4 (23), EDTA-Fe (17), H3BO3 (1.43), CaSO4·2H2O (1.03), MnSO4·7H2O (0.77), ZnSO4·7H2O (0.22), CoCl2·6H2O (0.12), CuSO4·5H2O (0.08), NaMoO4·2H2O (0.05). For the first 3 weeks, 0.25 mM ammonium nitrate was added to the nutrient solution. Eight weeks after planting, symbiotic N2-fixation was measured, nodules collected, divided into aliquots, frozen in liquid N2 and stored at −80◦C for analytical determinations. Two nodule aliquots per plant were used for nodule number estimation based on total nodule weight. Shoots and roots were weighed for fresh weight (FW) determinations and, subsequently, oven-dried at 80◦C for 48 h before dry weight (DW) was measured.

#### **NITROGEN FIXATION AND CHLOROPHYLL CONTENT DETERMINATIONS**

Symbiotic N2-fixation was measured in intact plants as apparent nitrogenase activity (ANA). H2 evolution from sealed roots systems was measured in an open flow-through system under N2:O2 (79%:21%, v/v) according to Witty and Minchin (1998) using an electrochemical H2-sensor (Qubit System, Canada).

Photosynthesis was determined in the apical leaves with an open system mode (model LC pro+; ADC BioScientific Ltd., Great Amwell, UK) using an ADC PLC-7504 leaf chamber. To estimate leaf chlorophyll content aMinolta SPAD-502 system was employed (Konica Minolta Sensing Europe BV, UK).

#### **NODULE PROTEIN EXTRACTION AND ENZYMES ASSAY**

Nodules (100 mg FW) were homogenized in a mortar and pestle with 500–600 μL of extraction buffer (50 mM 3-(*N*morpholino)propanesulfonic acid (MOPS), 5 mM MgCl2, 20 mM KCl, 1 mM EDTA, 20% polyvinylpolypyrrolidone, pH 7) where 1.5 mg mL−<sup>1</sup> of DTT, 0.7 μL mL−<sup>1</sup> of β-mercaptoethanol and 20 μL mL−<sup>1</sup> plant protease inhibitor cocktail (Sigma-Aldrich) were freshly added. Homogenates were centrifuged at 12,000 *g* and 4◦C for 15 min and supernatants were collected as nodule plant fractions. The nodule plant fraction was desalted using Bio Gel P6DG columns (Bio-Rad) equilibrated with 250 mM MOPS (pH 7), 100 mM KCl and 25 mM MgCl2. The desalted extract was used to measure the following enzyme activities according to Gonzalez et al. (1998): sucrose synthase (EC 2.4.1.13), alkaline invertase (EC 3.2.1.26), NADH-dependent glutamate synthase (GOGAT; EC 1.4.1.14), and aspartate aminotransferase (AAT; EC 2.6.1.1). The protein content in crude and desalted extracts was quantified using a Bradford-based dye-binding assay (Bio-Rad) employing bovine serum albumin as standard.

## **CARBOHYDRATE AND STARCH DETERMINATION**

100 mg-FW nodule aliquots were extracted in 80% (v/v) ethanol and ultrasonicated in a water bath system. After sonication, samples were centrifuged at 7,500 *g* and 4◦C for 5 min and supernatants were collected. These steps were repeated three times. Afterward the supernatants were dried in a Turbovap LV evaporator (Zymark Corp, Hopkinton, MA, USA) and soluble compounds were redissolved in 1 mL distilled water, homogenized and stored at −20◦C. The ethanol-insoluble residue was extracted for starch determination as in Macrae (1971). Carbohydrates were analyzed by high-performance capillary electrophoresis (Warren and Adams, 2000) using 10 mM benzoate (pH 12) containing 0.5 mM myristyltrimethylammonium bromide as a buffer under the following conditions: −15 kV potential, 50 μm-internal diameter and 30/40.2 cm-long capillary tube, indirect UV detection at 225 nm.

#### **STATISTICAL ANALYSIS**

All data are reported as mean ± standard deviation of *n* = 5 independent measurements. Statistical analysis was conducted using Student's *t*-test and *p* ≤ 0.05 was considered as statistically significant. The homogeneity of variances was tested using Levene's test.

#### **RESULTS**

In general terms, *M. truncatula* plants inoculated with *S. medicae* WSM419 outperformed those inoculated with *S. meliloti* 2011. Total plant biomass in the *M. truncatula–S. medicae* system was more than two-fold higher than when using the *S. meliloti* strain and the difference was most notable for shoots (**Figure 1**; **Table 1**). Plants inoculated with the *S. medicae* strain maintained a 1:1 shoot-to-root ratio, while this declined to ∼3:4 in plants inoculated with *S. meliloti* 2011 (**Table 1**).

Regarding photosynthetic CO2 assimilation, *M. truncatula*–*S. medicae* plants showed a 55.8% increase in photosynthesis when

```
Sinorhizobium medicae WSM419 (right). Scale bar = 2 cm.
```
**inoculation with either** *Sinorhizobium meliloti* **2011 (left) or**



M. truncatula plant biomass values when grown in symbiosis with S. meliloti 2011 or S. medicae WSM419. Values are mean ± standard deviation of five biological replicates. An asterisk (\*) denotes significant differences (Student's t-test at p ≤ 0.05). FW = fresh weight.

expressed on a plant basis (**Figure 2A**). However, when expressed on a leaf area basis, *M. truncatula*–*S. meliloti* showed higher photosynthetic rates (86.37 ± 2.09 μmol CO2 s <sup>−</sup><sup>1</sup> cm−2) compared to *S. medicae*-inoculated plants (67.19 ± 2.43 μmol CO2 s <sup>−</sup><sup>1</sup> cm−2). These higher photosynthetic rates were, however, not correlated with increased leaf chlorophyll content values, with both plant systems presenting similar values (**Figure 2B**).

To accurately estimate the rates of N2-fixation, ANA was measured as H2 evolution in intact plants (Witty and Minchin, 1998). The *M. truncatula*–*S. medicae* symbiosis showed increased N2-fixation values both when expressed on a plant (+57%) and nodule FW basis (**Figure 3A**). Plants inoculated with *S. medicae* showed higher nodule biomass (**Figure 3B**), although the number of root nodules was similar in both cases (**Figure 3C**). The increase in nodule biomass was, therefore, correlated with higher biomass per nodule. Plants inoculated with the *S. medicae* strain presented larger and more frequently bifurcated nodules compared to plants inoculated with the *S. meliloti* strain (**Figure 3D**). Furthermore, the plant fraction of *M. truncatula– S. medicae* nodules showed a significantly higher protein content than that of nodules infected with *S. meliloti* (25.18 ± 3.32 vs. 20.53 <sup>±</sup> 3.58 mg protein g FW−1, mean <sup>±</sup> standard deviation, respectively).

To better understand the metabolic differences in nodules following inoculation with the two microsymbionts, we measured the activity of the two main sucrose-degrading enzymes in nodules, sucrose synthase and alkaline invertase, as well as the activity of two key enzymes involved in ammonium assimilation, GOGAT and AAT. In both systems the specific activity of sucrose synthase was on average more than 25-fold higher than that of alkaline invertase (data not shown). Comparing the activity levels across systems, only sucrose synthase showed a significant increase in *S. medicae*-infected nodules (**Figure 4A**). In terms of nodule nitrogen metabolism, neither GOGAT nor AAT activities showed significantly different rates when comparing the two inoculants (data not shown).

Given that nodule sucrose catabolism was found to be more active in the *M. truncatula–S. medicae* symbiosis, the main carbon metabolites in nodules were quantified; sucrose and starch (**Figures 4B,C**). As a general trend, *S. medicae*-infected nodules presented lower levels of carbohydrates compared to those infected by the *S. meliloti* strain, with significant differences found in terms of starch content (**Figure 4C**).

## **DISCUSSION**

The efficiency of a legume–*Rhizobium* symbiosis is usually evaluated by comparing plant growth parameters (e.g., biomass, N content) of inoculated versus N-fed plants. These types of study, mostly analyzed from the bacterial perspective, have demonstrated that symbiotic efficiency varies depending upon the specific bacterial strain used (Miller and Sirois, 1982; Mhadhbi et al., 2005; Parra-Colmenares and Kahn, 2005; Heath and Tiffin, 2007; Rangin et al., 2008; Terpolilli et al., 2008; Oono and Denison, 2010). However, the plant contribution to these variable efficiencies has received much less attention.

In this work, we analyzed the effectiveness of the symbiosis of *M. truncatula* A17 with two *Sinorhizobium* strains, *S. meliloti*

**activity (ANA, A), total nodule biomass (B), nodule number (C) in** *M. truncatula* **plants inoculated with either** *S. meliloti* **2011 or** *S. medicae* **WSM419. D**, representative image of nodules sampled from

(bottom). Scale bar = 500 μm. Values represent mean ± standard deviation (n = 5). An asterisk (\*) denotes significant differences (p ≤ 0.05).

(p ≤ 0.05).

2011 and *S. medicae* WSM419, with special emphasis on understanding the main differences at the nodule metabolic level. Under our experimental growth conditions, *M. truncatula* plants grown almost exclusively on fixed N upon inoculation with *S. meliloti* 2011 did not show symptoms of N deficiency, presenting leaf chlorophyll contents comparable to those of plants inoculated with the *S. medicae* strain (**Figure 2B**). We did, however, observe a general outperformance of plants inoculated with the *S. medicae*strain in terms of plant biomass (**Figure 1**; **Table 1**), photosynthesis per plant (**Figure 2A**) and N2-fixation rates (**Figure 3A**). Interestingly, this improved fixation performance was correlated with a larger biomass per nodule, leading to a higher total nodule biomass per plant, but not to increased nodule number (**Figure 3**).

mean ± standard deviation (n = 5). Sucrose **(B)** and starch content **(C)** in M.

Nodules are strong sink tissues due to the high-energy demand that symbiotic N2-fixation represents for the plant (Silsbury, 1977; Schuize et al., 1999). These high-energy requirements are met by allocating photoassimilates from the aerial part to nodules, mostly in the form of sucrose, where they are hydrolyzed by either sucrose synthase or alkaline invertase (Morell and Copeland, 1984; Flemetakis et al., 2006). Sucrose synthase is considered to be primarily responsible for sucrose metabolism in mature nodules and its role has been shown to be essential for symbiotic N2 fixation in legumes (Gordon et al., 1999; Baier et al., 2007; Horst et al., 2007), while alkaline invertase appears to have a secondary role (Welham et al., 2009). In this study, the predominant role of sucrose synthase as the main sucrose-degrading enzyme in nodules was corroborated, showing a significantly higher specific activity than that of alkaline invertase in both symbiotic systems (>20-fold higher in average). Nodules from plants inoculated with the more efficient *S. medicae* strain showed higher sucrose synthase activity compared to *S. meliloti* 2011 nodules (**Figure 4A**). Furthermore, *S. medicae* WSM419-inoculated plants maintained nodule starch at significantly lower levels compared to those inoculated with the *S. meliloti* strain (**Figure 4C**), despite the higher photosynthetic rates of the former (**Figure 2A**). This inverse correlation between symbiotic efficiency and starch accumulation has been

similarly observed in alfalfa plants when inoculated with a fix − strain (Aleman et al., 2010). Indeed, in non-fixing alfalfa nodules, the products from sucrose breakdown are re-directed to starch biosynthesis due to the lower energy demand. Taken together, these results suggest that *S. medicae* WSM419 activates plant carbon catabolic reactions in nodules to keep up with the high nitrogenase demand for ATP and, as a consequence, they become stronger metabolic sinks in the plant (Sung et al., 1989). This positive feedback keeps N2-fixation rates high, promoting plant growth and, therefore, increasing the plant photosynthetic capacity. A similar mechanism has been described when bacteroid respiration is enhanced in nodules by the overexpression of a cytochrome oxidase (Soberon et al.,1999; Silvente et al.,2002; Talbi et al.,2012).

Despite the differences in N2-fixation rates, plants inoculated with the *S. meliloti* strain did not show significant differences in terms of nodule number (**Figure 3C**). Differences were, however, found in the plant protein fraction of nodules, most likely related to the metabolic activation discussed above. It is interesting, though, that these differences are mostly observed at the level of carbon metabolism, while the specific activity of enzymes involved in N assimilation did not differ significantly when the two symbiotic systems were compared (data not shown).

In conclusion, results presented here suggest that at least one of the factors contributing to the higher effectiveness of the *M. truncatula–S. medicae* WSM419 symbiosis is the activation of plant carbon catabolism in nodules, which allows the maintenance of high N2-fixation rates and, ultimately, leads to an improved plant performance. In agreement with previous studies (Moreau et al., 2008; Terpolilli et al., 2008), the use of *S. medicae* WSM419 as the partner of choice for *M. truncatula* symbiotic studies is highly recommended.

#### **ACKNOWLEDGMENTS**

We thank Dr. Euan K. James (The James Hutton Institute, Dundee, UK) and Dr. Jason J. Terpolilli (Murdoch University, Western Australia, Australia) for sharing the *S. medicae* WSM419

strain, and Dr. Gustavo Garijo for technical assistance. This work has been partially funded by the Spanish National Research and Development Programmes (AGL2011-23738 and AGL2011- 30386-C02-01). Estíbaliz Larrainzar and Erena Gil-Quintana are funded by the European FP7-PEOPLE program (253141). Amaia Seminario is funded by a predoctoral fellowship from the Public University of Navarre.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 April 2014; paper pending published: 13 June 2014; accepted: 05 August 2014; published online: 27 August 2014.*

*Citation: Larrainzar E, Gil-Quintana E, Seminario A, Arrese-Igor C and González EM (2014) Nodule carbohydrate catabolism is enhanced in the Medicago truncatula A17- Sinorhizobium medicae WSM419 symbiosis. Front. Microbiol. 5:447. doi: 10.3389/ fmicb.2014.00447*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Larrainzar, Gil-Quintana, Seminario, Arrese-Igor and González. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Nitrogen-fixing Rhizobium-legume symbiosis: are polyploidy and host peptide-governed symbiont differentiation general principles of endosymbiosis?

## *Gergely Maróti and Éva Kondorosi\**

Institute of Biochemistry, Biological Research Center, Hungarian Academy of Sciences, Szeged, Hungary

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Patrick Michael Erwin, University of North Carolina Wilmington, USA Jukka Jokela, ETH Zurich, Switzerland

#### *\*Correspondence:*

Éva Kondorosi, Institute of Biochemistry, Biological Research Center, Hungarian Academy of Sciences, Temesvári krt. 62, Szeged 6726, Hungary e-mail: kondorosi.eva@brc.mta.hu The symbiosis between rhizobia soil bacteria and legumes is facultative and initiated by nitrogen starvation of the host plant. Exchange of signal molecules between the partners leads to the formation of root nodules where bacteria are converted to nitrogen-fixing bacteroids. In this mutualistic symbiosis, the bacteria provide nitrogen sources for plant growth in return for photosynthates from the host. Depending on the host plant the symbiotic fate of bacteria can either be reversible or irreversible. In Medicago plants the bacteria undergo a host-directed multistep differentiation process culminating in the formation of elongated and branched polyploid bacteria with definitive loss of cell division ability. The plant factors are nodule-specific symbiotic peptides. About 500 of them are cysteine-rich NCR peptides produced in the infected plant cells. NCRs are targeted to the endosymbionts and the concerted action of different sets of peptides governs different stages of endosymbiont maturation. This review focuses on symbiotic plant cell development and terminal bacteroid differentiation and demonstrates the crucial roles of symbiotic peptides by showing an example of multi-target mechanism exerted by one of these symbiotic peptides.

**Keywords:** *Rhizobium***-legume symbiosis, bacteroid differentiation, host effector molecules, plant peptides, polyploidy, endosymbiont**

## **HOST-SPECIFIC INTERACTION BETWEEN THE** *Rhizobium* **AND PLANT PARTNERS**

The bacteria which form nitrogen-fixing symbiosis with legume plants belonging to diverse groups of α- and β-proteobacteria are collectively called rhizobia (Chen et al., 2003; MacLean et al., 2007). Many α-proteobacteria are engaged in long-term interactions with higher eukaryotes. These interactions range from surface colonization through facultative symbiotic relationships to obligate intracellular pathogen or endosymbiont lifestyles. The symbiotic genes required for nodule formation, host cell infection and nitrogen fixation have been acquired by lateral gene transfer which is the primary source of genetic diversity of rhizobia. Therefore, rhizobia could be more closely related to pathogens (such as *Agrobacterium* or *Brucella*) than to each other. Rhizobia tend to have large genomes (up to 10.5 Mbp) which in fast growing rhizobia are dispersed on multiple replicons (MacLean et al., 2007). For example, *Sinorhizobium meliloti*, the endosymbiont of *Medicago* species, has a tri-parted genome; a 3.65 Mbp chromosome and two megaplasmids, pSymA and pSymB (1.35 and 1.68 Mbp) both of which are indispensable and carry the majority of symbiotic genes. However, many *S. meliloti* strains contain further auxiliary medium sized plasmids and thus, the *S. meliloti* genome may contain up to 9,000 genes (Barnett et al., 2001; Capela et al., 2001). In contrast to rhizobia, obligate endosymbionts of insects usually possess a strongly reduced (160–450 Kbp) genome which ensures their multiplication and codes for a few specific biosynthetic pathways including those

satisfying the host's need (Moran et al., 2008; Price et al., 2011). These incredibly reduced genomes are nevertheless amplified compensating the diminished genome with a polyploid DNA content.

The plant partners of rhizobia belong to the *Leguminosae/Fabaceae* family. Nitrogen fixing symbiosis has evolved in several lineages, but not all legumes form symbiosis. Hitherto 12,000 nodulated legume species are known and each has its own *Rhizobium* partner(s). The symbiosis is triggered by nitrogen starvation of the host plant which has to select its *Rhizobium* partner from billions of bacteria in the rhizosphere. This is achieved by secretion of flavonoid signal molecules from the root which act as chemo-attractants but most importantly as inducers of the *Rhizobium* nodulation genes (Oldroyd, 2013). These genes are required for the production of bacterial signal molecules; the Nod factors (NFs) which trigger the nodule developmental program in the host plant (Walker and Downie, 2000). The NFs are lipochitooligosaccharide molecules that carry host specific substitutions on the terminal sugar residues and characteristic lipid chains, which are recognized by LysM-type host receptors and are required both for nodule development and bacterial infection. Interestingly, the ancient symbiosis of land plants with arbuscular mycorrhizal (AM) fungi operates with similar lipochitooligosaccharide signal molecules, the Myc factors which are perceived by similar but different LysM-type receptors as the NFs (Abdel-Lateif et al., 2012; Oldroyd, 2013). The Myc factors and NFs activate a common signaling pathway but after the

involvement of the common symbiotic genes conserved in plants, the pathways deviate; one leading to nodulation, the other for AM symbiosis.

Plant infection and nodule formation are intricate processes; Nod factors play distinct roles in nodule organogenesis and root hair infection. Moreover, beside Nod factors, various bacterial surface polysaccharides are crucial for efficient infection (Fraysse et al., 2003). In most legumes, the rhizobia enter the host via the root hairs where by invagination of the plasma membrane an infection thread (IT) is formed that contains the multiplying bacteria and grows towards the root cortex. A less frequent and ancient mode of infection occurs via cracks on the root surface of certain legumes.

## **DETERMINATE AND INDETERMINATE NODULE DEVELOPMENT**

Nodule development requires mitotic reactivation of cortical cells leading to nodule primordium formation which then differentiates into nitrogen-fixing root nodules providing microaerobic condition in the central zone for functioning of the oxygen sensitive nitrogenase enzyme in the bacteroids. Depending on the transient and persistent nature of host cell proliferation, the nodules can be either determinate or indeterminate type (Terpolilli et al., 2012; Kondorosi et al., 2013). Determinate nodules have no meristem and contain homogenous population of symbiotic cells.

Determinate nodules develop for example on *Phaseolus vulgaris* and *Lotus japonicus* roots.

To the contrary, the active cell division is maintained in the indeterminate nodules. A nodule meristem is present in the apical region (zone I) which by constant generation of new cells provokes continuous growth and elongated nodule shape. The cells leaving the meristem do not divide anymore and enter a differentiation phase. The infection thread releases the bacteria into the submeristematic cells, which differentiate gradually along the 12–15 cell layers of the infection zone (zone II), leading to the development of nitrogen-fixing symbiotic cells in nodule zone III (**Figure 1**; Franssen et al., 1992). *Medicago sativa, M. truncatula, Vicia sativa*, and *Pisum sativum* are examples of plants forming indeterminate nodules.

## **GROWTH OF SYMBIOTIC CELLS INVOLVES AMPLIFICATION OF THE HOST GENOME BY ENDOREDUPLICATION CYCLES**

Extreme plant cell enlargement can be observed in both the determinate and indeterminate nodules. The cytoplasm of a nitrogen-fixing symbiotic cell hosts about 50,000 bacteroids. To accommodate such a high number of endosymbionts, the host cells grow. In *M. truncatula* nodules the volume of the nitrogenfixing cells is 80-fold larger than that of the diploid meristematic cells. The growth of infected cells occurs stepwise in zone II and is the consequence of repeated endoreduplication (ER) of

**FIGURE 1 | Structure of nitrogen-fixing root nodules formed in** *S. meliloti* **–** *M. truncatula* **symbiosis.** The different nodules zones are indicated on the longitudinal nodule section: (I) meristem, (II) infection zone, (III) nitrogen fixation zone, (IV) senescence zone.

Symbiotic cells in zone II contain the differentiating endosymbionts while in zone III the host cytoplasm is fully packed with long nitrogen-fixing bacteroids. Endosymbionts stained with Syto9 have green fluorescence.

the genome without mitosis. In zone II the cell cycle machinery is still active but the lack of mitotic cyclins inhibits mitosis and transforms the mitotic cycles to endoreduplication cycles (Cebolla et al., 1999). This is achieved by the cell cycle switch CCS52A protein that by the destruction of the mitotic cyclins induces repeated rounds of genome duplication leading to the formation of gradually growing polyploid cells (Roudier et al., 2003; Kondorosi and Kondorosi, 2004). In *Medicago* species the ploidy levels can reach 64C representing 64-fold higher DNA content compared to the haploid cells (C corresponds to the haploid DNA content;Vinardell et al., 2003). Down-regulation of CCS52A in *M. truncatula* had no effect on primordium formation but was detrimental for nodule differentiation indicating that the ER cycles and formation of large highly polyploid cells are essential for nodule functioning (Vinardell et al., 2003). Interestingly, cortical cells containing AM fungi are also polyploid, as well as the nematode-feeding giant root cells (Favery et al., 2002; Genre et al., 2008). Similarly, insect symbiotic cells, the bacteriocytes harboring intracellular endosymbionts are also large and polyploid (Nakabachi et al., 2010). In angiosperm plants, polyploidy is frequent and the specific inherited pattern of polyploidy in different organs, tissues and cell types suggest that it could be a major source of the specialized physiology of host cells (Nagl, 1976; Edgar et al., 2014). Beside cell growth, the multiple gene copies, lack of chromosome condensation can contribute to higher transcriptional and metabolic activities. However, association of polyploidy with different cell functions suggests an impact of polyploidy also on the architecture of nucleosomes and on the epigenome controlling activation or repression of specific genomic regions. Accordingly, the polyploid genome content of symbiotic cells appears to be a prerequisite for nodule differentiation and for the expression of most symbiotic host genes (Maunoury et al., 2010).

#### **DIFFERENT FATES OF NITROGEN FIXING BACTEROIDS**

The bacteria released from the IT are present in the host cytoplasm as organelle-like structures, called symbiosomes. The bacteria have no direct contact with cytoplasm as they are surrounded by a peribacteroid membrane, known also as symbiosome membrane (SM). The bacteroid, the SM and the space between them comprise the symbiosome (Catalano et al., 2004). The SM during its formation reflects its plasma membrane origin, later modifications of its composition open new, specialized roles at the host-endosymbiont interface (Limpens et al., 2009; Ivanov et al., 2012; Brear et al., 2013; Sinharoy et al., 2013). The bacteroids multiply in the growing host nodule cells to a certain cell density, adapt to the endosymbiotic life-style and microaerobic conditions and mature to nitrogen-fixing bacteroids. The form and physiology of bacteroids can be, however, strikingly different in the various legumes. In certain legume hosts, the nitrogen-fixing bacteroids have the same morphology as cultured cells; this type of bacteroids can revert to the free-living form. In other associations, the bacteroids are irreversibly transformed to polyploid, enlarged, non-cultivable endosymbionts. These terminally differentiated bacteroids can be elongated and even branched and 5- to 10-fold longer than the free-living cells or can be spherical from 8 to at least 20-fold amplified genome depending on

the host (Mergaert et al., 2006; Nakabachi et al., 2010). Terminal differentiation of bacteroids is host controlled, evolved in multiple branches of the *Leguminosae* family indicating host advantage and likely higher symbiotic performance (Oono et al., 2010). Terminal bacteroid differentiation is the best elucidated in the *S. meliloti* – *M. truncatula* symbiosis. In *M. truncatula* nodules, the most visible events of terminal bacteroid differentiation occur in zone II. Multiplication of bacteroids stops in the middle of zone II where cell elongation and uniform amplification of the multiple replicons by endoreduplication cycles begin. Along 2–3 cell layers at the border of zone II and III (called interzone) sudden growth of bacteroids is visible reaching practically their final size, however, nitrogen-fixation takes place only in zone III.

## **HOST PEPTIDES GOVERN BACTEROID DIFFERENTIATION**

Comparison of nodule transcriptomes of legumes with reversible and irreversible bacteroid differentiation revealed the existence of several hundreds of small genes that were only present in the genome of those host plants where bacteroid differentiation was terminal. In *M. truncatula* the nodule cells produce at least 600 nodule-specific symbiotic peptides (symPEPs). The *symPEP* genes are only activated in the *S. meliloti* infected polyploid symbiotic cells (Kevei et al., 2002; Mergaert et al., 2003), however certain sets at the earlier, others during the later stages of nodule development. A large portion, more than 500 genes encode nodule-specific cysteine-rich (NCR) peptides (Mergaert et al., 2003; Alunni et al., 2007; Nallu et al., 2014). The NCR peptides are targeted to the bacteroids and when their delivery to the endosymbionts was blocked, bacteroid differentiation was abolished demonstrating that the peptides are responsible for terminal differentiation of *S. meliloti* bacteroids (Van de Velde et al., 2010). The high sequence variety and the characteristic expression patterns of *NCR* genes suggest diversity in their functions, modes of action and bacterial targets at different stages of bacteroid maturation (**Figure 2**). However, why does the host cell produce an arsenal of NCRs? What can be the advantage of such a diverse peptide repertoire? Is it necessary for interaction of the host with various bacteria? The symbiotic partners of *M. truncatula* are *S. meliloti* and *S. medicae*, however in the soil there are countless strain variants of both species. *M. truncatula* is also represented by many different ecotypes and accessions differing in the number, sequences, and expression profile of *NCR* genes and in their symbiotic interactions with different *S. meliloti* and *S. medicae* strains (Nallu et al., 2014; Roux et al., 2014). While a nodule contains a single bacterium type, the different nodules on the same root system may possess distinct bacterial populations. It is possible that the plant recognizing the various endosymbionts manipulates them with a strain-specific repertoire of peptides. These differences can add an additional control level for host-symbiont specificity and thereby for nodulation efficiency.

Although symPEPs represent unique peptide classes, their structures resemble to antimicrobial peptides (AMPs). AMPs with broad spectrum of microbial cell-killing activity are most frequently cationic provoking cell death by pore formation, membrane disruption and consequent lysis of microbial cells.

**FIGURE 2 | Differential expression of** *symPEP* **genes in** *M. truncatula* **nodules.** Black signal: in situ hybridization, blue signal: GUS activity of symPEP promoter-GUS fusions in transgenic nodules.

The fact that the cell division ability is definitively lost during endosymbiont differentiation indicates that at least certain sym-PEPs have antimicrobial activities. Treatment of bacteria with synthetic cationic NCRs indeed provoked rapid and efficient dose-dependent elimination of various Gram-negative and Grampositive bacteria including important human and plant pathogens (Van de Velde et al., 2010; Tiricz et al., 2013). This ex-planta killing effect correlated with permeabilization of microbial membranes, however, symPEPs in their natural environment – in the nodule cells – do not permeabilize the bacterial membranes and do not kill the endosymbionts. Most likely the peptide concentrations in the nodules are significantly lower than those applied in the *in vitro* assays. Moreover cationic peptides are produced together with anionic and neutral peptides in the same cell, and possible combination of a few tens or hundreds of peptides with various charge and hydrophobicity might neutralize the direct bactericidal effect of the cationic peptides.

The involvement of AMPs or AMP-like peptides is not unique for *Rhizobium*-legume symbiosis. In the weevil *Sitophilus,* the symbiotic cells produce the antimicrobial peptide coleoptericin-A (ColA) which provokes the development of giant filamentous endosymbionts by inhibiting cell division and protects the neighboring insect tissues from bacterial invasion (Login et al., 2011). In this system a single peptide is sufficient for differentiation of the obligate vertically transmitted endosymbiont unlike nodules that operate with hundreds of symPEPs and can host innumerable strain variants as their endosymbionts. In the aphid-*Buchnera* symbiosis, the host cells also produce bacteriocyte-specific peptides including cysteine rich peptides (BCRs) which resemble the *Medicago* NCR peptides, however the functions of these symbiotic peptides have not been reported yet (Shigenobu and Stern, 2013).

#### **NCR247: AN EXAMPLE FOR MULTI-TARGET HOST EFFECTOR**

Transcriptome analysis of *M. truncatula* nodules at different stages of their development, laser microdissection of nodule regions, in situ hybridization, immunolocalization of selected peptides, and symPEP promoter-reporter gene fusions in transgenic nodules allow mapping the action of individual peptides in the symbiotic cells from the early infection until the late nitrogen fixation state. NCR247 is expressed in the older cell layers of zone II and in the interzone where bacterial cell division stops and remarkable elongation of the endosymbionts occurs (Farkas et al., 2014).

This small cationic peptide effectively killed various microbes *in vitro* and the *in silico* analysis indicated its extreme protein binding capacities. FITC-labeled NCR247 entered the bacterial cytosol where its interactions with numerous bacterial proteins were possible. Binding partners were identified by treatment of *S. meliloti* bacteria or bacteroids with StrepII/FLAG-tagged peptides followed by affinity chromatography and identification of interacting partners with LC-MS/MS and Western analysis (Farkas et al., 2014).

One of the interactors was the FtsZ cell division protein playing a crucial primary role in cell division. A number of antibiotic peptides are known to exert bactericidal or bacteriostatic effect through the interaction with FtsZ, inhibiting its polymerization thereby hindering proper Z-ring and septum formation (Handler et al., 2008). NCR247 was co-purified with FtsZ from the bacterial cytoplasm and was shown to disrupt septum formation. NCR035 exhibiting *in vitro* also bactericidal effect and produced in the same symbiotic cells as NCR247 accumulates at the division septum which indicates simultaneous or consecutive action of these peptides and evolution of multiple host strategies to inhibit endosymbiont proliferation. Another study showed that expression of important cell division genes, including genes required for Z-ring function, were strongly attenuated in cells treated by NCR247 (Penterman et al., 2014). Pretreatment of bacteria with sub-lethal NCR247 concentrations abolished localization of FITC-NCR035 to the septum and provoked cell elongation (Farkas et al., 2014).

Ribosomal proteins were the most abundant NCR247 interacting partners. NCR247 was observed to strongly inhibit bacterial protein synthesis in a dose-dependent manner both *in vivo* and *in vitro* (Farkas et al., 2014). These results suggested that one mode of the NCR247 peptide action is binding to the ribosomes both in bacterial cells and bacteroids. Interestingly, an altered pattern and reduced complexity of the interacting proteins were observed in the bacteroids. Accordingly the general expression level of ribosomal proteins was in average 20-fold lower in the bacteroids than in the free-living cells with different relative abundance of transcripts of individual ribosomal proteins. Ribosome diversification in bacteroids may have a significant role by contributing to the advanced translation of specific proteins thereby supporting the specialized, energy-demanding physiology of highly abundant nitrogen fixation function.

The GroEL chaperon was also a direct interacting partner of NCR247 (Farkas et al., 2014). Out of the 5 GroEL proteins, GroEL1 or GroEL2 is sufficient for survival while GroEL1 expressed at high level in the nodule is essential for symbiosis (Bittner et al., 2007). It is needed for full activation of the nodulation genes and assembly of the nitrogenase complex. GroEL possesses extreme functional versatility by interacting with hundreds of proteins. The NCR247- GroEL1 interaction can have consequences directly on GroEL but indirectly also on the GroEL substrates and the associated biological processes. Absence of GroEL1 severely affected bacterial infection and the maintenance and differentiation of bacteroids demonstrating a general need for GroEL1 in all stages of nitrogen fixing nodule development.

The involvement of GroEL and host peptides in microbe-host interactions is not unique for *Rhizobium*-legume symbiosis. In the weevil symbiotic cells coleoptericin-A (ColA) interacts also with GroEL (Login et al., 2011). GroEL also plays an important role in the maintenance of endosymbionts (Moran, 1996; Kupper et al., 2014). As most symbiotic systems are as yet unexplored and high-throughput genomic and proteomic tools are only recently available, we can only predict that host peptides-mediated endosymbiont differentiation, likewise genome amplification of host cells and terminally differentiated endosymbionts are general strategies of symbiosis.

#### **CONCLUSION**

Symbiotic and pathogenic bacteria use similar approaches to interact with their hosts and to survive within host cells, even if the results of these interactions are strikingly different. Plants and animals can generate innate immune responses to microorganisms upon the perception of MAMPs (microorganism-associated molecular patterns). This perception results in the activation of signaling cascades, and the production of antimicrobial effectors. AMP-like host peptides such as the *M. truncatula* NCR peptides or the weevil ColA antimicrobial peptide play pivotal and multifaceted roles in controlling the multiplication and differentiation of endosymbionts, thereby restricting the presence of bacteria to the symbiotic cells. Thus, host organisms utilize these effector peptides to tame and even hire selected microbial invaders for service.

#### **ACKNOWLEDGMENTS**

Work in our laboratories is supported by the"SYM-BIOTICS" Advanced Grant of the European Research Council to Éva Kondorosi (grant number 269067) and by TÁMOP-4.2.2.A-11/1/KONV-2012-0035 supported by the European Union and co-financed by the European Social Fund.

#### **REFERENCES**


*meliloti* pSymA megaplasmid. *Proc. Natl. Acad. Sci. U.S.A.* 98, 9883–9888. doi: 10.1073/pnas.161294798


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2014; paper pending published: 21 May 2014; accepted: 13 June 2014; published online: 30 June 2014.*

*Citation: Maróti G and Kondorosi É (2014) Nitrogen-fixing Rhizobium-legume symbiosis: are polyploidy and host peptide-governed symbiont differentiation general principles of endosymbiosis? Front. Microbiol. 5:326. doi: 10.3389/fmicb.2014.00326 This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Maróti and Kondorosi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Pervasive effects of a dominant foliar endophytic fungus on host genetic and phenotypic expression in a tropical tree

*Luis C. Mejía1,2 †, Edward A. Herre1 \*, Jed P. Sparks 3, Klaus Winter 1, Milton N. García1, Sunshine A. Van Bael 1,4, Joseph Stitt 5, Zi Shi 2, Yufan Zhang2, Mark J. Guiltinan2 and Siela N. Maximova2 \**

*<sup>1</sup> Smithsonian Tropical Research Institute, Unit 9100, USA*

*<sup>2</sup> Department of Plant Science and The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA*

*<sup>3</sup> Department of Ecology and Evolution, Cornell University, Ithaca, NY, USA*

*<sup>4</sup> Department of Ecology and Evolutionary Biology, Tulane University, New Orleans, LA, USA*

*<sup>5</sup> Social, Life and Engineering Sciences Imaging Center, Materials Research Institute, University Park, PA, USA*

#### *Edited by:*

*M. Pilar Francino, Center for Public Health Research, Spain*

#### *Reviewed by:*

*Anna Carolin Frank, University of California Merced, USA Silvia Bulgheresi, University of Vienna, Austria Martin Heil, Centro de Investigación y de Estudios Avanzados del I.P.N. - Unidad Irapuato, Mexico*

#### *\*Correspondence:*

*Edward A. Herre, Smithsonian Tropical Research Institute, Unit 9100, Box 0948, DPO AA 34002-9998, USA e-mail: herrea@si.edu; Siela N. Maximova, Department of Plant Science, The Pennsylvania*

*State University, 421 Life Sciences Building, University Park, PA 16802, USA e-mail: snm104@psu.edu*

#### *†Present address:*

*Luis C. Mejía, Institute for Scientific Research and High Technology Services (INDICASAT-AIP), Panama, Panama*

It is increasingly recognized that macro-organisms (corals, insects, plants, vertebrates) consist of both host tissues and multiple microbial symbionts that play essential roles in their host's ecological and evolutionary success. Consequently, identifying benefits and costs of symbioses, as well as mechanisms underlying them are research priorities. All plants surveyed under natural conditions harbor foliar endophytic fungi (FEF) in their leaf tissues, often at high densities. Despite producing no visible effects on their hosts, experiments have nonetheless shown that FEF reduce pathogen and herbivore damage. Here, combining results from three genomic, and two physiological experiments, we demonstrate pervasive genetic and phenotypic effects of the apparently asymptomatic endophytes on their hosts. Specifically, inoculation of endophyte-free (E−) *Theobroma cacao* leaves with *Colletotrichum tropicale* (E+), the dominant FEF species in healthy *T. cacao*, induces consistent changes in the expression of hundreds of host genes, including many with known defensive functions. Further, E+ plants exhibited increased lignin and cellulose content, reduced maximum rates of photosynthesis (Amax), and enrichment of nitrogen-15 and carbon-13 isotopes. These phenotypic changes observed in E+ plants correspond to changes in expression of specific functional genes in related pathways. Moreover, a cacao gene (*Tc00g04254*) highly up-regulated by *C. tropicale* also confers resistance to pathogen damage in the absence of endophytes or their products in host tissues. Thus, the benefits of increased pathogen resistance in E+ plants are derived in part from up-regulation of intrinsic host defense responses, and appear to be offset by potential costs including reduced photosynthesis, altered host nitrogen metabolism, and endophyte heterotrophy of host tissues. Similar effects are likely in most plant-endophyte interactions, and should be recognized in the design and interpretation of genetic and phenotypic studies of plants.

**Keywords: symbiosis, fungal endophytes,** *Theobroma***,** *Colletotrichum***, gene expression, plant defense,** *Arabidopsis***,** *Populus*

## **INTRODUCTION**

Although non-pathogenic, microbial symbionts are increasingly recognized for their often profoundly beneficial influences on hosts (Van Rhijn and Vanderleyden, 1995; Kimbell et al., 2006; Herre et al., 2007; Li et al., 2008; Stat et al., 2008; Feldhaar, 2011; Engel and Moran, 2013), studies of genetic and physiological expression in plants typically do not control for, and therefore neglect, their effects. Moreover, how symbiont-induced changes in host genetic and physiological expression underlie associated benefits and costs are largely unexplored (Friesen et al., 2011).

The symbioses of woody plants with foliar endophytic fungi (FEF) offer a tractable experimental system in which the effects of endophytes on their host genetic and phenotypic expression can readily be assessed. All plants surveyed under natural conditions harbor FEF. Despite the fact that they are living in high density or abundance within plant tissues, they produce few, if any, visible effects on their hosts (Arnold et al., 2003; Herre et al., 2007; Rodriguez et al., 2009). In woody plants, most foliar endophyte species are acquired from the environment. Endophyte-free (generally <2% endophyte colonization) leaves can be produced by preventing spores from landing on leaf surfaces or by preventing leaf surfaces from becoming wet. Importantly, the identities of colonizing endophytes can be controlled (Arnold et al., 2003; Herre et al., 2007; Mejia et al., 2008). Experiments using these approaches have shown that foliar endophytes often enhance host defenses against pathogens and herbivores (Arnold et al., 2003; Mejia et al., 2008; Rodriguez et al., 2009). However, the actual mechanisms underlying the enhancement of host defense by FEF in woody plants are not established.

On one hand, many FEF produce antibiotic chemicals *in vitro*, and much attention has focused on the potential of these chemicals to provide enhanced host defense (Miller et al., 2002; Gunatilaka, 2006). On the other, it is well-known that plants respond genetically and physiologically to the presence of a wide variety of microbes (e.g., pathogens, non-foliar endophytes such as mycorrhizae and rhizobia, etc.) (Bailey et al., 2006; Friesen et al., 2011). However, the importance of the contribution of either endophyte-produced chemicals, or endophyte-induced enhancement of intrinsic host defenses to deter pathogens *in planta* is not known.

*Theobroma cacao*, the source of cacao beans, is a tropical tree species for which the ecology and systematics of fungal endophytes have been extensively studied, and for which there are complete genome sequences available (Crozier et al., 2006; Mejia et al., 2008; Hanada et al., 2010; Rojas et al., 2010; Argout et al., 2011; Motamayor et al., 2013). The composition of the FEF species assemblages in *T. cacao* is typical of other tropical tree species: it is highly diverse with few FEF species (e.g., *Colletotrichum tropicale*) consistently dominating the fungal assemblage, in any host plant that also includes many relatively rare species (Arnold et al., 2003; Crozier et al., 2006; Rojas et al., 2010). Several FEF species isolated from *T. cacao,* as well as microbes isolated from stems and pods of *T. cacao* and other host species have been shown to reduce leaf and pod damage caused by *T. cacao* pathogens and herbivores (Arnold et al., 2003; Evans et al., 2003; Rubini et al., 2005; Bailey et al., 2006; Mejia et al., 2008; Bae et al., 2009; Hanada et al., 2010; Krauss et al., 2010).

*Colletotrichum tropicale* occurs as an endophyte in a wide range of neotropical trees and is a dominant endophytic fungus in healthy leaves of *T. cacao* across natural and agricultural settings throughout Panama, and other parts of the New and Old World tropics (Rojas et al., 2010; Weir et al., 2012). In greenhouse experiments, E+ cacao plants inoculated and colonized with a mixture of endophyte species dominated by *C. tropicale* were more resistant to damage by the pathogen *Phytophthora palmivora* (Arnold et al., 2003). In agricultural field trials, *T. cacao* inoculated with *C. tropicale* showed lower incidence of black pod disease caused by *Phytophthora* spp. (Mejia et al., 2008). Thus, far, no evidence suggests chemical inhibition of *Phytophthora* spp. by *C. tropicale* (Mejia et al., 2008). Further, inoculations with *C. tropicale* have been shown to reduce herbivore damage in several tropical host plant species (Rojas et al., 2010; Van Bael et al., 2011, 2012).

Using 3347 and 17,247 unigenes *T. cacao* microarrays, quantitative PCR (RT-qPCR) and the completed sequences of the *T. cacao* and *Arabidopsis thaliana* genomes, we assessed changes in host gene expression in response to endophyte inoculation by comparing leaves inoculated with the foliar endophyte *C. tropicale* (E+) to endophyte-free, un-inoculated leaves (E−), in three experiments in which seedlings were grown under growth chamber conditions. In the leaf tissue from the first gene expression experiment we also determined changes in lignin and cellulose content. In addition, we used the functional annotation of significantly up- and down-regulated genes in the first two experiments as guides for expected changes in host phenotypic and physiological expression. Specifically, in two subsequent experiments examining phenotypic responses to endophyte inoculation, we measured photosynthetic responses and stable isotope ratios for nitrogen and carbon in E+ and E− leaves in plants grown under greenhouse conditions. In all inoculation experiments, E+ leaves showed significantly more endophyte colonization than E− leaves (see **Table 1**). Further, selecting a host gene (*Tc00g042540*, with a functional annotation suggesting defensive function) that was among the most highly up-regulated genes in E+ treatments, we conducted transient gene expression experiments to assess the effect of increased expression of this gene on host pathogen resistance in the absence of endophyte treatment. Finally, we conducted a third (time course) microarray experiment to document the dynamics of host genetic expression in response to endophyte inoculation through time.

Collectively, these experiments indicate that FEF can exert profound influences on host gene expression that are related to changes in multiple aspects of the host's physiology, metabolism, anatomy, and resistance to pathogens and herbivores. Importantly, in this case, we selected *C. tropicale*, routinely the dominant species found in foliar endophytic fungal communities that establish in healthy *T. cacao* leaves (Rojas et al., 2010). In the case of this ecologically relevant endophyte, we found that enhanced resistance to pathogen damage results from increased expression of a host gene that is among the most highly upregulated by *C. tropicale* inoculations, without the endophyte or its products being present in the host tissues. The generality of these findings on endophyte effects on host genetic and physiological expression is supported by the facts that (1) FEF are ubiquitous associates of plants worldwide, (2) that *C. tropicale* in particular is a dominant member of the endophyte community in many tropical plants (Rojas et al., 2010), and (3) that several studies document endophytic effects on host plant defense and physiology. Thus, our results demonstrate that endophyte presence and potential effects should be recognized and accounted for in the design and interpretation of studies of genetic and phenotypic expression in plants.

#### **MATERIALS AND METHODS**

#### **PLANT MATERIAL AND GENERATION OF PLANTS SYMBIOTIC WITH ENDOPHYTE (INOCULATION EXPERIMENTS)**

Seeds from open pollinated *T. cacao* trees accession UF12, grown in a plantation in Charagre, Bocas del Toro province, Panama, were used for all microarray and phenotypic response experiments reported in this article. Endophyte free seedlings were generated at the Smithsonian Tropical Research Institute, Panama, as previously described (Arnold et al., 2003). In brief, cacao seeds were surface sterilized by immersing them in 0.5% sodium hypochlorite for 3 min and rinsed with sterile water before being placed for germination in plastic trays with soil (2:1 mixture of clay rich soil from Barro Colorado Island, Panama and rinsed river sand) and incubated in growth chambers. One-month-old seedlings were transplanted to individual pots (600 ml volume) containing the same soil mixture and kept in the growth chambers or transferred to a greenhouse depending on the experiment. Germination of seeds and seedling growth was done

#### **Table 1 | Summary of experiments.**


*\*Theobroma cacao leaves inoculated with endophyte.*

*\*\*Un-inoculated T. cacao leaves.*

*\*\*\*dpi, days post inoculation. In the cases of multiple inoculations, = days post last inoculation.*

*\*\*\*\*In the second phenotypic response experiment, we used a paired leaf design in which some leaves (oldest and youngest pairs) were inoculated (E*+*) and others un-inoculated (E*−*) in the same individual. The relatively high level of endophyte colonization in E*− *leaves in this experiment reflects both the environment from the greenhouse that makes it difficult to maintain leaves completely free of endophytes and the difficulty of inoculating one leaf while keeping the other endophyte free. This makes for a conservative test of the effects of C. tropicale on both nitrogen and carbon isotopes ratios.*

in Percival growth chambers (model I35LL, 115 volts, 1/4 Hp, series: 8503122.16, Percival Scientific, Inc., Perry, IA) with 12/12 h light/dark photoperiod and temperatures of 30 and 26◦C, respectively. A summary of growth conditions and experimental designs is presented in **Table 1**.

Three microarray and two phenotypic response experiments were conducted to compare gene expression and anatomical and physiological phenotypes of endophyte inoculated *T. cacao* leaves (E+) with those of control un-inoculated *T. cacao* leaves (E−). Tissue used for the first microarray experiment was also evaluated for lignin and cellulose content in the leaf epidermal cells of E+ and E− plants. All microarray experiments and the first phenotypic response experiment were conducted in Percival growth chambers under the conditions detailed above and the second phenotypic response experiment was conducted in a greenhouse (**Table 1**). In all of these experiments, the endophyte treatment consisted of inoculation of cacao leaves with conidia (spores) suspensions of the endophyte *C. tropicale* strain 5101 (=CBS 124949) following previously described methods (Mejia et al., 2008). The percentage of endophyte colonization (number of leaf pieces with mycelia over total leaf pieces plated in culture media) per single leaf on E+ and E− leaves was measured as previously described (Mejia et al., 2008) and statistically analyzed using Wilcoxon signed rank test or Mann—Whitney U test (**Table 1**). The first microarray experiment was conducted to determine endophyte effect on host leaf gene expression using a 3347-unigenes spotted-oligo array. The second microarray experiment was conducted to test for consistency of the groups of host genes for which significant shifts in expression (up- or downregulation) were produced by endophyte inoculations using a 17,247-unigenes Roche NimbleGen custom oligo array. The third microarray experiment was designed as a time course experiment to assess cacao leaf gene expression changes in E+ relative to E− leaves over a period of 2 weeks, with samples collection at 0, 3, 7, and 14 days post endophyte inoculation.

The first phenotypic response experiment focused on comparing photosynthesis (Amax) between endophyte inoculated and un-inoculated plants. The second phenotypic response experiment was conducted in a greenhouse (using seedlings that were germinated and inoculated in the growth chambers) to confirm photosynthesis results from the first phenotypic response experiment under more natural conditions and to permit an assessment of endophyte induced shifts in carbon and nitrogen isotope signatures without the potential effects of isotope depletion often observed in relatively small closed chambers. Further, this experiment followed a paired leaf E+ and E−, on the same plant, design that permits greater statistical power.

#### **MICROARRAY AND RT-qPCR METHODS**

For the first microarray experiment, we employed a custom 70 mer oligonucleotides array (*T. cacao* 3K microarray) spotted at The Pennsylvania State University (PSU) Core Genomics Facility. This array contained 3347 unigenes (spotted in duplicates) and a set of non-hybridizing oligos as negative and probe spike-in controls (Alien, Spot Validation System, Stratagene). These gene sequences were collated from several different EST sequencing projects and from other cacao genes sequenced at the time of microarray development, prior to the whole genome sequencing. A total of 2154 different *T. cacao* genes with their respective locus identifiers were represented in this array and 2015 of the genes contained *A. thaliana* locus identifiers and functional annotation based on *e*-value cutoff of 1∗e-5 using TBLASTX (Altschul et al., 1990) on unigene sequences from this array against The Arabidopsis Information Resource (TAIR) database (Berardini et al., 2004). Samples used for microarray analyses consisted of six E+ and six E− mature leaves (equivalent to leaf developmental stage E; Mejía et al., 2012), each leaf from a different individual plant. These leaf samples were collected 14 dpi (**Table 1**), preserved in RNAlater according to manufacture's instructions (Applied Biosystems/Ambion, Austin, TX), and transported to PSU where RNA extractions were performed as previously described (Verica et al., 2004). Total RNA sample concentration and purity was assessed using a NanoDrop spectrophotometer and RNA quality was determined using an Agilent Bioanalzyer.

Hybridizations were performed by the Genomics Core Facility at PSU according to published facility protocols (http:// www.huck.psu.edu/facilities/genomics-up/protocols/nimblegenprotocols). One μg of total RNA was amplified using mRNA amplification kit (Amino Allyl MessageAmp II™, Ambion, Austin, TX) prior to labeling and hybridization. aRNA was dye coupled with Cy3 or Cy5 (GE Health Care #RPN5661) and subsequently purified according to manufacturer's instructions. The samples were paired and each Cy3 labeled sample (1.5μg) was combined with a Cy5 labeled sample (with 1.5μg) in dye-swap design. The paired samples were mixed together, fragmented (using Ambion AM8740) and individually hybridized to a single array at 42◦C for 18 h. A total of six arrays were hybridized resulting in a total of 12 measurements per treatment (six biological and six technical replicates). Arrays were washed to remove non-specifically bound target and were scanned with an Axon 4000A scanner.

For the second and third microarray experiment, we used a Roche Nimblegen oligonucleotide glass custom *T. cacao* gene expression 4X72k (four arrays of 72,000 probes) array, containing four probes of 50–60 mers in length for each of 17,247 unigenes (*T. cacao* 17K microarray, design ID 7114 manufactured by Roche). The microarray was designed based on EST sequences containing 6,853 contigs and 12,959 singlets from mixed cacao tissues (Argout et al., 2008), and 2781 unigenes resulting from 6572 ESTs generated from several previous cacao EST sequencing projects (Jones et al., 2002; Pugh et al., 2004; Verica et al., 2004). For the second microarray, three E+ and 4 E− leaf samples (leaf developmental stage D; Mejía et al., 2012) were collected at 3 dpi, each leaf from a different plant. For the third microarray, three E− leaf samples were collected at time 0, just prior to inoculation, and four E+ leaf samples (leaf developmental stage C–D; Mejía et al., 2012) were collected at 3, 7, and 14 dpi, each leaf from a different plant. Amplification, dye coupling, purification of labeled samples and hybridizations (one Cy5 and one Cy3 labeled sample mixed together per array) were performed also at the Genomics Core Facility at PSU.

The 17,247 unigenes on this microarray were blasted against TAIR9 CDS database and 11,225 unigenes had *A. thaliana* identifiers and annotations with the *e*-value cutoff of 1∗e-5 using TBLASTX. The 6022 unigenes without *A. thaliana* annotations were blasted against the cacao genome database using BLASTN and 4931 out of 6022 unigenes had cacao identifiers and predicted full length gene sequences. Those 4931 cacao full length gene sequences were further blasted against TAIR9 CDS database and 4518 out of 4931 cacao genes to obtain their *A. thaliana* IDs at *e*-value cutoff of 1∗e-5. At the end there were 11,445 genes with *T. cacao* locus identifiers and 8195 genes with *A. thaliana* locus identifiers (At) that were used for gene ontology (GO) categorization.

To verify microarray results from the first experiment, the RNA samples used for the microarray hybridization were subjected to RT-qPCR analysis (TaqMan® probe based system). Specifically, the expression of eight unigenes (representing six genes) that were highly up-regulated in the first microarray experiment was evaluated in the six E+ and six E− samples (three technical replicates each) used for first microarray experiment. Gene specific primers (Supplementary Table 1) were synthesized at the PSU Nucleic Acid Facility with a MerMade12 automated DNA synthesizer (Bioautomation, Plano, TX). Gene specific fluorescent probes were synthesized by Biosearch Technologies (Novato, CA). The fluorescent label used at the 5 end on the cacao genes probes was 6-carboxyfluorescein (6-FAM) and quencher at the 3 end of the gene probe was BHQ1 (Biosearch). The total volume of the PCR reaction was 25μl and the mix included: 5μl of cDNA (∼12.5 ng), 12.5μl 2× TaqMan® Universal Master Mix (#4304437, Applied Biosystems, Foster City, CA), 400 nmoles of each primer, and 200 nmoles of probe. The PCR reactions were ran in 96 well thin-walled PCR plates in an Applied Biosystems 7300 Q-PCR system (Foster City, CA) with the following reaction conditions; 2 min at 50◦C, 10 min at 95◦C, followed by 40 cycles of 15 s at 95◦C, and 1 min at 60◦C. Each sample was amplified in duplicate and the results were averaged. The expression of housekeeping genes *TcACT (Tc01g01090)* and *TcUBQ* (*Tc02g024050*) from cacao was also assessed and used to normalize the data. Amplification efficiency of all target and reference genes was calculated from the slopes of the dilution curves for each sample (E <sup>=</sup> 10(−1/slope)) (Bustin, 2004). Average efficiency for each gene was then calculated and used for efficiency data correction. The data normalization, efficiency correction, statistical randomization test and relative E+/E− expression ratios were computed using REST software (Pfaffl et al., 2002). Ratios (fold difference) with *p*-values less than 0.05 were considered significant.

#### **MICROARRAY DATA ANALYSIS**

Microarray data analyses of first and second microarray experiment were conducted using the Limma package of the Bioconductor software (Gentleman et al., 2004; Smyth, 2005). For the first microarray experiment, the background correction and normalization were performed using the Normexp and print-tip loess methods, respectively (Ritchie et al., 2007). The avedups function of Limma was employed for averaging log<sup>2</sup> scale intensity values for duplicate spots from the normalized data. A gene-wise linear model was fitted to normalized average log scale intensity values of cDNA spots and the empirical Bayes method implemented in Limma was used for statistical analysis and assessment of differential expression (Smyth, 2004). A moderated *t*-test was conducted to compare endophyte inoculated and control leaf samples at 14 dpi and genes with *p* < 0.01 after Benjamini Hochberg (BH) correction for multiple testing correction were considered as significantly differentially expressed. The difference in expression between the controls (E−) and the treatments (E+) is presented as the log2 fold up- or down-regulation. For the second microarray experiment raw data was background corrected and normalized using the RMA algorithm available in the Limma package. Similarly as above, a gene-wise linear model was fitted to normalized data and empirical Bayes used for statistical analysis. A moderated *t*-test was conducted to identify genes with significant difference (*p* < 0.05 after BH correction for multiple testing) in gene expression between E+ and E− leaves at 3 dpi. The data from the third microarray experiment were analyzed with software ArrayStar 4 (DNASTAR). Pair files were imported to ArrayStar and processed using the RMA method and quantile normalization. Samples were grouped by time point and an *F*-test ANOVA was employed for finding genes with significant (*p* < 0.05 after FDR correction) difference in expression across the time points assessed. Data from the three microarray experiments are available at the NCBI Gene Expression Omnibus (GEO, accession number GSE54732).

## **FUNCTIONAL CHARACTERIZATION OF MICROARRAY DATA**

The potential function of differentially expressed cacao genes was assigned based on the function of the best hit after using BLAST against with the *A. thaliana* and *T. cacao* genomes. Genes with available *A. thaliana* loci accession numbers were classified according to GO terms using the tools for GO annotations at TAIR (Berardini et al., 2004) and at web-based agriGO (Du et al., 2010).

#### **TERM ENRICHMENT ANALYSIS**

To identify cacao gene sets particularly affected by endophyte inoculation, the list of differentially expressed genes with *A. thaliana* locus accession numbers, from the first and second microarray experiments, were imported into agriGO and singular enrichment analysis was performed. For each microarray experiment, the background list of genes for comparison was all the genes with *A. thaliana* locus identifiers present in the specific microarray. The statistical method applied was the hypergeometric test, Holm adjustment for multiple test correction, significance level 0.05 and complete GO analysis.

#### **METABOLIC PATHWAY ANALYSIS WITH MapMan**

In order to perform metabolic pathway analysis of *T. cacao* genes with MapMan software (Thimm et al., 2004), we generated a mapping file with functional predictions of proteins using peptide sequences generated based on the *T. cacao,* Criollo genome sequence (Argout et al., 2011). The mapping file was generated with the Mercator pipeline for automated sequence annotation of the MapMan website using default parameters plus conservative and InterProScan (Lohse et al., 2014). Briefly a fasta file with 28,802 peptide sequences of cacao were uploaded to the Mercator tool for comparison with reference databases of protein sequences and a text mapping file with gene IDs assigned to MapMan bins (gene functional categories) was automatically generated. The file was consequently used for MapMan pathway analysis and visualization of differentially expressed genes into metabolic pathways. Files with list of differentially expressed genes and their respective expression fold changes between conditions for each of the first and second microarray experiments were imported to MapMan and genes visualized into metabolic pathways. Pathways explored in more detail included biotic stress (Pest/Pathogen attack), for identifying genes associated to plant-microbe interactions, and photosynthesis. The list of genes differentially expressed in the third microarray experiment was imported to MapMan for their functional categorization and later comparison with results from the first and second microarray experiments (**Table 2**).

#### **PHLOROGLUCINOL-HCL STAINING OF LIGNIN IN CACAO LEAVES**

Six E− and six E+ cacao plants used for the first microarray analysis were preserved in RNAlater. Three to four approximately 1 cm2 sections were cut from each leaf. The sections from each treatment (E+ and E−) were stained with 2% Phloroglucinol-HCL for 20 min. After the staining, 14 E+ and 18 E− sections were immediately assayed for the development of the purple color under an Olympus BX61Epi-Fluorescence Microscope (Olympus America Inc., Melville, NY) using a 10× objective. Images were acquired for with a Hamamatsu Orca-ER camera and processed **Table 2 | MapMan Bins (functional categories of genes) with more elements ("genes") affected by endophyte inoculations in the microarray experiments.**


*The table is sorted by the top 14 categories of terms more affected by endophyte in the 1st microarray experiment, and provides a comparison of gene expression across the experiments.*

with Olympus SlideBook 4.1 software. The purple color intensity (total pixel intensity) in the vascular tissue of each sample was measured using ImageJ software (http://rsbweb.nih.gov/ij/). Average intensity of each section was calculated and randomization tests indicated significant differences between the E+ and E− treatments (*p* < 0.001).

#### **CONFOCAL RAMAN MICROSCOPY**

A WITec CRM200 confocal Raman microscope was used to acquire lignin spectral images from mature leaves (preserved in RNAlater) of the six E− and six E+ plants from first microarray experiment. In brief, the spectral images were generated by acquiring one spectrum (1 s integration time) at each pixel. Each pixel spacing was 500 nm. The lignin spectral images were generated using WITec's Witecproject software by integrating over the spectral interval 1495–1560 relative reciprocal centimeters (cm−1) at each pixel. Brighter regions in the spectral images correspond to higher total intensity within the region of integration.

Two-dimensional chemical images of lignin distribution in adaxial leaf epidermal cells were generated by integrating over spectral interval from 1495 to 1560 relative reciprocal centimeters (cm−1) at each pixel. We observed at 100<sup>×</sup> magnification that the cell corners and cell walls of both treatments had a greater intensity and thus greater lignin concentration than the inner part of the leaf cells. We performed a total of 12,000 individual scans at 40× magnification per E− and E+ treatments using random samples from all 6 biological replicates, including a approximately 240 cells per treatment. We recorded the overall carbohydrate distribution by integrating from 500 to 3000 cm−1. For this analysis Raman bands were assigned to cellulose or lignin components according to published literature and average spectra were calculated for each individual treatment (Gierlinger and Schwanninger, 2006).

#### **PHYLOGENETIC ANALYSIS OF TUBULIN GENES**

To provide a more detailed understanding of the potential involvement in cell wall modification of the different forms of the tubulin genes that were up-regulated in cacao (*TcTU*) in E+ plants, full-length protein coding sequences of these genes were compared to α (*TUA*) and β (*TUB*) tubulin sequences from *Populus tremuloides* and *A. thaliana*. To identify all tubulin genes from cacao, full-length CDS sequences of six *A. thaliana* (*AtTUA*) and eight *Populus* (*PoptrTUA*) genes were compared against *T. cacao* ESTs and Genome databases using TBLASTX (Altschul et al., 1990). Full-length CDS sequences of all tubulin genes from *A. thaliana*, *P. tremuloides*, and *T. cacao* (Supplementary Table 2) were aligned using Muscle software (Edgar, 2004). Additionally multiple sequence alignment was performed with all cacao tubulin genes and the EST sequences of tubulin genes up-regulated in the microarray.

A phylogenetic tree was inferred by neighbor-joining analysis of all α-tubulin gene sequences from *A. thaliana*, *P. tremuloides*, and *T. cacao* using Mega 4.0 software (Tamura et al., 2007). For this analysis, α-tubulin gene sequences from *Physcomitrella patens* (*PpTUA*) genes were used as outgroup. The pairwise deletion option was selected to address alignment gaps and missing data, and branch support was obtained through 2000 bootstrap replicates.

## **PHOTOSYNTHETIC MEASUREMENTS**

Photosynthesis related measurements were conducted in the first phenotypic response experiment at 27 dpi on leaves of developmental stage E (Mejía et al., 2012). Net CO2 uptake capacity was measured using a Licor 6400 portable photosynthesis meter (Li-Cor, NE). Light saturated rates of photosynthesis (Amax) were obtained from light response curves of net CO2 uptake between 0 and 800μmol photons m−<sup>2</sup> s <sup>−</sup><sup>1</sup> under ambient CO2 conditions. Cuvete temperature was at 30◦C. Air flow through the cuvettes was 500μmol s−1.

### **CARBON AND NITROGEN ISOTOPE MEASUREMENTS**

Leaf samples (young and mature leaves corresponding to developmental stages C and E, respectively; Mejía et al., 2012) from the second phenotypic response experiment were dried and then shipped from Panama to Cornell University packed in desiccant. Upon arrival, the tissue was ground to a fine powder with a mortar and pestle and sub-samples of 2.55–3.15 mg were weighed using a microbalance (Model 4504MP8; Sartorius Corp. Edgewood, NY, USA). Tissue N and C content and δ15N and δ13C were measured using a CHN elemental analyzer (Model Carlo Erba NC2500; Thermo Finnigan, San Jose, CA, USA) coupled to a continuous flow isotope ratio mass spectrometer (Model Delta Plus; Thermo Finnigan, San Jose, CA, USA). Variation in repeated sample runs of the same material was ±0.01%. All analyses were conducted at the Cornell Stable Isotope Laboratory (COIL).

#### **FUNCTIONAL ANALYSES OF A GENE INVOLVED IN HOST DEFENSES AND PLANT-FUNGAL SYMBIOSES**

We conducted a series of experiments in which a *T. cacao* gene that is highly up-regulated in E+ treatments (*Tc00g042540*) was up-regulated in host tissues without the presence of *C. tropicale* or any of its products. The coding sequence *of TcChi1* in binary vector pGAM00.0511 (Maximova et al., 2006) was replaced with PCR amplified full length CDS of gene *Tc00g042540* (from genotype Scavina 6) under the control of high-level constitutive E12-- CaMV-35S promoter (Mitsuhara et al., 1996). The new binary vector and a control pGH00.0126 (Maximova et al., 2003) control vector (VC) were used to transiently transform cacao leaf discs from greenhouse grown mature Scavina 6 plants by *Agrobacterium* vacuum infiltration as previously described (Shi et al., 2013). Expression of *Tc00g042540* transgene was determined by RT-qPCR (primer set: RTF: ACTTGCAAT ATAGGGCGCTAGCCT and RTR: ACTTCTGGCGGGAAATAC CACCTT) using Takara SYBR® premix EX TaqII kit (Clontech) according to the user's manual. Each reaction was performed in duplicates in Roche Applied Biosystem StepOne Plus Realtime PCR System (15 min at 94◦C, 40 cycle of 15 s at 94◦C, 20 s at 60◦C, and 40 s at 72◦C). The specificity of the primer pairs was examined by visualization on the 2% Agarose Gel and dissociation curve. Analysis was performed using *TcACT* (*Tc01t010900*) as reference gene (primer set: *TcACT*-RTF: AGCTGAGAGATTCCG TTGTCCAGA and *TcACT*-RTR: CCCACATCAACCAGACTTTG AGTTC).

The transient EGFP fluorescence was observed under a microscope as previously described (Maximova et al., 2003) and leaf areas with more than 80% EGFP fluorescence surface coverage were subjected to pathogen infection assay (Mejía et al., 2012; Shi et al., 2013). The right side of each leaf segment, delineated by the midvein and positioned adaxial side up, was inoculated with 3 agar plugs of *Phytophthora capsici* mycelium as previously described (6 VC and 6 *Tc00g042540* biological replicates). Three sterile agar plugs were placed at the left side of each leaf segment and used as negative controls. Inoculated leaves were then incubated at 27◦C and 12:12 (Light: Dark) light cycle for 3 days before the evaluation of disease symptoms. Images were taken using Nikon D90 digital camera and measurements of each lesion size were performed using ImageJ software. Average lesion sizes were calculated from 18 replicates (6 leaf segments, 3 replicate inoculations per segment) and significance was determined by single factor ANOVA.

In order to assess the effects of increased expression of *Tc00g042540* on *P. capsici* DNA transcription (a proxy for pathogen metabolic activity and virulence), we used RT-qPCR to measure the ratio of *P. capsici* actin-coding DNA (*PcACT*) to cacao actin-coding DNA (*TcACT*), compared to controls (VC). Four lesions were collected as a 1.4 cm × 1.4 cm square surrounding the inoculation site and genomic DNA was extracted using TissueLyzer and DNeasy plant mini kit (Qiagen, Cat# 51304). *P. capsici* actin A (F: GACAACGGCT CCGGTATGTGCAAGG and R: GTCAGCACACCACGCTTGG ACTG) and *TcACT* (*Tc01g010900*) were used as pathogen and host targets. RT-qPCR was performed as previously described (Wang et al., 2011).

## **RESULTS**

## **ENDOPHYTE REGULATION OF HOST TRANSCRIPTION**

In all microarray experiments endophyte colonization was significantly higher in inoculated (E+) compared to un-inoculated (E−) leaves. See **Table 1** for description and summary of all inoculation experiments. Analyses of gene expression in the first microarray experiment (a 3347-unigenes microarray) identified a total of 193 differentially expressed *T. cacao* genes (107 up- and 86 down-regulated, roughly 9% of all genes on this microarray) in E+ leaves relative to E− leaves (Moderated *t*-test, *p* < 0.01 after Benjamini-Hochberg multiple testing correction, Supplementary Tables 3, 4).

We classified 102 and 86 of the up- and down-regulated genes, respectively, using GO terms based on homology to *A. thaliana* (13 GO biological processes and 15 GO cellular components, methods, **Figures 1A,B**). A high proportion of the differentially expressed genes encode chloroplast proteins identified with photosynthetic function (**Figures 1A,B**) and there was a significant enrichment for GO terms "chlorophyll binding" (hypergeometric test, *p* < 0.05 after FDR). We performed RT-qPCR of eight unigenes (representing six genes) identified by this microarray analysis as being highly up-regulated by E+ treatments. Upregulation was confirmed in all cases (Supplementary Table 5) and the more precise RT-qPCR analysis determined increases ranging from 2.9 to 53.2 fold, usually higher than values estimated by microarray hybridization.

A second gene expression experiment using a subsequently developed 17,247-unigenes microarray provided comparison with the first microarray experiment, and identified additional *T. cacao* affected by *C. tropicale* inoculation. Here, we identified significant changes in the expression of 856 *T. cacao* genes (7.5 % of the total), 433 up-regulated and 423 down-regulated genes in E+ relative to E− leaves (moderated *t*-test and *p* < 0.05 after BH adjustment for multiple testing correction, Supplementary Tables 6, 7), and significant GO enrichment for several chloroplast terms (Supplementary Figure 1).

Functional analyses of genes based on GO and MapMan (Thimm et al., 2004) terms also indicated that many of the regulated genes in first and second microarray experiments are involved in cellulose and lignin deposition and host cell wall hardening, nitrogen metabolism, photosynthesis, and biotic stress responses (Supplementary Tables 8, 9). The top 10 GO and MapMan (i.e., protein, signaling, RNA, cell wall, photosynthesis, hormone metabolism, miscellaneous function, cell, stress, and genes not assigned to any function) gene categories with more genes regulated by endophyte inoculation were consistent between the first and second microarray experiments (**Figure 1**, **Table 2**).

We conducted the third microarray experiment (time course) to view the dynamics of gene expression changes in E+ relative to E− leaves over a period of two weeks. We found 142 genes with significant changes in expression through the entire time course and with fold changes in expression between time points ranging from 1 to 7.4 [*p* < 0.05 after false discovery rate (FDR) correction, Supplementary Table 10]. We observed that the initial host response was characterized largely by down regulation, following by up-regulation of host genes (**Figure 2**). We

classified 112 of these genes according to GO terms based on homology to *A. thaliana* (14 GO biological processes and 16 GO cellular components) and found that GO categories of host genes with more genes affected by the endophyte under biological processes and cellular component were similar to the first and second microarray experiment (**Figure 2**).

#### **ENDOPHYTE REGULATION OF GENE EXPRESSION FOR DEFENSE**

Inoculation and successful host colonization by the endophyte, *C. tropicale,* causes significant up- and down-regulation of scores of host genes involved in defense against biotic stresses (i.e., pathogens and herbivores). Notably, genes affected were involved in the ethylene signaling and defense response pathways as well as signaling proteins (e.g., receptor kinases, Supplementary Table 8). Further, E+ inoculations also produced changes in the expression of genes associated with the synthesis, modification, and degradation of cell wall (e.g., Proline rich proteins; Bradley et al., 1992); peroxidases and components of the jasmonic acid defense pathway; pathogenesis related proteins (e.g., PR4 protein); redox state proteins; genes coding beta glucanases (defense

against pathogenic fungi), heat shock proteins, transcription factors, proteins involved in secondary metabolism and proteolysis; and other genes relevant to plant-microbe interactions such as NPR3, nodulin, and endochitinases (Supplementary Table 8).

## **ENDOPHYTE REGULATION OF TRANSCRIPTION OF HOST CELL WALL GENES**

Twenty of the significant genes between E+ and E− leaves in the first and second microarray experiment were associated to Mejía et al. Pervasive endophyte effects on *Theobroma cacao*

cell wall biogenesis (12 up-regulated and 8 down-regulated genes, Supplementary Table 8). In both of these microarray experiments, we observed significant up-regulation of a gene (*Tc01g035310*) coding for a putative proline rich protein (Supplementary Table 4) previously linked to hardening of cell walls (Bradley et al., 1992). We also observed two up-regulated putative tubulin genes *(TcTUA1* and *TcTUA5*). *TcTUA1* is similar to *PoptrTUA3* and *PoptrTUA5* found in *Populus tremuloides*, two Class I α tubulin genes shown to be abundant in xylem and preferentially expressed in wood-forming tissue (Supplementary Figure 2) (Oakley et al., 2007). Microtubules are well-known to be involved in cellulose biogenesis through linkages with the cellulose synthesis machinery.

Consistent with the microarray data, Phloroglucinol-HCL staining and Raman microscopy analyses indicated that the amount of lignin and cellulose was indeed significantly greater in E+ relative to E− leaves used in the first microarray experiment for genetic analyses (**Figures 3A,B**). The comparison of the mean intensity values for E+ and E− samples indicated a ∼23% increase of lignin content in the epidermal cells of E+ leaf samples (**Figure 3C**). For Raman microscopy, bands were assigned to cellulose or lignin components (Gierlinger and Schwanninger, 2006). Average spectra values revealed that epidermal leaf cells from inoculated plants contained ∼23% more lignin and ∼20% more cellulose (**Figures 3D–H**).

### **ENDOPHYTE DOWN-REGULATION OF CHLOROPLAST GENES AND REDUCED Amax**

Analysis of the first and second microarray experiments indicated that many chloroplast and photosynthesis related genes were down-regulated in E+ plants (**Figure 1**, Supplementary Tables 4, 7). These included genes encoding for: PSAD-2 (photosystem I subunit D-2), PSBY photosystem II PsbY protein, RuBisCO small subunit 1A (RBCS-1A) among many others. Additionally several chloroplast related genes were also regulated by the endophyte inoculation in the third microarray experiment (**Figure 2**). Consequently, we evaluated the effects of E+ treatments on maximum rates of host photosynthetic activity (Amax) and other photosynthetic parameters under growth chamber conditions in which entire plants were either E+ or E−.

Controlling for leaf age and time of day, E+ leaves maintained significantly lower light saturated rates of maximum photosynthesis rates (Amax) throughout the day (∼32%, **Figure 4**). There was no significant difference in *c*i/*c*a(the ratio of intercellular to ambient CO2 concentrations; **Figure 4**) determined by gas exchange (i.e., the term *c*<sup>i</sup> in traditional gas exchange experiments refers to the CO2 concentration in the substomatal cavities rather than to the concentration at the point of carboxylation in the chloroplast) indicating that the observed reductions in photosynthesis were not due to increased stomatal limitation of CO2 uptake. These results are consistent with the genetic data, suggesting that endophytes negatively affect photosynthetic activity by down-regulating many associated genes.

#### **ENDOPHYTE EFFECTS ON CARBON AND NITROGEN ISOTOPE RATIOS**

We also encountered endophyte-induced changes in expression of many genes involved in nitrogen metabolism. GO categorization

**FIGURE 3 | Endophyte inoculated** *T. cacao* **leaves (E+) have higher lignin content than un-inoculated (E−) leaves. (A)** Phloroglucinol-HCL stained E− leaf; **(B)** Phloroglucinol-HCL stained E+ leaf; **(C)** Comparing color intensity of the E− and E+ stained leaves (means ± SE). E+ leaves exhibited higher purple staining of the vascular tissue compared to E− leaves indicating ∼23% higher lignin accumulation in the E+ tissues. ∗Significance determined at *p* < 0.001. **(D–H)** Raman imaging and spectrometry reveal higher concentration of lignin and cellulose content in E+ epidermal leaf cells relative to E−. **(D–G)** Representative images of epidermal leaf cells: Bright yellow areas indicate high concentration of lignin and dark black regions indicate very low concentration of lignin; **(D)** Two-dimensional Raman image (false color) of lignin spatial distribution in the epidermal cells at 100×; **(E)** Bright field (white light) image of epidermal leaf cells at 40×; **(F)** Two-dimensional image of lignin spatial distribution in E− leaf cells at 40×; **(G)** Two-dimensional image of lignin spatial distribution in E<sup>+</sup> leaf cells at 40×; **(H)** Spectral bands of lignin (1530 cm−1) and cellulose (1162 cm−1) (see Section Materials and Methods). These results were very similar to results from the Phloroglucinol-HCL staining assay.

of the significant genes in the first and second microarray experiments identified a total of 115 genes in the category "nitrogen compound metabolic" (61 up-regulated and 54 down-regulated, Supplementary Table 9), providing strong evidence for an effect of endophytes on host nitrogen metabolism. These genes were also classified in other nitrogen GO categories: cellular nitrogen compound biosynthetic process (23 genes), cellular nitrogen compound metabolic process (36 genes), and regulation of nitrogen compound metabolic process (52 genes). Key nitrogen metabolism genes affected by *C. tropicale* included glutamine synthetase 2 and glutamate synthase in both, first and second microarray experiments.

Given that inoculation with *C. tropicale* influences photosynthetic properties (first phenotypic experiment) that correspond to the changes in expression of several photosynthesis-related genes (above), we conducted a second phenotypic experiment under greenhouse conditions to examine stable carbon isotope composition of E+ and E− plants. Based on the observed effects of endophyte inoculation on nitrogen metabolism, we also used this greenhouse experiment to examine stable nitrogen isotope composition in the same plants. In this experiment we controlled for among plant effects by comparing C and N isotope signatures for

(mean Amax ± SD) in endophyte *C. tropicale* inoculated (E+) cacao leaves was reduced compared to un-inoculated (E−) (left), Student's two-tailed *t*-test: *p* < 0.0001. The ratio of internal to ambient CO2 (mean ci/ca ± SD) was not affected (right).

paired E+ and E− treated leaves of the same age on the same plants. We found that inoculation with *C. tropicale* significantly enriched both foliar δ13C and δ15N compared to un-inoculated controls (**Table 3**).

#### **FUNCTIONAL ANALYSES OF A GENE INVOLVED IN HOST DEFENSES AND PLANT-FUNGAL SYMBIOSES**

The potential for defensive function of a specific gene upregulated by endophyte treatments in the absence of direct endophyte inoculations was evaluated using transient transgene expression coupled with pathogen assays. A gene of previously unknown function (*Tc00g042540*) was among the most highly up-regulated cacao genes found in E+ treatments in the first microarray experiment. This gene had been annotated as a 21 kDa seed protein (Argout et al., 2011) and has a weak sequence similarity to a known trypsin inhibitor (functional prediction from Mercator and MapMan analyses; (Lohse et al., 2014)) thus suggesting a defensive function. This gene was cloned by PCR amplification from cacao genotype Scavina 6 and transiently over-expressed in cacao leaves from the same genotype. The transformed cacao leaves had less damage caused by pathogen *P. capsici* than leaves transformed with a control vector (**Figure 5**). Thus, a *T. cacao* gene that is up-regulated by inoculation with *C. tropicale* (E+) can limit pathogen damage without endophytes or any endophyte-produced chemicals physically present in the host plant tissue.

## **DISCUSSION**

FEF are ubiquitous associates of healthy, apparently asymptomatic plants, in nature. Experiments have shown that host plants inoculated with endophytes are often more resistant to pathogen and herbivore damage (Arnold et al., 2003; Campanile et al., 2007; Mejia et al., 2008; Van Bael et al., 2009). In particular, we show that *T. cacao* seedlings inoculated (E+) and colonized with *C. tropicale,* the FEF species most commonly encountered in healthy *T. cacao* leaves, exhibit markedly different patterns of gene expression compared to un-inoculated leaves (E−). Across the microarray experiments, gene categories that show consistent effects of endophyte inoculations include: defense related (e.g., ethylene pathway, receptor kinases), cell wall development, chloroplast, and nitrogen metabolism (**Figure 1**). Further, these endophyte-induced effects on the expression of specific functional genes correspond to endophyte-induced changes in phenotypic expression in the host.

**Table 3 | Enrichment\* of Carbon 13 and Nitrogen 15 isotopes in** *T. cacao* **leaves inoculated with endophyte** *C. tropicale.*


*\*The observed enrichment of 15N and 13C isotopes in E*<sup>+</sup> *plants occurs over relatively short periods (a few days to weeks), with enrichment increasing over time.*

*\*\*Paired t-test.*

*\*\*\*Wilcoxon signed rank test.*

Our genetic and experimental results indicate that inoculation of the *T. cacao* with *C. tropicale* enhances the expression of large suites of host genes that are known to contribute to host defense against pathogen and herbivore attack. Specifically, the transient expression experiments show that over-expression of a gene (*Tc00g042540*) that is highly up-regulated in E+ treatments significantly decreased pathogen damage to its host. Importantly, at least in the case of this fungal species that dominates endophyte communities in healthy *T. cacao* (*C. tropicale*), increased host resistance is not due to any direct endophyte effect on pathogens (e.g., chemicals that endophytes might produce independently of the host (Mejia et al., 2008; Higginbotham et al., 2013), or even the physical presence of the endophytes. Instead, this experiment shows that endophytes can affect pathogen damage indirectly, by inducing increased expression of host genes that are demonstrably effective in enhancing disease resistance. Additional studies are needed to assess the relative contribution of direct and indirect effects of different foliar endophyte species (both systemically and locally) on host resistance to different pathogens (Mejia et al., 2008; Adame-Álvarez et al., 2014). Based on previous studies (Herre et al., 2007; Mejia et al., 2008), we expect that direct, chemical effects will be more important in some endophyte species than others, and that the specific chemicals contributing to defense will vary among species. Ultimately, the composition of the entire endophyte community will in large part determine levels of host resistance to individual pathogen and herbivore species that vary in their sensitivities to host defense and different chemicals produced by different components of the endophyte community (Arnold et al., 2003; Herre et al., 2007; Mejia et al., 2008; Adame-Álvarez et al., 2014).

More generally, among the genes and pathways identified with disease resistance in *A. thaliana* and other host plants, *C. tropicale* inoculation causes the up-regulation of many key components of the ethylene defense pathway in *T. cacao*, as well as several signaling genes (e.g., receptor kinases). E+ plants exhibited relative up-regulation of the ethylene overproduction protein 1 (*ETO1*) and the ethylene forming enzyme (*EFE*, *ACC oxidase*), up- and down-regulation of two ethylene responsive element binding family protein genes (*EREBP*), and down-regulation of the EIN3-binding F-box protein 1 (*EBF1*). Among the primary targets of the ethylene defense are necrotrophic pathogens such as *P. palmivora* which causes less damage in E+ cacao leaves inoculated with *C. tropicale* and other endophytes (Arnold et al., 2003; Mejia et al., 2008; Dodds and Rathjen, 2010). Further, receptor kinases play an important role in detection of pathogenic and non-pathogenic microbes at the cellular surface or within the cell (Dodds and Rathjen, 2010). These results suggest that one of the effects of the endophytes is to prime the host plant's defensive responses to pathogens for increased early detection by receptor kinases at cellular surfaces and subsequent intracellular responses mediated by cytoplasmic kinases and the ethylene transduction pathway. Additional studies are needed to determine the degree to which different endophyte species up-regulate different receptor kinases and other signaling genes (potentially affecting which pathogens might be recognized, and which host defense pathways are preferentially induced).

Interestingly, EIN3 and other components of the ethylene transduction pathway are required for the root endophytic fungus *Piriformospora indica* to balance beneficial and non-beneficial traits in its symbiosis with *A. thaliana* (Camehl et al., 2010). The conspicuous induction of the ethylene pathway is also involved in other mutualistic plant microbial symbioses (Pieterse et al., 1998; Khatabi and Schäfer, 2012), among them mycorrhizal-root and rhizobia-nodule associations. Although it appears that many fungal and bacterial mutualists have co-opted the ethylene pathway as a key component of their interactions with their plant hosts, the specific ways in which pathogens and mutualistic endophytes differentially influence expression and function of the ethylene pathway remain to be defined.

Inoculation and colonization with *C. tropicale* increased the lignin (∼23%) and cellulose (∼20%) content of E+ leaves of *T. cacao*. Recent comparative studies showed that plant species with relatively high cellulose content and lamina density (mass per unit volume of leaf) also tend to exhibit high leaf fracture toughness, and that these two traits together correlate with reduced herbivory rates and increased leaf lifespan across tropical plant species (Coley, 1983; Kitajima et al., 2012). Previous experimental studies with other host plant species (*Cucumis sativus, Merremia umbellata*, and *T. cacao)* show that *C. tropicale* inoculations can reduce leaf damage by tropical generalist herbivores, specifically leaf–cutting ants (Van Bael et al., 2012; Estrada et al., 2013). While further experiments are needed, evidence suggests that endophyte-induced increases in lignin and cellulose in their hosts' cell walls (in combination with endohyte-produced chemicals; Estrada et al., 2013) could contribute to the defense against pathogens and herbivores.

The decrease (∼32%) in Amax with *C. tropicale* inoculation (**Figure 4**) is consistent with the observed down-regulation of many genes coding for proteins in the photosynthetic pathways. Further, the observed enrichment in foliar <sup>δ</sup>13C (∼2%, **Table 3**), the increased lignin and cellulose deposition in the cell wall, increased hyphal presence inside leaf tissues, and lack of *c*i/*c*<sup>a</sup> differences between E+ and E− leaves are also consistent with an increase in mesophyll resistance to CO2 diffusion. Increased mesophyll resistances (i.e., increased resistances to CO2 diffusion between the sub-stomatal cavity and the site of enzymatic fixation within the chloroplast) would decrease the rate of CO2 diffusion to the point of carboxylation within the chloroplast and limit the rate of photosynthesis, and reduce the enzymatic discrimination against the heavier carbon isotope. Additionally, E+ plants exhibited down-regulation of host genes that can promote photosynthetic function by eliminating reactive oxygen species that interfere with electron transport (e.g., ascorbate peroxidase 1, among others). Although more detailed studies of the mechanisms underlying the observed decreases in Amax and the increases in foliar <sup>δ</sup>13C in E<sup>+</sup> plants are needed, endophytes clearly reduce the photosynthetic competence of their hosts.

The consistent enrichment of the foliar isotope ratio of nitrogen in the presence of endophytes observed in this study was perhaps surprising. Foliar nitrogen isotopes are thought to be a reflection of the soil solution δ15N (a consistent value in the controlled greenhouse conditions of this study) modified by within-plant fractionations that are thought to be relatively constant for a given plant species (Evans, 2001; Craine et al., 2009). The 15N enrichment in E<sup>+</sup> plants observed in this study (0.3–0.5 δ15N, **Table 3**) suggests either that endophytes alter the processes of nitrogen uptake into and/or reduction within host plants, or that endophytes increase the preferential loss of 14N from the leaf tissue, or both. Plant leaves are known to produce significant amounts of nitrogen trace gases (Sparks, 2009) and it is possible that increases in gaseous losses could lead to enrichment of the heavy N isotope in foliar tissue. Further, if endophytes utilize nitrogen from the host through heterotrophy of host products or tissues (as seems likely), then nitrogen could cycle between endophyte and host, experience fractionating losses, and enrich 15N in leaf tissue over time (see **Table 3**). The ultimate cause of enriched foliar δ15N in the presence of endophytes observed in this study is unclear. Nonetheless, endophytes appear to fundamentally alter nitrogen metabolism in their host. Detailed studies are needed of the specific nitrogen metabolism genes that are influenced by E+ treatments and the degree to which endophyte heterotrophy of host tissues contribute to the significant shifts in δ15N.

### **CONCLUSION**

Several experiments show that both foliar and other types of endophytes affect many characters that are usually thought of as "plant" characters (e.g., ability to resist either biotic or abiotic stresses, physiological responses, chemical and even stable isotope composition, and genetic composition and expression). Foliar endophytes can benefit their hosts through increased resistance to pathogen and herbivore damage. This study identifies several likely genetic and physiological mechanisms that can work alone or in combination to produce these effects. Importantly, in the case of this ecologically relevant, dominant endophyte species, we demonstrated that increased resistance to pathogen damage can occur in the absence of any direct endophyte treatment or any chemical that they produce. Up-regulation of host gene products that are themselves up-regulated as part of the host response to endophytes appear to be responsible. Yet these and other experiments also show clear costs of endophytes: reduced photosynthetic capacity and altered nitrogen metabolism. It appears that in a pathogen- and herbivore-free world, foliar endophytes would represent a net loss to the host, but are in fact providing a net benefit that increases as pathogen and herbivore pressure increases. Further, we note that asymptomatic endophytic fungi are often close relatives of fungal pathogens (Freeman and Rodriguez, 1993; Mejia et al., 2008). Ultimately a detailed genetic understanding of what makes an endophyte a neutral or even a beneficial symbiont is needed in the context of what makes some congeneric species pathogens (Freeman and Rodriguez, 1993).

Studies of genotypic and phenotypic expression in plants (e.g., defense, physiology, etc.) usually do not control for, or even Mejía et al. Pervasive endophyte effects on *Theobroma cacao*

consider, the potential effects of endophyte colonization. This and other studies have shown that, over the course of even a few days, the effects of foliar endophytes on host genetic and phenotypic expression can be rapid and large [e.g., marked reductions in pathogen and herbivore damage (Craine et al., 2009; Rodriguez et al., 2009; Friesen et al., 2011; Van Bael et al., 2012; Estrada et al., 2013); ∼23% increase in leaf lignin content, ∼20% increase in cellulose; ∼32% reduction in Amax; shifts in carbon (∼2%) and nitrogen (∼9%) isotopic signatures]. Studies of host plant genetic and phenotypic expression should be designed and interpreted with potential endophyte effects on their host clearly in mind (Arnold et al., 2003; Bailey et al., 2006; Herre et al., 2007; Mejia et al., 2008). We have shown the multiple effects that one dominant foliar endophyte species can have on its host's genetic and phenotypic expression. Plants normally harbor many endophyte species (e.g., foliar fungal or bacterial endophytes, arbuscular mycorrhizae, or nitrogen fixing bacteria, etc.), and studies of their separate and combined effects on host genetic and phenotypic expression are needed.

## **AUTHOR NOTE**

Recent published and in press work suggests similar patterns and mechanisms are responsible for symbiont-conferred protection against natural enemies across both plants and insects (see Douglas, 2014; Gerardo and Parker, 2014).

### **ACKNOWLEDGMENTS**

This study was supported by the Smithsonian Tropical Research Institute, the STRI Tupper Postdoctoral Fellowship, Kari Wenger, Bill Robertson and the Mellon Foundation, the American Cocoa Research Institute Endowed Program in the Molecular Biology of Cocoa at The Pennsylvania State University, and the Clapperton Award from Mars, UK. We thank Mars, UK for their generous sponsorship of the oligo synthesis for the *T. cacao* 3k microarray. We thank Siti Zulkafli, Sharon Pishak, Ann Young, Craig Praul, Deborah Grove, and the staff at the Genomics Core Facility, PSU. We thank Orlando Lozada for providing seeds, Damond Kyllo, Luis Ramirez, Terri Shirshac, Enith Rojas, and Carlos Aguilar for assistance growing the plants, culturing and inoculating the fungi, and conducting the physiological experiments. Merlin Sheldrake, Ron Herzig, Betsy Arnold, Nancy Knowlton, Daniel Wilson, Emma Sayer, Suzanne Ford, Truman Young, Richard Stone, Jan Sapp, Martin Heil and two anonymous reviewers provided useful discussion and/or comments on the manuscript.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fmicb.2014. 00479/abstract

#### **REFERENCES**


reduces leaf necrosis caused by *Phytophthora* spp," in *Small Wonders: Peptides for Disease Control* (Washington, DC: American Chemical Society), 379–395.


metabolic pathways and other biological processes. *Plant J.* 37, 914–939. doi: 10.1111/j.1365-313X.2004.02016.x


Weir, B. S., Johnston, P. R., and Damm, U. (2012). The Colletotrichum gloeosporioides species complex. *Stud. Mycol.* 73, 115–180. doi: 10.3114/sim0011

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 May 2014; paper pending published: 24 July 2014; accepted: 25 August 2014; published online: 12 September 2014.*

*Citation: Mejía LC, Herre EA, Sparks JP, Winter K, García MN, Van Bael SA, Stitt J, Shi Z, Zhang Y, Guiltinan MJ and Maximova SN (2014) Pervasive effects of a dominant foliar endophytic fungus on host genetic and phenotypic expression in a tropical tree. Front. Microbiol. 5:479. doi: 10.3389/fmicb.2014.00479*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Mejía, Herre, Sparks, Winter, García, Van Bael, Stitt, Shi, Zhang, Guiltinan and Maximova. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**REVIEW ARTICLE** published: 09 December 2014 doi: 10.3389/fmicb.2014.00593

## Microbial experimental evolution as a novel research approach in the Vibrionaceae and squid-Vibrio symbiosis

## *William Soto1 and Michele K. Nishiguchi <sup>2</sup> \**

<sup>1</sup> BEACON Center for the Study of Evolution in Action, Michigan State University, East Lansing, MI, USA <sup>2</sup> Department of Biology, New Mexico State University, Las Cruces, NM, USA

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Natacha Kremer, Université Claude Bernard Lyon 1, France Catherine Masson-Boivin, Institut National de la Recherche Agronomique, France

#### *\*Correspondence:*

Michele K. Nishiguchi, Department of Biology, New Mexico State University, Box 30001, MSC 3AF, Foster Hall- Horseshoe and Sweet Streets, Las Cruces, NM 88003-8001, USA e-mail: nish@nmsu.edu

The Vibrionaceae are a genetically and metabolically diverse family living in aquatic habitats with a great propensity toward developing interactions with eukaryotic microbial and multicellular hosts (as either commensals, pathogens, and mutualists). The Vibrionaceae frequently possess a life history cycle where bacteria are attached to a host in one phase and then another where they are free from their host as either part of the bacterioplankton or adhered to solid substrates such as marine sediment, riverbeds, lakebeds, or floating particulate debris. These two stages in their life history exert quite distinct and separate selection pressures. When bound to solid substrates or to host cells, the Vibrionaceae can also exist as complex biofilms. The association between bioluminescent Vibrio spp. and sepiolid squids (Cephalopoda: Sepiolidae) is an experimentally tractable model to study bacteria and animal host interactions, since the symbionts and squid hosts can be maintained in the laboratory independently of one another. The bacteria can be grown in pure culture and the squid hosts raised gnotobiotically with sterile light organs. The partnership between free-living Vibrio symbionts and axenic squid hatchlings emerging from eggs must be renewed every generation of the cephalopod host. Thus, symbiotic bacteria and animal host can each be studied alone and together in union. Despite virtues provided by the Vibrionaceae and sepiolid squid-Vibrio symbiosis, these assets to evolutionary biology have yet to be fully utilized for microbial experimental evolution. Experimental evolution studies already completed are reviewed, along with exploratory topics for future study.

**Keywords:***Vibrio***, sepiolid squid, cospeciation, experimental evolution, environmental transmission**

## **THE VIBRIONACEAE**

The family Vibrionaceae (Domain Bacteria, Phylum Proteobacteria, Class Gammaproteobacteria) encompass gram-negative chemoorganotrophs that are mostly motile and possess at least one polar flagellum (Farmer III and Janda, 2005; Thompson and Swings, 2006); although, the gut symbiont *Vibrio halioticoli* to the abalone *Haliotis discus hannai* has been described as non-motile (Sawabe et al., 1998). Vibrionaceae are facultative anaerobes, having both respiratory (aerobic and anaerobic) and fermentative metabolisms. Nitrogen fixation and phototrophy have both been reported (Criminger et al., 2007; Wang et al., 2012). Agarases and alginases have been noted from *Vibrio* (Fu and Kim, 2010; Dalia et al., 2014). Most cells are oxidase positive with a dimension 1 μm in width and 2–3 μm in length. Sodium cations are a requirement for growth and survival, but *Vibrio cholerae* and *V. mimicus* are unusually tolerant to low sodium waters. Most species are susceptible to the vibriostatic agent 0/129 (Thompson and Swings, 2006). Vibrionaceae are ubiquitously distributed throughout aquatic habitats, including freshwater, brackish, and marine waters (Madigan and Martinko, 2006). Vibrionaceae have been isolated from rivers, estuaries, lakes, coastal and pelagic oceanic waters, the deep sea, and saltern ponds (Urakawa and Rivera, 2006). Vibrionaceae can also be microbial residents of aquatic animals as either

commensals, pathogens, and mutualists (Soto et al., 2010). Bacteria may exist as planktonic free-living cells or as biofilms attached to solid subtrates present in sediments of aquatic habitats or alternatively adhered to floating particulate matter or debris. Vibrionaceae may also form biofilms on the surfaces of animal, algal/phytoplanktonic, protoctistal, or fungal hosts the cells colonize, as this prokaryotic family is quite able to initiate and establish vigorous biofilms on eukaryotic cells and chitin surfaces (e.g., invertebrate exoskeletons and fungal cell walls; Polz et al., 2006, Pruzzo et al., 2008; Soto et al., 2014). Vibrionaceae have also been found to be intracellular inhabitants of eukaryotic microorganisms (Abd et al., 2007). Although as many as eight genera have been assigned to the Vibrionaceae, the two most specious are *Vibrio* and *Photobacterium* (Thompson and Swings, 2006). *Salinivibrio* possesses an unusual ability to grow in a wide range of salinity (0–20% NaCl) and temperature (5–50◦C; Ventosa, 2005; Bartlett, 2006). Numerous species in the Vibrionaceae are pathogenic and cause disease in aquatic animals and humans (Farmer III et al., 2005), *V. cholerae* being the most notorious example as the causative agent of cholera (Colwell, 2006). *V. vulnificus* and *V. parahaemolyticus* can also cause severe illnesses in humans as a result of consuming contaminated seafood (Hulsmann et al., 2003; Wong and Wang, 2004). Furthermore, every year *V. harveyi* (Owens and Busico-Salcedo, 2006), *V. anguillarum*

(Miyamoto and Eguchi, 1997; Crosa et al., 2006), and *V. parahaemolyticus* (Austin, 2006) cause substantial economic losses to the aquaculture industry worldwide. The genera *Vibrio* and *Photobacterium* include opportunistic pathogens capable of infecting marine animals and humans, and are able to enter preexisting wounds or body openings of especially susceptible hosts that are already ill, stressed, fatigued, or immunocompromised (Urbanczyk et al., 2011). Given the heightened ability of Vibrionaceae to cement themselves to eukaryotic cells through peptide and polysaccharide modification of their exopolysaccharide, lipopolysaccharide, and capsules (Sozhamannan and Yildiz, 2011), the lack of additional human pathogens is curious. Perhaps the reason is foreign extracellular protein and polysaccharide are the same materials the mammalian immune system specifically targets, neutralizes, and removes as non-self antigens with exquisite capacity (Owen et al., 2013). Vibrionaceae have also been recently investigated for the development of probiotics, antimicrobials, and pharmaceutical drugs with potential clinical and economic value for veterinary medicine, animal husbandry, aquaculture, and human health—molecules antagonistic toward cancer cells, fungi, algae, protoctists (a term frequently preferred over protist or protozoan), bacteria, and viruses. Metabolites produced by the Vibrionaceae have also been found to have quorum sensing-disrupting properties against other bacteria, which may open an entire horizon for the advancement of "quorum sensing" antibiotics (i.e., quorum quenching) (Gatesoupe, 1999; Mansson et al., 2011).

## **MICROBIAL EXPERIMENTAL EVOLUTION**

Conventional evolutionary studies seeking to understand adaptation and speciation implement the comparative or historical approach (e.g., phylogenetics). This approach compares organisms from different environments and attempts to understand the evolutionary processes that may have produced the current distributions and adaptations of descendent populations from ancestral ones (Bennett, 2002). Since this methodology generates informed explanations based on extant organisms retrospectively and with hindsight, it naturally must make numerous assumptions on the evolutionary relationships of the organisms under study and their likely mode of evolution, even when the use of fossil data is available. Experimental evolution, however, allows one to begin with an ancestral population and empirically observe the adaptation and radiation that result in the descendent lineages under different selective regimens. Experimental evolution studies can be implemented under controlled and reproducible conditions to study evolution, usually in the laboratory and on model organisms. Less assumptions in environmental conditions, the selection pressures involved, or in the ancestral and evolving populations are necessary, since there is more control by the investigator (Bennett, 2002). Experimental evolution permits tractabilityfor the study of evolutionary biology by allowing experiments to be manipulated and repeated with replication (Lenski, 1995; Bennett, 2002). Bacteria, including *Vibrio*, are ideal organisms for such studies. For instance, these organisms have short generation times which allow evolution and adaptation to be observable on a human time scale (Lenski, 1995). Microorganisms also usually possess the advantage of achieving large population

sizes (><sup>1</sup> <sup>×</sup> 109 cells/mL in liquid culture) in the environments for which experimental evolution studies are executed, providing ample opportunity for rare beneficial mutations to arise and reach fixation by natural selection. Deleterious mutations are likely to become extinct via purifying selection, since evolution by genetic drift is negligible in huge gene pools. Moreover, a "frozen fossil record" can be generated with bacteria by storing evolving lineages at different evolutionary time points in a −80◦C freezer. Hence, one can later compare relative fitness of the ancestral clone with a derived one in novel or ancestral environments (Bennett, 2002; Lenski, 2002). As a result, evolutionary tradeoffs can be measured during the course of adaptation in the novel environment. The −80◦C fossil record also permits the determination of the evolutionary episode that a novel adaptive trait first evolved. Likewise, evolution may be "replayed" from various time points to see if subsequent outcomes are contingent on prior genetic changes or previously modified traits (Kawecki et al., 2012; Barrick and Lenski, 2013). Finally, the ancestral and derived bacteria can subsequently be analyzed to observe the exact genetic changes that have occurred and which specific ones are responsible for novel adaptive traits (Lenski, 1995; Bennett, 2002). Experimental evolution is the only direct method for studying adaptation and the genetic changes responsible, which complements genetic, physiological, biochemical, and phylogenetic approaches.

## **ATTENUATION AND VACCINE DEVELOPMENT WITH VIBRIOS: INSIGHTS FOR MICROBIAL EXPERIMENTAL EVOLUTION**

Microbial experimental evolution is a thrilling sub-discipline of evolutionary biology which has risen in the last twenty to thirty years to address diverse issues (Soto et al., 2010; Conrad et al., 2011). Although initial work largely began with *Escherichia coli* (Lenski et al., 1991), the inclusion of other microbial species has continued to grow. However, despite a few exceptions (Schuster et al., 2010; Soto et al., 2012, 2014), surprisingly little work has been completed to date with members of the Vibrionaceae. Considering the Vibrionaceae possess colossal metabolic, biochemical, ecological, and genetic diversity, the general absence of this bacterial family as an established model in microbial experimental evolution has been heedless. Nonetheless, classical efforts to attenuate pathogenic bacteria for human vaccine development were endeavors analogous to experimental evolution (Kawecki et al., 2012). Virulent bacterial isolates would be repeatedly subcultured under laboratory conditions on growth medium, in tissue/cell culture, or in animal models to introduce random deleterious mutations in the microorganism under study. Alternatively, the microbe would be continuously subjected to chemical or physical mutagens (e.g., ultraviolet light). The exact mutations that occurred and the loci undergoing genetic changes were frequently unknown initially, and attempts to their identification only coming later with additional research (Frey, 2007). For *V. cholerae*, nitrosoguanidine frequently served as a chemical mutagen to induce several attenuating mutations, including auxotrophy (Bhaskaran and Sinha, 1967; Baselski et al., 1978). Although attenuation by random mutagenesis yielded some products that demonstrated promising results in animal models and humans, this approach is less common today (Frey, 2007). The construction of attenuated vibrios containing large targeted deletions of loci known to contribute to virulence is currently more desirable, since microbial reversion to pathogenicity is deemed less probable through this practice. Side effects are also a concern (Honda and Finkelstein, 1979; Cameron and Mekalanos, 2011). *V. cholerae* attenuation by the continual introduction of random mutations, resulting in numerous deleterious genetic lesions across many loci, frequently fails to sufficiently incapacitate virulence (Cameron and Mekalanos, 2011), as the microbe finds alternative ways to thrive and persist in the human host. An evolutionary conclusion coming from vaccine work with *V. cholerae* is that numerous ways of making a successful living exist in a potential host for the genus *Vibrio*; many potential niches exist, as evidenced by the continued ability of *V. cholerae* to initiate successful and alternative symptomatic infections (e.g., reactogenicity) despite the introduction of several deleterious mutations into its genome. An implication of this observation is that vibrios are evolutionarily versatile for host colonization and proliferation. For instance, medical reports exist of *V. cholerae*'s ability to initiate bacteremia, malaise, fever, chills, and skin lesions in humans, even in the absence of a gastrointestinal infection (Ninin et al., 2000). Such symptoms are more typically characteristic of *V. vulnificus* infections and raise the interesting question of whether there may be common virulence factors in *V. cholerae* and other pathogenic vibrios which are overshadowed by the exuberant effect of choleragen. More broadly, determinants and mechanisms responsible for the colonization of host animals (or attachment to eukaryotic cells) by vibrios may possess overlap across diverse interactions (e.g., commensalism, pathogenesis, and mutualism; Hentschel et al., 2000). Hence, microbial selection experiments with vibrios have potential to provide novel insights into evolution of the varied interactions the genus *Vibrio* possesses with its hosts, and vibrio vaccine research is a great repository of information and useful starting point to ask scientific questions, construct hypotheses, and to find focus topics for real world applications and practical value.

## **SEPIOLID SQUID-***VIBRIO* **SYMBIOSIS: A CASE STUDY FOR MICROBIAL EXPERIMENTAL EVOLUTION WITH THE VIBRIONACEAE**

As mentioned previously, many members of the Vibrionaceae are able to form associations with eukaryotic hosts, including phytoplankton, protoctists, algae, aquatic fungi, invertebrates, fishes, and aquatic mammals, which may range from harmful, neutral, and beneficial to the host (Soto et al., 2010; Urbanczyk et al., 2011). One particular mutualistic interaction is the partnership between marine bioluminescent *Vibrio* and sepiolid squid. The sepiolid squid-*Vibrio* symbiosis has been a model system for studying developmental biology, immunology, physiology, and molecular biology underpinning interactions between bacteria and animals for over two decades (McFall-Ngai and Ruby, 1991), since both partners can easily be maintained in the laboratory independently of each other. Sepiolid and loliginid squids (**Figure 1A**) are colonized by bioluminescent *Vibrio* (Fidopiastis et al., 1998; Guerrero-Ferreira and Nishiguchi, 2007). The bioluminescent bacteria inhabit a morphological structure called the

light organ (**Figure 1B**) within the squid mantle cavity and benefit from their association with the cephalopod host by inhabiting a microenvironment rich in nutrients relative to the oceanic water column. The squid hosts prosper from the presence of bioluminescent bacteria by utilizing the light produced for a cryptic behavior called counterillumination (Jones and Nishiguchi, 2004; **Figure 1C**). Female squid fertilized by males lay their eggs on solid substrates such as rocks, where the embryos develop. Since female sepiolid squids are sequential egg layers, they can produce several clutches over 4–5 months after sexual maturity, with each clutch being 50–500 eggs each (Moltschaniwskyj, 2004). Axenic squid hatchlings emerge from their eggs (usually at twilight or at night) with sterile light organs, which are colonized within a few hours by specific free-living bioluminescent *Vibrio* present in the ocean (Soto et al., 2012). The colonizing bacteria quickly reproduce to fill and occupy the light organ crypt spaces (i.e., lumina, **Figure 1D**). Daily at dawn 90–95% of the light organ symbionts are vented exteriorly to the ocean by the squid host prior to burying in the sand. The remaining bacterial fraction in the squid host re-grows throughout the day to reinstate a complete light organ population by sunset (Soto et al., 2012). At dusk the squid emerge from the sand to engage in their nocturnal activity, including foraging and mating. [More detailed and comprehensive information can be found in recent reviews (Dunn, 2012; Stabb and Visick, 2013)]. Since bioluminescent symbionts can be grown in pure culture, cryopreserved with possible subsequent resuscitation, genetically manipulated and analyzed, and used to inoculate recently hatched gnotobiotic squid juveniles, the sepiolid squid-*Vibrio* mutualism is a promising prospect for experimental evolution studies aiming to understand symbioses. The juvenile squids are born without their*Vibrio* symbionts, lending the ability to infect the juveniles with any strain of *Vibrio* bacteria to examine colonization rates, ability to colonize, and persistence. Additionally, these bacteria can be used in competition experiments, which allows one to test different wild type strains against one another, mutant strains against their original wild type strain or versus other mutants, and experimentally evolved strains against the original ancestor (Nishiguchi et al., 1998; Nishiguchi, 2000, 2002; Soto et al., 2012).

Nevertheless, this mutualism has only recently been tapped as a resource for microbial experimental evolution studies in recent years (Schuster et al., 2010; Soto et al., 2012, 2014). Early work has shown *V. fischeri* are able to adapt to a novel squid host within 400 generations (**Table 1**), and such evolution may create tradeoffs in the ancestral animal host environment or in the free-living phase as a physiological correlated response to an important abiotic factor (Soto et al., 2012). Two sepiolid squid genera, *Euprymna* and *Sepiola*, are in the same taxonomic family. Several different *Euprymna* species are distributed allopatrically throughout the Indo-West Pacific Ocean, while numerous *Sepiola* species simultaneously co-occur sympatrically in the Mediterranean Sea (Nishiguchi et al., 1998; Nishiguchi, 2000, 2002; Soto et al., 2009). *Vibrio* symbionts colonizing *Euprymna* are host specialists and outcompete allochthonous isolates, a phenomenon termed competitive dominance, while those colonizing *Sepiola* are host generalists. *Vibrio* symbionts display no competitive dominance within *Sepiola* (Nishiguchi et al., 1998; Nishiguchi, 2000, 2002; Wollenberg

**FIGURE 1 |The sepiolid squid-***Vibrio* **symbiosis. (A)** Representative species from the families Sepiolidae and Loliginidae (clockwise from upper left): Euprymna tasmanica (Sepiolidae), E. scolopes (Sepiolidae), Photololigo noctiluca (Loliginidae), and Sepiola affinis (Sepiolidae). **(B)** Ventral dissection of E. scolopes, showing the bilobed light organ surrounded by the ink sac. Bar = 0.5 cm. **(C)** Diagram how the light organ operates under different phases of the moon. The progressive

**Table 1 | Competitive colonization experiments between** *Vibrio fischeri* **strains ES114 (ancestor) and JRM200 (derived) at different evolutionary time points in the novel squid host** *Euprymna tasmanica***.**


<sup>1</sup>Significantly different two-tailed t-test and sign test (P <sup>&</sup>lt; 0.05, <sup>α</sup> <sup>=</sup> 0.05).

and Ruby, 2012). Despite the presence of competitive dominance, data from population genetics and phylogenetics suggested secondary colonization events have occurred (Nishiguchi and Nair, 2003; Jones et al., 2006; Urbanczyk et al., 2011), creating a puzzling conundrum for years. Population genetics surveys fueled this enigma by consistently observing high levels of genetic diversity within the squid light organ (Jones et al., 2006), indicating light organ populations are not dominated by single or few genotypes through space and evolutionary time, an observation not consistent with competitive dominance. Competitive dominance decrease in shading from left to right symbolizes increased illumination by the light organ. **(D)** A transmission electron micrograph of an area of the epithelium-lined crypts containing symbiotic bacteria: (n) = nucleus of squid cell, (b) = bacteria in crypts (bar = 10 μm). Photo credits: Mark Norman, Mattias Oremstedt (Kahikai), M. K. Nishiguchi, R. Young, S. Nyholm, R. Long, M. Montgomery. Light organ illustration by Robert Long-Nearsight graphics.

results from squid host specialization by the symbionts, which should presumably purge genetic diversity of *V. fischeri* populations inside light organs. Microbial experimental evolution shed light on these mysteries and helped resolve these paradoxes with a complementing temporal population genetics survey spanning a decade—about 20,000 *V*. *fischeri* generations of evolution within the squid host—revealed the same evolutionary forces begetting competitive dominance were also responsible for driving*V. fischeri* genetic and phenotypic diversity within the squid light organ (Soto et al., 2012, 2014). *V. fischeri* indigenous to the Hawaiian bobtail squid (*E. scolopes*) was serially transferred for 500 generations through the Australian dumpling squid (*E. tasmanica*), a novel host (Soto et al., 2012; **Figure 2**).

Results demonstrated as *V. fischeri* adapted to *E. tasmanica*, the ability of the derived lines to grow along a salinity gradient significantly changed relative to the ancestor. Moreover, no obvious pattern to the growth changes was evident across the salinity continuum, suggesting *V. fischeri* microbial physiology had been "randomized."Salinity is known to impact*Vibrio* population levels and distributions worldwide (Soto et al.,2009).*V. fischeri*subjected to novel host evolution created polymorphic reaction norms for salinity, an abiotic factor integral to shaping symbiont ecology during the free-living phase. Furthermore, experiments indicated a "superior numbers" or a "running start" advantage to foreign strains over native ones in animal host colonization that could outflank competitive dominance. Thus, *V. fischeri* strains most abundant (perhaps due to salinity) during the free-living phase where squid hosts resided were the ones most likely to colonize

2012).

the cephalopod, not strains best adapted to the squid (Soto et al., 2012). A similar process may occur with *Photobacterium* in fish hosts due to temperature (Urbanczyk et al., 2011). Additionally, the *V. fischeri* lines serially passaged through *E. tasmanica* surged in biofilm formation and bioluminescence but lessened in motility (Soto et al., 2014). Increases and decreases in the utilization of select carbon sources also transpired. Interestingly, evolutionary differentiation occurred in the derived lines relative to the ancestor and to each other for biofilm formation, motility, bioluminescence, and carbon source metabolism, results consistent when compared to *V. fischeri* wild isolates obtained from light organs of *E. scolopes* and *E. tasmanica* specimens collected in the field (Soto et al., 2014). Squid host specialization by the symbionts promotes competitive dominance and diversifying selection. Perhaps clonal interference prevents selective sweeps in the squid light organ. The lineages serially transferred through *E. tasmanica* also exhibited decreased levels of bioluminescence in the ancestral host *E. scolopes* (Soto et al., 2012). In an independent study, *V. fischeri* strains previously incapable of establishing a persistent association (chronic infection) with sepiolid squids were shown to be capable of doing so after serial passage in *E. scolopes* (Schuster et al., 2010). Since *V. fischeri* possesses a life history where bacteria are cyclically associated with an animal host (sepiolid squids and monocentrid fishes) and then outside the host as free-living bacteria in the ocean, researchers can use microbial selection experiments with *V. fischeri* to simultaneously study symbiosis evolution and microbial evolution in the natural environment where microbes are not partnered to a host. [Some recent work suggests *V. fischeri* may also be a bioluminescent symbiont in the light organs of fishes belonging to the taxonomical families

(CAM). This strain was passaged through a non-native host for 500

Moridae and Macrouridae (Urbanczyk et al., 2011).] Additionally, *V. fischeri* strains exist which are completely unable to colonize the light organs of sepiolid squids and monocentrid fishes, permitting evolutionary biologists to study a continuum of interactions between a microbe and animal host when studying the squid-*Vibrio* mutualism. Given the Sepiolidae is a diverse family of squids that include allopatric and sympatric species distributions, testing whether host speciation affects selection for host specialist versus host generalist evolutionary strategies within *Vibrio* symbionts is possible.

#### **TYPE STRAIN MENTALITY AND OTHER BIOLUMINESCENT SYMBIONTS FOR SEPIOLID SQUIDS**

Early work characterizing the molecular biology of *V. fischeri* colonizing *Euprymna* squid focused on the strain *V. fischeri* ES114 and the host *E. scolopes* (with occasional studies in *Sepiola*), since only the Hawaiian squid host was routinely available (McFall-Ngai and Ruby, 1991; Fidopiastis et al., 1998). Furthermore, reductionism was desired to understand the fundamentals of the symbiosis. Nonetheless, caution is warranted to avoid development of a "type strain" or "type host" mentality. Recent work has expanded to regularly include other strains of *V. fischeri* and *Euprymna* species (Ariyakumar and Nishiguchi, 2009; Chavez-Dozal et al., 2012; Soto et al., 2012). This will aid in identifying more general results from those that are specific to a particular symbiont strain or host species. In addition, initial characterization of the sepiolid squid-*Vibrio* symbiosis described *V. fischeri* as the only bioluminescent symbiont present in the squid light organ (McFall-Ngai and Ruby, 1991). Subsequently, *V. logei* was discovered as a symbiont in the genus *Sepiola* (Fidopiastis et al., 1998; Nishiguchi, 2000). More recently, *V. harveyi* and *Photobacterium leiognathi* have been included as symbionts of *E. hyllebergi* and *E. albatrossae* from Thailand and the Philippines, respectively (Guerrero-Ferreira and Nishiguchi, 2007; Guerrero-Ferreira et al., 2013). An important prospect to consider is that *V. fischeri* and *V. logei* may have evolved fundamentally distinct and different traits for colonizing sepiolid squids, even when considering the same host species. Clearly, new and thrilling perspectives are surfacing around the sepiolid squid-*Vibrio* mutualism. Several species in the Vibrionaceae are bioluminescent. An interesting remaining question is why only a few of these form light organ symbioses with sepiolid squid hosts. For example, why is bioluminescent *V. orientalis* never found in squid light organs (Dunlap, 2009)? Are researchers simply not looking thoroughly enough?

#### **BIOGEOGRAPHY OF** *VIBRIO* **BACTERIA AND EXPERIMENTAL EVOLUTION IN THE FIELD**

Experimental evolution in the lab with *Vibrio* bacteria has only been completed in one species of *Vibrio* (*V. fischeri*), and strains used in those studies were either from the squid host *E. scolopes* (Hawaii) or pinecone fish *Monocentris japonicas* (Schuster et al., 2010; Soto et al., 2012). Given that a number of symbiotic *V. fischeri* from squid can colonize and survive in nearly all allopatric *Euprymna* hosts of the Indo-West Pacific, it provides a road map whether other *V. fischeri* strains can adapt to additional potential host species closely related to *Euprymna* (e.g., *Rondeletiola minor*) or even ones from a different phylum (Nishiguchi et al., 2004). Naturally occurring strains may be subjected to movement between hosts that are along a specific environmental gradient (Soto et al., 2010). Obviously, similar cues must be used for these bacteria to recognize a comparable, yet novel host, and then colonize and establish a persistent association in the outré animal for the symbionts to secure their distribution in the new host population (Wollenberg and Ruby, 2009). Only 6–12 *V. fischeri* cells are required to initiate a squid light organ infection. Once these bacteria colonize a squid host, they can reproduce much faster than in seawater. New *V. fischeri* clones encountering a squid host species for the first time will then be expelled every 24 h, increasing the cell numbers of *V. fischeri* new arrivals that can infect even more juvenile squid of the exotic host species (Lee and Ruby, 1994b, 1995). Whether these symbiont founder flushes truly occur in nature is not known, but observations in the laboratory have shown that alien *V. fischeri* genotypes can invade and take root where a preexisting genetic variety was already entrenched (Lee and Ruby, 1994a; Soto et al., 2012). Whether this commonly leads to a dominant symbiont genotype in a host population in a given geographical area over the long term must be investigated more closely.

## **TWO-CHROMOSOME GENOMIC ARCHITECTURE IN VIBRIONACEAE, EVOLVABILITY, AND VERSATILITY**

Research has shown an absence of parallel coevolution between *V. fischeri* symbionts and their light organ animal hosts, which implies significant host switching has occurred (Nishiguchi and Nair, 2003). Host switching has been a common evolutionary phenomenon for *Vibrio* and *Photobacterium* species involved in symbioses, regardless of whether the interaction was commensalism, pathogenesis, or mutualism (Urbanczyk et al., 2011). Extensive host switching could suggest this microbe, along with *Vibrio* species in general, are evolutionarily plastic and malleable organisms. Vibrionaceae possess two circular chromosomes, one large (Chromosome I) and one small (Chromosome II; Tagomori et al., 2002). With this complex genome arrangement, *V. fischeri*'s ability to exploit numerous lifestyles is easy to understand, as the *Vibrio* genome structure is dynamically unstable (Kolsto, 1999). The modular two-chromosome architectural structure of Vibrionaceae genomes has been hypothesized to be the inception for the versatility and ubiquity of this cosmopolitan bacterial family, with ecological specialization being the essence of the smaller and more genetically diverse Chromosome II with its superintegron island gene-capture system and genes encoding for solute transport and chemotaxis (Heidelberg et al., 2000; Ruby et al., 2005; Grimes et al., 2009). Intrachromosomal and interchromosomal recombination is clearly present, along with inversions, indels, and rearrangements (Kolsto, 1999; Heidelberg et al., 2000; Tagomori et al., 2002). Such genomic architecture permits the evolutionary potential for functional genetic specialization to occur among the two chromosomes (Heidelberg et al., 2000; Waldor and RayChaudhuri, 2000), promoting ecological opportunity in adapting and radiating into numerous niches (Soto et al., 2014). For example, *V. cholerae* and *V. parahaemolyticus* genomic studies have discovered that house-keeping genes (DNA replication, transcription, translation, cell division, and cell wall synthesis) and pathogenicity are mainly restricted to the large chromosome (Heidelberg et al., 2000).

Chromosome II appears to be a genetic module for DNA and a source for innovation, perhaps evolutionarily functioning analogous to plasmids, possessing significantly more foreign loci that appear to have been acquired horizontally from other microbial taxa (Heidelberg et al., 2000; Waldor and Ray-Chaudhuri, 2000). The presence of a gene capture system (i.e., integron island) and genes usually found on plasmids support this claim (Heidelberg et al., 2000). Furthermore, loci involved in substrate transport, energy metabolism, two-component signal transduction, and DNA repair are prominently carried on Chromosome II (Heidelberg et al., 2000; Waldor and RayChaudhuri, 2000). The loci involved in substrate transport consist of a large repertoire of proteins with diverse substrate specificity. Genes that subdivide cellular functions and that are intermediaries of metabolic pathways also are found on Chromosome II. These genetic auxiliaries potentially serve as the raw material for adaptation and specialization (Heidelberg et al., 2000; Waldor and RayChaudhuri, 2000). The structure and size of the large chromosome appears relatively constant throughout the Vibrionaceae, whereas Chromosome II is more acquiescent and flexible to genetic reorganization, rearrangement, recombination, and large indel events (Okada et al., 2005). Genes encoding function for starvation survival and quorum sensing are located on both chromosomes. Thus, interchromosomal functional regulation is present in Vibrionaceae. As a result, specific and novel mechanisms involved in the regulation, replication, and segregation of both chromosomes are thought to have evolved in this bacterial family (Waldor and RayChaudhuri, 2000; Egan et al., 2005).

Interestingly, *V. cholerae* colonization factors (e.g., genes responsible for pili formation) primarily reside on Chromosome I. Consequently, different *V. fischeri* ecotypes could be the result of evolution at loci involved in metabolism as opposed to those involved in tissue colonization (Browne-Silva and Nishiguchi, 2007; Soto et al., 2014). Experimental evolution studies with *E. coli* have demonstrated that resource partitioning and alternative substrate specialization is sufficient for ecological polymorphisms to arise in prokaryotes (Rosenzweig et al., 1994). In summary, the two-chromosome architecture provides *V. fischeri* with enormous evolutionary fluidity. Particularly, Chromosome II may possess ecological or symbiosis islands which could account for this microorganism's broad ecological range (Tagomori et al., 2002). For example, differences in the pathogenicity islands present on Chromosome II appear to determine whether or not *V. parahaemolyticus* strains are pathogenic. Similarly, the pliant nature of *V. fischeri* could explain why there is extensive host switching. Chromosome II may well be a gene repository outfitted to respond to environmental change, habitat heterogeneity through space and time, and stress (Dryselius et al., 2007; Soto et al., 2012). Future studies will be thrilling and exciting, as modern bioinformatics and genomics offer high hopes and allow unprecedented visions. Recent advances in high throughput sequencing technologies and genome editing techniques (e.g., MuGENT) will greatly increase the potential of experimental evolution to understand adaptation (MacLean et al., 2009; Dalia et al., 2014).

#### **TOPICS FOR FUTURE STUDY**

#### **BIOFILM FORMATION AND MOTILITY**

Motility and biofilms are modes by which *V. fischeri* strains can niche specialize in their *Euprymna* squid hosts (Yildiz and Visick, 2009; Soto et al., 2014). Biofilms are aggregates of microorganisms attached to a surface that are frequently enmeshed within a matrix of exopolysaccharide and can be comprised of a pure culture population or a community (Davey and O'toole, 2000; Stoodley et al., 2002). This community is much more resistant to antimicrobials, ultraviolet light, pH shifts, osmotic shock, desiccation, and other environmental stresses (Gilbert et al., 1997; Davey and O'toole, 2000). The role of biofilms in disease and host colonization is well documented, where bacterial pathogens establishing biofilms in animals may be more recalcitrant to phagocytosis by host macrophages, resistant to respiratory bursts by immune cells, and insensitive to antimicrobials produced by host defenses (Davey and O'toole, 2000). In addition, biofilm development has a major role in *V. fischeri* colonization of sepiolid squid hosts (Chavez-Dozal and Nishiguchi, 2011; Chavez-Dozal et al., 2012; **Figure 3**). When movement on surfaces is necessary, swarming with flagella is the motility mechanism for Vibrionaceae (McCarter, 2001). Swarming is specialized mobilization or locomotion on a surface as opposed to the swimming and tumbling done by individual cells. As *V. fischeri* swarms with concurrent cell division (e.g., growth), cells differentiate from a vegetative state to a swarmer one. Swarmer cells are hyperflagellated and longer than vegetative counterparts (Harshey, 2003), and

provide a steady state supply of nutrients until motility ceases. Motility plays an integral role in the colonization of sepiolid squid by *V. fischeri* and allows host-associated bacteria to reach the destination and surface desired for further colonization or attachment (Millikan and Ruby, 2004). Since swarming is an energetically expensive process, chemotaxis has a role mediating how a bacterial cell should physiologically respond. Through

**FIGURE 3 | Early and late gene expression (mRNA) of various biofilm related loci in** *V. fischeri* **ETJB1H. (A)** Early (4 h) gene expression of flagellum biosynthesis (flgF), type IV pili formation and adhesion (pilU, pilT ), and the sodium-type flagellar motor pump for motility (motY ) loci. **(B)** Late gene expression of genes important for mature biofilm production (24 h) include expression of heat shock protein (ibpA), magnesium-dependent induction for c-di-GMP synthesis (mifB), and arginine decarboxylase (speA). Modified from Chavez-Dozal et al. (2012). dCCT is the change in relative expression of each gene compared to the standard control (in this case 16S rRNA). The gel represents the amount of 16S rRNA (top) and the amount of mRNA expressed in each gene examined. Error bars represent SD of three replicates.

years of studying diverse bacteria as motility model systems, research has shown many regulatory pathways controlling motility also affect biofilm formation (Harshey, 2003; Verstraeten et al., 2008). Bacterial populations must resolve whether to institute motile machinery for expedient colonization of surfaces or engage biofilm systems when an appropriate location for initial contact and attachment has been found, a critical choice affecting survival between competitors. Experiments are underway where *V. fischeri* lines are being selected for increased biofilm formation and motility. Accompanying these experiments are ones where *V. fischeri* lines are being alternately or cyclically selected for biofilm and motility lifestyles (oscillatory selection). The relative abilities of these lines to colonize squid hosts will be assessed.

#### **PARASITISM, PREDATION, AND GRAZING ON** *VIBRIO* **BACTERIA**

Substantial work exists on how protoctistan predators are effective grazers on *Vibrio* or other bacteria, particularly when they form biofilms (Matz and Kjelleberg, 2005; Matz et al., 2008). Previous research has demonstrated that certain species of *Vibrio* (e.g., *V. cholerae* and *V. fischeri*) are better able to ward off microbial eukaryotic predators when in their biofilm state compared to their planktonic counterparts (Erken et al., 2011). Earlier work provides strong evidence that when *Vibrio* biofilms are grazed by protoctistans, the bacteria release toxic compounds capable of killing the predators, the dead grazers themselves then become a meal and carbon source for the *Vibrio* (Chavez-Dozal et al., 2013). Depending on the species, and even strain type, *Vibrio* biofilms make an excellent model to determine if grazing can affect biofilm growth, structure, and production of chemicals to inhibit grazers (Barker and Brown, 1994). Current microbial selection studies are ongoing to examine adaptive responses of *V. fischeri* to various grazers and how these evolutionary outcomes impact sepiolid squid colonization. Bacteriophage, predatory bacteria (e.g., *Bdellovibrio*, *Bacteriovorax*, *Micavibrio*, and "wolfpack" feeders such as myxobacteria), and aquatic fungi also prey on the Vibrionaceae (Atlas and Bartha, 1998; Richards et al., 2012). How these natural enemies affect *V. fischeri* evolution and the sepiolid squid-*Vibrio* symbiosis are worthy of future investigations. For instance, *Vibrio* chitinases attacking fungal cell walls may be a means to avoid grazing by marine yeast. Chitinases are known to be utilized by *V. fischeri* symbionts when interacting with the squid host (Wier et al., 2010).

#### **EVOLUTION DURING THE FREE-LIVING PHASE, ABIOTIC FACTORS, AND BACTERIAL STRESS RESPONSES**

Prior work has shown that *V. fischeri* host adaptation to sepiolid squids and monocentrid fishes affects this species' ability to grow within a gradient of an abiotic factor (e.g., tolerance limits to environmental stress) while in the free-living or planktonic phase (Soto et al., 2009, 2012), implicating that natural selection could be acting on the bacterial stress responses to better accommodate the symbiont against the unprecedented stressful environments presented by a new animal host (e.g., novel immune defenses; Soto et al., 2010). The coupling of different bacterial stress responses to one another and their correlation to successful symbiosis

initiation, host immunity evasion, pathogenesis, and virulence mechanisms is becoming necessary for understanding bacterial evolution (Nishiguchi et al., 2008). Future experimental evolution work will focus on the adaptability of *V. fischeri* to abiotic factor stresses, such as high and low tolerance limits of salinity, temperature, and pH while in the free-living or planktonic phase. In turn, correlated responses of *V. fischeri* adapting to these environmental stresses will be investigated in sepiolid squid hosts (Abucayon et al., 2014). Understanding how *V. fischeri* stress evolution affects its relationship with sepiolid squids will lead to new insights in the dynamic evolutionary forces that shape associations between hosts and symbionts. Because both free-living and host environments impose dramatically different selection pressures to microorganisms (e.g., evasion of immune host defenses), these perspectives have implications into infectious disease and virulence mechanisms, as genetic and physiological components responsible for mutualisms and pathogenesis are frequently identical or homologous (Ruby et al., 2005; Nishiguchi et al., 2008; Buckling et al., 2009). Stress evolution and stress-induced mutagenesis are known to be capable of creating cryptic genetic variation through varying gene-by-gene and gene-by-environment interactions which can be invisible to natural selection during the original circumstances in which they materialize but either beneficial or detrimental to bacterial fitness when conditions change (Tenaillon et al., 2004; McGuigan and Sgro, 2009; Paaby and Rockman, 2014). The evolutionary significance of cryptic genetic variation in patterning interactions between animal hosts and bacteria is unclear. *V. fischeri* adapting to a novel squid host was found to increase this symbiont's ability to form biofilms in artificial seawater containing no organic carbon while in the free-living phase. This result suggests symbiosis evolution can affect *V. fischeri*'s ability to tolerate starvation or oligotrophic conditions when subsequently outside the host (Soto et al., 2014). *V. fischeri* adapting to selective pressures imposed by abiotic factors or environmental stressors during the free-living phase may either reinforce or decouple coevolution between*Vibrio* symbionts and their animal hosts (Soto et al., 2009, 2012). A static microcosm or standing culture of *Pseudomonas fluorescens* where wrinkly spreader, fuzzy spreader, and smooth morph colonies arise over several days has become a model system for studying microbial adaptive radiation, a process known to be affected by oxygen depletion and nutrient availability (Rainey and Travisano, 1998; Travisano and Rainey, 2000). Alterations in *Vibrio* colony morphology is known to affect animal host colonization (Mandel et al., 2009). *V. fischeri* adaptive radiation during the free-living phase and the subsequent consequences on symbiosis are poorly understood. The use of microbial experimental evolution with heterogeneous environments will provide insight into how *V. fischeri* biodiversity (e.g., Shannon-Wiener Index) in the free-living phase affects symbiont population variation within the squid light organ across gradients of various abiotic factors.

#### **METABOLISM**

Biolog plates were developed for global phenotype analysis of microorganisms that allows a comprehensive survey of microbial physiological traits (Bochner, 1989; Bochner et al., 2001). The aim is to identify unique characteristics of individual microbes and common metabolism to particular taxa or ecological populations. These plates also provide functional data to complement genetic analyses and gene expression studies of microbes. For instance, mutants can be screened efficiently to compare phenotypic consequences relative to wild type. This is especially important for examining metabolic polymorphisms, physiological heterogeneity, and distinguishing between different ecotypes within the same bacterial species, since different substrates can be shunted into alternate biochemical pathways (Rosenzweig et al., 1994; White, 2007). Additionally, how metabolism of the same substrate (i.e., D-glucose) is disproportionately distributed among numerous biochemical pathways (glycolysis versus pentose phosphate pathway) may also vary among different individual cells of the same bacterial species (Rosenzweig et al., 1994), as hypothesized by the nano-niche model of bacterial evolution (Wiedenbeck and Cohan, 2011). For example, most members of an *E. coli* population may move the carbon flow from the breakdown of D-glucose via glycolysis, but a small proportion of the remaining population may shuttle more intermediates of D-glucose degradation through Entner-Doudoroff pathway for an alternate way of making a living (e.g., physiological tradeoffs, resource partitioning, and ecological nutrient specialization by differentiation in usage of metabolic pathways; Rosenzweig et al., 1994; Cooper and Lenski, 2000; Travisano and Rainey, 2000; MacLean et al., 2004;White, 2007). Within the lifetime of just one adult squid host, a single*V. fischeri* clone has ample time to evolve cross-feeding with either other*V. fischeri* cells or host cells, since this has been documented in *E. coli* in less than 800 generations in a homogenous and unstructured environment (Rosenzweig et al., 1994). *V. fischeri* adapting to novel animal hosts undergo ecological diversification in carbon source utilization within 500 generations (Soto et al., 2014). With the use of Biolog plates, microbial experimental evolution can provide keen insight in the role of metabolism in *V. fischeri* ecological diversification and sepiolid squid colonization (MacLean and Bell, 2002).

#### **CHEMOTAXIS**

Support exists biofilms, motility, carbon metabolism, and bioluminescence are entwined or interlaced with one another. Possible crossroads for their roles in *V. fischeri* include chemotaxis, intracellular second messengers (c-di-GMP), and bacterial stress responses. Methyl-accepting chemotaxis proteins (MCPs) are central for chemotaxis, as these proteins are chemoreceptors that monitor the chemical composition of the environment and transmit this information interiorly to the cell (Bren and Eisenbach, 2000; Brennan et al., 2013). MCPs are versatile receptors to chemical stimuli, adept at mediating taxis to diverse signals (Hsing and Canale-Parola, 1996). A single MCP is incredibly sensitive. It is able to discern differences in stereochemistry between isomers, sense relative asymmetries in chemical concentrations of a substance along a gradient, and integrate diverse information of multiple chemical stimuli present in the environment simultaneously (Hsing and Canale-Parola, 1996; Bren and Eisenbach, 2000). An MCP is capable of a graded, measured, and progressive selective response to chemical stimuli. MCP function is further elaborated by being present on bacterial cell membranes as a mass complex of several interacting MCPs bundled together into a chemo-antenna cluster network, amplifying the synergistic

interactions possible in chemotaxis and signal transduction (Bren and Eisenbach, 2000). Additionally, single amino acid substitutions can have colossal effects in sensitivity, affinity, specificity, and function of an MCP (Derr et al., 2006). Hence, MCPs and redistributable metabolism may allow *V. fischeri* populations to better colonize novel hosts by resculpting its N-dimensional niche hypervolume space quickly (Hutchinson, 1957). In a study using comparative genomics and a network biology-based approach to understand how genes select for multigenic phenotypes such as virulence in *V. cholerae*, loci encoding MCPs and others associated with chemotaxis were among those identified as most responsible (Gu et al., 2009). MCPs couple chemotaxis to diverse metabolites and their gradients, supplying one potential route a symbiont can adapt to unaccustomed host physiology. Experimental evolution with microorganisms to analyze chemotaxis can be completed by placing small volumes of bacteria onto the centers of motility agar plates with different chemoattractants at the periphery. Over an incubation time at an appropriate temperature, cells from the leading edge closest to the chemoattractant are serially transferred onto the centers of new motility plates (DeLoney-Marino et al., 2003). Derivations of this method can be used to select for bacteria with increased aversion to chemorepellents. Another avenue is to use a rendition of the glass capillary tube chemotaxis assay that involves continuous subculturing (Adler, 1973).

#### **QUORUM SENSING, BIOLUMINESCENCE, SOCIAL EVOLUTION, AND ECOLOGICAL INTERACTIONS**

Quorum sensing was first described in *V. fischeri* in 1970 in connection with bioluminescence (Nealson et al., 1970). Since then, quorum sensing is now known to govern many more traits other than bioluminescence, including but not limited to exoenzyme secretion, siderophore production, antibiotic synthesis, cell division, DNA replication, cell surface anabolism (cell wall, cell envelope, and capsule), biofilm development, and motility (Miller and Bassler, 2001). Bioluminescence is frequently used as a proxy quorum sensing measurement. Regulation of the *lux* operon involves input from the quorum sensing apparatus that couples to other microbial physiological pathways and cascades (Miyashiro and Ruby, 2012). Clever designs can permit microbial selection experiments that investigate quorum sensing and bioluminescence. In a plate selection scheme, ImageJ (image processing freeware produced by National Institutes of Health) may be used to single out brighter and dimmer colonies on agar plates for serial transfers that have been digitally imaged in lit and dark rooms ("digital replica plating" or "replica imaging"). The Vibrionaceae possess a hierarchical and sophisticated quorum sensing machinery comprised of "low cell density" (LCD) and "high cell density" (HCD) gene expressions (Camara et al., 2002). Microbial selection experiments with *V. fischeri* mutants locked or defaulted into LCD and HCD gene expressions will permit studies into group selection, kin selection, social evolution, and greenbeard genes (Travisano and Velicer, 2004). LCD and HCD gene expressions can each secrete a different and distinct subset of public goods not produced by the other (e.g., extracellular nuclease and metalloprotease for LCD and HCD, respectively; Blokesch and Schoolnik, 2008; Natrah et al., 2011; Bruger and Waters, 2014). Experimentally evolved lines possessing constitutive HCD and LCD gene expressions would be compared to the quorum sensing wild type strain (ancestral or derived) for a particular selection regimen. LCD lines could serve as "cheaters" or "defectors" for a public good produced by HCD or wild type lines at high cell density (e.g., extracellular metalloproteases). An investigator could ask if an LCD cheater line initially at low frequency could invade an HCD line (at low or high cell density) or a quorum sensing wild type line at high cell density. HCD lines could analogously serve as cheaters for extracellular nuclease. The ability to control microbial growth and dilution rates with chemostats using select media might also be another way. The use of quorum sensing enhancers and quorum quenching molecules or drugs are additional avenues for future experiments (Rasmussen and Givskov, 2006; Defoirdt et al., 2008). Serial transfers of liquid cultures performed at particular cell densities (specific transmission bottlenecks) or with spent (conditioned) media may permit inquiries into quorum sensing.

Other possible ecological interactions between microbes include competition (interference and exploitation) and microbial allelopathy (e.g., chemical warfare; Atlas and Bartha, 1998). Additionally, one must recognize that one microbe may be more fit than another because of increased efficiency in resource utilization or better able to convert assimilatory carbon and reducing power into more offspring (i.e., a shorter generation time growing on D-glucose). Yet an interesting facilitation is cross-feeding. Cross-feeding can also occur between cells of different strains or species, where one cell type secretes a waste product that is utilized by another as a nutrient or useful resource. Understanding the diversity of social dynamics is valuable. Within the social evolution context, when a participant (the actor) benefits from harming another (recipient), the interaction is termed selfishness (West et al., 2006). When the actor suffers a negative effect by harming the recipient, the interaction is called spite. Altruism occurs when the recipient benefits and the actor is harmed, but mutualism takes place with both partners benefiting. Commensalism occurs if the actor benefits and the recipient experiences no effect. In amensalism, the actor is unaffected but the recipient is disserviced (Atlas and Bartha, 1998). (Predation was addressed previously.) As alluded to earlier, an initial effort to characterize the assortment of social interactions between bacteria can be done by placing washed cells in the filter-sterilized spent media of a competitor. NMR and mass spectroscopy can possibly be used to identify any interesting molecular components that can be isolated or purified. Excellent questions linger. What are the roles of cooperation, cheating, competition for limiting resources, microbial allelopathy, and other ecological interactions in shaping the squid-*Vibrio* symbiosis? At what stages do each of these processes most predominate (e.g., free-living versus host associated)? Is cheating among symbionts suppressed by the squid host when *V. fischeri* are in the light organ? Are bacteriocins produced by *V. fischeri* strains (i.e., vibriocins) against other conspecific subtypes in the squid light organ?

#### **VIABLE BUT NON-CULTURABLE STATE**

The viable but non-culturable (VBNC) state is a phenomenon frequently observed in the Vibrionaceae and other prokaryotes, including *V. fischeri* (Lee and Ruby, 1995). Bacteria normally

culturable no longer grow in liquid culture or on agar media, because the cells enter a dormancy where still metabolically active and presumed to have elevated tolerance or resistance to environmental stressors (extreme conditions of an abiotic factor such as temperature or salinity), harmful compounds or noxious chemicals, starvation, and heavy metal toxicity (Ordax et al., 2006; Nowakowska and Oliver, 2013). Escape from digestion after phagocytosis or endocytosis by ameba and macrophages has also been hypothesized to be another function of the VBNC condition, permitting these eukaryotic cells to serve as reservoirs for survival and dispersal (Rahman et al., 2008). Published research has reported molecules and mechanisms (e.g., temperature upshift) that appear to restore culturability to VBNC cells upon their return to liquid media or agar plates. This putative revival of VBNC dormancy has been termed "resuscitation." However, many researchers doubt the existence of a VBNC state and its resuscitation, claiming the supporting evidence is lacking or marginal at best (Bogosian and Bourneuf, 2001). Skepticism arises because resuscitation is thought to be re-growth of injured cells that have regained their health. Disbelievers point out genes responsible for a pathway or developmental program leading to a physiologically differentiated VBNC state have been slow to identify through the use of null mutations and knockout studies (Soto et al., 2010). Nothing analogous to endospore formation has surfaced. Definitive evidence of VBNC cells will require loss-of-function experiments with subsequent complementation or overexpression gain-of-function studies to describe a "VBNC" regulon or modulon (Bogosian and Bourneuf, 2001). Microbial experimental evolution is a remarkable approach to addressing the validity of VBNC cells. After 24–48 h of growth in nutrient rich media (28◦C, 200–225 rpm), most of a*V. fischeri* liquid culture is non-culturable, if not entirely dead, as the plating efficiency rapidly decreases. (Static liquid cultures do not experience this phenomenon and can remain culturable for weeks). The exact result is strain dependent, as some strains are more susceptible than others in their failure to re-grow upon subculturing to fresh media or transfer to agar plates. By serially transferring what few *V. fischeri* cells continue to grow from shaking and aging liquid cultures undergoing a decay in culturability, a population can be increasingly selected for resistance to non-culturability.

#### **CONCLUSION**

Bioinformatics will provide additional insight into experimental evolution with the Vibrionaceae, including genomics, transcriptomics, proteomics, and metabolomics. For microorganisms such as *V. fischeri*, which cycle between host-associated and freeliving phases, consideration of the operating selection pressures unique to each environment, relative magnitudes, and respective contributions in driving microbial evolution merits consideration (Nyholm and Nishiguchi, 2008). Since prokaryotes possess tremendous genetic and metabolic diversity, understanding the factors that shape bacterial biogeography and ecology will provide insights into bacterial adaptation and natural history.

#### **ACKNOWLEDGMENTS**

The authors would like to thank helpful discussions with members of the Nishiguchi lab and the *Euprymna–Vibrio* community of researchers. This work was supported by NIH NIAID 1SC1AI081659, NIH NIAIDS 3SC1AI081659-02S1, and NSF IOS 074498 to Michele K. Nishiguchi. We would also like to thank Michigan State University – BEACON Center for the Study of Evolution in Action for support.

#### **REFERENCES**


isolated from the gut of the abalone Haliotis discus hannai. *Int. J. Syst. Bacteriol.* 48, 573–580. doi: 10.1099/00207713-48-2-573


Yildiz, F. H., and Visick, K. L. (2009). Vibrio biofilms: so much the same yet so different. *Trends Microbiol.* 17, 109–118. doi: 10.1016/j.tim.2008.12.004

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 May 2014; accepted: 20 October 2014; published online: 09 December 2014.*

*Citation: Soto W and Nishiguchi MK (2014) Microbial experimental evolution as a novel research approach in the Vibrionaceae and squid-Vibrio symbiosis. Front. Microbiol. 5:593. doi: 10.3389/fmicb.2014.00593*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Soto and Nishiguchi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Bacteria in Ostreococcus tauri cultures – friends, foes or hitchhikers?

## *Sophie S. Abby1,2 , Marie Touchon1,2 , Aurelien De Jode3,4 , Nigel Grimsley3,4 and Gwenael Piganeau3,4 \**

<sup>1</sup> Institut Pasteur, Microbial Evolutionary Genomics, Paris, France

<sup>2</sup> CNRS, UMR 3525, Paris, France

<sup>3</sup> CNRS, UMR 7232, Biologie Intégrative des Organismes Marins, Observatoire Océanologique, Banyuls-sur-Mer, France

<sup>4</sup> Sorbonne Universités, UPMC Université Paris 06, UMR 7232, BIOM, Observatoire Océanologique, Banyuls-sur-Mer, France

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Fabrice Not, Centre National de la Recherche Scientifique, France Scott Clingenpeel, United States Department of Energy Joint Genome Institute, USA

#### *\*Correspondence:*

Gwenael Piganeau, CNRS, UMR 7232, Biologie Intégrative des Organismes Marins, Observatoire Océanologique, Sorbonne Universités – University Pierre and Marie Curie, 66650 Banyuls-sur-mer, France e-mail: gwenael.piganeau@ obs-banyuls.fr

Marine phytoplankton produce half of the oxygen we breathe and their astounding diversity is just starting to be unraveled. Many microbial phytoplankton are thought to be phototrophic, depending solely on inorganic sources of carbon and minerals for growth rather than preying on other planktonic cells. However, there is increasing evidence that symbiotic associations, to a large extent with bacteria, are required for vitamin or nutrient uptake for many eukaryotic microalgae. Here, we use in silico approaches to look for putative symbiotic interactions by analysing the gene content of microbial communities associated with 13 different Ostreococcus tauri (Chlorophyta, Mamilleophyceae) cultures sampled from the Mediterranean Sea. While we find evidence for bacteria in all cultures, there is no ubiquitous bacterial group, and the most prevalent group, Flavobacteria, is present in 10 out of 13 cultures. Among seven of the microbiomes, we detected genes predicted to encode type 3 secretion systems (T3SS, in 6/7 microbiomes) and/or putative type 6 secretion systems (T6SS, in 4/7 microbiomes). Phylogenetic analyses show that the corresponding genes are closely related to genes of systems identified in bacterial-plant interactions, suggesting that these T3SS might be involved in cell-to-cell interactions with O. tauri.

**Keywords: phytoplankton, bacterial symbiosis, secretion system, illumina sequencing, bacterial diversity, microbiome, phycosphere,** *Ostreococcus tauri*

## **INTRODUCTION**

Eukaryotes acquired photosynthesis on multiple occasions from endosymbiosis (Archibald, 2012; De Clerck et al., 2012), resulting in an astounding phylogenetic diversity of phytoplanktonic eukaryotes (Not et al., 2012 for a review). The coexistence of so many species competing for the same resources does not fit theoretical prediction that in a stable environment, the best competitor wins, a puzzle coined as the "paradox of the plankton" (Hutchinson, 1961). However spatial and temporal environmental heterogeneities affect the environmental stability hypothesis (e.g., Levins and Culver, 1971), and interactions among competitors (e.g., Gross, 2008), as well as non-competitive interactions with other species (e.g., mutualism, commensalism) may increase their probability of coexistence. The "phycosphere" – the region immediately surrounding and influenced by phytoplankton cells – is an important bacterial habitat (Bell and Mitchell, 1972; Blackburn et al., 1998). Heterotrophic bacterial communities are sustained by phytoplankton exudates and play an important role in remineralization of nitrogen (N) and phosphate (P). Many protist algae are mixotrophic, gaining nutrients either by photosynthesis or by heterotrophy depending on environmental conditions (Flynn et al., 2013 for a review), adding further complexity to plankton community assemblages. Facilitative interactions have often been suggested by phycologists whose algal laboratory cultures were often most successful when they did not eliminate all bacteria from the cultures

(Cole, 1982 for a review), and the generality of bacterialseaweed associations is now well known (Goecke et al., 2010; Egan et al., 2013; Hollants et al., 2013 for reviews). For example, some protists are known to fix nitrogen or carbon via their cyanobacterial endosymbionts (reviewed in Nowack and Melkonian, 2010; Thompson and Zehr, 2013) and some chemical pathways mediating algal-bacterial interactions were identified (e.g., Seyedsayamdost et al., 2011; Patzelt et al., 2013; Syrpas et al., 2014). Whether some phytoplanktonic eukaryotes evolved specific interactions with bacteria beyond ecological facilitation is still a matter of debate, and little information is available on close algal-bacterial interactions for unicellular eukaryotes. The study of interactions between planktonic microbes has been long hampered by our lack of knowledge of these unicellular organisms, especially for the smallest sized planktonic eukaryotes, the picoeukaryotes (cell diameter size <2 μm). These microorganisms often lack morphologically informative characters and are difficult to isolate and maintain in culture. For example, species of the genus *Ostreococcus* are hard to discriminate, and it is thus difficult to study species-specific interactions between partners one cannot identify (Subirana et al., 2013).

Early observations of physical attachment between some diatom species and bacteria date back to the first cytological observations (e.g., the diatom *Skeletonema costatum* in Droop and Elson, 1966). A pioneering study of bacteria-phytoplankton interactions screened the bacterial content of microalgal cultures by standard microbiological techniques (Berland and Maestrini, 1969). With the development of molecular biology tools, ribosomal RNA genes sequencing and barcoding approaches allowed establishing links between phytoplankton and bacterial community dynamics in natural communities (Fukami et al., 1992; Rooney-Varga et al., 2005) and culture collections of diatoms and dinoflagellates (Schäfer et al., 2002; Jasti et al., 2005; Sapp et al., 2007). Recently, integrative approaches associating sequence-based identification of cells, cytometry, cell sorting and tracer experiments with 15N and 13C, demonstrated a mutualistic interaction between N2-fixing cyanobacteria and a phytoplanktonic prymnesiophyte (Thompson et al., 2012). Cell sorting and single cell sequencing of natural isolates now enables to identify candidate bacteria–protist partnerships without cultivation, and these techniques promise to uncover overlooked interactions (Martinez-Garcia et al., 2012).

The development of next generation sequencing now enables to sequence microalgae and associated microbiomes and to investigate the molecular toolkit of putative interactions from gene content analyses. For example, symbiosis involving nitrogen fixation requires the presence of the nitrogenase operon in the bacterial partner, and complementation for vitamin biosynthesis requires the presence of the genes responsible for the vitamin B pathway in the bacteria along with suitable algal uptake mechanisms (**Table 1**). In some plant-bacterium interactions, physical cell-to cell-interactions occur via specialized protein secretion systems such as the type 3 secretion systems (T3SS; Kosarewicz et al., 2012), or the T4SS (Smillie et al., 2010). These systems are sophisticated molecular needles that enable the translocation of bacterial effectors from the bacterial cytoplasm to the eukaryotic cell (**Figure 1**). These systems are involved in both antagonistic and beneficial interactions between bacteria and eukaryotes.

While picoalgae of the genus *Ostreococcus* are distributed worldwide (Piganeau and Moreau, 2007; Demir-Hilton et al., 2011), *Ostreococcus tauri*, first isolated from a French Mediterranean coastal lagoon and described as the smallest

eukaryotic species known (Courties et al., 1994; Chrétiennot-Dinet et al., 1995) has been found mainly in coastal regions and lagoons (Subirana et al., 2013). Within a 3-year period (2006– 2008) of sustained effort, we could isolate 17 new wild-type clonal lines of *O. tauri* and characterize them with a few genetic markers (Grimsley et al., 2010). Interestingly, none of the *O. tauri* cultures we retained were completely axenic despite initial treatment with antibiotics. Here, we postulate that the bacterial microbiome consistently present in successful *O. tauri* cultures could be required for the health of these cultures. We present an analysis of the bacterial microbiome associated with these cultures and apply recently developed tools to mine the *O. tauri* microbiome for protein secretion systems involved in bacteria-eukaryotes interaction.

#### **MATERIAL AND METHODS BIOLOGICAL DATA**

We analyzed data from 13 *O. tauri* strains that were sampled from surface water in five locations by the North-West Mediterranean sea previously described (Blanc-Mathieu et al., 2013). Cultures were isolated by serial filtrations, addition of Keller's salts as a supplement (Keller et al., 1987), and growth in a culture chamber in the laboratory. These strains were established from clonal culture by plating out in gelled medium and re-inoculating cells picked from a single colony in liquid medium to obtain cell densities above 10<sup>7</sup> ml−1. One strain was the control *O. tauri* laboratory strain RCC4221, cloned from the RCC745 culture (Courties et al., 1994), and the other 12 were isolated more recently as described in Grimsley et al. (2010). Despite the treatment by antibiotics as described in Grimsley et al. (2010); kanamycin 20 μg/ml, penicillin 25 μg/ml, and neomycin 20 μg/ml final concentrations, none of the strains were found to be completely axenic. This is the case not only for the 13 strains analyzed here, but also for 100s of other isolations made by plating out for single algal cells, including *O. mediterraneus, Micromonas sp.* and *Bathycoccus prasinos* (Nigel Grimsley, unpublished observations). Three micrograms of total DNA was extracted from each culture as previously described (Derelle et al., 2006). Genomic DNA of the strains


**Table 1 |The Molecular toolkit of bacteria – eukaryote interactions: examples of bacterial target genes for screening genomes and metagenomes.**

membrane-associated part containing the core secretion apparatus. The components with bright colors correspond to the most conserved components, and are thus the one searched by sequence similarity to assess the system's presence in the assembled sequences.

was randomly sheared into ∼250-bp fragments. The libraries created from these fragments were sequenced on an Illumina GAIIx and Hiseq system at the Joint Genome Institute<sup>1</sup> Community Sequencing Program (CSP-129). Sequence data for strains RCC1108, RCC1114, RCC1115, RCC1116, RCC1558, RCC1559, and RCC4221 were 76 bp paired-end reads (depth of coverage for the algal genome 160-fold to 340-fold) and sequence data of strains RCC1110, RCC1112, RCC1117, RCC1118, RCC1123, and RCC1561 were 101 bp paired-end reads (depth of coverage for the algal genome 780-fold to 1130-fold).

#### **EXTRACTION OF BACTERIAL SEQUENCES AND TAXONOMIC AFFILIATION**

Paired-end data of each strain were mapped to the reference nuclear genome of *O. tauri* (GenBank accession numbers: CAID01000001-CAID01000020, (Blanc-Mathieu et al., under review) with the Burrow-Wheeler Aligner (BWA) with parameters *n* = 6 *l* = 35 *k* = 3 *e* = 3 (Li and Durbin, 2009). Read pairs with no read mappings to the genome sequence were identified based on SAM flags and extracted in fastq files with the seqtk package2. Paired-end reads were assembled with Velvet (Zerbino and Birney, 2008) using default parameters and various *k-mer* sizes. The assembly with the highest median contig length (N50) was retained.

To remove remaining *Ostreococcus* sequences that might have been too divergent to be mapped on to the reference sequence, contigs with more than 80% amino-acids identity over 90 bp with available Mamiellales coding sequences were discarded using the

PRASINOID interface<sup>3</sup> (Vaulot et al., 2012). The remaining contigs were analyzed via MG-RAST (Meyer et al., 2008) on the Refseq protein database. Taxonomic affiliationsfor each microbiome were downloaded from the MG-RAST server and analyzed using inhouse scripts to retrieve all contigs with an alignment against Refseq longer than 30 amino-acids and with more than 80% sequence identity.

The assembled bacterial contigs for each culture can be downloaded from http://www.obs-banyuls.fr/piganeau/publications/ data/.

#### **ESTIMATION OF UBIQUITY AND ABUNDANCE OF BACTERIA**

The abundance of each bacterial group was measured as the sum of reads affiliated to one group, divided by the total number of affiliated reads for each microbiome. Ubiquity was defined as the number of occurrences of a genus in the 13 metagenomes.

To estimate the bacteria to *Ostreococcus*-cells ratio, we assumed that the number of reads affiliated to *Ostreococcus*, *r*<sup>O</sup> is equal to the product of the number of *Ostreococcus* cells in the sample, *C*O, by the genome size,*G*O, by a constant α<sup>O</sup> (representing the product of the DNA extraction efficiency and the sequencing efficiency):

$$R\_O = \alpha\_O G\_O C\_O \tag{1}$$

The number of reads affiliated to bacteria, *R*B, is equal to the sum of the product of *C*Bi the number of bacterial cells from each bacterial strain *i*, by *G*Bi their average genome size, by a constant αBi. Assuming an equal α. and genome size between bacteria (marine bacteria have a genome in the 2–7 Mb size range), the complete number of bacterial reads can be expressed as the product of *CB*, *GB*, and α*B*.

$$R\_B = \sum\_i \alpha\_{Bi} G\_{Bi} C\_{Bi} \approx \left| \alpha\_B G\_B C\_B \right| $$

From these two equations we can estimate the ratio of bacterial to *Ostreococcus* cells as a function of two parameters: the relative genome size differences, the relative number of reads and the relative α. parameter between bacteria and *Ostreococcus* DNA.

$$\frac{C\_B}{C\_O} = \frac{\alpha\_O G\_O R\_B}{\alpha\_B G\_B R\_O} \tag{2}$$

Statistical analyses and ubiquity abundance plots were done with R4.

#### **SEARCH FOR NITROGENASES**

To screen for the presence of nitrogenases, we used the amino-acid sequences of the different types of nitrogenases described in Raymond et al., (2004). We processed the output of the blastx of this dataset against the assemblies to retain all hits with amino-acid identity greater or equal to 60% and total alignment length higher than 100 amino-acids. Ten cultures contained hits with these criteria. Further analysis of theses hits did not confirm that the encoding genes were nitrogenases, but genes belonging to related gene families, like hydrogenases, so that we did not proceed to further analysis.

<sup>1</sup>http://www.jgi.doe.gov/

<sup>2</sup>https://github.com/lh3/seqtk

<sup>3</sup>http://www.obs-banyuls.fr/prasinoidtest

<sup>4</sup>http://www.R-project.org

#### **IDENTIFICATION AND PHYLOGENETIC ANALYSIS OF PROTEIN SECRETION SYSTEMS**

Tools were recently developed to identify type 3, type 4, and type 6 secretion systems from similarity search of essential components and gene content/gene organization criteria (Abby and Rocha, 2012; Gama et al., 2012; Guglielmini et al., 2014). They were used with the MacSyFinder framework (Abby et al., 2014) to detect these systems in the assembled contigs of the microbiomes. Phylogenetically relevant components of T3SS detected in the microbiome were analyzed (i.e., SctJ, SctN, SctQ, SctR, SctS, SctT, SctU, and SctV; see **Figure 1**). Their sequences were introduced into the appropriate pre-existing gene families (dataset of Abby and Rocha, 2012). We aligned their sequences with Muscle (default parameters) and selected informative sites with BMGE (BLOSUM30 similarity matrix, gap rate cut-off = 0.20, sliding window size = 3, entropy score cutoff = 0.5; Edgar, 2004; Criscuolo and Gribaldo, 2010). Then, these alignments were concatenated, and a phylogenetic tree including microbiome sequences was built with RAxML (Le and Gascuel matrix + 4-categories-discretized Gamma distribution for rate variation among sites + empirical frequencies of amino-acids) with 100 rapid bootstraps (Stamatakis, 2006; Le and Gascuel, 2008).

### **RESULTS**

#### **WHICH BACTERIA LIVE IN** *O. tauri* **CULTURES?**

MG-RAST taxonomic affiliation of contigs based on sequence identity to the Refseq protein database listed 149 distinct bacterial taxonomic affiliations at the genus level. A few contigs were assigned to higher taxonomic ranks like alphaproteobacteria or gammaproteobacteria. We estimated the average abundance of each group as the proportion of reads affiliated to a bacterial group to the total number of reads assigned to bacteria. Based on these estimates, we found no bacterial group to be both ubiquitous and abundant throughout the 13 microbiomes (**Figure 2**). The most ubiquitous bacterial group is *Flavobacterium* and is found in 10 out of 13 cultures, with an average abundance of 18% of reads in the 10 microbiomes. The most abundant group

of bacteria is *Limnobacter* as this genus is represented by 60% of the reads in the three microbiomes where it was detected. Other abundant groups are *Pseudomonas*, *Roseovarius,* and *Oceanocaulis* (**Figure 2**), each group represent more than 10% of the reads. The most abundant bacterial groups for each microbiome are reported in **Table 2**. They belong to seven different genera for the 13 microbiomes; *Flavobacterium*, *Pseudomonas,* and *Limnobacter* dominate in the six microbiomes with more than one bacterium for 10 *O. tauri* cells, while *Sphingomonas*, *Robiginitalea*, *Oceanocaulis,* and *Roseovarius* are the most abundant bacterial groups in the seven microbiomes with less than one bacterium for 10 *O. tauri* cells (see below). These genera belong to the Bacteroidetes and Proteobacteria phyla, with three orders presentfor the latter: alphaproteobacteria, betaproteobacteria, and gammaproteobacteria.

#### **HOW MANY BACTERIA PER** *O. tauri* **CELL?**

We used the number of reads affiliated to *O. tauri* and to the most abundant bacterial groups to estimate the number of bacterial to microalgal cells using Eqn (2). This estimation relies on the assumption that the DNA extraction protocol and the sequencing are not biased towards *O. tauri* or bacteria. In a 45 marine bacterial genomes dataset, the average genome size was estimated to be ∼4 Mb (Moran and Armbrust, 2007). We used this value as a proxy for genome size of bacteria associated to *O. tauri* strains. Since marine bacterial genome size varies between 2 and 7 Mb, we do not expect more than a twofold difference in our estimate. The number of bacteria for 10 *O. tauri* cells varies by two orders of magnitude between cultures: from 0.02 to 4 (**Table 2**).

#### **LOOKING FOR BACTERIAL FACTORS INVOLVED IN INTERACTIONS WITH EUKARYOTES**

Type 3 secretion systems have evolved and diversified into recognizable sub-types to interact with different kinds of eukaryotic cells (animal vs. plant), and participate in different types of interaction with eukaryotes (antagonistic vs. beneficial; Troisfontaines and Cornelis, 2005; Abby and Rocha, 2012). The human pathogen *Salmonella* uses two types of T3SS (SPI-1 and SPI-2) at different


**Table 2 | Number of reads assigned to** *Ostreococcus tauri* **and bacteria for each cultured strain and estimation of the number of bacterial cells for 10** *O. tauri* **cells from Eqn (2) for the most abundant bacteria.**

Genome size of O. tauri is 13 Mb and bacterial genome size was assumed to be 4 Mb.

\*T3SS detected from microbial assemblage.

+α, β, γ P, alpha, beta, gamma Proteobacteria; F, Flavobacteria.

stages of host infection (Valdez et al., 2009), while nitrogen-fixing Rhizobiales use a particular T3SS (Rhizobiales type) to establish symbiotic interactions with host plants (Dai et al., 2008; Kambara et al., 2009). Therefore, the presence of T3SS components in a bacterial dataset is a hallmark of bacterium-eukaryote interactions, and the phylogenetic typing of these components can help identifying the kind of interaction, and the type of targeted eukaryotic cell. Among T4SS – classically dedicated to conjugation – some are involved in plant pathogenesis (e.g., tumor formation, Pitzschke and Hirt, 2010) but also in plant symbiosis (Hubber et al., 2004). T6SS allow the translocation of effectors from bacteria to eukaryotic cells in antagonistic relationships, but were also proved to target bacteria for bacterial competition (Pukatzki et al., 2007; Hood et al., 2010; Kapitein and Mogk, 2013). T6SS are virulence factors for several phytopathogens, and were observed in plant symbionts genomes (Amadou et al., 2008; Wu et al., 2008). We looked for signs of these putative factors of bacterium-eukaryote interaction in the microbiota associated to *O. tauri*. Using both gene content and close linkage distance between genes as criteria for the detection of T4SS and T6SS (Gama et al., 2012; Guglielmini et al., 2014), there was no evidence of T4SS seemingly involved in protein secretion, but we could detect six occurrences of putative T6SS in four different microbiomes (Table S1).

Using the same kind of detection methods for type 3 secretion systems (Abby and Rocha, 2012), we could detect seven occurrences of putative T3SS in contigs from six different microbiomes (**Figure 3**). We took advantage of a previous study to sub-type T3SS: we included the T3SS' components detected in *O. tauri*'s microbiome in a reference phylogeny annotated with T3SS sub-types and corresponding host types (Figure 4 of Abby and Rocha, 2012). Interestingly, detected T3SS consistently placed within, or as sister-groups of plant-associated

T3SS sub-types with high rapid bootstrap supports (**Figure 4**). A first sub-type placed within the "Hrp1" T3SS family, which includes many systems of plant pathogens and some involved in plant symbiosis (i.e., *Pseudomonas syringae, Dickeya dadantii*). This system, whose contig was assigned to the *Pseudomonas* genus, was found in a single microbiome and showed sequence and genetic architecture highly similar to the T3SS observed in the genome of *P. brassicacearum*, a root-associated plant symbiont (Ortet et al., 2011). Three T3SS occurrences seemed to correspond to a single system that placed within the clade of plant-symbionts, Rhizobiales T3SS. This system showed high similarity with a *Mesorhizobium* system (Rhizobiales species, 50–70% identity in a Blast analysis). The three corresponding contigs were attributed to three different alphaproteobacterial species: 2 Rhizobiales and 1 Rhodobacterales. Intriguingly, a third type of T3SS found in two microbiomes fell outside of the defined T3SS families, but constituted a sister-group of the Hrp1 family (**Figure 4**). The closest system found in the phylogeny was that of *Herbaspirillum seropedicae*, a plant symbiont, and both taxonomic attribution of contigs and similarity searches of the system using Blast and MG-RAST pointed at the species *Limnobacter sp.*, a Burkholderiales.

Overall, one of the detected T6SS gene cluster was found on a contig assigned to the *Pseudomonas* genus, in the same microbiome where a *Pseudomonas* T3SS was inferred (RCC1108, see Table S1; **Figure 4**). Five of the six detected T6SS were found in three microbiomes showing also evidence of T3SS gene clusters.

#### **DISCUSSION**

Despite the efforts of many laboratories over the last century to define the media and growth conditions required for different marine algae, most have so far remained recalcitrant to growth

in a completely defined medium, and require seawater to grow over essential extra nutrients (e.g., phosphate and nitrate). Seawater is a complex solution of chemicals and organisms that can vary considerably in its composition between geographically distant regions, complicating the development of appropriate culture media. Collections are thus often located in marine biology laboratories close to coastlines. The majority of algal cultures cannot be maintained axenically, rendering physiological analyses of their nutritional requirements more difficult. Indeed, many unicellular algae are mixotrophic, and can satisfy part of their nutritional requirements by ingesting bacteria.

Despite antibiotics treatments and the isolation of single cells from colonies in soft agarose, none of the *O. tauri* strains were axenic. This was the case not only for the 13 strains analyzed here, but also for 100s of other isolations made by plating out for single algal cells (unpublished observations). In addition, bacterial cultures issued from media of such cultures as well as observations of algal strains by flow cytometry almost always confirmed the presence of common seawater bacteria (unpublished data). These observations strongly suggest that either *O. tauri* adhere to bacterial cells during the cultivation process, or that *O. tauri* require some unidentified substance from bacterial cells for growth. Recent experimental evidence and genome analysis suggests that *O. tauri* is vitamin B12-dependent (Helliwell et al., 2011). We provided three vitamins (thiamine [B1]), biotin [H], and cyanocobalamin [B12]) as described for standard Keller's medium (Keller et al., 1987), making unlikely the selection of bacteria for vitamin B12 production. In order to identify putative supplements required by isolated *O. tauri* cells for growth, future work could focus in

the use of sterile artificial seawater, and step-by-step introduction of candidate substances. But considering the great deal of effort already put in attempts to define suitable culture media for this kind of algae (Keller et al., 1987), it might be more efficient to first isolate associated bacteria and investigate their influence on the physiology of the microalgae (Le Chevanton et al., 2013).

The bioinformatic analyses performed here confirmed the presence of a diverse collection of common marine bacteria in *O. tauri* cultures. *Flavobacteria* was largely found in our samples. This important class of *Bacteroidetes* often constitutes a significant portion of marine microbial communities and has been reported in microalgal cultures (Berland and Maestrini, 1969; Mann et al., 2013). Similarly, *Bacteroidetes* have also been reported in the surface waters of NW Mediterranean sea (Lami et al., 2009). *Flavobacteria* are found both free-living and attached to organic aggregates and are considered as major mineralizers of organic matter (Kolton et al., 2013). Interestingly, the type 3 and type 6 secretion systems have been detected in the microbiomes with higher bacterial prevalence. Microbiomes for strains that have a higher ratio of number of bacteria to *O. tauri* cells (**Table 2**), are more likely to contain a detectable T3SS (Wilcoxon signed rank test *p* < 0.01). The taxonomic affiliation of the contigs containing a predicted T3SS correspond to the most abundant bacterial group in the microbiome except for RCC1110 and RCC1561, whose T3SS contigs were assigned to alphaproteobacteria while the most abundant bacterial group is a *Flavobacterium*. The contigs containing the secretion systems were attributed to genera with evidence of species interacting with eukaryotes via protein secretion systems. Finally, no single strain of bacterium was found in all of the

*O. tauri* cultures. Since all these bacteria were isolated from the same host species, it could seem unlikely at first that very specific physical interactions or nutritional requirements exist between *O. tauri* and its microbiome. However, several biases could partially explain the heterogeneity observed between the microbiomes both in terms of taxonomic diversity, and therefore gene content.

Firstly as the effective detection of the secretions systems rely on sequence similarity search and the genetic organization of their components, it heavily depends on the sequencing and assembly quality. In the context of NGS approaches for metagenomics, whose short reads are difficult to assemble, it is likely that we missed occurrences of systems due to contig assembly errors and

biases. Secondly, the filtration steps in *O. tauri* isolation is likely to sort out aggregates of bacterial cells, and as a consequence, some bacteria of interest to understand the growth of *O. tauri*. Finally, antibiotics treatment had also an impact on the bacterial populations we analysed in this study, and this might also explain some of the heterogeneity in terms of bacterial diversity and gene content. To specifically find preferred associated bacteria we should repeat this work without using antibiotics, even if these conditions, it may be difficult to isolate *Ostreococcus* strains.

In conclusion, we provide evidence of pervasive bacterial presence in *O. tauri* cultures, despite initial antibiotic treatment. We provide evidence for putative plant-associated T3SS in six microbiomes, and several cases of T6SS in four microbiomes (three displayed both systems). For now there are no studies showing a clear association between T6SS sub-types and their function, thus it is hard to define from genomic analyses whether the systems we detected are targeting bacteria or eukaryotes. But in both cases, they might be parts of interactions between bacteria and eukaryotes, even indirectly: *via* bacterial competition, T6SS were found to serve as colonization factors in the plant pathogen*Agrobacteriumtumefaciens*, (Ma et al., 2014) and to provide plants a protection against pathogens in *P. fluorescens* (Decoin et al., 2014). On the other hand, the analysis of detected T3SS gave a clearer picture as it clearly shows they are involved in interaction with plant cells. The three subtypes they belong to – or the groups they are closer to, all contain systems typical of plant symbiosis, and pathogenicity in the case of the Hrp1 group. Further experimental work is required to determine the impact of these secretion systems in *O. tauri* growth, while keeping in mind that interactions are dynamic, and that the same bacteria may change between "friend," "foe," or "hitch-hiker" over time or environmental conditions (Andrade-Domínguez et al., 2014).

#### **ACKNOWLEDGMENTS**

We would like to thank the GenoToul bioinformatic platform (http://bioinfo.genotoul.fr/) for access to computing facilities and Romain Blanc-Mathieu, Sophie Sanchez-Ferandin, Marc Garcia-Garcerà and the Genomics of Phytoplankton group in Banyuls sur mer for stimulating discussions. We would like to acknowledge the Joint Genome Institute (http://www.jgi.doe.gov/) for sequencing. This work was funded by ANR-13-JSV6-0005 to GP, the Institut Pasteur, the French Centre National de la Recherche Scientifique and the European Research Council (grant EVOMOBILOME, number 281605).

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at:http://www.frontiersin.org/journal/10.3389/fmicb.2014.00505/ abstract

#### **REFERENCES**

Abby, S. S., Néron, B., Ménager, H., Touchon, M., and Rocha, E. P. C. (2014). MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems. *PLoS ONE* 9:e110726. doi: 10.1371/journal.pone.0110726


**Conflict of Interest Statement:** The Review Editor, Fabrice Not, declares that, despite having collaborated with author, Nigel Grimsley, the review process was handled objectively and no conflict of interest exists. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 July 2014; accepted: 08 September 2014; published online: 07 November 2014.*

*Citation: Abby SS, Touchon M, De Jode A, Grimsley N and Piganeau G (2014) Bacteria in Ostreococcus tauri cultures – friends, foes or hitchhikers? Front. Microbiol. 5:505. doi: 10.3389/fmicb.2014.00505*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Abby, Touchon, De Jode, Grimsley and Piganeau. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Fungal association with sessile marine invertebrates

## *OdedYarden\**

Department of Plant Pathology and Microbiology, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot, Israel

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Susanna López-Legentil, University of North Carolina Wilmington, USA Harald Ronald Gruber-Vodicka, Max Planck Institute for Marine Microbiology, Germany

#### *\*Correspondence:*

Oded Yarden, Department of Plant Pathology and Microbiology, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel e-mail: oded.yarden@huji.ac.il

The presence and association of fungi with sessile marine animals such as coral and sponges has been well established, yet information on the extent of diversity of the associated fungi is still in its infancy. Culture – as well as metagenomic – and transcriptomicbased analyses have shown that fungal presence in association with these animals can be dynamic and can include "core" residents as well as shifts in fungal communities. Evidence for detrimental and beneficial interactions between fungi and their marine hosts is accumulating and current challenges include the elucidation of the chemical and cellular crosstalk between fungi and their associates within the holobionts. The ecological function of fungi in association with sessile marine animals is complex and is founded on a combination of factors such as fungal origin, host health, environmental conditions and the presence of other resident or invasive microorganisms in the host. Based on evidence from the much more studied terrestrial systems, the evaluation of marine animal– fungal symbioses under varying environmental conditions may well prove to be critical in predicting ecosystem response to global change, including effects on the health of sessile marine animals.

**Keywords: marine fungi, ascidian, marine sponge, coral health**

## **INTRODUCTION**

There is no consensus on the definition of marine fungi, even though it is clear that the grouping of marine fungi is primarily based on an ecological rather than a taxonomical basis (Kohlmeyer and Kohlmeyer, 1979; Hyde et al., 2000). Commonly used descriptions include "marine-derived" or "marine-associated" fungi, yet "facultative-marine" and "obligate-marine" would, most likely, best distinguish between fungi isolated from marine niches versus those requiring the marine environment. Much of the research on these organisms has focused on fungi from marine environments such as obtained or found associated with mangroves, wood substrates, sediments (Jones, 2011b and references within), as well as on fungal infections of marine mammals (Higgins, 2000).

The occurrence of fungal associations with other organisms within the marine environment has been reported and discussed for over a century, including the concerns whether fungi can, at all, grow in sea water (Murray, 1893). One of the earliest reports on actual parasitism by a marine fungus (albeit on an algae) was documented that same year by Church (1893). A significant landmark in the study of fungi from the marine environment was the report by Barghoorn and Linder (1944) who in addition to their descriptions of marine-derived fungi stated that "The fact that a score or more of species have been described as occurring in the sea is of importance since it shows that fungi not only tolerate salt water, but indeed that marine conditions furnish a normal habitat for the relatively small number of fungi that have become adapted to it." Since then, the number of fungal species (over 800) described from marine environments and the rate at which they are currently being described indicates that the marine fungal community is larger than originally considered (Jones et al., 2009; Jones, 2011a,b). Nonetheless, whereas the *Symbiodinium*

and marine prokaryotic and viral communities and their possible contributions to the niches they reside in received increased attention during the last decades (e.g., Breitbart, 2012; Garren and Azam, 2012;Weber and Medina, 2012; Moitinho-Silva et al., 2014), our current understanding of the fungal communities in marine environments is still extremely limited (Amend et al., 2012). The fungal kingdom is estimated to be comprised of approximately 1.5 million species, with less than 10% of them described to date (Hawksworth, 2001). As the identification of fungi associated with sessile marine animals and the realization of the ecological significance they may have is fairly recent, it is highly likely that many new species (some with unique attributes/ecological roles) within these niches have yet to be isolated/identified.

In this short review, evidence accumulated for the presence and association of fungi with corals, sponges, and ascidians will be provided as well as discussion of the potential impact fungi may prove to have in these niches.

## **CORALS**

The phylum Cnidaria, comprised of an approximate 10,000 species (Zhang, 2011) has been the most widely studied with regard to fungal prevalence. While the presence of fungi in coral hosts is acknowledged in the literature (Kendrick et al., 1982; Le Campion-Alsumard et al., 1995; Bentis et al., 2000; Golubic et al., 2005), insight as to the mechanistic nature of the interactions between the two is scarce.

In an in-depth SSU-based amplicon sequencing/transcriptomic analysis of the fungal community associated with the coral *Acropora hyacinthus*, the authors determined the presence of a diverse, metabolically active, fungal community (Amend et al., 2012). Interestingly, they have also uncovered a phylogenetically diverse core assemblage of fungi consistently associated with

*A. hyacinthus*. The community was correlated with the host rather than differences in environment (e.g., water temperature) or the presence of *Symbiodinium* partners in the sample (Amend et al., 2012). Metagenomic analysis of bleaching *A. millepora* revealed a 3-fold increase in fungi-like sequences, yet the role of fungi during this stress is unclear (Littman et al., 2011). In analysis of the *Porites astreoides* metagenome, fungi were found to be highly prevalent and have been implied to play a possible role in at least two processes of the nitrogen cycle within the coral, suggesting a positive role of fungi, along with other associated microorganisms within the holobiont (Wegley et al., 2007).

Recent incorporation of transcriptome-based analyses have also provided evidence for the presence of fungal species in cnidarians, such as in the case of the sea anemone *Aiptasia* in which changes in the microbial transcriptome have been monitored under conditions in which dinoflagellates were present versus aposymbiotic (bleached) states (Lehnert et al., 2014). In what way these changes influence the coral is not yet clear, yet given the fact that some coral (like members of the temperate genus *Astrangia*) go through seasonal changes in algal presence/activity (Dimond and Carrington, 2007), fluctuations in the presence of fungi or their activity may well be expected to accompany or be involved in seasonal or environmental changes. In addition to the variations in fungal community composition or activity, altered spatial distribution of core/resident fungi may also occur, as has been described in *A. formosa,* where fungi resident in healthy tissue of the coral proliferate into the skeletal cavity in stressed animals (Yarden et al., 2007). An additional form of interaction may involve a succession in biotic/abiotic factors involved in conferring a disease or syndrome. Such is the case described for black band and density-banding of *Porites* spp., where following an increase in the abundance of endolithic algae, these are attacked by fungi, which proliferate and are associated with production of the dark pigment (Priess et al., 2000). Hence, the combinations of temporal and spatial presence of fungi, along with their activities within the host appear to be dynamic and can clearly impact coral health. Determining the kinetics of fungal association with the host and other holobiont constituents is a key to the eventual understanding of their ecological significance.

Endolithic fungi have been shown to elicit a defensive response in corals, and have thus been suggested to elicit a parasitic, rather than saprophytic (or mutualistic) nature of association (Le Campion-Alsumard et al., 1995; Bentis et al., 2000; Golubic et al., 2005; Raghukumar and Ravindran, 2012). Perhaps the most striking example for potential parasitism has been the study of *Aspergillus sydowii*, a common terrestrial fungus suggested to cause an epidemic in the sea fan *Gorgonia ventalina* that can also confer symptoms to *G. flabellum* (Geiser et al., 1998; Kim and Harvell, 2004). The ecology and host response of sea fan coral Aspergillosis has been recently reviewed by Burge et al. (2013). The disease causing agent was suggested to be introduced as dustborne propagules originating from the Sahara (Garrison et al., 2003). However, evidence of panmixia and lack of isolation by distance, along with high genetic diversity of the *A. sydowii* isolates analyzed and lack of evidence supporting the presence of the fungus in sampled African dust suggest that a single origin of the

pathogen or a limited number of introductions is probably not the basis of Caribbean sea fan Aspergillosis (Rypien, 2008; Rypien et al., 2008). Furthermore, the presence of additional opportunistic fungi associated with sea fans (including sea fans in the Pacific; Barrero-Canosa et al., 2013) suggests that Aspergillosis, and related diseases, may be caused by more than one fungal species. A changing environment may also affect the nature of associations, via the status of the holobiont, as some coral species have been shown to exhibit a temperature-dependent immune response, including changes in melanin production, circulating amoebocytes and antioxidant levels (Mydlarz et al., 2010). These data imply that the potential detrimental (or beneficial) effects in which fungi are involved may be more complex than originally perceived and are based on a combination of factors such as pathogen origin, host health, environmental conditions and the presence of other resident or invasive microorganisms in the host. Furthermore, it is highly conceivable that some fungi may alter their function (e.g., from commensals to opportunists/pathogens) as a result of changes in environmental conditions, host vigor or holobiont composition/activity.

Taken together, most studies performed on coral-associated fungi have focused on either parasitic or opportunistic interactions. Even though the majority of these describe either the fungal species associated with coral or potential detrimental outcomes of fungal presence within the animals, the importance of another key form of interactions – mutualism, has yet to be probed in detail, even though support for this notion has been clearly discussed (Wegley et al., 2007). Evidence from plant and terrestrial animals suggest that it will not be surprising to find such interactions in coral (and other sessile marine animals), yet the actual nature of mutual relationships have yet to be substantiated.

## **SPONGES**

It is not surprising to find fungi associated with sponges, along with a plethora of other microorganisms (Hentschel et al., 2012). The fact that 40–60% of the sponge biomass is comprised of associated microorganisms, along with the nature of sponge feeding, based on the filtering of vast volumes of water (known to harbor fungal propagules), provides a high potential for the presence of fungi. Over 20 years ago Morrison-Gardiner (2002) isolated over a hundred fungal strains from marine sponges. Höller et al. (2000) isolated fungi from 16 species of sponges from tropical, subtropical, and temperate waters. In both cases, the fungi belonged to both marine and ubiquitous genera and produced diverse metabolites, many of which were previously unknown. Eighty five fungal taxa have been isolated from the Mediterranean Sea sponge *Ircinia variabilis* (formerly *Psammocinia* sp.), mainly by using a "sample compressing" method, in combination with fungicides-amended media (Paz et al., 2010). Here, too, abundant "terrestrial" taxa such as *Acremonium* and *Penicillium* were found, along with previously undescribed *Phoma* and*Trichoderma* species. Interestingly, when some of the *Trichoderma* spp. were examined, a significant number of the analyzed strains were shown to be halo-sensitive, suggesting that not the entire community was preferentially adapted to the marine environment and perhaps some of them were recently introduced or acquired from terrestrial sources (run-off or air-borne; Gal-Hemed et al., 2011).

Overall, the extent and ecological nature of fungal–sponge associations remains to be determined. Evidence for specificity of fungal communities within different sponges has been provided in the case of *Suberites zeteki* and *Mycale armata* (Gao et al., 2008; Li and Wang, 2009), yet to what extent fungal community signatures are hallmarks of different sponge microbiomes will need to be further established once additional data is collected. Based on metagenomic and gene enrichment analysis of the deep sea sponge *Lamellomorpha* sp., Li et al. (2014) proposed that eukaryotic symbionts, including fungi, along with their prokaryotic counterparts, can exhibit different metabolic potentials, especially in nitrogen and carbon metabolism. Furthermore, genetic interactions among them may also occur. Interestingly, even though fungi have been isolated from sponges, contrary to corals, no evidence for the presence of hyphal structures within sponges has surfaced over the years despite in-depth ultrastructural analyses carried out on a wide range of sponges (Maldonado et al., 2005). This raises the possibility that fungal propagules (presumably sexual or asexual spores, along with hyphal fragments) do not germinate to produce the standard hyphal cells but rather adapt to direct spore production in the liquid environment (Martinelli, 1976). This, of course, does not refer to unicellular yeasts, which under most circumstances do not produce hyphae or hyphae-like structures. In fact, based on TEM analysis, Maldonado et al. (2005) described the vertical transmission of a unicellular fungus via the oocytes in *Chondrilla nucula*, further substantiating the close potential interaction between fungus and sponge.

In contrast to coral, assessment of disease causing microbial agents in sponges appears to be more difficult, perhaps due to the complexity of the microbial communities associated with sponges (Webster and Taylor, 2012). However, the potential of a sponge to be a symptomless carrier of a coral disease-causing agent has been suggested, following the isolation of a sea fan-infectious strain of *A. sydowii* from *Spongia obscura* (Ein-Gil et al., 2009). Whether some fungi may prove to be omnivorous opportunists or pathogens raises the question concerning fungi as potential threats to more than one sessile marine species.

## **ASCIDIANS**

To date, only few reports describing fungal association with ascidians have been published and most analyses have focused on prokaryotic symbionts of these animals (Blasiak et al., 2014; Erwin et al., 2014). Perhaps the most detailed analysis of fungal diversity in ascidians has been reported by Menezes et al. (2010), who isolated representatives of over 15 fungal genera (the most abundantly described were *Trichoderma*, *Phoma,* and *Cladosporium* spp.) cultured from *Didemnum* spp. In earlier studies, Morrison-Gardiner (2002) described over two dozen fungal strains isolated from various chordates, the majority of which were from the genera *Aspergillus* and *Cladosporium*. At least one case of an acidian-associated fungus (*Camarosporium* sp.) was included in that report. Other reports, mainly related to bioprospecting for novel natural compounds include representatives of known fungal genera such as *Penicillium* in the case of isocoumarin derivatives (Xin et al., 2007) and *Aspergillus* strains that produce cytotoxic compounds, isolated from *Eudistoma vannamei* (Montenegro et al., 2012). In addition, a likely

new species of the family Diatrypaceae was isolated from an unnamed ascidian in the Bahamas (Oh et al., 2005, 2010). Coculturing this fungus with a fungal-associated marine bacterium induced a fungal diterpene biosynthesis pathway, suggesting a potential fungal-bacterial interaction which could result in the production of secondary metabolites. Whether or not this cometabolic process occurs in in the natural environment is not known.

## **ABUNDANT BIOACTIVE COMPOUNDS IN THE HOLOBIONT MAY HAVE ROLES IN HOST-FUNGAL AND FUNGAL-MICROBE INTERACTIONS**

Most of the emphases in the study of sessile marine animal– fungal interactions have been dedicated to the outcome of such interactions in respect to the hosts and by far less on the tolerance/susceptibility and physiological adaptation of the fungi involved. Chemical defense mechanisms have been considered as part and parcel of the success and survival strategies evolved in many sessile marine invertebrates. It is thus highly probable that some of the chemicals produced are directly involved in determining the composition and activity of resident and introduced microbiota (as well as other potential predators or invaders). The antifungal nature of some of the chemicals produced by such animals and their accompanying microflora can provide a potentially hostile environment for fungi (Donia and Hamann, 2003; Wang et al., 2013 and references within). It is not always clear which holobiont constituent (or their combinations) produce many of the compounds described. Nonetheless, cultured fungi isolated from sessile marine animals have been demonstrated to be capable of producing novel chemicals which have the potential to affect the marine hosts and their microbiome. These findings have been driven both by ecological/physiologicalbased research as well as bioprospecting efforts aimed at obtaining novel bioactive compounds (e.g., Höller et al., 2000; Wang et al., 2008; Cohen et al., 2011; Panizel et al., 2013 and reviewed in Raghukumar, 2008; Imhoff et al., 2011; Rateb and Ebel, 2011). Though understanding the nature of the cross-talk between the host and its associated microorganisms has mainly focused on the bacterial components, some studies on host–fungal interactions have been carried out. Wang et al. (2013) reviewed some of the chemical defense arsenal produced in sponges. This includes a broad range of compounds (from amino acid derivatives to nucleosides, macrolides, porphyrins, terpenoids, aliphatic cyclic peroxides, and sterols, just to mention a few groups) that have been shown to exhibit biological activity on a variety of cell types (sponges and others). Anti-microbial chemicals are also produced within sponges, and have been shown to specifically inhibit some Gram-positive or Gram-negative bacteria as well as fungi. An example of the potential sponge chemical response to fungi was reported by Ward et al. (2007) who showed that extracts from *A. sydowii*-infected *G. ventalina* inhibited the fungus significantly more than those from uninfected coral. Furthermore, these extracts exhibited increased potency when obtained from sea fans grown at elevated temperatures (in which higher severity of disease was also observed). Perovi´c-Ottstadt et al. (2004) described a cell surface receptor that recognizes (1 → 3)-β-D-glucans (a fungal cell wall component) in the demosponge

*S. domuncula* and that this is followed by a transduction of signals which include tyrosine protein phosphorylation. Such a recognition event can lead to exclusion/killing of the fungus or, conversely, the establishment of a symbiotic interaction. Regardless, the richness of the microbial community associated with sponges is clear evidence for the discriminatory function of some of the chemicals and corresponding cellular reactions produced.

Understanding of the effects of antifungal compounds (and other responses) produced by sessile marine holobionts and the role of fungal-produced compounds warrants additional research. This includes, in addition to chemical structure elucidation, the mode of anti-fungal action and the cellular signals involved both in the host and the fungus. Perhaps most challenging of all, is determining the significance/ecological roles of these compounds in the holobiont under varying environmental conditions.

## **DO FUNGI HAVE THE POTENTIAL OF ALTERING THE RESPONSE OF SESSILE MARINE ANIMALS TO GLOBAL CHANGE?**

Documentation concerning global climate changes and their possible effects on marine animals is accumulating (Harvell et al., 2002; Hoegh-Guldberg and Bruno, 2010). However, little is known about the possible changes/effects they may have on the fungi associated with these animals and on their possible involvement or role in the consequences of a changing environment. The effect of global changes on plant performance in the case of fungal pathogens is, by far, better documented than in the marine system. Kivlin et al. (2013) performed a meta-analysis of publications on the indirect responses to global changes (including enriched CO2 levels and warming) of plants associated with four classes (leaf, arbuscular mycorrhizal fungi, ectomycorrhizal fungi, and dark septate fungi) of endophytes (fungi residing within the host for at least part of their life cycle without causing apparent detrimental effects). For the fungal groups analyzed, presence of the symbionts did not significantly influence the effect of CO2 on host plant performance. Warming had a differential effect, based on the symbiont class. An increase in temperature can promote plant growth and in the case of the presence of beneficial endophytes such an increase can further enhance plant biomass accumulation. Adversely, the presence of a potentially detrimental endophyte (such as those belonging to the *Phialocephala fortinii* s.l. –*Acephala applanata* species complex) can result in a reduction in plant biomass accumulation under elevated temperature conditions. The changes analyzed involve an array of factors ranging from the genetic background of the host to the colonization capacity of the symbiont as well as interactions with other fungal strains/taxa (Reininger et al., 2012). To what extent the interactions in the marine niches in terms of the combined genetic backgrounds of sessile marine animal hosts and the fungi involved, along with environmental changes, are analogous to those described in some of the terrestrial systems, has yet to be analyzed. The study of a few aspects of such interactions has been initiated as stated in some of the examples above (Ward et al., 2007; Amend et al., 2012) and some marine-derived fungi have been shown to alter their growth rates as a result of combined changes in salinity and temperature (Lorenz and Molitoris, 1992).

Oceanic pH levels are predicted to decrease by.0.4 units within the current century, due to increases in atmospheric CO2 concentrations (Caldeira and Wickett, 2005; Hoegh-Guldberg et al., 2007). Meron et al. (2012)reported no significant changes in coralassociated bacterial communities along a natural pH gradient in the vicinity of volcanic vents, yet fungi were not monitored in that study. Nonetheless, under controlled conditions, a marked increase in the fungal community was observed under stress conditions in the coral *P. compressa* (Thurber et al., 2009). Krause et al. (2013) have used a microcosm experimental system to follow the changes in fungal colony forming units (CFUs) under altered pH conditions and found significant increases (9-fold) in fungal CFUs when pH levels were reduced to 7.82, when compared to the ambient 8.0 levels present in some North Sea locations. These data suggest that dissemination and possible proliferation of some fungi may increase as a result of acidification of the marine environment. Should they prove to affect holobiont health, the evaluation of marine animal–fungal symbioses under varying environmental conditions may well prove to be critical in predicting ecosystem resilience to global change.

## **SESSILE MARINE ANIMAL-ASSOCIATED FUNGI – A THREAT?**

Fisher et al. (2012) listed several cases where fungi were not only shown to be threats to plant/animal health but have also caused severe die-offs and extinctions in wild species and can impact biodiversity in a manner affecting food security and ecosystem health. In terrestrial plant systems, negative density dependence has been suggested as a mechanism that promotes community diversity on the basis of preventing competitive exclusion of some species by the more dominant ones [The Janzen–Connell hypothesis (Janzen, 1970; Connell, 1971)]. Recently, Bagchi et al. (2014) have experimentally tested the impact of fungi on plant diversity and have shown that fungicide treatment reduced the effective number of plant species by approximately 16%, implicating a role pathogenic fungi have in promoting diversity within the seedling community analyzed. The fundamentals of the Janzen–Connell model describing the promotion of diversity have been shown to be valid in an aquatic ecosystem. Marhaver et al. (2013) demonstrated that survivorship of *Orbicella faveolata* (formerly *Montastraea faveolata*) planulae was not only dependent on their proximity to adults of other taxa, but also significantly reduced in water collected near conspecific adults when compared to sterile water. The latter data support the prediction that mortality is due to the activity of hostspecific detrimental microorganisms. Whether or not fungi are involved in the described mortality and if such occurrences are relevant to coral that reproduce mainly by fragmenting (or other sessile animals) has yet to be determined, yet the foundations for examining the model with regard to fungal activity have been set.

In conclusion, fungi have gained recognition as both resident and functional components of sessile marine animal microbiomes. The extent of their diversity and function has only begun to be revealed and determining their contributions and effects on the health, maintenance, survival and proliferation of sessile marine animals is a timely task, especially under a changing environment. It is tempting to try and extrapolate from terrestrial to marine environments. One can speculate that fungi may well contribute to sessile marine animal communities in related ways and that changes in the prevalence of fungi associated with marine species may have significant effects on maintaining the biodiversity of the latters. Based on the vast evidence on the complexity and diversity of plant-microbe interactions, we can expect an equivalently rich variety of relationships in sessile organisms of the marine environment.

#### **ACKNOWLEDGMENTS**

I thank Yitzhak Hadar and Monica Medina for their comments and suggestions on this review.

#### **REFERENCES**


agents of biological control for arid-zone agriculture. *Appl. Environ. Microbiol.* 77, 5100–5109. doi: 10.1128/AEM.00541-11


Le Campion-Alsumard, T., Golubic, S., and Priess, K. (1995). Fungi in corals – symbiosis or disease – interaction between polyps and fungi causes pearl-like skeleton biomineralization. *Mar. Ecol. Prog. Ser.* 117, 137–147. doi: 10.3354/meps117137


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 April 2014; paper pending published: 26 April 2014; accepted: 29 April 2014; published online: 15 May 2014.*

*Citation: Yarden O (2014) Fungal association with sessile marine invertebrates. Front. Microbiol. 5:228. doi: 10.3389/fmicb.2014.00228*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Yarden. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The extended phenotypes of marine symbioses: ecological and evolutionary consequences of intraspecific genetic diversity in coral–algal associations

## *John E. Parkinson and Iliana B. Baums\**

Department of Biology, The Pennsylvania State University, University Park, PA, USA

#### *Edited by:*

Monica Medina, The Pennsylvania State University, USA

#### *Reviewed by:*

Melissa Susan Roth, University of California, Berkeley, USA Yvonne Valles, Centro Superior de Investigación en Salud Pública, Spain Benjamin Minault Fitzpatrick, The University of Tennessee, USA

#### *\*Correspondence:*

Iliana B. Baums, Department of Biology, The Pennsylvania State University, 208 Mueller Laboratory, University Park, PA 16802, USA e-mail: baums@psu.edu

Reef-building corals owe much of their success to a symbiosis with dinoflagellate microalgae in the genus Symbiodinium. In this association, the performance of each organism is tied to that of its partner, and together the partners form a holobiont that can be subject to selection. Climate change affects coral reefs, which are declining globally as a result. Yet the extent to which coral holobionts will be able to acclimate or evolve to handle climate change and other stressors remains unclear. Selection acts on individuals and evidence from terrestrial systems demonstrates that intraspecific genetic diversity plays a significant role in symbiosis ecology and evolution. However, we have a limited understanding of the effects of such diversity in corals. As molecular methods have advanced, so too has our recognition of the taxonomic and functional diversity of holobiont partners. Resolving the major components of the holobiont to the level of the individual will help us assess the importance of intraspecific diversity and partner interactions in coral–algal symbioses. Here, we hypothesize that unique combinations of coral and algal individuals yield functional diversity that affects not only the ecology and evolution of the coral holobiont, but associated communities as well. Our synthesis is derived from reviewing existing evidence and presenting novel data. By incorporating the effects of holobiont extended phenotypes into predictive models, we may refine our understanding of the evolutionary trajectory of corals and reef communities responding to climate change.

**Keywords: coral, genotype interactions, intraspecific diversity, mutualism,** *Symbiodinium*

#### **INTRODUCTION**

Fundamentally, evolution by way of natural selection acts on functional variation among individuals within a species (Darwin, 1859; Fisher, 1930). When the success of two (or more) organisms are linked, such as among mutualistic symbiotic partners, variation within one species interacts with the variation in the other, as well as with the environment (Thompson, 2005; Warren and Bradford, 2014), potentially driving direct and indirect evolutionary interactions (Wootton, 1994; Rowntree et al., 2014). Thus, the adaptive capacity of symbiotic organisms will be underestimated when intraspecific variation is not accounted for (Fisher, 1930). The increasing scale of reef degradation has called into question the ability of coral–algal symbioses to acclimate or evolve to deal with a changing world (Lasker and Coffroth, 1999; Glynn et al., 2001; Hoegh-Guldberg et al., 2002; Reshef et al., 2006; Brown and Cossins, 2011; Barshis et al., 2013). Acclimation occurs over the course of an organism's lifetime, while evolution takes place over generations; the time frame for both processes can overlap when evolution is particularly rapid (Hairston et al., 2005). Despite the fact that host and symbiont genomes are often decoupled each generation, coevolution clearly occurs (Thornhill et al.,2014). Currentforecasts of reef perseverance do not explicitly incorporate the effects of intraspecific diversity driving coevolution among coral–algal partners because such effects have rarely been assessed.

Classically, biodiversity has been measured at the species level, and such diversity has generally had positive effects on higherorder community diversity, function, and resilience (Balvanera et al., 2006). Modern molecular techniques are revolutionizing species delineation in coral holobionts. Using genetic and complementary phenetic evidence, many traditional host species designations and higher-order relationships are being reevaluated (Fukami et al., 2004, 2008; Huang et al., 2011; Pinzon and LaJeunesse, 2011; Budd et al., 2012; Keshavmurthy et al., 2013). Microalgae (including *Symbiodinium*) are likewise receiving renewed taxonomic attention emphasizing molecular data (LaJeunesse et al., 2012, 2014; Jeong et al., 2014; Leliaert et al., 2014).

More recently, intraspecific diversity has been revealed to be just as important (in some cases, more important) than interspecific diversity in explaining variation in associated community traits (Hughes et al., 2008). For example, the diversity, richness, and abundance of arthropods on trees are better explained by the number of tree genotypes than tree species diversity (Shuster et al.,2006;Whitham et al.,2006). However, similar investigation is lacking for corals and their microalgae. Few studies have addressed whether genotype diversity of a coral species affects the diversity of its symbiont community or other associated invertebrates and vertebrates. This is partly because the resolution of species (let alone individuals) in the coral holobiont has been contentious

(Stat et al., 2012). Within a given coral species, morphologically distinct colonies can be genetically identical owing to phenotypic plasticity among asexual fragments (Highsmith, 1982; Todd, 2008), while genetically disparate colonies may share striking resemblance (e.g., Pinzon and LaJeunesse, 2011). All *Symbiodinium* species and cell lines look superficially similar even under high magnification (LaJeunesse, 2001). Without high-resolution genetic markers, intraspecific effects on the ecology and evolution of coral–algal symbioses have been difficult to quantify accurately.

Population genetic microsatellite markers are increasingly used to study both scleractinian hard corals (Lopez et al., 1999; Maier et al., 2001; Magalon et al., 2004; Severance et al., 2004; Baums et al., 2005a, 2009; Underwood et al., 2006; Mangubhai et al., 2007; van Oppen et al., 2007; Isomura and Hidaka, 2008; Starger et al., 2008; Andras and Rypien, 2009; Wang et al., 2009; Concepcion et al., 2010; Polato et al., 2010; Banguera-Hinestroza et al., 2013; Chen et al., 2013; Davies et al., 2013) and *Symbiodinium* (Santos and Coffroth, 2003; Magalon et al., 2004; Pettay and LaJeunesse, 2007, 2009; Bay et al., 2009; Howells et al., 2009; Kirk et al., 2009; Andras et al., 2011; Pinzon et al., 2011; Wham et al., 2011, 2014). Armed with such markers, it is now possible to sample a single coral colony and determine not only its host and symbiont species compositions, but also to resolve unique multilocus genotypes (i.e., individuals) within each species. However, only rarely have both host and symbiont genotype composition been analyzed in concert (Andras et al., 2011, 2013; Pettay et al., 2011; Pettay and LaJeunesse, 2013; Thornhill et al., 2013; Baums et al., 2014; Prada et al., 2014b). So far this has only been done in a general population survey context, with most evidence suggesting that the genetic structuring of the host and the symbiont are not the same (e.g., Baums et al., 2014). No studies have manipulated host-symbiont pairings to examine genotype level interspecific interactions while unambiguously resolving both partners. Such work is routine in the study of terrestrial mutualisms, but represents a new frontier in the marine realm.

Researchers now stand poised to answer previously intractable questions about the nature of coral–algal symbioses. In this review, we argue that intraspecific diversity is an important component shaping interspecific interactions within a holobiont, and that such interactions may influence the evolutionary trajectory of reef ecosystems faced with a changing climate. We have four major goals: (i) to briefly review the role of intraspecific diversity in other systems, (ii) to describe what we currently know about intraspecific diversity in coral hosts and algal symbionts, (iii) to present preliminary data illustrating the potential extent of functional intraspecific diversity in coral–algal systems, and (iv) to identify research questions and methodologies that will shed further light on this understudied component of marine microbial symbiosis ecology. We posit two central, testable hypotheses: (i) genotypic interactions between coral hosts and algal symbionts influence functional diversity and therefore evolutionary capacity in coral holobionts, and (ii) intraspecific diversity among corals affects reef community function. Dawkins (1982) introduced the concept of "extended phenotypes" to incorporate the indirect effects of genes on the environment independent of the individual bodies in which they reside. In this framework, unique combinations

of coral and *Symbiodinium* individuals might be thought of as holobionts with unique extended phenotypes that may shape reef community dynamics.

## **SIGNIFICANCE OF INTRASPECIFIC FUNCTIONAL DIVERSITY IN OTHER SYSTEMS**

The importance of genotypic diversity (i.e., the number of distinct multilocus genotypes) among symbiotic partners is apparent in terrestrial systems, where genotype level resolution has been used in manipulative experiments for years. An illustrative example is the association between plants and arbuscular mycorrhizal fungi (AMFs). These fungi penetrate vascular plant roots, transmitting nutrients from the surrounding soil to the host. AMFs are obligate symbionts—they cannot survive without a host plant. Numerous studies have recorded symbiont genotype effects on host performance (and vice versa; reviewed by Johnson et al., 2012). For instance, Koch et al. (2006) inoculated clonal carrot roots with genetically distinct AMFs belonging to the single species *Glomus intraradices*; host root growth varied with symbiont genotype. Munkvold et al. (2004) monitored host and symbiont growth among holobionts composed of distinct genotype pairings; growth varied depending on intraspecific partner combinations. Scheublin et al. (2007) found that intraspecific symbiont identity affected the outcome of competitive interactions between the host and other plant species. Similar effects are found in other systems. Among genetically identical host clones of pea aphids, pathogen resistance was conferred to different degrees by distinct strains of a facultative bacterial symbiont species (Lukasik et al., 2013b). Conversely, host pathogen resistance and fecundity varied among host genotypes associating with a clonal symbiont (Lukasik et al., 2013a). These examples highlight that intraspecific diversity among holobiont partners can be high and drive complex interactive effects that mediate holobiont fitness in multiple ways. The same is likely true in coral–algal systems.

The effects of host-symbiont pairings are reflected not only in growth, competitive interactions, pathogen resistance, and fitness, but also in gene expression patterns. Heath et al. (2012) explored the molecular underpinnings of partner interactions by partitioning genetic variation in plant and AMF transcriptomes into additive and interactive effects. The authors found that interactions between plant and AMF genotypes drove symbiont gene expression changes and transitioned host transcription from a nuclear dominated profile (i.e., basic housekeeping) to a plasmid dominated profile (i.e., nitrogen fixation). These polymorphisms altered access to nitrogen fixation, the chief benefit of symbiosis to the plant and a determinant of host reproductive fitness. When the fitness of one species is influenced by the genotype of its symbiotic partner, coevolution is possible (Thompson, 2005; Wade, 2007). Fitness and expression differences among distinct holobionts exemplify natural variation available to coevolutionary selection (Heath et al., 2012). Evolutionary innovation can arise from transcriptional variation in response to short- and long-term stress (Lopez-Maury et al., 2008), and such variation has been described in marine organisms responding to selective pressures associated with climate change, including temperature (e.g., DeSalvo et al., 2010; Barshis et al., 2013; Polato et al., 2013)

and acidification (Pespeni et al., 2013). In the coral–algal system, genetically determined expression differences among holobionts responding to stress might be subject to natural selection and lead to adaptation.

Increasingly, diversity below the species level is recognized to be an important force shaping community dynamics, particularly among ecosystem engineers (Whitham et al., 2006; Bolnick et al., 2011). In pea aphid studies, symbiont genotype affected the extent of pathogen sporulation in dead hosts, which likely altered community dynamics by limiting or expanding the exposure of other aphids to the fungus (Lukasik et al., 2013a,b). In the Pacific Northwest, locally derived leaf litter from red alder trees (*Alnus rubra*) decomposed more rapidly than litter derived from trees at other riparian zones, indicating intraspecific variants might drive community-level changes to ecosystem flux (Jackrel and Wootton, 2013). In poplar trees (*Populus* sp.)*,* plant genotype was shown to explain three times as much variation in associated arthropod communities as species level differences (Shuster et al., 2006). Similarly, soil microbial community composition was driven largely by intraspecific genotype (Schweitzer et al., 2008). For the marine eelgrass (*Zostera marina*), genotypically diverse beds were more resistant to disturbance by grazing geese, as were their associated invertebrate fauna (Hughes and Stachowicz, 2004). Intraspecific diversity improved not only seagrass biomass and density but also epifaunal abundance over the course of a warm water temperature anomaly (Reusch et al., 2005). Thus, genotypic diversity in seagrasses has both first-order effects on species resistance and/or resilience as well as second-order effects on ecosystem function. Corals are also marine ecosystem engineers; similar second-order effects may have a profound influence on reef function.

In summary, results from terrestrial studies suggest by extension that intraspecific variation among coral holobionts has the potential to scale up to influence the diversity, resilience, and function of entire reef ecosystem, including associated microbes, alga, invertebrates, and vertebrates. The critical first step in all future studies of intraspecific diversity will be establishing the individual identities of each coral colony and *Symbiodinium* strain under investigation.

## **DEFINING CORAL–ALGAL DIVERSITY**

The coral holobiont is composed of more than just the host and *Symbiodinium*. Within host tissues, additional symbionts may include apicomplexa (Toller et al., 2002; Kirk et al., 2013a,b), nitrogen-fixing cyanobacteria (Lesser et al., 2004), other bacteria (Rohwer et al., 2002), viruses (Wilson et al., 2005), archaea (Kellogg, 2004; Wegley et al., 2004), and cell-associated microbial aggregates (Work and Aeby, 2014), not to forget organisms found in the host skeletal structure such as endolithic algae (Odum and Odum, 1955; Shashar and Stambler, 1992) and fungi (Le Campion-Alsumard et al., 1995; Bentis et al., 2000). The partner for which the most data are available and for which the role in the symbiosis is most clearly understood is *Symbiodinium*; we therefore use the term "symbiont" to refer only to *Symbiodinium* in this review.

When it was first described, taxonomic diversity among *Symbiodinium* was assumed to be low (Freudenthal, 1962; Taylor, 1984). Over time, it was recognized that the genus included many different species based on various morphological, physiological, and early genetic data (Schoenberg and Trench, 1980a,b,c). Molecular diversity in the group achieved more recognition when *Symbiodinium* were divided into low-resolution clades based on rDNA (Rowan and Powers, 1992), and some corals were found to associate with members of different symbiont clades simultaneously (Rowan et al., 1997). At the time, it was acknowledged that the genetic distances between clades were similar to those observed among different genera and even families of dinoflagellates—an observation borne out by more recent molecular analyses (Stern et al., 2010; Ladner et al., 2012). Higher resolution was achieved by dividing *Symbiodinium* into subcladal "types" using hypervariable regions of nuclear and chloroplast rDNA markers (LaJeunesse, 2001, 2002; Santos et al., 2003a). Now, a suite of hierarchical molecular markers and population genetic data are being used to define precise species boundaries and refine *Symbiodinium* taxonomy (LaJeunesse et al., 2012, 2014; Jeong et al., 2014). Though it has yet to be physically observed, overwhelming molecular evidence indicates that *Symbiodinium*engage in sex at somefrequency in the wild, either within the coral habitat or in the external environment (Baillie et al., 2000; LaJeunesse, 2001; Santos et al., 2004; Sampayo et al., 2009; Pettay et al., 2011; Baums et al., 2014; Chi et al., 2014; Thornhill et al., 2014). Sympatric symbionts found in distinct colonies of the same host species in the same environments exhibit diagnostic microsatellite allele frequencies, revealing genetic recombination within but not between groups (LaJeunesse et al., 2014). This satisfies the biological species concept, demonstrating that molecular data can be used to consistently delimit species boundaries in *Symbiodinium—*a necessity for investigating intraspecific diversity.

Similar molecular data have been used to resolve coral host species, which feature the added complication of introgressive hybridization among closely related taxa (Ladner and Palumbi, 2012). Often, current taxonomic designations based on morphological characteristics are at odds with genetic evidence. For example, the entity designated *Stylophora pistillata* was recently determined to be composed of at least four species based on cytochrome oxidase I sequencing (Keshavmurthy et al., 2013), while multiple markers suggest that three of the Caribbean poritid morphospecies (*Porites divaricata, P. furcata,* and *P. porites*) should be collapsed into one entity (Prada et al., 2014a). Even within a single genus, molecular data indicate some lineages should be lumped while others should be split (Pinzon et al., 2013). Unlike *Symbiodinium*, it will be easier to combine data from experimental crosses, morphological assessments, and genetic sequencing to resolve coral species (Budd et al., 2010, 2012). Proper species identification is critical when designing experiments to understand coral evolution. Failure to recognize that colonies belong to distinct species when collecting population genetic data can produce misleading signatures of structure and hybridization (Combosch et al., 2008; Combosch and Vollmer, 2011). Failure to recognize cryptic species can also mask important differences in ecological interactions and population dynamics (Boulay et al., 2014). Once coral species boundaries are established, it then becomes possible to assess functional diversity among individuals within species.

Biologically, the notion of an individual is difficult to define in corals. On one level, there is the smallest physical unit representing the organism's genome (the polyp). On another, there are units of contiguous tissue that connect multiple clonal polyps (the colony). In macro-scale contexts, these colonies are the ecologically significant units on a reef. Sometimes, physically separated colonies are clones (i.e., share the same genome), whereas others are genetically distinct. Throughout this review, when attributed to a given organism, we use the term "genotype" to refer to the concept of genome identity within a species (that is, genetically distinct individuals). All coral colonies that share an identical genome together comprise a "genet," with each member colony referred to as a "ramet." Coral genotypic diversity thus refers to the number of distinct genets on a reef. *Symbiodinium* are also capable of both clonal and sexual propagation, but their unicellular nature requires that we use different terminology than corals. A single *Symbiodinium* cell contains one genome and functions independently of all others cells. When residing within host cells, *Symbiodinium*typically reproduce asexually and generate homogenous populations of cells derived from a single ancestor. We use the term "strain" to refer to this physical collection of clonal symbiont cells hosted within a coral colony. In contrast, sexual reproduction leads to new strains. Multiple *Symbiodinium* strains may be present within the habitat provided by a single coral colony, and multiple strains from either a single or many species may be present.

It has become clear that in many coral–algal symbioses, individual host colonies are dominated by a single symbiont species (that is, >99% of the symbiont cells in host tissue belong to a single species). In the Caribbean and Eastern Pacific, where most high-resolution assessments have been performed, individual colonies are dominated not only by one species, but by one strain within that species. An example would be the *Acropora palmata–Symbiodinium "fitti"* association, where pairings of single host and symbiont genotypes produce holobionts that may each exhibit unique extended phenotypes (**Figure 1**; Baums et al., 2014, Parkinson et al., submitted). In fact, in studies where microsatellite markers have been used to characterize both partners, the host:symbiont genotype ratio is one:one in >70% of colonies (Goulet and Coffroth, 2003a,b; Santos et al., 2003b; Kirk et al., 2005; Pettay and LaJeunesse, 2007, 2009, 2013; Thornhill et al., 2009, 2013; Andras et al., 2011; Pettay et al., 2011; Pinzon et al., 2011; Baums et al., 2014; Prada et al., 2014b). This outcome falls in line with the predictions of basic population theory, as closely related organisms generally compete for similar resources, leading to competitive exclusion among similar species (Gause, 1934; Hardin, 1960). However, there are certainly other associations where strains from multiple *Symbiodinium* species codominate in one host colony (e.g., Rowan et al., 1997; van Oppen et al., 2001), such that the holobiont can be viewed as a more complex community. The presence of low-abundance or"background"

symbionts representing <0.1% of the symbiont population may also shape some holobiont dynamics (see **Box 1**). This range of partnership complexity provides exciting potential for deconstructing the processes shaping the evolution of mutualisms across reef habitats.

## **INTRASPECIFIC FUNCTIONAL DIVERSITY IN CORALS: CLASSIC STUDIES**

Traditionally, common garden and reciprocal transplant experiments have been used to test for functional differences of genotypes in plants (e.g., Hufford and Mazer, 2003) and corals (Potts, 1984; Edmunds, 1994; Bruno and Edmunds, 1998; D'Croz and Mate, 2004; Smith et al., 2007). Typically, colonies from environmentally distinct sites (e.g., shallow vs. deep or inshore vs. offshore) are reciprocally transplanted to test how they perform relative to native corals. In parallel, colonies from both sites may be transplanted to a third location to test how they perform relative to each other in a new common environment. As one might expect, studies on reef-building corals have found species that are characterized by generalist genotypes (Smith et al., 2007), species that show local adaptation (D'Croz and Mate, 2004; Kenkel et al., 2013), and species that harbor both generalist and specialist genotypes (Potts, 1984). Such studies address the performance of the specific combination of coral and *Symbiodinium* genotypes in the experimental units. However, the relative contribution of each partner to holobiont performance has been difficult to measure.

Prior to the mid-1990s, confirmation of the distinctness or clonality of coral colonies was difficult because of the lack of genetic data and the fact that coral clones are generally impossible to distinguish visually (even histo-incompatibility proved unreliable; Heyward and Stoddart, 1985). For example, in a classic common garden reciprocal transplant experiment, Potts (1984) mounted clonal fragments of *Acropora* sp. sourced from each of five environments from a single reef onto common wire grids. Five replicate grids were distributed among the five locations. Source location (a proxy for host genet) drove non-random differences in growth rate and survivorship among individual colonies in shared environments. After eight years of observation, colonies with different origins did not converge on a common morphology to match the native colonies at their new locations, indicating low phenotypic plasticity in this coral (at least morphologically) and further supporting a genetic component of coral performance. However, the corals sampled for this study may have included two cryptic species that in some environments can only be distinguished with molecular techniques (Potts, 1984; Ayre et al., 1991).

In another example, host genotype effects on thermotolerance were examined (Edmunds, 1994). To minimize the chance of incorrectly assigning genets, patches of *Orbicella (* = *Montastraea) annularis* complex that were physically clustered in groups attached by contiguous skeleton but unconnected by coral tissue were considered as clones of the same genotype because such a formation suggests a common origin. The author showed that bleaching colonies were aggregated rather than randomly distributed on the reef, and that these aggregations corresponded to genotype identities. While the spatial distribution of bleaching colonies might alternatively be explained by the distribution

#### **Box 1 | Low abundance** *Symbiodinium.*

Given that DNA evidence is the primary means by which Symbiodinium are both detected and identified, our ability to quantify symbiont diversity is restricted by the molecular techniques used. Not all techniques and markers have equal resolving power (Sampayo et al., 2009). One of the most common markers, the internal transcribed spacer 2 (ITS2) of the ribosomal array, is multicopy and undergoes concerted evolution, maintaining functional and non-functional rare variants in the species population (Dover, 1982). Much debate has focused on the information lost when using denaturing gradient gel electrophoresis (DGGE) to screen out rare intragenomic variants (Apprill and Gates, 2007; Thornhill et al., 2007). This methodology conservatively underestimates total symbiont diversity within a coral colony while revealing the dominant or codominant taxa (i.e., the most numerically abundant and presumably ecologically relevant species). In the process, minor strains that comprise <5% of the total symbiont population within host tissues go unrecognized. With the development of several sensitive qPCR assays (Ulstrup and Van Oppen, 2003; Ulstrup et al., 2007; Correa et al., 2009; Mieog et al., 2009) and the advent of next generation sequencing (Kenkel et al., 2013; Green et al., 2014), it has been possible to survey the diversity of "background" populations of Symbiodinium below the detections limits of DGGE and traditional PCR.

In a recent survey of 26 coral taxa previously thought to be "specific" (restricted to associations with one Symbiodinium clade), background symbionts from multiple clades could be detected with qPCR assays in nearly all host species (Silverstein et al., 2012). When a non-symbiotic coral species was screened as a control, the assays returned false positives from putatively contaminant symbionts trapped in the mucus or gut cavity 9% of the time. This rate of natural contamination is quite high, but nevertheless, background strains are more common than previously thought. It is understood that most corals that acquire their symbionts from the environment each generation are promiscuous during early ontogeny, associating with multiple symbiont taxa that are not dominant in adults (Coffroth et al., 2001, 2006; Santos et al., 2003a; Little et al., 2004; Abrego et al., 2009a; Byler et al., 2013; Cumbo et al., 2013; Poland et al., 2013; Yamashita et al., 2013). Since the capacity for non-specific associations is present in juveniles, it is not necessarily surprising that multiple clades were detected in low abundance in adult corals (Santos et al., 2004; Baird et al., 2007; Baker and Romanski, 2007). It is currently unclear whether the presence of a background symbiont implies that it is functionally relevant to the holobiont.Though corals may have always been open to infiltration by background symbionts, host-symbiont specificities have evolved multiple times regardless. Detection of low-abundance Symbiodinium cells in corals suggest that hosts may be open environments where small numbers of heterologous symbionts are entering and exiting the system on a regular basis. If commensal, these symbionts may move passively through the system without engaging in symbiosis. If parasitic, they may trigger a host rejection response or may be competitively displaced by the dominant symbiont, such that only a small number are present in a coral at a given time. Finally, if mutualistic, they may be fully engaged in the fitness of the holobiont despite their rarity. For example, rare symbionts may be important if they contribute a different but essential metabolic resource than the dominant symbiont strain (analogous to rare members of the bacterial biosphere; reviewed by Pedros-Alio, 2012), or if they can increase sufficiently in number to replace a compromised dominant symbiont should environmental conditions change (Buddemeier and Fautin, 1993; Baker et al., 2004; Berkelmans and van Oppen, 2006).

(Continued)

#### **Box 1 | Continued**

Studies are needed to distinguish between these competing scenarios. So far, the few experiments that have successfully tracked background symbionts during natural environmental extremes suggest that they are not viable sources of persistent acclimation to stress, at least in terms of replacing the dominant symbiont. After a cold-water bleaching event in the Gulf of Mexico, most Pocillopora damicornis colonies with mixed symbiont communities did not "shuffle" (c.f., Baker, 2003) to the more thermally tolerant species (McGinley et al., 2012), instead remaining stable despite environmental variability. In corals sampled before, during, and after a 2005 bleaching event in Barbados, background populations of the thermally tolerant Symbiodinium trenchii increased in prevalence prior to bleaching, but declined to pre-stress levels over the next 2 years of non-stressful conditions (LaJeunesse et al., 2009). However, functional relevance may not be tied directly to cell numbers (a rare strain may always be rare and yet essential). Such a hypothesis has yet to be tested in corals, though bacterial analogs are known. For example, a single rare bacterium representing 0.006% of the total cell count in peat accounted for a much larger proportion of the biome's sulfate reduction relative to its abundance (Pester et al., 2010). This is an active research area, and despite our current data deficiency, future studies may provide more convincing evidence of the functional relevance of background Symbiodinium.

of colonies with distinct *Symbiodinium* associations and therefore thermotolerances, it is unlikely that the experimental colonies harbored different symbiont species. This is because the corals were located at a common depth over a small spatial scale, reducing the number of light microhabitats that lead to unique symbiont associations within the host species complex (Rowan et al., 1997). In a second experiment, subfragments from large colonies of *Porites porites* located more than 15 m apart (thus suggesting they belonged to different genets) were experimentally exposed to elevated temperatures for three days and their symbiont densities were measured. Despite having similar densities at the start of the experiment, the putatively distinct genotypes showed different rates of symbiont loss (or, in one case, gain) after thermal stress exposure (Edmunds, 1994).

The coral literature is rife with similar examples where genotype level effects seemed apparent, but actual genotypes were not resolved explicitly. Given that the spatial range over which host ramets of the same genet have been distributed (e.g., from <1 to >70 m in *Acropora palmata*; Baums et al., 2006), it may not be appropriate to assume that by swimming a certain distance, the chance of collecting a clonal colony is greatly reduced. For fine-scale ecological questions, it will be necessary to incorporate molecular confirmation of intraspecific diversity. As genomicsempowered tools become less expensive and more accessible, a greater number of studies are taking advantage of fine-scale resolution.

## **INTRASPECIFIC FUNCTIONAL DIVERSITY IN CORALS: GENOMICS-EMPOWERED STUDIES**

A series of recent work on the Mediterranean Red Coral (*Corallium rubrum*) demonstrates the utility of a genomics approach to studies of marine evolutionary ecology. This particular coral lacks *Symbiodinium*, reducing the complexity of the system. First, neutral microsatellite markers were used to differentiate populations of *C. rubrum* (Ledoux et al., 2010a,b; Costantini et al., 2011). Populations were structured along a depth gradient that reflected distinct, stable thermal environments. This genetic structure corresponded with variability in *C. rubrum* thermal stress limits (Torrents et al., 2008). Since the multilocus genotypes of each colony were established, individuals from each population could be targeted to assess physiology. Colonies were subfragmented and exposed to various heat stress regimes in common garden aquaria, while the expression of key heat shock proteins were monitored via qPCR (Haguenauer et al., 2013). After assessing variability in gene expression among individuals within different populations, the authors found evidence consistent with local adaptation driven by environmental variability, and argued for a trade-off between reduced responsiveness of metabolic genes and frontloading of thermotolerance genes. Critically, environmental heterogeneity at shallow sites seemed to select for phenotypically plastic individuals, as reflected by high genetic variability in the shallow population versus low genetic variability in the populations at depth. This work emphasizes the potential importance of cryptic diversity in coral communities and the significance of marginal populations in providing evolutionary novelty (Bell and Gonzalez, 2011; Boulay et al., 2014). It also exemplifies a useful strategy for investigating genotype level effects driving thermal adaptation in symbiotic corals.

The reductive approach of assessing the performance of either the host or symbiont in isolation is more difficult for symbiotic scleractinian corals. One methodology is to experiment with coral larvae, which often lack *Symbiodinium* prior to settlement*.* Crosses of gametes collected from distinct adult genets produce large batches of offspring with known heritage. Controlled crosses between adjacent *Acropora palmata* individuals showed that full sibling larval batches were unequally affected by thermal stress, which influenced swimming speeds and developmental rates (Baums et al., 2013). The same larval batches exhibited diverse transcriptional responses to thermal stress depending on their heritage (Polato et al., 2013), revealing a higher-than-expected degree of molecular variation in this endangered coral species. Among *Acropora palmata* adults, some individuals were sexually incompatible (Baums et al., 2013). This was not due to general infertility as most individuals were capable of producing viable larvae when crossed with a compatible genotype. Clearly, intraspecific diversity has fitness consequences in corals. In another experiment, Polato et al. (2010)identified colonies of *Orbicella faveolata* at two distant locations that belonged to one panmictic population according to neutral markers. At each location, locally derived aposymbiotic larval batches were exposed to a common thermal stress. The larvae exhibited both shared and location-specific transcriptional responses, strongly suggesting the existence of local adaptation despite ongoing gene flow among locations.

Because some *Symbiodinium* can be maintained in culture, their performance can be measured independent of a host. *Symbiodinium goreaui* is a host-generalist symbiont featuring a global distribution (LaJeunesse, 2005). In one study, *Symbiodinium goreaui* was identified in two *Acropora tenuis* reefs located several hundred kilometers apart with average temperatures differences of ∼2◦C (Howells et al., 2009). After establishing via microsatellite

genotyping that these reefs are likely inhabited by distinct populations of *Symbiodinium goreaui*, symbionts from each population were isolated and cultured (Howells et al., 2012). Cultures were then exposed to elevated temperatures, and photochemical performance was monitored. *Symbiodinium goreaui* cultured from the warmer reef population showed a smaller decline in photochemical performance at elevated temperature relative to the population from the cooler reef, even after >30 asexual generations in culture. Similar *in vitro* experiments have shown within-species differences in physiology (see *Symbiodinium* Growth Rates in Culture). Thus, when separated, both corals and *Symbiodinium* show intraspecific variation in thermotolerance that appears to have a heritable genetic component—the raw material of natural selection.

Howells et al. (2012) further tested whether intraspecific variation influences holobiont performance when the host and symbiont are combined. They used the distinct *Symbiodinium goreaui* populations to inoculate aposymbiotic larvae of the coral *Acropora millepora*. After growing to a sufficient size, symbiotic coral juveniles were then exposed to ambient or elevated temperatures, and both symbiont and host physiology were assessed. The symbiont population from the warmer reef showed optimal photochemical performance at elevated temperature, and coral juveniles associating with these symbionts grew rapidly with no signs of bleaching and minimal mortality at high temperature. In contrast, the symbiont population from the cooler reef experienced chronic photodamage at high temperature, and the juveniles inoculated with this population grew slowly and suffered high bleaching and mortality at high temperature. Symbiont and host thermotolerance correlated, showing a strong influence of symbiont physiology on holobiont performance even below the species level. In a similar vein, Kenkel et al. (2013) used microsatellites and identified performance differences among two populations of the coral *Porites astreoides*. In this case, both hosted the same *Symbiodinium* species as determined by characterization of the symbiont community using high-throughput sequencing of the ITS2 marker. Host structure appeared to be maintained by differences in variable inshore vs. stable offshore thermal regimes. In a common garden, offshore holobionts were less tolerant of experimental heat stress, showing elevated bleaching and reduced growth compared to inshore holobionts. Despite the homogeneity of the symbiont population, *Symbiodinium* in offshore hosts experienced lower photochemical efficiency during heat stress than those associating with inshore hosts. These results support the contention that the host plays an important role in holobiont thermotolerance (Baird et al., 2009a). Moreover, it is not just the host species, but intraspecific populations that may determine performance.

To assess host and symbiont adaptive potential, Csaszar et al. (2010) identified two coral populations of a single species (*Acropora millepora*). Each population associated with a different symbiont species. Heritability estimates for key thermal response traits within each host population showed the symbionts to be relatively more capable of adapting to climate change than the host. However, as the authors recognized, while hosts were genotyped to the level of individuals, symbionts were only resolved to the sub-cladal type (approximately species) level. Though the relative comparisons between host and symbiont heritability must be interpreted with caution, this study sets an excellent precedent, as it is one of the few to both measure intraspecific trait variation in coral hosts and confirm the unique identity of the host genets involved.

### **PRELIMINARY EVIDENCE IN A GENOMICS AGE**

While the previously mentioned studies mostly examined intraspecific variation at the population level, genotype level effects have only rarely been explored (Baums et al., 2013; Polato et al., 2013). Now that both major components of the coral holobiont can be genotyped to individuals, the doors have opened for high-resolution investigations of partner interactions. Here we highlight preliminary evidence that variation at the genotypelevel may be extensive in both corals and *Symbiodinium*, and that unique partner pairings drive unique responses to stress. This work tests the first of our major hypotheses; that interactions between partners contribute to functional diversity that may subsequently be acted upon by selection. We argue that to truly understand how corals may respond to the myriad selective pressures of a changing climate it will be necessary to assess the contribution of intraspecific diversity to holobiont performance.

#### **CORAL GROWTH IN RESTORATION NURSERIES**

With global reef degradation reaching alarming levels, marine managers have developed methods to rear coral fragments *in situ* for restoration purposes. A typical "coral gardening" approach involves several steps: donor colonies are identified and fragmented; the pieces are attached to artificial substrate; the fragments are grown together in a common nursery plot; ultimately, these aquacultured colonies are outplanted to depauperate reefs (Rinkevich, 1995, 2005). The goal is to increase coral biomass, diversity, and reproductive capacity, as well as to restore the reef ecosystem and associated fauna (Precht, 2006). During the growth phase, the underwater nurseries serve as common gardens where environmental conditions are roughly equivalent for all colonies, and observed differences can be attributed mostly to genetic effects (Baums, 2008). Maternal effects or acclimation to the donor colony's source environment can carry over to affect performance in the nursery, but these factors have been difficult to assess. Restoration nurseries have greatly expanded in the Caribbean, where the endangered *Acropora cervicornis* and *Acropora palmata* have been targeted for extensive management (Lirman et al., 2010; Johnson et al., 2011; Young et al., 2012). As part of the process, hundreds of colonies in the Florida Reef Tract have been genotyped at multilocus microsatellite markers (e.g., Baums et al., 2010), and many have been monitored for growth and mortality for several years (Griffin et al., 2012; Lirman et al., submitted).

These nurseries provide a unique and under-utilized resource for investigations of genetic influence on coral performance. The few studies that have been conducted with nursery-reared colonies all point to intraspecific genotype effects on growth. For example, Bowden-Kerby (2008) reared genets of acroporid corals from both forereef and backreef environments in a common garden backreef nursery. In contrast to the study of Potts (1984), here source population (a proxy for host/symbiont genotype) was more important than environment in determining growth rate; source was determined to be a significant factor in 75% of tests compared to 44% for environment. Forrester et al. (2013) transplanted *Acropora palmata* fragments from two source locations to a common garden at a third. In the first year, there were no observed differences between groups, but when the experiment was repeated, growth rate varied by source. In a concurrent experiment, colonies were subdivided into fragments and reciprocally transplanted to "home" and "away" environments. Clonal fragments moved "away" grew more slowly, revealing a slight home-field advantage and a combined influence of both environment and genotype.

Griffin et al. (2012) reared fragments of several *Acropora cervicornis* genotypes at a line nursery in Puerto Rico and confirmed the hypothesis that linear tissue extension rate varied among individuals. A re-analysis of this data set is presented here (**Figure 2**). In addition to discriminating growth rates by host genotype, we also separated colonies into depth classes by their relatively shallow (9–10.5 m) or deep (10.5–13 m) positions in the line nursery, as depth was a significant factor in model analysis (Griffin et al., 2012). We removed measurements from individuals attached to the lines by cable ties, as this method was shown to negatively affect growth (Griffin et al., 2012). To use the terminology of that study, host genotypes are referred to by color names or capital letters. Repeat genotyping of host samples derived from the nursery (rather than the donor colony, as in the original study) revealed that genotypes "A" and "B" were actually identical, as were "Blue" and "Brown," so their measurements were pooled. Additional genotyping of the dominant symbiont associated with each colony revealed that three of the four hosts shared a clonal *Symbiodinium "fitti"* (ITS2 type A3) strain; host "A/B" associated with a unique *Symbiodinium "fitti"* strain. The "Green"

host genotype grew faster than all others, regardless of depth. Identical individuals generally grew faster at greater depth. Interestingly, the "Blue/Brown" genotype deviated significantly from the "A/B" and "Yellow" genotypes when reared in deep but not shallow depths. This indicates an interaction between host genotype and environment. Symbiont genotype did not appear to affect growth, since the most deviant host genotypes shared a clonal symbiont, while two of the hosts that did not differ in growth rate at either depth associated with distinct symbionts. To test this particular hypothesis rigorously, it will be necessary to track the growth rates of ramets of the same host genet each associating with distinct symbiont genotypes; such cases are difficult (though not impossible) to find in nature (Baums et al., 2014).

#### *Symbiodinium* **GROWTH RATES IN CULTURE**

It has long been possible to culture *Symbiodinium* independent of the host in artificial media (McLaughlin and Zahl, 1959). By now a great many studies have been performed *in vitro*, revealing key physiological differences among *Symbiodinium* in terms of cold tolerance (Thornhill et al., 2008a; McBride et al., 2009), heat tolerance (Robison and Warner, 2006; Suggett et al., 2008), light tolerance (Iglesias-Prieto and Trench, 1994, 1997a; Hennige et al., 2009), and acidification tolerance (Brading et al., 2011). Typical phenotypic traits that have been monitored under different environmental conditions include culture growth rates and photochemical efficiencies (e.g., Robison and Warner, 2006; Thornhill et al., 2008a). Given the state of *Symbiodinium* taxonomy prior to the 1990s, most early work assumed the physiology of a few cultures was representative of the entire genus. Over the years, more studies have incorporated clades, types, and species designations, broadening our understanding of the extensive physiological diversity within *Symbiodinium*, but none have resolved individuals within species.

Using a hierarchical molecular approach, two species of Clade B *Symbiodinium* were recently delineated with a combination of nuclear, mitochondrial, and chloroplast markers (LaJeunesse et al., 2012). *Symbiodinium minutum* associates with the globally distributed anemone *Aiptasia* sp. in tropical waters, while *Symbiodinium psygmophilum*, despite being present in the tropics, is cold-tolerant and typically engages in symbiosis with the scleractinian corals *Astrangia poculata, Cladocora caespitosa,* and *Oculina patogonica* in high latitudes of the Atlantic Ocean*.* In a preliminary experiment designed to test the hypothesis that phenotypic differences could be detected among genotypes within and between *Symbiodinium* species, we reared several monoclonal cultures of *Symbiodinium minutum* and *Symbiodinium psygmophilum* genotypes under identical temperature and light regimes and monitored growth rates (in terms of asexual propagation of cells). We used the micro-culture methods of Rogers and Davis (2006) as a guide, and reared all cultures in ASP-8A media (Ahles, 1967). First, genotype uniqueness was confirmed with microsatellite repeat length variation (i.e., different alleles) at nuclear marker *Sym15* (Pettay and LaJeunesse, 2007) and sequence variation at chloroplast *psbAncr* (Moore et al., 2003; LaJeunesse and Thornhill, 2011) for each culture of each species. Next, individual cells from synchronized cultures (*n* = 3 genotypes per species)

were transferred to 96-well plates via cell sorter such that each culture was represented in sixteen replicate wells with ∼5 cells each at the start of the experiment. Plates were incubated at 25◦C and a 12:12 light/dark photoperiod at 60 microeinsteins. As cells divided asexually, plates were observed under a microscope at 400X magnification and total cell counts were recorded at noon every 2 days for 2 weeks. The growth rates were exponential, so data were log transformed and fit to a linear regression. The slope of the line was recorded as the growth metric per replicate well. The entire experiment was repeated twice.

The *Symbiodinium psygmophilum* culture PurpPFlex failed to grow (as occasionally happens with recent transfers of older cultures, such as in this case), so ultimately we collected data from three *Symbiodinium minutum* genotypes (Mf1.05b, rt-002, and rt-351) and two *Symbiodinium psygmophilum* genotypes (Mf10.14b.02 and rt-141). Initial growth was highly variable until at least ten cells were present in each well, and cell counts became difficult after concentrations reached >200 cells/well, so we only included in our analysis wells with time series data between this count range. After failing to detect differences between experiments (*t*-test, *t*(101) = 1.25, *p* = 0.216), data from each run were combined and analyzed together.

We noted a difference in average growth rate between species, reported here as ln(cells/day) ± 95% Confidence Interval. For *Symbiodinium minutum,* the growth rate was 0.34 ± 0.01, while for *Symbiodinium psygmophilum* it was 0.31 ± 0.02 (ANOVA, *F*(1,120) = 4.97, *p* = 0.028). When separated by genotype, it became clear this effect was driven by the *Symbiodinium psygmophilum* culture rt-141, which had much lower growth rates than all other cultures regardless of species (ANOVA, *F*(4,117) = 7.39, *p* < 0.001; **Figure 3**). The diversity in growth rates among *Symbiodinium psygmophilum* may reflect the genetic diversity within this species, which exceeds that of *Symbiodinium minutum* (LaJeunesse et al., 2012). The key result is that phenotypic variation among genotypes within *Symbiodinium* species can potentially exceed that found between members of different

species. This situation is not uncommon in nature (Bangert et al., 2006), but to date, the concept of intraspecific variation within *Symbiodinium* species has largely been ignored. A vast preponderance of reef ecology studies only measure symbiont phenotypes at the low-resolution "clade" or intermediate-resolution "type" level. Using crude averages from these higher-order taxonomic rankings may miss important dynamics taking place among or within species. Further experimentation with more *Symbiodinium* genotypes (both *in vitro* and *in hospite*) will be necessary to confirm these findings. The fact that such patterns can be found even among a small number of strains implies that, much like in corals, intraspecific variation in symbiont physiology may be extensive.

#### **HOST GENOTYPE EFFECTS ON CLONAL SYMBIONT PERFORMANCE**

In their analysis of host and symbiont population interactions, Howells et al. (2012) showed that intraspecific variation among *Symbiodinium* influenced the growth of host juveniles in a laboratory setting. But does intraspecific variation among hosts influence symbiont performance? To address this question, we recently took advantage of the *Acropora palmata–Symbiodinium "fitti"* association, wherein individual host colonies usually associate with only one clonal symbiont strain (Baums et al., 2014). Distinct coral genets that shared a clonal *Symbiodinium "fitti"* strain were identified growing close to each other within a natural common garden. Highly sensitive qPCR assays established that no other *Symbiodinium* could be detected within the colonies. Fragments were removed, exposed to cold shock *ex situ* (10◦C for 3 days), and monitored for photochemical efficiency changes and acute host transcriptional responses. We found that the photochemical response of the symbiont strain varied depending on which host genotype it associated with (Parkinson et al., submitted). Because all measured *Symbiodinium* were clonal and environmental variation was reduced by the proximity of the colonies, the most parsimonious explanation was that physiological variation among host genotypes drove photochemical differences among the clonal symbiont strains. Experiments designed to test for intraspecific variation should make sure that individual histories are not a confounding factor; the natural common garden proved advantageous for that purpose here.

In a subset of the holobionts exposed to cold, symbiont photochemical efficiency was phenotypically buffered (Waddington, 1942; Bradshaw, 1965; Reusch, 2014), meaning the reaction norm changed relatively little with environmental perturbation. In other host backgrounds, the symbiont strain's response was less buffered. Host expression of iron sequestering and oxygen stress signaling genes correlated with these differences in symbiont performance, suggesting that variation in iron microhabitat and/or redox sensitivity among hosts may mediate clonal symbiont performance during stress. Anecdotally (because sample size was small), the colonies that participated in the annual spawning event had the most buffered symbiont responses. Those colonies with less buffered symbiont responses did not spawn. This result suggests a possible fitness consequence of genotype interactions among holobionts, highlighting the potential evolutionary importance of intraspecific diversity among coral mutualists.

#### **METABOLOMIC ANALYSIS OF SYMBIOTIC AND NON-SYMBIOTIC POLYPS**

The *Astrangia poculata–Symbiodinium psygmophilum* association has been proposed as a model system for investigating coral–algal symbiosis. This scleractinian hard coral is more amenable to aquaculture than exclusively tropical species and exists across a broad latitudinal and temperature range. Uniquely, *Astrangia poculata* colonies often feature both symbiotic and non-symbiotic polyps within the same colony under non-stressful conditions. This attribute allows for experimental investigation into the molecular features that mediate successful symbiotic interactions among hosts and symbionts while controlling for partner genotypes. We generated metabolomic profiles for symbiotic and non-symbiotic polyps dissected from each of three *Astrangia poculata* colonies to provide another example of the insights that can be gained when intraspecific diversity is accounted for in the experimental designs. We also analyzed a *Symbiodinium psygmophilum* monoclonal culture (isolated from a tentacle of *Astrangia poculata*). Methods generally followed Gordon et al. (2013) with minor modifications. Target tissues were snap frozen in liquid nitrogen within 1 min of sampling, then metabolites were extracted in isopropanol:acetonitrile:water (3:3:2) solution. The samples were separated on a Shimadzu 20R UFLC high-performance liquid chromatography system using a C18 column. Mass spectra and tandem mass spectra were obtained in both positive and negative ion mode on an AB SCIEX 5600 Triple TOF. The resulting LC-MS profiles were Pareto transformed to reduce bias from metabolites with large fold changes while preserving the rank and dimensionality of the data (van den Berg et al., 2006).

Principle component analysis (PCA) clustered polyps by symbiont state more strongly than host genotype (**Figure 4A**). PCA loadings revealed ∼4000 compounds (including isotopic and monoisotopic peaks) that were mainly present in only one of the symbiotic states, driving group clustering. For example, a platelet activating factor (PAF) was observed at much higher levels in non-symbiotic polyps (**Figure 4B**). This metabolite has multiple functions in humans, and may play a role in intracellular signaling (Venable et al., 1993). The single *Symbiodinium* sample fell far from either of the holobiont clusters in the PCA. Certain compounds were observed only in the *Symbiodinium* sample, such as 13E-Docosenamide, the function of which is unclear in *Symbiodinium* (it has been found in the cerebrospinal fluid of mammals; Cravatt et al., 1995). Unfortunately, a majority of metabolites could not be easily annotated, and further work will be required to characterize them. Controlled contrasts should reveal key players in the metabolic interactions that allow the symbiosis to persist. Being able to compare fragments of the same host genotype in two symbiotic states reduces the problem of working with non-model coral species that contain a large amount of genetic variation. That variation would otherwise obscure patterns. This is but one example of how new technologies, when applied to combined and isolated components of the holobiont, will facilitate new insights into marine endosymbiotic mutualisms.

## **COEVOLUTIONARY CONTEXT AND CLIMATE CHANGE**

Mutualisms in general (Kiers et al., 2010) and coral–algal associations in particular (Hoegh-Guldberg et al., 2007) are threatened by

**FIGURE 4 | Preliminary analysis of the** *Astrangia poculata– Symbiodinium psygmophilum* **metabolome. (A)** Principle component analysis of metabolite profiles. Shown are principle components 1 and 2 (x- and y-axis, respectively) of Pareto-transformed metabolite data. Shapes indicate host genotype (n = 3). Black fills correspond to symbiont-rich polyps. White fills correspond to nearly symbiont-free polyps. "S" indicates a sample of a Symbiodinium psygmophilum monoculture. "N" indicates a negative control (purified water). Astrangia poculata samples cluster by the symbiotic state of the polyps rather than by host genotype. **(B)** Representative profiles for specific metabolites. C16-Lyso-PAF was abundant in non-symbiotic polyps but low in symbiotic polyps and absent in Symbiodinium culture. 13E-Docosenamide was mainly present in Symbiodinium culture but not in coral tissue. The two unidentified compounds are characteristic of metabolites with greater detection in symbiotic (Unidentified-A) or non-symbiotic (Unidentified-B) polyps. Polato et al. (unpublished data).

a changing climate and anthropogenic disturbance. Aside from the extreme case of mutual extinction (Dunn et al., 2009), other negative evolutionary outcomes of changing environmental conditions may include shifts from mutualism to antagonism, switches to inferior partners, and mutualism abandonment (Kiers et al.,2010). Unequal responses to climate shifts between partners can contribute to mutualism breakdown (Warren and Bradford, 2014). Such breakdown is apparent in coral systems, where the "coral bleaching" phenomenon (when hosts and symbionts dissociate due to stress) takes place at temperatures below the upper thermal limits of most free-living microalgae (Berry and Bjorkman, 1980). There is a unique aspect to engaging in symbiosis that makes the intact association more sensitive to temperature changes; this is likely due to the consequences of an oxygen-sensitive animal taking on a photosynthetic symbiont that generates reactive

oxygen species under elevated light and temperature conditions (Lesser, 2006; Baird et al., 2009a). While many efforts have been made to assess the adaptive potential of coral holobionts facing rising sea surface temperatures, almost none have considered intraspecific trait variation (but see Csaszar et al., 2010). Such investigation will be needed to more accurately predict the role of coevolution in the coral holobiont response to climate change.

Many corals transmit their symbionts vertically by provisioning eggs with *Symbiodinium* cells (Hirose et al., 2008), but most spawn symbiont-free gametes or larvae (Baird et al., 2009b), and therefore must acquire their algal complement from the environment. In a closed vertical system it is easier to accept that tight coevolution takes place; it is less clear how coevolution plays out when partner genomes are uncoupled every host generation. And yet, there is remarkable stability among holobionts with horizontal transmission. The Caribbean broadcasters in the *Orbicella* genus appear flexible at the clade level (associating with members of Clades A, B, C, and D), but are quite specific at finer-scale resolution, hosting only a few species within each clade (Thornhill et al., 2014). The two lineages of the Caribbean gorgonian *Eunicea flexuosa* each associate exclusively with a corresponding Clade B symbiont (Prada et al., 2014b), while the Caribbean scleractinian *Acropora palmata* typically associates with *Symbiodinium "fitti"* (Baums et al., 2014). These examples, along with a number of other studies and data sets, clearly demonstrate that coevolution takes place in coral–algal systems, with unique host and symbiont combinations (holobionts) being the units of selection (Iglesias-Prieto and Trench, 1997b; LaJeunesse et al., 2004, 2010; LaJeunesse, 2005; Reshef et al., 2006; Finney et al., 2010; Correa and Baker, 2011; Lesser et al., 2013; Thornhill et al., 2013, 2014; Prada et al., 2014b).

We can view the holobiont as a unit of selection because survival may depend on a given host and symbiont genotype combination. It is less clear whether holobionts can be considered strict units of evolution (Maynard-Smith, 1991; Frank, 2011; Heath and Stinchcombe, 2014). Coevolution of the holobiont as a unit does not necessarily follow directly from selection on its components. The host and symbiont are organisms with their own evolutionary paths; the frequent uncoupling of host and symbiont genomes prevents direct co-heritability of genetic information (Maynard-Smith, 1991). However, this does not prevent the species from coevolving, since specialized associations clearly exist (LaJeunesse, 2002). Coevolution despite vertical *Symbiodinium* transmission can be explained by the processes of ecological selection via hostspecialization (Thornhill et al., 2014), with or without geographic isolation (Flaxman et al., 2014). Divergent selection should act on intraspecific variation to favor adaptations that increase *Symbiodinium* fitness in a given host intracellular habitat, removing suboptimal generalist genotypes. The *Eunicea* association provides a good example where both host and symbiont lineages are relatively recently diverged and the *Symbiodinium* are host-specialized (Prada et al., 2014b).

Aspects of population biology that may shed light on coevolutionary capacity are patterns of population genetic structure and gene flow. Based on the current evidence, population genetic structure does not match between coral host and algal symbiont (Andras et al., 2011, 2013; Baums et al., 2014). Adaptation to thermal and ocean acidification stress is likely ongoing but those adaptations that require reciprocal changes in the mutualistic partners (e.g., pathways involved in exchange of nutrients) will be spread inefficiently if dispersal scale is not matched between partners. For example, in *Acropora palmata* the host is divided into two large populations encompassing the eastern and western Caribbean (Baums et al., 2005b). At the same time, the dominant symbiont (*Symbiodinium "fitti"*), consists of seven populations, each found over smaller geographic regions (Baums et al., 2014). Thus a beneficial adaptation arising in *Symbiodinium "fitti"* may only efficiently rise to high frequency in parts of the host range. However, even weak selection can be sufficient to spread advantageous alleles throughout structured populations, in part because fixation times for such alleles are greatly reduced relative to their neutral counterparts (Slatkin, 1976; Rieseberg et al., 2004). Patterns of gene flow can vary substantially among coral hosts from small to large geographic scales (reviewed by Baums, 2008). We expect the same to be true for *Symbiodinium* species. Hence, additional studies are needed that resolve the population genetic structure of both partners simultaneously.

Little theoretical work has been done to understand how population genetic structure should be matched between hosts and symbionts. Work on parasites suggests that population structure should be smaller scale in the parasite compared to the host population (as found by Dybdahl and Lively, 1996), though there are examples of the opposite case (Martinez et al., 1999) and balanced structure (Mulvey et al., 1991). However, the traditional Red Queen model of rapid antagonistic coevolution does not seem appropriate for mutualisms, where fitness consequences of interactions are measured in gains rather than losses. An alternative model for mutualisms based on game theory, the Red King hypothesis (Bergstrom and Lachmann, 2003), predicts that unbalanced evolutionary rates among partner species can be stable. Currently, this model is not spatially explicit—it cannot account for local adaptation to environmental gradients such as light, for example—but nevertheless makes interesting predictions. According to Red King, the host is assumed to be "enslaving" the faster-evolving symbiont (Hilbe et al., 2013) by repeatedly "demanding" over evolutionary time scales that more opportunistic symbiont genotypes evolve back toward being more generous. The Red King hypothesis may need to be modified to account for the one-to-many interactions between a coral colony and individual *Symbiodinium* cells (Gokhale and Traulsen, 2012). Finally, such models will require empirical data accounting for both inter- and intraspecific diversity and population structure in both partners. Results might provide important insight when predicting the effects of climate change on marine mutualisms.

#### **FUTURE DIRECTIONS**

Consideration of intraspecific diversity in experimental designs will likely improve the predictive value of models of climate adaptation in corals. For example, when climate projections do not incorporate adaptive processes such as genetic adaptation, they predict 20–80% more mass bleaching events in a given period than when such processes are included (Logan et al., 2014). Adaptationfree models over-predict the current frequency of bleaching, which indicates that adaptive processes are likely ongoing. Indeed, rapid adaptation and acclimation to thermal stress have been demonstrated among corals exposed to highly variable temperatures (Palumbi et al., 2014). Intraspecific diversity may represent a component of adaptive capacity to increased temperature in corals (Baums, 2008; Baums et al., 2013), although rare beneficial alleles can spread rapidly even when diversity is low. We would predict a link between intraspecific diversity and bleaching resistance, much like the classic link between diversity and infectious disease resistance (O'Brien and Evermann, 1988). If an empirical link can be made, this information can be incorporated into models projecting the survival of corals.

There are several areas where the development of new techniques will provide further insight in to the nature of marine mutualisms. The difficulty of aquaculturing corals has always presented a challenge to molecular studies in this system. Rearing of an F2 generation for traditional genetic experiments has previously been intractable. Only recently has successful culturing of corals from gametes to sexual maturity taken place (Iwao et al., 2010; Baria et al., 2012). These colonies spawned after three or four years of growth, indicating that the rearing of F2 generations to sexual competence for backcrosses will require at least six years for these species. Further complications stem from the symbiotic promiscuity of larvae, which may take more than three years to reflect the algal complement of stable adult colonies (Abrego et al., 2009b). Despite these issues, new technologies are providing different avenues for molecular characterization of corals. For example, Lundgren et al. (2013) recently used next generation sequencing to characterize a suite of single nucleotide polymorphisms (SNPs) that correlate with environmental variables in populations of scleractinian corals on the Great Barrier Reef. Five SNPs for *Acropora millepora* and three SNPs for *Pocillopora damicornis* exhibited likely signatures of selection. These markers may serve as quantitative trait loci for stress tolerance, a critical tool for managers attempting to identify particularly resilient genotypes for restoration purposes.

In parallel with the development of microsatellite markers to distinguish coral and algal individuals, efforts have been made to elucidate the taxonomic diversity of coral-associated microbes, cryptic invertebrates, and more transient associates such as reef fish. An integrative approach that simultaneously assesses diversity across all these community-levels would provide a comprehensive understanding of how coral genotypic diversity affects and is affected by reef community diversity. This can be accomplished by combining surveys of natural coral stands, manipulation of *in situ* common gardens, and *ex situ* experiments. Even at small spatial scales, natural variation in genotypic evenness and richness is common within and across species, ranging from minimal clonal replication to reefs dominated by just one genet (Hunter, 1993; Ayre and Hughes, 2000; Miller and Ayre, 2004; Baums et al., 2006; Boulay et al., 2014). By tracking the functional and taxonomic diversity of associated micro- and macro-scale assemblages over time in plots of varying host and symbiont genotypic

diversity or composition, it will be possible to quantify the link between diversity and community dynamics. We would predict that host and *Symbiodinium* genotypic diversity positively correlate with microbial and epifaunal community diversity. The incorporation of environmental stressors in such designs will help to assess the direct effects of those stressors as well as the indirect effects of diversity and composition on both ecosystem function and resilience, potentially informing conservation and restoration strategies (Srivastava and Vellend, 2005). Again, we would predict a positive association between holobiont genotypic diversity and resilience. These types of studies would address our second major hypothesis; that reef community dynamics are influenced by intraspecific diversity among corals.

An interesting application of fine-scale techniques will be to examine the coral colony landscape in terms of the distributions of different symbiont genotypes throughout host tissues. Do *Symbiodinium* stratify not only based on light regime (e.g., top, bottom, and sides of colonies), but also within specific host tissues (e.g., tentacles)? Can multiple symbiont species or genotypes within a species occupy a single symbiosome within a single host cell? Laser-capture microdissection (Espina et al., 2006) has already been used to isolate targeted bacterial endosymbionts of *Siboglinum fiordicum*, a tube worm (Thornhill et al., 2008b). The same technology could be applied to isolate *Symbiodinium* among non-calcifying hosts *in hospite*, and be coupled with transcriptomic or metabolomic profiling. Because somatic mutations in the undifferentiated host germ line can propagate as corals age (reviewed by Van Oppen et al., 2011), and early larval fusion can generate chimeras (Frank et al., 1997; Barki et al., 2002; Puill-Stephan et al., 2009), it will also be interesting to map host genotypic mosaicism within a colony and to see if this influences symbiont associations in any way.

Further research into the physiology and ecology of background *Symbiodinium* is required to determine the role of this diversity in coral holobionts. Manipulating background strains will be difficult. A first step would be rearing healthy, completely symbiont-free corals, much like sterile mice reared without gut bacteria. With current aquaculture techniques, this is impossible for scleractinian hard corals. Progress has been made in the model anemone *Aiptasia* sp. (Weis et al., 2008). Though they lack the biomineralization processes of hard corals, *Aiptasia* represent a promising first step for several reasons. It is easy to produce clonal replicates, novel associations with heterologous symbionts are possible, and the same individuals can be inoculated, bleached, and re-inoculated experimentally in an aquarium setting. Moreover, genomic resources are available for the host and the homologous symbiont, *Symbiodinium. minutum* (Sunagawa et al., 2009; Bayer et al., 2012; Lehnert et al., 2012; Shoguchi et al., 2013). This system may be well-suited for establishing whether background *Symbiodinium* are functionally relevant during normal and stressful conditions. Additional transcriptomic, metabolomic, and proteomic characterizations of different *Symbiodinium* are ongoing. By contrasting molecular phenotypes at both coarse resolution (e.g., between clades; Ladner et al., 2012; Barshis et al., 2014) and fine-scale resolution (e.g., between species within clades and between individuals within species), we will begin to decipher the mechanisms by

which evolution gave rise to the current diversity of *Symbiodinium.*

#### **CONCLUSION**

Intraspecific variation is a major component of terrestrial mutualisms, affecting ecological interactions between proximate symbiotic species as well as higher-order community dynamics. Our understanding of such forces in marine endosymbiotic associations is lacking. We have reviewed some of the current literature and presented additional preliminary evidence suggesting intraspecific variation is extensive in coral hosts and algal symbionts, and that such variation interacts to affect the function of the combined holobiont. The holobiont is both a key ecological feature (being the physical structure that shapes reef ecosystems) and a unit of natural selection; it may ultimately be a unit of evolution in some cases. Future research should incorporate fine-scale molecular genotyping of both partners to address key questions about marine symbiosis ecology and evolution, and to characterize the role of holobiont extended phenotypes in an era of changing climate.

#### **AUTHOR CONTRIBUTIONS**

John E. Parkinson led writing of the manuscript and conducted experiments. Iliana B. Baums formulated the major hypotheses and edited the paper.

#### **ACKNOWLEDGMENTS**

We would like to thank the many investigators who contributed data for this review: B. A. Griffin, S. Griffin, T. Moore, and H. Spathias for *Acropora* growth rates; S. Denecke for *Symbiodinium* growth rates; C. S. Campbell, A. M. Lewis, and N. R. Polato for *Astrangia* and *Symbiodinium* metabolomic profiles; T. C. LaJeunesse for overseeing *Symbiodinium* experiments and providing insightful discussion and comments on the manuscript. Thanks to J. H. Marden for fruitful conversations on the topic. Special thanks to P. W. Glynn for advice and encouragement. Support was provided by the National Science Foundation (NSF DGE-0750756 to John E. Parkinson; NSF OCE-0928764 to Iliana B. Baums).

## **REFERENCES**


environment: from population to functional genetics. *J. Exp. Mar. Biol. Ecol.* 449, 349–357. doi: 10.1016/j.jembe.2013.10.010


**Conflict of Interest Statement:** The Guest Associate Editor, Monica Medina, declares that, despite being affiliated with the same institution as authors John E. Parkinson and Iliana B. Baums, the review process was handled objectively and no conflict of interest exists. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 April 2014; accepted: 04 August 2014; published online: 25 August 2014. Citation: Parkinson JE and Baums IB (2014) The extended phenotypes of marine symbioses: ecological and evolutionary consequences of intraspecific genetic diversity in coral–algal associations. Front. Microbiol. 5:445. doi: 10.3389/fmicb.2014.00445*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Parkinson and Baums. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Not just who, but how many: the importance of partner abundance in reef coral symbioses

## *Ross Cunning\* and Andrew C. Baker*

Department of Marine Biology and Ecology, Rosenstiel School of Marine and Atmospheric Science, University of Miami, Miami, FL, USA

#### *Edited by:*

M. Pilar Francino, Center for Public Health Research, Spain

#### *Reviewed by:*

Christina A. Kellogg, United States Geological Survey, USA Marilyn Brandt, University of the Virgin Islands, U.S. Virgin Islands

#### *\*Correspondence:*

Ross Cunning, Hawaii Institute of Marine Biology, University of Hawaii, P. O. Box 1346 (United States Postal Service), Kaneohe, HI 96744, USA e-mail: ross.cunning@gmail.com

The performance and function of reef corals depends on the genetic identity of their symbiotic algal partners, with some symbionts providing greater benefits (e.g., photosynthate, thermotolerance) than others. However, these interaction outcomes may also depend on partner abundance, with differences in the total number of symbionts changing the net benefit to the coral host, depending on the particular environmental conditions. We suggest that symbiont abundance is a fundamental aspect of the dynamic interface between reef corals and the abiotic environment that ultimately determines the benefits, costs, and functional responses of these symbioses. This density-dependent framework suggests that corals may regulate the size of their symbiont pool to match microhabitatspecific optima, which may contribute to the high spatiotemporal variability in symbiont abundance observed within and among colonies and reefs. Differences in symbiont standing stock may subsequently explain variation in energetics, growth, reproduction, and stress susceptibility, and may mediate the impacts of environmental change on these outcomes. However, the importance of symbiont abundance has received relatively little recognition, possibly because commonly-used metrics based on surface area (e.g., symbiont cells cm−2) may be only weakly linked to biological phenomena and are difficult to compare across studies. We suggest that normalizing symbionts to biological host parameters, such as units of protein or numbers of host cells, will more clearly elucidate the functional role of symbiont abundance in reef coral symbioses. In this article, we generate testable hypotheses regarding the importance of symbiont abundance by first discussing different metrics and their potential links to symbiosis performance and breakdown, and then describing how natural variability and dynamics of symbiont communities may help explain ecological patterns on coral reefs and predict responses to environmental change.

**Keywords: coral,** *Symbiodinium***, symbiont density, cell ratio, normalization, symbiosis regulation, benefits and costs, density dependence**

#### **INTRODUCTION**

Reef corals engage in symbiosis with single-celled dinoflagellate algae in the genus *Symbiodinium*, from which they acquire photosynthetic products that support most or all of their energetic needs (Muscatine and Porter, 1977) and help them build calcium carbonate skeletons that form the foundation of coral reefs (Allemand et al., 2011). The future growth and persistence of these ecosystems therefore depends on the integrity of coral-algal symbiosis under anthropogenic climate change. Coral bleaching—the breakdown of symbiosis that can lead to coral mortality—is predicted to occur with greater frequency and intensity due to rising sea surface temperatures (Hoegh-Guldberg et al.,2007; Baker et al., 2008), although individual responses may vary greatly in space and time. Investigating the basic functional biology of coral-algal symbiosis has helped us understand this variability and improves our ability to forecast the potential fates of coral reefs under climate change.

The functional response of the coral "holobiont" (the animal host and its symbionts) is known to depend on the genetic composition of its symbiotic algal community. Different taxa within the genus *Symbiodinium* (Pochon and Gates, 2010) vary in their physiological properties, and certain taxa, particularly members of clade D, are heat-tolerant (Rowan, 2004), conferring increased resistance to thermal stress on their coral hosts (Rowan et al., 1997; Glynn et al., 2001; Berkelmans and van Oppen, 2006; LaJeunesse et al., 2010; McGinley et al., 2012; Cunning and Baker, 2013). Other types, including members of clade C, may provide corals with more fixed carbon (Cantin et al., 2009), enabling faster growth (Little et al., 2004; Jones and Berkelmans, 2010). Symbiont taxa also differ in their ability to acquire inorganic nutrients (Baker et al., 2013) and combat oxidative stress (McGinty et al., 2012). Together, these differences likely help explain significant variation in growth, performance, and stress susceptibility among corals hosting different symbiont types.

However, the expressed phenotype of coral holobionts likely also depends on the abundance of algal symbionts within coral tissues, and not just their genetic identity. Indeed, symbiont population density may directly influence the costs, benefits, and outcomes of all symbiotic interactions (Holland et al., 2002, 2004). In corals, symbiont abundance is variable in space and time (Fagoonee et al., 1999; Fitt et al., 2000), and may strongly

influence most, if not all, aspects of reef coral physiology, including nutrient cycling (Wooldridge, 2009), light absorption (Enríquez et al., 2005), and stress response (Nesa and Hidaka, 2009; Nesa et al., 2012; Cunning and Baker, 2013). However, despite the potential importance of symbiont abundance, its specific role in determining coral functional responses is poorly understood and often overlooked. This may be due in part to the preoccupation of recent work with genetically identifying (rather than quantifying) symbionts. Moreover, the different metrics used to normalize symbiont abundance (e.g., per unit area, mass, volume, protein, or cell) may not all have equal relevance to symbiosis physiology, potentially obscuring important functional relationships (Edmunds and Gates, 2002), and precluding useful comparisons across species and studies.

While most recent studies measure symbiont abundance only to diagnose coral bleaching, earlier studies also focused on understanding how symbiont populations are regulated and controlled (Muscatine and Pool, 1979; Falkowski et al., 1993; Jones and Yellowlees, 1997). Although these studies were primarily concerned with the mechanisms by which a particular abundance is maintained, its subsequent influence on coral physiology and function received less attention. Some studies have evaluated impacts of symbiont abundance on photosynthesis and respiration (Hoegh-Guldberg and Smith, 1989; Hoogenboom et al., 2010), while others have explored its potential physiological impacts using either conceptual (Wooldridge, 2013) or modeling approaches (Anthony et al., 2009; Terán et al., 2010; Cunning, 2013), concluding that symbiont abundance can have fundamental impacts on symbiosis. Here, we advance the view that symbiont abundance is much more than just an indicator of bleaching during stress; it is an integral determinant of holobiont physiology and mediator of symbiosis function that underlies critical variation in symbiosis biology and ecology.

#### **MEASURING SYMBIONT ABUNDANCE**

Many techniques and metrics have been employed to measure the abundance of algal symbionts in cnidarian hosts. In corals, the most commonly used metric is the number of symbiont cells per unit surface area of coral skeleton (cells cm−2). Measuring this typically involves extracting intact *Symbiodinium* cells from living corals [e.g., using a Water Pik (Johannes and Wiebe, 1970) or airbrush], counting them with a hemocytometer, and normalizing cell numbers to skeletal surface area. This method is inexpensive but labor-intensive and requires sacrificing several square centimeters or more of coral tissue. The accuracy and precision of this metric depends on complete extraction of symbionts from the skeleton, the breakup of coral mucus to ensure an even distribution of symbionts in the hemocytometer counting field, and accurate measurement of skeletal surface area, all of which can be difficult to achieve without large and compounding errors (Johannes and Wiebe, 1970; Edmunds, 1994; Veal et al., 2010).

Areal symbiont abundance metrics also provide no information about the coral animal inhabiting the same area, which is problematic since coral tissue biomass varies considerably among coral species, colonies, and over time (Fitt et al., 1993; Brown et al., 1999; Fitt et al., 2000; Edmunds and Gates, 2002; Thornhill et al., 2011). Therefore, although different corals may host similar numbers of symbionts per square centimeter of skeleton, these symbionts may be contained within different amounts of host tissue and consequently may function differently. Therefore, normalizing symbiont abundance by area may obscure important functional variation among symbioses related to differences in host tissues (Edmunds and Gates, 2002), emphasizing the need for metrics that better reflect the abundance (or size) of both interacting partners, i.e., a "symbiont to host ratio" (Douglas, 1985). Other metrics address this issue by normalizing symbiont abundance to host-associated biological units instead of areal units.

The number of polyps has been occasionally used to normalize symbiont abundance (Muscatine et al., 1991; Jones and Yellowlees, 1997), although differences in polyp size, structure, and density among coral taxa may prevent useful comparisons of symbiont abundance per polyp (Edmunds and Gates, 2002). Other metrics that are more comparable across taxa include symbiont cells per unit mass (Fitt, 1982), or, more commonly, per unit protein. For protein normalization, researchers either measure total (animal and algal) protein (Saunders and Muller-Parker, 1997; Shick et al., 1999; Edmunds and Gates, 2002; Anthony and Hoegh-Guldberg, 2003; Hoogenboom et al., 2010), or physically separate animal and algal fractions to measure only animal protein (Muller-Parker, 1985; Muller-Parker et al., 1994; Hawkins et al., 2013). Protein is then quantified using the Bradford Assay (Bradford, 1976) and used to calculate symbiont abundance (from cell counts, as above) as cells per mg protein. While this metric provides information about both algal and coral partners, it also has limitations. First, a total protein denominator does not provide a true symbiont to host ratio as it includes algal-derived protein [∼10–13% in anemones (Saunders and Muller-Parker, 1997) and corals (Douglas, 1985)]. Using only animal protein as a denominator theoretically overcomes this issue, although common procedures for mechanically separating algal and animal tissues (i.e., centrifugation) are not fully effective (Douglas and Smith, 1983), leading to considerable error in these metrics. Moreover, these techniques are additionally hampered by issues of incomplete tissue removal from the skeleton, which may be even greater for corals with thicker tissue (Edmunds, 1994) or perforate skeletons.

Symbiont abundance has also been measured by volume (e.g., algal volume as a percent of host cell volume or per mg protein) in green *Hydra* symbioses (Douglas and Smith, 1983, 1984). However, this metric is not amenable to coral symbioses because symbionts occupy nearly 100% of the host cell volume (Muscatine et al., 1998). Moreover, volume estimation relies on assumptions of cell shape and size that are likely incorrect (Douglas, 1985).

To overcome problems associated with volume ratios and ineffective separation of algal and host tissues, the amount of chlorophyll *a* per unit protein (e.g., μg chl a per μg protein) of intact tissues has also been proposed as a useful symbiont to host ratio for diverse invertebrate-algal symbioses (Douglas, 1985). A similar metric of chlorophyll *a* normalized to tissue ash-free dry weight (AFDW) has been used for corals (Grottoli et al., 2004, 2006). However, because symbionts comprise 5–12% of coral

AFDW (Porter et al., 1989) and chlorophyll *a* content varies widely per symbiont cell (Chang et al., 1983), this metric may not reflect symbiont abundance so much as the photosynthetic capacity of the symbiosis. As such, it may still provide useful information, and has the advantages of being rapidly and reliably calculated, requiring only small amounts of tissue, and being comparable across diverse symbiotic associations (Douglas, 1985).

Symbiont abundance has also been normalized to host cell numbers. In *Hydra*, the mean number of symbiont cells within a single host digestive cell is a commonly used metric of density (Douglas and Smith, 1984). A similar cell-specific density (CSD) in corals indicates the average number of symbionts within a symbiont-containing gastrodermal cell, which typically has a value between 1 and 2 (Muscatine et al., 1998). However, because corals also contain many non-symbiotic cell types that are not counted in the CSD, this metric is clearly decoupled from tissue- and colony-level phenotypes. Indeed, an increase in CSD can occur simultaneously with major declines in overall symbiont abundance, measured as cells per mg protein (Shick et al., 1999).

More recently, the abundance of symbionts relative to the total number of host cells at the tissue or colony level has been measured using quantitative PCR (qPCR; Mieog et al., 2009; Cunning and Baker, 2013). This technique involves amplification of specific target gene loci in both the symbiont and the host to calculate a ratio of the total number of symbiont cells to host cells (S/H cell ratio). Bulk genomic DNA can be extracted from an intact coral fragment, which overcomes the problems of incomplete tissue removal and fractionation that introduce inaccuracy in other metrics. Moreover, very small tissue samples (0.25 cm<sup>2</sup> or less) can be used for this analysis, enabling repeated sampling of living coral fragments over time. Most importantly, because this technique enumerates symbionts genetically instead of visually, it can distinguish among different symbiont types in mixed communities at any level of taxonomic resolution. This is of fundamental importance, because the overall function of a symbiont community depends quantitatively on its composition (Loram et al., 2007; Cunning, 2013), and many corals may harbor multiple symbiont types (Silverstein et al., 2012).

Because cells are the fundamental unit of biological organization, standardizing the abundance of symbiont cells to host cells using qPCR may represent the best current approximation of a "symbiont to host ratio" (*sensu* Douglas, 1985). However, as with other techniques, there are drawbacks. These include higher variability among technical replicates than is associated with areal measurements (Mieog et al., 2009) due to the logarithmic error inherent in qPCR. In addition, calculation of absolute S/H cell ratios from qPCR data requires normalizing fluorescence intensity (if different reporter dyes are used) and estimating DNA extraction efficiency and gene copy numbers for target loci (Mieog et al., 2009; Cunning and Baker, 2013; Angly et al., 2014). Primer and probe sequences must also be carefully designed to match target sequences and mismatch non-target sequences (Cunning and Baker, 2013), and some prior knowledge of the symbiont diversity present in a sample is required to select appropriate assays. However, once assays have been developed and validated, they enable higher-throughput data

collection relative to methods based on cell counts and surface area, as well as quantitative characterization of the genetic composition of the symbiont community. To date, qPCR assays have been developed to quantify *Symbiodinium* in clades B, C, and D in several coral host species (Mieog et al., 2009; Cunning, 2013; Cunning and Baker, 2013; Silverstein et al., 2014), which can be easily adapted for use in any laboratory with a qPCR platform.

Other genetic techniques for quantifying mixed symbiont assemblages include "FISH-Flow," which utilizes fluorescence *in situ* hybridization and flow cytometry in tandem to count different symbiont types (McIlroy et al., 2014), and next-generation sequencing (NGS; Kenkel et al., 2013). While their application to coral symbiont communities has only just begun, NGS approaches have the power to recover a more complete picture of community diversity, including the rare biosphere (Quigley et al., 2014), and require no prior taxonomic knowledge. However, while these approaches can estimate relative proportions of different symbiont types, these data are subject to numerous quantitative biases (Amend et al., 2010) and must still be normalized to surface area or other host parameters to quantify symbiont abundance. However, further development of quantitative NGS approaches using appropriate markers for both coral and *Symbiodinium* partners may enable calculation of symbiont to host ratio metrics that identify and quantify all members of the community in a biologically relevant way.

## **IMPLICATIONS OF DIFFERENT METRICS OF SYMBIONT ABUNDANCE**

Depending on which metric is used to quantify symbiont abundance, different aspects of symbiosis structure and function may be revealed (or obscured). For example, Muller-Parker et al. (1994) found that nutrient enrichment increased the number of symbiont cells per cm2 while cells per mg protein remained constant. In contrast, in response to low light, Anthony and Hoegh-Guldberg (2003) found no change in symbiont cells per cm<sup>2</sup> but more than double the number of cells per mg protein. Similarly, Edmunds and Gates (2002) found that different coral colonies had the same number of symbionts per cm2, but significantly different abundances normalized to protein.

Differences among these metrics are likely the result of a dynamic vs. fixed quantity in the denominator. When symbionts are normalized to a dynamic unit (host protein, cells, etc.), their abundance is also influenced by changes in these units. Therefore, changes in coral tissue architecture may produce different patterns in different metrics of symbiont abundance (**Figure 1**). For example, as environmental conditions change from winter into summer, coral tissues become thinner (Barnes and Lough, 1992; Brown et al., 1999; Fitt et al., 2000; Thornhill et al., 2011), which may involve a loss of both symbiont and host cells on an areal basis. Decreased heterotrophy in summer (Ferrier-Pagès et al.,2011) may also reduce numbers of host prey-capture cells such as cnidocytes and mucocytes, but increased reproduction in summer may increase the number of host gametocytes and mesenterial cells. Higher summer temperatures may also increase respiration and host cell catabolism. Changes in cellular architecture as a result of these processes (e.g., **Figures 1A,B**) might lead to a greater net

loss of host cells relative to symbionts, resulting in a reduction in symbionts per cm2, but an increase in the S/H cell ratio (**Figure 1**). Indeed, areal symbiont density tends to decrease in the summer (Stimson, 1997; Brown et al., 1999; Fagoonee et al., 1999; Fitt et al., 2000), while the S/H cell ratio may increase (Cunning and Baker, 2013).

not to scale and are meant to illustrate conceptual differences between

different metrics.

Since these metrics provide different information, it is important for researchers to select the most relevant metric. For research focused primarily on interactions with the physical environment (e.g., the interception of light by symbionts), it may be appropriate to normalize symbiont abundance to a physical unit of area. Because light is measured on an areal basis (e.g., μmol quanta m−<sup>2</sup> s <sup>−</sup>1, or W m−<sup>2</sup> s <sup>−</sup>1), an areal metric of symbiont abundance may be most appropriate for understanding relationships between symbionts and light. Alternatively, because coral tissues and light fields are three-dimensional, the abundance of symbionts per unit volume may be even more informative (Terán et al., 2010).

In contrast, for research focused primarily on biological interactions between symbionts and hosts, it may be more useful to normalize symbiont abundance to a host-related biological unit (i.e., a "symbiont to host ratio"; Douglas, 1985). The currencies of host-symbiont interactions are metabolites and cellular signaling molecules, which are produced and received by cells as fundamental biological units. Therefore, measuring the abundance of symbionts relative to host cells (or other biological units, e.g., biomass, protein) may be more informative and relevant for research concerned with these interactions. For example, in one study of bleaching and recovery, symbiont abundance per unit area had recovered to pre-bleaching levels within months, but tissue biomass, proteins, and lipids per unit area remained lower than pre-bleaching levels (Fitt et al., 1993). In this case, recovered corals might be expected to function differently from their pre-bleaching state, although areal symbiont abundance metrics would not reveal any difference. Meanwhile, symbiont abundance normalized to a biological parameter might reveal important differences indicative of functional variation.

These issues demonstrate the importance of normalizing data in a way that is relevant to the research question and the response variable of interest. In phototrophic symbioses such as corals, the physical interactions between symbionts and light and the biological interactions between symbionts and hosts are fundamentally linked. Therefore, measuring the number of symbionts normalized to both physical and biological units would provide the most comprehensive information regarding symbiosis function. However, if only one type of metric is to be used, normalizing symbiont abundance to dynamic biological units, rather than static physical units, may be more generally relevant to the physiology and function of coral-algal symbioses (Edmunds and Gates, 2002).

## **EFFECT OF SYMBIONT ABUNDANCE ON SYMBIOSIS FUNCTION**

Symbiont abundance is an important factor shaping coral tissue microhabitat, resource availability, and symbiont physiology, which in turn determine the overall costs and benefits of symbiosis (Holland et al., 2002, 2004). In corals, both photosynthesis and photo-oxidative stress depend on the light fields that individual *Symbiodinium* experience (Powles, 1984), which are directly modified by the surrounding symbionts (Enríquez et al., 2005; Terán et al., 2010). When symbiont abundance is low, each cell receives more light; as their abundance increases, self-shading reduces light such that symbionts may only receive 10% of the incident light at the colony surface (Kaniewska et al., 2011; Wangpraseurt et al., 2012). Because light absorption takes places within a threedimensional coral tissue matrix, the magnitude of self-shading is likely a function of symbiont abundance per unit volume, and has been implemented this way (as cells per mm3) in modeling these dynamics (Terán et al., 2010).

While incident light may be directly influenced by symbiont abundance, light absorption and quenching involve additional layers of photobiology, and downstream impacts on symbiosis function are further mediated by host-symbiont cellular interactions. Nevertheless, these complex outcomes may still be linked to symbiont abundance and illustrated within a conceptual framework (**Figure 2**). For example, if each symbiont provides some photosynthate, increasing symbiont abundance will increase the total photosynthate received (i.e., the gross benefit to the

coral). However, at high abundances, self-shading and/or carbonlimitation may reduce photosynthesis in each cell, causing gross benefit to decline (**Figure 2**). This relationship is supported empirically by P:R ratios in corals that initially increase as a function of symbiont abundance (per mg protein) and subsequently decline (Hoogenboom et al., 2010). Importantly, the impact of photosynthate delivery on the coral depends on the amount of coral tissue receiving it, suggesting that symbiont abundance may better predict gross benefit when normalized to host biological parameters (e.g., protein, cell).

Another outcome linked to symbiont abundance is the energetic cost to the host of maintaining symbionts (Douglas and Smith, 1983). These costs include, but are not limited to, providing space within host cells for symbiont occupation (Douglas and Smith, 1983), creating and maintaining host-derived symbiosome membranes (Peng et al.,2010), actively concentrating carbon dioxide for symbiont photosynthesis (Weis et al., 1989; Meyer andWeis, 2012), detoxifying oxygen radicals, and repairing macromolecular damage caused by symbiont photo-oxidative stress (Lesser, 2006). The costs associated with each symbiont will cause the gross cost of symbiosis to increase with symbiont abundance (**Figure 2**). At high abundances, costs may increase exponentially, as carbon-limitation of symbiont photosynthesis may exacerbate photodamage and oxidative stress (Wooldridge, 2009; **Figure 2**). Importantly, the impact of these costs also depends on the amount of coral tissue incurring the cost, suggesting it may also be better predicted by adopting a symbiont to host ratio approach.

Thus, symbiont abundance may determine both the costs and benefits of symbiosis, which in turn determine the net benefit (or interaction outcome; **Figure 2**). The magnitude of this benefit may subsequently correlate with aspects of host performance, such that greater benefit facilitates faster growth or higher reproductive

rates. This framework allows us to understand how variation in symbiont abundance may underlie variability in coral outcomes. For example, elevated nutrients have been shown to reduce coral growth (Marubini and Davies, 1996; Fabricius, 2005), which may reflect a nutrient-driven increase in symbiont abundance beyond the optimum that reduces the net benefit of symbiosis. We hypothesize that, if symbiosis costs and benefits are density-dependent, variation in symbiont abundance can help explain the natural variability observed in coral performance, both within and among coral species and colonies. This "density-dependent" model of coral-algal symbiosis provides a framework for generating and testing diverse hypotheses linking the environment to symbiont abundance, physiology, and function.

## **EFFECT OF SYMBIONT ABUNDANCE ON SYMBIOSIS BREAKDOWN**

Symbiont abundance can influence corals' sensitivity to environmental stress and the breakdown of symbiosis that can occur as a result. Because photodamage and production of reactive oxygen species (ROS) in symbionts is thought to be the primary trigger of bleaching (Weis, 2008), this response should logically depend on symbiont abundance. However, a link between symbiont abundance and bleaching has only recently been shown: in the Pacific coral *Pocillopora damicornis,* colonies with more symbionts (measured by S/H cell ratios) bleached more severely in response to a natural warming event (Cunning and Baker, 2013), while higher S/H cell ratios were also linked to greater bleaching severity in experiments with the Caribbean corals *Montastraea cavernosa* (Silverstein et al., 2014), *Orbicella faveolata*, and *Siderastrea siderea* (Cunning, 2013) suggesting this may be a general phenomenon in corals. Although counter to previous suggestions that more symbionts (per cm2) may buffer corals from

stress (Stimson et al., 2002; Enríquez et al., 2005), these findings are consistent with the molecular mechanisms of bleaching in suggesting that a larger symbiont pool produces more cumulative ROS, triggering a proportionally more severe bleaching response.

Under this model, if the primary sources and targets of ROS signaling are symbiont and host cells, respectively, then the S/H cell ratio may be the best predictor of the functional relationship between symbiont abundance and bleaching. In fact, areal symbiont abundance is suggested to have the opposite influence, such that fewer symbionts per unit area leads to reduced self-shading and greater light-driven ROS production per cell (Enríquez et al., 2005; Terán et al., 2010). However, cumulative ROS production, as the relevant metric in this framework, equals the per-cell rate times the total number of cells, and thus concomitant changes in both these factors must be evaluated to determine the net effect.

The relationship between symbiont abundance and local irradiance (i.e., self-shading, which may drive per-cell rates of ROS production) has been identified using both empirical and modeling approaches as being nonlinear, such that pigments (Enríquez et al., 2005) or symbionts (Terán et al., 2010) may decline by ∼80% before the internal light environment is significantly amplified. Consequently, large changes in symbiont abundance may take place without impacting light-driven ROS production per cell. Meanwhile, 80% fewer symbionts would reduce total ROS production by at least 80%, suggesting that corals with fewer symbionts may indeed experience less cumulative oxidative stress. However, enhanced ROS production per cell may become relatively more important if the symbiont pool is reduced below a threshold (e.g., due to partial bleaching) where the internal light environment becomes exponentially amplified (Enríquez et al., 2005; Terán et al., 2010). This positive feedback may accelerate coral bleaching in already-bleached corals, even though initial susceptibility may be greater when symbiont abundance is higher.

These hypotheses are supported by a study that used both area- and protein-normalized metrics to assess changes in symbiont abundance in two colonies of *Orbicella franksi* transplanted to a high light environment (Edmunds and Gates, 2002). Initial symbiont abundance per cm<sup>2</sup> did not differ between colonies, but symbionts per mg protein differed by ∼60%. Only the coral with more symbionts per mg protein bleached when transplanted to the high light environment, supporting the hypothesis that excess symbionts cause more severe bleaching. Even though these corals showed different functional responses, areal symbiont density measurement failed to identify any difference between them, showing how certain metrics can mask or obscure important functional variation. This provides another illustration of how metrics that incorporate both symbiont and host information may be more relevant to physiology and better predict symbiosis functional outcomes.

#### **SYMBIONT ABUNDANCE VARIABILITY AND DYNAMICS**

Understanding natural spatiotemporal variability in symbiont abundance is important due to the many ways it may influence symbiosis costs and benefits, coral performance, and stress susceptibility. Early studies found that symbiont abundance was partly determined by environmental conditions in *Hydra* (Douglas and Smith, 1984), *Aiptasia* (Steele, 1976), and corals (Dustan, 1979). In particular, these studies showed that differences in feeding and light regimes led to changes in symbiont abundance in the host. The apparent regulation of symbionts by the host was well-studied in *Hydra*, involving both arrested growth and expulsion of symbionts (Douglas and Smith, 1984). Corals may also actively regulate their symbiont populations, evidenced by continuous symbiont expulsion (Hoegh-Guldberg et al., 1987; Baghdasarian and Muscatine, 2000; Yamashita et al., 2011), and higher growth rates observed in *Symbiodinium* living outside the host (Chang et al., 1983). Various mechanisms of host control over the symbiont population have been investigated, including nutrient limitation (Falkowski et al., 1993), expulsion (Baghdasarian and Muscatine, 2000), apoptosis (Dunn and Weis, 2009), symbiophagy (Downs et al., 2009), and other mechanisms (Gates et al., 1992).

However, the underlying factors that determine the specific abundance of symbionts maintained by these mechanisms are not well understood (Douglas and Smith, 1984; Smith, 1987). It has been hypothesized that spatial or volumetric capacities determine the abundance of symbionts in a coral (Jones andYellowlees,1997), although changes in abundance on seasonal and diel scales and in response to abiotic factors (e.g., nutrients) suggest that mechanisms other than space-limitation are important (Davy et al., 2012). If corals actively regulate the size of their symbiont pool, it follows that they should maintain symbionts at an optimal abundance that maximizes the net benefit of the symbiosis (**Figure 2**; Hoogenboom et al., 2010; Cunning, 2013). This optimal abundance will be context-dependent, as abiotic factors such as light and temperature, and biotic factors such as symbiont type, are expected to influence the costs and benefits defining optimal abundance (**Figures 2A,B**). In this model, an optimal abundance exists for a given symbiont type in a given environment.

High spatial variability in abiotic factors, even over reefal scales (Brakel, 1979), may drive corresponding variation in optimal symbiont abundance, and regulation to match these variable optima may explain differences observed among coral colonies (Moothien-Pillay et al., 2005; Pisapia et al., 2014). Short- to mid-term temporal changes (days to weeks) in abiotic factors may similarly shift abundance optima, driving observed seasonal dynamics of symbiont populations (Stimson, 1997; Fagoonee et al., 1999; Fitt et al., 2000; Cunning and Baker, 2013). In this way, regulation by coral hosts to match dynamic optima that maximize interaction benefit may underlie observed spatiotemporal variability in symbiont abundance.

Alternative explanations for variation in symbiont abundance include direct environmental control of symbiont growth dynamics and the resulting differential performance of symbiont types with varying physiological optima. Additionally, the degree of symbiont regulation might also be expected to depend on the particular coral species and symbiont type involved, and might also be inhibited by certain abiotic factors (e.g., nutrients). In addition, some degree of time lag between changes in the environment and compensatory changes in symbiont population size might be expected. Consequently, even if hosts actively regulate symbiont populations, they may not always be maintained at optimal levels.

While the primary abiotic factors influencing symbiont abundance are likely to be light, temperature, and nutrients, other factors such as salinity, dissolved oxygen (Brown et al., 1999; Fagoonee et al., 1999), and pCO2 may also play important roles. These factors can be also incorporated into a density-dependent theoretical framework—by altering symbiosis costs and benefits and driving the need for host regulatory control. Additional data describing the effects of each of these factors on symbiont population dynamics, and their potential interactions, will help test this model.

In addition to the environmental factors that control symbiont abundance, biological factors may also be important drivers of symbiont standing stock. These factors include intrinsic differences in tissue architecture among coral species (i.e., corals with thinner tissues may have generally higher symbiont abundance relative to host tissue), reproductive status, and heterotrophy (see Implications Of Different Metrics Of Symbiont Abundance and **Figure 1**). In addition, lesions (due to parrotfish bites, physical impact, or partial mortality) can lead to reduced symbiont abundance in surrounding tissues (Palmer et al., 2011), and coral diseases can also destabilize symbiont abundance (Cervino et al., 2001; Toller et al., 2001). Differences in these biotic and abiotic factors within colonies and across reefs therefore may establish a wide range of symbiont abundance in coral tissues, even for corals of the same species hosting the same symbiont type. Different coral hosts with different algal symbionts only further increases natural variability in partner abundance on reefs.

## **ECOLOGICAL IMPLICATIONS AND ENVIRONMENTAL CHANGE**

We hypothesize that the complex and dynamic interaction between biotic and abiotic landscapes can give rise to significant spatiotemporal variability in symbiont abundance within corals and across reefs. Indeed, symbiont abundance in nearby colonies can vary from twofold to threefold (cells per cm2; Jones and Yellowlees, 1997; Moothien-Pillay et al., 2005) to 21-fold (S/H cell ratio; Cunning, 2013), and changes of similar magnitudes may occur seasonally within colonies (Thornhill et al., 2011; Cunning, 2013).

If partner abundance determines symbiotic interaction outcomes (net benefit), then variability in symbiont population size can translate to critical differences in coral holobiont performance. For example, in *P. damicornis*, variation in initial symbiont abundance drove high variability in bleaching response (0–77% reduction in S/H cell ratios in colonies hosting thermotolerant *Symbiodinium* D1, 46–95% in colonies hosting thermally sensitive C1b-c; Cunning and Baker, 2013). Variability in symbiont abundance may therefore help explain why bleaching is often patchy over relatively small scales, and even within single colonies (Rowan et al., 1997; Jones, 2008). Over larger scales, variability in symbiont abundance may explain why bleaching is more severe at certain locations (e.g., where abiotic conditions promote higher symbiont abundances), in certain coral species (McClanahan et al., 2004), or at different times of the year (e.g., summer, when S/H cell ratios are higher, Cunning and Baker, 2013). Thus, differences in symbiont abundance may help explain ecological patterns over many scales.

These relationships may also provide insight into the impacts of climate change, as the effects of a changing environment on reef coral ecology may be mediated by effects on symbiont abundance. For example, several studies have shown declines in areal symbiont densities in response to elevated pCO2, which has been interpreted as acidosis-induced coral bleaching (Anthony et al., 2008; Kaniewska et al., 2012). Alternatively, this response could be interpreted as a host-controlled reduction of symbiont abundance to sustain maximum interaction benefit in a high-pCO2 environment. Regardless, if corals under high pCO2 have fewer symbionts, they may be less susceptible to subsequent thermal stress due to lower cumulative ROS accumulation. This suggests that corals in naturally acidic areas, or at high latitudes where acidification may occur before warming (van Hooidonk et al., 2014), may be more bleaching resistant than conspecifics in different environments. If true, this would have important implications for survival trajectories of corals facing the combined effects of high temperature and pCO2. Testing this hypothesis will require acclimating corals to high pCO2 and allowing symbiont abundance to equilibrate to these conditions before applying thermal stress, in order to separate the effects of prior CO2 exposure from the effects of thermal stress.

Eutrophication is another factor affecting reefs worldwide that may interact with other stressors by influencing symbiont abundance. Excess nutrients can increase symbiont abundance by alleviating their normal state of nutrient limitation, which may cause the host to lose regulatory control of its symbionts (Falkowski et al., 1993), resulting in detrimental impacts on host growth and performance (Marubini and Davies, 1996; Fabricius, 2005). Moreover, enlarged symbiont populations may render nutrientexposed corals more susceptible to thermal stress (Cunning and Baker, 2013; Vega Thurber et al., 2014). This indicates that efforts to reduce nutrient pollution on coral reefs may help corals be more resistant to climate change-related stressors (Wooldridge and Done, 2009; Cunning and Baker, 2013; Wiedenmann et al., 2013; Vega Thurber et al., 2014).

Bleaching susceptibility is not the only factor that may be affected by symbiont abundance. Because the magnitude of net benefit received by corals is also dependent on symbiont abundance (**Figures 2A,B**), important ecological parameters such as growth and reproduction may also be impacted. While these links must be quantified empirically, the mechanistic framework outlined here helps conceptualize and evaluate the links between environmental variability, symbiont population dynamics, and reef coral ecology.

#### **CONCLUSION**

While much of the focus of recent research has been on the influence of symbiont identity, symbiont abundance must also be considered as a critical factor influencing the function of coral-algal symbioses. Efforts to evaluate coral responses to environmental stresses may therefore benefit from more rapid and accurate ways of measuring and monitoring symbiont abundance, not merely as a stress response, but as a critical metric of coral physiology that will help explain holobiont outcomes. Ideally, knowledge of both symbiont identity and abundance (with respect to both physical and biological units) would provide the most

comprehensive information on the state of the symbiosis, but we suggest that taxon-specific symbiont to host cell ratios are currently the most biologically relevant and efficiently obtainable metrics. When applied to targeted symbiotic systems of interest they have shown consistent functional relationships with aspects of host performance such as bleaching severity, and may also be useful predictors of the overall costs and benefits of symbiosis. We suggest that the use of more relevant metrics and a greater appreciation for importance of symbiont abundance will advance our understanding of the biology of coral-algal symbioses and their responses to environmental change.

#### **ACKNOWLEDGMENTS**

Ross Cunning was supported by a University of Miami Fellowship and a National Science Foundation Graduate Research Fellowship. Additional support was provided by a Provost's Research Award from the University of Miami, and a Pew Fellowship in Marine Conservation to Andrew C. Baker.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 May 2014; paper pending published: 14 June 2014; accepted: 16 July 2014; published online: 04 August 2014.*

*Citation: Cunning R and Baker AC (2014) Not just who, but how many: the importance of partner abundance in reef coral symbioses. Front. Microbiol. 5:400. doi: 10.3389/fmicb.2014.00400*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Cunning and Baker. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## *Chuya Shinzato , Sutada Mungpakdee , Nori Satoh\* and Eiichi Shoguchi*

*Marine Genomics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan*

#### *Edited by:*

*Monica Medina, Pennsylvania State University, USA*

#### *Reviewed by:*

*Malcolm Hill, University of Richmond, USA Daniel J. Thornhill, Defenders of Wildlife, USA*

#### *\*Correspondence:*

*Nori Satoh, Marine Genomics Unit, Okinawa Institute of Science and Technology Graduate University, 1919-1 Tancha, Onna, Okinawa 904-0495, Japan e-mail: norisky@oist.jp*

Far more intimate knowledge of scleractinian coral biology is essential in order to understand how diverse coral-symbiont endosymbioses have been established. In particular, molecular and cellular mechanisms enabling the establishment and maintenance of obligate endosymbiosis with photosynthetic dinoflagellates require further clarification. By extension, such understanding may also shed light upon environmental conditions that promote the collapse of this mutualism. Genomic data undergird studies of all symbiotic processes. Here we review recent genomic data derived from the scleractinian coral, *Acropora digitifera,* and the endosymbiotic dinoflagellate, *Symbiodinium minutum.* We discuss *Acropora* genes involved in calcification, embryonic development, innate immunity, apoptosis, autophagy, UV resistance, fluorescence, photoreceptors, circadian clocks, etc. We also detail gene loss in amino acid metabolism that may explain at least part of the *Acropora* stress-response. Characteristic features of the *Symbiodinium* genome are also reviewed, focusing on the expansion of certain gene families, the molecular basis for permanently condensed chromatin, unique spliceosomal splicing, and unusual gene arrangement. Salient features of the *Symbiodinium* plastid and mitochondrial genomes are also illuminated. Although many questions regarding these interdependent genomes remain, we summarize information necessary for future studies of coral-dinoflagellate endosymbiosis.

**Keywords: corals, symbiosis,** *Symbiodinium***, genome, transcriptome**

#### **INTRODUCTION**

Coral reefs and tropical forests are places that foster the greatest diversities of organisms on the earth. Even though coral reefs occupy only ∼1% of the seas, they are estimated to harbor around one-third of all described marine species (Wilkinson, 2004), and their productivity supports around one quarter of marine fisheries. However, due to human activities and climate changes, reefs decline in abundance, and wholesale loss of reef habitats is one of the most pressing environmental issues of our time.

The major architects of coral reefs, the scleractinian corals, are anthozoan cnidarians that form obligate endosymbioses with photosynthetic dinoflagellates of the genus *Symbiodinium*. The symbionts confer upon the coral holobiont the ability to fix CO2 and to deposit the massive aragonite (a form of calcium carbonate) skeletons that distinguish reef-building corals from other anthozoans, such as sea anemones. The association is fragile however, collapsing under stress and from disease. Molecular and cellular mechanisms underlying much of coral biology, including the establishment, maintenance, and breakdown of coral-*Symbiodinium* symbioses remain to be elucidated.

In order to investigate mechanisms that support this mutualism, genomic information from both corals and *Symbiodinium* is essential. Proteomics approaches have also been applied to coral and *Symbiodinium* studies (Drake et al., 2013; Ramos-Ailva et al., 2013). Following cloning and characterization of single genes (e.g., Berghammer et al., 1996; Miller et al., 2000), the first large molecular dataset available for a coral was a collection of ∼3000 expression sequence tags (ESTs) from the Indo-Pacific complex coral, *Acropora millepora* (Kortschak et al., 2003). Since then, several EST data sets and transcriptomics studies in corals, as well as *Symbiodinium* spp. have appeared (**Tables 1, 2**). In 2011, a draft genome of *Acropora digitifera* was decoded (**Table 1**) (Shinzato et al., 2011). Then, in 2013, a draft genome of *Symbiodinium minutum* was decoded (**Table 2**) (Shoguchi et al., 2013a). The present review describes characteristic features of these two genomes, with the hope that this information may support future studies of coral biology.

## **THE** *ACROPORA DIGITIFERA* **GENOME**

The genome of *A. digitifera,* decoded using next-generation sequencing technology, is ∼420-Mbp in size, 39% G+C, and contains 23,668 predicted protein-coding loci (Shinzato et al., 2011). The coral gene set is comparable in size and composition to those of *Nematostella vectensis* (Putnam et al., 2007) and *Hydra magnipapillata* (Chapman et al., 2010). The *A. digitifera* genome browser is accessible at http://marinegenomics.oist.jp/acropora\_ digitifera (Koyanagi et al., 2013). Approximately 93% of *A. digitifera* genes have homologs in other metazoans (**Figure 1A**), and of these, 11% have significant homology only amongst EST data from corals (**Figure 1B**) (Hemmrich and Bosch, 2008), suggesting the presence of a considerable number of coral-specific genes. As discussed later, the *Acropora* nuclear DNA sequences do not contain any *Symbiodinium*-related genome sequences.

#### **EVOLUTIONARY ORIGINS OF REEF-BUILDING CORALS**

Corals are morphologically very similar to sea anemones, but their evolutionary origins are obscure. Reef building scleractinians first appeared in the fossil record in the mid Triassic (∼240 MYR) (Stanley and Fautin, 2001), but were already highly diversified, suggesting much earlier origins. The availability of fully sequenced genomes for three cnidarians (*Acropora, Nematostella*, and *Hydra*) allows us to estimate the time of divergence between corals and other metazoans. Molecular phylogenetic analyses, based on an alignment of 94,200 amino acids, suggest a divergence time of 520 ∼ 490 MYR for *Acropora* and *Nematostella* (late Cambrian or early Ordovician). This implies early origin of Scleractinia indicates that corals have persisted through previous periods of dramatic environmental change, including the mass extinction event at the Permian/Triassic boundary, when global CO2 and temperature were much higher than at present. However, molecular phylogeny of symbiotic dinoflagellates suggests that *Symbiodinium* originated in early Eocene, and that the majority of extant lineages diversified since Mid-Eocene, ∼18 MYR ago (Pochon et al., 2006). Therefore, it is far from certain




*\*From a mixed host/symbiont cDNA library.*

that modern coral reefs can adapt to the rapid environmental changes now occurring.

#### **TRACES OF SYMBIOSIS IN THE CORAL GENOME**

Obligate endosymbiosis of corals dates from at least the mid Triassic (Stanley and Fautin, 2001), and the longevity of this association might be expected to have resulted in changes in the coral genome. However, a comprehensive search of *Acropora* nuclear DNA sequences failed to find any *Symbiodinium* DNA sequences (Shinzato et al., 2011); hence there is, as yet, no evidence for horizontal gene transfer from symbiont to host. Neither is *Symbiodinium* vertically transferred via host gametes. As a result, the symbiosis must be re-established with each generation. Nonetheless, comparative analyses imply that *Acropora* is probably metabolically dependent upon its endosymbiont.

When the metabolic repertoire of *A. digitifera* was compared using the KEGG pathway database to that of its non-symbiotic relative, *Nematostella*, it became apparent that *Acropora* lost a gene for cysteine biosynthesis. Biosynthesis of cysteine from homocysteine and/or serine requires two enzymes, cystathionine beta-synthase (Cbs) and cystathionase (cystathionine gammalyase) (**Table 3**). Although both the *A. digitifera* and *Nematostella* genomes encode cystathionase, the gene for Cbs could not be identified in *Acropora* despite the existence of an ortholog in *Nematostella* (**Table 3**). An extensive search of transcriptomic data available for various *Acropora* spp. (Hemmrich and Bosch, 2008) failed to identify a *Cbs* transcript in any congener. Moreover, whereas a PCR strategy confirmed the presence of *Cbs* in some other corals (*Galaxea fascicularis, Favites chinenis, Favia lizardensis*, and *Ctenactis echinata*), no amplification products could be obtained for two different *Acropora* species (**Table 3**). Although

**Table 3 | The presence or absence of a gene encoding cystathionine β-synthase (Cbs) for L-cysteine biosynthesis in corals.**


*ND, not determined.*

*aSupported by sequenced genome and EST analyses.*

*bSupported by sequenced genome, EST, and PCR amplification of genome DNA.*

*cSupported by PCR-amplification of genome DNA.*

*dSupported by EST analyses.*

further studies of biosynthetic pathways are required, this finding raises the intriguing possibility of a metabolic basis for the obligate nature of symbiosis in *Acropora*. Differences in dependency could potentially explain not only the phenomenon of symbiont selectivity, but also the high sensitivity of *Acropora* to environmental challenges.

#### **GENES INVOLVED IN CALCIFICATION**

The coral gene repertoire, with predicted roles in skeleton deposition, is of particular interest, given the likely impact of ocean acidification resulting from rising atmospheric CO2 on coral calcification. Surveys of the *Acropora* genome reveal the presence of genes for specific groups of proteins associated with calcification, including the eukaryotic carbonic anhydrases (Jackson et al., 2007). In general, the soluble fraction of the organic matrix (OM) in invertebrates is very rich in acidic amino acids, and has a particularly high aspartic acid composition (Sarashina and Endo, 2006). A number of candidate OM protein genes are present in the *Acropora* genome. Galaxins, first purified from the coral, *G. fascicularis*, are unique to corals and are the only coral skeletal matrix protein for which the complete primary structure has been determined (Fukuda et al., 2003). However, galaxin possesses neither acidic regions (the fraction of Asp+Asn in the galaxin is only 9.7%) nor obvious Ca2<sup>+</sup> binding domains. Four genes encoding galaxin-related proteins have been identified in the *A. digitifera* genome, including two likely *A. digitifera* homologs of galaxin.

#### **TRANSCRIPTION FACTOR GENES AND SIGNALING MOLECULE GENES**

Cnidarians have genes for transcription factors and signaling molecules comparable to those found in bilaterians (Technau et al., 2005; Putnam et al., 2007) and this is also true of corals (Shinzato et al., 2011). Of those, genes for Hox cluster and basic helix-loop-helix (bHLH) families have been examined in detail in the *A. digitifera* genome.

#### *Hox genes*

*Hox* genes are homeobox transcription factors that play a critical role in developmental patterning (McGinnis et al., 1984). They have been identified in every extant phylum except the Porifera, Ctenophora, and Placozoa. Cnidarians are the only nonbilaterian phylum with *Hox* genes; therefore they are critical to our understanding of early *Hox* cluster evolution. However, the *H. magnipapillata* genome shows no *Hox* cluster (Chapman et al., 2010) and clustering in *N. vectensis* is limited to anterior Hox genes (Chourrout et al., 2006; Putnam et al., 2007; Ryan et al., 2007), raising the question of the degree of *Hox* gene clustering in cnidarians. The *A. digitifera* genome has the most extensive *Hox* cluster reported in any cnidarian (DuBuc et al., 2012). Phylogenetic analysis revealed a total of six *Hox*, one *ParaHox*, three *Mox*, one *Eve*, and one *HlxB9* gene in the *Acropora* genome. Of the six *Hox* genes, two anterior (PG1 and PG2) linked to an *Eve* homeobox gene and an *Anthox1A* gene (**Figure 2**). Therefore, the *Hox* cluster of the cnidarian–bilaterian ancestor was more extensive than previously thought. These facts are congruent with the existence of an ancient set of constraints on the *Hox* cluster and reinforce the importance of incorporating a wide range of animal species to reconstruct critical ancestral nodes.

**FIGURE 2 | The anthozoan complement of Hox genes and the implications of the evolution of the Hox cluster.** Comparing the genomic linkage of Hox genes in the sea anemone *N. vectensis* and the staghorn coral *A. digitifera* confirms that cnidarians once had a Hox cluster that contained both anterior and posterior/central class Hox genes. **(A)** The Hox cluster of *N. vectensis* includes the anterior Hox genes Anthox6 (PG1), Anthox8b (PG2), Anthox8a (PG2), and Anthox7 (PG2) as well as the Eve homeobox gene. **(B)** The Hox cluster of *A. digitifera* includes the anterior Hox genes Anthox6 (PG1) and Anthox7/8 (PG2), and the posterior/central class Hox gene Anthox1a (PG4–14), as well as the Eve homeobox gene. Another gene HlxB9 (also named MNX) is found upstream of Anthox6 in the Hox cluster of both genomes (data not shown). **(C)** The metazoan tree of life with inferred ancestral Hox clusters. The ancestor to protostomes and deuterostomes is thought to have had two anterior class Hox genes (Hox1 and Hox2), one

#### *bHLH genes*

bHLH proteins constitute a large group of transcription factors that comprise a basic region for DNA binding and two α-helices, interrupted by a variable loop region, for dimerization. bHLH proteins homo- or heterodimerize to recognize and bind specific core hexa-nucleotides, and play pivotal roles in cell differentiation and proliferation (Massari and Murre, 2000; Jones, 2004). A putative full set of bHLH genes has been described in the genomes of a number of metazoans, and molecular phylogenetic analyses have identified 45 orthologous families of bHLH factors, which were categorized into six high order groups (Atchley and Fitch, 1997).

The *A. digitifera* genome contains a nearly full set of 70 bHLH transcription factors, comparable to the 68 bHLH genes in *N. vectensis* (Gyoja et al., 2012). The *Acropora* genes have been assigned to 29 previously reported orthologous families. In addition, three novel HLH orthologous families have been identified, designated pearl, amber, and peridot (Gyoja et al., 2012). Pearl and amber orthologs are present in genomes and ESTs of the Mollusca and Annelida, in addition to the Cnidaria. Peridot orthologs are present in genomes and ESTs of the Cephalochordata and the Hemichordata, in addition to the paralagous group 3 gene (Hox3), three central class genes (Hox4, Hox5, and Hox6–8), one posterior class Hox gene (Hox9–14), and one Eve homeobox gene. Because of the extended cluster in A. digitifera, we can now say that the cnidarian–bilaterian ancestor had, at least, two anterior class Hox genes (Anthox6 and Anthox7/8), a central/posterior class Hox gene (Anthox1/1a), and the Eve homeobox gene. It is unclear at what point the genomic rearrangement involving the Eve homeobox gene occurred. The origin of the PG3 Hox genes also is not clear. \*Anthox7/8 has been categorized as a PG2 Hox gene in previous publications, but it is possible, based on our current phylogenetic analysis, that Anthox7/8 descended from a Hox gene that was lost in bilaterians. Based on the genomic orientation of these genes, we also believe the ancestor likely had a fourth Hox gene potentially related to Anthox9. For more detail information, please see DuBuc et al. (2012). Abbreviations: PG, paralogous group; Ax, Anthox.

Cnidaria. These three genes have apparently been lost in the clades of *Drosophila*, *Caenorhabditis*, and *Homo sapiens*. Therefore, cnidarians provide information about alteration of transcription factor genes during animal evolution.

#### **INNATE IMMUNITY**

Innate immunity in corals is of special interest not only in the context of self-defense, but also in relation to the establishment and collapse of the obligate symbiosis with *Symbiodinium*. The coral innate immune repertoire is highly complex and more sophisticated than that of *Hydra* and *Nematostella* (**Figure 3**) (Shinzato et al., 2011; Hamada et al., 2013). For example, whereas a single canonical Toll/TLR protein is present in *N. vectensis* (Miller et al., 2007), the *Acropora* genome encodes at least four such molecules, as well as five IL-1R-related proteins, and a number of TIR-only proteins (**Figure 3A**). Likewise, the *Acropora* repertoire of NACHT/NB-ARC domains, which are characteristic of primary intracellular pattern receptors, is again highly complex—an order of magnitude more NACHT/NB-ARC domains are present in coral than in other animals, and some of these cnidarian proteins have novel domain structures.

from *Nematostella vectensis* and *Hydra magnipapillata*. The repertoire of Toll/TLR, IL-1R-like, and TIR-only proteins is significantly more complex in the case of *A. digitifera* than in *N. vectensis* or *H. magnipapillata*. TIR, TIR domain; DEATH, DEATH domain; IG and IGc2, Ig domain; LRR, LRY-TRY, LRR-CT and LRR-NT, leucine-rich repeats. **(B)** The complexity of the NBD repertoire of

IPAF, CIITA and BIR in NAIP. (b) A total of 379 coral NBD loci do not encode repeat domains. Numbers to the right of schematics represent the number of loci with each specific architecture. (c) In addition, 117 loci in the coral encode NBDs and repeat domains of the WD40, LRR, Ank, or TPR types. (d) The various domains identified in the Nod-like proteins of *Acropora*.

In the vertebrate innate immune system, ∼20 tripartite nucleotide oligomerization domain (NOD)-like receptor proteins that are defined by the presence of NAIP, CIIA, HET-E, and TP1 (NACHT) domains, a C-terminal leucine-rich repeat (LRR) domain, and one of three types of N-terminal effector domain, are known to function as primary intracellular pattern recognition molecules (**Figure 3B**) (Hamada et al., 2013). Surveying the coral genome demonstrates a larger number of NACHT- and related domain nucleotide-binding adaptors shared by APAF-1, R proteins, and CED-4 (NB-ARC)-encoding loci (∼500) than in other metazoans, and also a surprising diversity of domain combinations among coral NACHT/NB-ARC-containing proteins (**Figure 3B**). N-terminal effector domains include apoptosisrelated domains, caspase recruitment domains (CARD), death effector domains (DED), and Death, and C-terminal repeat domains, such as LRRs, tetratricopeptide repeats, ankyrin repeats, and WD40 repeats. Many of the predicted coral proteins that contain a NACHT/NB-ARC domain also contain a glycosyl transferase group 1 domain, a novel domain combination first found in metazoans. Phylogenetic analyses suggest that the NACHT/NB-ARC domain inventories of various metazoan lineages, including corals, are largely products of lineage-specific expansions. Many of the NACHT/NB-ARC loci are organized in pairs or triplets in the *Acropora* genome, suggesting that the large coral NACHT/NB-ARC repertoire has been generated at least in part by tandem duplication (Hamada et al., 2013). In addition, shuffling of N-terminal effector domains may have occurred after diversification of specific NACHT/NB-ARC-repeat domain types. These attributes illustrate the extraordinary complexity of the innate immune repertoire of corals, which may reflect adaptation to a symbiotic lifestyle in a uniquely complex and challenging environment.

#### **APOPTOSIS**

The apoptotic network of *A. digitifera* is comparable in complexity to those of "higher" animal taxa, including vertebrates (**Figure 4A**) (Shinzato et al., 2011). Seven Bcl-2 family members containing multiple domains, four IAP family members, 25 caspases, a single APAF-1, four Death receptors, three Death ligands, and 32 members of the TRAF adaptor family are present in the *Acropora* genome (**Figure 4B**). These numbers are generally comparable to those in the *Nematostella* genome. The TRAF family in *Acropora* and *Nematostella* and the caspases in *Acropora* are overrepresented relative to humans. While no BH3-only members of the Bcl-2 family have been identified (**Figure 4B**), this may be a consequence of the small size of the BH3 domain and the extent of sequence divergence in these proteins. Failure to detect adaptors with Death domains may reflect the low level of domain conservation characteristic of this family.

#### **AUTOPHAGY**

The *A. digitifera* genome contains orthologs of ATG1, ATG2, ATG3, ATG4, ATG5, ATG6, ATG7, ATG8, ATG9, ATG 10, ATG12, ATG13, ATG14, ATG16, ATG18, ATG24, TOR, Vsp34, and Vsp15, but no counterparts of the yeast-specific proteins ATG11, ATG15, ATG17, ATG19, ATG20, ATG21, ATG22, ATG23, ATG26, ATG27, and ATG29 (Shinzato et al., 2011) (**Figure 5**). The *Acropora* genome also encodes orthologs of human UVRAG, SH3GLB1, DRAM, AMBRA1, RB1CC1, and ATG101 (**Figure 5**), which are also absent in yeast.

#### **GENES INVOLVED IN UV-DAMAGE PROTECTION**

Reef-building corals typically inhabit shallow and relatively clear tropical waters and are therefore constantly exposed to high levels of UV irradiation. Since high solar radiation sometimes causes coral bleaching (Gleason and Wellington, 1993), one intriguing question is how corals protect themselves against UV-damage. UV-absorbing substances potentially act as photoprotective compounds. These include mycosporine-like amino acids (MAAs), scytonemin, carotenoids, and others of unknown chemical structure (Shick et al., 1999; Reef et al., 2009). Although some photoprotective compounds have been isolated from corals (Rastogi et al., 2010), it is often unclear whether symbiotic dinoflagellates and/or bacteria produce the photoprotective compounds, or whether the corals themselves can independently synthesize them.

## *MAAs*

A recent study of the cyanobacterium, *Anabaena variabilis*, identified a four-gene cluster (encoding DHQS-like, O-MT, ATPgrasp, and NRPS-like enzymes) that converts pentose-phosphate metabolites into shinorine, one of MAAs (**Figure 6**) (Balskus and Walsh, 2010). A search of cnidarian gene models for components of the shinorine gene cluster revealed that this four-gene pathway is present in both *Acropora* and *Nematostella*, but not in *Hydra* (Shinzato et al., 2011). This strongly suggests that both *Acropora* and *Nematostella* can synthesize shinorine by themselves, which may be a precursor for photoprotective compounds.

In addition, molecular phylogenetic analyses show that homologous proteins in *Acropora* have more sequence similarities to those of bacteria and dinoflagellates (Shinzato et al., 2011). These genes might have been acquired via horizontal gene transfer (Starcevic et al., 2008). For example, during the evolution of cnidarian stinging cells, a subunit of bacterial poly-γglutamate (PGA) synthase was transferred to an animal ancestor via horizontal gene transfer (Denker et al., 2008). It has been proposed that in marine environments, horizontal gene transfer is important in adapting to ecological vagaries (Keeling, 2009).

## *Scytonemin*

The UV-blocker, scytonemin, is found exclusively in cyanobacteria. In *Nostoc punctiforme,* its biosynthesis is controlled by a cluster of 18 genes (**Figure 7**) (Soule et al., 2007; Balskus and Walsh, 2008). The cluster comprises one subcluster of genes involved in aromatic amino acid biosynthesis, and a novel subcluster of genes of unknown function (Soule et al., 2009). The former includes *tyrA*, *dsbA*, *aroB*, *trpE*, *trpC*, *trpA*, *tyrP*, *trpB*, *trpD*, and *aroG* (**Figure 7B**). The latter includes *scyA*, *scyB*, *scyC*, *scyD*, *scyE*, and *scyF* (**Figure 7B**).

The *A. digitifera* genome contains only six of the 18 genes: namely, *scyA*, *scyB*, *scyF*, *dsbA*, *aroB*, and *tyrP* (**Figure 7**) (Shoguchi et al., 2013c). This result suggests that coral cannot synthesize scytonemin independently. Molecular phylogenetic analyses indicate that coral *scyA* and *scyB* are associated with bacterial genes for acetolactate synthase and glutamate dehydrogenase, respectively. This suggests that these enzymes are coupled with PGA/amino acid biosynthesis in corals. In addition, *scyA*, *scyB*, and *aroB* (*DHQS-like*) are likely to have originated by horizontal transfer from bacteria.

#### *Glyoxylate cycle enzymes: malate synthase and isocitrate lyase*

Glyoxylate cycle enzymes play a role in lipid metabolism in plant seeds (Kornberg and Beevers, 1957). Although this pathway has


**FIGURE 4 | (A)** Schematic presentation of cellular components involved in the pathways of apoptosis, based on human genes. The extrinsic pathway, intrinsic pathway, and ER stress pathway are three major pathways of apoptosis. Major families are shown by green background. Families found in the *Acropora digitifera* genome are boxed by red and those of *Nematostella*

*vectensis* by blue. **(B)** The number of apoptosis-related family members in the genome of *A. digitifera* (Ad), *N. vectensis* (Nv), and *Homo sapiens* (Hs). The *Acropora* and *Nematostella* genomes contain apoptosis-related genes of which numbers are comparable to those of the human genome, except for a larger number of adaptor TRAF family in the cnidarians.

not been found in animal lineages, nematode genomes contain genes encoding enzymes involved in the pathway (Liu et al., 1995). Interestingly, the *A. digitifera* genome contains one *isocitrate lyase* (*ICL*) gene and two *malate synthase* (*MS*) genes. Orthology between *Acropora* and *Nematostella* is supported by molecular phylogenetic analysis (Shoguchi et al., 2013c). The genes, *ICL* and *MS1*, are aligned head-to-head in tandem. In addition, by comparisons between neighboring genes, synteny in the region is also conserved. The anthozoan genes form a clade with bacterial *ICL*. Therefore, the origin of anthozoan

**human (gray backgroud) and yeast genes (***Saccharomyces cerevisiae***; yellow background) involved in the pathway.** The pathway is composed of autophagy induction, membrane nucleation, vesicle expansion and completion, retrieval and autophagic degradation. Genes found in the

blue. It is obvious that all the human autophagy-related genes have counterparts in *Acropora* and *Nematostella*. In contrast, autophagy-related genes that are found only in the yeast cannot be found in the cnidarian geneomes.

genes may be different from those of nematode glyoxylate cycle enzymes.

#### **FLUORESCENT PROTEINS**

Corals exhibit diverse colors, which depend largely on fluorescent proteins (Matz et al., 1999, 2006). Four basic colors of fluorescent proteins present in corals include cyan (CFP), green (GFP), and red (RFP), and a non-fluorescent blue/purple chromoprotein (Kelmanson and Matz, 2003; Field et al., 2006). Fluorescent proteins are usually composed of ∼230 amino acids. Corals are able to synthesize several different fluorescent or colored moieties from amino acids within fluorescent proteins, via two or three consecutive autocatalytic reactions. While CFP and GFP possess the same chromophore, individual chromophores can differ dramatically in spectroscopic characteristics (Henderson and Remington, 2005; Lukyanov et al., 2006).

The *A. digitifera* genome contains one, five, one, and three candidate genes for CFP, GFP, RFP, and chromoprotein, respectively, (Shinzato et al., 2012). The CFP and GFP genes are clustered in an ∼80-kb genomic region, suggesting that they originated from an ancestral gene by tandem duplication. Since CFP and GFP possess the same chromophore, this gene clustering may provide the first genomic evidence for a common origin of the two proteins. Comparisons of the fluorescent protein genes of closely related coral species suggest an expansion of chromoprotein genes in the *A. digitifera* genome, and of RFP genes in the *A. millepora* genome. RNA-seq analysis shows that *A. digitifera* fluorescent protein genes are expressed during embryonic and larval stages and in adults, suggesting that these genes play a variety of roles in coral physiology.

A wide variety of roles have been attributed to coral fluorescent proteins, including modulating the efficiency of photosynthesis and photoprotection for the symbionts (e.g., Salih et al., 2000) as well as antioxidant functions (Bou-Abdallah et al., 2006; Palmer et al., 2009). Along with cataloging the coral fluorescent protein repertoire, functions of these proteins should be investigated by future studies, especially in the context of molecular mechanisms involved in environmental stress responses of corals, which are associated with collapse of coral-*Symbiodinium* symbiosis.

#### **PHOTORECEPTORS AND CIRCADIAN CLOCK GENES**

Corals exhibit circadian behaviors, which play a pivotal role in timing of spawning. However, little is known about the

molecular mechanisms underlying the regulation of these behaviors. Microarray analysis of *Acropora*-*Symbiodinium* suggested complex diel cycles of gene expression (Levy et al., 2011). The *A. digitifera* genome contains seven opsin and three cryptochrome (photoreceptor) genes (**Figure 8**) (Shoguchi et al., 2013b). Two genes from each family likely underwent tandem duplication in the coral lineage. In addition, *A. digitifera* has orthologs to *Drosophila* and mammalian circadian clock genes: four *clock*, one *bmal/cycle*, three *pdp1-like*, one *creb/atf*, one *sgg/zw3*, two *ck2alpha*, one *dco* (*csnk1d/cnsk1e*), one *slim/BTRC*, and one *grinl* (**Figure 8**). However, *Acropora* is unlikely to have *vrille*, *rev-erv*α*/nr1d1*, *bhlh2*, *vpac2*, *adcyap1*, *or adcyaplr1* orthologs (**Figure 8**). Intriguingly, an extensive survey failed to find homologs of *period* and *timeless*, although it found one *timeout* gene. When the coral genes were compared

to orthologous genes in *N. vectensis*, a similar repertoire of circadian clock genes was apparent, although *A. digitifera* contains more clock genes and fewer photoreceptor genes than *N. vectensis* (**Figure 8**). This suggests that the circadian clock system was established in a common ancestor of corals and sea anemones, and diversified by tandem gene duplications and the loss of paralogous genes in each lineage. Future studies should examine how the coral circadian clock functions without *period*.

## **SYMBIODINIUM GENOME**

Coral symbionts are all *Symbiodinium* spp. belonging to the phylum Dinoflagellata. Dinoflagellates are unicellular eukaryotes, 10–100μm in diameter, and characterized by two flagella and a unique cell covering referred to as the theca.

Approximately half of them are photosynthetic (Graham and Wilcox, 2000). Dinoflagellates belong to the well-supported Superphylum Alveolata, which also includes ciliates and apicomplexans, such as the malarial parasite, *Plasmodium falciparum* (Burki et al., 2007). Each alveolate lineage has had a distinct evolutionary trajectory with regard to nuclear genome organization, resulting in three divergent outcomes (Gardner et al., 2002; Eisen et al., 2006). Ciliates contain two nuclei, a somatic macronucleus and a micronucleus for reproduction, and they lack plastids. Apicomplexans, due to their parasitic life style in most species, have substantially reduced genomes, with highly degenerate plastids known as apicoplasts (Wilson et al., 1996). Dinoflagellate nuclei have permanently condensed liquid-crystalline chromosomes that lack nucleosomes (**Figures 9A,B**) (Bouligand and Norris, 2001). In addition, recent studies of partial dinoflagellate genome data show repeated gene copies arranged in tandem arrays (Bachvaroff and Place, 2008), trans-splicing of messenger RNAs (Lidie and van Dolah, 2007; Zhang et al., 2007), and a reduced role for transcriptional regulation, compared to other eukaryotes (Erdner and Anderson, 2006; Moustafa et al., 2010). Given these remarkable characteristics, elucidating the structure and composition of dinoflagellate genomes is essential to understanding their packaging of chromosomal DNA and expression of encoded genes. However, dinoflagellates possess some of the largest eukaryotic nuclear genomes (1500–245,000 megabases [Mbp] in size), which have previously thwarted whole-genome sequencing (Lin, 2011; Wisecaver and Hackett, 2011). In 2013, the genome of a culturable dinoflagellate, *S. minutum,* was decoded (Shoguchi et al., 2013a).

## **THE NUCLEAR GENOME**

The genome of *S. minutum* is estimated at ∼1500 Mbp. Approximately 40-fold coverage of the genome yielded a ∼616 Mbp assembly (Shoguchi et al., 2013a). A large quantity of RNA-seq sequences were assembled into 63,104 unique transcripts, 26,691 of which encode complete open reading frames. Gene prediction yielded 41,925 protein models, 77.2% of which (32,366 gene models) are supported by RNA-seq data. In addition, the vast majority of the transcriptome is encoded in the 616-Mbp draft assembly, suggesting that these contigs represent the euchromatin-like region of the *Symbiodinium* genome (http://marinegenomics*.*oist*.*jp/genomes/gallery). DNA transposons, retrotransposons, and tandem repeats comprise 0.5, 1.1, and 4.6% of the assembled genome, respectively. The GC-content of the *Symbiodinium* nuclear genome was 44%. This is comparable to GC-content of metazoans and green plants, but

pyrenoid (PY) in brown. Scale bar, 1μm. **(B)** DAPI staining of the nucleus showing permanently condensed chromosomes of *S. minutum.* Scale bar, 1μm. **(C)** RCC1 proteins are eukaryotic proteins that bind to groupings of eukaryotic RCC1 proteins and prokaryotic RCC1-like proteins are supported by 100% bootstrap duplication. Bar indicates an amino acid substitution per site.

contrasts strongly with the AT-rich genomes of other alveolates, such as apicomplexans [*P. falciparum,* 19% GC (Gardner et al., 2002)] and ciliates [*Tetrahymena thermophile,* 22% GC (Eisen et al., 2006)], respectively.

## *Gene content of the dinoflagellate genome*

Of 41,925 gene models, 20,983 (50%) encode proteins with known domains. One of the largest dinoflagellate protein families is the EF-hand family, a large family of calcium-binding proteins characterized by a helix-loop-helix structural domain. The second largest dinoflagellate family contains ankyrin repeats, one of the most common protein-protein interaction motifs in nature. When the *Symbiodinium* gene families are compared with those of other eukaryotes, *Symbiodinium* shares a considerable number of homologous genes with *Homo* and *Arabidopsis*, although ∼46% of predicted proteins are novel or S*ymbiodinium*-specific.

#### *Specific gene expansion in the Symbiodinium genome*

Dinoflagellates have been predicted to possess 38,000–87,000 protein-coding genes (Hou and Lin, 2009). The presence of a larger number of genes in the *S. minutum* genome (41,925) is likely caused by lineage-specific expansion of genes by duplication (Hou and Lin, 2009). Orthologous gene clustering analyses indicate that 1064 groups (10,912 genes) in the *Symbiodinium* genome have likely resulted from such events. One striking finding is that the regulator of chromosome condensation family protein (RCC1) is highly expanded (discussed below). Calcium channel and calmodulin families are also expanded. Because the largest domain was the EF-hand subgroup of calcium-binding proteins, Ca2<sup>+</sup> metabolism is clearly of great importance in *Symbiodinium*.

## *Molecular basis of permanently condensed chromatin*

As mentioned above, dinoflagellate nuclei are characterized by permanently condensed, liquid-crystalline chromosomes (**Figures 9A,B**), and dinoflagellate chromosomal organization is a fundamental issues that is still not fully understood (Lin, 2011). In eukaryotes, histone proteins are involved in chromatin modulation, whereas in prokaryotes, histone-like proteins serve this function. The *S. minutum* genome contains both eukaryotic histone genes and prokaryotic histone-like genes, although orthologs of histone H1 are not found in the genome (Shoguchi et al., 2013a). All four core-histone genes (H2A, H2B, H3, and H4) are duplicated. In addition, there are 15 histone-like proteins similar to those found in bacteria.

In addition to enlargement of the genome, a dinoflagellate, *Hermatodinium* sp., gains a novel family of nucleoproteins from an algal virus, termed dinoflagellate/viral nucleoprotein (DVNP) (Gornik et al., 2012). The *Symbiodinium* genome contains 19 genes that appear homologous to DVNPs, suggesting a role for this type of protein in *Symbiodinium* chromosome structure.

The RCC1 proteins (RCC1 superfamily in eukaryotes and RCC1-like repeat proteins in both prokaryotes and eukaryotes) bind to chromatin and play an important role in the regulation of gene expression (Dasso, 1993). As mentioned above, genes for RCC1 have the third highest degree of expansion in the *Symbiodinium* genome, and a total of 189 genes are present in the *Symbiodinium* genome (Shoguchi et al., 2013a). When 86 of these proteins are used for molecular phylogenic analyses, two distinct clusters become evident. One, with 34 *Symbiodinium* proteins consists of those orthologous to eukaryotes, including alveolates, plants, and animals (**Figure 9C**, left), whereas the other includes 52 proteins with similarities to prokaryotes, including cyanobacteria and proteobacteria (**Figure 9C**, right). This result potentially explains the characteristic architecture of dinoflagellate chromosomes, although the manner in which they interact with each other to establish and maintain the permanently condensed chromosomes remains to be studied.

## *Unique spliceosomal splicing*

Although previous reports have suggested that introns are relatively uncommon in dinoflagellate genes (Okamoto et al., 2001; Hoppenrath and Leander, 2010), genes of *S. minutum* are highly intron-rich. 39,970 of the 41,925 genes (95%) are composed of multiple exons. The average number of exons per gene reaches 19.6, and some genes contain more than 200 introns (Shoguchi et al., 2013a). In addition, spliceosomal introns of *Symbiodinium* are unique among eukaryotic genomes. In other eukaryotes, introns are excised under the GT-AG rule, wherein GT and AG are used as recognition nucleotides at 5 and 3 splice sites, respectively, (**Figure 10**). In contrast, *Symbiodinium* uses GC and GA at the 5 donor splice site, in addition to GT (**Figure 10**). GC usage frequency is nearly equal to that of GT. The presence of these 5 splice sites provides the first evidence in eukaryotes that the majority of mRNA splicing does not always follow the GT-AG rule. Another feature of *Symbiodinium* splicing is that the 3 acceptor splice site, AG, is frequently followed by the nucleotide G (**Figure 10**), although a similar phenomenon is known in human minor alternative splice sites (Thanaraj and Clark, 2001).

Key steps in RNA splicing are performed by spliceosomes, acting in concert with five small nuclear RNA molecules (snR-NAs; *U1*, *U2*, *U4*, *U5*, and *U6*). The five major snRNAs recognize nucleotide sequences that specify where splicing is to occur, and they participate in spliceosome chemistry (Rogozin et al., 2012). In the *Plasmodium* and *Tetrahymena* genomes, snRNAs are scattered throughout the genome, whereas in metazoans and green plants, two different types of the five major snRNAs are sometimes tandemly aligned (Wang and Brendel, 2004; Marz et al., 2008). In contrast, in the *Symbiodinium* genome, all five snRNAs, *U1, U2, U4, U5*, and *U6* occur in a cluster, in addition to other snRNAs scattered across about 70 locations. This is the first discovery of an snRNA gene cluster in a eukaryote genome. It has been reported that *trans*-splicing of messenger RNAs is common in dinoflagellates (Lin, 2011; Wisecaver and Hackett, 2011). The *Symbiodinium* genome contains spliced-leader (SL) genes with a conserved SL sequence.

## *Unique arrangement of genes in the genome*

The *Symbiodinium* genome is also unique in the context of gene arrangement (Shoguchi et al., 2013a). In contrast to the random arrangement of protein-coding genes in the genomes of *Tetrahymena*, *Plasmodium*, *Arabidopsis,* and *Homo,* those of the *Symbiodinium* and *Trypanosoma* genomes show a clear tendency for tandem and unidirectional gene alignment. The grade of change in gene direction was searched using a 10-gene sliding window (**Figure 11**). Graphs of these data for *Plasmodium*, *Tetrahymena*, *Arabidopsis*, and *Homo* show a peak between 4 and 5 changes in orientation, indicating the frequency of strand switch regions (SSRs) between genes in head-to-head or tailto-tail orientations (**Figure 11**). In contrast, *Symbiodinium* and *Trypanosoma* show a cluster (**Figure 11**). This indicates a strong tendency for tandem alignment of genes or clustering of unidirectionally aligned genes in the *Symbiodinium* and *Trypanosoma* genomes.

#### *Genes involved in the basic transcriptional machinery*

Although the *S. minutum* genome is unique in regard to permanently condensed chromosomes, spliceosomal splicing, and unidirectionally aligned genes, the genome contains highly conserved

indicate typical patterns of 10-gene arrangements with the number of strand switch regions (SSRs), although the SSRs shown here are not always typical. Patterns are based on the analyses shown in **Figure 6**. Gene architecture shows average gene lengths (exons in red and introns in blue) with the average intron number per gene. The sequence motif of the splice site is illustrated using WebLogo. Only two genes with spliceosomal similarities between *Symbiodinium* and *Trypanosoma*. Additionally, analyses of intron-richness and the weakness of 5 splice site signals (asterisk) indicate that *Symbiodinium* has the most unusual genome organization found in a eukaryote genome to date. The probability of position 2 at the 5 splice site is shown in inset. A double asterisk shows G conserved at the 3 splice site.

basic transcriptional machinery components, including RNA polymerase I, II, and III, basal transcription factors, such as TFIID and TATA-binding protein (TBP), and transcription elongation factors (Shoguchi et al., 2013a). In contrast, the genome contains a few sequence-specific transcription factors, including 19 gene models with AP2 domain(s), 15 models with HMGbox domain(s), eight models with zf-C2H2 domain(s), and others. These results suggest constant, steady transcription of *Symbiodinium* genes with fewer genes under sequence-specific transcriptional control.

#### **CHLOROPLAST (PLASTID) GENOME**

Chloroplasts (plastids) are common photosynthetic organelles in eukaryotic algae and land plants. Plastids first may have arisen when non-photosynthetic eukaryotic hosts acquired cyanobacterial endosymbionts by a process termed "primary endosymbiosis" (Howe et al., 2008; Keeling, 2010). Other non-photosynthetic eukaryotes may have subsequently acquired endosymbionts from photosynthetic eukaryotes to create secondary plastids (Howe et al., 2008; Keeling, 2010). In some lineages including dinoflagellates, secondary plastids may have been lost and replaced with

tendency toward unidirectional alignment of genes in the *S. minutum* and *Trypanosoma* genomes. Each line represents a frequency histogram for changes in the gene orientation between successive genes in the genome. The X-axis represents the number of orientation changes as one moves through windows of 10 genes. For examples as indicating random orientation, the poisson distributions with μ = 4*.*5 (average) and 0.2 are shown.

secondary endosymbiotic plastids or other primary endosymbiotic plastids, resulting in tertiary plastids (Allen et al., 2011).

Evolutionary changes in plastid genomes in alveolates are dramatic. Ciliates lost plastids and became heterotrophic, while parasitic apicomplexans retain unpigmented plastid remnants termed apicoplasts. On the other hand, two species closely related to apicomplexans, *Chromera velia* and *Vitrella brassicaformis*, are photosynthetic. Their plastid genomes retain ancestral characteristics of both apicomplexan and dinoflagellate plastids and probably share a common red algal endosymbiont (Janouskovec et al., 2010). Interestingly, rapidly evolving dinoflagellate plastids show a great variety of reduced stages. Their gene content has been dramatically diminished by large-scale transfer of genes to the nucleus, leaving only 12–17 genes in the plastids (Howe et al., 2008). Conventional plastid genomes have all genes physically linked in one molecule, typically 120–200 kb in size (Keeling, 2010), while dinoflagellate plastid genes reside on small plasmids of 2.2–6 kb, termed "minicircles" (Zhang et al., 1999), containing a few genes and a core, non-coding region, which is conserved within species and plays a regulatory role (Zhang et al., 2002; Leung and Wong, 2009; Wisecaver and Hackett, 2011). Moreover, a number of unusual post-transcriptional RNA modifications, including the addition of 3 terminal poly(U)tracts, occur in the ancestral chloroplasts of dinoflagellates. Extensive RNA editing occurs in some dinoflagellates (Zauner et al., 2004; Wang and Morse, 2006; Dang and Green, 2009), employing diverse editing types that have not been observed in mammals and plants. This leads to speculation about the functional connection between poly(U)tailing and RNA editing in dinoflagellate plastid transcripts (Dang and Green, 2009).

In *S. minutum*, 95 of 109 plastid-associated genes have been transferred to the nuclear genome and subsequently expanded by gene duplication (Mungpakdee et al., 2014). Only 14 genes remain in plastids, as DNA minicircles. Each *Symbiodinium* minicircle (1.8–3.3 kb) contains one gene and a conserved non-coding region containing putative promoters and RNA-binding sites. Nine types of RNA editing, including a novel G/U type, are evident in minicircle transcripts, but not in genes transferred to the nucleus. In contrast to DNA editing sites in dinoflagellate mitochondria, which tend to be highly conserved across all taxa, editing sites employed in DNA minicircles are highly variable from species to species. Editing is crucial for core photosystem protein function. It restores evolutionarily conserved amino acids and increases peptidyl hydropathy. RNA editing is also likely to increase protein plasticity necessary to initiate photosystem complex assembly.

#### **MITOCHONDRIAL GENOME**

In most metazoans, mitochondrial genomes are 13–20-kb, compact, circular molecules, containing 12–13 proteins, 24–25 tRNAs, and 2 rRNAs. As in the case of plastid genomes, mitochondrial genomes also dramatically changed during evolution. Ciliates (*Tetrahymena* and *Ichthyophthirius*) have linearly mapped mitochondrial genomes of 43 kb with a normal gene number (Burger et al., 2000), while only 3 protein-coding genes and fragmented rRNAs organized as part of linear repeats of about 6– 7 kbp are found in parasitic apicomplexans (*Plasmodium, Babesia*, and *Theileria*) (Hikosaka et al., 2012). Gene content of dinoflagellate mitochondrial genomes is comparable to that of apicomplexans (Slamovits et al., 2007), but with highly fragmented and rearranged genome structure (Waller and Jackson, 2009).

A 49-kmer assembly of only high coverage (*>*100) Illumina paired-end reads of a dinoflagellate, *S. minutum*, revealed two candidate mitochondrial scaffolds, two linear DNAs (19,577 and 291,368 bp) (Mungpakdee et al., unpublished data). Blast and transcriptome mapping show that one contains only *cox1* and the other *cob*, *cox3,* and 6 fragmented of large subunit (LSU) rRNA genes. Fragments of small subunit (SSU) rRNA and tRNA genes are not found in the *Symbiodinium* mitochondrial genome. The evolution of the mitochondrial genome in *Symbiodinium,* as well as in other dinoflagellates requires further investigation to reach some consensus.

#### **CONCLUSION**

Genomic information is essential for future studies of molecular and cellular mechanisms underlying the establishment, maintenance, and breakdown of obligate endosymbiosis of corals with photosynthetic dinoflagellates *Symbiodinium.* In general, the coral genome is unique in that frequent horizontal gene transfer is evident in UV-protection genes. In addition, *Symbiodinium* is one of diverse dinoflagellates in regard to nuclear, plastid, and mitochondrial genomes. At present, many questions about endosymbiosis remain to be answered, but genomic information will greatly facilitate future studies of coral-dinoflagellate endosymbiosis.

#### **ACKNOWLEDGMENTS**

Our genome project of both the coral and *Symbiodinium* was supported by a Grant-in-Aids from MEXT (No. 23128515 to Eiichi Shoguchi) and JSPS (No. 24241071 to Nori Satoh) of Japan, and OIST internal fund. We thank all members of our Unit and the DNA Sequence Section of OIST for their enormous help in the project, and Dr. Steven Aird for his help in preparing the manuscript.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 April 2014; accepted: 17 June 2014; published online: 07 July 2014. Citation: Shinzato C, Mungpakdee S, Satoh N and Shoguchi E (2014) A genomic approach to coral-dinoflagellate symbiosis: studies of Acropora digitifera and Symbiodinium minutum. Front. Microbiol. 5:336. doi: 10.3389/fmicb.2014.00336 This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Shinzato, Mungpakdee, Satoh and Shoguchi. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Novel tools integrating metabolic and gene function to study the impact of the environment on coral symbiosis

## *Mathieu Pernice1\* and Oren Levy2*

<sup>1</sup> Plant Functional Biology and Climate Change Cluster, University of Technology, Sydney, Sydney, NSW, Australia <sup>2</sup> The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Mauricio Rodriguez-Lanetty, Florida International University, USA Thomas Andrew Oliver, University of Hawaii at Manoa, USA

#### *\*Correspondence:*

Mathieu Pernice, Plant Functional Biology and Climate Change Cluster, University of Technology, Sydney, P. O. Box 123, Broadway, Sydney, NSW 2007, Australia e-mail: mat.pernice@gmail.com

The symbiotic dinoflagellates (genus Symbiodinium) inhabiting coral endodermal tissues are well known for their role as keystone symbiotic partners, providing corals with enormous amounts of energy acquired via photosynthesis and the absorption of dissolved nutrients. In the past few decades, corals reefs worldwide have been increasingly affected by coral bleaching (i.e., the breakdown of the symbiosis between corals and their dinoflagellate symbionts), which carries important socio-economic implications. Consequently, the number of studies focusing on the molecular and cellular processes underlying this biological phenomenon has grown rapidly, and symbiosis is now widely recognized as a major topic in coral biology. However, obtaining a clear image of the interplay between the environment and this mutualistic symbiosis remains challenging. Here, we review the potential of recent technological advances in molecular biology and approaches using stable isotopes to fill critical knowledge gaps regarding coral symbiotic function. Finally, we emphasize that the largest opportunity to achieve the full potential in this field arises from the integration of these technological advances.

**Keywords: coral, symbiosis,** *Symbiodinium***, genomics, stable isotopes**

## **INTRODUCTION**

Reefs based on scleractinian corals are among the most productive and biologically diverse marine ecosystems on Earth (Moberg and Folke, 1999). At the heart of the success of corals is their symbiosis with dinoflagellate algae (zooxanthellae), which live within their tissues and provide each coral polyp with a wider metabolic repertoire (Anthony and Hoegh-Guldberg, 2003; Houlbreque and Ferrier-Pages, 2009). This fundamental symbiosis is known to enhance the ability of corals to synthesize a calcium carbonate skeleton (Gattuso et al., 1999), the structural basis of coral reef ecosystems, in an environment where nutrients are mostly limited.

Since 1979, populations of scleractinian corals have been reported as increasingly affected by mass coral bleaching, which involves the breakdown of the symbiosis between the cnidarian host and the dinoflagellate symbionts (Hoegh-Guldberg, 1999). Given that the economic value of coral reefs has been estimated around US \$375 billion per year (Costanza et al., 1997) and that coral reefs support over 500 million people through the services and food that they provide, losing corals from reef systems would have substantial impacts on coastal populations worldwide. Consequently, many studies have focused on the causes and mechanisms of the disruption of this symbiosis in the past decades (more than 18,000 articles are available on Google scholar since 1990). However, despite the undoubted causal link between environmental stresses and bleaching, capturing the complexity of the interaction between coral-dinoflagellate symbiosis and its surrounding environment remains challenging, especially given that (i) the environment is complex and defined by a multitude of factors and (ii) the study of coral symbiosis is complex and hampered

by the intertwined nature of coral-dinoflagellate symbiosis. Considering the complexity of coral symbiosis, here we identify specific knowledge gaps within the functioning of coral symbiosis, including immune defenses and metabolism, and argue that recent technological advances provide better tools to understand how the environment affects these functional processes.

## **CORALS GROW IN A NARROW BAND OF ENVIRONMENTAL CONDITIONS, NEAR THEIR PHYSIOLOGICAL LIMITS**

Corals thrive in tropical waters, which happen to be warm, clear and generally oligotrophic (Muscatine and Porter, 1977). This narrow and consistent band of environmental conditions signifies that corals live and grow best near their physiological limits, especially with regards to three main factors: temperature, light and nutrients. Among these factors, the interaction between temperature and light has been intensively studied over the last decades because of a major interest in understanding thermal and light stressrelated bleaching phenomena (Hoegh-Guldberg, 1999). One of the first sites of damage is the symbiont photosystem II apparatus (Lesser, 2006), a key component of photosynthetic pathways located within the chloroplast of *Symbiodinium*. This photosynthetic dysfunction results in the excessive production of reactive oxygen species (ROS) in the symbiont and promotes the degradation of host mitochondria, providing another potential site of harmful ROS production and oxidative stress (Dunn et al., 2012a). The excess ROS damages essential biological macromolecules and cellular structures, initiating a cascade of innate immune responses, which subsequently result in the release and/or degradation of the symbiotic dinoflagellates. Histological studies have revealed two possible mechanisms of symbiont degradation: (i) symbionts are degraded from the effects of ROS via programmed cell death (Strychar et al., 2004) or (ii) the coral host actively destroys the symbionts and ultimately expels them (Dunn et al., 2007b). The cellular mechanisms of coral bleaching have been the focus of many studies since the 1990's, which are already well reviewed (for review, see: Weis, 2008). However, despite the clear involvement of cellular mechanisms such as exocytosis, host cell detachment, apoptosis, and necrosis, the cascade of immune responses and the modulation of cell death pathways leading to bleaching remain unsolved.

#### **TECHNOLOGICAL ADVANCES IN MOLECULAR BIOLOGY PROGRESS TOWARD UNRAVELING GENE FUNCTION**

By allowing researchers to simultaneously investigate genes and their level of expression, genomics and transcriptomics approaches have greatly improved our understanding of coral bleaching. In the late 2000s, the first microarrays studies had a tremendous impact on revealing the cellularfoundation of thermal stress-induced coral bleaching (Desalvo et al., 2008; Rodriguez-Lanetty et al., 2009; Bellantuono et al., 2012). In the first medium– scale transcriptomics coral study, Desalvo et al. (2008) used a cDNA microarray containing more than 1300 genes of the coral *Montastraea faveolata* to measure gene expression changes associated with thermal stress*.* Their results suggested that oxidative stress in thermal-stressed corals causes a disruption of Ca2<sup>+</sup> homeostasis and the initiation of cell death via apoptosis and necrosis. In a following study, Rodriguez-Lanetty et al. (2009), examined the effect of thermal stress on the early transcriptional response of aposymbiotic larvae of the reef-building coral *Acropora millepora* and show that elevated temperature compromise some critical components of the coral immune defenses including a mannose-binding C-type lectin. More recently, microarrays studies demonstrated that coral host transcriptomic states are correlated with different *Symbiodinium* genotypes (Desalvo et al., 2010) and that potential "early warning genes" and "severe heatrelated genes" could be detected as a result of high heat stress (Maor-Landaw et al., 2014). The findings from Maor-Landaw et al. (2014) also suggest that during short-term heat stress, *S. pistillata* may divert cellular energy into mechanisms such as the ER-unfolded protein response (UPR) and ER-associated degradation (ERAD) at the expense of growth and biomineralization processes in an effort to survive and subsequently recover from the stress. The emergence of next-generation sequencing technologies has further increased the speed of coverage and decreased the cost through massively parallel sequencing methods. As a result, large–scale transcriptomics are well established as part of the coral biologist's "toolbox" (Moya et al., 2012) and a total of approx. 600,000 sequences with more than 86,000 unique blast matches are available from various coral species through six assembled transcriptomes and one fully assembled genome (Shinzato et al., 2011) in databases such as systems biology of symbiosis1, compagen2, or cnidarian3. Genomic information on *Symbiodinium* are more scarce, due primarily to its very large genome

2http://www.compagen.org/links.html

size (approx. 2–4 Gb) and repetitive DNA (Leggat et al., 2011). However, since 2007, several expression sequence tags (ESTs) and subsequent gene expression studies were generated (Leggat et al., 2007; Rosic et al., 2010, 2011a,b), and more recently, the gene structure of the dinoflagellate has been further revealed by the draft assembly of the *S. minutum* nuclear genome (Shoguchi et al., 2013). Despite these enormous progresses, how the environment and genes ultimately interact to affect immune defenses and cell death in coral symbiosis remains poorly understood. Currently, the biggest challenges in this field are (i) to organize the growing body of molecular information into a clear mechanistic framework and testable hypotheses with a direct link to function and (ii) to direct hypothesis-driven research to aid in the examinations of gene function. While the first challenge is conceptual and beyond the scope of this paper, the latter is more technological and could be achieved using advanced functional approaches to silence/knockdown gene expression. In this respect, previous studies in the freshwater cnidarian *Hydra magnipapillata* have explored the use of the reverse-genetics technique RNA interference (RNAi), which consists of introducing a synthetic double strand of RNA into cells to selectively induce gene suppression. Unfortunately, this method has had limited success; the RNA delivery via electroporation often damages tissues and cells (Lohmann et al., 1999; Smith et al., 2000; Cardenas and Salgado, 2003). In a pioneering gene silencing study for a symbiotic cnidarian, Dunn et al. (2007a) reported the use of RNAi and chemical transfection delivery to suppress the expression of a gene coding for caspase, a proteolytic enzyme triggering cell death and bleaching in the symbiotic sea anemone *Aiptasia pallida.* Although their method was effective, the decrease obtained in caspase activity was not absolute (only 30%), most likely related to the instability or poor delivery of siRNA *in vivo*. Genetic modification is still in its infancy in symbiotic cnidarians, and efforts at using siRNA for gene silencing have often been hampered by the difficulty of effectively introducing it into cells of interest. As a result, no other attempts to use siRNA in corals have been published thus far.

Nanotechnology is a relatively new discipline, which in the last years is starting to be used for addressing questions related to biological systems. Thanks to nanotechnology, new materials can be developed that have new properties compared to existing properties. Nanomaterials have been shown to possess distinctive properties that contribute to promising applications in several fields. To give an example, fluorescent semiconductor nanoparticles (NPs) suffer less from photobleaching than conventional fluorophores. Thus, single-molecule based tracking of fluorescently labeled membrane-bound proteins received a great boost by moving from connectional fluorophores to fluorescent NPs (Bouzigues et al., 2007). The small size of the NPs gives them a high surface area-to-volume ratio and facilitates the interaction with several types of chemical species. NPs are also excellent candidates for drug delivery due to their capacity to interact with biomolecules and their possibility to be loaded with specific cargo (Bouzigues et al., 2007; Conde et al., 2012). Although NPs can be useful devices for delivering specific cargo *in vitro* and *in vivo* and their uptake is easily carried out, the specific control of the cargo release is still a challenge. Some years ago, a new concept was presented in which hollow particles, typically on the micrometer

<sup>1</sup>http://www.auburn.edu/∼santosr/symbiosys.htm

<sup>3</sup>http://data.centrescientifique.mc/blast/?option=home.php

scale (so called polyelectrolyte capsules), were used as a carrier system. The cavity of the capsules can be loaded with a large variety and high quantity of cargo (De Koker et al., 2009; De Cock et al., 2010; Conde et al., 2012). If the walls of the capsules are modified with Au NPs, an optothermal opening can be made, providing the possibility to carry out a controlled release of the cargo (Skirtach et al., 2006; De Koker et al., 2009). This release is similar to the concept of caged Ca2+, in which Ca2<sup>+</sup> ions are released from a chelator upon a flash of light. Capsules, however, allow for the release of a larger variety of cargo molecules, such as small drugs, proteins, or mRNA. Capsules can range in size from hundreds of nanometers to a few micrometers, depending on the size of the template that was used in synthesis (Skirtach et al., 2006; Wang et al., 2008). When less specific control of the release is required, biodegradable capsules are also a good alternative. It has been demonstrated that biodegradable capsules are degraded inside of the lysosome of cells, where their cargo is then released*.* Thus, nanotechnology can provide sophisticated carrier systems ranging from a few nanometers to a few micrometers, which allows for the controlled release of biologically active cargo. This next generation of carrier systems could be a game changer in the understanding of bleaching mechanisms, as they also include NPs with new biomaterials developed to fit the chemistry, biophysical structure, and biological function of siRNA. Many research groups are already reporting improved stability and delivery efficiency of siRNA (for review, see: Kozielski et al., 2013). However, in model organisms such as Nematods or Zebrafish, a reliable means of RNAi-mediated gene knockdown remains elusive (Kelly and Hurlstone, 2011). As such, a simpler alternative for gene knockdown, such as morpholino antisense oligos, is still the most effective method of gene suppression in Zebrafish. Morpholinos act by binding and blocking access to target mRNA (Nasevicius and Ekker, 2000). Since their first introduction in the early 2000's, they been used in a range of model organisms, including sea urchin, ascidian, zebrafish, frog, chick, and mouse providing a relatively simple and rapid method to study gene function (Huang et al., 2012; for review see: Heasman, 2002). Although these better ways for gene suppression still require optimization and validation in coral symbiosis, their future development could be critical to address important hypotheses about gene function, such as the existence of symbiosis-specific genes.

#### **IMPORTANCE OF NUTRIENTS AND METABOLIC FUNCTION IN CORAL SYMBIOSIS**

Given the nutritional role of the symbionts, another intriguing question—and subsequent knowledge gap in coral symbiosis arises as to whether bleaching may in part reflect a change in the nutritional status of the host–symbiont interaction. In this respect, recent studies have highlighted strong correlations between feeding, sustained photosynthetic activity and reduced bleaching (Ferrier-Pages et al., 2010; Beraud et al., 2013). Coral symbiosis requires the delicate balance of exchanged compounds between the symbiotic partners. The photosynthetically fixed carbon compounds translocated by the dinoflagellate symbiont to the host consist largely of non-nitrogenous compounds, such as glycerol, glucose, and succinate (Venn et al., 2008). These compounds are

often referred to as "junk food," as they directly support coral respiration and mucus production (Wild et al., 2004) but can only be used for coral growth when nitrogen and phosphorus are available from another source (Ferrier-Pages et al., 2010). Consequently, the ability to assimilate nitrogen and phosphorus by feeding on plankton (Houlbreque and Ferrier-Pages, 2009) or by absorbing nutrients dissolved in seawater, with a preference for phosphate, ammonium and nitrate (Grover et al., 2002), is a crucial attribute of the coral symbiosis.

Ecological stoichiometry is an increasingly broad field of research that evaluates how the relative quantity of specific chemical elements (C:N:P ratio) constrains or facilitates the movement of nutrients through an ecosystem (Sterner and Elser, 2002). Given that elemental ratios are unit less, using a stoichiometric approach allows for the tracking of the same response at scales where quantities differ greatly (for example, from single cells to community). Recent advances in this field of research have demonstrated an intimate link between biomass stoichiometry, environmental conditions and nutrient fluxes in a broad range of ecosystems (Taylor and Townsend, 2010). The same is certain to be the case for coral and coral reefs; the changes in biomass stoichiometry, driven by changes in macromolecular composition, reflect coral physiology (e.g., growth rate or accumulation of storage compounds) and *in situ* environmental conditions (e.g., temperature, nutrients, or light availability). Shifting the balance of supply versus demand for nitrogen and carbon, for example due to an environmental perturbation, may well result in the disruption of the coral–symbiont relationship, which is referred to as bleaching. The development of research investigating environmental impacts on the metabolic function of coral reefs is particularly necessary given that (i) reefs are rapidly deteriorating worldwide and that (ii) most coral reef research does not consider biomass stoichiometry and metabolic function when studying bleaching. Providing redress for such an omission may be critical in understanding the issue of coral reef degradation.

## **TECHNIQUES BASED ON STABLE ISOTOPES TO QUANTIFY METABOLIC ACTIVITY** *IN SITU*

The intertwined nature of the coral-dinoflagellate endosymbiosis has long hampered research on coral metabolic function, as studies often suffer from potential cross-contamination between coral host and dinoflagellate fractions (Yellowlees et al., 2008). In this context, the recent development of approaches combining incubations with stable isotopes and analysis of elements, DNA, RNA and other biomarkers has been a revolutionary step, allowing the detection of metabolically active microbes in their natural habitat (for a review on these techniques and their application in microbiology Musat et al., 2008; Orphan and House, 2009; Wagner, 2009; Murrell and Whiteley, 2011). Stable isotope probing (SIP) is a relatively recent method to track the metabolic fate of a compound "isotopically labeled." In a seminal study, Radajewski et al. (2000), reported that 13C-DNA, produced during growth methylotrophic bacteria on a 13C-enriched carbon source could be resolved from 12C-DNA by density-gradient centrifugation, allowing both taxonomic and functional characterization by gene probing and sequence analysis (Radajewski et al., 2000). This method was described as DNA-SIP, and was soon after followed by

RNA-SIP (Manefield et al., 2002), which provided another novel way to link the phylogeny of microorganisms to their function. Since then, SIP has been developed with the use of different stable isotopes including 15N and 18O (for review, see Murrell and Whiteley, 2011).

Stable isotope probing techniques are not the only way to track metabolic incorporation of stable isotopes. Metabolomics, which involves the quantitative analysis of all metabolites present in cells and tissues, can be used in combination with stable isotope and has the potential to play a key role in understanding coral symbiosis (for review, see: Gordon and Leggat, 2010). In a recent study, Dunn et al. (2012b) used stable isotopic incorporation from dissolved inorganic carbon (NaH13CO3) combined with HPLC-MS to investigate the lipogenesis in symbiotic cnidarian. Interestingly, their results indicated that fatty acids derived from photosynthetically fixed carbon were not used directly in host lipogenesis, suggesting that additional sources of carbon, such as host respiration and heterotrophy may be especially important for the lipogenesis of fatty acids in the cnidarian host. Another technique that over- comes the use of stable isotopes to detect metabolically active microbes in their natural environment is nano-scale secondary ion mass spectrometry (NanoSIMS). With NanoSIMS, secondary ions are extracted from the surface of a sample under the impact of a primary ions beam and subsequently analyzed using mass spectrometry providing imaging and quantification of up to seven isotopes of elements simultaneously (Wagner, 2009). When combined with stable isotopes incubation, NanoSIMS can be used to measure the relative metabolic contribution of different symbiotic partners (i.e., metabolic rates of individual host and symbiont cells) with single cell resolution (Pernice et al., 2012; Kopp et al., 2013). This technique can also be used in concert with *in situ* hybridizations to simultaneously identify individual cells and quantify their substrate uptake (Behrens et al., 2008; Musat et al., 2008). Although this combination of NanoSIMS and *in situ* hybridization has never been applied to coral, studies integrating these powerful methodologies can significantly improve our understanding of the functional diversity that exists at the very heart of reef-building corals (Pernice et al., 2014).

By allowing direct empirical evaluation of a proposed hypothesis, thesefunctional approaches combining incubations with stable isotopes and biomarkers or elemental analysis can help addressing fundamental questions in coral symbiosis. Among others, an interesting hypothesis that remains poorly addressed so far concerning coral symbiosis, is whether photosynthetically fixed carbon may directly contribute to calcification of coral skeleton? Indeed, previous studies have demonstrated that photosynthesis and skeleton formation are tightly coupled in zooxanthellate scleractinian corals, calcification being, on average, three times higher in light than in darkness [for review, see Gattuso et al. (1999)]. However, the details of carbon supply to the calcification process are almost unknown. In this respect, elemental analyses combined with incubations using multiple stable isotopes focusing on skeletal formation [e.g., using 86Sr (Houlbreque et al., 2009)] and fixation of carbon via photosynthesis [e.g., using NaH13CO3 (Pernice et al., 2014)] have great potential to address

**function to study the impact of the environment on coral symbiosis.** The "omics" approaches can be used to describe the system under different

can subsequently be targeted by gene silencing and technics based on stable isotopes, respectively, to direct hypothesis-driven research.

this hypothesis and may be critical in understanding the link between photosynthesis and skeletal formation.

When integrated with recent advances in molecular biology, including "omics" approaches and next generation siRNA delivery systems, approaches based on incubation with stable isotopes could also provide important opportunities for system biology (Raes and Bork, 2008; **Figure 1**). Discovering the relationships between genefunctions and the sum of all metabolic processes (i.e., nutrients and energy cycling) occurring within symbiotic corals could be an especially important next step to better understand the biology and functioning of this symbiosis.

#### **ACKNOWLEDGMENTS**

The authors thank two anonymous reviewers for their valuable and constructive comments on the first version of the manuscript, which helped improve the quality of this paper. Mathieu Pernice is supported by the Plant Functional Biology and Climate Change Cluster, University of Technology Sydney.

#### **REFERENCES**


genome reveals Dinoflagellate gene structure. *Curr. Biol.* 23, 1399–1408. doi: 10.1016/j.cub.2013.05.062


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 May 2014; accepted: 05 August 2014; published online: 21 August 2014. Citation: Pernice M and Levy O (2014) Novel tools integrating metabolic and gene function to study the impact of the environment on coral symbiosis. Front. Microbiol. 5:448. doi: 10.3389/fmicb.2014.00448*

*This article was submitted to Microbial Symbioses, a section of the journal of Frontiers in Microbiology.*

*Copyright © 2014 Pernice and Levy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**REVIEW ARTICLE** published: 22 August 2014 doi: 10.3389/fmicb.2014.00422

# The engine of the reef: photobiology of the coral–algal symbiosis

## *Melissa S. Roth\**

Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Michael P. Lesser, University of New Hampshire, USA Roberto Iglesias-Prieto, Universidad Nacional Autónoma de México, Mexico

#### *\*Correspondence:*

Melissa S. Roth, Department of Plant and Microbial Biology, University of California Berkeley, 441 Koshland Hall, Berkeley, CA 94720-3201, USA e-mail: melissa.s.roth@gmail.com

Coral reef ecosystems thrive in tropical oligotrophic oceans because of the relationship between corals and endosymbiotic dinoflagellate algae called Symbiodinium. Symbiodinium convert sunlight and carbon dioxide into organic carbon and oxygen to fuel coral growth and calcification, creating habitat for these diverse and productive ecosystems. Light is thus a key regulating factor shaping the productivity, physiology, and ecology of the coral holobiont. Similar to all oxygenic photoautotrophs, Symbiodinium must safely harvest sunlight for photosynthesis and dissipate excess energy to prevent oxidative stress. Oxidative stress is caused by environmental stressors such as those associated with global climate change, and ultimately leads to breakdown of the coral–algal symbiosis known as coral bleaching. Recently, large-scale coral bleaching events have become pervasive and frequent threatening and endangering coral reefs. Because the coral–algal symbiosis is the biological engine producing the reef, the future of coral reef ecosystems depends on the ecophysiology of the symbiosis. This review examines the photobiology of the coral–algal symbiosis with particular focus on the photophysiological responses and timescales of corals and Symbiodinium. Additionally, this review summarizes the light environment and its dynamics, the vulnerability of the symbiosis to oxidative stress, the abiotic and biotic factors influencing photosynthesis, the diversity of the coral–algal symbiosis, and recent advances in the field. Studies integrating physiology with the developing "omics" fields will provide new insights into the coral–algal symbiosis. Greater physiological and ecological understanding of the coral–algal symbiosis is needed for protection and conservation of coral reefs.

**Keywords: scleractinian corals, dinoflagellate,** *Symbiodinium***, photophysiology, ecophysiology, acclimation, photoprotection**

## **INTRODUCTION**

Coral reefs flourish as one of the world's most diverse and productive ecosystems. Economic goods and ecosystem services of coral reefs are valued at over US \$20 trillion annually (Costanza et al., 1997; de Groot et al., 2012). Despite their immense biological, economical, and societal significance, corals reefs are declining worldwide due to a myriad of threats on multiple scales. Synergies of global stressors (e.g., ocean warming and acidification) and local stressors (e.g., over-fishing and coastal development) accelerate the degradation of coral reefs (Hughes et al., 2003; Hoegh-Guldberg et al., 2007). Because coral reefs are at risk of global decline and corals are the keystone species of the ecosystem, it is critical to understand the dynamics of coral biology that govern responses and tolerances to environmental variability and change.

Coral reefs are a paradoxical ecosystem, "an oasis in a desert ocean" (Odum, 1971), in which corals build complex structures teeming with life in shallow, oligotrophic oceans (**Figures 1A,B**). This calcium carbonate bioconstruction, so extensive it is visible from outer space, is powered by the coral–algal symbiosis. Dinoflagellate algae live within the cells of corals and provide their hosts with most if not all the energy needed to meet the coral's metabolic demands (**Figures 1C,D**; Muscatine, 1990).

Reef-building corals (phylum Cnidaria, class Anthozoa, order Scleractinia) host endosymbiotic dinoflagellates of the genus *Symbiodinium* (kingdom Chromalveolata, division Pyrrhophyta, class Dinophyceae), which are often referred to as zooxanthellae (zoo = animal and xanth = yellow) in the literature (Freudenthal, 1962). Similar to other photoautotrophs, *Symbiodinium* must delicately balance the sunlight absorbed and processed through photochemistry to sustain high rates of primary productivity without incurring damage. The fixed carbon produced by *Symbiodinium* is translocated to fuel coral growth and calcification (Goreau, 1959; Muscatine, 1990). Additionally, the oxygen produced as a by-product of photosynthesis may promote maximum coral calcification rates (Colombo-Pallotta et al., 2010). In return, corals provide their endosymbionts with essential nutrients in a safe, sunlit habitat in nutrient-poor oceans. This symbiosis is unique because it involves two eukaryotic organisms and the genome of the symbiont is three times larger than the genome of its host (Shinzato et al., 2011; Shoguchi et al., 2013). Prokaryotes and viruses are also associated with corals and *Symbiodinium*, but their roles are mostly uncharacterized (Ainsworth et al., 2010). The tight recycling and conservation of nutrients within the coral holobiont (the coral and its collective community) allows coral reefs to thrive in tropical nutrient-poor oceans. It should also be

**FIGURE 1 | "An oasis in a desert ocean": coral reef seascapes powered by the coral–algal symbiosis. (A)** Aerial view of coral reef architecture in shallow, oligotrophic tropical waters of Fiji. **(B)** Reef-building corals create habitats for vibrant communities boasting incredible biodiversity and productivity. This photograph was taken in the heart of the Coral Triangle in Raja Ampat, Indonesia. **(C)** Corals are colonial invertebrates, made up of genetically identical individual polyps connected by living tissue (coenosarc). The coral golden hue of

Seriatopora hystrix comes from symbiotic dinoflagellates located within their cells. Scale bar represents 1 cm. **(D)** The biological engine of the reef – endosymbiotic dinoflagellates of the genus Symbiodinium in coral cells: fluorescence microscopy image showing a Montipora capitata coral egg (green fluorescence from coral fluorescent proteins) and intracellular Symbiodinium (red fluorescence from chlorophyll). Symbiodinium provides photosynthetic products and oxygen to fuel coral growth and calcification. Scale bar represents 50 μm. (Images by M. S. Roth.)

noted that there are corals without *Symbiodinium* and they do not require sunlight for nourishment nor build coral reefs and thus are not discussed in this review. The survival and success of coral reef ecosystems depend on the elegant symbiosis between reef-building corals and *Symbiodinium*.

Over the last few centuries coral reef ecosystems have endured a long trajectory of decline (Pandolfi et al., 2003), but coral reefs today face unprecedented levels of change and degradation at a global scale (Hoegh-Guldberg et al., 2007; Hoegh-Guldberg and Bruno, 2010). Changes in a suite of environmental conditions including temperature and light can lead to the breakdown and dissociation of the coral–algal symbiosis, which is called coral bleaching (Lesser, 2011). The timing and extent of coral bleaching primarily depends on the magnitude and duration of temperature anomalies as well as light levels, other environmental variables and the thermal history of the reef (Baker et al., 2008; Middlebrook et al., 2008; Strong et al., 2011). Bleached corals will die if not re-populated with *Symbiodinium*, but even recovered corals have reduced growth, regeneration, fitness, and greater susceptibility to bleaching in the future (Jokiel and Coles, 1977; Goreau and Macfarlane, 1990; Meesters and Bak, 1993; Ward et al., 2000).

Because of the central role of *Symbiodinium* photosynthesis as the engine of the coral reef ecosystem, this review summarizes the critical components and timescales of the photobiology of the coral–algal symbiosis and the underlying factors influencing the responses. This review aims to reach an audience that extends beyond photobiologists to all scientists and managers who work on coral reefs to provide them with a basic understanding of the important concepts, fundamental mechanisms and principal players in the photobiology of the coral–algal symbiosis. The extraordinary challenges confronting coral reefs require greater physiological and ecological understanding of the coral–algal symbiosis for the protection and conservation of these majestic ecosystems.

#### **LIGHT ENVIRONMENT OF THE CORAL–ALGAL SYMBIOSIS**

Light is a key regulating factor shaping the productivity, physiology, and ecology of the coral–algal symbiosis. Light quantity (photon flux) and quality (spectral composition) are determining characteristics of the symbiosis. Both macroscale (e.g., depth) and microscale (e.g., coral skeleton structure) features influence the light environment of the symbiosis.

#### **LIGHT QUANTITY**

To maintain high rates of productivity, coral reefs are predominantly located in shallow-waters (<30 m). In shallow-waters, corals can be exposed to high levels of downwelling irradiance of >2000 μmol photons m−<sup>2</sup> s <sup>−</sup><sup>1</sup> at midday (Jimenez et al., 2012). Solar irradiance decreases exponentially with depth due to the scattering and absorbance of water itself as well as dissolved and particulate matter (**Figure 2A**; Dustan, 1982; Oliver et al., 1983; Shick et al., 1996; Lesser, 2000; Lesser et al., 2000; Kirk, 2010). Crevices, overhangs, and caves in addition to depth create low light habitats for corals. In low light environments, reef-building corals acclimate by reducing energetic requirements through decreasing tissue biomass, skeleton thickness, respiration rates, translocation, and growth (Anthony and Hoegh-Guldberg, 2003b). *Symbiodinium* in low light acclimated corals maximize the light absorption and utilization by increasing photosynthetic pigments and photosynthetic efficiency (Falkowski and Dubinsky, 1981; Anthony and Hoegh-Guldberg, 2003a,b). Reef-building corals are found throughout the photic zone with the deepest record of a reef-building coral living at 165 m (Maragos and Jokiel, 1986). Deep coral communities (>30 m), also called mesophotic coral reef ecosystems, inhabit low light environments with roughly <10% of surface irradiance (Lesser et al., 2009). Some corals such as *Montastraea cavernosa* can be found over a considerable depth range from 3 to 91 m and show a decline in gross photosynthesis and an increase in heterotrophy with depth (Lesser et al., 2010). In contrast, other corals such as *Leptoseris hawaiiensis* are restricted to the deeper zones (>60 m; Luck et al., 2013). Because of the inaccessibility of the mesophotic zone, coral physiology at these deeper depths is understudied but may provide unique insight into the coral–algal symbiosis. As sunlight penetrates seawater, the amount of direct light rapidly decreases while the amount of light from the side (diffuse light)

can remain fairly constant from 10 to 40 m (Frade et al., 2008). Therefore, deeper corals experience a more uniform light field than shallower corals as well as substantially lower irradiance. At depth, in addition to the reduced irradiance, there is a narrowing of the spectrum of light present (Dustan, 1982; Shick et al., 1996; Lesser, 2000; Lesser et al., 2000; Kirk, 2010). Thus, corals from different depths not only acclimate to different light quantities, but also to distinct light quality.

#### **LIGHT QUALITY**

The spectral composition of light changes with depth because different wavelengths have distinct attenuation properties and organic matter preferentially absorbs particular wavelengths of light (**Figure 2B**; Dustan, 1982; Shick et al., 1996; Lesser, 2000; Lesser et al., 2000; Kirk, 2010). Blue light (400–500 nm) transmits the deepest in the oceans while ultraviolet radiation (UVR, 200– 400 nm) and red light (620–740 nm) attenuate the fastest (Dustan, 1982; Shick et al., 1996; Lesser, 2000; Lesser et al., 2000; Kirk, 2010). While the oceans in the tropics are oligotrophic and thus relatively transparent, reefs near coastal areas can have high amounts of dissolved organic matter (DOM) from terrigenous inputs and upwelling (Shick et al., 1996; Lesser, 2000; Zepp et al., 2008; Banaszak and Lesser, 2009; Kirk, 2010; Kuwahara et al., 2010). The absorption and scattering of DOM and in particular colored DOM (CDOM) create the unique spectral composition found on coral reefs (Shick et al., 1996; Lesser, 2000; Zepp et al., 2008; Banaszak and Lesser, 2009; Kirk, 2010; Kuwahara et al., 2010). Thus, shallow corals experience high intensity UVR and full-spectrum light (400–700 nm), while mesophotic corals experience low levels of spectrally enriched blue light.

The spectral composition of light influences corals and their symbionts on molecular, cellular, biochemical, and behavioral levels. In clear tropical oceans, high energy UVR can penetrate to >20 m and is particularly damaging for cells (Shick and Dunlap, 2002); high doses of UVR irrespective of temperature induces coral bleaching (Gleason andWellington, 1993). While UVR can be very damaging for the coral–algal symbiosis, blue light has the greatest influence on biology and physiology. Coral photoreceptors

and circadian-clock genes respond to blue light (Gorbunov and Falkowski, 2002; Levy et al., 2007). Additionally, blue light affects coral bleaching during thermal stress (Fitt and Warner, 1995), antioxidant activity (Levy et al., 2006b), coral growth and chlorophyll *a* concentrations (Kinzie et al., 1984), fluorescent protein (FP) regulation (D'Angelo et al., 2008), polyp behavior (Gorbunov and Falkowski, 2002; Levy et al., 2003), and coral regeneration (Kaniewska et al., 2009). In cyanobacteria, blue light in addition to UVR damages the photosynthetic apparatus directly and inhibits its repair (Nishiyama et al., 2006); however, whether this remains true in *Symbiodinium* is unknown. Because different pigments absorb distinct wavelengths of light, the spectral composition of light influences photosynthesis. Corals collected from 3 m have double the rates of photosynthesis under full-spectrum light as compared to blue light, while the same species collected from 40 m has double the rates of photosynthesis under blue light as compared to full-spectrum light (Mass et al., 2010b). A recent study comparing blue, red, and combined blue and red light suggests that red light alone or in combination with blue light has negative effects on symbiont health and survival (Wijgerde et al., 2014); because wavelengths of red light are attenuated quickly, only shallow corals will encounter red light. Corals and *Symbiodinium* have adapted to a variety of light environments, and light quality and quantity have significant impacts on the physiology, ecology and evolution of the photosynthetic system and the coral–algal symbiosis.

#### **LOCAL LIGHT ENVIRONMENT CREATED BY THE CORAL**

Whereas the characteristics of the underwater light field are universal for all marine organisms within a specific location, many properties of the coral itself create a distinctive local light environment for the coral–algal symbiosis. Every component of the coral–algal symbiosis from the mucus layer to the calcium carbonate skeleton can influence the light propagating through corals and reaching *Symbiodinium* (Wangpraseurt et al., 2014). Light can be scattered, absorbed, or re-emitted as fluorescence by various components of corals and *Symbiodinium* (Kühl et al., 1995; Salih et al., 2000; Enríquez et al., 2005; Kaniewska et al., 2011; Wangpraseurt

et al., 2012, 2014; Marcelino et al., 2013). The extensive genetic and environmental variability influencing each of these characteristics adds complexity to understanding the photobiology of the coral–algal symbiosis. The coral produces a highly refractive extracellular skeleton that enhances light and increases absorption (Enríquez et al., 2005). The microstructure of the skeleton creates multiple scattering of light resulting in 3–20 times higher light levels within a coral cell than in the adjacent water column (**Figure 2C**; Kühl et al., 1995; Enríquez et al., 2005; Marcelino et al., 2013). Therefore, if photons are not absorbed by the coral or its symbiont as incident light, the skeleton scatters the light as diffuse reflectance and presents more opportunities for photons to be absorbed. A recent study provides evidence that light can travel laterally a distance of ∼2 cm within the tissues of corals (Wangpraseurt et al., 2014). The light propagation properties in intact corals reduce the effects of self-shading and allow *Symbiodinium* to maximize light absorption with low investment in pigments (Enríquez et al., 2005; Wangpraseurt et al., 2012, 2014; Marcelino et al., 2013). *Symbiodinium* in corals can have high gross photosynthetic rates and quantum efficiencies can reach near theoretical limits under moderate irradiances (Rodríguez-Román et al., 2006; Brodersen et al., 2014). Early studies vary widely in reported quantum efficiencies (Dubinsky et al., 1984; Wyman et al., 1987; Lesser et al., 2000), which may have been caused by an underestimation of the absorption cross-section of chlorophyll, differences in light levels during measurements, or differences among corals in light scattering, tissue thickness, and skeletal morphology (for more discussion see Section "Photosynthesis"). Corals with complex morphologies and thick tissue encompass a variety of light microniches. Examples of light heterogeneity within a coral colony include the gradient of light through thick coral tissue and the precise location within a coral colony (e.g., the top will receive significantly more light than the bottom of a branch or the side of a colony; Kaniewska et al., 2011; Wangpraseurt et al., 2012; Brodersen et al., 2014). The light environment can determine the corals' capacity for growth and reproduction (Goreau and Goreau, 1959; Kojis and Quinn, 1984) because corals obtain significant amounts of energy and oxygen from *Symbiodinium* primary production (Muscatine, 1990; Colombo-Pallotta et al., 2010).

## **DYNAMICS OF LIGHT OF THE CORAL–ALGAL SYMBIOSIS**

Light is one of the most predictable yet stochastic environmental variables of the coral–algal symbiosis. Light in the ocean is incredibly dynamic over a variety of timescales from milliseconds to thousands of years (**Table 1**). The most pronounced but consistent light cycle is the diurnal light cycle, in which *Symbiodinium* switches from producing oxygen via photosynthesis to consuming oxygen via respiration. This switch causes the environment within coral cells to change from hyperoxic during the day to hypoxic during the night (Kühl et al., 1995), and was first observed within the tissues of symbiotic sea anemones (Dykens and Shick, 1982). The amount of oxygen generated by *Symbiodinium* within coral cells can be so extensive that some corals release bubbles with high amounts of oxygen and even change the level of oxygen in the surrounding environment (D'Aoust et al., 1976; Crossland and Barnes, 1977). Coral calcification is called

light-enhanced calcification because it is tightly linked with photosynthesis and corresponds with the diurnal cycle (Gattuso et al., 1999). Recent evidence suggests that the oxygen produced from photosynthesis during the day is required for maximum rates of calcification (Colombo-Pallotta et al., 2010). For more information on coral growth and calcification see reviews dedicated to the subject (e.g., Gattuso et al., 1999; Allemand et al., 2011; Tambutté et al., 2011). The diurnal light cycle and seasonal periodicity are responsible for the rhythmic responses of the circadian clock in the coral–algal symbiosis (Levy et al., 2011; Sorek et al., 2014).

During the day, many factors influence the amount of solar energy the coral–algal symbiosis receives. Waves on the surface of the ocean act as lenses causing the sunlight to focus and defocus creating 100-fold changes in light intensity on millisecond timescales (Stramski and Dera, 1988; Falkowski and Chen, 2003). Sunlight flashes in shallow-waters can exceed 9000 μmol photons m−<sup>2</sup> s <sup>−</sup><sup>1</sup> and occur >350 times per minute (Veal et al., 2010). Additionally, marine organisms such as fish swim over corals and temporarily shade them. Shading from clouds and storms can reduce irradiance by 40-fold and last for minutes or weeks (Falkowski and Chen, 2003; Anthony et al., 2004). The irradiance of corals is also affected by the tidal cycle, which alters the depth of the water column and can even cause shallow corals to become subaerially exposed during extreme low tides (Brown et al., 1994; Anthony et al., 2004; Jimenez et al., 2012). Throughout the year, changes in day length and solar declination modify the amount of sunlight available (Kirk, 2010). It should be noted that light is not only an indirect source of energy for corals, but also provides informational signals for reproduction and spawning, which are tightly linked to the lunar cycle (Harrison and Wallace, 1990; Levy et al., 2007). The complex dynamics of interweaving random and cyclic processes that govern light availability have profound effects on photosynthesis and coral–algal physiology.

## **PHOTOSYNTHETIC SYMBIOSES IN CORALS INCREASE SUSCEPTIBILITY TO OXIDATIVE STRESS**

Photosynthesis, the conversion from solar energy to chemical energy, is one of the most important processes on our planet. Using sunlight, oxygenic photosynthetic organisms, such as *Symbiodinium,* convert carbon dioxide and water into organic carbon. This process also generates oxygen, which supports aerobic life on Earth. In reef-building corals, photosynthesis by *Symbiodinium* provides most of the energy needed for corals to build the infrastructure of the reef (Goreau, 1959; Muscatine, 1990). The primary photosynthetic pigments of *Symbiodinium,* chlorophyll *a*, chlorophyll *c*2, and peridinin, determine which wavelengths of light are utilized in photosynthesis (**Table 2**). Light-harvesting complexes capture photons of light and transfer the energy to the photosynthetic electron transport chain. Light-induced linear electron flow from water to NADPH involves electron transfer from photosystem II (PSII) to photosystem I (PSI) via the cytochrome *b*6*f* complex to generate ATP (for diagram of arrangement see Eberhard et al., 2008). Cyclic electron flow must run in concert with linear electron transport for efficient photosynthesis (Munekage et al., 2004). Cyclic electron flow utilizes PSI and cytochrome *b*6*f* to


**Table 1 |Timescales of light dynamics and responses by the coral–algal symbiosis.**

References for light dynamics, coral responses, and Symbiodinium responses classified as "L," "C," and "S," respectively. \*General timescale for photosynthetic eukaryotes.


**Table 2 | Summary of light absorbing and emitting pigments, proteins, and compounds in the coral–algal symbiosis.**

Secondary peaks listed in parentheses.

<sup>a</sup>Measured in algal cells.

<sup>b</sup>Estimated in vivo absorption.

<sup>c</sup>Measured in purified thylakoid membranes.

<sup>d</sup>Measured in purified PSII.

eMeasured in ethanol extracts.

build a high proton motive force and thus ATP. Photosynthetically derived NADPH and ATP are used to drive the fixation of carbon dioxide in the Calvin–Benson cycle as well as other metabolic processes in the chloroplast. The reaction centers, PSI and PSII, are embedded in the thylakoid membrane of the chloroplast.

While endosymbiont photosynthesis serves as the engine to power the growth and calcification of coral reefs, sunlight capture, absorption, and utilization presents a high potential for photooxidative damage. Oxidative stress results from the production and accumulation of reactive oxygen species (ROS) and can damage lipids, proteins and DNA and signal cell apoptosis or exocytosis (Gates et al., 1992; Lesser, 1997; Hoegh-Guldberg, 1999; Franklin et al., 2004; Lesser and Farrell, 2004; Lesser, 2006). Oxidative stress is considered the unifying mechanism for a number of environmental insults that elicit coral bleaching (Lesser, 2011), resulting in the loss of *Symbiodinium* from host cells via mechanisms such as apoptosis, exocytosis, and necrosis (reviewed in Gates et al., 1992).

Although light is required for photosynthesis, excess light can be extraordinarily harmful for photosynthetic organisms and their hosts. There are four main fates for sunlight absorbed by a photosynthetic organism, depicted in the "funnel scheme" in **Figure 3**. The principal role for absorbed sunlight is to drive the photochemical reactions of photosynthesis. However, due to the dynamic nature of sunlight, the photosynthetic apparatus often receives more light than can be processed through photochemistry and the excess light must be diverted away from carbon assimilation and utilized by other pathways to minimize photo-oxidative damage (Niyogi, 1999; Müller et al., 2001). The absorbed excitation energy can also be re-emitted as chlorophyll fluorescence (red light), dissipated as heat which is termed non-photochemical quenching (NPQ), or decayed via the chlorophyll triplet state in which ROS are produced (**Figure 3**; Asada, 1999; Müller et al., 2001). On a sunny day, *Symbiodinium* in shallow corals dissipate four times more light energy than is used in photosynthesis (Gorbunov et al., 2001). Experimentally, corals under typical irradiances of coral reefs (640 μmol photons m−<sup>2</sup> s <sup>−</sup>1) dissipate 96% of the energy and use only 4% of absorbed light energy for photosynthesis (Brodersen et al., 2014). Highly reactive intermediates and by-products such as ROS can cause photo-oxidative damage to the photosynthetic apparatus and are inevitably produced during photosynthesis (Niyogi, 1999). Therefore, the photosynthetic system is constantly repairing itself from the damage (Niyogi, 1999). If the rate of damage exceeds the rate of repair, there will be reductions in photosynthetic efficiency and/or maximum rates of photosynthesis, which is called photoinhibition (Niyogi, 1999). Oxidative damage can decrease the outflow from the funnel, which intensifies the problem through increased production of ROS (**Figure 3**). Consequently, photosynthetic organisms have

**FIGURE 3 | Pathways of light energy utilization by** *Symbiodinium***.** The funnel scheme of the photosynthetic apparatus depicts the possible fates of absorbed light. When sunlight is absorbed by chlorophyll, the singlet-state excitation of chlorophyll (1Chl\*) is formed and the excitation energy can be (1) used to drive photochemistry, (2) re-emitted as fluorescence, (3) dissipated as heat (NPQ), or (4) decayed via the chlorophyll triplet state (3Chl\*), which produces reactive oxygen species as a by-product. Multiple types of reactive oxygen species are produced during photosynthetic electron flow. When the light exceeds what can be processed through these pathways, there is a high potential for the accumulation of reactive oxygen species and ultimately oxidative stress (inspired by Müller et al., 2001 and Demmig-Adams and Adams, 2002).

numerous photoprotective strategies. For example, adjusting the size of the light-harvesting complexes (volume of the funnel), photosynthetic capacity (rate of the primary outflow of the funnel), and NPQ capacity (rate of the secondary outflow of the funnel) can vary how much energy can be accommodated and how much excess energy or spillover there is. The rates of photochemical reactions and turnover rates of electron sinks (outflow from the funnel) are sensitive to changes in temperature and low temperatures can cause an energy imbalance and overexcitation of PSII (Huner et al., 1998; Nobel, 2005). Additionally, increases in temperature can change the repair rates of photosynthetic proteins and thus indirectly affect outflow from the funnel (Warner et al., 1999; Takahashi et al., 2004). Changes in temperature can also disturb thylakoid membrane fluidity and decrease the outflow from the funnel through the uncoupling of photosynthetic energy transduction and a reduction in carbon assimilation from the leaking of protons and consequently decrease ATP production (Tchernov et al., 2004). Other photoprotective processes include photorespiration, water–water cycle, antioxidant systems, and repair and new synthesis of proteins (Niyogi, 1999). Photosynthetic organisms balance the light entering and exiting the photosynthetic apparatus (the funnel) to maximize photosynthesis under the conditions the organism lives in while preventing oxidative damage. Excess

light (flow into the funnel) and/or changes in temperature (direct and indirect effects of flow out of the funnel) are principal factors causing energy imbalance in photosynthetic organisms (Huner et al., 1998; Nobel, 2005). All of these processes ultimately influence the health of the coral–algal symbiosis and the propensity for bleaching.

The photosynthetic apparatus is a flexible molecular machine that is highly conserved among eukaryotes (Falkowski and Raven, 2007; Eberhard et al., 2008). However, *Symbiodinium* photosynthesis occurs within an animal cell creating additional complexities. During the day, *Symbiodinium* generates high amounts of oxygen as a by-product of photosynthesis. Despite the fact that the coral absorbs oxygen during respiration, the coral cell becomes hyperoxic and may even produce bubbles of oxygen (D'Aoust et al., 1976; Crossland and Barnes, 1977; Kühl et al., 1995). The excess oxygen makes both the coral and its symbiont susceptible to oxidative stress (Lesser, 2006). Because the highly reflective coral skeleton enhances light within the coral cell, the loss of photosynthetic pigments and/or symbionts through coral bleaching increases the local irradiance and aggravates the negative effects of the stressful environmental conditions (**Figure 2C**; Enríquez et al., 2005; Marcelino et al., 2013): bleaching can result in 150% increase in scalar irradiance within coral tissues as compared to a healthy coral (Wangpraseurt et al., 2012). *Symbiodinium* photosynthesis is sensitive to changes in temperature and light (Lesser et al., 1990; Iglesias-Prieto et al., 1992; Lesser and Farrell, 2004; Roth et al., 2012; Downs et al., 2013). A recent study on *Symbiodinium* in corals provides evidence that light stress without heat stress causes fusion of thylakoid lamellae concurrent with photo-oxidative damage, heat stress without light stress causes decomposition of thylakoid structures which consequently generates photo-oxidative stress, and combined heat and light stresses induce both pathomorphologies (Downs et al., 2013). In nature, heat stress that produces coral bleaching generally occurs over weeks (Strong et al., 2011), which would mean that heat stress will be concurrent with daylight. Season, cloud cover, water clarity, and waves among other parameters determine the irradiance levels corals are exposed to. Because *Symbiodinium* is generally more susceptible to heat stress than their coral hosts (Strychar and Sammarco, 2009), *Symbiodinium* can become a substantial source of ROS during heat stress (Yakovleva et al., 2009). Excess ROS is transferred to and accumulates in the host (Levy et al., 2006b), and correspondingly the gene expression response to heat stress is larger in the coral than the symbiont (Leggat et al., 2011a). When corals bleach the main source of ROS production is removed although it is important to acknowledge that the host itself may also be producing ROS (Lesser, 2006; Weis, 2008). The delicate balance of *Symbiodinium* light absorption and utilization within the hyperoxic cells of corals in a dynamic environment makes the coral–algal symbiosis vulnerable to oxidative stress.

Because of the importance of oxidative stress in the coral–algal symbiosis and its role in coral bleaching, a brief discussion of ROS production, damage, and cellular defenses is warranted. There are a variety of types of ROS, with different degrees of reactivity and diffusivity across membranes, including: singlet oxygen (1O2 ∗), superoxide (O2 <sup>−</sup>), hydrogen peroxide (H2O2), hydroxyl radical (. OH), and the reactive nitrogen species nitric oxide (NO) and

peroxynitrite anion (ONOO−; Lesser, 2006). The major sites of ROS production are the chloroplast (light-harvesting complexes, PSI and PSII), mitochondria (inner membrane), and the endoplasmic reticulum (Niyogi, 1999; Lesser, 2006). The main targets of oxidative damage in *Symbiodinium* are the D1 protein of PSII and its repair mechanism, the enzyme ribulose 1,5-bisphosphate decarboxylase/oxygenase (Rubisco) of the Calvin–Benson cycle, and thylakoid membranes (Lesser, 1996; Warner et al., 1999; Takahashi et al., 2004; Tchernov et al., 2004). The cellular mechanisms of photoinhibition and coral bleaching are not described here as the topic has been recently reviewed elsewhere (see Lesser, 2006, 2011; Weis, 2008). The coral–algal symbiosis has an arsenal of defenses to combat ROS and neutralize damage including the antioxidant enzymes superoxide dismutase (SOD), catalase (CAT), and peroxidase, and the nonenzymatic antioxidants ascorbic acid, glutathione, tocopherol, carotenoids, uric acid, dimethylsulfide, dimethylsulphoniopropionate, and mycosporine-like amino acids (MAAs; Lesser, 2006). SOD catalyzes O2 <sup>−</sup> into H2O2 and O2, while CAT and peroxidase catalyze the H2O2 into H2O and O2. There is some evidence that accumulation of H2O2 is the primary ROS causing loss of *Symbiodinium* in corals (Sandeman, 2006). Recently, it has been suggested that enzyme mitochondrial alternative oxidase (AOX) in *Symbiodinium* could compete for electrons and reduce oxidative stress in the mitochondria (Oakley et al., 2014). In addition to the suite of photoprotective defenses, the coral–algal symbiosis employs a variety of approaches to optimize photosynthesis to maintain high rates of productivity under distinct light environments (see Sections "Photobiology of Corals" and "Photobiology of *Symbiodinium*").

#### **PHOTOBIOLOGY OF CORALS**

Extreme light intensity can be directly damaging for corals as well as indirectly harmful via the cascade of events that can occur from photo-oxidative damage. Corals may control the light *Symbiodinium* receives because the symbiont is located within the host oral endoderm cells inside a vacuole called the symbiosome (**Figure 2C**). Most corals live in shallow, oligotrophic habitats characterized by high light. However, corals also inhabit low light environments such as in caves or under overhangs in the shallows or in the mesophotic zone, where light becomes diffuse and monochromatic. Because adult corals are sessile, long-term acclimation to their particular light environment results in dramatic differences between high and low light corals (Falkowski and Dubinsky, 1981; Anthony and Hoegh-Guldberg, 2003b). Corals regulate antioxidants, pigments, gene expression, behavior, and architectural levels over multiple timescales in response to changes in ambient light to optimize fitness of the coral–algal symbiosis (**Table 1**).

#### **ENZYMATIC ANTIOXIDANTS**

Because photosynthesis invariably produces ROS, enzymes that can neutralize ROS have a fundamental role in coral photophysiology. Corals synthesize the enzymes SOD and CAT (Shick et al., 1995), which work together to convert O2 <sup>−</sup> and H2O2 into H2O and O2. The activities of SOD but not CAT decline with depth in shallow corals suggesting a relationship with the potential for oxidative stress (Shick et al., 1995). Both SOD and CAT activities have a diurnal pattern increasing with light and photosynthesis and decreasing at night (Levy et al., 2006a). In just two days, corals significantly increase the activities of SOD and CAT under blue light and decrease the activities in prolonged darkness (Levy et al., 2006a,b). Levy et al. (2006a) found that the response in antioxidants by the host was larger than the symbiont. As expected, gene expression of at least some antioxidants is coupled with the diurnal light cycle (Levy et al., 2011).

#### **MYCOSPORINE-LIKE AMINO ACIDS**

Because mycosporine-like amino acids are small molecules that absorb UVR and have antioxidant activities (**Table 2**), they play an essential function in photoprotection of marine organisms (Shick and Dunlap, 2002). While MAAs accumulate in host tissues (Shick and Dunlap, 2002), it is unclear which partner of the symbiosis synthesizes them. Originally it was presumed that MAAs were synthesized by *Symbiodinium* because of their presence in *Symbiodinium* in culture (Shick and Dunlap, 2002); however, the recent sequencing of the coral genome *Acropora digitifera* shows that the host also has the genes required for the biosynthesis of MAAs (Shinzato et al., 2011). It is also hypothesized that corals can acquire MAAs through their diet (Shick and Dunlap, 2002). Changes in concentration of MAAs occur over days with primary MAAs appearing first and secondary MAAs, which are synthesized from precursor MAAs, developing later (Shick, 2004). MAAs are also found in the coral mucus where they absorb nearly ∼10% of UVR (Teai et al.,1998). It is currently unknown which partner contributes which types of MAAs found in the coral–algal symbiosis. Because MAAs are thermally stable, they may play an important role scavenging free radicals and quenching singlet oxygen during heat stress (Banaszak and Lesser, 2009).

#### **FLUORESCENT PROTEINS**

Fluorescent proteins are proteins that absorb higher energy light and re-emit lower energy light. Corals produce a variety of FPs that absorb between ∼400–600 nm and fluoresce between ∼480– 610 nm with Stokes shifts ranging from ∼10–90 nm (**Table 2**; Alieva et al., 2008). The most common FP is the green FP (GFP), but corals also produce cyan FPs (CFP), red FPs (RFP), and even those that only absorb light but do not fluoresce called chromoproteins (CP; **Table 2**; Salih et al., 2000; Matz et al., 2002; Alieva et al., 2008; Gruber et al., 2008). The FP superfamily exhibits diversity in color while remaining similar on a structural level (Tsien, 1998). The three-dimensional structure, an 11-stranded β-barrel fold and a central α-helix containing the three amino acid chromophore, makes *in vitro* FPs stable and resistant to changes in temperature and pH (Tsien,1998). FPs contribute to the vivid coloration of corals (Dove et al., 2001; Oswald et al., 2007). Corals synthesize high concentrations of FPs and are ubiquitous in shallow reef-building corals (Salih et al., 2000; Leutenegger et al., 2007) as well as in mesophotic reef-building corals (Roth et al., in review).

Light regulates FP expression in corals. Corals increase and decrease GFP concentrations within 15 days in response to increased and decreased light, respectively (Roth et al., 2010). Green light and to an even greater extent blue light increases gene expression of CFP, GFP, RFP, and CP (D'Angelo et al., 2008). However, a field study did not show a significant correlation between depth and GFP concentration in *M. cavernosa* and *M. faveolata* (Mazel et al., 2003). Coral larvae and adults of the same species, which are found in different light environments, can express distinct FPs (Roth et al., 2013). Additionally in the mesophotic zone (>60 m), the type of FP is correlated with depth to match the spectral quality of light both within species as well as among closely related species (Roth et al., in review).

The function of FPs remains ambiguous and controversial despite being prevalent on coral reefs as well as within corals where they make up a significant portion of the total soluble protein (Salih et al., 2000; Leutenegger et al., 2007; Roth et al., in review). The high diversity of both corals and FPs may create challenges to understanding the functions because different FPs could have unique roles in different species. The predominant hypotheses on the functions of FPs include photoprotection (either directly by absorbing harmful light energy or indirectly as an antioxidant) and photosynthesis enhancement (Kawaguti, 1944, 1969; Salih et al., 2000; Bou-Abdallah et al., 2006; Palmer et al., 2009a). Some corals express multiple types of FPs and the emission spectra of some FPs overlap with the absorption spectra of other FPs providing the possibility for higher energy to be reduced to lower energy via fluorescence resonance energy transfer between FPs within corals (**Table 2**; Salih et al., 2000). Despite the tight relationship between light and FPs (Vermeij et al., 2002; D'Angelo et al., 2008; Roth et al., 2010), evidence against a photoprotective hypothesis includes a lack of correlation between depth and GFP as well as the negligible impact of GFP absorption, emission, and reflection on sunlight reaching *Symbiodinium* (Mazel et al., 2003). However, recent evidence suggests that CPs can reduce chlorophyll excitation and thus may serve a direct photoprotective role (Smith et al., 2013). Moreover, FPs decrease susceptibility to coral bleaching during heat stress providing more evidence for a photoprotective role (Salih et al., 2000). CP concentration is strongly correlated with photosynthetic capacity at the onset of bleaching (Dove et al., 2008), which may suggest that FPs plan an important role in mitigating thermal stress for the symbiont. FPs have also been shown to have antioxidant activity, which could provide an indirect photoprotective role (Bou-Abdallah et al., 2006; Palmer et al., 2009a). This activity may explain why under temperature stress GFP is rapidly degraded or used up (Roth and Deheyn, 2013). In contrast, there is much less supporting evidence for the photosynthesis enhancement hypothesis. The emission of FPs and the absorption of photosynthetic pigments are not aligned (**Table 2**); there is inefficient energy transfer between host and *Symbiodinium* pigments (Gilmore et al., 2003) and GFP emission has negligible impact on light reaching *Symbiodinium* (Mazel et al., 2003). Additionally, there are no differences in abundance, photophysiology, or genotype of *Symbiodinium* in mesophotic corals with and without coral fluorescence (Roth et al., in review). Nevertheless, the high abundance of fluorescence in energetically limited corals of the mesophotic zone suggests that FPs play an integral physiological role (Roth et al., in review).

The visual nature of FPs and the strong correlation with growth enables coral fluorescence to be utilized as an indicator of coral health (Roth et al., 2010; D'Angelo et al., 2012; Roth and Deheyn, 2013). During temperature stress, there is a rapid decline in GFP prior to coral bleaching providing an early signal of declining coral condition (Roth and Deheyn, 2013). While the function of FPs is uncertain, it is clear they are involved in the photophysiological response of the coral–algal symbiosis.

## **TISSUE THICKNESS**

Tissue thickness directly affects the amount of light reaching *Symbiodinium.* Photosynthetically active radiation (PAR, 400– 700 nm) decreases within the coral tissue while near-infrared radiation (NIR, 700–800 nm) is consistent throughout the coral tissue (Wangpraseurt et al., 2012). In Caribbean corals, the tissue thickness is highest in the spring and the lowest in the summerfall when there are lower energetic reserves, which also correlates with changes in *Symbiodinium* density (Fitt et al., 2000). It is hypothesized that an increase in translocated photosynthetic products associated with proliferating *Symbiodinium* density must precede the enlargement in tissue biomass (Fitt et al., 2000). Small changes in the tissue thickness will affect the amount of light penetrating the coral as well as the amount of multiple scattering.

## **POLYP BEHAVIOR**

Despite living as a sessile organism, corals have adapted a unique set of behaviors to regulate light exposure. Coral polyp size varies greatly, from less than 1 cm (**Figure 1C**) to greater than 30 cm in length in solitary corals (e.g., *Fungia*). Polyp size affects the surface area to volume ratio and in most corals is inversely related to photosynthesis and respiration (Porter, 1976). The polyp behavior, extension and contraction, can dramatically affect the light environment within coral cells. Corals can retract their polyps in minutes in response to high light (Levy et al., 2003) and as part of the diurnal cycle (Kawaguti, 1954). For heterotrophic feeding, corals extend their polyps to capture prey, but this primarily occurs at night. Intertidal corals can become exposed to high light and air during extreme low tides and have developed unique adaptations including the reversible retraction of coral tissue deep into the skeleton so that the tissue is no longer visible (Brown et al., 1994). During the extreme tissue retraction, the white bare coral skeleton increases the albedo and reduces the sunlight absorbed. Furthermore, the pigments in the tissue are condensed and the amount of light is decreased within coral cells. The tips of the tentacles are often distinctly pigmented (**Figure 1C**) and it has been suggested that FPs can act as a sunscreen plug when the polyp is retracted (Salih et al., 2000). *Symbiodinium* are located in coral cells both in the polyp and the coenosarc (tissue that connects polyps), however only the polyp can be extended or retracted. An extended polyp increases the surface area to volume ratio allowing for faster diffusion of carbon dioxide and oxygen in and out of coral cells. Additionally, a greater amount of lateral light is transmitted when the polyp is extended (Wangpraseurt et al., 2014). The surface irradiance over polyps is higher than over the coenosarc (Wangpraseurt et al., 2012). The differences in light and/or photosynthetic substrates may be responsible for the spatial heterogeneity observed in photosynthetic responses (Ralph et al., 2002).

#### **SKELETON MORPHOLOGY**

Scleractinian corals have tremendous phenotypic plasticity in morphology. Light, in addition to water flow, is one of the primary influences on morphology (Todd, 2008). Gross morphology determines the exposure of the coral–algal symbiosis to different light regimes, while microscale morphology and skeleton composition can influence light scattering. Even within a species, corals become flatter under low light to enhance light capture and more branched under high light to augment self-shading (Muko et al., 2000; Padilla-Gamiño et al., 2012); changes in gross morphology can occur in less than a year (Muko et al., 2000). Depending on morphology, the top, sides, and bottom of a coral can have dramatically different light environments (Warner and Berry-Lowe, 2006; Kaniewska et al., 2011). In chronic low light environments such as caves, overhangs and at depth, corals have a plate-like flat morphology and thinner skeleton (Kühlmann, 1983; Anthony and Hoegh-Guldberg, 2003b). Because multiple scattering by the coral skeleton amplifies light within the coral cells (Enríquez et al., 2005), the microscale architecture dictates the light field the symbionts are exposed to. Light within coral cells can differ dramatically depending on the precise location of the tissue; for example, there is higher irradiance in cells on top of ridges than in cells between ridges (Kühl et al., 1995). Corals have diverse skeletal fractality on nano- and microscales that causes an eightfold variation in the light scattering properties (Marcelino et al., 2013). Lastly, corals can vary how much of their tissue penetrates the skeleton. Corals that are perforate, porous skeletal matrices with intercalating tissue, can have five times thicker tissues than imperforate corals, those with tissue that do not penetrate the skeleton (Yost et al., 2013). Light and coral morphology are intricately interconnected and morphology creates conspicuous light microenvironments.

From small molecules and proteins to behavior and morphology, corals employ many strategies to modify the light environment within the coral cell. While the various strategies to alter light are known, many of the molecular, cellular, and biochemical processes to regulate these methods are understudied. In contrast, the cellular and biochemical photophysiology of *Symbiodinium* is much better understood.

#### **PHOTOBIOLOGY OF** *Symbiodinium*

Corals are highly refractive and provide an environment where *Symbiodinium* have high gross rates of photosynthesis and quantum efficiencies close to their theoretical limits (Rodríguez-Román et al., 2006; Brodersen et al., 2014). Because light is the driving force of photosynthesis, photophysiology of photosynthetic organisms has been a very active area of research. *Symbiodinium* optimizes the amount of light absorbed and utilized by photochemistry, while shunting light when the photosynthetic capacity has been reached. On sunny days, ∼80% of light is dissipated by *Symbiodinium* in shallow corals and not used in photochemistry (Gorbunov et al., 2001). Experimental measurements confirm that corals dissipate 96% of absorbed light energy under typical irradiances of coral reefs (640 μmol photons m−<sup>2</sup> s <sup>−</sup>1; Brodersen et al., 2014). Sunlight flashes dramatically increase light in milliseconds, but have little effect on overall photosynthesis of *Symbiodinium* suggesting that they have effective mechanisms of dissipating excess light on rapid times scales (Veal et al., 2010). Additionally, *Symbiodinium* efficiently repairs the daily damage that occurs from photosynthesis (Gorbunov et al., 2001; Hoogenboom et al., 2006). Akin to other photosynthetic organisms, corals and their symbionts adapt to high and low light environments and have specific photosynthetic characteristics. The coral–algal symbiosis exhibit classic photosynthetic low and high light adaptation patterns: the coral–algal symbiosis under low light maximizes the amount of light processed through increased light-absorbing pigments and photosynthetic efficiencies to obtain high rates of photosynthesis under lower irradiances; in contrast, the coral–algal symbiosis under high light minimizes the amount of light processed through reduced pigments and photosynthetic efficiencies but higher maximum rates of photosynthesis under high irradiances (Falkowski and Dubinsky, 1981; Anthony and Hoegh-Guldberg, 2003a,b). Light is very dynamic and *Symbiodinium*, like all photosynthetic organisms, exploit a variety of photophysiological processes over a range of timescales to efficiently absorb and utilize light and prevent photoinhibition (**Table 1**).

#### **PHYLOTYPE**

Because of the lack of morphological characteristics, it was originally believed that there was only one pandemic species of *Symbiodinium*, *S. microadriaticum* (Freudenthal, 1962). Upon greater consideration of physiology, biochemistry, ultra-structure, and other aspects, and more recently with molecular biology and phylogenetics, it has become apparent that *Symbiodinium* actually represents several divergent lineages known as clades A thru I (Stat et al., 2012). In addition to the symbiosis with corals, *Symbiodinium* are commonly found in symbiosis with other cnidarians (e.g., sea anemones) as well as Platyhelminthes, Mollusca, Porifera, and Foramniferans and even free-living (Stat et al., 2006).

Individual corals can host multiple phylotypes of *Symbiodinium* at the same time and through time. Recent techniques have shown that corals host 6–8 times greater diversity of *Symbiodinium* than previously assumed (Apprill and Gates, 2007) and can identify low abundance *Symbiodinium* (Mieog et al., 2007). The same species of coral found at different depths can harbor the same or different phylotypes of *Symbiodinium* (Iglesias-Prieto et al., 2004; Warner et al., 2006). Surprisingly, only one out of eight species of corals investigated showed a correlation between distinct coral microhabitat patterns and *Symbiodinium* phylotypes (van Oppen et al., 2001). Throughout the year, *Symbiodinium* phylotype varies both between clades and the proportion of different subclades (Suwa et al., 2008; Ulstrup et al., 2008). The diverse and variable assemblage of *Symbiodinium* within corals sets the stage for the inherent physiological capacity for photosynthesis and its responses to environmental changes.

#### **ABUNDANCE**

The abundance of *Symbiodinium* is important because it may directly affect the amount of oxygen produced within corals cells and therefore the potential for ROS production. The irradiance regulates the density of *Symbiodinium* in corals, but *Symbiodinium* abundance also alters the light field within corals. Scleractinian corals typically host between 1 and 2 *Symbiodinium* cells per endoderm cell (Muscatine et al., 1998). Symbiont densities generally range from 1 to 4 <sup>×</sup> 106 cells cm−2, but can be found as dense as 8 <sup>×</sup> <sup>10</sup><sup>6</sup> cells cm−<sup>2</sup> (Fagoonee et al., 1999; Fitt et al., 2000; Apprill et al., 2007). It is thought that the coral controls *Symbiodinium* density and its pigments through nitrogen limitation (Falkowski et al., 1993), although the mechanisms are not well understood (Davy et al., 2012). For a thorough discussion of *Symbiodinium* acquisition, regulation, expulsion, and degradation see the recent review by Davy et al. (2012). In laboratory experiments, *Symbiodinium* density can acclimate to new light intensities within 15 days (Roth et al., 2010). On coral reefs, *Symbiodinium* density changes inversely with seasonal light levels, decreasing in the summer and increasing in the winter and fall (Fagoonee et al., 1999; Fitt et al., 2000; Ulstrup et al., 2008), likely to optimize photosynthesis. During temperature stress, higher densities of *Symbiodinium* have been implicated in increasing the susceptibility of corals to bleaching because of the higher ROS production relative to corals' antioxidant capacity (Cunning and Baker, 2013); however, high densities of *Symbiodinium* also result in significant self-shading, lower rates of oxygen evolution, and ultimately reduced ROS production. Because *Symbiodinium* absorbs light, irradiance declines the fastest where the layer of *Symbiodinium* are located within the coral tissue (Wangpraseurt et al., 2012). Thus, changes in *Symbiodinium* density, and in particular during bleaching, exacerbate the environmental stress on the remaining symbionts. Further research on populations of *Symbiodinium*including abundance, phylotype, and their physiological differences will elucidate the outcomes of the coral–algal symbiosis during environmental stress.

## **ANTIOXIDANTS**

Antioxidants neutralize ROS and play an important photoprotective role. Like corals and other photosynthetic organisms, *Symbiodinium* synthesize a variety of enzymatic antioxidants such as SOD, CAT, and ascorbate peroxidase (ASPX; Lesser and Shick, 1989; Shick et al., 1995). *Symbiodinium* in corals collected from high irradiance habitats have higher SOD, CAT, and ASPX activities than those collected from low irradiance habitats at the same depth (Lesser and Shick, 1989). Additionally, *Symbiodinium* in corals collected over a depth gradient show a decline in the activities of SOD, CAT, and ASPX with increasing depth, which may be related to the decrease in potential for oxidative stress (Shick et al., 1995). Similar to their hosts, activities of SOD and CAT in *Symbiodinium* increase with blue light and show a positive correlation with the diurnal cycle (Levy et al., 2006a,b). In culture, different phylotypes show distinct constitutive activities of SOD produced despite being grown under the same conditions (Lesser, 2011). Phylotypes with higher capacity for photoacclimation and thermal tolerance also have higher concentrations of the nonenzymatic antioxidant glutathione and xanthophylls (see Section "Carotenoids"; Krämer et al., 2011). MAAs have antioxidant activity in addition to absorbing UVR (**Table 2**). *Symbiodinium* synthesizes at least four MAAs in culture, but most MAAs are primarily passed to the host to be

used as a first line of defense absorbing UVR before it can reach *Symbiodinium* (**Figure 2C**; Shick and Dunlap, 2002). For more details on antioxidants see the Section "Photobiology of Corals."

### **PHOTOSYNTHETIC PIGMENTS**

Photosynthetic dinoflagellates including *Symbiodinium* have plastids derived from red algae. The primary photosynthetic pigments in *Symbiodinium* are chlorophyll *a*, chlorophyll *c*2, and peridinin (**Table 2**). While the core photosynthetic machinery is highly conserved among photosynthetic eukaryotes, the lightcapturing pigments are diverse to match the particular light environment of the organism. *Symbiodinium* has two types of light-harvesting complexes: (1) the thylakoid membrane-bound chlorophyll *a*–chlorophyll*c*2–peridinin-protein-complex (acpPC) and (2) the water-soluble peridinin–chlorophyll *a* protein (PCP; Iglesias-Prieto et al., 1991, 1993). The chlorophylls primarily absorb high-energy blue light (∼430–460 nm), but chlorophyll *a* also absorbs red light (∼680 nm; **Table 2**; Bricaud et al., 2004). Peridinin expands the range of photosynthetically usable light of *Symbiodinium* because it has maximum absorption of blue-green light (∼480–500 nm) and a broad absorption spectra (∼450–550 nm; **Table 2**; Bricaud et al., 2004; Johnsen et al., 2011).

The majority of photosynthetic pigments are involved in absorbing and transferring light to the reaction centers of PSI and PSII. Photoacclimation processes can also involve changing the stoichiometry between antenna proteins and reaction centers and between photosystems. Within 15 days, *Symbiodinium* in corals photoacclimate by modifying the amount per cell of chlorophyll *a*, chlorophyll *c*2, and peridinin yet maintaining the same ratios of pigments (Roth et al., 2010). In culture, *Symbiodinium* also change the concentration of chlorophyll *a*, chlorophyll *c*2, and peridinin, but additionally change the ratios of photosynthetic pigments under different light conditions (Iglesias-Prieto and Trench, 1994; Robison and Warner, 2006; Hennige et al., 2009). This discrepancy between *Symbiodinium* in culture and symbiosis may suggest that the host modulates the light environment of *Symbiodinium* in symbiosis. Moreover, *Symbiodinium* can photoacclimate by changing the size of the photosynthetic unit by adjusting the abundances of PSI, PSII, acpPC, and PCP and the antenna size for each photosystem (Titlyanov et al., 1980; Falkowski and Dubinsky, 1981; Iglesias-Prieto and Trench, 1997; Hennige et al., 2009). A study of eight phylotypes of cultured *Symbiodinium* under two irradiance growth conditions suggests that the photoacclimation generally occurs by modifying the reaction center content rather than the effective antennae-absorption (Hennige et al., 2009). In shallow corals on reefs, chlorophyll *a* per cell decreases in the summer and increases in the winter (Fitt et al., 2000) and the ratio of chlorophyll *a* to chlorophyll *c*<sup>2</sup> can vary on a seasonal basis (Warner et al., 2002). Coral bleaching is defined as either a decrease in *Symbiodinium* density and/or a reduction in photosynthetic pigments (Coles and Jokiel, 1978; Warner et al., 1996; Hoegh-Guldberg, 1999; Roth et al., 2012), which alters the light scattering and absorption characteristics. Furthermore, there is a complex relationship between the increase in pigments and the decrease in optical absorption cross-section (the relationship between the rate of excitation delivered and the

photochemical reaction) due to self-shading within the cell called the "package effect" (Kirk, 2010). The amount of packaging can vary between different phylotypes as well as under low and high light conditions (Hennige et al., 2009). Detailed studies on the changes in chlorophyll content of *Symbiodinium* under various light regimes for a variety of phylotypes will elucidate the packaging dynamics. In the coral–algal symbiosis, *Symbiodinium*pigment packaging is compounded by packaging of *Symbiodinium* within coral cells. The packing of pigments and cells adds complexity to the relationship between light absorption and pigments.

## **CAROTENOIDS**

Carotenoids are accessory pigments (tetraterpenoids) synthesized by photosynthetic organisms. There are two types of carotenoids: carotenes (pure hydrocarbons) and xanthophylls (hydrocarbons with oxygen). *Symbiodinium* synthesizes β-carotene and xanthophylls peridinin, diadinoxanthin, and diatoxanthin (**Table 2**). Carotenoids have a variety of roles including as accessory light-harvesting pigments, structural components of the lightharvesting complexes, antioxidants, and sinks for excess energy. Within minutes of high light, the xanthophyll cycle converts diadinoxanthin to diatoxanthin through de-epoxidation and the cycle is reversed in limiting light (Brown et al., 1999). Increases in xanthophyll de-epoxidation state, the ratio of diatoxanthin to the total xanthophyll cycle pool, are associated with photoprotection of the photosynthetic apparatus (Brown et al., 1999). *Symbiodinium* can increase the capacity for photoprotection by increasing the amount of β-carotene and xanthophylls relative to chlorophyll *a* (and vice versa); the increase occurs within 15 days during photoacclimation and within 5 days under temperature stress (Roth et al., 2010, 2012). Likewise, *Symbiodinium* in culture adjust the relative abundances of photoprotective pigments under different light environments (Hennige et al., 2009). Carotenoids provide important photoprotection for photosynthetic organisms under multiple timescales.

## **PHOTOSYNTHESIS**

Given the central role of photosynthesis in the coral–algal symbiosis, it is important to characterize a variety of photosynthetic related parameters. Quantifying photosynthesis under different light fields, generally referred to as photosynthesis to irradiance (P/E) curves, describes the dynamics of photosynthesis. From these data, the light compensation point (where photosynthesis and respiration are equal), photosynthetic efficiency (the slope under light-limiting conditions), saturating irradiance, and the photosynthetic maximum can be determined (see diagram in Osinga et al., 2012). Photoacclimation of eight phylotypes of cultured *Symbiodinium* under two growth irradiances provide evidence for highly variable bio-physical and bio-optical measurements (Hennige et al., 2009). *Symbiodinium* in culture photoacclimate by changing their maximum rate of net photosynthesis (*P*max), respiration rate and saturating irradiance (Iglesias-Prieto and Trench, 1994). In contrast, *Symbiodinium* in corals photoacclimate to new growth conditions primarily by changing saturating irradiances rather than changes in *P*max (Anthony and Hoegh-Guldberg, 2003a). There are considerable

differences in high and low light adapted corals including in *P*max, photosynthetic efficiency, saturating irradiance, respiration, and thylakoid packing (Falkowski and Dubinsky, 1981; Dubinsky et al., 1984; Anthony and Hoegh-Guldberg, 2003b). Changes in photosynthetic function are one of the first indicators of temperature stress of the coral–algal symbiosis (Iglesias-Prieto et al., 1992; Warner et al., 1996, 1999; Lesser, 1997; Lesser and Farrell, 2004; Roth et al., 2012).

Two of the most informative measurements in photobiology of the coral–algal symbiosis are the maximum quantum yield of photosynthesis (-) and its inverse the minimum quantum requirement (1/-). These measurements are calculated as the fraction of photosynthetically usable light absorbed by photosynthetic pigments used to drive photosynthetic activity (e.g., O2 evolved or CO2 assimilated). The theoretical limit of the minimum quantum requirement for photosynthetic organisms is eight photons absorbed per molecule of oxygen evolved (Wyman et al., 1987). Measuring the light absorbed by *Symbiodinium* in corals is challenging and at one point was regarded as impossible (Falkowski et al., 1990). Early measurements of underestimated the absorption cross-section of chlorophyll because it was measured from freshly isolated *Symbiodinium* (Dubinsky et al., 1984; Wyman et al., 1987; Lesser et al., 2000) rather than in intact corals where the absorption is two to fivefold higher because of light scattering by the skeleton (Enríquez et al., 2005). Recent studies suggest that corals are efficient energy collectors and that the energy can be utilized close to the theoretical maximum (Rodríguez-Román et al., 2006; Brodersen et al., 2014). varies within the coral (depth within the tissue), in corals collected from distinct light environments (high light vs. shade adapted) and in corals with different degrees of bleaching (Dubinsky et al., 1984; Rodríguez-Román et al., 2006; Brodersen et al., 2014). Additionally, the is affected by the irradiance during measurement (Brodersen et al., 2014). Corals species and environmental history influence skeletal morphology, tissue thickness, and ultimately light scattering, which add to the variability in coral–algal photobiology. While this direct assessment of the efficiency of light utilization is an important measurement, it remains logistically cumbersome.

Chlorophyll fluorescence can be used as a proxy for many photosynthetic measurements and consequently the results can be interpreted as an indicator of coral health. Chlorophyll *a* fluorescence provides an understanding of the photochemical activity of PSII, photodamage, and photoprotection over temporal and spatial scales in a noninvasive manner (Warner et al., 2010). This review will briefly discuss some of the most widely measured fluorescence parameters of photosynthesis in the coral–algal symbiosis, but there are many types of fluorescence measurements that involve a variety of fluorometers that operate on different basic principles (reviewed in Cosgrove and Borowitzka, 2010; Warner et al., 2010). The maximum photochemical efficiency (quantum yield) of PSII (*F*v/*F*m) is measured in dark-acclimated corals and represents the maximum capacity of PSII. The effective or steady state photochemical efficiency of PSII (Δ*F*/*F*m , Δ*F* /*F*m or -PSII) is measured in the light-adapted state. Corals show a daily midday reversible decrease in Δ*F*/*F*m and *F*v/*F*<sup>m</sup> associated with shunting energy away from photochemical reactions and into other pathways to prevent damage (**Figure 3**; Brown et al., 1999; Gorbunov et al., 2001). The functional absorption cross-section for PSII shows a diurnal pattern with a decline associated with peak irradiances during midday, which correlates with the decrease in Δ*F*/*F*m , the increase in NPQ (see Section "Non-photochemical Quenching") and the highest rate of net photosynthesis (Levy et al., 2006a). To maintain high rates of productivity under normal conditions, a percentage of PSII reaction centers (D1 protein) will become damaged during the day when the rate of damage exceeds the rate of repair, but PSII will be able to repair itself when the rate of repair exceeds the rate of damage in low light (nighttime; Gorbunov et al., 2001). *Symbiodinium* in corals photoacclimate by changing photosynthetic efficiency to new conditions within days in laboratory experiments (Roth et al., 2010), and over seasons on reefs (Warner et al., 2002; Ulstrup et al., 2008). Additionally, distinct microhabitats of the coral such as tops versus sides can show different photosynthetic efficiencies (Warner and Berry-Lowe, 2006). When *F*v/*F*<sup>m</sup> declines over time, it implies that the rate of damage of PSII exceeds the rate of repair and damage has accumulated, which can lead to coral bleaching (Roth et al., 2012). The excitation pressure over PSII can be calculated as *Q*<sup>m</sup> = 1 – [(Δ*F*/*F*m at peak sunlight)/(*F*v/*F*<sup>m</sup> at dawn)] (Iglesias-Prieto et al., 2004). A low *Q*<sup>m</sup> would signify a high proportion of PSII reaction centers are open and possible light limitation, whereas a high *Q*<sup>m</sup> would signify that most PSII reaction centers are closed and there could be photoinhibition. A recent study showed that during a heat stress experiment, corals began bleaching when *Q*<sup>m</sup> reached ∼0.4 and continued heat stress intensified the bleaching until the *Q*<sup>m</sup> reached ∼0.8 (when measurements were no longer possible due to the low level of symbionts) while control corals maintained *Q*<sup>m</sup> < 0.2 (Roth et al., 2012). Measuring chlorophyll fluorescence under various light regimes can also provide estimates of the relative electron transport rate (rETR) similar to P/E curves, but there are many problems and pitfalls with this approach (Warner et al., 2010; Osinga et al., 2012). Despite its limitations, measuring chlorophyll fluorescence is an important noninvasive methodology to assess the physiological state of *Symbiodinium* and thus the coral. For more information on the methodologies and the instrumentation mentioned in this section see recent reviews (Warner et al., 2010; Osinga et al., 2012).

#### **NON-PHOTOCHEMICAL QUENCHING**

Excess energy harmlessly dissipated as heat, also called NPQ, is an important photoprotective mechanism. In **Figure 3**, the secondary outflow of the funnel is representative of NPQ pathways. NPQ includes all processes that decrease chlorophyll fluorescence yield apart from photochemistry and consists of energy-dependent quenching (qE), state transition quenching (qT), and photoinhibitory quenching (qI; Müller et al., 2001). NPQ processes are characterized according to their relaxation kinetics (Müller et al., 2001). In *Symbiodinium* in corals, >80% of excitation energy can be dissipated through NPQ (Gorbunov et al., 2001; Brodersen et al., 2014). Most of the energy is likely to be dissipated through qE rather than qT or qI (Niyogi, 1999).

#### *Energy-dependent quenching*

Turning on and off within minutes, qE is essential for coping with rapid changes in incident sunlight. In most eukaryotic algae, qE depends on a buildup of a transient pH across the thylakoid membrane, a particular light-harvesting complex protein called LHCSR, and specific carotenoids of the xanthophyll cycle (Niyogi and Truong, 2013). However, LHCSR is not found in the Expressed Sequence Tag (EST) library of *Symbiodinium* (Boldt et al., 2012), which may suggest another mechanism for how qE is achieved in *Symbiodinium*.

#### *State transition quenching*

State transition quenching is the quenching that results from uncoupling the light-harvesting complexes from PSII to decrease the amount of light absorbed and transferred to the PSII reaction center in green algae and plants (Müller et al., 2001). In *Symbiodinium* under excess light, both light-harvesting complexes acpPC and PCP may dissociate from PSII to minimize PSII overexcitation (Hill et al., 2012). It is thought that the redistribution of acpPC from PSII to PSI could prevent photooxidative damage (and ultimately bleaching) in more tolerant phylotypes of *Symbiodinium* (Reynolds et al., 2008; Hill et al., 2012). State transitions are triggered by reversible phosphorylation of light-harvesting proteins and can occur in minutes and relax in tens of minutes (Müller et al., 2001; Eberhard et al., 2008). However, some studies on freshly isolated and cultured *Symbiodinium* have not observed the enhanced energy transfer to PSI (Warner et al., 2010). The relative role and specific mechanisms of qT in *Symbiodinium* as a photoprotection mechanism remain unknown.

#### *Photoinhibitory quenching*

Photoinhibitory quenching is the NPQ mechanism with the slowest relaxation kinetics and is poorly understood even in plants and green algae (Müller et al., 2001). During prolonged light stress, slowly reversible quenching occurs that is thought to result from both photoprotection and photodamage. qI relaxation generally occurs within hours in photosynthetic eukaryotes (Müller et al., 2001). More research is needed on the mechanisms of qI in *Symbiodinium* to fully understand the photoprotective pathways.

*Symbiodinium* utilizes a variety of processes on multiple timescales to protect its primary role of absorbing and processing light through photochemistry while avoiding oxidative stress. While much is understood about these mechanisms on a cellular and biochemical level, there is much to learn about how the various components and proteins are synthesized, regulated, assembled, and degraded. A recent study on gene expression in *Symbiodinium* (microarray containing 853 features) showed that 30% of genes show diurnal oscillations (Sorek et al., 2014). While some of these genes are associated with photosynthesis such as the peridinin−chlorophyll *a*-binding protein, many of the genes are uncharacterized (Sorek et al., 2014). Recent advances such as the *Symbiodinium* draft genome (Shoguchi et al., 2013) and transcriptome (Baumgarten et al., 2013) will permit new investigations into gene expression and posttranscriptional regulatory processes

and should be paired with biochemical and physiological work to elucidate process on molecular, cellular, and biological levels.

## **BEYOND LIGHT, INFLUENTIAL FACTORS IN PHOTOSYNTHETIC SYMBIOSES IN CORALS**

Thus far, this review has focused on the effects of light on photosynthesis and the coral–algal symbiosis. Under specific conditions such as excess light, which is typical of sunny days in the shallow environment of reef-building corals (Gorbunov et al., 2001), there are additional abiotic and biotic factors that influence photosynthesis. Because all reef-building corals rely on energy from their symbionts (Osinga et al., 2011), the factors modifying photosynthesis are central to the health of the coral–algal symbiosis.

#### **ABIOTIC FACTORS**

Abiotic factors that influence photosynthesis in the coral–algal symbiosis include availability of inorganic nutrients (in particular carbon), oxygen concentration, pH, and temperature, which are all modulated by water flow. Because corals are sessile, water flow dictates the rate of diffusion of gas exchange between the coral and the surrounding water by changing the thickness of the diffusive boundary layer. A coral extending its polyp may also affect the boundary layer, but those effects are uncharacterized. Abundance of dissolved inorganic carbon can the determine rates of photosynthesis and calcification in the coral–algal symbiosis (Falkowski et al., 1993; Marubini et al., 2003). Increased water flow has been shown to decrease the amount of oxygen within coral cells, which in turn increased the ratio of carboxylation to oxygenation catalyzed by Rubisco, and resulted in an augmentation of photosynthetic rate (Mass et al., 2010a). High flow and high irradiance result in faster growth rates of corals (Schutter et al., 2011). The combination of feeding corals (providing carbon, nitrogen, and phosphorus) and higher irradiance has an additive effect on coral growth (Osinga et al., 2011). Doubling carbon dioxide concentration, for example in ocean acidification experiments, does not increase photosynthesis or calcification in corals (Anthony et al., 2008). Corals may be able to regulate their internal pH and buffer against moderate changes in external pH and carbonate chemistry (Venn et al., 2013). Additionally, *Symbiodinium* can increase coral intracellular cytosolic pH through photosynthesis (Laurent et al., 2013).

Temperature anomalies can have serious consequences on the coral–algal symbiosis and the effects have been extensively studied as well as covered in recent reviews (Weis, 2008; Lesser, 2011). Temperature affects the activity of various enzymes and reactions involved in photosynthesis and ultimately the repair of critical proteins (Somero, 1995; Huner et al., 1998; Warner et al., 1999; Takahashi et al., 2004; Nobel, 2005). During temperature stress, changes in the fluidity of the thylakoid membrane affect photosynthetic electron transport capacity and dismantle the photosynthetic system resulting in a decomposition of the thylakoid structure (Iglesias-Prieto et al., 1992; Tchernov et al., 2004; Downs et al., 2013); as a result, *Symbiodinium* produces a high abundance of ROS, which is passed to the host (Weis, 2008; Lesser, 2011). Once the threshold of ROS that the coral can neutralize is exceeded, a cascade of events is triggered that results in coral bleaching (Weis, 2008; Lesser, 2011). Catastrophic coral bleaching often occurs during small increases in temperature over prolonged periods of time and frequently concurrent with calm, clear weather patterns (Baker et al., 2008; Weis, 2008; Lesser, 2011). Because intensity and duration of the temperature anomaly are important in coral bleaching, the National Oceanic and Atmospheric Administration (NOAA) Coral Reef Watch program monitors temperature via satellite to determine the cumulative stress on a particular area of coral reef using a thermal stress index called degree heating weeks (DHW; Strong et al., 2011). At a given location, the DHW represent the accumulation of how long an area has experienced higher than average temperatures, which are called HotSpots. For example, one week of a HotSpot of 1◦C is equivalent to one DHW. Significant bleaching occurs around four DHW, and widespread bleaching and mortality occurs around eight DHW (Strong et al., 2011). Because of the importance of light, the NOAA Coral Reef Watch program plans to integrate measurements of light, wind, water transparency, and waves among other parameters into the monitoring program (Strong et al., 2011).

#### **BIOTIC FACTORS**

In addition to the influences of abiotic effects on photosynthesis, biotic effects possibly under host control can have important consequences on symbiotic photosynthesis but have not been extensively investigated. The most conspicuous distinction between *Symbiodinium* in symbiosis and in culture is the difference in morphology. *Symbiodinium* in symbiosis primarily are non-flagellate spherical cells (coccoid stage), while in culture they show diurnal morphological changes between the flagellate gymnodinioid stage (motile stage) in daylight and the coccoid stage at night (Muscatine et al., 1998; Yamashita et al., 2009). Additionally, *Symbiodinium* in culture, but not in symbiosis, make crystalline deposits of uric acid that align during the motile stage and are hypothesized to function as an eyespot (Yamashita et al., 2009).

In addition to these obvious differences that suggest that *Symbiodinium* in culture and in corals are in quite distinct states, there are also physiological and biochemical discrepancies. *Symbiodinium* in symbiosis has reduced metabolism as compared to *Symbiodinium* in culture, which was determined by comparing *Symbiodinium* freshly isolated from corals and those from cultures (Goiran et al., 1996). Additionally, the host may control photosynthetic rates and release of photosynthetic products. In freshly isolated *Symbiodinium* from corals, the amount of carbon fixed and released differed if the symbionts were in the presence or absence of synthetic "host" factors (free amino acids; Stat et al., 2008). Moreover, corals limit the growth rate of *Symbiodinium* in symbiosis; the doubling time of *Symbiodinium* in high light and low light corals is ∼70 and ∼100 days, respectively, which contrasts with a week in culture replete with nutrients (Falkowski et al., 1993). It is suspected that the corals control *Symbiodinium* growth through nitrogen limitation (Falkowski et al., 1993). Bacteria and cyanobacteria associated with corals may be able to provide both the host and the symbiont with nitrogen and affect the stability of the symbiosis (Lesser et al., 2007; Ceh et al., 2013). However, the effects of bacteria and viruses on *Symbiodinium* remain largely unexplored. The coral host may also be

able to influence its symbiont on a biochemical level, which has been observed with *Symbiodinium* in sea anemones. In *Symbiodinium* from anemones, there are differences in photosynthetic proteins (e.g., Rubisco and peridinin–chlorophyll *a*-*c*2-binding protein) between cells in symbiosis versus in culture (Stochaj and Grossman, 1997). These studies provide evidence that research on *Symbiodinium* in culture may not reflect their behavior in symbiosis. While it is apparent that corals have some influence on *Symbiodinium* in symbiosis, the extent to which they regulate the activities of *Symbiodinium* and the mechanisms are unknown.

#### **DIVERSITY OF THE CORAL–ALGAL SYMBIOSIS**

There is incredible genetic, biochemical, physiological, and ecological diversity within both scleractinian corals and *Symbiodinium* individually as well as within the symbiosis. Responses and tolerances to light and other environmental parameters by the host or its symbiont can vary based on phylotype as well as recent environmental and biological history (Ward et al., 2000; Robison and Warner, 2006;Warner et al.,2006; Middlebrook et al., 2008; Krämer et al., 2011). Due to the high diversity of reefbuilding corals, the exact number of species is unknown. However, hundreds of species of corals have been described based on morphology (Veron, 2000). There are considerable challenges in how to demarcate a species and it is likely that a combination of morphological and genetic (nuclear and mitochondrial markers) approaches will be necessary to understand the biodiversity of corals (Stat et al., 2012). The enormous morphological diversity of scleractinian corals also contributes to the varying degree of bleaching sensitivity (Veron, 2000; Marcelino et al., 2013). Vertical and lateral light gradients within corals can be different because of their unique tissue and skeletal characteristics (Wangpraseurt et al., 2012). Additionally, the internal light environment may be altered by different colors and abundances of FPs of different species (Salih et al., 2000; Alieva et al., 2008; Gruber et al., 2008; Roth et al., 2010). It is likely that there are many other cellular and biochemical distinctions between coral species, but they are largely underexplored. It is also difficult to tease apart physiological differences of corals alone because a healthy reef-building coral is one in symbiosis with *Symbiodinium*.

Like their hosts, *Symbiodinium* contains significant functional and genetic diversity. While it is known that there are nine clades of *Symbiodinium*, the number of species is unknown and there are a number of challenges in delineating species in this taxonomic group (Stat et al., 2012). In *Symbiodinium,* not only is there extensive intracladal diversity, but also substantial biochemical and physiological intercladal differences. *Symbiodinium* phylotype can determine photoacclimation and photosynthetic capacities as well as antioxidant activities (Savage et al., 2002; Robison and Warner, 2006; Hennige et al., 2009; Lesser, 2011). Additionally, phylotypes have different photoinhibition, photorepair mechanisms, and thylakoid lipid composition, which can determine thermal sensitivity (Tchernov et al., 2004; Ragni et al., 2010; Díaz-Almeyda et al., 2011; Krämer et al., 2011). A significant challenge is that the majority of *Symbiodinium* strains, particularly those most biologically relevant such as those that populate

most of the corals from the Indo-Pacific, have not been able to be maintained in culture and thus not studied without their hosts.

There is an additional level of diversity in the coral–algal symbiosis because individual corals can host multiple types of *Symbiodinium* on various temporal and spatial scales. While it was originally believed that the symbiosis was mutualistic, it is now known that the coral-algal symbiosis spans the continuum from parasitism to mutualism (Lesser et al., 2013). Clade A and D are generally considered more parasitic while clade C is known as more mutualistic based on characteristics of carbon fixation and translocation (Stat et al., 2008; Cantin et al., 2009). Changes in environmental conditions and coral bleaching may create opportunities that favor specific or new symbioses. The significant diversity within each partner as well as in the symbiosis means that much of the diversity remains uncharacterized, but due to the biodiversity crisis this is an important area of research for understanding coral populations. Because the performance of the coral holobiont is dependent upon both partners of the coral–algal symbiosis, physiological and ecological studies would benefit from taxonomic identification of both partners.

#### **RECENT ADVANCES AND FUTURE DIRECTIONS**

Recent advances in genomics, transcriptomics, translatomics, proteomics, lipidomics, and metabolomics (collectively referred to as the "omics") will provide a fresh perspective into the coral– algal symbiosis and enhance the understanding of this complex relationship in a dynamic environment. The first coral genome was published in 2011 (Shinzato et al., 2011), which was followed by a draft of the larger genome of *Symbiodinium* (the anemone symbiont *S. minutum*) in 2013 (Shoguchi et al., 2013). Additionally, a number of scleractinian coral and *Symbiodinium* transcriptomes are available (e.g., Meyer et al., 2009; Bayer et al., 2012) and it is now possible to analyze both coral and symbiont transcriptomes simultaneously (Shinzato et al., 2014). For a full description of recent genomic and proteomic studies see the review by Meyer and Weis (2012). Quantitative gene expression studies under a variety of conditions will be important in establishing the key molecular players responsible for a range of processes and in particular for responses to light. For example, a recent global transcriptome investigation of corals in low pH conditions revealed that in addition to upregulation of calcification genes, genes for autotrophy and heterotrophy are upregulated (Vidal-Dupiol et al., 2013). Because "omics" studies encompass the collective characterization of an organism, they can provide new directions of focus that may have been overlooked or not considered.

"Omics" studies in *Symbiodinium* lag behind those on corals because of the size of the *Symbiodinium* genome (∼1500 Mbp; Shoguchi et al., 2013) and transcriptome (∼59,000 genes; Baumgarten et al., 2013). In addition to the large amount of cellular DNA they contain, there are a number of well-known genetic peculiarities to dinoflagellates such as having permanently condensed chromosomes, few or no nucleosomes and reduced plastid genomes (Hackett et al., 2004; Leggat et al., 2011b). A real-time PCR study of *Symbiodinium* showed no effect of diurnal changes

in light levels or transfer from low to high light, on transcript abundance of reaction center proteins of both PSI and PSII, suggesting that posttranscriptional processes may be important for regulating proteins (McGinley et al., 2013). Most previous studies have focused on a small number of genes (e.g., Leggat et al., 2011a; McGinley et al., 2013; Sorek et al., 2013), but the tools are now available for quantitative transcriptome-wide studies. A recent study using RNA-seq on thermotolerant and sensitive phylotypes of *Symbiodinium* in the same coral host showed no detectable change in gene expression after a short heat stress despite evidence of symbiosis breakdown (Barshis et al., 2014). Another study on *S. microadriaticum* suggests that there is a low number of transcription factors, but that small RNAs (smRNAs) may be important for posttranscriptional regulation (Baumgarten et al., 2013). However, minimal changes were also observed in the endosymbiont enriched proteome from corals during temperature stress (Weston et al., 2012). This study found that 11% of peptides increased expression but that neither antioxidants nor heat stress proteins significantly increased expression under heat stress (Weston et al., 2012). Unexpectedly, temperature stress did cause an extraordinary 114-fold increase in a viral replication protein, which may suggest that viruses may play an important role in bleaching and/or disease when corals are stressed (Weston et al., 2012). System-level studies integrating "omics" with physiology will elucidate the genes, proteins, and regulatory factors relevant for photoacclimation and light stress of both partners of the symbiosis. Because these studies are unbiased, they can reveal new areas of focus such as the effects of viruses on *Symbiodinium* physiology. These technologies are advancing quickly and are now available on single cells (Wang and Bodovitz, 2010), which could reveal the heterogeneity of the mixed *Symbiodinium* assemblage as well as the physiological diversity within different layers of coral tissues. A new method has recently been developed for conducting automated massively parallel RNA single-cell sequencing (MARSseq) on multicellular tissues (Jaitin et al., 2014), which could be used to examine the distinct cells of the coral tissue (e.g., cells with *Symbiodinium* and without). These exciting new technologies will offer a new characterization of the physiology of the coral–algal symbiosis.

In addition to the development of the "omics," advances in traditional methodologies and interest by those with expertise in complex techniques can provide insights into the coral–algal symbiosis. The light-harvesting characteristics of *Symbiodinium* are important area of concentration because of the central role of photosynthesis in the health of the coral–algal symbiosis. Due to the unique spectroscopic properties of *Symbiodinium* lightharvesting complexes, acpPC and PCP have gained the attention of scientists who study photosynthesis in model photosynthetic organisms and employ a variety of sophisticated techniques and methodologies, which can be applied to *Symbiodinium*. The Xray crystallography structure of PCP was recently determined (Schulte et al., 2009) because it is the only system where bound carotenoids (peridinin) outnumber chlorophylls. However, the structure of the acpPC complex is still unknown. A recent study has shown that PCP is protected from potential photodamage because peridinin has an extremely fast triplet state, which can instantaneously deplete triplet chlorophyll to prevent forming

singlet oxygen (Niedzwiedzki et al., 2013a). Femtosecond timeresolved transient absorption spectroscopy of acpPC shows that the accessory pigments (most carotenoids and chlorophyll *c*2) are very effective at absorbing light and passing it to chlorophyll *a*, but the photoprotection capacity of acpPC remains questionable (Niedzwiedzki et al., 2013b). Light-harvesting complexes play a role in preventing the overexcitation and dissipation of excess energy and thus further research in these systems may provide important insight into coral–algal photophysiology. The recent interest in *Symbiodinium* by photosynthesis scientists from model organisms will help elucidate the cellular and biochemical mechanisms in this unique photosynthetic symbiont.

Other techniques that have provided valuable insight in other fields, yet are sorely lacking in the coral–algal field, include genetic transformation and coral cell lines in culture. Although a methodology for genetic transformation was described in *Symbiodinium* 16 years ago (ten Lohuis and Miller, 1998), there has been no progress reported since the initial study. Additionally, gene knockout or knockdown in the coral–algal symbiosis could reveal roles of critical proteins involved in responses to light and environmental stress, among other processes. Development of methodologies for RNA interference (RNAi), which are used for gene knockdown, are currently underway in *Symbiodinium* (Weber and Medina, 2012). Furthermore, a simplified system of coral cells in culture, both with and without *Symbiodinium*, would be a great asset to obtain a better grasp of the symbiosis. Almost all studies to date of *Symbiodinium* in culture have included cultures that are not axenic and therefore may have included bacteria, fungus and/or protists. Recently, clonal, axenic lines of *Symbiodinium* have been obtained (Xiang et al., 2013), and will be instrumental in understanding the roles of bacteria and viruses on *Symbiodinium*. Combinations of physiological, biochemical, and genetic studies under normal conditions, acclimation, and stress will provide the most insight into the coral–algal symbiosis.

Tremendous progress has been made over the last 30 years in the knowledge of the coral–algal symbiosis, and the recent advances in tools and techniques integrated with traditional methodologies will provide new insights into the symbiosis. Given that ∼75% of the world's coral reefs are now considered threatened (Burke et al., 2011), now is the time to act swiftly in a coordinated, collaborative effort to make even greater strides in understanding the coral– algal symbioses for the protection and conservation of coral reef ecosystems.

## **ACKNOWLEDGMENTS**

This project was supported by the Agriculture and Food Research Initiative Competitive Grant No. 2013-67012-21272 from the USDA National Institute of Food and Agriculture. I would like to thank Judith Connor for sparking my passion for algae, George Somero for instilling in me the importance of environmental factors, Nancy Knowlton for introducing me to coral reef research, Roberto Iglesias-Prieto for cultivating my knowledge of photosynthesis in corals, Dimitri Deheyn for encouraging me through my research endeavors, and Krishna Niyogi for nurturing my development in the broader photosynthesis world. I thank Monica Medina and Pilar Francino for the invitation to contribute to this edition and Krishna Niyogi, Jacqueline Padilla-Gamiño, Kate Hanson, Alizée Malnoë, and Scott Bornheimer for thorough and insightful comments on the manuscript.

## **REFERENCES**


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 21 April 2014; accepted: 25 July 2014; published online: 22 August 2014. Citation: Roth MS (2014) The engine of the reef: photobiology of the coral–algal symbiosis. Front. Microbiol. 5:422. doi: 10.3389/fmicb.2014.00422*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Roth. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Microbial diversity and activity in the** *Nematostella vectensis* **holobiont: insights from 16S rRNA gene sequencing, isolate genomes, and a pilot-scale survey of gene expression**

*Jia Y. Har 1 †, Tim Helbig1 †, Ju H. Lim1, Samodha C. Fernando1, Adam M. Reitzel 2, Kevin Penn1 and Janelle R. Thompson1 \**

*Edited by: Monica Medina, Pennsylvania State University, USA*

*Reviewed by: Irene Newton, Indiana University, USA Angela Elizabeth Douglas, Cornell University, USA*

#### *\*Correspondence:*

*Janelle R. Thompson, Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Room 48-331, 15 Vassar Street, Cambridge, MA 02139, USA jthompson@mit.edu † Co-first authors.*

#### *Specialty section:*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology*

*Received: 05 May 2014 Accepted: 27 July 2015 Published: 02 September 2015*

#### *Citation:*

*Har JY, Helbig T, Lim JH, Fernando SC, Reitzel AM, Penn K and Thompson JR (2015) Microbial diversity and activity in the Nematostella vectensis holobiont: insights from 16S rRNA gene sequencing, isolate genomes, and a pilot-scale survey of gene expression. Front. Microbiol. 6:818. doi: 10.3389/fmicb.2015.00818*

*<sup>1</sup> Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA,*

*<sup>2</sup> Department of Biological Sciences, University of North Carolina at Charlotte, Charlotte, NC, USA*

We have characterized the molecular and genomic diversity of the microbiota of the starlet sea anemone *Nematostella vectensis*, a cnidarian model for comparative developmental and functional biology and a year-round inhabitant of temperate salt marshes. Molecular phylogenetic analysis of 16S rRNA gene clone libraries revealed four ribotypes associated with *N. vectensis* at multiple locations and times. These associates include two novel ribotypes within the ε-Proteobacterial order Campylobacterales and the Spirochetes, respectively, each sharing *<*85% identity with cultivated strains, and two γ-Proteobacterial ribotypes sharing *>*99% 16S rRNA identity with *Endozoicomonas elysicola* and *Pseudomonas oleovorans*, respectively. Species-specific PCR revealed that these populations persisted in *N. vectensis* asexually propagated under laboratory conditions. cDNA indicated expression of the Campylobacterales and *Endozoicomonas* 16S rRNA in anemones from Sippewissett Marsh, MA. A collection of bacteria from laboratory raised *N. vectensis* was dominated by isolates from *P. oleovorans* and *Rhizobium radiobacter*. Isolates from field-collected anemones revealed an association with *Limnobacter* and *Stappia* isolates. Genomic DNA sequencing was carried out on 10 cultured bacterial isolates representing field- and laboratory-associates, i.e., *Limnobacter* spp., *Stappia* spp., *P. oleovorans* and *R. radiobacter*. Genomes contained multiple genes identified as virulence (host-association) factors while *S. stellulata* and *L. thiooxidans* genomes revealed pathways for mixotrophic sulfur oxidation. A pilot metatranscriptome of laboratory-raised *N. vectensis* was compared to the isolate genomes and indicated expression of ORFs from *L. thiooxidans* with predicted functions of motility, nutrient scavenging (Fe and P), polyhydroxyalkanoate synthesis for carbon storage, and selective permeability (porins). We hypothesize that such activities may mediate acclimation and persistence of bacteria in a *N. vectensis* holobiont defined by both internal and external gradients of chemicals and nutrients in a dynamic coastal habitat.

**Keywords:** *Nematostella vectensis***, holobiont, cnidaria, microbiota, mixotrophy, phasins**

## **Introduction**

Communities of microbes and their animal hosts are collectively known as holobionts (Rohwer et al., 2002). The microbial portion of the holobiont (i.e., the microbiota) contributes to the molecular and physiological functions of a wide diversity of hosts. For example, bacteria are known to breakdown complex plant-polymers and polysaccharides in termites (Xu et al., 2003; Warnecke et al., 2007; Mahowald et al., 2009), synthesize essential amino acids and vitamins in sharpshooter insects (Wu et al., 2006), aid the development of particular organs and systems in humans (Dobber et al., 1992; Rawls et al., 2004; O'Hara and Shanahan, 2006; Rader and Nyholm, 2012), and deter predators and pathogens in corals (Reshef et al., 2006; Bosch, 2013). Evidence that the composition and succession of the microbiota are species specific for particular animal hosts comes from the identification of mechanisms for interaction which appear to have diverged with host speciation (Rawls et al., 2006; Ley et al., 2008; Ryu et al., 2008; Fraune et al., 2010; Ochman et al., 2010).

Cnidarians are a focal taxonomic group in marine habitats for understanding the interaction between animals and microbes. Much work on cnidarian-microbe associations has focused on identifying bacterial species that might cause or prevent disease, particularly the various "band" diseases that are increasingly common in reef building corals (Bourne et al., 2009; Kimes et al., 2010; Mouchka et al., 2010). However, detailed functional connections between corals and bacteria remain unknown. Mechanistic studies using the hydrozoan *Hydra* have revealed species-specific bacterial communities and precise temporal regulation of the microbiome during its development (Fraune and Bosch, 2007; Franzenburg et al., 2013a,b). Together, these data from cnidarian species suggest that bacterial communities are integral and specific components to each cnidarian holobiont with a spectrum of functions.

Recently, the anthozoan *Nematostella vectensis* has been developed into a model organism for metazoan evolution and development due to its tractability in the lab, easily induced sexual and asexual reproduction and sequenced genome including a repertoire of predicted innate immunity genes (Putnam et al., 2007; Genikhovich and Technau, 2009; Renfer et al., 2010; Reitzel et al., 2012; Stefanik et al., 2013). A sedentary carnivore, this anemone resides exclusively in estuaries (Hand and Uhlinger, 1994) including those of extreme salinity (Sheader et al., 1997), temperature (Williams, 1983; Kneib, 1988; Reitzel et al., 2013) and sulfide fluxes (Howes et al., 1985). *N. vectensis* does not harbor zooxanthellae or any other known eukaryotic symbionts (**Figure 1**) and mainly preys on small free-living organisms in salt marshes, including copepods, midge larvae, worms (nematodes, polychaetes, and oligochaetes) and rotifers (Frank and Bleakney, 1978; personal observation) (**Figure 1A**).

As a first step to characterize the microbiota of *N. vectensis* both in the wild and under controlled laboratory conditions we have employed cultivation-independent analyses of 16S rRNA gene diversity including cloned sequence analysis, strain isolation, genome sequencing, and analysis of expressed RNAs to determine: (1) Whether *N. vectensis* is associated with similar populations of microorganisms in geographically distinct salt

**FIGURE 1 | (A)** Two *N. vectensis* anemones residing at the surface of a core of Sippewissett Marsh, MA (white arrows). Other invertebrates evident in image were also observed in the *N. vectensis* gut. **(B)** *N. vectensis* polyp maintained in the laboratory. **(C)** Scanning electron micrograph of *N. vectensis* collected from Sippewissett Marsh (October 2009) **(D)** higher magnification showing the presence of **(D)** diatoms and rod-shaped microorganisms on the anemone exterior.

marshes and when transferred to laboratory cultures with artificial seawater conditions, (2) Whether these microbial populations are metabolically active within the host tissue, and (3) If associated microbes have specific genes that may promote survival in the holobiont environment. Taken together, the following data provide evidence that *N. vectensis* maintains interactions with populations of microorganisms, of which several appear to be active based on detection of expressed RNAs. Future work to determine the nature of these interactions will advance our understanding of how microorganisms contribute to the physiology and ecology of the anemone holobiont.

## **Methods**

## **Anemone Collection and Maintenance**

*N. vectensis* adults were collected from Sippewissett Marsh, Massachusetts USA (MA-I to MA-V, MA-II), Clinton, Connecticut USA (CT) and Mahone Bay, Nova Scotia Canada (MB) between July 2008 and March 2010, preserved in RNAlater (Ambion, Inc.) and stored at 4◦C for DNA analysis. Before nucleic acid extraction, field anemones were directly removed from RNAlater and rinsed three times in deionized water. Sediment samples from the site of *N. vectensis* collection were retrieved from Sippewissett Marsh in November 2008 and June 2009 (**Table 1**). Four hundred and eighty milliliter marsh water was collected in June 2009 and filtered using Sterivex 0.22 micron cartridge filters (Millipore) on-site. The cartridges were kept on ice and then frozen in −20◦C until DNA extraction.


#### **TABLE 1 | Overview of sample collection and analysis.**

*aMonthly air temperature ranges were obtained from weather series of Cape Cod Air Station, MA (12 miles away from the Great Sippewissett Marsh), Clinton, CT, and Western Head, NS (42 miles from Mahone Bay, NS).*

*bWater temperature was measured by an Onset® temperature logger. ND: Not Determined.*

Laboratory-acclimated *N. vectensis* were originally collected from Sippewissett Marsh, MA (summer of 2007, multiple trips) and were maintained at MIT for at least 6 months before DNA extraction. *N. vectensis* were kept in artificial seawater (ASW) adjusted to a salinity of 10 ppt (Instant Ocean, Spectrum Brands, Inc.) at room temperature (21–23◦C) and fed *Artemia* nauplii three times a week over 6 months. To prepare laboratoryacclimated *N. vectensis* (LAB), individuals were transferred to autoclaved saline and fed with bleached *Artemia* nauplii for 4 weeks to reduce the effects of laboratory microbial contaminants on the host microbiota. Prior to genomic DNA extraction, laboratory anemones were incubated in autoclaved 10 ppt ASW for 2 days without feeding to eliminate digested food particles and rinsed three times in deionized water to remove loosely attached microbes and debris.

## **Molecular Diversity of the** *N. vectensis* **Holobiont Preparation and Analysis of 16S rRNA Gene Clone Libraries**

Genomic DNA was extracted from whole anemones (*n* = 3 to 5 per extraction depending on anemone size) using the DNeasy<sup>R</sup> Blood and Tissue kit (Qiagen Sciences). DNA from sediment and water filters was extracted using UltraClean™ Soil DNA Isolation Kit (Mo Bio Laboratories, Inc.) according to manufacturer instructions. Blunt end 16S rRNA products were amplified with the universal bacterial PCR primers 27F (5 -AGA GTT TGA TCM TGG CTC AG-3 ) and 805R (5 -GGA CTA CCA GGG TAT CTA ATC CC-3 ) using Phusion polymerase (Finnzymes) under the following conditions: 1 cycle of 98◦C for 30 s, 35 cycles of 98◦C for 10 s, 52◦C for 30 s, 72◦C for 30 s and 1 cycle of 72◦C for 10 min. Quadruplicate PCRs were carried out and pooled for each sample to reduce potential influence of PCR-generated mutations. PCR products were gel-purified (Qiagen kit), cloned into the pCR-Blunt vector (Invitrogen) and transformed into chemically competent TOP10 cells (Invitrogen) according to the manufacturer instructions. Ninety-six to 120 clones were randomly selected from each library and insert sequences were amplified using flanking primers M13F (5 TGT AAA ACG ACG GCC AGT) and M13R (5 -AGG AAA CAG CTA TGA CCA T-3 ). PCR products were sequenced from the 27F primer using BigDye<sup>R</sup> Terminator v3.1 Cycle Sequencing Kit according to manufacturer recommendations on a 3130 Genetic Analyzer (ABI). Sequences were trimmed and sorted into operational taxanomic units (OTUs) defined by 99% nucleic acid identity using Sequencher 4.5 (Gene Codes Corporation). Sequences were checked for chimeras using Bellerophon (Huber et al., 2004) and Mallard (Ashelford et al., 2006). Non-chimeric clones were identified by querying the Genbank database using NCBI BLAST (Altschul et al., 1997). Representative clones for each 99% OTU were aligned to the SINA database in Silva (Pruesse et al., 2012; Quast et al., 2013) and their phylogenies were inferred using the neighbor-joining algorithm implemented in CLC Genomic Workbench (version 7).

## **Species-specific PCR**

Species-specific PCR primers were designed for the ribotypes found in association with *N. vectensis* at different locations. Primers were designed using Primer-BLAST (Rozen and Skaletsky, 2000) and the annealing temperatures of the primers were optimized by incubation with negative controls of non-target organisms until no non-specific targets could be amplified. Primer sequences and optimized annealing temperatures (Ta) are as follows: Campylobacterales OTU, NVeps81F TAGCTTGCTAGAGTGTCAGC and NVeps677R TTTGTCTTGCAGTTCTATGGTTAA, Ta: 55◦C; Spirochete-like OTU, Spiro165F GGGGTAATACCG AATGATCTAGG and Spiro655R TTCCAACGCAACAAT ACAGTTAAG, Ta: 57◦C; *Endozoicomonas elysicola*, Endo80F AGCTTGCTCTTTGCCGACGAG and Endo624R CTTTCACA TCCAACTTAGGTAGCC Ta: 58◦C; *Pseudomonas oleovorans*, - Po23S-323F: GTACACGAAACGCTCTTATCAATG and Po23S-1475R AAATCAGCCTACCACCTTAAACAC, Ta: 57◦C. The expected PCR product sizes were approximately 600, 490, 540, and 1150 bp, respectively. Ten nanogram of DNA from MA-III, SED-MA-III, WATER-MA-III, and LAB samples were used as templates for the species-specific PCR using Phusion (Finnzymes) using the same thermocycler profile as for the 16s universal primers (see above). Positive amplicons with correct size products (if any) were confirmed by cloning and sequencing or by restriction fragment length polymorphism (RFLP) analysis with the enzyme HaeIII.

## **Isolation and Characterization of Bacterial Strains from the** *N. vectensis* **Holobiont**

Microbial strains were isolated from *N. vectensis* collected from Sippewissett Marsh (multiple trips; March–May 2010) or from the laboratory-maintained stock (February 2008 and 2010). Bacterial isolation methods were varied in an effort to increase the diversity of microbial isolates. Anemones were washed with 1X PBS (with or without 50 ugmL−<sup>1</sup> gentamycin treatment to inactivate surface-associated microbes depending on sample) for 1 h at room temperature, after which they were homogenized in 1X PBS using a flame-sterilized tissue grinder (Wheaton). Some anemones were treated with 5 mM Type 1 collagenase (Calbiochem) for 30 min at 37◦C for further tissue maceration, prior to tissue grinding. Partial sequences of the 16S rRNA genes were obtained after screening by RFLP analysis with the HaeIII enzyme and compared to sequences recovered by clone library analysis. Isolates with ribotypes recovered from multiple individual anemones, geographic locations, and/or sampling times, were considered to have evidence of stable association with the anemone suggesting symbiosis or "the living together of unlike organisms" (de Bary, 1879). Stable associates were selected for physiological characterization and genome sequencing.

Physiological tests were conducted in triplicate to characterize particular attributes of each bacterial isolate isolated from *N. vectensis*. Heterotrophic growth media for physiological characterization consisted of 2216 marine broth or agar (Difco) with additional tests for general growth (LB and TSB media, Difco), microaerophilic growth (GasPak EZ Container system, (BD) with Brucella-blood agar (Anaerobe Systems), Campylobacter-Wollinella agar (Anaerobe Systems), and 2216 agar). Minimal marine salts supplemented with 2 mM Na2S-9H2O (Sigma) or 5 mM Na2S2O3 (Sigma) was employed to test for chemoautotrophic growth. Catalase activity was assayed using 3% H2O2 and Gram staining performed according to manufacturer's protocol (BD Life Sciences). Cell morphology was observed on a Zeiss Axioskop 2 (Carl Zeiss MicroImaging Inc.) after staining cells from early stationary phase cultures (2 days) with 4 ,6-diamidino-2-phenylindole (DAPI, Sigma). Motility was scored as directional swimming in live cultures observed via light microscopy. Colony morphology was described after growth on 2216 agar at 28◦C for 2 days (unless otherwise specified). Tests for tolerance of pH were conducted in 2216 broth pH-adjusted and buffered by the addition of acetic acid (range of pH4-5), NaH2PO4 (pH6-8), or Tris base (pH9-10) at 28◦C and scored by culture turbidity after 2 days. Salinity tests were conducted in LB media omitting or adding NaCl and scored by culture turbidity. Temperature tolerance from 4 to 45◦C was measured by evidence of colony growth on 2216 agar after 4 days. Tests for antibiotic sensitivity were conducted in 2216 marine broth supplemented with Nalidixic acid (4 mg L−1), Chloramphenicol (10 mg L−1), Ampicillin (100 mg L−1), Kanamycin (100 mg L−1), or Streptomycin (100 mg L−1).

## **Genome Sequencing, Assembly, Annotation, and Analysis**

Whole genome libraries were prepared (after Penn et al., 2014) for sequencing using the Illumina Genome Analyzer (Illumina, Inc.). Briefly, 5μg of genomic DNA from each strain was sheared using Adaptive Focused Acoustic technology (Covaris, Inc.) to generate fragments 100–300 base pair (bp) in length. Fragments were blunt-ended, A-tailed and ligated with T nucleotide overhang Illumina forked paired end-sequencing adapters (Illumina, Inc.) containing bar codes for multiplex sequencing. Resulting libraries were size selected on an agarose gel to obtain 250 bp libraries. Libraries were then PCR amplified for 15 cycles based on determination of optimum number of cycles using qPCR. Libraries were multiplexed and sequenced to a targeted depth of 50X. Bacterial genomes were assembled into contigs using CLC Genomics Workbench 4 (Aarhus, Denmark). Contigs produced by the CLC assembly were uploaded to the Rapid Annotations using Subsystems Technology (RAST) server for identification and annotation of open reading frames (ORFs) (Aziz et al., 2008). Genomes corresponding to strain names are public in the RAST database. ORFs were also annotated by assignment to orthologous groups in the eggNOG Database (v3.0) (Powell et al., 2012) based on similarity searches with BLASTP (Altschul et al., 1997) with a threshold *e*-value *<* 1e−<sup>20</sup> and where the aligned portion includes the predicted functional residues of the protein (as designated in the COG/NOG database).

To test if isolates from distinct phylogenetic lineages shared similar regions of DNA including phage or prophage elements, suggesting potential horizontal gene transfer within the holobiont, (1) isolate genome ORFs were compared to each other by BLASTN (minimum match identity =95%), (2) the predicted proteome of each isolate was compared with BLASTP to the PHAST phage and prophage database (Zhou et al., 2011) with an *e*-value *<* 1e−<sup>10</sup> followed by manual inspection of top matches, and (3) the nucleotide sequences were compared to all virus and phage genomes in the non-redundant nucleotide database (July 2014) by BLASTN. (4) To assess the potential for horizontal gene transfer between *N. vectensis* and the isolates, a BLASTN was performed between the contigs of all 10 assembled *N. vectensis* associate genomes and the scaffolds of the current *N. vectensis* genome (v.1, Putnam et al., 2007). Sequence matches were determined using an expected value cutoff of 1e−30. *N. vectensis* scaffolds containing bacteria-like DNA were manually inspected and analyzed with custom python scripts in order to determine their GC content, ambiguous base composition and size, which were used to assess the likelihood of horizontal gene transfer. Finally, to screen for factors of host-association protein sequences from annotated ORFs were compared a database of virulence factors (Chen et al., 2012) by BLASTP with an *e*-value *<* 1e−<sup>10</sup> followed by manual inspection of top matches.

## **Analysis of Expressed RNA in Field-collected and Laboratory-acclimated Anemones Characterization of Expressed 16S rRNA in** *N. vectensis* **from Sippewissett Marsh**

One gram of *N. vectensis* polyps collected from Sippewissett Marsh July 2009 were preserved onsite in RNAlater and later homogenized in the TriPure Isolation Reagent (Roche Applied Science) with a mortar and pestle according to manufacturer's protocol. The homogenate was then mixed with chloroform at room temperature and centrifuged at 12,000 g for 30 min at 4◦C. The upper aqueous phase containing RNA was mixed with 2.5 ml isopropyl alcohol and incubated at −80◦C for 15 min, and then centrifuged at 12,000 rpm for 15 min at 4◦C. The resulting RNA pellet was washed with 75% ethanol, air-dried and resuspended in DEPC treated nuclease free water and stored at −80◦C. RNA was further purified prior to analysis as previously described (Sambrook and Russell, 2001). Briefly, RNA and phenol:chloroform:isoamylalcohol (25:24:1) were mixed at 1:1 ratio, vortexed for 15 sec and centrifuged at 14,000 g for 5 min. The aqueous phase was mixed with 0.1 volumes of ammonium acetate (pH 5.2) and 2.5 volumes of ice-cold 100% ethanol. Samples were gently mixed and incubated at −80◦C for 30 min, and then centrifuged at 14,000 g for 30 min. Pellet was washed in 70% ethanol and resuspended in DEPC treated nuclease free water and stored at −80◦C. One microgram of purified total RNA was used for cDNA synthesis. The first strand cDNA synthesis was carried out using Transcriptor First strand cDNA synthesis kit (Roche) according to manufacturer's protocol, with the exception that pentadecamer primers (5 - NNNNNNNNNNNNNNN-3 ) synthesized by Integrated DNA Technologies (Coralville) were used for random amplification of total RNA instead of the random hexamers supplied with the kit (Stangegaard et al., 2006). The cDNA was used in place of genomic DNA as a template for 16S rRNA gene amplification and cloning as described above.

## **Characterization of Expressed Bacterial ORFs in Laboratory-raised** *N. vectensis* **Through a Pilot-scale Metatranscriptome Study**

Laboratory-raised *N. vectensis* adults were incubated and treated to reduced microbial contamination, as described earlier. Twenty *N. vectensis* polyps (approximately 2 cm each) were homogenized in TRIzol reagent (Life Technologies) and their RNA was extracted according to manufacturer's instructions, including treatment with DNAse followed by phenol-chloroform-extraction. The RNA was divided between six samples that were each subjected to various combinations of rRNA depletion protocols as an initial screen of protocol effectiveness. These depletion methods were designed to enrich for microbial mRNAs and eliminate eukaryotic RNAs and bacterial rRNAs and are summarized in **Table 5**. Unprocessed total RNA was included as a reference sample. All RNA samples were transcribed to cDNA (SuperScript Kit Catalog # 11917- 020) following treatments for depletion of rRNA and eukaryotic RNA. To prepare the Illumina libraries, cDNA for each sample was sheared to fragments of between 100 and 300 base-pairs, purified, ligated into proprietary Illumina Adaptor sequences (Illumina, Inc., San Diego, CA) with unique 6 base-pair barcode sequences to designate samples for multiplexing within a single lane. Barcoded adaptor-ligated cDNA was then subject to size selection to remove self-ligated adaptors. Cleaned and merged adaptor-ligated cDNA was sequenced using the Illumina-GAII platform (as described in Penn et al., 2014). The cDNA was sequenced as paired end reads (100 bp × 2) on an Illumina GA-II. The resulting FastQ file was sorted by unique barcode then sequences were truncated by removing the barcodes, the Illumina adaptors and tandem repeat sequences were removed utilizing perl and python scripts. Sequence pairs were then compared against the Silva large and small subunit rRNA databases (Quast et al., 2013) using BLASTN. The pairs having one or both ends matching a ribosomal RNA database sequence with a bitscore *>* 50.0 were removed. Remaining sequences were compared against a custom database of bacterial and *N. vectensis* 5S rRNA and ITS sequences using BLASTN, and again, those pairs having one or both ends matching a sequence within one of these databases with bit score *>* 50.0 were removed. Following rRNA separation, remaining paired sequences that had overlapping sequence were merged using the software program SHERA with confidence metric ≥ 0.7 (Rodrigue et al., 2010).

## **Annotation of Assembled and Individual Metatranscriptome Sequences from Laboratory-raised N. vectensis**

Putative mRNA sequences were assembled in CLC genomics workbench with the following settings (mismatch cost = 2; Insertion cost = 3; Deletion cost = 3; Length fraction = 0.5; Similarity fraction = 0.8). The contigs were then used for BLASTX search against the NR database using a low complexity filter and the top 100 hits kept. These results were loaded into MEGAN to identify the taxonomic matches of the contigs using the lowest common ancestor (LCA) method. LCA parameters for taxonomic assignments in MEGAN were set with a minimum support of 10, a minimum bit score of 50, maximum e-value of 0.01, only considering matches that lie within the top 10% of the best score for a particular sequence, and the minimum complexity filter was set at 0.44. Putative mRNAs was also compared against all sequences in the NCBI database using BLASTX (Altschul et al., 1997) with parameters (-m 8 -W 3 -e 20 - Q 11 -F "m S"). The BLASTX results were imported into MEGAN with bit score cutoff 40.0 and the lowest common ancestor cutoff being 2 matches (Huson et al., 2007). Unmerged sequence pairs with ends matching different domains (Bacteria, Archaea, and Eukarya) were discarded; those matching the same domain were annotated with more specific taxonomy.

## **Mapping the** *N. vectensis* **Metatranscriptome to** *N. vectensis* **Bacterial Isolate Genomes**

Non-ribosomal reads were merged into one Fasta file and imported into CLC Genomics Workbench (CLC Bio, Cambridge, MA) as unpaired sequences. The sequences were aligned with the annotated isolate reference genomes using the "map reads to reference sequence" function of CLC with parameters adjusted to provide stringent mapping of short cDNA sequences (Similarity = 0.9; Length Fraction = 0.5). Mapping results were manually inspected for coverage and sequence identity. ORFs from bacterial isolates with greater than 200 bp of consensus sequence coverage from mapped reads with *>*95% identity were considered for further analysis.

## **Results**

## **Molecular Diversity of Microbiota Associated with** *N. vectensis* **at Three Salt Marshes**

A total of 393 non-chimeric Bacterial and chloroplast 16S rRNA gene sequences (*E. coli* positions 27–805) were obtained from *N. vectensis* from Mahone Bay, Nova Scotia (MB), Clinton Harbor, Connecticut (CT) and Sippewissett Marsh, Massachusetts (MA-I, MA-II) (**Figure 2A**, **Table 1**). An additional 39 16S rRNA gene sequences were recovered from cDNA libraries prepared from *N. vectensis* total RNA (Sippewissett Marsh, MA-IV), 82 Bacterial 16S rRNA genes were recovered from laboratory-reared *N. vectensis* and 66 16S rRNA sequences were obtained from Sippewissett Marsh surface sediments (**Table 1**). Archaeal 16S rRNA genes were not recovered by amplification of *N. vectensis* DNA (20 ng) with the Archaeal primer pair 21F to 958R. Three to ten bacterial phyla were recovered from samples of *N. vectensis* consisting of representatives from Cytophaga-Flexibacter-Bacteroides (CFB), Chloroflexi, Cyanobacteria, Deferribacteres, Firmicutes, OD1, Planctomycetes, Proteobacteria, Spirochetes, Tenericutes and Verrucomicrobia (**Figures 2A**, **3**). Operational taxonomic units (OTUs) were defined as clusters of 16S rRNA sequences sharing *>*99% identity (i.e., a ribotype).

Bacterial sequences associated with *N. vectensis* from the MB and CT salt marshes were dominated by a single ε-Proteobacterial OTU in the Order Campylobacterales which corresponded to 98 and 97% of the MB and CT clone libraries, respectively; 34 and 3% of sequences from *N. vectensis* collected from Sippewissett Marsh July 2008 and November 2008, respectively (**Figure 2A**) and 26% of the bacterial cDNA clones, indicating expressed rRNAs from Sippewissett Marsh *N. vectensis*, in July 2009 (**Figure 3**, cluster 6). This Campylobacterales ribotype appears to be part of an uncultured lineage sharing 96% identity with clones from *Orbicella faveolata* (Sunagawa et al., 2009) (FJ202415 in **Figure 3**, cluster 6) and sharing <sup>≤</sup>85% identity with the closest cultured relatives in the bacterial genera *Helicobacter*, *Arcobacter*, and *Sulfurovum lithotrophicum* representing gastrointestinal pathogens (Engberg et al., 2000) as well as sulfur-oxidizing chemoautotrophs (Inagaki et al., 2004).

Two additional OTUs were distributed in multiple clone libraries from field-collected *N. vectensis*. An OTU with 99.6% 16S rRNA identity to the marine-invertebrate endobiont *Endozoicomonas elysicola* (Schuett et al., 2007) was associated with anemones from Clinton Harbor, CT and Sippewissett Marsh, MA, representing 0.9 and 4.9% of cloned sequences from 16S rRNA gene libraries, respectively, and 23% of bacterial cDNA clones representing expressed rRNA from Sippewissett Marsh *N. vectensis* (**Figure 3**, cluster 2). In addition, an OTU sharing 99.6% rRNA identity with isolates of *Pseudomonas* *pseudoalcaligenes* (now reclassified as *P. oleovorans*, Saha et al., 2010) a widely distributed environmental bacteria and an opportunistic pathogen (Gilardi, 1972; Yamamoto et al., 2000), were recovered from anemones from Mahone Bay (**Figures 2A**, **3**) representing 1.9% of cloned sequences.

*N. vectensis* from MB and CT revealed a surprisingly low diversity of associated bacterial types (chao1 3.5 and 4, respectively) dominated by the Campylobacterales OTU while anemones from Sippewissett Marsh harbored a higher diversity (chao1 78 and 1059). There is no obvious explanation for the low diversity of microbial sequences from anemones from Clinton Harbor, CT and Mahone Bay, NS relative to anemones from Sippewissett Marsh, MA. Comparison of sequence types associated with Sippewissett Marsh sediments collected in November 2008 (chao1 283) with sequences recovered from the anemones collected at the same location and time (November 2008) reveals a similar distribution of Cyanobacterial and chloroplast sequences (**Figure 2A**), suggesting some sequence richness in anemones from Sippewissett Marsh may be due to a contribution of sediment associated-bacteria. Moreover, a high proportion of sequences (23%) associated with Sippewissett Marsh *N. vectensis* (November 2008) were from diatom chloroplasts. We have observed diatoms attached to the external body wall of *N. vectensis* (**Figures 1C,D**). Comparisons between anemones and sediment clone libraries cannot be made with the CT and MB samples as the sediments in these locations were not collected. Notably, the previously described OTUs associated with *N. vectensis* at multiple salt marshes (i.e., sequences similar to Campylobacterales, *Endozoicomonas elysicola*, or *Pseudomonas oleovorans*) were not observed in Sippewissett Marsh sediments, although rarefaction analysis indicated that the sediment clone library diversity was not sampled to saturation (**Figure 2B**).

## **Molecular Diversity of Microbiota Associated with Laboratory-reared** *N. vectensis*

Laboratory-raised *N. vectensis* were associated with a similar magnitude of bacterial diversity as anemones collected from the field (chao1 80; **Table 1**) however, the bacterial community composition was notably different. Laboratory-reared anemones were associated with a majority of γ-Proteobacterial sequences (75%), in contrast to field-collected anemones that were associated with <sup>≤</sup>16% <sup>γ</sup>-Proteobacterial sequences (**Figure 2A**). Only two microbial OTUs observed in wild anemones were also recovered from the laboratory-reared *N. vectensis*; *Pseudomonas oleovorans* and a novel Spirochete OTU (92.8% 16S rRNA identity with an uncultured clone from a deep-sea coral) that was also recovered from the Sippewissett Marsh sediment clone library.

## **Species-specific PCR of** *N. vectensis* **Microbial Associates**

To determine whether the four OTU's associated with *N. vectensis* in multiple clone libraries during Summer/Fall 2008 (i.e., the Campylobacterales and Spirochete OTUs, and the *Endozoicomonas elysicola-*, and *Pseudomonas oleovorans*like OTUs) remained associated with both laboratory-reared

and Sippewissett Marsh *N. vectensis* collected in June 2009 (representing a timespan of 9–12 months) as well as to screen for their presence in the surrounding marsh habitat (sediment and water), specific PCR assays were designed for each OTU. These analyses confirmed that the Campylobacterales and Spirochete OTUs and *Endozoicomonas elysicola* remained associated with both laboratory-reared and field-collected anemones in June 2009 (**Figure 4**), however *Pseudomonas oleovorans* amplicons were not recovered from the field-collected anemone DNA. Marsh water DNA yielded the expected sized amplicons for all four-sequence types, although PCR inhibition of amplification was apparent by the reproducible faint band intensity of 16S rRNA amplicon from universal eubacterial primers, relative to other environments. In contrast, surface sediments did not appear to be associated with any of these four OTUs, and a positive signal for the universal 16S rRNA amplicon indicated PCR inhibition was not a confounding factor in the sediment analysis (**Figure 4**). Non-detection of Campylobacterales, *Endozoicomonas elysicola,* and *Pseudomonas oleovorans* amplicons in the sediment sample (June 2009) is

#### **FIGURE 3 | Continued**

colors correspond to sample origin: Salt marsh collected anemones (green), laboratory-acclimated anemones (blue) or from both field and lab-acclimated anemones (red). Chart to the right of tree specifies the specific sample and date of sequence origin. Phylum is indicated at the far right and corresponds to the legend on the figure. Highlighted sequence clusters correspond to taxa discussed in this study (1) *Pseudomonas oleovorans*, (2) *Endozoicomonas* spp., (3) *Limnobacter* spp., (4) *Stappia* spp., (5) *Rhizobium radiobacter*, (6) uncultured Campylobacter lineage (note: the origin of the "termite group" sequence FJ202415 is the coral *Orbicella (Montastrea) faveolata*), (7) uncultured OTUs with highest sequence identity to a coral-derived Spirochete.

consistent with their absence from the clone library prepared from surface sediments collected in November 2008 (data not shown). This suggests that the association of these ribotypes with *N. vectensi*s may be more specific than ingestion of detritus or attachment of the surrounding sediment to the anemone surface. However, the absence of a Spirochete OTU-specific PCR amplicon from marsh sediment was surprising, and may be due to a low concentration of this population's DNA in the sediment in June 2009, in contrast to November 2008 when a single sequence was observed in the sediment clone library.

## **Diversity of Isolates Cultured from** *N. vectensis*

Seven different media combinations were used to isolate a total of 132 bacterial strains from the field anemones and 511 bacterial strains from anemones maintained in the lab (**Table 2**). These strains were classified by 16S rRNA RFLP and sequencing. These strains corresponded to a total of 19 different ribotypes, among which 17 ribotypes were recovered from the field and 5 ribotypes from the lab (**Table 2**). Types recovered from multiple samples from the field generally did not overlap, while similar types (dominated by *Pseudomonas*

**TABLE 2 | Summary of bacterial isolates recovered from** *Nematostella vectensis* **(Nv) maintained in the laboratory or collected from Sippewissett Marsh (March 2010).**


*oleovorans* and *Rhizobium radiobacter* were recovered from all laboratory samples on all media formulations. There was little overlap between populations recovered from lab vs. field notable exceptions were a *V. furnissii*-like ribotype. Most isolates from field collected anemones shared *>*95% nucleotide similarity with isolates or cloned sequences obtained from other anthozoans (primarily stony corals), suggesting potentially conserved mechanisms for association with anthozoans. Three ribotypes observed in the 2010 culture collection matched ribotypes recovered from an earlier survey of culturable diversity associated with laboratory-raised anemones (in 2008 and 2009) where isolates were recovered from on 2216 media. Quantitative data on strain distribution from this 2008 study are not available, however isolates of *P. oleovorans*, *Limnobater thiooxidans*, and *R. radiobacter* were archived during this study and serve as a reference for strains isolated in 2010. Bacterial isolates observed in multiple samples or with similarity to associates of other Anthozoan hosts were selected for additional physiological and genomic characterization. These included *Pseudomonas oleovorans* isolated from laboratory-acclimated *N. vectensis* in 2008 (Po-B4) and 2010 (Po-Gab and Po-Is) and strain Po47 from anemones donated from John Finnerty's laboratory (Boston University, 2010) (**Figure 3**, cluster 1), *R. radiobacter* isolated from lab-acclimated anemones in 2008 (Rr-D5 and Rr-D8) and 2010 (Rr-Is) (**Figure 3**, cluster 5). In addition, two *Limnobacter* isolates from salt marsh-collected *N. vectensis* in 2010 (Lt-F1 and Lt-FCMA) matched sequences from isolates of laboratoryacclimated anemones obtained in 2008 (**Figure 3**, cluster 3) and a single *Stappia* isolate was selected that matched a cloned sequence from a stony coral (Ss-F1) and sequences from isolates that were subsequently recovered from anemones collected at Belle Island Marsh, near Boston MA in 2012 (**Figure 3**, cluster 4). Isolates from the genus *Vibrio* (in particular *V. furnissii*) were excluded from this analysis because of their ubiquitous recovery from coastal environments and high coverage of this particular genus and species in characterized culture collections and among genome-sequence repositories. Notably, despite varied cultivation methods (including targeting aerobic, anaerobic and microaerophilic growth) no isolates of the Campylobacter, *Endozoicomonas*, or Spirochete OTUs associated with multiple anemone samples in the cultivation-independent characterization were recovered during the isolations.

Ten bacterial strains isolated from *N. vectensis* were subjected to a suite of physiological tests to characterize their optimal growth conditions (**Table 3**). All of the isolates grew on heterotrophic media at 28◦C under aerobic conditions with colonies evident after 24–48 h (*Pseudomonas, Rhizobium, Stappia*) or 72 h (*Limnobacter*). All strains grew over the pH range of 6–10 with optimal growth at pH 7–8, were catalase positive and gram(–), did not exhibit hemolysis on blood agar plates and exhibited a range of heterotrophic growth under microaerophilic conditions. All strains exhibited directional motility when observed by light microscopy. None of the strains were observed to grow chemoautotrophically with NaS or NaS2O3 as an electron donor after 1 week. The *Pseudomonas oleovorans* cells were 0.5 × 1–1.5μm rods forming 0.5 to 1 mm diameter colonies on 2216 media with variable opacity and texture after 2 days (**Table 3**). Growth was observed from 16 to 45◦C with optimal growth at 28 and 37◦C. Salinity tolerance ranged from 0 to 5% (optimal 2–3%). *Pseudomonas* strains revealed resistance to multiple tested antibiotics (Nalidixic acid, Chloramphenicol and Ampicillin). *R. radiobacter* isolates had variable cell size, short and stout rods (0.5–0.7μm × 0.7–1μm) to slender rods (0.7 × 2–2.5μm), and formed punctate opaque colonies after 2 days on 2216 media. Growth was observed from 16 to 37◦C with optimal growth at 28◦C and variable growth at 45◦C. Salinity tolerance varied by strain with all strains growing from 0 to 3% (optimal 2–3%) and strain Rr-D5 tolerating up to 7% salinity. All *Rhizobium* strains were resistant to Streptomycin and exhibited variable resistance to other tested antibiotics with strain Rr-D5 exhibiting resistance to all 5 antibiotics tested. *Limnobacter thiooxidans* strains were motile rods (0.5 × 1– 1.5μm) and formed 0.5–1 mm translucent colonies with variable texture after 72 h. Growth after 4 days occurred from 22 to 37◦C with variable growth at 16◦C. The optimal salinity for growth was 2% and strains varied in salinity tolerance (1–2% for strain Lt-F1 and 0–3% for strain Lt-FCMA). *Limnobacter thiooxidans* strains were sensitive to all five antibiotics tested. The *Stappia stellulata* isolate formed motile rods (0.5 − 1.5–2μm) and grew from 16 to 45◦C (optimal 22-37◦C) and at salinities from 2 to 5% (optimal 3%). This strain was resistant to Nalidixic acid and Streptomycin.

## **Characterization of Genomes from Bacterial Isolates**

Estimated sequence coverage for isolate genomes ranged from 15.5x - 82.1x although none of the genomes could


*(Continued)*

**181**


be closed (**Table 4**, **Figure 5**). Annotation of genomes with the RAST pipeline revealed multiple pathways for utilization of carbohydrates and proteins, consistent with observed heterotrophic growth on complex media. To identify potential mechanisms for host-association, genes with homology to virulence factors were identified through review of the RAST annotations and by homology to the virulence factor database VFDB (Chen et al., 2012). ORFs homologous to virulence factors commonly associated with both pathogenic and nonpathogenic Proteobacteria were observed in all isolates including genes mediating expression of flagellar motility and chemotaxis, general secretion (type II), type IV pili, fimbrae, iron transport, hemolysis (hlyA, B, D), siderophore biosynthesis, and superoxide dismutase (sodAB).

*Pseudomonas oleovorans* genomes were most similar to sequenced genomes of *P. mendocina* strains ymp and NK-01 (Guo et al., 2011) (**Figure 5**). The predicted genome size was 5.20–5.41 Mb with 15.5–25.3x coverage and 64.1–64.9% GC content. The average nucleotide identity of orthologs shared between *Pseudomonas* isolates obtained from *N. vectensis* ranged from 94.91 to 97.08%. In addition to the virulence factors described above, BLASTX hits against the virulence factor database revealed hits against urease and one gene in the type VI secretion pathway (icmF).

*Rhizobium* genomes were most similar to sequenced strains of *Agrobacterium tumefacians* (revised name *R. radiobacter*) (**Figure 5**). The predicted genome sizes were 5.38–5.49 Mb with 15.8–20.5x coverage and 59.1–59.3% GC. The average nucleotide identity of orthologs shared among the isolates from *N. vectensis* ranged from 95.5 to 97.46%. In addition to the common Proteobacterial virulence factors identified above *Rhizobium* strains included proteins with significant similarity to ureases, type VI secretion proteins (vgrG and IcmF), antibiotic resistance proteins (including tetAB), and proteins involved in isochorismate and salicylate biosynthesis, which are linked to production of bioactive compounds.

*Limnobacter thiooxidans* genomes of strains were most similar to the partially assembled genome of *Limnobacter* strain MED105 and the completed genomes of the Betaproteobacteria strains *Burkholderia cenocepacia* AU1054 and *Ralstonia solanaceraum* GMI1000 available in the RefSeq database (**Figure 5**). The predicted genome size was 3.21 and 3.45 Mb for strains Lt-FCMA and Lt-F1 with 38.3x and 82.1x coverage and GC content of 52.3 and 51.7%, respectively. The average nucleotide identity among shared gene orthologs from the *Limnobacter* genomes was 92.5%. Analysis of genome annotations revealed genes for lithotrophic sulfur oxidation (Sox genes), while genes for carbon fixation to enable autotrophic growth were not evident. This observation was consistent with physiological characterization that indicated that *Limnobacter* strains did not grow in the absence of exogenously supplied organic carbon, and these strains may thus be mixotrophic. The virulence factors in the *Limnobacter* genomes include the common Proteobacterial factors as well as genes with homology to a beta-lactamase involved in antibiotic resistance and a salicylate synthetase. In addition, several genes were identified as homologs of type III secretion proteins, although annotation of the corresponding


#### **FIGURE 5 | Continued**

genomes the plots depict pairwise comparison of strain Lt-F1 to each of the following four genomes [*Limnobacter* spp. MED105, Strain Lt-FCMA (this study), *Burkholderia cenocepacia* AU1054, and *Ralstonia solanacearum* GMI1000]. Colored bars stacked to comprise the concentric rings represent shared ORFs (determined by bi- and uni- directional BLAST analysis) and the color represents the average protein sequence similarity between orthologs with the color scale representing the range of this value.

open reading frames in RAST indicated that at least some of these genes may be mis-annotated flagellar genes and further work is needed to confirm this result.

The genome from the single *Stappia stellulata* strain was 4.42 Mb with 25.3x coverage and 65.3% GC. Like the *Limnobacter* strains this *Stappia* isolate was obtained from anemones collected in the field. The *Stappia* genome annotations indicate a complete pathway for oxidation of reduced sulfur compounds (Sox genes), as well as genes for utilizing carbon-monoxide and aromatic compounds as electron donors for growth. In addition to the common Proteobacterial virulence factors identified above the *Stappia* strain included proteins with significant similarity to ureases and antibiotic resistance proteins.

## **No Observed Evidence for Horizontal Gene Transfer between Genome-sequenced Bacterial Lineages or with the** *N. vectensis* **Host**

Comparison of predicted phage-like or mobile genetic elements and high identity DNA sequences by BLASTN and BLASTP revealed no evidence of shared genetic elements with high nucleotide identity suggesting no recent horizontal gene transfer among bacterial lineages. Bacterial genome ORFs with BLASTP annotations from the NCBI NR Database were imported into MEGAN for taxonomic binning using the Lowest Common Ancestor algorithm (Huson et al., 2007). As expected, the majority of the assigned ORFs binned within the assigned Class of the bacterial isolate. In addition, ORFs within the *Pseudomonas* and *Rhizobium* strains were classified as viral in origin consistent with identification of several likely bacteriophage and phagerelated genes. Surprisingly, four ORFs from the *P. oleovorans* strains were classified as being of *N. vectensis* origin. While it may be possible that genes have been horizontally transferred between the anemone and bacteria, in either direction, the most parsimonious explanation is that the *N. vectensis* reference genome is contaminated with *Pseudomonas* DNA incorrectly annotated as cnidarian as recently described by Artamonova and Mushegian (2013). To examine this further, nucleotide BLAST of *P. oleovorans* and *N. vectensis* genomes revealed *Pseudomonas* DNA on 101 *N. vectensis* genome scaffolds that were, on average about 20 kb shorter than the average and contain almost 80% ambiguous nucleotides with an average GC of 60% similar to that of the sequenced *P. oleovorans* (64.1–64.9%). BLASTN of the *N. vectensis* genome with the *Rhizobium*, *Limnobacter*, or *Stappia* genomes sequenced in this study revealed an additional 6, 4, and 3 *N. vectensis* genome scaffolds that likely derive from bacterial contaminants, respectively. All *N. vectensis* scaffolds identified with likely bacterial sequence by this approach are indicated in Supplementary Table 1. On average proteins shared by the *P. oleovorans* strains reported in this study, and identified as originating from *Pseudomonas* in the *N. vectensis* genome shared 75% amino acid similarity (range 25–98%).

## **Preparation of a** *N. vectensis* **Laboratory Holobiont Metatranscriptome**

Because analysis of 16S rRNA genes in cDNA from fieldcollected *N. vectensis* revealed expressed ribosomal sequences from several candidate symbionts including Campylobacterales and *Endozoicomonas* ribotypes we sought to optimize protocols for further metatranscriptomic analysis. To this end we conducted a pilot metatranscriptome study to sequence enriched mRNA from RNA extracted from laboratoryraised anemones. After processing sequence data to remove ribosomal contamination and QC filtering (removal of Illumina adaptors and low complexity sequences; **Table 5**) sequences from the different treatments were combined yielding a final total of 529,425 sequence pair units. We noted that of the treatments examined, the RNA sample processed with the MICROBEnrich/MICROBExpress+mRNA-only kits for rRNA depletion performed best in terms of sequence yield with 246,506 out of 653,926 sequences identified as putative mRNAs (37.7%) compared to the unprocessed control (8.24%). In absence of replication, these observed differences between approaches are purely anecdotal. Data from this initial screen, and from published studies (He et al., 2010; Stewart et al., 2010), supported adoption of the mRNA-only + Microbe Express/Enrich protocol for future work and has yielded similar proportions of sequences and successful enrichment of bacterial mRNAs among complex targets (Penn et al., 2014).

## **Assembly and Annotation of Holobiont Metatranscriptome Sequences**

Assembly of sequence pair units yielded 7296 contigs where 3422 and 2809 were classified as Eukaryotic and *N. vectensis*, respectively, through comparison to the NCBI non-redundant protein database. Ten contigs assigned to the bacteria were analyzed more closely by BLASTN and BLASTX. One assembled contig of 528 nt derived from 12 sequence pairs had 97% nucleotide identity to the *Vibrio campbellii* outer membrane protein OmpU (average coverage of 2.46X). Remaining contigs were revealed to be *N. vectensis*-like or revealed no higher than 40% amino acid identity to predicted proteins, precluding annotation.

## **Taxonomy of Individual Metatranscriptome Sequences**

Individual sequence pair units were compared against the NCBI non-redundant protein database using BLASTX and slightly less than half of the sequences shared significant similarity with database proteins (i.e., 259,746 database matches) and were assigned to taxonomic groups using MEGAN. Consistent with assembled contigs, the majority of sequences with database matches were "Cnidarian" corresponding to the host anemone taxonomy (77.5% of assigned sequences) (**Figure 6**).


**TABLE 5 | Processing of** *N. vectensis* **metatranscriptomes to remove ribosomal RNAs and low-quality sequences.**

*ITS, Internal Transcribed Spacer.*

*aThe processes implemented for depletion of rRNA and non-bacterial mRNA included treatment of total RNA with: (1) RNAseH after hybridization with DNA oligos targeting specific conserved regions of rRNA - RNAseH is an endonuclease that specifically degrades RNA in RNA:DNA hybrids, (2) the MICROBEnrichTM Kit (Ambion Part No. AM1901) and MICROBExpressTM Kit (Ambion Part No. AM1905), a pair of kits that rely on a novel capture oligo hybridization protocol to selectively remove eukaryotic rRNA and Bacterial rRNA respectively, (3) mRNAOnly reagent (Epicenter), an endonuclease-based method that selectively degrades RNAs with 5 -monophosphates, (4) duplex-specific nuclease (DSN) treatment after hybridization with DNA oligos targeting specific conserved regions of rRNA. DSN specifically degrades dsDNA and DNA in DNA:RNA hybrids, and (5) Poly(A)purist kit that relies on use of oligo(dT) cellulose to preferentially bind Poly(A) tails of eukaryotic mRNA. Treatments were used in combinations specified above and kits were implemented according to the manufacturer's protocols.*

*bA sequence pair unit can be one of three things: (1) A sequence pair whose ends have both made it through filtering. (2) A pair of sequences merged into one sequence because of shared overlapping sequence. (3) A pair of sequences clipped to one sequence because of adaptor contamination.*

The other top assignments were Metazoan taxa (12.6% of assigned sequences), Opisthokonta (2.2%), and Eukaryota (5.6%) suggesting that *>*90% of the expressed non-ribosomal sequences from the holobiont derived from the host anemone. Sequences similar to microbial eukaryotes each corresponded to *<*0.15% of sequences. Of the 1746 sequences annotated as bacterial (0.67% of sequences with database matches), 1308 corresponded to Proteobacteria (75%), followed by unclassified bacteria (13%), Actinobacteria (6.4%) and Firmicutes (3.9%) (**Figure 6**).

## **Recruitment of Metatranscriptome Sequences to Bacterial Isolates**

Metatranscriptome sequences were mapped as unpaired reads to the sequenced and annotated genomes of the 10 cultured *N. vectensis* associated bacteria. Twenty-four gene families (COG/NOG) from the *Limnobacter* genomes matched sequences in the metatranscriptome with 95-100% sequence identity over at least 200 bp of consensus sequence (**Table 6**). Expressed ORFs from the *Pseudomonas, Rhizobium* or *Stappia* genomes were not detected by this approach. The most highly represented *Limnobacter* gene among metatranscriptome sequences (3.6X coverage of a 489 bp ORF with 100% identity between the consensus sequence and the Lt-FCMA genome) was predicted as derived from the phasin protein family (NOG45042), a group of proteins responsible for the synthesis and structure of Poly 3-hydroxyalkanoate (PHA) granules (**Table 6**). Expression of a predicted PHA synthase (COG3243) that participates in PHA granule formation was also detected in the metatranscriptome (99% sequence identity over a 202 bp region of a 1794 bp ORF). Predicted functions for other *Limnobacter* ORFs with metatranscriptome matches include a phosphatase (COG3211), a transporter for phosphate (COG0226), a transporter for iron (COG1629), a TonB-dependent siderophore receptor (COG4774), and a flagellar motility protein (COG2063) (**Table 6**).

## **Discussion**

Symbiotic bacteria associated with cnidarians have recently become focal points for research to understand the roles of these microorganisms in the health and disease of their hosts. As a model cnidarian, *N. vectensis* and its bacterial associates represent a tractable system for examining potential mechanisms for microbial persistence in the holobiont. While previous research had provided evidence of microbial contamination of the *N. vectensis* genome (Starcevic et al., 2008; Har, 2009;

#### **TABLE 6 | Summary of open reading frames from the** *Limnobacter thiooxidans* **genomes that recruited** *>***200 bp of sequence data from the pilot metatranscriptome.**


*\*Paired sequences from the metatranscriptome were mapped as single reads to the open reading frames of the two sequenced Limnobacter associates. Mapping results are reported where the consensus sequence* ≥*200 bp. Open reading frames were annotated using the COG and NOG subsets of the eggNOG database (version 3.0).*

Artamonova and Mushegian, 2013) the research presented here is the first to document the diversity of microbial associates in wild and laboratory raised *N. vectensis* and to describe the physiological and genomic variation of culturable microbes associated with this anemone. Through analysis of 16S ribosomal RNA clone libraries we have observed that bacterial OTUs of a novel Campylobacterales spp. as well as *Endozoicimonas elysicola* are associated with *N. vectensis* in geographically distinct salt marshes, while a novel Spirochete OTU and *Pseudomonas oleovorans* have been observed in *N. vectensis* collected from both the field and the laboratory. Speciesspecific PCR indicates that these populations may persist in the holobiont of laboratory-acclimated *N. vectensis* for at least 9 months (**Figure 4**) after transfer from their natural salt marsh habitat. Similarly, isolation of *Limnobacter* strains from both field-collected and laboratory-raised anemones suggests these strains are also able to persist in the holobiont from the field to laboratory, although their absence from clone libraries suggest that these are not dominant taxa in either environment. Isolation of *Stappia* strains from *N. vectensis* collected from different Massachusetts marsh sites, and *R. radiobacter* strains from laboratory-acclimated *N. vectensis* over a 2 year timeframe suggests these two populations may be stable associates of *N. vectensis* in the salt marsh and laboratory environments, respectively. Thus, we hypothesize that these populations are *N. vectensis* symbionts due to their apparently stable association with the anemone (de Bary, 1879; Chaston and Goodrich-Blair, 2010).

## *N. vectensis***-associated Bacteria are Closely Related to Coral and Sponge Associates**

Association of strains closely related to the *N. vectensis* symbionts described in this study with other marine Cnidarians or Porifera suggest that these bacteria may be adapted to life in association with early diverging Metazoan hosts (species in the phyla Cnidaria and Porifera). The Campylobacterales population that is the most-abundant bacterial associate of *N. vectensis* is most closely related to an uncultured sequence from the Caribbean coral *Montastraea (Orbicella) faveolata* (97% 16S rRNA identity) (Sunagawa et al., 2009). Similarly, the novel Spirochete OTU is most closely related (92.8% 16S rRNA identity) to a deepsea coral clone (Kellogg et al., 2009). *Pseudomonas oleovorans (pseudoalcaligenes)* (99.6% 16S rRNA identity) has been isolated from the marine sponge *Ianthella bastain* (Cervino et al., 2006), is widely distributed in the terrestrial and marine environment (Nishino and Spain, 1993; Quinteira et al., 2005) and is regarded as an opportunistic pathogen of humans (Gilardi, 1972) and other animals (Yamamoto et al., 2000). Finally, recent studies have shown that *Endozoicomonas elysicola-*like bacteria are associated with marine invertebrates including a wide diversity of Cnidarians. Sequences with high ribotype identity (≥97%) with *Endozoicomonas elysicola* have been found in three sea anemones: *N. vectensis* (this study), *Metridium senile* (Schuett et al., 2007) and *Anthopleura midori* (Du et al., 2010), in addition populations of *Endozoicomonas* spp. are found at high proportion across multiple types of corals (Raina et al., 2009; Sunagawa et al., 2010; Yang et al., 2010; Morrow et al., 2012; Pike et al., 2013; Bayer et al., 2013a,b; Morrow et al., 2014; Neave et al., 2014) and other marine invertebrates (e.g., the sea slug *Elysia ornate*, Kurahashi and Yokota, 2007).

The culturable symbionts analyzed by genome sequencing also share close relation to sequences and isolates recovered from coral and sponge holobionts. *Stappia stellulata* strains have been recovered from a wide diversity of marine invertebrates (Boettcher et al., 2000; Weber and King, 2007) and the isolate derived from *N. vectensis* in this study matched a ribotype found in a Black Band Diseased coral *Siderastrea siderea* (DQ446087) (Sekar et al., 2006). The sequenced strains of *R. radiobacter* are closely related to the agent of crown-gall disease in plants, which was formerly identified as and still commonly called *Agrobacterium tumefaciens* (Young et al., 2001). A closely related sequence to *R. radiobacter* isolates from our study was recovered from a survey of coral reef bacterioplankton (*>*98% to HQ443405) (Nelson et al., 2011) and appears to be widespread in marine environments (Engelhardt et al., 2013). Other members of the genus *Rhizobium* are well known for symbiotic nitrogen fixation in plants and are of emerging interest due to their potential role in nitrogen fixation within the coral holobiont (Lema et al., 2012). Bacterial isolates from the genus *Limnobacter* have been found in diverse environments including freshwater lake sediments, the surface waters of the Baltic and Mediterranean Seas, a volcanic deposition in Japan, soils at a coal-mining site (Spring et al., 2001; Lu et al., 2011; Vedler et al., 2013; Poncelet et al., 2014). While no published studies indicate animal association is common in this taxonomic group we note that a strain of *Limnobacter thiooxidans* sharing *>*98% rRNA identity with the strains described in this study was isolated from the sponge *Haliclona simulans* in the South China Sea (FJ999570, unpublished study) and symbioses have been documented within other genera within the family Burkholderiaceae e.g., Kim et al. (2013).

## **Potential Microbial Activities in the** *N. vectensis* **Holobiont**

Guided by the diversity of described marine microbial symbioses we can pose several tentative hypotheses based on our data regarding the activities of microorganisms that associate with the *N. vectensis* holobiont. First, we suggest that the associations between the bacterial isolates characterized in this study and the anemone are facultative based on their ease of cultivation, and the diversity and size of genomic repertoires suggesting that these particular strains have not experienced overall genome size reduction that is characteristic of more obligate symbioses. In contrast, symbionts observed via 16S rRNA clone libraries that have resisted cultivation in this study may represent more fastidious or obligate associations and remain attractive targets for further work to uncover the mechanisms of association and persistence. Based on our current phylogenetic, genomic, and pilot-scale metatranscriptomic data we suggest several activities that may mediate survival and persistence of bacterial populations in the *N. vectensis* holobiont, namely (1) the use of alternative forms of energy generation (mixotrophy), (2) scavenging of nutrients (P and Fe), (3) storage of carbon, and (4) expression of mechanisms to resist chemical stressors. All of these factors have been identified as relevant to other host x microbe associations that are discussed in more detail below.

## **Sulfur Oxidation as a Potential Form of Mixotrophy in** *N. vectensis* **Microbiota**

Several members of the *N. vectensis* holobiont described in this study either contain genes for sulfur oxidation, or are in a phylogenetic lineage that contains species that are known sulfur compound oxidizers. The closest culture-characterized relatives of the Campylobacterales OTU, numerically dominant in anemones collected from the salt marsh habitat, includes a sulfuroxidizing chemolithoautotroph (*Sulfurovum lithotrophicum*). Chemoautotrophic εpsilon-Proteobacteria that use reduced sulfur compounds as electron donors, are found in symbiotic associations with animals in environments exposed to high fluxes of reduced sulfur compounds such as hydrothermal vents and salt marshes (Madrid et al., 2001). Genomes from *Limnobacter thiooxidans* and *Stappia stellulata*, isolated from field-collected anemones, reveal ORFs annotated as genes for sulfur oxidation (sox) but not autotrophic carbon fixation, suggesting these species may be able to utilize reduced sulfur compounds to supplement heterotrophic growth in the anemone holobiont. Despite observation of sox genes in the genome of *S. stellulata*, to our knowledge mixotrophic growth has not been reported for other strains of the species (Buchan et al., 2001; Weber and King, 2007). *Limnobacter thiooxidans* was originally described as a mixotroph (Spring et al., 2001) and this trait is observed in other members of the genus (Lu et al., 2011). Fluxes of sulfide are characteristic of the anemone's salt marsh habitat (Howes et al., 1985) and utilization of this alternative source of electrons for energy-generation may promote persistence in the host during times of nutrient scarcity. In addition, oxidation of reduced sulfur compounds in the *N. vectensis* holobiont could increase holobiont fitness through detoxification of internal sulfide, or by fueling autotrophic-production of microbial biomass as an internal food supply, as has been demonstrated in other marine microbe symbioses (Childress et al., 1991; Krueger et al., 1996; Freytag et al., 2001; Dubilier et al., 2008). Further work is warranted to investigate whether mixotrophic sulfide oxidation may play a similar role in the *N. vectensis* holobiont in its native salt marsh range.

## **Scavenging Nutrients**

The importance of the nutrients iron and phosphorous within the microbiota is suggested by analysis of genomes and metatranscriptomes. All *N. vectensis* associated bacterial genomes revealed genes for the biosynthesis of high affinity iron-binding compounds (siderophores); such compounds are well-established as host association factors due to competition between the host and microbiota for bioavailable iron. Metatranscriptome sequences mapped with high stringency to *Limnobacter* ORFs predicted to encode proteins for nutrient scavenging including two types of siderophore receptors (COGs 1629 and 4774) as well as an alkaline phosphatase and a phosphate transporter (COGs 3211 and 0226) that enable cleavage of phosphate groups from organic compounds followed by uptake. Iron and phosphorous are both essential nutrients and enrichment/expression of nutrient transporters has been shown to correlate to environmental stress for the respective nutrient (Coleman and Chisholm, 2010; Harke and Gobler, 2013). As siderophores promote the survival of pathogens during infection they are widely identified as virulence factors; yet these compounds have been shown play much broader ecological roles by controlling the dynamics of plankton populations in low-iron ocean regions and mediating ecological interactions among coastal bacterioplankton (Cordero et al., 2012) and coral reefs (Kelly et al., 2012).

#### **Resource Storage**

The *Limnobacter* ORF with the highest coverage in the *N. vectensis* holobiont metatranscriptome corresponded to a phasin protein in the gene family NOG45042, which regulates biosynthesis of Poly 3-hydroxyalkanoate (PHA) granules for intracellular storage of carbon (**Table 6**). A second ORF detected in the metatranscriptome corresponded to a PHA synthetase (COG 3243). PHA granules have recently been determined to play a critical role in symbiosis of a Betaproteobacterial species (genus *Burkholderia)* with the bean bug *Riptortus pedestris* (Kim et al., 2013). The phasin protein was more highly expressed in bean bug-associated bacteria than in cultures of the *Burkholderia* strain (Kim et al., 2013). Evidence of the role of PHA in symbiosis was provided and when genes for PHA synthesis were inactivated by mutagenesis resulting in a reduced density of the *Burkholderia* population within the bean bugs which, in turn, became more vulnerable to osmotic, oxidative, nutrient, and temperature perturbations (Kim et al., 2013). This work suggests that PHAs mediate the persistence of bacterial cells under various environmental stresses and it is possible that PHA granules may play a similarly important role in *Limnobacter's* acclimation and persistence within the anemone holobiont.

## **Resistance to Chemical Stressors**

A *Vibrio* OmpU-like protein was the sole bacterial transcript assembled from the *N. vectensis* holobiont metatranscriptome. OmpU, an outermembrane porin, has been shown to modulate host and symbiont interaction in several vibrios, mediating colonization of the mutualist *V. fisheri* (Aekersberg et al., 2001) and virulence of the pathogens *V. splendidus* (Duperthuy et al., 2010) and *V. cholerae* (Provenzano and Klose, 2000). Loss of OmpU function in *V. splendidus* was associated with higher sensitivity to host-derived antimicrobial peptides (Duperthuy et al., 2010). Porins were also among the predicted cell-wall and membrane *Limnobacter* ORFs detected among metatranscriptome sequences (**Table 6**). Antibiotic resistance in bacteria is mediated by selective permeability at the cell wall and membrane, which is mediated by porins as well as efflux pumps that control penetration of toxicant compounds (e.g., antibiotics, antimicrobial peptides) to the interior of the bacterial cell (Yeaman and Yount, 2003; Piddock, 2006). Cnidarians are known to make a diverse array of antimicrobial compounds, and it has been recently shown that the model cnidarian *Hydra* regulates the composition of its microbiota through antimicrobial activity (Franzenburg et al., 2013a,b). Selective permeability of the bacterial cell wall may point to the importance of bacterial acclimation to the chemical environment of the anemone for persistence.

## **Conclusion**

We have used an integrated approach of cultivation independent microbiota surveys, strain isolation, genome sequencing, physiological characterization, and holobiont metatranscriptomics to explore the diversity and activity of the microbiota associated with *N. vectensis* in both the field and laboratory setting. This work has enabled preliminary insights into both the biodiversity of the *N. vectensis* holobiont over space and time and the mechanisms by which bacteria may persist in association with the *N. vectensis* host. Predicted activities of *Limnobacter* ORFs detected in the *N. vectensis* metatranscriptome parallel activities noted as important in other established symbioses, including nutrient scavenging, selective permeability of the cell wall/membrane and PHA granule formation which may play a role in bacterial resistance to holobiont-associated stresses. In addition, mixotrophic use of reduced sulfur compounds as electron donors is a potential activity of bacteria that appeared to be stably associated with *N. vectensis* across multiple filed sites (Campylobacterales OTU) and genes for this were detected in *Limnobacter* and *Stappia* isolates recovered from natural populations of *N. vectensis* in sulfide-rich salt marsh habitats. To better understand bacterial acclimation and persistence within cnidarian holobionts, further work should focus on organisms recovered from their natural habitats with the additional goal to elucidate activities of microbial populations that resist culturing and may reflect more obligate associations within the holobiont.

#### **Nucleotide Accession Numbers**

Nucleotide Accession Information: The sequences obtained in this study have been deposited to Genbank under accession numbers HQ189546 to HQ189745. Genome sequences are deposited under BioProject Number PRJNA281237 and annotations are referenced by strain name and are publically available via the RAST server.

## **Acknowledgments**

Funding for this work was provided to JT by the MIT Civil and Environmental Engineering Department. Additional funding was provided by the Seagrant Doherty Chair for Ocean

## **References**


Utilization (JT), a National Science Foundation graduate research fellowship (TH), and the National Research Foundation of Korea fellowship (JL). AMR was funded by NSF DEB-1545539 and NSF OCE-1536530. We would also like to thank the MIT Center for Environmental Health Sciences (US National Institute of Environmental Health Sciences NIEHS grant P30- ES002109) for core facility use and assistance with Illumina sequencing. We'd like to thank John Finnerty (Boston University) for sharing *N. vectensis* samples. Finally, we would like to thank Michael Meyers, Tzipora Wagner and members of the Thompson lab (MIT) for assistance with fieldwork and Ann Tarrant (WHOI) for helpful discussions regarding anemone husbandry.

## **Supplementary Material**

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmicb. 2015.00818

marine sponge *Ianthella bastain* NewBritain, Papua New Guinea. *Mar. Ecol. Prog*. *Ser*. 324, 139–150. doi: 10.3354/meps324139


de Bary, A. (1879). *Die Erscheinung der Symbiose*. Strasbourg: Karl J. Trubner.


in deep subseafloor sediments. *ISME J*. 7, 199–209. doi: 10.1038/ismej. 2012.92


encrusting pore coral Montipora aequituberculata*. Int. J. Syst. Evol. Microbiol*. 60, 1158–1162. doi: 10.1099/ijs.0.014357-0


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Har, Helbig, Lim, Fernando, Reitzel, Penn and Thompson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## *Wolbachia* is not all about sex: male-feminizing *Wolbachia* alters the leafhopper *Zyginidia pullula* transcriptome in a mainly sex-independent manner

#### *Hosseinali Asgharian1 \*, Peter L. Chang1, Peter J. Mazzoglio2 and Ilaria Negri <sup>2</sup>*

*<sup>1</sup> Program in Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA <sup>2</sup> DISAFA - Department of Agricultural, Forest and Food Sciences, University of Torino, Grugliasco (TO), Italy*

#### *Edited by:*

*M Pilar Francino, Center for Public Health Research, Spain*

#### *Reviewed by:*

*Anna Carolin Frank, University of California Merced, USA Natacha Kremer, Université Claude Bernard Lyon 1, France*

#### *\*Correspondence:*

*Hosseinali Asgharian, Program in Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, RRI 413M, 1050 Childs Way, Los Angeles, CA 90007, USA e-mail: asgharia@usc.edu*

*Wolbachia* causes the feminization of chromosomally male embryos in several species of crustaceans and insects, including the leafhopper *Zyginidia pullula*. In contrast to the relatively well-established ecological aspects of male feminization (e.g., sex ratio distortion and its consequences), the underlying molecular mechanisms remain understudied and unclear. We embarked on an exploratory study to investigate the extent and nature of *Wolbachia*'s effect on gene expression pattern in *Z. pullula*. We sequenced whole transcriptomes from *Wolbachia*-infected and uninfected adults. 18147 loci were assembled *de novo*, including homologs of several *Drosophila* sex determination genes. A number of transcripts were flagged as candidate *Wolbachia* sequences. Despite the resemblance of *Wolbachia*-infected chromosomal males to uninfected and infected chromosomal females in terms of sexual morphology and behavior, principal component analysis revealed that gene expression patterns did not follow these sexual phenotype categories. The principal components generated by differentially expressed genes specified a strong sex-independent *Wolbachia* effect, followed by a weaker *Wolbachia*-sexual karyotype interaction effect. Approaches to further examine the molecular mechanism of *Wolbachia*-host interactions have been suggested based on the presented findings.

**Keywords:** *Wolbachia* **infection, male feminization, principal component analysis (PCA),** *Zyginidia pullula* **transcriptome, transcriptome de novo assembly, host-symbiont interactions**

## **INTRODUCTION**

*Wolbachia* is an intracellular symbiont alpha-proteobacterium that infects a wide range of arthropods and nematodes (Schulenburg et al., 2000; Werren et al., 2008). It is often transmitted vertically from females through the eggs to their future progeny; although, horizontal transfer between hosts has also been documented (Werren et al., 1995; Cordaux et al., 2001). Studying the mechanism of *Wolbachia*-host interactions is fascinating for many reasons. *Wolbachia* is capable of inducing several intriguing sex-related phenotypes in its hosts, including male killing (MK), in which infected males die during embryonic or larval stages; male feminization (MF), that is the development of genetic males into females; thelytokous parthenogenesis (TP) in which infected virgin females produce daughters. All of these phenotypes distort the progeny sex ratio in favor of females thus ensuring higher transmission rate of *Wolbachia* to the next generation of hosts (Werren et al., 2008; White et al., 2013). Another fascinating effect of the infection is cytoplasmic incompatibility between gametes (CI), which results in aberrant or considerably reduced offspring production, if uninfected females mate with infected males, or if the parents are infected with different *Wolbachia* strains (Werren et al., 2008; White et al., 2013). In this case, infected females possess a reproductive advantage compared to uninfected ones, and this again ensures the spreading of *Wolbachia* into the host population. Fast transition between the four phenotypes in the course of the coevolution of *Wolbachia* and its hosts hints that similar molecular mechanisms might underlie the apparently different effects (Ma et al., 2014). Due to its enormous host range, *Wolbachia* may have played a crucial role in the evolution of sex determination system and reproductive strategies in arthropods (Cordaux et al., 2011; Awrahman et al., 2014; Ma et al., 2014).

Various approaches have been employed to investigate the *Wolbachia*-host interactions in naturally infected and uninfected strains (Hoffmann et al., 1990; Negri et al., 2006; Riparbelli et al., 2012), experimentally inoculated cell lines (Noda et al., 2002; Xi et al., 2008), and antibiotic treated specimens (Hoffmann et al., 1990; Casiraghi et al., 2002). Although *Wolbachia* is an obligate intracellular symbiont natuarally, protocols have been developed to keep it viable in cell-free media for days; however, no replication occurs in the extracellular phase (Rasgon et al., 2006; Gamston and Rasgon, 2007). The experimental/analytical techniques comprised a wide range including classical crossing and fecundity measurements (e.g., Hoffmann et al., 1990; Dunn et al., 2006), microscopic approaches (*in situ* hybridizations, electron microscope and immunohistochemical techniques for bacterium detection inside hosts and cells, tissues, etc.) (e.g., Negri et al., 2008; Fischer et al., 2011), gene expression analysis (e.g., Xi et al., 2008; Kremer et al., 2009, 2012; Hughes et al., 2011; Chevalier et al., 2012; Darby et al., 2012; Liu et al., 2014), bioinformatic genome sequence annotation and functional prediction (e.g., Wu et al., 2004; Foster et al., 2005; Klasson et al., 2008), and mathematical modeling of the ecological consequences of CI or sex ratio distortion (e.g., Taylor, 1990; Turelli, 1994). Despite all these efforts, a coherent mechanistic story of *Wolbachia*'s effect is still lacking. The picture is incomplete even for CI which occurs in *Drosophila* and is the most extensively studied *Wolbachia*induced phenotype; although, cytoskeleton reorganization and asynchrony in nuclear envelope break down and chromosomal condensation of male and female pronuclei after fertilization have been implicated in the process (Serbus et al., 2008; Werren et al., 2008). The other three phenomena are less well understood. TP seems to result from induction of diploidy in species with a haplodiploid sex determination system by production and development of diploid eggs; that is achieved by altering meiosis to produce diploid gametes (Weeks and Breeuwer, 2001), the abortion of the first mitotic division after chromosomal duplication (Pannebakker et al., 2004), or the fusion of the two haploid nuclei after first mitosis of induced eggs (Gottlieb et al., 2002). The molecular bases of MK and MF are least understood but they are suspected to share certain components as MK is often the result of a lethal and incomplete attempt at feminization of genetic male embryos (Werren et al., 2008). The most direct mechanistic evidence comes from the study of male killing *Wolbachia* in the moth *Ostrinia scapulalis* showing that it overrides the karyotypic signal in genetic males to produce the female *dsx* isoform (Sugimoto and Ishikawa, 2012). This suggests that *Wolbachia* impacts the sex determination pathway at or above *dsx*. Apart from this direct effect on the pivotal sex determining gene *dsx*, MK or MF *Wolbachia* infection is reported to be accompanied with defective chromatin remodeling (Riparbelli et al., 2012), induction of host immune response (Chevalier et al., 2012), and epigenetic reprogramming of the host (Negri et al., 2009a).

*Zyginidia pullula* is a leafhopper with XX/XO male heterogametic sex determination system in which *Wolbachia* causes feminization of chromosomal males (Negri et al., 2006). Infected female leafhoppers are morphologically indistinguishable from uninfected females; but feminized chromosomal males have an intersex phenotype i.e., they have the upper pygofer appendages, a typical male secondary sexual feature. These appendages show varying degrees of development, from being fully developed in some specimens to being a barely recognizable stump in others (Negri et al., 2006). Feminized males with upper pygofer appendages reduced to a stump have ovaries morphologically similar to uninfected females, whereas those with prominent appendages possess malformed and probably less functional ovaries (Negri et al., 2008). The "degree of feminization" has been shown to be correlated with *Wolbachia* density in the host tissues in several systems (Jaenike, 2009). We have previously reported that *Wolbachia* instigates epigenetic reprogramming of *Z. pullula* (Negri et al., 2009a,b) and probably interacts with the insect hormone biosynthesis pathway to stimulate the production of feminizing hormones (Negri et al., 2010; Negri, 2012). In this study, whole transcriptomes of male and female *Zygindia* samples (*Wolbachia*-infected and uninfected) were analyzed with Illumina deep sequencing technique, in order to understand the scope and nature of the *Wolbachia*-induced change in the host gene expression profile. Our initial idea was that if male feminization is the main consequence of *Wolbachia* infection, transcriptomes from the three female types (uninfected females, infected females and feminized males) should resemble each other and be different from the only phenotypically male group (uninfected males). In fact, we decided to test the hypothesis that sex reversal is *Wolbachia*'s main effect at the transcriptome level. Were this confirmed, we would proceed to identify differentially expressed genes between the two sexual phenotype groups.

## **METHODS**

#### *ZYGINIDIA* **SPECIMENS**

34 overwintering females of *Z. pullula* were collected in the same grass field in north Italy; and were reared individually in the laboratory as described in Negri et al. (2006). Overwintering females have often mated with several males (rarely with only one). By carefully examining the progeny, *Wolbachia-*infected (i.e., all female brood) and uninfected (i.e., male and female brood) lines were identified. *Wolbachia* infection was then confirmed by PCR on the mothers and randomly chosen samples from the brood as described in Negri et al. (2006). Morphological investigation as to the presence or absence of upper pygofer appendages lead us to separate feminized males from genetic females in the allfemale (i.e., *Wolbachia-*infected) lines, and males and females in the uninfected lines. Males from uninfected lines were mated to the physiologically female progeny of the infected lines (consisting of genetic females and males) at each generation to produce the next generation of infected females (and feminized males). This backcrossing to uninfected males was done for at least three generations in the lab. Fifty adults from each of the four different categories of uninfected females (F), uninfected males (M), infected females (FW) and feminized (infected) males (MW) were pooled together for RNA sequencing.

#### **cDNA LIBRARY PREPARATION AND SHORT-READ SEQUENCING**

cDNA libraries were made from male and female specimens of infected and uninfected leafhopper lines. Infected males are phenotypically intersex and exhibit different degrees of feminization depending on the concentration of *Wolbachia*, ranging from individuals with functional ovaries to individuals with female secondary sexual characters, but possessing testes. We used thoroughly feminized infected males for RNA extraction. RNA purification, cDNA synthesis and Illumina library construction were performed using the protocols of Mortazavi et al. (2008), with the following modifications: total RNA, mRNA and DNA were quantified using a Qubit fluorometer (Invitrogen); mRNA fragmentation was performed using Fragmentation Reagent (Ambion) for a 3 min and 50 s incubation at 70◦C and subsequently cleaned through an RNA cleanup kit (Zymo Research); additional DNA and gel purification steps were conducted using Clean and Concentrator kits (Zymo Research). Each sample library was sequenced as pair-ended 76-base reads on an Illumina Genome Analyzer II.

#### *DE NOVO* **TRANSCRIPTOME ASSEMBLY AND EXPRESSION LEVEL CALCULATION**

Due to the sensitive nature of *de novo* assembly, it is critical that the reads used to generate contigs have the highest sequencing quality. Reads were removed from consideration in the *de novo* assembly if they had a terminal *phred* (Ewing and Green, 1998) quality value less than 15, or contained more than 2 unknown nucleotides (i.e., *N*). Reads were also filtered due to similarities to known PCR primer and Illumina Adapter sequences. Using the reads pooled from all of the four samples that were not filtered out, the *de novo* assembly program Velvet (version 1.0.15) (Zerbino and Birney, 2008) was used in conjunction with a custom post-processing algorithm capable of retaining information from alternative splices (Sze et al., 2012) to assemble short reads into contigs, using sequence overlap information until the contigs could no longer be extended. Velvet was run under the following settings with a kmer length of 35: -cov\_cutoff auto -max\_branch\_length 0 -max\_divergence 0 -max\_gap\_count 0 read\_trkg yes. Sequenced reads that were kept as pairs and not filtered out together or separately were treated as "-shortPaired" with insert length of 175 bases and standard deviation of 75 bases. Single end reads that were not filtered out were treated as "-short."

With the set of *de novo* assembled sequences serving as a reference, reads from each of the individual samples were mapped using the Burrows-Wheeler Aligner (BWA) (Li and Durbin, 2009). The number of reads that mapped to the contigs of each gene was tabulated and normalized to calculate FPKM (Fragments Per Kilobase Of Exon Per Million Fragments Mapped). Additional normalization among all samples was performed using the TMM protocol (Trimmed Mean of M-values) outlined in Robinson and Oshlack (2010), which takes into account differences in overall RNA populations across samples and is one of several methods used to evaluate RNA sequencing data. Normalization was implemented using the edgeR package in R (Robinson et al., 2010). All statistical analyses and graphs evaluating consistency between samples were produced using R v2.13.0 (R Development Core Team, 2011).

## **GENE FUNCTIONAL ANNOTATION AND CLASSIFICATION**

Blast2GO v.2 (Götz et al., 2008) and WEGO (Ye et al., 2006) were used to obtain Gene Ontology (GO) annotations. Genes were also annotated using a BLASTX search (Altschul et al., 1990) (Expected value *<*1.00e-05) to the nr protein database available from GenBank as well as to the set of protein sequences available from the *Drosophila melanogaster* 5.34 and the pea aphid *Acyrthosiphon pisum* 2.1 releases. We chose the annotation with the highest BLAST score as long as the span of the alignment was greater than 80% of the length of the contig under query. For genes that did not report any hits, we lowered the minimum span to 40% of the length, choosing the annotation with the highest BLAST score having Expected value *<*1.00e-05.

## **PRINCIPAL COMPONENT ANALYSIS OF GENE EXPRESSION VALUES**

Expression values were cleaned of extreme outliers, quartilenormalized and log-transformed before they were used for PCA. To make sure the result were not artifacts of the data preparation method, PCA was repeated on the raw (not normalized, not log-transformed) expression values as well as after several different outlier-filtering and normalization strategies. These statistical procedures were done in SAS 9.3.

## **RESULTS**

## **SHORT-READ SEQUENCING AND** *DE NOVO* **ASSEMBLY**

The mRNA population was analyzed with Illumina deep sequencing of male and female *Zygindia* samples with and without *Wolbachia* infection. The pooled data from all samples had a total of 50 M pair-ended reads that were 76 bases long. All Illumina sequences are available for download at the NCBI Short Read Archive under the BioProject PRJNA171390. After sequences were filtered based on quality and matches to adapter and primer sequences, the 38 M reads from all four samples were pooled together and run through Velvet and the post-processing algorithm. Eventually, 18,147 loci and a total of 27,236 transcripts were assembled; multiple transcripts of a locus pertained often to different splicing isoforms and occasionally to largely differentiated alleles. The transcripts ranged in lengths from 291 bp to 15,389 bp, with mean and median lengths of 1006 bp and 702 bp, respectively. This assembly included a fairly large number of long transcripts: 25% were longer than 1250 bp and 10% were longer than 2000 bp. Of the 18,147 loci, 14,068 (77.5%) had a single isoform and the remaining 22.5% had multiple ones. Transcripts within a locus were subsequently collapsed into a single "representative locus sequence" by using ClustalW to run a multiple sequence alignment and identifying the locus consensus sequence. Mean and median lengths of consensus sequences were 900 bp and 618 bp, respectively. The total length of all loci consensus sequences was 16.3 Mb.

## **GENE FUNCTIONAL ANNOTATION AND CLASSIFICATION**

6946 loci, corresponding to 38% of the entire dataset, were Gene Ontology annotated with Blast2GO. The consensus sequences were also aligned using a BLASTX search to the nr protein database available from GenBank as well as to the set of protein sequences available from Flybase and the aphid genome. **Table 1** shows the proportion of cases that resulted in a hit where the length of the alignment was greater than 80% or 40% of the length of the query (leafhopper sequence). One might very crudely attribute the 80% alignment span hits to true genic homology and the 40% alignment span hits to conserved domains.



*40% homology length indicates that the length of the homologous segment covers at least 40% of the query (leafhopper) sequence. A correspondingly similar definition applies to the 80% homology length category. The percent values in the table cells show what percent out of all loci (18147) fit each criterion when Blasted against the designated dataset.*

A number of genes potentially involved in the leafhopper sex determination were identified through homology search with the *Drosophila* sex determination genes. Although pea aphid is *Zyginidia*'s closest relative with a reference genome sequence (The International Aphid Genomics Consortium, 2010), the functional annotation for this genome is not as complete as that of *Drosophila*. Sex determining genes of pea aphid have been found based on homology with *Drosophila* sequences and lack direct experimental verification (The International Aphid Genomics Consortium, 2010). Therefore, we decided to use *Drosophila* sequences as the reference set. **Figure 1** depicts the canonical sex determination pathway in *Drosophila*. Homologs of several *Drosophila* sex determination genes were identified among the transcripts including *dsx (doublesex)*, *tra-2 (transformer-2)*, *vir (virilizer)*, *fl(2)d (female lethal d)*, *snf (sans fille)* and *ix (intersex)*. No leafhopper homologs could be identified for *tra (transformer)*, *sxl (sex lethal)*, *fru (fruitless)* or *her (hermaphrodite).* **Table 2** shows the expression levels for the identified leafhopper sex determination genes.

Seventeen genes in our dataset were flagged as likely *Wolbachia* sequences according to the Blast results against the NCBI dataset. Bacterial origin seems very probable for a number of these transcripts based on the expression levels in infected and uninfected lines, plus high similarity to known *Wolbachia* sequences (**Table 3**). These sequences were Blasted against the aphid genome to check if there was an indication of horizontal transfer; they were also Blasted against the *Drosophila* genome as a distant outgroup (**Table 3**).

#### **PRINCIPAL COMPONENT ANALYSIS OF TRANSCRIPTOMES**

Principal component analysis on transcriptomes of the four leafhopper samples surprisingly revealed that *Wolbachia* infection changes the host transcriptome extensively and the effect is by no means limited to sex-reversal. As evident in **Figure 2**,



*FW, infected female; MW, infected feminized male (intersex female); F, uninfected female; M, uninfected male.*

**FIGURE 1 | Sex determination pathway in** *Drosophila***, modified from Sánchez (2008).** SxlF and Sxl<sup>M</sup> refer to functional female and nonfunctional male isoforms of the Sxl protein, respectively. Tra<sup>F</sup> is the functional female form of the Transformer protein, which in conjunction with the constitutive gene product Tra-2 controls female-specific splicing of *dsx* and *fru*. *snf*, *vir* and *FL(2)D* are required for late female-specific splicing of Sxl but play no part

in determining early Sxl splicing pattern. The genes for which *Z. pullula* homologs have been identified in this study, are boxed in gray. For more details on the regulation and function of these genes, refer to (Sánchez, 2008; Gempe and Beye, 2011) or other similar resources. Reproduced with permission from The International Journal of Developmental Biology (Int. J. Dev. Biol.) (2008) Vol:52, pp. 837–856.




*(Continued)*



 *second best hit is the reported Wolbachia sequence. The first hit is >ref | XP\_005192252.1 | PREDICTED: uncharacterized protein LOC101899042, partial [Musca domestica]. \*\*The second best hit is the reported Wolbachia sequence. The first hit is >ref | WP\_012472427.1 | hypothetical protein [Candidatus Amoebophilus asiaticus]. the best hit the reported hypothetical protein, the second best hit and many after that are annotated as (putative) transposases.*

*\*\*\*Although*

the first PC (explaining 66.46% of variance) is highly correlated with all of the samples indicating that the expression of most genes is not significantly altered by *Wolbachia* and is similar across all samples. By the second PC (explaining 20.36% of variance), *Wolbachia* infected male and female samples cluster together and uninfected male and female cluster together. This PC is generated by genes whose expression is changed by *Wolbachia* consistently regardless of sexual karyotype or phenotype. The third PC (explaining 7.97% of variance) indicates an interaction term: F and M are similar and stand in the middle of the scale, with MW and FW occupying the opposite sides of them. This PC is generated by genes that are expressed similarly in uninfected males and females, and *Wolbachia* infection changes their expression in opposite ways in chromosomal males and females. Overall, sex inversion does not seem to be the only or even the biggest effect of *Wolbachia* on gene expression patterns in *Zyginidia*, even if it is the most conspicuous phenotypic consequence; otherwise, we would expect the three phenotypically female groups (F, FW and MW) to cluster together and the only male group (M) to stand separate from them. None of the PCs show such a pattern. PCA was repeated on expression values without the initial outlier filtering, and applying several different normalization and transformation strategies; they all yielded the same picture as described above: the main effect was invariably the presence or absence of *Wolbachia* regardless of sex (details not shown).

## **DISCUSSION**

We assembled the *Z. pullula* transcriptome *de novo* and produced 18,147 loci and 27,236 transcripts with a total consensus sequence length of 16.3 Mb. These numbers were well within the expected range based on the aphid genome information. The aphid genome was reported to contain 11,089 highly supported RefSeq gene models with a total exonic length of 21.6 Mb; adding the gene models from six other gene prediction programs, a total of 34,604 non-redundant gene models with the total exonic length of 35.7 Mb were described (The International Aphid Genomics Consortium, 2010). The true number of genes is purportedly a number between those two estimates. Hence, our *de novo* assembly of the transcriptome seems to have captured a reasonable proportion of the expressed genes.

The results of sequence homology search (**Table 1**) confirm the closer relatedness of *Z. pullula* to *A. pisum* (the aphid) than to *Drosophila*. A caveat to this analysis is the extensive set of duplications in the aphid genome (The International Aphid Genomics Consortium, 2010). Without a leafhopper reference genome, we do not know if the same wave of duplications has affected *Z. pullula* or not; however, there was an indication in our data that it might have. By visual inspection of the sequences that were annotated as isoforms of a single locus computationally, we realized that some of them did not show signatures of known alternative splicing patterns; but looked like highly differentiated alleles (details not shown). These may indeed be paralogous sequences in the process of divergence. Further investigation, including the sequencing of single individuals rather than pools of them, will be required to separate paralogy from allelic variation.

A number of leafhopper sex determination genes were identified based on homology with fly sequences (**Table 2**).

**Table 3 | Continued**

Insect sex determination machinery has evolved around the *transformer-doublesex* axis (Sánchez, 2008); *tra* is the fast evolving component responsible for receiving the signal–sometimes through mediators- from the upstream sex determining factors (chromosomal constituent, incubation temperature, etc.), and *dsx* is the conserved switch that relays this signal down to the developmental processes (Sánchez, 2008; Verhulst et al., 2010). It is, therefore, not surprising that we found a homolog for *dsx* and not for *tra* in our dataset. The short length of the aligned segments prevented reliable assignment of male and female isoforms; but these initial results can be used to design primers to extract the whole genes from the leafhopper genome. Future experiments can then follow the flow of the signal in the sex determination pathway to identify where the cascade is diverted to female development in *Wolbachia*-infected genetic males. In the moth *O. scapulalis*, the impact point is somewhere above the level of *dsx* (Sugimoto and Ishikawa, 2012). Having the sequences of *dsx* male and female isoforms, one could check whether this is also true in leafhoppers. Unfortunately, the lack of replicates in our preliminary data makes it impossible to assess the significance of differential expression of genes across our four groups (FW, MW, F, and M). This is another task that remains to be done in future projects. In addition, development of X-linked sequence markers will enable early sexing of the embryos (based on the female XX / male XO karyotypes) through quantitative PCR; and facilitate the study of early developmental processes in infected and uninfected specimens.

We found a number of *Wolbachia*-related transcripts in the sequenced cDNA libraries (**Table 3**). The loci expressed mainly in infected lines with great similarity to known *Wolbachia* sequences are likely to have *Wolbachia* origin (e.g., loci 1053, 1097, 1331, and 13961). Curiously, a couple of loci are expressed primarily

but PC2 separates the infected and uninfected samples conclusively. PC3 reveals an interaction term between *Wolbachia* status and sexual karyotype. The "\_ql" suffix after line names means that the expression values were quantile-normalized and log transformed. **(A)** PC1 vs. PC2; **(B)** PC1 vs. PC3.

in the uninfected lines (e.g., locus 22635). At this point, we do not have a hypothesis as to the reason behind this observation. Repeating the experiments with replicates and higher sequencing depth would be the first step to confirm the reproducibility of these patterns. Our protocol of mRNA purification for creation of cDNA libraries involved a hybridization step with oligo-T ligands, which targets the eukaryotic mRNA poly-A tails; therefore, it will be necessary to employ a different purification strategy in order to capture most of the poly-A lacking bacterial mRNAs. **Table 3** shows that several of the *Wolbachia*related sequences code for Ankyrin-repeat proteins. *Wolbachia* genomes are well known for containing an extraordinarily high number of these genes (Wu et al., 2004; Iturbe-Ormaetxe et al., 2005). Gene transfer between *Wolbachia* and mosquito hosts has been previously reported (Woolfit et al., 2009). PCR experiments and phylogenetic analyses have confirmed horizontal gene transfer from bacterial endosymbionts to the aphid genome (The International Aphid Genomics Consortium, 2010). Similar approaches will be required to confirm bacterial or insect origin for the transcripts listed in **Table 3**. We tried to check for possible aphid lineage-specific horizontal transfers by asking whether a likely *Wolbachia* transcript shows high sequence similarity to an aphid sequence, but not a fly sequence; none of the loci in **Table 3** expressed such a pattern. One of the *Wolbachia*-related transcripts showed a degree of homology with the aphid *vasa* gene (locus 4382). Almost identical homologs of this sequence exist in the three published *Wolbachia* genomes (Blast results not shown); its homologs in fly, leafhopper and the published *Wolbachia* genomes are characterized or predicted ATP-dependent RNA helicases. *vasa* has been implicated in transmission of maternal effects and sex determination in clams (Milani et al., 2011). It will be very interesting to check if products of host-homologous genes are actually exported out by *Wolbachia* into the host cell.

We used natural isolates of infected and uninfected leafhoppers for our comparisons with no antibiotic treatment. This relieved our comparisons from the confounding effects of antibiotic treatments on the host physiology. The rationale behind the traditional use of antibiotics to cure the infected lines from *Wolbachia* is to obtain infected and uninfected lines with the same genetic background. However, antibiotics can change the host physiology substantially, and quite remarkably, their effect can perpetuate through several generations of unexposed progeny (Ballard and Melvin, 2007; Zeh et al., 2012; Fridmann-Sirkis et al., 2014). We avoided the use of antibiotics completely and achieved homogenous genetic backgrounds among samples by taking advantage of repeated backcrossing of infected females to uninfected males. We collected all of our founder specimens from the same leafhopper population in a grass field. In the sampled population, the sex-ratio was only moderately female biased, with a moderate prevalence of the infection (∼1:1.8 male:female, *Wolbachia* infection rate ∼30% of the collected females; Negri I., unpublished data). As uninfected males are the only physiological males in existence, all the "egg-laying females" (in the field and in the lab, including the females used in this study) always mate with (and only with) uninfected males. Thus, all of our infected and uninfected lines come from the same genetic background. We carried out three further generations of backcrossing of infected females to uninfected males in the lab to effectively remove any residual genetic variation between the two groups. Details of rearing conditions are described in Negri et al. (2006). The natural pattern of sexual reproduction and the additional backcrossing done in the lab ensure the similarity of nuclear genetic backgrounds. We also tested mitochondrial gene sequences in *Zyginidia* samples from different Italian localities, both infected and uninfected, and they were all nearly identical (Negri I., unpublished data).

Through principal component analysis, we have showed that *Wolbachia*-induced changes in the host transcriptome are mainly sex-independent, and cannot be explained only by the sex reversal of genetic males. Previous transcriptomic studies on *Wolbachia* have reported changes in the expression of genes unrelated to the reproductive phenotype. For instance, *Wolbachia* infection in *Armadillidium vulgare* triggered the overexpression of immunerelated genes (Chevalier et al., 2012). In the parasitoid wasp *Asobara tabida*, endosymbiont infection or lack thereof was associated with changes in expression of genes related to female reproductive development, iron and oxidative stress regulation, and immune recognition (Kremer et al., 2009, 2012). Artificial infection of *Anopheles* cell cultures by *Wolbachia*, surprisingly caused down-regulation of immune, stress response and detoxification genes (Hughes et al., 2011). *Wolbachia*-inoculated *Drosophila* cell lines exhibited differential expression of several GO categories not directly related to reproduction, including antimicrobial humoral response, ion homeostasis, response to unfolded protein and response to chemical stimulus (Xi et al., 2008). In *Aedes aegypti*, *Wolbachia* was shown to manipulate the expression of a metalloprotease gene through induction of a specific host miRNA (Hussain et al., 2011). Apart from such direct evidence, the observation of various forms of fitness cost in the feminized males, is consistent with the idea that sex reversal is not the sole effect of feminizing *Wolbachia* (Moreau et al., 2001; Rigaud and Moreau, 2004). Nevertheless, our study is the first one to quantitatively demonstrate that infection itself has a larger effect than that of sex reversion, through PCA of all of the available gene expression levels.

Lack of replicates meant that we could not quantitatively identify differentially expressed genes between the lines because we could not calculate variances. Instead, we focused on the global patterns of gene expression by applying PCA to gene expression values. Thousands of loci (each acting as one observation point) were used to generate the PCs. Antibiotic treatment and different genetic backgrounds could have been two potential sources of systematic bias in this type of analysis; they could have generated similar clustering patterns and confounded the interpretation of results. However, through the single-population sampling and the repeated backcrossing scheme, we avoided both sources of confusion.

Based on the PCA results, we encourage the use of biochemical bottom-up approaches focusing on the whole *Wolbachia* effect rather than the specific sex inversion event. *Wolbachia*'s effect is perceivably mediated by molecules secreted into the host cell or expressed on the outer membrane surface of the bacteriumcontaining vesicles. *Wolbachia* cannot be maintained in cell-free cultures indefinitely; but there are protocols to keep them alive in synthetic media for several hours (Rasgon et al., 2006; Gamston and Rasgon, 2007). In such a setting, the molecules released into the medium can be detected and purified using chromatographic and/or mass spectrometric approaches. Appropriate methods can be used, too, for isolation and characterization of surface molecules from the bacterium-containing vesicles. Pull-down experiments on the host proteins by these *Wolbachia* released or surface molecules might reveal the initial cellular targets of the endosymbiont-host interaction.

## **REFERENCES**


with predicted alternative splices, single nucleotide polymorphisms and transcript expression estimates. *Insect Mol. Biol.* 21, 205–221. doi: 10.1111/j.1365- 2583.2011.01127.x


streamlined genome overrun by mobile genetic elements. *PLoS Biol.* 2:E69. doi: 10.1371/journal.pbio.0020069


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2014; accepted: 30 July 2014; published online: 01 September 2014. Citation: Asgharian H, Chang PL, Mazzoglio PJ and Negri I (2014) Wolbachia is not all about sex: male-feminizing Wolbachia alters the leafhopper Zyginidia pullula transcriptome in a mainly sex-independent manner. Front. Microbiol. 5:430. doi: 10.3389/ fmicb.2014.00430*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Asgharian, Chang, Mazzoglio and Negri. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Identification of overexpressed genes in *Sodalis glossinidius* inhabiting trypanosome-infected self-cured tsetse flies

#### *Illiassou Hamidou Soumana1, Bernadette Tchicaya1, Béatrice Loriod2,3, Pascal Rihet 2,3 and Anne Geiger <sup>1</sup> \**

*<sup>1</sup> IRD-CIRAD, UMR 177, Montpellier, France*

*<sup>2</sup> INSERM, UMR1090 TAGC, Marseille, France*

*<sup>3</sup> Biology Department, Aix-Marseille University, Marseille, France*

#### *Edited by:*

*Monica Medina, Pennsylvania State University, USA*

#### *Reviewed by:*

*Silvia Bulgheresi, University of Vienna, Austria Devin Coleman-Derr, Joint Genome Institute, USA*

#### *\*Correspondence:*

*Anne Geiger, UMR 177, IRD-CIRAD, CIRAD TA A-17/G, Campus International de Baillarguet, 34398 Montpellier Cedex 5, France e-mail: anne.geiger@ird.fr*

*Sodalis glossinidius,* one of the three tsetse fly maternally inherited symbionts, was previously shown to favor fly infection by trypanosomes, the parasites causing human sleeping sickness. Among a population of flies taking a trypanosome-infected blood meal, only a few individuals will acquire the parasite; the others will escape infection and be considered as refractory to trypanosome infection. The aim of the work was to investigate whether fly refractoriness could be associated with specific *Sodalis* gene expression. The transcriptome of *S. glossinidius* harbored by flies that were fed either with a non-infected blood meal (control) or with a trypanosome-infected meal but that did not develop infection were analyzed, using microarray technology, and compared. The analysis using the microarray procedure yielded 17 genes that were found to have a significant differential expression between the two groups. Interestingly, all these genes were overexpressed in self-cured (refractory) flies. Further analysis of functional annotation of these genes indicated that most associated biological process terms were related to metabolic and biosynthetic processes as well as to oxido-reduction mechanisms. These results evidence the occurrence of molecular crosstalk between the different partners, induced by the passage of the trypanosomes through the fly's gut even though the parasites were unable to establish in the gut and to develop a permanent infection.

**Keywords: sleeping sickness, tsetse-symbiont-trypanosomes, tripartite interactions, control flies, self-cured tsetse flies**

## **INTRODUCTION**

Tsetse flies (*Glossina* spp.), the vectors of African trypanosomes causing sleeping sickness in humans (HAT, human African trypanosomiasis) and nagana (AAT, animal African trypanosomiasis) in animals, harbor symbiotic bacteria that regulate important aspects of their host's physiology. Two of these microbes, obligate *Wigglesworthia glossinidia* and commensal *Sodalis glossinidius*, are vertically transmitted (Cheng and Aksoy, 1999; Dale and Maudlin, 1999) to developing intrauterine progeny via maternal milk gland secretions (Attardo et al., 2008). Tsetse's third symbiont, *Wolbachia*, is transmitted via the germ-line cells (Cheng et al., 2000; Balmand et al., 2013). While the prevalence of *Wolbachia* infections is high in laboratory-reared fly colonies (Cheng et al., 2000), field population prevalence is much lower, and several tsetse fly species were never shown to harbor the symbiont (Doudoumis et al., 2012); in fact, we did not evidence the presence of *Wolbachia* in the population of *Glossina palpalis gambiensis* from which the individuals used in our experiments were selected (Geiger, personal communication). Nevertheless, the association between some tsetse fly species and the symbiont may have a long co-evolutionary history, since *Wolbachia* loci were found horizontally transferred into the host genome (Doudoumis et al., 2012). *S. glossinidius* is a secondary symbiont located intra-extracellularly in the fly's midgut, but it can be detected in other tissues (Cheng and Aksoy, 1999; Balmand et al., 2013). The association between *Sodalis* and tsetse fly was suggested to be recent (Chen et al., 1999). In the wild, the prevalence of fly infection by trypanosomes seldom exceed 10% of the population (Frézil and Cuisance, 1994; Maudlin and Welburn, 1994); similarly, when flies are fed with a trypanosome-infected blood meal in the insectary (Ravel et al., 2003), less than 50% of the flies will acquire the parasite and, as in field conditions, most will escape infection. This means that the normal status of the flies is to be refractory to trypanosome infection. As concerns *Sodalis*, this symbiont was believed to be involved in fly vector competence in enhancing the trypanosome susceptibility of its host, the tsetse fly (Welburn and Maudlin, 1999). In the wild, the presence of *Sodalis* has been demonstrated to favor fly infection by trypanosomes, assessing the suggested role of the symbiont in vector competence (Farikou et al., 2010). The suggested mechanism involved included the inhibition of the trypanocidal lectin, secreted by the fly during feeding, by N-acetyl glucosamine resulting from pupae chitin hydrolysis by chitinases secreted by the fly-hosted *S. glossinidius* (Maudlin and Ellis, 1985; Welburn et al., 1993; Welburn and Maudlin, 1999; Dale and Welburn, 2001). Finally, it was also shown that the effect of *S. glossinidius* could depend on its genotype (Geiger et al., 2007; Farikou et al., 2010).

However, an overview of the biological mechanisms by which, *in vivo*, the bacteria favors fly infection, and, conversely, the mechanisms by which the fly becomes refractory to trypanosome infection, is still lacking. In this context, the aim of the present work was to investigate whether fly refractoriness could be associated with specific *Sodalis* gene expression. Consequently, the transcriptomes of *S. glossinidius* harbored by flies that were fed either with a non-infected blood meal (control) or with a trypanosomeinfected meal, but that did not develop infection, were analyzed using genome-wide *S. glossinidius* oligonucleotide microarrays and compared.

### **MATERIALS AND METHODS**

#### **ETHICAL STATEMENT**

The experimental protocols involving animals were approved by the Ethics Committee and the Veterinary Department of the Centre International de Recherche Agronomique pour le Développement (CIRAD), Montpellier, France. The experiments were conducted according to internationally recognized guidelines.

#### *TRYPANOSOMA BRUCEI GAMBIENSE* **STRAIN**

The S7/2/2 *T. b. gambiense* strain used in this study was isolated in 2002 from HAT-affected patients living in the sleeping sickness focus of Bonon, Côte d'Ivoire (Ravel et al., 2006). The strain belongs to the homogenous *T. b. gambiense* group 1.

#### **INFECTION OF** *GLOSSINA PALPALIS GAMBIENSIS*

The *G. p. gambiensis* flies used in this study originate from flies that were collected in the field in Burkina Faso. Pupae were collected from these flies. Following fly emergence, the population was maintained in a level-2 containment insectary at 23◦C and 80% relative humidity (Geiger et al., 2005) without any selection. Individuals used in the present work were randomly chosen for infection experiments.

Experimental infections were conducted following the protocol reported by Ravel et al. (2006). *T. b. gambiense* stabilate was thawed at room temperature and 0.2 ml was injected intraperitoneally into balb/cj mice. The infection was monitored by examining tail blood using a phase-contrast microscope at a ×400 magnification. Teneral flies were then fed for the first time on infected mice displaying parasitemia levels between 15 and <sup>25</sup> <sup>×</sup> <sup>10</sup><sup>7</sup> parasites/ml (determined using the matching method, Herbert and Lumsden, 1976). Ten days after infected bloodmeal uptake, an anal drop was collected from each fly, and the fly infection status was determined by PCR examination using TBR specific primers (Moser et al., 1989) assessing the presence or absence of trypanosomes. Positive PCR results indicate trypanosome establishment in the fly midgut; negative PCR results indicate trypanosome self-cured flies. Less than 5% of the flies that were exposed to trypanosome were shown to be infected at day 10 post-infected blood-meal uptake. Only flies whose PCR result was negative were included in this study; they were designated as self-cleared or as refractory flies. Negative control samples consisted in teneral flies fed for the first time on noninfected mice. Finally, all the flies (fed on infected or non-infected mice) were later maintained by feeding on an uninfected rabbit, 3 days a week. Ten days after the first blood feeding (on either trypanosome infected or non-infected mice), the flies were dissected according to the method described by Penchenier and Itard (1981), and the samples, each of seven pooled midguts, respectively from control and refractory flies, were collected in 400 µl of RNA later (Ambion, France).

#### **RNA ISOLATION**

Total RNA was extracted from each sample using Trizol reagent (Invitrogen, France) according to the manufacturer's specifications. After extraction, RNA integrity was checked using agarose gel electrophoresis. The quality of RNA and the absence of any DNA contamination were checked on an Agilent RNA 6000 Bioanalyzer and quantified using the Agilent RNA 6000 Nano kit (Agilent Technologies, France).

#### **cDNA HYBRIDIZATION ON MICROARRAY**

RNA reverse transcription and fluorescent dye incorporation were carried out using the Promega ChipShot Direct Labeling and Clean-Up System (Promega, USA). For each sample, 5 µg of total RNA was reverse-transcribed and labeled with a single dye (Cy3) labeling procedure and used for microarray hybridization according to the manufacturer's indications (Promega). Each sample was run on custom-made 60-mers oligonucleotide microarrays specific for the *S. glossinidius* whole genome, and for the four plasmids (respectively, GenBank accession number, AP008232; NCBI RefSeq: NC\_007183.1; NCBI RefSeq: NC\_007184.1; NCBI RefSeq: NC\_007186.1; NCBI RefSeq: NC\_007187.1) with at least four oligonucleotide probes per gene (design is available at Gene Expression Omnibus under the accession number GPL17347). The Agilent design utilizes the uniqueness of probe sequences as one of the criteria for probe selection to avoid cross-hybridization with non-target genes. For each experimental condition four independent biological replicates were analyzed to ensure the high reproducibility and statistical significance of the expression data. The details of the expression data are available at Gene Expression Omnibus under accession number GSE48360.

#### **MICROARRAY DATA ANALYSIS**

The primary expression data were normalized through two successive steps using (a) both R software packages and lowess normalization to normalize the M-values for each array separately (within-array normalization) without prior background correction, and (b) quantile normalization to the *A*-values, making the density distributions similar across arrays to compare expression intensities between them (Bolstad et al., 2003). Normalized expression values were averaged through Cy3 signal intensities according to dye-swap replications to assign only one expression value to each biological replicate. Microarray data were scanned using an Agilent microarray scanner (Agilent Technologies), and the pictures were extracted with Agilent Feature Extraction software (version 10.5.1.1). Data were filtered for detectable expression level; only those showing a level of expression greater than the background noise in at least three of the four replicates were selected.

Unsupervised hierarchical clustering was used to investigate relationships between samples and between genes. It was applied to median-centered data, using the Cluster and TreeView programs (average linkage clustering using Pearson correlation as the metric distance). Statistical analysis was performed using the TMeV5 Multi Experiment Viewer, v4.5 software (http://www. tm4.org/mev.html) and two-class unpaired SAM (significant analysis of the microarray program) analysis method. One-way analysis of variance was applied to identify genes differentially expressed between infection self-cured and control flies. A 5% predicted false discovery rate was used as the threshold for differential expression (Reiner et al., 2003).

The identification of biological interpretation of differentially expressed genes was performed using DAVID software (Dennis et al., 2003). This program allows identification of the biological interpretation of genes in the basis of gene ontology (GO) terms. In addition, the Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathways were used to assess the specific biological pathways that were overrepresented.

#### **QUANTITATIVE REAL-TIME PCR**

The microarray results were controlled, using quantitative PCR (qPCR), on a subset of four genes (SG0845, SG0858, SG0895, and SG1978) that were shown to be differentially expressed in microarray experiments between the two groups of flies; these genes were among the highest overexpressed in refractory flies. Primers, specific to these genes (**Table 1**) were designed using Primer-Blast software (http://www.ncbi.nlm.nih.gov/tools/ primer-blast/). cDNA was synthesized from 5 µg of original total RNA samples using random hexamers and Superscript II reverse-transcriptase (Invitrogen, France) according to the manufacturer's instructions. All qPCR reactions were performed in an Mx3005P QPCR System (Agilent Technologies) using the Brillant II Sybrgreen qPCR Kit (Agilent technologies) with 2µl of cDNA of a known concentration in a 25-µl total volume. PCR efficiencies for each primer pair were calculated using tenfold dilutions of fly gut-extracted cDNA as previously described (Hamidou Soumana et al., 2013). PCR conditions were as follows:

#### **Table 1 | Primers designed for microarray data confirmation by quantitative PCR (qPCR).**


*Primers were designed with Primer-Blast software. The third and fourth columns represent forward and reverse primer concentration used in qPCR and PCR product size obtained, respectively.*

94◦C for 5 min (1×), followed by 94◦C for 45 s, 60◦C for 45 s, 72◦C for 1 min (39×), and then 72◦C for 10 min (1×). Melting curve analysis was performed to check the specificity of the PCR reaction and to verify the amplification efficiency. The housekeeping gene, *Glossina* tubulin (GenBank accession number HE861503), was used as the reference gene for the normalization



*The fold change represents the ratio of S. glossinidius gene expression level by comparing self-cured flies to control flies fed on a non-infected blood meal.*

calculation of relative expression quantification. Cycle thresholds (Ct) for each reaction were obtained using the MxPRO QPCR Software (Agilent Technologies). Relative quantification was calculated with the 2−--*<sup>C</sup>*(*t*) method as described by Livak and Schmittgen (2001). Relative quantification for given genes with respect to the calibrator was determined and compared with the normalized expression values resulting from microarray experiments.

## **RESULTS**

The aim of the study was, using the microarray analysis procedure, to compare the transcriptome of *S. glossinidius* harbored by tsetse flies (*G. p. gambiensis*) that got an non-infected blood meal (control flies) with that of the symbiont harbored by flies that did not become infected despite they were fed with *T. b. gambiense* infected blood meal (refractory flies). The comparison is expected to allow to identify differential expressed genes, if any. The gene expression was analyzed 10 days after the flies had taken either their infective or a noninfective blood meal. Two-class SAM procedures were used to identify differentially expressed genes with a 5% false discovery rate.

#### **IDENTIFICATION OF DIFFERENTIALLY EXPRESSED GENES**

**Table 2** presents the 17 genes that exhibited significant differential expression between the two groups using the modified t-statistic SAM. Interestingly, all these genes were overexpressed in infection self-cured flies.

We used an unsupervised hierarchical clustering method that grouped genes on the vertical axis and samples on the horizontal axis, on the basis of similarity in their expression profiles. The similarities are summarized in a dendrogram in which the pattern and length of the branches reflect the degree of relatedness of the samples (**Figure 1**).

According to microarray data, all significantly differentially expressed genes were overexpressed 1.2- to 1.8-fold in


**FIGURE 1 | Expression profile of** *Sodalis glossinidius* **genes whose transcript levels changed significantly between trypanosome self-cured flies and control flies fed on a non-infected blood meal.** This set of genes was extracted from the full data set (*n* = 2823) using a SAM procedure with

a 5% false discovery rate. Each row represents a gene and each column represents a sample. Red and green indicate expression levels above and below the median, respectively. Dendrogram of genes, to the left of the matrix represents overall similarities in gene expression profiles.

refractory flies with reference to the level of expression in control flies. The *S. glossinidius* gene (SG0858\_nagB) corresponding to glucosamine-6-phosphate deaminase gene, is one of the most highly overexpressed in refractory flies (1.5- to 1.7-fold overexpression); this enzyme plays a crucial role in amino sugar and nucleotide sugar metabolism. We also detected increased expression levels of genes involved in purine metabolism such as ADP-ribose pyrophosphatase (SG0267; 1.4-fold increase), in D-galactose metabolism, represented by the gene encoding galactokinase (SG0895) and UTP-glucose-1-phosphate uridylyltransferase (SG1367), which were 1.5- and 1.7-fold over-represented in refractory flies. Oxidative respiration complex enzyme NADH dehydrogenase (SG1597) appears to be 1.4-fold overexpressed in refractory flies.

In resistant flies, we also found overexpressed genes involved in the exonucleolytic cleavage of DNA, synthesis of amino acids and lipoproteins, as well as in disulfide bond formation and assistance in the conformational maturation of secreted proteins containing disulfide bonds.

Finally, among the highest overexpressed genes, we identified genes coding for phage tail sheath protein (SG0845; 1.8-fold overexpression) and for phage capsid protein (SG2357; 1.5-fold overexpression).

#### **qPCR CONTROL OF MICROARRAY DATA**

The microarray expression data were validated by quantitative PCR analyses. Four genes showing different expression levels were selected from the microarray data. The results provided by the quantitative PCR analyses were similar to those provided by microarray data, showing consistent expression levels for the four genes (**Figure 2**).

#### **GENE FUNCTIONAL ANNOTATION**

Functional annotations of the differentially expressed *S. glossinidius* genes, with reference to the biological process GO terms and KEGG pathways, was investigated using DAVID software. The analysis showed an overrepresentation of GO terms related to metabolism and biosynthesis processes (**Table 3**). The modified Fisher exact test revealed a 17.6-fold enrichment (*P*-value = 0.055) for the GO term related to hexose metabolism (**Table 3**). Similarly, galactose metabolism KEGG pathways were found to be 28.5-fold-enriched (*P*-value = 0.088) (**Table 4**).

**FIGURE 2 | Comparison of selected gene expression assessed by quantitative PCR and by microarray technologies. (A)** Gene expression was assessed by microarray technology. The n-fold change value was calculated on the basis of normalized data when comparing the level of gene expression from *S. glossinidius* derived from self-cured flies with those of control flies fed on a non-infected blood meal. Error bar represents the

standard deviation (SD) between biological replicates. **(B)** Gene expression was assessed by quantitative PCR. Data were analyzed with the 2−--*C*(*t*) method with *Glossina* tubulin gene as a control gene. The n-fold change value represents the mean of the *Sodalis* gene expression level in self-cured flies compared with control. Error bar represents the SD between biological replicates.

## **DISCUSSION**

After being ingested by a tsetse fly taking an infective blood meal, the trypanosome undergoes a complex cycle of differentiation and multiplication in the host midgut. Successful establishment of trypanosomes in the tsetse fly midgut depends on their ability to adapt, transform, grow, and survive rapidly in this new fly midgut environment (Simo et al., 2010). Several factors could influence parasite establishment, among which the tsetse midgut lectin (Welburn and Maudlin, 1999), reactive oxygen

**Table 3 | Biological process gene ontology (GO) terms associated with set of** *S. glossinidius* **significantly differentially expressed genes obtained with David software.**


*\*Enriched GO term with t-statistic modified test (P-value* <sup>=</sup> *0.055)*

species (MacLeod et al., 2007), and antimicrobial peptide produced by the fly in response to trypanosome infection (Hao et al., 2001). Furthermore, *S. glossinidius*, was previously shown to favor tsetse fly infection by trypanosomes (Welburn and Maudlin, 1999). Despite the presence of the symbiont in all insectary tsetse flies, most of the flies are refractory to trypanosome infection (Geiger et al., 2005). In this context, we investigated the transcriptomic events that may occur in bacteria when they are harbored by refractory flies. Using microarray analysis we investigated *S. glossinidius* genes, the expression of which could discriminate *G. p. gambiensis* flies refractory to *T. b. gambiense* infection and control flies.

Most of the genes whose expression was modified, overexpressed in *Sodalis* from refractory versus *Sodalis* from control flies, are involved in lipoprotein metabolic and biosynthetic processes, as well as in amino sugar and nucleotide metabolism. Bacterial lipoproteins have been shown to play various roles, including nutrient uptake, transport (such as the ABC transport system), and extracytoplasmic folding of proteins (Lampen and Nielsen, 1984; Mathiopoulos et al., 1991; Alloing et al., 1994). The *Sodalis* prolipoprotein diacylglyceryl transferase gene (SG1978) that was found to be overexpressed in self-cured flies is the only one transferring the diacylglyceryl moiety to the thiol group of cysteine. The importance of this enzyme has been emphasized by the fact that post-translational modification is ubiquitous in the bacterial kingdom. The overexpression of enzymes involved in bacterial growth could be a necessary mechanism employed by the bacteria to fight the parasite.

The gene encoding the NADH dehydrogenase complex (SG1597) was found to be overexpressed in *Sodalis* from refractory flies. This enzyme is involved in the oxidative respiration process and allows bacteria to survive in a variety of hostile environments and to adapt quickly in a rapidly changing environment (Richardson, 2000). Furthermore, this enzyme is implicated in the synthesis of ATP, and thus energy metabolism in the prokaryotic cell (Lengeler et al., 2009). In mosquito cells, oxido-reduction mechanisms were used to protect against DENV viral infection (Patramool et al., 2011). Several other overexpressed genes in refractory flies were found to be involved in KEGG pathways and

**Table 4 | Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathways associated with set of** *S. glossinidius* **significantly differentially expressed genes obtained with David software.**


*\*Enriched KEGG pathway with t-statistic modified test (P-value* <sup>=</sup> *0.088)*

GO terms related to metabolism, such as galactose metabolism, purine metabolism, amino sugar and nucleotide metabolism, as well as hexose metabolism (**Tables 3**, **4**). Thus, *S. glossinidius* might indeed benefit its tsetse host by nutrient supplementation via these compounds. However, why these pathways were overexpressed in refractory flies is unknown. In other studies, increased sugar metabolism enzyme activities due to viral infection have been reported (Klemperer, 1961; El-Bacha et al., 2004; Tchankouo-Nguetcheu et al., 2010). It has been suggested that the increased activity of glycolysis was due to the breakdown of the mitochondrial membrane, which decreased ATP production (Ritter et al., 2010). As a result, the glycolysis pathway was activated to compensate for the lack of energy via the oxidative pathway. However, recent studies have demonstrated alternative functions of sugar metabolism enzymes such as transcriptional regulation or as a regulator or indicator of apoptosis (Kim and Dang, 2005).

In addition, among the most highly overexpressed genes in refractory flies were those encoding glucosamine-6-phosphate deaminase (SG0858). This enzyme participates in amino sugar metabolism, particularly in the conversion of glucosamine into ammonium and fructose. *S. glossinidius* was suspected to favor trypanosome establishment in the insect midgut through a complex biochemical mechanism involving the production of Nacetyl glucosamine (Welburn and Maudlin, 1999). This sugar, resulting from hydrolysis of pupae chitin by a *S. glossinidius*produced endochitinase was reported to inhibit a tsetse-midgut lectin lethal for the procyclic forms of the trypanosome (Dale and Welburn, 2001). So, while the presence of this sugar would favor the establishment of trypanosomes in the fly's midgut, its deamination by the glucosamine-6-phosphate deaminase may, in contrast, favor fly refractoriness. So in a next step, the decrease of N-acetyl glucosamine *in situ* will have to be studied.

Finally, increased transcription of genes coding phage proteins was recorded in refractory flies when compared to flies fed with a non-infected bloodmeal. These results are in line with those obtained previously when comparing refractory versus infected flies (Hamidou Soumana et al., 2014). Bacteriophage elements have also been identified in other symbiotic associations. So for the presence of bacteriophages APSE-1 and APSE-2 in the secondary endosymbiont of aphids ca. *Hamiltonella defensa* where they are associated with the protective activity of this secondary endosymbiont that kills parasitoid wasp larvae (Oliver et al., 2003; Moran et al., 2005; Degnan and Moran, 2008). Similarly, a bacteriophage, WO, was characterized in parasitic *Wolbachia*; the phage was suggested to be beneficial for the invertebrate host as it may be involved in the parasitic bacterial load regulation (Bordenstein and Wernegreen, 2004; Bordenstein et al., 2006). Finally a detailed characterization of mobile genetic elements and pseudogenes revealed the presence of different types of prophage elements that have proliferated across the genome of *S. glossinidius* (Belda et al., 2010). In addition, the presence of viral particles has been observed previously in *Sodalis glossinidius* cultures (Maudlin, personal communication).

Regarding our results, they highlight the probable role of a bacteriophage as a major actor in tsetse fly refractoriness. The activation of a prophage hosted by *S. glossinidius* could lead to the release of bacterial agonists that trigger the tsetse fly immune system preventing trypanosome development.

The overall results demonstrate the existence of a molecular dialog between the three partners—the fly, the symbiont, *Sodalis glossinidius*, and the trypanosome—even though the parasite could not establish in the fly's midgut. Some of the overexpressed genes belong to classical metabolic pathways; others, however, may be involved in fly refractoriness. The molecular signal that induces the overexpression of all these genes is unknown. Further investigations are needed to progress in the understanding of the complex tripartite interactions that control the fly vector competence and hence the spread of sleeping sickness.

## **ACKNOWLEDGMENTS**

The authors thank the "Région Languedoc-Roussillon - Appel d'Offre Chercheur d'Avenir 2011," the "Service de Coopération et d'Action Culturelle de l'Ambassade de France au Niger" and the "Institut de Recherche pour le Développement" for their financial support. Illiassou Hamidou Soumana is a PhD student funded by the French Embassy in the Republic of Niger, Service de Coopération et d'Action Culturelle (SCAC). We extend our thanks to Dr Aurélie Bergon (TAGC platform) for her assistance in data registration in the Gene Expression Omnibus data bank.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 March 2014; paper pending published: 22 April 2014; accepted: 09 May 2014; published online: 27 May 2014.*

*Citation: Hamidou Soumana I, Tchicaya B, Loriod B, Rihet P and Geiger A (2014) Identification of overexpressed genes in Sodalis glossinidius inhabiting trypanosomeinfected self-cured tsetse flies. Front. Microbiol. 5:255. doi: 10.3389/fmicb.2014.00255 This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Hamidou Soumana, Tchicaya, Loriod, Rihet and Geiger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Microbial interactions and the ecology and evolution of Hawaiian Drosophilidae

## *Timothy K. O'Connor 1 †, Parris T. Humphrey1 †, Richard T. Lapoint 1, Noah K. Whiteman1 and Patrick M. O'Grady2 \**

<sup>1</sup> Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA

<sup>2</sup> Environmental Science, Policy and Management, University of California Berkeley, Berkeley, CA, USA

#### *Edited by:*

M. Pilar Francino, Center for Public Health Research, Spain

#### *Reviewed by:*

Rob DeSalle, American Museum of Natural History, USA Jens Walter, University of Nebraska, USA

#### *\*Correspondence:*

Patrick M. O'Grady, Environmental Science, Policy and Management, University of California Berkeley (University, Not-for-profit), 137 Hilgard Hall, Berkeley, CA 94720, USA e-mail: ogrady@berkeley.edu

†Timothy K. O'Connor and Parris T. Humphrey have contributed equally to this work.

Adaptive radiations are characterized by an increased rate of speciation and expanded range of habitats and ecological niches exploited by those species. The Hawaiian Drosophilidae is a classic adaptive radiation; a single ancestral species colonized Hawaii approximately 25 million years ago and gave rise to two monophyletic lineages, the Hawaiian Drosophila and the genus Scaptomyza. The Hawaiian Drosophila are largely saprophagous and rely on approximately 40 endemic plant families and their associated microbes to complete development. Scaptomyza are even more diverse in host breadth. While many species of Scaptomyza utilize decomposing plant substrates, some species have evolved to become herbivores, parasites on spider egg masses, and exploit microbes on living plant tissue. Understanding the origin of the ecological diversity encompassed by these nearly 700 described species has been a challenge.The central role of microbes in drosophilid ecology suggests bacterial and fungal associates may have played a role in the diversification of the Hawaiian Drosophilidae. Here we synthesize recent ecological and microbial community data from the Hawaiian Drosophilidae to examine the forces that may have led to this adaptive radiation.We propose that the evolutionary success of the Hawaiian Drosophilidae is due to a combination of factors, including adaptation to novel ecological niches facilitated by microbes.

**Keywords: Hawaiian** *Drosophila***,** *Scaptomyza***, symbiosis, fungi,** *Pseudomonas***, herbivory, adaptive radiation**

#### **INTRODUCTION**

Symbioses are broadly defined as persistent interactions between two or more species. While one view of symbioses is restricted to mutualistic relationships, most biologists now consider any type of long-standing interaction between species (e.g., commensalism, mutualism, parasitism) as a symbiotic relationship (Bradford and Schwab, 2013). Many insects have developed intimate evolutionary interactions with microbes that enhance nutrient acquisition or reproduction (Moran and Baumann, 2000; Currie et al., 2003, 2006; Mikheyev et al., 2007; Werren et al., 2008; Hansen and Moran, 2014), or defense against natural enemies (Oliver et al., 2009; Jaenike et al., 2010). A different type of relationship is seen in saprophagous insects, such as many fly species in the family Drosophilidae, which require microbes to break down plant material and make nutrients available for uptake.

Yeasts and bacteria associated with drosophilid flies can influence mating behavior, oviposition behavior, larval feeding choice, and food processing, and these ecological roles can have important evolutionary consequences for insects. Here we propose the Hawaiian Drosophilidae as a model system for studying the role of microbial associations in insect diversification. We focus on two systems: (1) the fungal associates of the largely saprophagous Hawaiian *Drosophila* lineage and (2) the bacterial species encountered by drosophilids, especially herbivorous members of the genus *Scaptomyza*.

#### **HAWAIIAN DROSOPHILIDAE**

The Hawaiian Drosophilidae is one of the best-characterized examples of an adaptive radiation (Carson and Kaneshiro, 1976). This group includes 687 described species (Magnacca and Price, 2012) and 200–300 more taxa that await description (O'Grady et al., 2011). Hawaiian Drosophilidae have adapted to a diverse array of niches and plant substrates (Kambysellis and Craddock, 1997), and their interactions with microbes are a central part of *Drosophila* ecology. Microbes have been implicated in providing direct and indirect nutrition sources (Northrop, 1918; Starmer and Aberdeen, 1990), generating chemosensory signals (Dobzhansky et al., 1956; Melcher and Pankratz, 2005), and extensively colonizing larvae and adults (Gilbert, 1980; Ganter, 1988; Coluccio et al., 2008). Although microbes can influence insect ecology (Feldhaar, 2011), promote speciation (Brucker and Bordenstein, 2012, 2013; Joy, 2013), and promote niche differentiation (Janson et al., 2008; Joy, 2013), the potential role of microbes in the diversification of Hawaiian Drosophilidae has not been explored in depth.

Drosophilidae is the oldest known lineage of endemic Hawaiian plants or insects (Price and Clague, 2002). A single colonizing species is estimated to have arrived in the Hawaiian Islands ∼25 million years ago (Thomas and Hunt, 1993; Russo et al., 1995), although recent estimates suggest a slightly older age (Tamura et al., 2004; Obbard et al., 2012). The Hawaiian Drosophilidae

has since radiated into two species-rich lineages, the endemic Hawaiian *Drosophila* and the cosmopolitan genus *Scaptomyza*. Although the inclusion of *Scaptomyza* within a larger *Drosophila* group is confusing taxonomically, this is due to the large-scale polyphyly of the genus *Drosophila* (O'Grady and DeSalle, 2008; O'Grady et al., 2008a,b; O'Grady and Markow, 2009; O'Grady, 2010). The Hawaiian Drosophilidae (Hawaiian *Drosophila*+ *Scaptomyza*) is strongly supported as monophyletic in every rigorous phylogenetic study (Throckmorton, 1975; Thomas and Hunt, 1991, 1993; Baker and DeSalle, 1997; Remsen and DeSalle, 1998; Bonacum, 2001; Remsen and O'Grady, 2002; O'Grady and DeSalle, 2008; O'Grady et al., 2011). Most members of the genus *Drosophila*, including those endemic to Hawaii, are saprophagous and have adapted to a diverse array of substrates for oviposition, larval development and adult nutrition (Markow and O'Grady, 2005, 2008). While many *Scaptomyza* species are saprophagous on a variety of larval substrates, including plant leaves and flowers, some now specialize on spider egg sacs or land snails (Magnacca et al., 2008), and herbivory has evolved at least once within this lineage (Lapoint et al., 2013).

## **HAWAIIAN** *Drosophila* **AND ASSOCIATED YEASTS**

The Hawaiian Drosophilidae (**Figure 1A**) utilize nearly 40% of the native Hawaiian plant families and an array of substrate types (leaves, bark, fruits, sap flux, fungus; Magnacca et al., 2008). Hawaiian *Drosophila* adults use volatile compounds as cues to identify host plants and stimulate mating and oviposition, although the identity and origin of these cues are unknown (Ohta, 1978). Among the closely related species in the cactophilic *Drosophila repleta* group, such cues can include byproducts of microbial metabolism (Fogleman and Foster, 1989), raising the possibility that host finding in the Hawaiian Drosophilidae may also be microbially mediated.

Ort et al. (2012) surveyed four endemic Hawaiian plants (**Figure 1B**) to determine whether microbial communities played a role in host plant specificity in Hawaiian *Drosophila*. Over 160 OTUs, representing 113 genera and 13 fungal classes, were discovered (**Figure 1C**). Ort et al. (2012) found little sharing of fungal taxa between different substrates, and fungal communities differed significantly between substrate type (e.g., leaves *vs*. stems) and among plant genera. It is clear that different substrates support correspondingly distinct fungal communities, which may provide unique oviposition cues or nutrition to flies that use those substrates.

Relative to their host plants, the fungal communities of the two *Drosophila* species examined, *Drosophila imparisetae* and *Drosophila neutralis*, were relatively simple: only seven or eight fungal lineages were present (**Figure 1D**). This suggests that *Drosophila* vector a limited number of fungal species from plant to plant. Interestingly, the most abundant fungal class associated with *Drosophila* adults, Saccharomycetes, was only modestly represented in the*Cheirodendron* leaf samples (**Figure 1C**), suggesting that Hawaiian flies select and vector their own yeasts from rotting plant to rotting plant, as in the cactophilic *Drosophila* (Barker and Starmer, 1982).

*Drosophila*-associated microbes may contribute to reproductive isolation of closely related species, which is critical to sustaining adaptive radiations. In the cactophilic species *Drosophila buzzatii*, heritable variation in oviposition behavior is mediated by attraction to different yeasts (Barker et al.,1994), which might contribute to assortative mating among genotypes. Isolation among races and species of cactophilic *Drosophila* species may have evolved due to chemical variation in larval substrates that are a combination of necrotic host plant tissues and microbial communities (Etges and de Oliveira, 2014). Combined with evidence that bacterial communities play a direct role in mating preference in *Drosophila melanogaster* (Sharon et al., 2010), this suggests that microbes can directly or indirectly influence speciation of drosophilid flies through mechanisms that are dependent on larval feeding substrates. Ohta (1980) provides strong evidence that post-mating barriers involving large chromosomal inversion in Hawaiian *Drosophila* can be explained by variation in host plant use. Thus, the combined effects of host plant phenotype and microbial community phenotypes have likely played an important role in driving the diversification of Hawaiian Drosophilidae.

#### *Scaptomyza* **AND ASSOCIATED BACTERIA**

Most drosophilid flies likely feed on microbes associated with rotting vegetation or on the fruiting bodies of fungi. However, in a few lineages (including *Scaptomyza*), feeding on living plant tissues as a primary source of nutrition (herbivory) has evolved. This transition to herbivory is a remarkable feature of the Hawaiian Drosophilidae radiation, the challenges of which are underscored by the paucity of insect orders with herbivorous members (Mitter et al., 1988). Given the change in the nature of the relationship between *Scaptomyza* and its food source, the nature of its relationship to its microbial communities is expected to also change. It is likely that herbivorous drosophilid lineages encounter distinct groups of plant-associated microbes that may influence fly adaptation to these larval substrates.

Leaf-mining *Scaptomyza* larvae are extensively colonized by plant-associated bacteria during feeding. Gut bacterial composition of larval *Scaptomyza flava* collected from leaves of wild *Barbarea vulgaris* (Brassicaceae) resembled that of their host plant more than any other drosophilid: 99.7% of gut bacterial sequences matched OTUs found in *Barbarea vulgaris* (**Figure 2A**). Pseudomonadaceae predominated in *Scaptomyza flava* guts and *Barbarea vulgaris* leaves (**Figure 2B**). Also found in both samples were *Enterobacter cloacae*, which includes some strains that degrade isothiocyanates (Tang et al., 1972), a group of potent foliar toxins in the Brassicaceae that are released upon plant wounding. Although untested, *Enterobacter cloacae* or other bacteria may supplement the endogenous isothiocyanate detoxification abilities of *Scaptomyza*, which involve modification of ancient evolutionarily conserved detoxification pathways (Gloss et al., 2014).

One way herbivorous *Scaptomyza* may have adapted to feeding on living plant tissue is by acquiring novel microbial symbionts that aid in substrate utilization, for instance by catabolizing polysaccharides or plant secondary compounds. Bacteria that can metabolize plant-derived molecules in ways that contribute to insect fitness are most likely to be found already associated with the novel substrate (reviewed in Janson et al., 2008; Mason et al., 2014). Indeed, recent work suggests *Scaptomyza flava* depends

imparisetae and Drosophila neutralis, both oviposit in rotting leaves of Araliaceae (Cheirodendron). **(C)** The phylogeny of the fungal taxa present in decomposing leaves from Cheirodendron (and other Hawaiian plant taxa)

the adult Drosophila species sampled. Aspects of this figure have been modified with permission from O'Grady et al. (2011),Ort et al. (2012), and Lapoint et al. (2013).

#### **FIGURE 2 | Observational and experimental evidence linking**

*Scaptomyza* **ecology with microbes. (A,B)** The gut bacterial community of Scaptomyza flava larvae more closely resembles that of a host plant (Barbarea vulgaris) than that of other drosophilids. Field-collected Scaptomyza flava and Barbarea vulgaris microbial communities were characterized with Illumina sequencing (according to Caporaso et al., 2012) and compared to other Drosophila communities sequenced by Chandler et al. (2011). **(A)** Principal coordinate analysis (PCOA) of unweighted UniFrac distances summarizing differences in gut bacterial community composition among species. Gut bacteria of Scaptomyza flava are significantly different from those of other drosophilids (PERMANOVA, P < 0.001). **(B)** The relative abundance of bacteria classified to Pseudomonadaceae is greater in Scaptomyza flava than other drosophilids and comparable to levels found in Barbarea vulgaris. **(C)** Pre-treating Scaptomyza flava with antibiotics reduces feeding and fecundity on Arabidopsis thaliana. Lab-reared flies were fed for 4 days on 5% sucrose with (Rif+) or without (Rif–) 50 μg/mL rifampicin (Sigma). Treatments were delivered into feeding chambers with 5 μL microcapillary tubes that were refreshed daily. After 4 days, all surviving flies were randomized to cages with plants of either line (wild-type Col-0 or glucosinolate knock-out GKO) and were allowed to feed

and oviposit for 24 h, after which feeding puncture and eggs on each plant were counted. The number of feeding punctures and eggs per plant were normalized by number of flies released into each cage (each Rif– condition had 14 females; each Rif+ condition had 23 females). Data are presented as boxplots with 50% quantiles around the medians (dots) and symmetrical marginal frequency distributions. \*\*P < 0.01, \*P < 0.05 (Mann–Whitney U-test, two-tailed). **(D)** Scaptomyza flava females enhance transmission of Pseudomonas syringae between A. thaliana leaves in the laboratory; Pseudomonas syringae grew in un-treated leaves only when flies were present. Three lower leaves of 5 weeks old A. thaliana Col-0 were infected with 105/ml Pseudomonas syrinage pv. maculicola str. 4326. Four days later, leaves were removed and petioles were inserted into 60 mm petri dishes containing 1% Phytagel to maintain leaf hydration. Equal numbers of infected or un-infected leaves were randomized into one of two mesh cages. Into one of these cages, we released 20 adult female Scaptomyza flava for 2 days after which feeding punctures were counted on all leaves. Leaf discs were taken and homogenized in 10 mM MgSO4 and dilution-plated onto King's B medium, and fluorescent colonies were counted 4 days later. Error bars indicate standard errors; \*\*P < 0.01, 'ns' non-significant, unpaired t-tests on log10-transformed CFU counts.

upon their gut bacteria for fitness within plants. Laboratoryreared *Scaptomyza flava* treated with the antibiotic rifampicin showed reduced feeding rates and lower fecundity on *Arabidopsis thaliana* (*Arabidopsis*) compared to control flies (**Figure 2C**). This effect does not appear to be due to direct antibiotic toxicity, because treatment and control flies survived at similar rates, nor interactions with plant defensive compounds. This is because results were consistent when flies were reared on wild-type

*Arabidopsis* as well as a mutant line deficient in the production of two defensive compounds. Although particular bacteria have not been implicated, this experiment indicates that gut bacterial communities may be important for degrading or processing plant tissues.

Diffuse interactions between insects, bacteria, and plants may also be involved in the evolution and maintenance of herbivory. For instance, insects can vector plant-associated bacteria between plants, including common pathogens, to make those plants more suitable hosts. *Drosophila melanogaster* can serve as both host and vector to the plant pathogen *Erwinia carotovora* (Basset et al., 2000, 2003), and several insects vector *Pseudomonas syringae* between plants (Snyder et al., 1998; Stavrinides et al., 2009).

*Scaptomyza flava* adult female flies can enhance the transmission of a model pathogenic strain of *Pseudomonas syringae* (**Figure 2D**); *Pseudomonas syringae* moved from pre-inoculated *Arabidopsis* leaves to un-treated control leaves in cages with adult female flies added, while control leaves only showed background levels of non-*Pseudomonas syringae* bacteria in cages without flies (**Figure 2D**). Not only are *Pseudomonas* spp. widespread within tissues of *Scaptomyza flava* and its host plant, but *Pseudomonas syringae* are overrepresented within plant leaves damaged by a close relative, *Scaptomyza nigrita,* in the wild (Humphrey et al., 2014). Furthermore, experimental plant infection with *Pseudomonas* species can enhance feeding by adults of the specialist *Scaptomyza nigrita* in the laboratory, indicating that plant exposure to certain *Pseudomonas* spp. can induce susceptibility to this herbivore (Humphrey et al., 2014). Thus, the potential exists for drosophilid herbivores to frequently encounter and transmit *Pseudomonas* species between host plants in ways that enhance insect fitness. Frequent exposure to, and transmission of, defensealtering microbes can ultimately lead to novel and potentially mutualistic interactions between microbes and insects (Luan et al., 2012).

Inoculating plants with bacteria may allow *Scaptomyza* species to subvert anti-herbivore defense by exploiting mutual antagonism between plant defense pathways, including the canonical plant defense hormones salicylic acid (SA) and jasmonic acid (JA). The SA pathway, typically triggered by bacterial infections, represses the JA pathway, which is typically triggered by chewing herbivores (Thaler et al., 2012). Bacterial infection can thus alter plant chemistry in ways that also affect herbivores, including via mechanisms independent of SA–JA antagonism (Cui et al., 2002, 2005; Groen et al., 2013). The Colorado potato beetle was recently shown to locally disable plant defenses by secreting bacteria—including *Pseudomonas syringae—*into plant tissues while feeding (Chung et al.,2013), likely via SA–JA antagonism. Actively suppressing host plant defenses with bacteria may be a common behavior among herbivorous insects, including *Scaptomyza*.

Plant interactions with leaf-colonizing bacteria are ubiquitous, and the phenotypic impacts of bacteria on plants likely have always been a part of the context in which insect herbivory evolves. Indirect interactions are numerous in diverse ecological communities, and can impact selection on focal herbivore traits such as feeding preference both within (Utsumi et al., 2012) or between plant host individuals (Tack et al., 2012). Specifically, herbivores within the Hawaiian Drosophilidae hold great potential to shed light on the role of plant-mediated indirect effects in the ecology and evolution of herbivore traits given that the focal players are themselves—or are related to—genetic model organisms (Whiteman et al., 2011).

## **HOST SHIFTS AND SYMBIOSES**

Plant defenses, especially secondary compounds, are key obstacles to host plant switching for herbivorous insects (Ehrlich and Raven, 1964). The tendency of herbivorous insect lineages to specialize upon plants with similar secondary chemistry suggests that mechanisms to overcome these defenses may be difficult for already-specialized insects to evolve (Berenbaum et al., 1989). However, many microbes are known to degrade plant secondary compounds and structurally similar chemicals (Winkelmann, 1992). Freeland and Janzen (1974) hypothesized that mammalian herbivores might rely on gut bacteria to detoxify secondary compounds of novel plants during host shifts, and a similar bacterial role has been suggested for insects (Broderick et al., 2004; Dillon and Dillon, 2004; Janson et al., 2008). External microbial associates, including yeasts, also have great detoxifying potential for saprophagous species that must also contend with plant secondary chemistry. Because associations with symbiotic microbes are likely to be more evolutionarily labile than endogenous detoxification mechanisms, symbiosis might facilitate colonization of and adaptation to novel plant substrates.

The detoxification abilities of *Drosophila-*associated yeasts have been extensively demonstrated in the cactophilic *Drosophila. Diplodocus* and *Pichia* species found on *Stenocereus thurberi* cacti hydrolyze plant lipids that inhibit both larval fly and yeast growth (Starmer, 1982) while *Candida* and *Cryptococcus* species consume byproducts of cactus fermentation (2-propanol and acetone), which are toxic to *Drosophila mojavensis* (Starmer et al., 1986). Fungi associated with other insects degrade a wide variety of common plant secondary compounds, including tannins, terpenes, chlorinated hydrocarbons, and phenolics, among others (Dowd, 1992). The Hawaiian *Drosophila* utilize a chemically diverse collection of host plants and may similarly benefit from the detoxifying activity of yeasts.

Examples of detoxification by insect gut bacteria are limited, but several recent reports suggest this phenomenon may be more widespread than is currently appreciated. Terpene-degrading bacteria are associated with pine bark beetles (*Dendoctronus ponderosae*) that mine galleries in terpene-rich subcortical tissues of pines (Adams et al., 2013; Boone et al., 2013), and gypsy moths (*Lymantria dispar*) rely upon plant-derived bacteria to supplement endogenous detoxification of phenolic glycosides (Mason et al., 2014). These results suggest that environmentally acquired bacteria can be important contributors to insect fitness by directly detoxifying plant compounds, which may facilitate invasion of novel niches.

Plant-associated bacteria may also contribute to insect detoxification capacities via horizontal gene transfer from ingested bacteria to gut residents. Such a transfer has been described in humans, where the bacterium *Bacteroides plebeius* from Japanese populations apparently acquired genes from marine Bacteroidetes that degrade seaweed polysaccharides (Hehemann et al., 2010). Like human guts, insect guts are hotspots of horizontal gene transfer (reviewed in Dillon and Dillon, 2004). Conjugative plasmids are shared promiscuously within the guts of silkworm larvae (Watanabe et al., 1998), although gut conditions may not be conducive to natural transformation (Ray et al., 2007). Some pathways for plant secondary compound detoxification are encoded in small genomic regions that might facilitate their transfer, such as genes in the pyrrolidine pathway responsible for nicotine catabolism in

plant-associated *Pseudomonas putida* (Tang et al., 2013). Whether gene transfer from environmental bacteria to *Drosophila* gut bacteria is common is not yet known.

#### **CONCLUDING REMARKS**

Understanding the factors that generated and maintain the staggering diversity of the Hawaiian Drosophilidae has informed general hypotheses of how other organisms diversify. The diversity of these flies appears to be due to many different factors including geography, mating behaviors, and ecology. We propose that the interaction between the Hawaiian Drosophilidae, their host plants, and host-associated microbes is another important aspect driving the diversification of the Hawaiian Drosophilidae and/or maintaining this diversity. The role of microbial associates in nearly every aspect of drosophilid ecology—development, nutrition, host finding, and reproduction—presents many opportunities for those microbes to influence diversification. Experimental data from other *Drosophila* species suggest that bacteria, yeast and other fungi may allow host shifts and even trophic shifts, as well as instigate major changes in mating behaviors that subdivide populations in the Hawaiian Drosophilidae. Because yeasts are critical to host finding and substrate processing, Hawaiian *Drosophila* may not be radiating on host plants directly, but instead on fungal diversity. Additional studies are poised to expose the importance of these interactions.

#### **ACKNOWLEDGMENTS**

Support for this research was provided by the National Science Foundation (DEB-662 1256758 to Noah K. Whiteman; DDIG DEB-1309493 to Parris T. Humphrey; GRF and IGERT to Timothy K. O'Connor), the John Templeton Foundation (Grant 664 ID #41855 to Noah K.Whiteman), National Institutes of Health (K12- 631 GM000708 PERT to Richard T. Lapoint) and the National Geographic Society (9097-12 to Noah K. Whiteman).

#### **REFERENCES**


of plant defense compounds. *J. Chem. Ecol.* 39, 1003–1006. doi: 10.1007/s10886- 013-0313-0


Winkelmann, G. (1992). *Microbial Degradation of Natural Products*. New York: VCH.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 May 2014; accepted: 29 October 2014; published online: 18 December 2014.*

*Citation: O'Connor TK, Humphrey PT, Lapoint RT, Whiteman NK and O'Grady PM (2014) Microbial interactions and the ecology and evolution of Hawaiian Drosophilidae. Front. Microbiol. 5:616. doi: 10.3389/fmicb.2014.00616*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 O'Connor, Humphrey, Lapoint, Whiteman and O'Grady. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Structural and functional changes in the gut microbiota associated to *Clostridium difficile* infection

## *Ana E. Pérez-Cobas 1,2, Alejandro Artacho1, Stephan J. Ott 3,4, Andrés Moya1,2, María J. Gosalbes 1,2† and Amparo Latorre1,2\*†*

*<sup>1</sup> Unidad Mixta de Investigación en Genómica y Salud de la Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valenciana (FISABIO) y el Instituto Cavanilles de Biodiversidad y Biología Evolutiva de la Universitat de València, València, Spain*

*<sup>2</sup> CIBER en Epidemiología y Salud Pública, Madrid, Spain*

*<sup>3</sup> Institute for Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany*

*<sup>4</sup> Department for Internal Medicine, University Hospital Schleswig-Holstein, Kiel, Germany*

#### *Edited by:*

*Monica Medina, Pennsylvania State University, USA*

#### *Reviewed by:*

*Sébastien Duperron, Université Pierre et Marie Curie, France Elisabeth Margaretha Bik, Stanford University School of Medicine, USA*

#### *\*Correspondence:*

*Amparo Latorre, Instituto Cavanilles de Biodiversidad y Biología Evolutiva, Universitat de València, C/ Catedrático José Beltrán 2, 46980 Paterna (València), PO Box 46071, València, Spain e-mail: amparo.latorre@uv.es*

*†These authors have contributed equally to this work.*

Antibiotic therapy is a causative agent of severe disturbances in microbial communities. In healthy individuals, the gut microbiota prevents infection by harmful microorganisms through direct inhibition (releasing antimicrobial compounds), competition, or stimulation of the host's immune defenses. However, widespread antibiotic use has resulted in short- and long-term shifts in the gut microbiota structure, leading to a loss in colonization resistance in some cases. Consequently, some patients develop *Clostridium difficile* infection (CDI) after taking an antibiotic (AB) and, at present, this opportunistic pathogen is one of the main causes of antibiotic-associated diarrhea in hospitalized patients. Here, we analyze the composition and functional differences in the gut microbiota of *C. difficile* infected (CDI) vs. non-infected patients, both patient groups having been treated with AB therapy. To do so we used 16S rRNA gene and metagenomic 454-based pyrosequencing approaches. Samples were taken before, during and after AB treatment and were checked for the presence of the pathogen. We performed different analyses and comparisons between infected (CD+) vs. non-infected (CD−) samples, allowing proposing putative candidate taxa and functions that might protect against *C. difficile* colonization. Most of these potentially protective taxa belonged to the Firmicutes phylum, mainly to the order Clostridiales, while some candidate protective functions were related to aromatic amino acid biosynthesis and stress response mechanisms. We also found that CDI patients showed, in general, lower diversity and richness than non-infected, as well as an overrepresentation of members of the families Bacteroidaceae, Enterococcaceae, Lactobacillaceae and Clostridium clusters XI and XIVa. Regarding metabolic functions, we detected higher abundance of genes involved in the transport and binding of carbohydrates, ions, and others compounds as a response to an antibiotic environment.

**Keywords: Gut microbiota, bacterial composition, metabolic functions,** *C. difficile* **infection, colonization resistance**

## **INTRODUCTION**

The human intestinal microbiota is involved in many host functions, such as food processing, regulating intestinal epithelium growth, immune system development, synthesis of essential vitamins, or protection against pathogens (Hooper et al., 2002; Guarner and Malagelada, 2003; Hattori and Taylor, 2009; Leser and Mølbak, 2009; Montalto et al., 2009). Because of its role in human health, imbalances in the gut microbiota have been associated to pathologies such as inflammatory bowel disease, diabetes, obesity, or Crohn's disease (Kang et al., 2010; Sekirov et al., 2010; Morgan et al., 2012; Shanahan, 2013). Antibiotic (AB) therapy has been crucial to treat bacterial infections for over half a century, but it strongly disturbs the gut commensal bacteria and, consequently, the beneficial functions they perform (Jernberg et al., 2010; Willing et al., 2011; Pérez-Cobas et al., 2013a). In fact, AB usage has been associated to short and long-term changes in the intestinal microbiota, reducing colonization resistance to opportunistic pathogens such as *Clostridium difficile* (Vollaard and Clasener, 1994; Bartlett, 2002; Jernberg et al., 2010; Reeves et al., 2011; Britton and Young, 2012). *C. difficile* is an anaerobic, sporeforming, Gram-positive toxin-producing bacterium, which is the most common cause of nosocomial diarrhea, and broad spectrum ABs constitute one of the primary risk factors for infection by this pathogen (Hookman and Barkin, 2009; Cohen et al., 2010). Under normal conditions, the human gut microbiota is able to prevent pathogen invasion through general mechanisms such as direct inhibition (by releasing inhibitory compounds, bacteriocins), nutrient depletion (consuming growth-limiting nutrients) or stimulating host immune defenses (reviewed in Stecher and Hardt, 2011). The exact mechanism by which the microbiota protects against *C. difficile* infection (CDI), preventing its growth and virulence, is still unknown. In this regard, direct antagonism was found *in vitro* since *C. difficile* is a target of bacteriocin produced by an intestinal strain of *Bacillus thuringensis* (Britton and Young, 2012). Since the gut microbiota participates actively in the fermentation of diet carbohydrates, amino acid and lipid metabolism and protein digestion, Theriot et al. used a metabolic model of CDI in mice and found that ABs affect all these functions, leading to a disturbed microbiota functional state that favors *C. difficile* germination and growth (Theriot et al., 2014). Moreover, gut microorganisms participate in bile acid transformation, which otherwise stimulate *C. difficile* spore germination and growth (Britton and Young, 2012). Thus, the loss of key taxa which play these roles can trigger a structural and functional imbalance, allowing colonization by this opportunistic pathogen.

In recent years, high-throughput molecular techniques, such as 16S rRNA gene sequence analyses (taxonomic composition of microbial communities), metagenomics (genetic potential of microbial communities) and other meta-"omics" (metatranscriptomics, metaproteomics, metabolomics) have extended our knowledge of intestinal microbiota diversity and functions (Gill et al., 2006; Kurokawa et al., 2007; Zoetendal et al., 2008; Tap et al., 2009; Gosalbes et al., 2011; Pérez-Cobas et al., 2013a,b). Some of these approaches have recently been used to address the effects of ABs in the gut ecosystem (Dethlefsen et al., 2008; Antonopoulos et al., 2009; Dethlefsen and Relman, 2010; Jakobsson et al., 2010; Antunes et al., 2011; Pérez-Cobas et al., 2013b) showing that ABs considerably alter the gut microbial ecology and the hostmicrobiota interactions (Pérez-Cobas et al., 2013a). The response of the microbiota to ABs is related to properties of the agent, such as the antimicrobial effect, mode of action, dosage and duration of treatment, or route of administration (Jernberg et al., 2010; Pérez-Cobas et al., 2013b). In addition, biological factors of the host-microbial ecosystem itself such as taxonomic and functional composition, resistance gene reservoir, or host immune homeostasis also contribute to the microbial community shifts associated to AB therapy (Jernberg et al., 2010; Willing et al., 2011; Relman, 2012). To date, few studies have aimed to ascertain whether specific changes in the microbiota composition due to AB therapy lead to CDI. Past surveys have shown that diversity of the intestinal microbiota is significantly reduced in patients prior and/or during CDI, as well as important structural changes associated to infection (Antharam et al., 2013; Vincent et al., 2013).

The main goal of the present follow-up study is to gain insights into the development of CDI and its relation to an altered human gut microbiota. We have used 16S rRNA gene and metagenomic approaches to characterize the structure and functions of the intestinal microbiota before, during and after broad spectrum AB therapy in patients who developed CDI. In two previous studies we explored the effect of broad spectrum ABs on human gut microbiota composition and function in patients that did not develop CDI at any time (Pérez-Cobas et al., 2013a,b). Comparative analyses of these two groups of patients identified bacterial taxa and metabolic functions associated to an infection status, as well as specific taxa and functions that could protect against the *C. difficile*, and thus contribute to colonization resistance of the human gut microbiota.

## **MATERIALS AND METHODS**

#### **SAMPLE COLLECTION**

Three patients under AB therapy at the Department of Internal Medicine of the University Hospital Schleswig-Holstein, Kiel, Germany were recruited for the study due to the fact that they were positive for *C. difficile* at some time points of the treatment. The patients were older than 65 years, no antibiotic therapy was administered to them in the previous 6 months to their hospital admission. The diagnosis at the entrance to the hospital were ischaemic colitis, sigmoid diverticulitis and infection of unknown origin for patients referred as F, G, and H, respectively. The patients stayed in the hospital during the AB therapy. Written, informed consent was obtained from all the subjects.

Fecal samples were collected, before, during and after AB treatment, from the three patients in sterile tubes and stored at −80◦C until processing all sample together. Fecal samples were screened by multiplex PCR for *C. difficile* toxin genes, *tcdA* and *tcdB*, and triose phosphate isomerase gene (*tpi*), considering *C. difficile* positive those samples that resulted positive for the three examined genes (referred as CD+, whereas CD− is used for the rest of samples). Patients F and H were found positive after 16 and 35 days after AB treatment, respectively, whereas patient G was found positive on entering hospital (**Table 1**). The three patients presented diarrhea during AB theraphy.

In two previous studies we evaluated the effect of broadspectrum antibiotics on five patients (A, B, C, D, E) through similar approaches of those presented in this work (16S rRNA gene and metagenomics) as part of the same research survey (Pérez-Cobas et al., 2013a,b) that was approved by the Ethical Committee of the University Hospital Schleswig-Holstein. None of these patients developed CDI (they were negative for the multiplex PCR for *C. difficile tcdA*, *tcdB*, and *tpi* genes), or presented diarrhea. The main features and therapy of all patients (A, B, C, D, E, F, G and H) are shown in **Table 1**. Only the time-points used in this study are shown for the patients A, B, C, D, and E (all CD− samples) of the previous studies.

#### **DNA EXTRACTION AND SEQUENCING PROCESS**

The fecal samples were resuspended in sterile PBS [containing, per liter, 8 g of NaCl, 0.2 g of KCl, 1.44 g of Na2HPO4, and 0.24 g of KH2PO4 (pH 7.2)] and centrifuged at 1250 g and 4◦C for 2 min to remove fecal debris. The supernatants were centrifuged at maximum speed at 4◦C for 5 min to pellet the cells. DNA was extracted with the QIAamp® DNA Stool Kit (Quiagen) following the manufacturer's instructions. Total DNA integrity was checked by running a standard agarose gel electrophoresis and the concentration was quantified with the QuantiT PicoGreen dsDNA Assay Kit (Invitrogen). For each sample, except of F\_after from which there was no enough amount of DNA, the total DNA (metagenome) was directly pyrosequenced with a Roche GS FLX sequencer and Titanium chemistry in the Center for Public Health Research (FISABIO-Salud Pública) (Valencia, Spain). Thus, a total of 12 metagenomes were analyzed. We obtained a mean of 78,976 reads per sample with an average length of 374 bp.

#### **Table 1 | Description of the patients involved in the study.**


*CD (*+*/*−*), positive and negative detection for C. difficile. AB, antibiotic.*

#### **16S rRNA GENE AMPLIFICATION**

A region of the 16S rRNA gene (V1, V2, and V3) was amplified by polymerase chain reaction (PCR) for each sample. The primers were the universal E8F (5 -AGAGTTTGATCMTGGCTCAG-3 ) with adaptor A and 530R (5 -CCGCGGCKGCTGGCAC-3 ) with adaptor B using the sample-specific Multiplex Identifier (MID) for pyrosequencing according to 454 standard protocols. For each sample a 50µl PCR mix was prepared, containing 5µl of Buffer Taq (10X) with 20 mM MgCl2, 2µl of dNTPs (10 mM), 1µl of each primer (10 mM), 0.4µl of Taq Fast start polymerase (5 u/µl), 39.6µl of nuclease-free water and 1µl of DNA template. PCR conditions were: 95◦C for 2 min followed by 25 cycles of 95◦C for 30 s, 52◦C for 1 min and 72◦C for 1 min and a final extension step at 72◦C for 10 min. The amplification products were checked by electrophoresis in agarose gel (1.4%). PCR products were purified using NucleoFast® 96 PCR Clean-Up Kit (Macherey-Nagel) and quantified with the QuantiT PicoGreen dsDNA Assay Kit (Invitrogen). The pooled PCR products were directly pyrosequenced in the same way as the total DNA (described above). We obtained an average of 5394 reads per sample.

#### **ANALYSIS OF THE 16S rRNA GENE READS**

We used the Ribosomal Database Project (RDP) pipeline (Cole et al., 2009) to trim off the MID and primers and also to filter sequences by quality. Sequences with a phred quality score below 20 (Q20) and short length (*<*250 bp) were discarded. The denoising of the sequences was performed with the usearch program in the QIIME pipeline (Caporaso et al., 2010). Then, the pyrosequencing chimeras were discarded using the uchime filtering also in the QIIME pipeline. After, OTUs were calculated at 97% of sequence similarity by clustering with the usearch program in the QIIME pipeline. The taxonomic assignment of the amplicons was based on the database of RDP. We included only annotations obtained with a confidence level (bootstrap cut-off) greater than 0.8, leaving the assignment at the last-well identified level and the consecutive levels as unclassified (uc).

#### **BIODIVERSITY AND ECOLOGICAL ANALYSIS**

To analyze the microbial community structure at OTU level (97%) we calculated two diversity parameters: number of OTUs and Shannon index (Shannon, 1948) and two richness estimators: Chao1 (Chao, 1984) and abundance-based coverage (ACE) (Chao et al., 2000). These estimators are implemented in Vegan package (Oksanen et al., 2011) under R software (http://cran*.* r-project*.*org) (R Development Core Team, 2011). To statistically compare the mean ranks of the biodiversity measures between groups, we used the Wilcoxon signed-rank test implemented in the R software.

We also performed a clustering based on OTU composition to study the similarity between samples using the pvclust library (Suzuki and Shimodaira, 2006) in the R software. This analysis assesses the uncertainty in hierarchical clusters using bootstrap resampling techniques. We used the approximately unbiased (AU) *p*-value with 10,000 bootstraps to estimate the probability of each cluster. This AU *p*-value indicates how strong the cluster is supported by data.

#### **FUNCTIONAL CLASSIFICATION OF METAGENOMES**

We used the 454 Replicate Filter Program (Gómez-Alvarez et al., 2009) to eliminate artifact replicate reads of pyrosequencing following the parameters: sequence identity cutoff = 1; length difference requirement = 0; number of beginning base pairs to check = 10. Reads were compared against the human genome database using BLASTN (Altschul et al., 1990) with an *e*-value of 10−<sup>10</sup> to eliminate possible contamination with human sequences. To identify the sequences encoding the ribosomal 16S rRNA and 23S rRNA genes we compared the dataset against the Small Subunit rRNA Reference Database (SSUrdb) and the Large Subunit rRNA Reference Database (LSUrdb) described in Urich et al. (2008) using BLASTN with an *e*-value of 10−<sup>16</sup> and 10−<sup>4</sup> respectively. After removing the ribosomal genes, the remaining reads were compared to the NCBI-nr protein database using BLASTX (Altschul et al., 1990) to identify the protein-coding genes, and then we performed an Open Reading Frames (ORFs) search with the Fraggenscan program from the metagenomic analysis web server (WebMGA) (Wu et al., 2011). The predicted ORFs were functionally annotated by HMMER 3.0 (Durbin et al., 1998) against the TIGRFAM database (Haft et al., 2003) using default parameters.

#### **STATISTICAL ANALYSIS**

Canonical correspondence (CCA) and detrended correspondence (DCA) analyses were performed to explore the relationship between different groups of samples and with *C. difficile* infection as a variable that could explain the variability pattern. To statistically assess the effect of that variable in explaining the bacterial composition differences, a multivariate ANOVA based on dissimilarity tests (Adonis) was applied, implemented in the Vegan package under the R software. These approaches were applied to two different levels: the taxonomy based on the 16S rRNA gene, and the biological functions based on the TIGRFAMs annotations. We used the ShotgunFunctionalizeR package (Kristiansson et al., 2009) in the R software to statistically compare samples at diversity and functional levels. The differences in composition between samples were addressed comparing groups of multiple samples with the function "test-GeneFamilies.dircomp." On the other hand, we applied the "testGeneCategories.dircomp" function to compare the distribution of functional categories between groups of samples. It compares each gene family from a higher functional category to decide whether the global category is statistically significant among two groups of samples. All tests were based on Poisson models.

#### **SEARCHING FOR PUTATIVE CANDIDATE TAXA AND METABOLIC FUNCTIONS TO PROTECT AGAINST CDI**

We also used the "testGeneCategories.dircomp" test to identify taxa and metabolic functions that could play a protective role against *C. difficile* colonization. Specifically, we performed three comparisons between groups of samples to identify taxa and functions that were significantly over-represented in CD− compared to CD+ samples. The taxa and functions resulting from the different comparisons were intersected to define the candidate protective group. For this purpose, we performed the following comparisons:

Comparison 1.Since patients F and H were negative to the pathogen before treatment but positive during therapy, we compared the CD− samples before AB (F\_before and H\_before) against the CD positive samples (CD+) during AB (F\_16D, H35\_D and H38\_D) (**Table 1**). We aimed to identify taxa and functions that significantly decreased (*p*-value *<* 0.1) due to treatment, presumably allowing *C. difficile* overgrowth.

Comparison 2. Since patients A, B, C, D and E did not develop CDI, we performed a comparison of the samples before AB treatment against their samples during treatment (**Table 1**). We aimed to identify taxa and functions that significantly increased (*p*-value *<* 0.1) due to therapy or that changed less drastically than those in Comparison 1, since their presence could play a role in preventing *C. difficile* infection.

Comparison 3. Since patient H was negative for the pathogen 26 days after AB, we carried out a comparison of the CD+ samples of patient H (H35\_D and H38\_D) against the CD− sample taken after AB (H\_after) (**Table 1**). We aimed to identify taxa and functions whose significant increase (*p*-value *<* 0.05) could be incompatible with pathogen overgrowth as this was not detected.

Finally, we intersected all these results to obtain a group of candidate taxa and functions that could participate in *C. difficile* colonization resistance.

## **CO-OCCURRENCE BAYESIAN NETWORKS OF CANDIDATES (TAXA AND METABOLIC FUNCTIONS) IN CD− SAMPLES**

To find positive correlations between candidate protective taxa or functions found in the previous analyses and other taxa and functions obtained for samples from patients A, B, C, D and E during AB treatment (all CD− samples), we performed Bayesian networks based on their relative abundance. The Bayesian networks were inferred using the bnlearn package (Scutari, 2010) in the R software. The optimal network inference was constrained so that only those interactions exhibiting a Spearman correlation *p*-value below 0.01 were included in the network. Correlations and *p*-values were computed using the Spearman method implemented in R software.

#### **DATA ACCESSION NUMBER**

All sequences have been entered in the European Bioinformatics Institute database, under accession numbers: ERP002192 (patients A, B, C, and D), ERP001506 (patient E) and PRJEB5771 (patients F, G, and H).

## **RESULTS**

## **MICROBIAL DIVERSITY AND BACTERIAL COMPOSITION IN PATIENTS DEVELOPING CDI**

Analysis of the gut microbiota of the three CDI patients (F, G, and H) showed large variations in bacterial composition during therapy (**Figure 1**).

Before AB treatment, the bacterial composition of patient F was dominated by the *Akkermansia* genus (30.6%) belonging to the family Verrucomicrobiaceae. Other bacterial families were also abundant such as Ruminococcaceae (20.8%), Oscillibacteriaceae (*Oscillibacter*, 11.7%) and Bacteroidaceae (*Bacteroides*, 14.8%). When *C. difficile* was detected, at day 16 of AB treatment, all these taxa were almost absent in the community except *Bacteroides*, which had increased to 41.9%, becoming a predominant genus of the gut ecosystem. The Clostridium cluster XlVa increased dramatically (from 0.7% before AB course to 46.8% at day 16), being the most abundant group at this time point. After treatment, the abundance of the main taxa of the microbial community changed again, the predominant being Enterococcaceae *(Enterococcus*, 48.3%), Streptococcaceae (*Streptococcus*, 43.2%), Staphylococcaceae (*Staphylococcus*, 4.1%) and Clostridium cluster XI (3.5%).

Patient G was found positive to *C. difficile* detection before, during, and after AB treatment, showing the most similar bacterial composition at the three time points, though there are some remarkable differences. The initial composition (G\_before) consisted mostly of Bacteroidaceae (*Bacteroides*, 36.7%) and Ruminococcaceae (*Faecalibacterium*, 29.6%). During AB (G4\_D), although *Bacteroides* decreased in abundance to 25%, it remained the most abundant genus, while *Faecalibacterium* (5.9%) decreased radically. However, *Enterococcus* increased during AB (from 1.3 to 14.9%). After therapy (G\_after), Clostridium cluster XI became the predominant group (62.4%) whereas *Streptococcus* genus decreased progressively at each time point (3, 2.2, and 0.2%, respectively).

Patient H had a very unusual gut microbiota before AB treatment, being dominated (85.7%) by Enterobacteriaceae family, mainly *Escherichia* genus, but its abundance decreased dramatically reaching the lowest values at days 35 and 38 of the broadspectrum AB treatment (4.2 and 2.8%, respectively), when *C. difficile* was detected. During days 7 and 14 of AB treatment the genus *Bacteroides* showed the higher abundance values (20.4 and 34.8%); however this taxon decreased on day 20, becoming undetectable by days 35 and 38. *Streptococcus* genus increased slightly in the two CD+ samples (1.5 and 4.5%, respectively). The most striking shift occurred in the Lactobacillaceae family (*Lactobacillus* genus), whose frequency increased from less than 1% at the beginning of treatment to 83.3 and 70% on days 35 and 38 of the AB course, and was reduced to 15.5% after AB. We performed a statistical comparison to evaluate the differences in bacterial composition between the samples immediately prior to *C. difficile* detection (H14\_D, H20\_D) and in the CD+ samples (H35\_D, H38\_D). (Table S2). The main significantly overrepresented taxa were *Lactobacillus*, *Streptococcus*, *Proteus*, *Sutterella* and the uc\_Lactobacillaceae, while the Clostridium cluster XlVa, *Enterococcus*, *Bacteroides*, *Escherichia*, *Klebsiella*, and *Roseburia* were the least abundant taxonomic groups.

The three individuals exhibited great fluctuations in the number of observed OTUs, as well as in the diversity parameters analyzed (Table S1). The diversity (based on Shannon, Chao 1 and ACE estimators) of patient F was reduced in the CD+ samples, being minimal after the therapy. The microbial diversity of patient G also reached the lowest values after treatment. The decreased diversity after the course in these two patients could be due to both, the AB and CDI effects. However, the patient H, which was recovered of the infection after the therapy, presented the lowest diversity parameters before the AB that could be due to the massive presence of members of the Enterobacteriaceae family detected in this sample and also during CDI (**Figure 1**).

Finally, we performed a cluster analysis to find similarities in microbiota composition between samples at OTU level (97%) (**Figure 2**). The three samples corresponding to patient G (G\_before, G4\_D and G\_after) were clustered with F\_after, being all CD+ samples. This cluster was closer to the others two samples of patient F (F\_before and F16\_D). Patient H samples formed two clear groups. One of the clusters included the prior infection samples (before AB and 7, 14 and 20 days during AB) whereas the CD+ samples (days 35 and 38), which are the most similar samples, grouped in a second cluster with the sample after treatment (H\_after). The clustering shows that both the individual and the *C. difficile* presence contributed to explain the similarity pattern of the samples.

## **DIFFERENCES IN MICROBIAL STRUCTURE BETWEEN** *C. DIFFICILE***-INFECTED AND NON-INFECTED PATIENTS**

In previous studies we analyzed changes in bacterial composition in AB-treated patients that did not develop *C. difficile* infection (A, B, C, D and E), and thus all samples were CD−. To search for differences in microbiota composition possibly related to infection, we compared the 15 time points during the AB therapy of these CD− patients with samples from patients that were positive for *C. difficile* detection (CD+) (F16\_D, F\_after, G\_before, G4\_D, G\_after, H35\_D and H38\_D) (**Table 1**).

First, we compared the Shannon index distributions between CD+ and CD− samples (Figure S1). We found a lower diversity for CD+, with an average of 3.1 ± 1.0 compared to CD− samples with 3.9 ± 0.8, respectively. The richness estimator, Chao1, showed great variations for both groups; even so, the means were also lower in the CD+ populations with values of 210 ± 132 vs. 287 ± 157 in CD− individuals. The Wilcoxon signed-rank test was performed to compare the diversity measures between both groups, and was not significant for the Shannon index (*p* = 0*.*14) and the Chao1 estimator (*p* = 0*.*33). The gut microbiota of CD+ samples seems to be more heterogeneous and less rich than the CD− samples corresponding to patients that did not develop CDI, but a larger number of samples would be required to support this observation.

Second, we performed a detrended correspondence analysis (DCA) to explore the variations in bacterial composition between the same CD+ and CD− samples tested above (**Figure 3A**). The two axes explained 26. 7% of the total variance, and there was large variability in the microbiota of both groups. Despite this variability, two clusters can be distinguished with minimal overlapping. We applied the Adonis test to evaluate whether developing *C. difficile* infection is a factor that influences the microbiota structure. The factor proved to be significant with a *p*-value of 0.005.Thus, although the CD+ samples do not form a well-defined cluster, they share some features in their microbiota composition that differ from CD− samples.

Finally, we performed a statistical test to find those taxa that explained the differences in composition between CD− and CD+ groups. In CD+ samples there was significant over-representation of the genera *Lactobacillus*, *Bacteroides*, *Enterococcus*, *Faecalibacterium*, the family Lachnospiraceae incertae sedis, and the Clostridium clusters XIVa and XI, the latter included *C. difficile*. However, commensal members of the intestinal community, such as *Roseburia*, *Coprococcus*, *Blautia*, *or Subdoligranulum* genera and the families Erysipelotrichaceae and Ruminococcaceae were underrepresented (**Table 2**).

#### **CANDIDATE TAXA INVOLVED IN** *C. DIFFICILE* **COLONIZATION RESISTANCE**

In order to obtain a subset of candidate bacteria that could be involved in *C. difficile* colonization resistance, we performed statistical comparisons between different groups of

**FIGURE 3 | Correspondence analyses. (A)** Detrended correspondence analysis (DCA) based on taxa abundance and composition of CD+ samples of patients (F, G, and H) (red square) and CD− samples of patients (A, B, C, D,

**Table 2 | Differential taxa abundance between CD− (during time points of A–E patients) and CD+ (F16\_D, F\_after, G\_before, G4\_D, G\_after, H35\_D and H38\_D) samples.**


samples (see Materials and Methods for the three specific comparisons). The three comparative analyses gave a number of statistically significant taxa (**Table 3**), and intersection of the results of the three analysis indicated which taxa may participate in colonization resistance to *C. difficile*. We found that the major number of taxa belonged to the order Clostridiales (Firmicutes), specifically to the families Ruminococcaceae (*Ruminococcus*, *Subdoligranulum*, and

*Gemmiger*), Oscillibacteraceae (*Oscillibacter*) and Eubacteriaceae (*Anaerovorax*). We also found unclassified Ruminococcaceae and Erysipelotrichaceae belonging to the Clostridiales and Erysipelotrichales orders, respectively, as well as other Clostridia and Clostridiales members. Finally, the genus *Escherichia* from the family Enterobacteriaceae, Proteobacteria phylum, was also detected.

Once the candidate protective taxa had been detected, we performed a Bayesian network (see Materials and Methods) to find other related members of the bacterial community and hence also putatively involved in pathogen colonization resistance (**Figure 4**). *Gemmiger*, *Subdoligranulum* and uc\_ Erysipelotrichaceae did not show any significant correlation and thus they are not represented in the figure. It is worth noting that most taxa showing a positive and significant correlation with the candidates were phylogenetically related to them, mainly belonging to the Clostridiales order, such as *Roseburia* and *Coprococcus* (Lachnospiraceae family) and *Anaerotroncus* (Ruminococcaceae family).

## **FUNCTIONAL CHANGES IN PATIENTS DEVELOPING CDI**

In the present work, we performed the functional annotation of the 12 metagenomes sequenced (metagenome of sample F\_after could not be analyzed, see Materials and Methods) by comparison against the TIGRFAM database, obtaining the following hierarchical classification: main roles (the highest functional level), sub roles (more specific metabolic functions for each one of the main roles) and genes (metabolic functions) for all the reads. Figure S2 shows great homogeneity in the main role distribution of different samples for all three patients (F, G, and H). On average, the most abundant main role categories were: energy metabolism (12.3% ± 2.1), protein synthesis (12% ± 2), transport and binding proteins (8.6 ± 2.5%) and cell processes (8.6 ± 1.3%). Similar main role distribution was described for patients A, B, C, and D in our previous study (Pérez-Cobas et al., 2013b), which is expected due to the importance of these household functions for the survival and growth of gut bacteria.


**Table 3 | Significant taxa and associated** *p***-value resulting from the three comparative analyses to find protective candidate taxa.**

*In bold are the candidate taxa that were significant in the three comparisons. (a) (F\_before and H\_before) vs. (F\_16D, H35\_D, and H38\_D).*

*(b) A, B, C, D, and E samples before vs. during AB treatment.*

*(c) (H35\_D and H38\_D) vs. (H\_after).*

In patient F, there were 51 significantly different sub roles between samples corresponding to before and during AB treatment (F\_before vs. F16\_D) (Table S3). The most significant over-represented categories in AB treated samples were: DNA metabolism/chromosome-associated proteins; cellular processes/DNA transformation; cell envelope/biosynthesis and degradation of surface polysaccharides and lipopolysaccharides; and energy metabolism/ pentose phosphate pathway. The underrepresented categories were: protein synthesis/tRNA aminoacylation; transport and binding proteins/amino acids; peptides and amines; and cell envelope/surface structures.

Three different comparisons were made for patient H: (i) before vs. during treatment but before *C. difficile* detection (H\_ before vs. H7\_D, H14\_D and H20\_D); (ii) before vs. CD+ samples (H\_before vs. H35\_D and H38\_D) and (iii) CD− prior

to CDI vs. CD+ samples (H14\_D and H20\_D vs. H35\_D and H38\_D). In total, we found 37 significant sub roles increased or decreased in CD− samples. (i) Those that increased during AB but before CDI were mainly involved in "cell processes/DNA transformation" and "protein synthesis/translation factors," whereas we observed a significant decrease in "amino acid biosynthesis/folic acid"; "mobile and extrachromosomal elements function/plasmid functions"; "signal transduction/PTS" and "transport and binding proteins/carbohydrates, organic alcohol and acids." (ii). A similar functional profile was found when we compared before vs. CD+ samples. (iii) Finally, when we specifically compared the two samples previous to infection with the two CD+ samples we found 54 significant subroles. The most significant over-represented in the CD+ samples were: "signal transduction/PTS"; "transport and binding proteins/carbohydrates, organic alcohols, and acids"; "transport and binding proteins/amino acids, peptides and amines" and "cell envelope/biosynthesis and degradation of mureinsacculus and peptidoglycan" (Table S3).

Patient G proved to be infected by the pathogen throughout the study. Thus, we compared the sub roles distribution, before vs. during treatment, to find those functions that could be AB-related. The comparison showed that only two categories changed during AB: "amino acid biosynthesis/serine family" decreased (*p* = 0) while "cell envelope/other" increased (*p* = 0*.*04).

#### **DIFFERENCES IN THE FUNCTIONAL PROFILE BETWEEN** *C. DIFFICILE***-INFECTED AND NON-INFECTED PATIENTS**

To compare the whole functional composition of CD+ sample of patients F, G, and H with CD− samples during treatment of patients A, B, C, D, and E, we applied a correspondence analysis based on TIGRFAM functions abundance, with both axes explaining a total of 49.3% of sample variance. The analysis did not show a clear differential functional pattern between the CD+ and CD− groups given the CD+ samples seem to be a subset of the CD− group (**Figure 3B**). We also used the Adonis test to evaluate the significance of ABs in structuring the functional profile of the microbial community in a different way for the two groups (CD+ and CD−). The factor was not significant at the hierarchical level sub roles and metabolic functions, the *p*-values being 0.63 and 0.73, respectively. To find specific sub roles that could be associated to CD+ samples, we compared the functional profile of the same previously tested samples (**Table 4**), finding significant enrichment in "transport and binding proteins," mainly for "carbohydrates, organic alcohols and acids," and "signal transduction" by the phosphotransferase system (PTS). However, "mobile and extrachromosomal element functions" and "aromatic amino acid family biosynthesis" were significantly underrepresented.

#### **CANDIDATE FUNCTIONS INVOLVED IN** *C. DIFFICILE* **COLONIZATION RESISTANCE**

Just as in the 16S rRNA gene survey, we performed three comparative analyses to find (in the intersection) those metabolic functions that may play a role in colonization resistance. **Table 5** shows the roles, sub roles, and functions that may be protective. Those with a clearly assigned role are involved in "aromatic amino acid biosynthesis (chorismate mutase)," "endospore formation (stage IV sporulation protein B and anti-sigma F factor)," "metabolism of amino groups (agmatine diminase)," and "stress response mechanisms (rrf2 family protein, redox-active disulfide protein 2 and glutamate decarboxylase)." Doubled CXXCH domain belongs to a protein of unknown function but it is postulated to be part of c-type cytochromes that participate in electron transfer. UDP-N-acetylglucosamine 4,6-dehydratase participates in the biosynthesis of pseudaminic acid. No sub-roles were assigned to indolepyruvate ferredoxin oxidoreductase and RNA polymerase sigma-70 factor.

We also performed a Bayesian network to find significant and positive associations between the candidate protective functions and other functions that may be important in pathogen infection resistance. **Figure 5** shows the functional network according to hierarchical categories. In a general overview, most of the candidate functions were connected with several different sub roles, and correlations between candidates were also found. The most frequently connected function was the doubled CXXCH domain (26 correlations), and chorismate mutase (25 correlations). Additionally, these two candidate functions shared some connections whose nodes are involved in different roles, the majority **Table 4 | Comparisons of sub-roles distribution between CD− (during time points of A–E patients) and CD+ (F16\_D, F\_after, G\_before, G4\_D, G\_after, H35\_D and H38\_D) samples.**


#### Pérez-Cobas et al. Gut microbiota and *Clostridium difficile*

#### **Table 4 | Continued**


*Arrows indicate the sub-roles significantly over-represented (upward) and underrepresented (downward) in the CD*+ *samples.*

#### **Table 5 | Candidate functions involved in** *C. difficile* **colonization resistance.**


*\*RNA polymerase sigma-70 factor, Bacteroides expansion family 1.*

being related to energy metabolism, protein synthesis and fate, as well as amino acid biosynthesis. The UDP-N-acetylglucosamine 4,6-dehydratase showed 21 correlations, mainly with cell envelope, protein fate and transport system roles. Also, this function was connected to another important candidate: glutamate decarboxylase, with which it shares some correlations. The redoxactive disulfide protein 2 and glutamate decarboxylase presented 17 correlations each. The former, which is correlated to the two candidates known as chorismate mutase and indolepyruvate ferredoxin oxidoreductase, showed associations with energy metabolism and protein synthesis, while glutamate decarboxylase is correlated to protein fate, regulatory and transport functions.

#### **DISCUSSION**

In this study, we have analyzed changes in the bacterial composition and functional profile of the gut microbiota of two patients (F and H) that were positive for *C. difficile* (CD+ samples) after AB treatment and one patient (G) that despite not having taken AB was already CD+ when entered to the hospital. Patients F and H had an unusual microbiota at the start of the study (before AB treatment), enriched in *Akkermansia* genus (30.6%) and highly abundant in *Escherichia* genus (85.7%), respectively. We also compared the gut microbiota of those three patients with five individuals from two previous studies (Pérez-Cobas et al., 2013a,b), who were also treated with AB but did not develop CDI. All the patients fit the same inclusion criteria. Despite the heterogeneity of the samples and only 15 time points are overall compared, we consider that the results obtained with the different analyses performed, provide new insights into the effect of CDI on the structure and metabolic functions of the human gut microbiota. Furthermore, we identified members of the bacterial community and metabolic functions that are differentially present in the CD− samples compared to the CD+ samples and thus could be involved in resistance to *C. difficile* colonization.

The gut microbiota of the three CDI patients showed large variations in bacterial composition and diversity throughout the therapy, confirming that antibiotics disturb the ecological equilibrium of microbial communities. Previous studies showed great fluctuations and low diversity of the human gut microbiota under the effects of a wide variety of ABs, although patients did not develop CDI (Dethlefsen et al., 2008; Dethlefsen and Relman, 2010; Pérez-Cobas et al., 2013a,b). In addition to the influence of AB on the microbiota structure, this survey found that CDI contribute to decreasing bacterial diversity since the infected samples showed, in general, lowest biodiversity index values and richness estimators than non-infected samples. In this respect, a mouse colitis model-based study has suggested that intestinal inflammation during colonization by some pathogens, including *C. difficile*, affect microbiota equilibrium (reviewed in Stecher and Hardt, 2011), contributing to reduced microbial diversity.

Similarly, significant alterations in the abundance of some taxa (mainly from the Firmicutes phylum) and a decrease in microbial diversity and species richness were found in individuals with CDI (Antharam et al., 2013).

We have found that the microbiota of the infected samples (CD+) share some common features, being depleted in commensal genera such as *Ruminococcus*, *Roseburia*, *Subdoligranulum*, *Blautia* or *Coprococcus* and enriched in *Lactobacillus*, *Enterococcus*, Clostridium clusters XlVa and XI. The latter being the phylogenetic cluster which contains the *C. difficile* species (Collins et al., 1994). Although the relative abundance of cluster XI was variable between the infected samples, its presence is higher in CD+ than in CD− samples, probably due to the high abundance of *C. difficile*. The higher abundance of Clostridium cluster XIVa could be a consequence of the microbiota imbalance, since members of this group have been characterized as opportunists (Lozupone et al., 2012). This may also be the case of *Enterococcus*, which is a common opportunistic pathogen that becomes dominant when the normal gut microbiota is disturbed by ABs (Donskey, 2004;

Ubeda et al., 2010). *Enterococcus* was also over-represented in samples of reduced biodiversity in other CDI studies (Antharam et al., 2013; Vincent et al., 2013). The higher abundance of *Lactobacillus* in the CD+ samples is also interesting. For example, a murine model-based study found that Lactobacillaceae was dominant in CDI samples (Rea et al., 2011) as did a study of CDI in humans (Antharam et al., 2013). Although *Lactobacillus* has been described as an intestinal probiotic genus, different studies show that only specific strains (e.g., *L. delbrueckii*) can inhibit *C. difficile* growth (Naaber et al., 2004; Banerjee et al., 2009). Further research would be needed to clarify the role of *Lactobacillus* strains in gut colonization by *C. difficile*.

The three comparisons performed enabled us to identify taxa that were significantly over-represented in CD− samples, due to AB therapy, in individuals that either did not develop CDI (comparison 2) or recover from CDI (comparison 3), but decreased in those CD+ samples (Comparison 1). Thus, *Anaerovorax*, *Escherichia*, *Gemmiger*, *Oscillibacter*, *Ruminococcus*, *Subdoligranulum*, uc\_Clostridia, uc\_Clostridiales, uc\_Erysipelotrichaceae, and uc\_Ruminococcaceae were found as candidates for protecting against *C. difficile* colonization. Bayesian correlation networks are a powerful tool to search and study ecological or metabolic associations in microbial communities (Durbán et al., 2012), and thus we used them to look for other taxa associated to the above, which may be also indirectly involved in resistance by ecologically interacting with the candidates. Most of the taxa in the network belonged to Clostridia: *Ruminococcus*, *Subdoligranulum*, *Oscillibacter*, *Anaerovorax*, *Roseburia*, *Coprococcus*, *Anaerotroncus*, *Gemminger* and other unclassified members of Lachnospiraceae and Ruminococcaceae families. It has been proposed that competition of normal gut microbiota members with their related pathogens for limiting resources or sites, called "niche exclusion," could be a colonization resistance mechanism (reviewed in Britton and Young, 2012). Thus, this niche hypothesis could explain the role of these related taxa belonging to Clostridiales in protecting against CDI. In this regard, some studies in mice have shown that Clostridia members, such as Lachnospiraceae, are *C. difficile* antagonists and restore the microbiota when fed to infected mice (Itoh et al., 1987; Reeves et al., 2011, 2012; Lawley et al., 2012). Another study in hamsters showed that non-toxigenic *C. difficile* were able to prevent the toxigenic pathogen (Sambol et al., 2002; Merrigan et al., 2003), suggesting a more efficient utilization of limiting nutrients (niche exclusion) as the protection mechanism. In human studies, members of the Ruminococaceae and Lachnospiraceae families were significantly depleted in CDI patients (Antharam et al., 2013).

Some of the Clostridia members found to be associated to the main protective candidate taxa, such as *Roseburia* or *Coprococcus*, are active anaerobic short-chain fatty acids (SCFA) producers (Barcenilla et al., 2000; Pryde et al., 2002). This could be other mechanism through they are candidates to protect against CDI, since SCFA are reported to inhibit *C. difficile* growth and also to decrease the production of toxin *in vitro* (May et al., 1994). Moreover, it has been postulated that the anaerobic fraction of the microbiota is essential for gut ecosystem stability in healthy individuals, because the butyrate and other SCFAs they produce have anti-inflammatory effects and stimulate the immune system and, thus, this imbalance increases the risk of *C. difficile* overgrowth (Bartlett, 2002; Roy et al., 2006; Jernberg et al., 2010). However, a recent study in mice found that SCFA production was no correlated with lower levels of *C. difficile* colonization (Reeves et al., 2012). In addition, these authors found that the microbiota composition of CDI mice was partially restored when they used only one isolate of the Lachnospiraceae family for inoculation. Nevertheless, total restoration was obtained when total fecal content was transferred from a wild-type mouse. These results agree with our findings because we have found several putative candidate protective taxa, indicating that more than one bacterial group is involved in pathogen protection. Hence, further research should test *in vivo* the colonization resistance capacity of the specific ensemble we have proposed.

In a previous study, we showed that the metabolic profiles of AB-associated shifts in human gut microbiota were less dramatic than those in bacterial composition, principally when considering main roles. This is due to functional redundancy of the human gut microbiota, and the fact it has a very general set of functions (Pérez-Cobas et al., 2013b). We have also found great homogeneity in distribution of the main role in all the samples. However, differences appear when considering more inclusive functional levels (sub-roles and functions). In this study, patients showed different functional responses (sub-roles) to ABs, in agreement with our previous study where a great inter-individual variability was found in AB-treated patients. Although no significant differences between both groups of AB-treated patients (CDI and non-infected) as a whole were detected, a specific functional profile was found. Thus, the transport, metabolism, and regulation of sugars such as mannose, fructose, lactose, glucitol, or mannitol were over-represented functions in CDI samples, the major sugar transport system being the phosphotransferase system (PTS). In a previous work, we found that AB increases PTS in metagenomes, since it seems to give advantage to bacteria carrying them under stress conditions (Deutscher et al., 2006; Pérez-Cobas et al., 2013b). The higher presence of these functions in CD+ samples compared to CD− is noteworthy, even when both were treated with ABs, because it could be related with the infection, as shown in a metabolomic study in mice that developed CDI (Theriot et al., 2014). The same authors found an increase in carbohydrates like mannose, fructose, lactose, glucitol, or mannitol after AB treatment, and they postulated that these increases favored *C. difficile* germination and growth. Related to this finding, a transcriptomic study revealed that sugars released by an altered microbiota are exploited by enteric pathogens such as *Salmonella enterica* and *C. difficile* (Ng et al., 2013). Thus, *C. difficile* and other opportunistic bacteria can efficiently catabolize the excess of carbohydrates generated by the disrupted microbiota and, in the absence of competitors, increase colonization rates.

Using the same three comparisons, we also found metabolic functions that may play a role in *C. difficile* colonization resistance (**Table 5**). Overall, there was a higher abundance of functions related to aromatic amino acid biosynthesis, being chorismate mutase the central node of the network, since it was strongly connected to other important functions like energy metabolism or protein fate. The chorismate mutase, which participates in tyrosine, phenylalanine and tryptophan biosynthesis, could be involved in colonization resistance through stimulation of the immune system, since the tryptophan metabolite participates in immune system equilibrium and inflammation regulation (Zelante et al., 2013). Future research should be conducted to discover the mechanism by which aromatic amino acid synthesis could protect against colonization by *C. difficile*. Also, some energy metabolism pathways seem important, such as TCA cycle, electron transport, or fatty acid biosynthesis. A great number of different transporter families, regulator genes, and genes involved in responses to osmotic or acid stress were also highlighted in the network, possibly playing a role in colonization resistance.

Another possible protective pathway was peptide catabolism via tryptophan metabolism. Low abundance of protein digestion markers was associated to susceptibility to CDI in the mouse gut (Theriot et al., 2014). Regarding host immune response, we found polyamine biosynthesis (putrescine or cadaverine) by decarboxylation of amino acids to be another potential protective pathway. A previous study reported that these metabolites interact with the gut microbiota, stimulating the immune system and playing a role in intestinal maturation (Gómez-Gallego et al., 2012). In this regard, Jung et al. (2003) found that glutamate decarboxylase activity, related to polyamines, was also a protective determinant, playing a role in protection against acid stress. It is also relevant that this enzyme is connected to other functions in the network, such as protein fate, transcription regulation, or transport systems, thus reinforcing its protective role. Moreover, other protective gene-products regulate metabolic pathways that are important for several cellular physiology processes, like osmotic stress resistance and responses to environmental changes (Wouters et al., 2010; Shepard et al., 2011).

In summary, we found specific fecal microbiota in CDI patients as it was enriched in *Lactobacillus*, *Enterococcus*, Clostridium clusters XIVa and XI but depleted in SCFAproducing bacteria. The latter bacterial group could be involved in *C. difficile* colonization resistance. A group of metabolic processes related to the metabolism of proteins, amino acids and responses to stress would seem to participate in avoiding pathogen invasion in the human gut ecosystem. Further research into these pathways should be undertaken to unravel the mechanism by which they participate in colonization resistance to *C. difficile*. A larger cohort of patients with similar sampling would be needed to deeper define the CDI microbiota at taxonomic and functional level.

#### **AUTHOR CONTRIBUTIONS**

Andrés Moya, María J. Gosalbes, and Amparo Latorre conceived the work. Ana E. Pérez-Cobas performed all the analyses. María J. Gosalbes and Alejandro Artacho help with some of the analyses. The manuscript was written by Ana E. Pérez-Cobas, María J. Gosalbes, and Amparo Latorre. Andrés Moya and Stephan J. Ott revised the manuscript.

#### **ACKNOWLEDGMENTS**

This work was supported by the Spanish Ministry of Economy and Competitiveness (SAF2012-31187) and by EU ERA-NET project on *C. difficile*. Pyrosequencing was carried out by Dr. Nuria Jiménez in the Sequencing Service of FISABIO-Salud Pública (Valencia, Spain).

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/journal/10*.*3389/fmicb*.* 2014*.*00335/abstract

#### **REFERENCES**


long-term impacts on the human throat and gut microbiome. *PLoS ONE* 5:9836. doi: 10.1371/journal.pone.0009836


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 March 2014; accepted: 16 June 2014; published online: 04 July 2014. Citation: Pérez-Cobas AE, Artacho A, Ott SJ, Moya A, Gosalbes MJ and Latorre A (2014) Structural and functional changes in the gut microbiota associated to Clostridium difficile infection. Front. Microbiol. 5:335. doi: 10.3389/fmicb.2014.00335 This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Pérez-Cobas, Artacho, Ott, Moya, Gosalbes and Latorre. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**PERSPECTIVE ARTICLE** published: 26 September 2014 doi: 10.3389/fmicb.2014.00510

## The symbiont side of symbiosis: do microbes really benefit?

## *Justine R. Garcia\* and Nicole M. Gerardo*

Gerardo Lab, Department of Biology, O. Wayne Rollins Research Center, Emory University, Atlanta, GA, USA

#### *Edited by:*

Monica Medina, Pennsylvania State University, USA

#### *Reviewed by:*

Scott Clingenpeel, Joint Genome Institute, USA Mark Mandel, Northwestern University Feinberg School of Medicine, USA

#### *\*Correspondence:*

Justine R. Garcia, Gerardo Lab, Department of Biology, O. Wayne Rollins Research Center, Emory University, Room 1174, 1510 Clifton Rd. NE, Atlanta, GA 30322, USA e-mail: jrhall2@emory.edu

Microbial associations are integral to all eukaryotes. Mutualism, the interaction of two species for the benefit of both, is an important aspect of microbial associations, with evidence that multicellular organisms in particular benefit from microbes. However, the microbe's perspective has largely been ignored, and it is unknown whether most microbial symbionts benefit from their associations with hosts. It has been presumed that microbial symbionts receive host-derived nutrients or a competition-free environment with reduced predation, but there have been few empirical tests, or even critical assessments, of these assumptions. We evaluate these hypotheses based on available evidence, which indicate reduced competition and predation are not universal benefits for symbionts. Some symbionts do receive nutrients from their host, but this has not always been linked to a corresponding increase in symbiont fitness.We recommend experiments to test symbiont fitness using current experimental systems of symbiosis and detail considerations for other systems. Incorporating symbiont fitness into symbiosis research will provide insight into the evolution of mutualistic interactions and cooperation in general.

**Keywords: mutualism, microbial fitness, host–microbe interactions, symbiont transmission, endosymbiosis**

#### **INTRODUCTION**

Microbes have been recognized as an important force in eukaryotic evolution (McFall-Ngai et al., 2013), but recognition of the impact of eukaryotes on microbial evolution has lagged behind. Interspecies interactions between microbes and eukaryotic hosts fall on a continuum from parasitism to mutualism. Fitness effects of these interactions are routinely investigated in hosts, but it is necessary to consider both partners to understand how interactions evolve and persist. There is a robust framework for understanding how parasitic interactions promote the fitness of parasitic microbes (pathogens), but the microbe's perspective has largely been ignored in putatively mutualistic interactions, and it is unknown whether most non-parasitic microbes benefit from host association.

Most research of mutualisms has focused on the host, as they are larger and usually a more tractable experimental organism. The effect of microbial association on hosts is routinely tested by comparing fitness in hosts with and without symbionts (**Figure 1A**; e.g., Kikuchi et al., 2007). Analogous experiments for symbionts are rarely performed, even in well-described systems. It is often assumed that symbiont fitness is higher in hosts relative to other niches because they receive a competitionfree environment, reduced predation, or host-derived nutrients. Population size is a straightforward way to measure microbial fitness (i.e., the replication capacity of a clonal population), but it should be used to quantify symbiont fitness in the same way that it is for hosts – as the difference in replication in the presence and absence of its interacting partner. When tested, some experiments have shown that symbionts suffer deleterious effects or costs such as suppressed growth in hosts (Ahmadjian, 1993; Wooldridge, 2010; Login and Heddi, 2013; Udvardi and Poole, 2013). The presence of some costs in the host relative to other niches does not necessarily preclude the symbiont from gaining

a net fitness benefit through host association [e.g., acquiring genetic diversity through horizontal gene transfer (HGT)], but it does suggest an important aspect that should be considered.

The semantics of symbiosis may be partially to blame for the neglect of microbes. There have been two prominent uses of "symbiosis" over the past century. The first follows from the definition of symbiosis by de Bary as "the living together of unlike organisms" and is applied to interspecies associations regardless of the relationship (parasitism, commensalism, or mutualism; Douglas, 2010; Leigh, 2010). In the second, symbiosis is synonymous with mutualism and indicates a generally beneficial relationship. This is usually applied when it is known that the host benefits from an association and implies that the symbiont does as well. Here we consider any long-term, intimate association to be a "symbiosis" while reserving mutualism for only those interactions known to be beneficial for both partners.

Here we evaluate evidence for reciprocal benefit in presumed mutualistic microbial symbioses, emphasizing environmentally acquired (horizontal) microbial symbionts in eukaryotic hosts. We also re-examine the role of hosts and microbes in symbioses in light of evidence for symbiont benefit. Although it has previously been recognized that symbionts must be more thoroughly investigated (Douglas and Smith, 1989; Bronstein, 2001; Wilkinson and Sherratt, 2001; Kereszt et al., 2011), recent advances in technology and new study systems provide novel tools and opportunities for investigating the symbiont side of symbiosis.

## **AN EVALUATION OF ASSUMED SYMBIONT BENEFITS COMPETITION**

It is assumed that microbial symbionts benefitfrom a competitionfree environment inside hosts because they live in the absence of other microbes that compete for resources. While some systems

**FIGURE 1 | (A)** Experimental designs to test the effect of symbiosis on host fitness (left) and symbiont fitness (right). Both experiments involve measuring growth or other fitness parameters (see section Recommendations for Investigating Symbiont Fitness) in the presence and absence of their partner. Experiments on host fitness have been performed in diverse systems, but the equivalent symbiont fitness experiment is rarely performed. **(B)** Experimental design from Wollenberg and Ruby (2012) for measuring the relative growth of two

have monoclonal symbiont populations (Gage, 2002; Martens et al., 2003; Kubota et al., 2007; Dubilier et al., 2008; Aanen et al., 2009), likely due to bottlenecks during repeated vertically transmission or winnowing during horizontal transmission, not all host-symbiont associations are monoclonal. Within-host competition between strains is important for pathogen fitness (Bell et al., 2006) and some vertical symbionts (Oliver et al., 2006). This is likely also true for horizontal symbionts as hosts from many systems harbor multiple symbiont genotypes (Baker and Romanski, 2007; Dubilier et al., 2008; Fay et al., 2009; FitzPatrick et al., 2012; Van Horn et al.,2012; Garcia et al.,2014). Even hosts with strict colonization requirements and entry mechanisms, like bobtail squid which select specific strains of*Vibrio fischeri* from diverse microbes in seawater, contain multiple symbiont genotypes (Wollenberg and Ruby, 2009).

Competition in a polyclonal symbiont population can result in decreased growth for one species or genotype (Elliott et al., 2009; Baker et al., 2013; Engelmoer et al., 2014) or lower symbiont titers (Mouton et al., 2004). Mycorrhizal fungi, for instance, have lower abundance in plant roots when co-inoculated relative to single inoculations. Furthermore, competition between these fungi is stronger within the host compared to the rhizosphere (Engelmoer et al., 2014). Coexistence with other symbionts, however, can be beneficial. Double or triple infections of *Wolbachia* in the wasp *Asobara tabida*, for example, increase the abundance of a specific

groups of bobtail squid symbionts within naturally infected hosts. Competition assays were performed to test within-host fitness by inoculating the seawater of a hatchling squid with a symbiont strain from each symbiont group (left). A separate experiment confirmed that the symbionts had an equal ability to colonize the squid after single-strain inoculations (not pictured). Symbiont growth was tested in the environment by inoculating filtered (middle) and unfiltered (right) seawater from the natural habitat of the squid and symbiont.

*Wolbachia* genotype relative to single infections with that genotype only (Mouton et al., 2004). Co-infections, therefore, are a necessary but not sufficient condition for competition and there is no a general framework for predicting the conditions in which co-infections will promote or hinder a symbiont's fitness. Future research on within-host competition is needed, and should be considered in the context of mechanisms, such as partner choice and sanctioning, that may reduce or prevent polyclonal infections and competition (Bull and Rice, 1991).

## **PREDATION AND THE HOST IMMUNE SYSTEM**

In non-host environments, microbes are attacked by pathogens and preyed upon by predators such as nematodes, zooplankton, and filter-feeding invertebrates. In hosts, symbionts still face pressures akin to predation. Hosts have potent immune defenses with which both horizontal (Dunn and Weis, 2009) and vertical (Wang et al., 2009; Laughton et al., 2011) symbionts must sometimes contend. These defenses are analogous to predators as they suppress population growth and can eliminate organisms from an environment (Sachs and Wilcox, 2006; Kim et al., 2013). In some cases, a multitude of bacteria enter a host but cannot pass increasingly specific checkpoints to establish within the host (Nyholm and McFall-Ngai, 2004; Kim et al., 2013). Microbes are killed by a range of host immune responses, including phagocytosis, antimicrobial peptides, and reactive oxygen species (Davidson et al., 2004; Login

and Heddi, 2013). Hosts can also suppress or regulate established symbiont populations. Carpenter ants reduce bacterial symbiont populations through modulation of an immune response during development (Ratzka et al., 2013). Similarly, tsetse flies express antimicrobial peptides in symbiont-housing cells to regulate symbiont populations (Login et al., 2011). Although it is not known if host control of symbiont growth via immune system "predation" is universal, it is clear that symbionts do not grow unfettered in hosts.

Symbiont growth may also be controlled using mechanisms unconnected to the immune system. Rhizobia root nodule bacteria (Udvardi and Poole, 2013), algal symbionts of corals (Wooldridge, 2010), insect bacterial symbionts (Login and Heddi, 2013), and lichen photobionts (Ahmadjian, 1993) can have lower growth rates relative to their free-living counterparts. The growth of *Symbiodinium* algae is suppressed in corals relative to free-living *Symbiodinium*, but the rate of photosynthesis is comparable in both populations (Muscatine et al., 1984; Falkowski et al., 1993), suggesting algal energy is directed toward producing photosynthate for the host rather than self-growth. In other hosts, proliferating *Symbiodinium* cells are preferentially expelled over non-proliferating cells (Baghdasarian andMuscatine,2000). However, growth suppression of certain symbiont cells in the host does not single-handedly indicate a deleterious effect on symbionts. The real indicator of a beneficial association is an increased capacity to reproduce in the host relative to the non-host niche, which has not been sufficiently addressed.

#### **HOST-PROVIDED NUTRITION**

There are clear examples in which symbionts receive nutrients like amino acids (Graf and Ruby, 1998; Macdonald et al., 2012) from hosts. Rhizobia bacteria receive numerous compounds from their plant hosts, including amino acids, sugars, and trace ions (Prell et al., 2009; Udvardi and Poole, 2013). However, it is unclear whether any of these nutrients are beneficial to the symbiont. In the case of amino acids, free-living and cultured rhizobia can synthesize branched chain amino acids on their own, but the synthesis of these amino acids is significantly down-regulated in root nodules, and rhizobia in the host rely solely on the plant for these amino acids (Prell et al., 2009). In this state, "symbiotic auxotrophy," bacteria seem to function more as ammonia-producing organelles rather than organisms seeking to increase their fitness. Similarly, *V. fischeri*, bobtail squid symbionts, receive amino acids, fatty acids and chitin from their hosts (Graf and Ruby, 1998; Jones and Nishiguchi, 2006; Wier et al., 2010). However, there is evidence that *V. fischeri* benefit from these host-derived nutrients or another aspect of host association, as environmental populations are larger in habitats with squid hosts compared to those without squid (Lee and Ruby, 1994; Jones et al., 2007). Ultimately measures of microbial growth along with direct tests of the fate of microbes inside and outside hosts are crucial for understanding the effect of host-derived nutrients.

#### **RECOMMENDATIONS FOR INVESTIGATING SYMBIONT FITNESS**

The effect of microbes on hosts has been quantified in many systems by measuring fitness in symbiotic and aposymbiotic hosts, but the effect of host-association on symbionts has been tested far less frequently (**Figure 1A**). One experiment in the squid-*Vibrio* system serves as a model for symbiont experiments using the comparative fitness approach (**Figure 1B**). Wollenberg and Ruby (2012) inoculated bobtail squid, filtered seawater, and unfiltered seawater with *V. fischeri* strains that were either highly prevalent or rare symbionts in squid hosts. The common symbionts grew as well as the rare symbionts in the squid host and in filtered water, but displayed a distinct population decline in unfiltered seawater (Wollenberg and Ruby, 2012), likely due to predation or competition from other seawater inhabitants. This is one of the only experiments demonstrating that symbionts have an increased reproductive capacity and higher fitness within-hosts relative to non-host environments. It is important to note that this experiment found an effect because it utilized natural environments (ocean water with diverse microorganisms and nutrients) rather than culture based conditions.

Population growth is an appropriate measure of fitness for many microbes because growth and offspring production are usually the same, i.e., binary fission. There are many easy and reliable methods for measuring microbial population growth, including counting by culturing (CFUs or OD600), counting labeled cells with a microscope or flow cytometer, and counting gene copies with quantitative polymerase chain reaction (qPCR). However, there are alternative measures of fitness, that include future reproduction (Ratcliff et al., 2012), reproductive structures, e.g., fruiting bodies (Huang et al., 2006), sporulation (Pringle and Taylor, 2002), transmission (Huang et al., 2006), and virulence (Bryner and Rigling, 2012), that can also be employed. These measures are routinely used to measure pathogen fitness; for instance, measuring virulence as a percentage of hosts killed as a proxy for microbial fitness (Parker et al., 2014). These alternative fitness measures may be more appropriate for many symbionts, especially those with complex lifecycles such as fungi (Pringle and Taylor, 2002) and protists (Devreotes, 1989). Certain nodulated rhizobia, for example, undergo multiple rounds of endoreplication, each time doubling the chromosome without completing cell division (Udvardi and Poole, 2013). Therefore, comparing population sizes of rhizobial bacteria in and outside the host using a gene counting method like qPCR would provide an inflated count of population size and an alternative measure would be more appropriate. Additionally, alternative fitness measure may detect a benefit to symbionts even when their relative growth rate is lower in hosts than other niches.

One challenge of comparative fitness assays is duplicating an appropriate non-host environment. For example, gene expression differences between symbiotic and free-living rhizobia have been investigated in many studies, but they have almost exclusively used cell culture as the "free-living" environment (Barnett et al., 2004; Djordjevic, 2004; Capela et al., 2006; Karunakaran et al., 2009; Tatsukami et al., 2013; Peng et al., 2014). Comparison between host-associated and cultured symbionts can provide insight into responses to ecologically relevant conditions, such as low-oxygen and nutrient-limitation, but they cannot duplicate the complexity and heterogeneity of natural conditions. Ideally, fitness experiments would be done in substrate taken directly from the environment, as was the seawater for the *V. fischeri* experiment above. Semi-natural substrates like potting soil or aquarium sea salt mixtures are somewhat more informative than cell culture. In other cases, it may not be known if there is a non-host habitat or what the symbiont's full habitat range is and coupling symbiosis research with more traditional microbial ecology can inform these experiments (Zahran, 2001; Garcia et al., 2014).

Advances in "omics" technologies (genomics, transcriptomics, etc.) have provided new approaches to investigate symbiont fitness. Although omics approaches do not directly test symbiont fitness, they can illuminate the "terms" of the relationship and hint at benefits. For instance, up-regulation of vitamin production in the host could suggest a nutritional benefit for symbionts, while overexpression of anti-phage proteins may indicate protection of symbionts from pathogens. Omics data can be used to direct and refine comparative fitness assays. For example, simultaneous transcriptome sequencing of *Porites* (a coral) and *Symbiodinium* (its symbiont), revealed that neither partner could synthesize a complete repertoire of amino acids. This, coupled with up-regulation of transport proteins, suggests amino acids are transported between host and symbiont, including amino acids that may be a limiting resource for *Symbiodinium* outside the host (Shinzato et al., 2014). Targeted experiments could test the fitness effect of nitrogen-limitation or removal of specific amino acids on *Symbiodinium* growth in the host and seawater. Omics studies may be especially useful when laboratory fitness assays do not reveal any difference between host-associated and free-living microbes (because the benefit depends on a factor not present in the lab).

One disadvantage of growth as a fitness measure is its emphasis on short-term, immediate benefits at the expense of long-term, rare benefits, which could include access to novel genetic diversity or dispersal. HGT is an important source of novel DNA in prokaryotes, and there is considerable evidence that HGT is important in symbiosis (Marchetti et al., 2010; Husnik et al., 2013; McFall-Ngai et al., 2013). HGT is impeded by separation between appropriate donor-recipients pairs, which could be overcome when closely-related prokaryotes, which are more likely to be compatible (Popa and Dagan, 2011), come together in a host. HGT is particularly prevalent in proteobacteria (Nielsen et al., 2014), phyla rife with insect (Kikuchi et al., 2011), marine invertebrate (Dubilier et al., 2008; Bright and Bulgheresi, 2010), and leguminous plant symbionts (Zahran, 2001). Genomic analysis indicates genes that control host specificity and colonization in the proteobacteria *Xenorhabdus nematophila* (Cowles and Goodrich-Blair, 2008) and *V. fischeri* (Mandel et al., 2009) have likely been acquired via HGT. Although some proteobacterial endosymbionts have lower rates of HGT than their close relatives (Kloesges et al., 2011), this is not true for proteobacteria in mammalian guts (McFall-Ngai et al., 2013). Additionally, HGT may be especially adaptive for horizontal symbionts as they could access novel DNA within-hosts, even if host association was detrimental to shortterm fitness. Dispersal may be a similarly rare but beneficial event. Mobile hosts such as flying insects or pelagically dispersed coral larvae (Wirshing et al., 2013) may transport symbionts to

novel environments or hosts that better support symbiont growth. Dispersal would be of particular benefit in systems where local extinction is possible. These rare benefits may provide small or hard-to-measure fitness gains to symbionts that outweigh other short-terms costs associated with inhabiting a host or another niche.

Finally, in order to persist, horizontal symbionts must outlive their host by dispersing to a new host or free-living habitat. In some systems, there is clear release of viable symbionts back into the environment. Bobtail squid expel ∼95% of their symbionts in a daily cycle (Lee and Ruby, 1994) and gene expression studies indicate symbionts prepare for life outside the host before expulsion by up-regulating flagellar genes and making metabolic changes (Jones and Nishiguchi, 2006; Wier et al., 2010). Some legumes (Bright and Bulgheresi, 2010) and marine invertebrate hosts (Sachs and Wilcox, 2006), including coral (Baghdasarian and Muscatine, 2000), also release viable symbionts, though this has primarily been considered a way to rid themselves of poor symbionts (Douglas, 2008). In contrast, some hosts can kill, digest, or otherwise prevent viable symbionts from cycling back into the environment. Some rhizobia have undergone such extreme physiological changes that they are no longer viable outside the host, though they do remain metabolically active (Mergaert et al., 2006). In many systems, it is unknown whether symbionts can leave the host much less whether they are viable in the environment. Determining whether a symbiont can leave the symbiosis and proliferate is important as transmission dynamics, the cornerstone of pathogen fitness and evolution (de Roode et al., 2008), undoubtedly play a role in the ecology and evolution of beneficial symbionts as well.

Symbiosis is an important and intensely studied topic in evolution and ecology. However, core concepts including how beneficial symbioses are formed and maintained over evolutionary time are not well developed. The most common hypothesis is that these associations are maintained through mutual benefit. However, in cases where there is no evidence of a symbiont benefit, symbionts may instead be more akin to prisoners or farmed crops than equal partners. Even if symbionts do exhibit increased reproductive ability in hosts, this could ultimately be of little evolutionary benefit, in much the same way cattle populations increase through ranching but, as most cattle are sacrificed prior to reproduction, they do not receive a fitness benefit. Therefore, it is important to determine whether hosts imprison symbionts and whether symbionts have adaptations to evade capture in addition to measuring costs and benefits of presumed mutualisms (Douglas, 2008). Even in this warden-prisoner model of host– microbe association, it is important to recognize there may be both costs and benefits to associating with a host and to identify the short- and long-term fitness consequences for microbes in a variety of contexts. Ultimately, it is clear that progress in symbiosis research requires inclusion of the symbiont side of symbiosis.

## **AUTHOR CONTRIBUTIONS**

Justine R. Garcia and Nicole M. Gerardo developed the ideas presented here. Justine R. Garcia wrote the manuscript and Nicole M. Gerardo revised and edited it. Justine R. Garcia and Nicole M. Gerardo both approve of the final version of this manuscript and take responsibility for all its contents.

#### **ACKNOWLEDGMENTS**

We thank Alice Laughton, Stephanie Chiang, Lynn Griffin, Nelle Couret, Jaap de Roode, Berry Brosi, Les Real, and Todd Schlenke for their critical assessment and insightful discussions. Funding was provided by a National Science Foundation (NSF) Graduate Research Fellowship to Justine R. Garcia and NSF grant IOS-1149829 to Nicole M. Gerardo.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 May 2014; accepted: 10 September 2014; published online: 26 September 2014.*

*Citation: Garcia JR and Gerardo NM (2014) The symbiont side of symbiosis: do microbes really benefit? Front. Microbiol. 5:510. doi: 10.3389/fmicb.2014.00510*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Garcia and Gerardo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Production possibility frontiers in phototroph:heterotroph symbioses: trade-offs in allocating fixed carbon pools and the challenges these alternatives present for understanding the acquisition of intracellular habitats

## *Malcolm S. Hill\**

*Department of Biology, Gottwald Science Center, University of Richmond, Richmond, VA, USA*

#### *Edited by:*

*Monica Medina, Pennsylvania State University, USA*

#### *Reviewed by:*

*Simon K. Davy, Victoria University of Wellington, New Zealand Mathieu Pernice, University of Technology Sydney, Australia*

#### *\*Correspondence:*

*Malcolm S. Hill, Department of Biology, Gottwald Science Center, University of Richmond, Richmond, VA 23173, USA e-mail: mhill2@richmond.edu*

Intracellular habitats have been invaded by a remarkable diversity of organisms, and strategies employed to successfully reside in another species' cellular space are varied. Common selective pressures may be experienced in symbioses involving phototrophic symbionts and heterotrophic hosts. Here I refine and elaborate the Arrested Phagosome Hypothesis that proposes a mechanism that phototrophs use to gain access to their host's intracellular habitat. I employ the economic concept of production possibility frontiers (PPF) as a useful heuristic to clearly define the trade-offs that an intracellular phototroph is likely to face as it allocates photosynthetically-derived pools of energy. Fixed carbon can fuel basic metabolism/respiration, it can support mitotic division, or it can be translocated to the host. Excess photosynthate can be stored for future use. Thus, gross photosynthetic productivity can be divided among these four general categories, and natural selection will favor phenotypes that best match the demands presented to the symbiont by the host cellular habitat. The PPF highlights trade-offs that exist between investment in growth (i.e., mitosis) or residency (i.e., translocating material to the host). Insights gained from this perspective might help explain phenomena such as coral bleaching because deficits in photosynthetic production are likely to diminish a symbiont's ability to "afford" the costs of intracellular residency. I highlight deficits in our current understanding of host:symbiont interactions at the molecular, genetic, and cellular level, and I also discuss how semantic differences among scientists working with different symbiont systems may diminish the rate of increase in our understanding of phototrophic-based associations. I argue that adopting interdisciplinary (in this case, inter-symbiont-system) perspectives will lead to advances in our general understanding of the phototrophic symbiont's intracellular niche.

**Keywords:** *Symbiodinium***,** *Chlorella***, investment strategies, endocytobiology, intracellular mimicry, phagosomes**

## **BACKGROUND**

Symbiotic associations between different species with conjoined evolutionary trajectories are among the most common ecological interactions in biological communities (Thompson, 2005; Douglas, 2010). They also represent some of the most important evolutionary moments for life on this planet given that the genesis of the Domain Eukarya involved successful invasion of host cells by bacterial endosymbionts (e.g., mitochondria and chloroplasts—Knoll et al., 2006). Symbiotic interactions are exceptionally diverse and include everything from pollinators/mycorrhizal symbionts and their plant hosts, to parasites that castrate snails, to intracellular mutualists and parasites (e.g., Thompson, 2005; Douglas, 2010; Vergara et al., 2013). The evolutionary responses of endocytobiological associations are particularly interesting due to the high degree of intimacy between partners, which has the potential to generate complicated evolutionary patterns as the host and symbiont respond to the selective pressures each places on the other (e.g., Thompson, 2005).

Organisms that occupy intracellular habitats must avoid the host's cellular defenses (e.g., immunological response, phagotrophy—Scott et al., 2003; Martirosyan et al., 2011; Sibley, 2011). Despite the challenges of living inside a cell, many symbionts have successfully invaded this habitat as parasites and mutualists (e.g., Schwarz, 2008; Nowack and Melkonian, 2010; Heinekamp et al., 2013; Romano et al., 2013). The dynamic adaptive landscapes associated with endocytobiological interactions can generate tight integration between the partners such that the evolutionary manifestation is an obligate association for one or both species (e.g., Amann et al., 1997). From an evolutionary perspective, however, the earliest stages of intracellular occupancy must, to some degree, involve facultative associations. It is clear that we do not fully understand nuanced aspects of evolutionary processes that shape many intracellular interactions, and thus the patterns (e.g., host specialization; Thornhill et al., 2014) that emerge from them.

Symbioses between phototrophs and heterotrophs are common in many ecosystems (e.g., lichens, *Chlorella*- and *Symbiodinium-*based symbioses). These ancient associations have been a focus of study for decades in a variety of systems (e.g., Karakashian and Karakashian, 1965; Kremer, 1980; Weis, 1980; Wilkerson, 1980; Brodo et al., 2001; Yuan et al., 2005). The algal partners often contribute substantial energy reserves to their hosts, and in many cases are located intracellularly. In some cases, adaptations for symbiotic life styles have been detected (e.g., Blanc et al., 2010). The dynamics of establishing the partnership from one generation to the next are complex, and depend upon characteristics of the species involved in the association. In many cases, algae re-infect hosts each generation from environmental sources. The route of entry into the host for intracellular partnerships is often phagotrophic (**Figure 1**),

**FIGURE 1 | (A)** An example of phagotrophic entry of a potential phototrophic symbiont into a heterotrophic host cell. *Chlorella* were fed to the freshwater sponge *Ephydatia muelleri* where they were captured through phagocytosis by archaeocytes. Scale bar = 1µm. **(B)** *Chlorella* cell (C) and bacterial prey (B) within separate vacuoles of an archaeocyte adjacent a choanocyte in *E. muelleri*. The Arrested Phagosome Hypothesis states that the fate of some algal cells (i.e., symbionts) may differ from other potential prey items (e.g., bacterial prey in the archaeocyte) because algal symbionts can avoid digestion by translocating photosynthate to the host thus mimicking digesting prey (Hill and Hill, 2012). Scale bar = 2µm.

but the mechanisms that prevent activation of host defenses as a response to the foreign agent (i.e., symbiont) are often unknown. A common narrative can be found in much of the literature. Host cells lack particular vital nutrients, which they obtain from an endosymbiont. Through its beneficence (e.g., preferentially shutting down immunological or digestive processes in response to appropriate algal partners), the host creates a microhabitat, often within specialized cells, that favors algal growth, but only up to a point. If the symbiont population becomes too large, the host imposes some type of control to maintain symbiont population size near a carrying capacity. Under this scenario, hosts must coordinate a complicated choreography of genetic and cellular events in response to symbiont presence. In this context, algae play a limited role in this association, and some have gone so far as to liken them to prisoners involved in "enforced domestication" (Wooldridge, 2010; Damore and Gore, 2011).

Two hypotheses that afford symbionts a larger role in initiating and maintaining populations within host cells were presented recently (Hill and Hill, 2012). One of those hypotheses, the Arrested Phagosome Hypothesis (APH), proposes that phototrophs enter a host cell through phagocytosis (**Figure 1**). However, the APH states that the symbiont can then subvert normal endomembrane processes that lead to exocytosis by mimicking an organelle typically associated with digestion (e.g., the phagosome) through the perpetual release of photosyntheticallyderived compounds. Thus, under the APH, symbionts have evolved a strategy involving the release of photosynthate so they may remain within the host cell (i.e., occupy habitat) for extended periods of time. It is important to note that the focus here will be on carbon-based photosynthate. This perspective builds on the work of biologists like Muscatine et al. (1981) who examined carbon contributions that zooxanthellae make to coral animal respiration. A major difference from that earlier work and the ideas presented here is that I will focus on strategies that the symbiont employs to procure its cellular habitat. It is also important to note that symbioses like the ones considered here involve nuanced and complicated host-to-symbiont and symbiont-to-host transactions of material like nitrogen and metals involved in photosynthesis (e.g., Fagoonee et al., 1999; Whitehead and Douglas, 2003; Pernice et al., 2012). While the perspectives presented should apply to any material exchanges between symbiont partners that involve trade-offs, the focus here will be on carbon alone. While the APH was proposed to explain how *Symbiodinium* procure residency within heterotrophic hosts in tropical habitats (Hill and Hill, 2012), the hypothesis should apply to nearly any phototroph:heterotroph symbiosis (e.g., those involving *Chlorella* or cyanobacteria, lichens).

Factors driving algae to occupy a host or host cell may differ depending on the partners and the habitat in which the symbiosis occurs. That is, these associations are likely context dependent. For example, selective pressures generated by the limitation of metals, which are essential for electron transport and can be rare in many environments, may be a factor favoring entry of phototrophs into the host cell habitat (Raven et al., 1999; Saenger et al., 2002; Rutherford and Faller, 2003). For example, in marine systems, the ratio of magnesium to calcium in modern seawater is approximately 5:1. However, this ratio has shifted as rates of continental spreading and terrestrial erosion have waxed and waned (Garrisson, 2007; Ries, 2010). The second hypothesis that affords symbionts a greater role in initiating and maintaining populations within host cells is the Magnesium Inhibition Hypothesis (MIH) that was proposed to explain why *Symbiodinium* seem to prefer hosts that modify CaCO3 solubilities (Hill and Hill, 2012). The MIH states that *Symbiodinium* favor hosts that have the ability to concentrate or release calcium ions, which would otherwise be limiting in the system. In other systems, different host-derived resources (e.g., other limiting metals like iron) might be targets for intracellular occupancy.

Regardless of explanatory hypotheses like the APH or MIH, translocation of photosynthate is ubiquitous in phototrophic symbioses. Greater attention needs to be focused on the conflict that likely exists between host and symbiont over the quantity and quality of material that is translocated. If hosts benefit from greater translocation and symbionts benefit from translocating less material or material of lesser energetic value, then the antagonism between partners might lead to partner specialization as selection favors strategies that mitigate the conflicts. Here, I argue that selection on symbiont-driven allocation strategies deserves greater attention, and recent methodological and theoretical advances offer interesting avenues for future research. My purpose is to provide a useful heuristic for considering selective pressures algal symbionts and their hosts may face in the context of translocation.

#### **DURABLE vs. CONSUMABLE TRADE-OFFS & THE PRODUCTION POSSIBILITY FRONTIER**

Phototrophic organisms create fixed carbon stores through photosynthesis. The chemical energy (e.g., reduced sugars) is then used to power a variety of physiological functions including basal metabolism and growth/reproduction. Excess energy can be stored for future consumption. Phototrophic endosymbionts face additional debits against their energy budget in the form of material translocated to the host. The APH views the intracellular space as one that can be leased from a host (Hill and Hill, 2012; **Figure 1**). The "cost" of occupying the intracellular space is the fraction of photosynthetically-fixed C that is translocated, which has been reported to reach 95% for some species of corals and dinoflagellate endosymbionts (Falkowski et al., 1984; Muscatine and Weis, 1992; Yellowlees et al., 2008; Muller et al., 2009; Stambler, 2011). While there is little doubt that material is translocated to heterotrophic hosts, Davy et al. (2012) point out that many deficits exist in our current understanding of the quantity and type of material translocated to heterotrophic hosts. The 95% value quoted above is too general and imprecise to be of use for specific symbioses, and greater work is required to create a realistic picture of the material that moves between symbiotic partners. Nonetheless, the physiological characteristics of the host cell would set the price of the space, and the symbiont would have to "pay" at a particular rate and with particular expectations of materials released. It is important to note that in addition to costs required to occupy the endomembrane system, host cells would also have a unique molecular genetic milieu (e.g., immunological responses) that would impose another level of selection on an invading symbiont. However, the APH points to a clear life history trade-off from the symbiont's perspective—for every increase in material translocated to the host, the symbiont suffers a reduction in the amount of energy available for other physiological needs like mitosis.

For many of the algae that form symbioses with heterotrophic hosts, asexual reproduction is the dominant mode of population increase (Pettay et al., 2011; Thornhill et al., 2013), and mitosis is an energy consuming process as DNA, cellular machinery, organelles, etc. are duplicated to provision each daughter cell. Given that cell division and translocation draw on the same primary production pool of fixed carbon (ignoring for the moment basal metabolism and storage), investing in one or the other process raises the possibility that competition ensues for the energy represented by the limited products of photosynthesis. A common graph used in economics provides a useful tool to visualize the trade-off that phototrophic symbionts face (**Figure 2**). The production-possibility frontier (PPF; Gillespie, 2007) is a curve that depicts possible production sets representing the most efficient distribution of two commodities that draw on the same

**FIGURE 2 | Production possibility frontier (PPF; red curve) represents trade-offs in investment strategies that phototrophic symbionts may face with the photosynthate they create** *in hospite***.** Algae may use their energy stores to create more cells through mitosis (a durable good—see orange arrow), but this comes at the cost of carbon that is translocated to the host (a consumable good—see gray arrow). It is assumed that natural selection would rapidly remove inefficiencies (star in graph) where more carbon could be translocated or its energetic equivalents used for cell division. Thus, "Pareto efficiencies" that comprise the curve represent evolutionary optima. The tangent to the curve represents opportunity costs associated with producing one commodity over the other. A prediction of the Arrested Phagosome Hypothesis is that symbionts will increase the time they reside in a cell by translocating more material to the host (moving from mutant 2 to 1). However, if a mutant can release less photosynthate without losing its ability to evade host defenses (moving from mutant 1 to 2), then natural selection may favor that strategy as more cells will be available to colonize additional cells and hosts in the environment. If the PPF shifts inward (green curve) due to some major environmental event (e.g., thermal stress), the symbionts are faced with a smaller energy budget. If amount of photosynthate that must be translocated to meet host demands does not change, fewer cells can be produced (see open points on the red and green curves). This is a scenario that might lead to phenomena like coral bleaching.

inputs for their production. Points on the curve are known as "Pareto" efficiencies (Gillespie, 2007). The curve also helps define the opportunity costs that exist within a system. A typical example from economics highlights the trade-offs that exist when an agent can decide whether it should produce a durable or a non-durable (i.e., consumable) good. Durable and non-durable goods are often compared in this manner because shifts along the PPF provide information about the investor's "interpretation" of future benefits of current investments. That is, investing in a durable good provides some indication that the investor interprets conditions as conducive for future growth. For algal symbioses, we can consider two commodities in which an algal cell might invest. The first is production of new cells generated through mitosis new algal cells are analogous to durable goods since they last for a substantial period of time. The second way that energy might be invested involves translocating photosynthate to the host—this is analogous to investing in a non-durable/consumable good that is used up immediately. As with any trade-off, investment in one commodity necessitates reducing investment in the other. In phototrophic symbioses, the PPF represents the set of ratios of cells produced relative to material translocated to the host. Each point along that curve represents the most efficient number of algal cells produced for that amount of fixed carbon translocated to the host (**Figure 2**).

The only way to increase the number of algal cells produced for a particular investment in translocated material is to shift the PPF outward. But moving the PPF requires a change in the pool of fixed carbon that is available for investment (e.g., an algal mutant that is more photosynthetically efficient, or the environment changes so light levels or nutrient load is higher). An inward shift of the PPF represents a scenario where the pool of carbon available for investment decreases, which might be expected when photosynthetic ability is compromised (e.g., under thermal stress). Under this scenario, only symbionts that could maintain a level of translocation to meet host demands, and establish a rate of population growth that was sustainable, would remain in symbiosis. If we assume, however, that the system is static (i.e., no improvements in technological (i.e., physiological) abilities to increase the fixed carbon pool), then natural selection could act on strategies that algal symbionts employ to gain competitive advantages within a particular host. For example, a mutation that gives its bearer elevated cell division rates (and thus lower translocation rates, mutant 2 vs. mutant 1 in **Figure 2**) might appear in a symbiont population harbored by a single host. Provided that this mutant does not trigger a defensive or digestive response from the host, it would have a competitive advantage over other individuals in the symbiont population (see Frank, 1996). This opens the possibility of evolutionary changes within hosts, and possibly among the other hosts that exist in the habitat (but see Damore and Gore, 2011). Alternatively, if residence time is the phenotype that natural selection favors, a mutant that translocates more fixed carbon (with lower rates of division) might increase in frequency because the host detects it less frequently (e.g., higher translocation rates of mutant 1 vs. mutant 2 in **Figure 2**).

But what evidence exists that phototrophs face the kind of trade-off in translocation vs. mitosis envisioned here? There are many indirect lines of evidence that a trade-off exists. It has long been known that algae translocate carbon and that the dynamics of that translocation process are complicated. For example, almost a half-century ago, Smith et al. (1969) consolidated evidence that metabolite transfer from symbiont to host is a widespread phenomenon in mutualistic and parasitic associations. It has also been known for many years that cultured *Symbiodinium* release only a fraction of their photosynthate compared to algae found in hosts; cultured algae also have distinct morphologies compared to algae in intact symbioses (e.g., Colley and Trench, 1983; Domotor and D'Elia, 1986). Ritchie et al. (1997) found that a commercially available synthetic fungicide stimulated release of fixed carbon products. More recently, Grant et al. (2006) found that a host release factor from the coral *Plesiastrea versipora* stimulated the release of glycerol from its *Symbiodinium* symbiont. The authors argued that the diversion of glycerol from the algae reduced internal stores of triacylglycerols and starch, which in turn would help the host regulate growth of intracellular algae. However, Suescún-Bolívar et al. (2012) provide the most direct test of the existence of trade-offs in phototroph:heterotroph symbioses. They induced release of glycerol from *Symbiodinium* growing in culture by exposing the dinoflagellates to osmotic upshocks. The osmotic treatments did not affect photosystem performance or survivorship, but did reduce population sizes, which the authors attributed to a reduction in cell division rates for the *Symbiodinium* that released glycerol. These results should be interpreted carefully given that glycerol released by *Symbiodinium* may be a response to stress and not a translocated compound (see review by Davy et al., 2012).

Despite the caveats mentioned previously, there is evidence that the type of trade-off envisioned in the PPF (**Figure 2**) exists in phototroph:heterotroph symbioses. Furthermore, there seems to be a significant capacity for modifying the quality and quantity of material translocated. Burriesci et al. (2012) found that highlyefficient mechanisms exist for translocation of newly synthesized glucose from *Symbiodinium* to its *Aiptasia* host. Glucose appeared in host tissue as quickly as 2 min after exposing anemones to stable isotopes of CO2 and moving them into the light after rearing them in the dark. Other solutes appeared in host tissues at much later time points. The solutes mannose, inositol, threonine, glutamine, and succinate appeared after 1 h. Other solutes appeared after 1 day (e.g., glycerol, glutamic acid, and pentaric acid) and 1 week (e.g., glycine and ß-alanine), though these compounds may represent downstream products of host metabolism (e.g., Starzak et al., 2014). If variability in the release rates of these and other compounds exists among phototrophs within a population of symbionts, then natural selection could operate to favor variants with strategies that optimally match the characteristics of the host cellular machinery—a process that might ultimately lead to host specialization (Thornhill et al., 2014).

One of the most important insights gained from the PPF perspective is a clear statement of the problem of conflicts between partners in phototroph:heterotroph symbioses. As far as translocated carbon is concerned, hosts would appear to favor symbionts that give up more material because host fitness would increase. Symbionts, on the other hand, would appear to favor hosts that demand fewer of their photosynthetically-derived reserves because they could translate those energy gains into additional mitotic events (enhancing within- and among-host competitiveness). The phenotype observed in a particular holobiont combination is thus the manifestation of a tug-of-war between the competing pressures of translocating more material [to reduce the probability of being detected by the host, thus increasing within-cell residence time] and dividing more rapidly [which would produce more cells and might confer a competitive advantage through higher infective/dispersive capabilities compared to slower growing mutants]. It seems that productive research possibilities exist in exploring the precise mechanisms that regulate the evolutionary and ecological interactions between partners in the context of these tradeoffs.

Furthermore, by emphasizing the reciprocal dynamics of host:symbiont interactions in terms of material goods that are exchanged between partners (as the APH does—Hill and Hill, 2012), an opportunity exists to consider one mechanism that might lead to specialization between partners. There are likely many strategies available to symbionts that would lead to faster growth rates (see above), and selection would often favor faster growing symbionts that remain undetected by the host. The symbionts have short generation times, large population sizes (albeit small effective population size due to clonality), and mutation rates—conditions that would provide constant fuel for rapid evolutionary change. The relative fitness of different symbiont strains (created via mutation) constitutes a major force that might drive rapid lineage turnover within a host. The long-term fate of these common genetic changes would depend on the interplay of effective population size and natural selection. Population-level processes such as selection, migration, and recombination will also help shape the genetic diversity of symbionts among and within hosts (Santos et al., 2003; Thornhill et al., 2009, 2013; Andras et al., 2011; Pettay et al., 2011). The genetic footprint of these processes is likely to be complex, but it is clear that opportunities exist for rapid adaptation of symbiont to its host environment. Within each host, symbiont populations might experience diversifying selection driven by pressure to evade immune systems (e.g., Endo et al., 1996) while simultaneously experiencing stabilizing or directional selection in response to the host's energetic expectations for translocated material. Rapid onset of local adaptation by the symbiont to its host (involving selective sweeps) might be expected (e.g., Thornhill et al., 2014).

Contrary to the perspective presented above, some theoretical models find that symbionts become enslaved partners precisely due to the substantial differences in evolutionary rates between partners (e.g., Frean and Abraham, 2004; Damore and Gore, 2011). In these models, the host does not respond to the selective pressures created by the symbiont because its relative evolutionary rate is so much slower than the symbiont's. The rapidly evolving species, typically the symbiont, becomes highly cooperative, while the slowly evolving one, typically the host, does not reciprocate the cooperativeness (Frean and Abraham, 2004; Damore and Gore, 2011). However, these models often assume a quality of interaction (especially from the perspective of the symbiont) that is difficult to defend from biological first principles.

Many are beginning to explore the role of the symbiont in processes of invasion and establishment of intracellular residency, and as agents with independent evolutionary trajectories (e.g., Rodriguez-Lanetty et al., 2006a; Schwarz, 2008; Weis et al., 2008; Davy et al., 2012), but a host-centric lens is still too often applied to understanding the associations in ways that mask possible important interactions. Much of the difficulty of studying these microscopic symbionts is due to their opaque life histories. One overgeneralizes if host:symbiont interactions involving *Symbiodinium*, *Rhizobium*, *Buchnera*, endomycorrhizae, or phages infecting cyanobacteria are lumped together as if they behave identically - as has been implied in some of the models developed to date (e.g., Frean and Abraham, 2004). These associations involve quite different agents of biological interaction operating at different scales and degrees of intimacy. While the models that have been developed describe interesting dynamics, biologists must determine to which symbioses they apply. For example, Frean and Abraham (2004) state that "Surprisingly, in very few cases have endosymbionts been shown to benefit significantly from their interactions with host organisms. . . .For the putative benefits of symbiotic life as zooxanthellae, dinoflagellates give up their cell wall and their flagella, sacrifice most of their photosynthetic products, and reduce their reproductive rate." Rather than viewing these as losses, it may be more profitable to look at them as strategies for host occupancy and for production of daughter cells to infect new hosts. In some models, the symbionts are assumed to enter a host where they become trapped until the host dies (Frean and Abraham, 2004). This assumption can be rejected for many of the algal symbioses that create stable phototroph population sizes despite constant input from mitotic events, which indicates that each algal cell has a particular residence time within the host and a free-living stage in the environment (Hill and Hill, 2012). Further, the payoff matrices used in some approaches that employ game theory (e.g., the snow drift model—Damore and Gore, 2011) do not map on to or reflect symbioses we find in nature. There are reasons to believe the payoffs experienced by the hosts and symbionts operate on different scales with different magnitudes.

Cooperation and defection might be appropriate terms to describe some interactions (e.g., production of a joint nutrient), but they fail to describe the types of interactions that occur if cellular mimicry is in play, as has been proposed in the APH (Hill and Hill, 2012). That is, how can a host cooperate if it is "unaware" that it is in a game, and the two species are not "fighting" over the benefits of a mutualism? If a host is being duped by a symbiont, the situation described in some theoretical approaches begins to dramatically violate assumptions of the model, which calls in to question the generality of the findings (e.g., Damore and Gore, 2011). In this light, the co-evolutionary possibilities become more intriguing, and additional model approaches may be beneficial (see also Frank, 1996; Friesen and Jones, 2012). For example, it may be that modern scleractinian corals are ecologically naïve, and have evolved reduced predatory efficiency because they have been energetically subsidized for millions of years by their *Symbiodinium* symbionts. Plasticity in host feeding, and a strong feed-back system between symbiont and host, indicate a continued reliance on heterotrophy by both the host and symbiont (e.g., Grottoli et al., 2006; Ferrier-Pages et al., 2010). However, the assurance of energetic inputs from algae, extrapolated over many millennia, may have weakened the selective pressure on structures and behaviors involved in predation. Contrary to the models described above, it may be that coral hosts have been selected to be, in a sense, highly cooperative.

### **DYNAMIC ENERGY BUDGET PERSPECTIVES AND MECHANISMS OF HOST:SYMBIONT INTERACTION**

The trade-offs articulated above emphasize identifying optimal investment strategies that a symbiont might adopt to reside within heterotroph cells. Another useful perspective is one that looks at the consequences of changing the productive capacity of phototrophs (i.e., shifting the PPF). If efficient strategies exist for persisting in host habitats, then any stressors that decrease the productive capacity of the system would lead to major consequences for the holobiont. The green curve in **Figure 2** represents a scenario where the fixed carbon pool available for investments has diminished greatly. If the intracellular residency costs remain the same, that is, the quantity of photosynthate required by the host stays at a certain level, then the number of cells that could be produced would drop. This is illustrated in **Figure 2** where the open points on both curves are at the same location on the x- but not the y-axis. This might lead to reductions of the total number of symbionts harbored by the host, as observed in bleaching events for hosts with *Symbiodinium* symbionts.

A situation where the PPF curve would be shifted inward might occur when environmental conditions change. For example, we can consider the dynamics of host:symbiont interactions in the context of seasonal changes in habitat (**Figure 3**). During the majority of the year, the symbiont can produce a sufficient amount of photosynthate to placate host demands and take care of its other physiological functions. As envisioned in **Figure 3**, the symbiont has some plasticity in its investment in different compartments, and might invest in more mitosis when the host's metabolic rates are low (e.g., in the winter). However, in some seasons costs associated with intracellular occupancy might increase (e.g., as the metabolic demands of the hosts and symbionts increase), which would necessitate consuming more of the total available photosynthate reserve in the service of translocation or basal metabolism. If the thermal stress continues to a point that compromises photosynthetic capability (e.g., PSII damage, Warner et al., 1999), then the amount of primary production that can be invested dwindles, and the energy budget can go into deficit territory. Viewed in this manner, periods of thermal stress that compromise a phototroph's ability to maintain the rate of fixed carbon transfer would elevate detection or expulsion rates. If that stressor persists, phenomena like coral bleaching might be the result ("potential bleaching zone" in **Figure 3**).

Using an energetic budget approach offers important opportunities to examine these symbioses (Lesser, 2013). For example, Muller et al. (2009) used dynamic energy budgets (DEB) to model flows of matter and energy between partners in a phototroph:heterotroph symbiosis. The authors made several simplifying assumptions including that only excess material (photosynthate or nutrients) are transferred between partners. With the DEB, Muller et al. (2009) found that ambient food density, inorganic nitrogen, and irradiance had little affect on symbiont density whereas light deprivation and nitrogen enrichment

**FIGURE 3 | Hypothesized annual photosynthate budget for an algal symbiont like** *Symbiodinium***.** The thick black line represents the total pool of photosynthate generated through carbon fixation. Four physiological compartments that energy derived from those photosynthates could be invested in include: (1) translocation to host (white), (2) storage (e.g., in lipids—thin dark gray band), (3) mitosis (light gray), and (4) basal metabolic rate (BMR in black). This figure envisions a drastic reduction in primary productive capabilities of the phototroph in the summer months (i.e., an inward shift of the PPF from **Figure 2**). This might be caused, for example, by drastically warmer water. A reduction in the photosynthate reserves might push the symbiont into territory representing energy deficits, which might lead to detection, digestion, or expulsion by the host.

caused increases in density. The importance of this type of work is the attempt to compartmentalize physiologically-important processes so that nuanced insights might be gained about the nature of the interaction between partners. However, it is important to keep front-and-center the assumptions that these various approaches make—in particular careful consideration of how we describe energy equivalents (Lesser, 2013). Furthermore, several recent studies have measured and modeled the flow of material and energy in coral:*Symbiodinium* symbioses (Tremblay et al., 2012; Gustafsson et al., 2013, 2014). These studies provide detailed perspectives on the dynamism of exchanges that likely occur between partners.

To fully explore any dynamism of energy allocation and the trade-offs proposed above, we require precise information about the molecular and biochemical interactions that occur between the partners. Next generation sequencing provides opportunities to gain a nuanced and detailed understanding of the interactions that occur between partners at the finest levels of molecular, genetic, and cellular interaction. Recent advances in transcriptomic, proteomic, and metabolomic analyses offer tools to gain fine-scale molecular genetic perspectives on the physiologicallyimportant processes mentioned above (e.g., Meyer and Weis, 2012). While bioinformatic tools will expand research opportunities, we also need classic physiological experiments that elucidate meaningful aspects of host:symbiont interactions. Recent work with stable isotopes highlight the promise of precisely documenting the material that is translocated from the symbiont to the host, and that is taken up by the host from the symbiont (Hughes et al., 2010; Weisz et al., 2010; Burriesci et al., 2012; Pernice et al., 2012). Furthermore, the development of aposymbiotic model systems provides a number of empirical possibilities to determine how different symbiont types modulate the relationship with a particular host (e.g., Hambleton et al., 2014; Riesgo et al., 2014).

Efforts to create "model" systems of study will expand empirical opportunities (e.g., Weis et al., 2008; Lehnert et al., 2012), but it is clear that we will gain much if we maintain an explicitly comparative approach to work on these intracellular symbioses. Indeed, adopting an explicitly comparative perspective that unites the findings from different symbiotic partnerships may elucidate common pathways to intracellularlity (see below). The consequences extend beyond the phototrophic mutualisms considered here as any symplesiomorphies identified may be equally valuable for studies of intracellular parasitisms (e.g., malaria, toxoplasmosis). Do parasites release material to secure intracellular habitats in a manner that shares similarities with what we see in phototrophic symbioses? What reciprocal changes might be found in host endomembrane proteins common to associations that involve phototrophs or parasites? What modes of parasite invasion apply to phototrophic associations?

While understanding the mechanisms of interaction at the cellular level are vital, the evolutionary behavior of these associations is relatively unexplored from theoretical perspectives. If the trade-offs described above (**Figure 2**) are important, the specific factors that contribute to particular strategies of persistence within a single host and within a population of hosts need elucidation—especially in the context of holobiont performance. The reciprocal selective pressures that host and symbiont place on each other create interesting evolutionary possibilities. How does specialization evolve in these systems, and do they behave like host:parasite systems that engage in time-lagged, frequency dependent interactions? Modeling these symbioses from metapopulation perspectives would be particularly interesting. Hosts represent habitat. These habitats have extinction rates that depend on the life history of the host species. For symbioses that involve horizontal-acquisition, habitats become available when aposymbiotic propagules appear in the environment. The within host population may be an asexually derived clonal population, but it is part of a larger metapopulation. Secord (2001) appears to be the first to appreciate this fact.

#### **TERMINOLOGICAL DIFFERENCES: CAUTIONARY TYPOLOGICAL TALES**

"The beginning of wisdom is to call things by their proper name." *Confucius*

". . . [the one] who first seizes the word imposes reality on the other"

*T. Szasz*

It is likely that the trade-offs articulated above apply to any symbiosis that involves a heterotrophic host that harbors a phototrophic symbiont. However, terminological difference among fields compromises our ability to identify common strategies that might exist. For example, in *Paramecium*:*Chlorella* symbioses, algae are located within a perialgal vacuole derived from the host digestive vacuole (Kodama and Fujishima, 2010). In *Hydra*, *Chlorella* populate the perisymbiont space in digestive gastrodermal myoepithelial cells (Rands et al., 1992). In some sponges, specialized cells, termed "cyanocytes," harbor large aggregates of cyanobacteria; other sponge hosts harbor cyanobacteria in digestive vacuoles (Wilkinson, 1978). In non-phototrophic symbioses, e.g., *Trypanosoma* parasitisms, the parasite may briefly reside in acidic parasitophorous vacuoles (Lu et al., 1998; Chen et al., 2008). If the symbiosis under study involves *Symbiodinium*, the dinoflagellate symbiont is harbored within the symbiosome (e.g., Roth et al., 1988). It is possible that these differently named structures share common origins.

Hinde and Trautman (2002) argued for the primacy of the term symbiosome when describing membrane-bound symbionts living intracellularly. However, the symbiosome, if it is a distinct component of the cellular machinery, is a derived trait, and I contend that focusing on the symplesiomorphic traits of the endomembrane system of heterotrophic hosts is a better approach to understanding the shared evolutionary and ecological pressures phototrophs face as they invade heterotrophic host cells. We assume much about biochemical and physiological differences between symbiont-bearing and "normal" endomembrane structures when we erect terms for the former (e.g., "symbiosome"). A useful starting point is to accurately describe the endomembrane system (i.e., the habitat as seen by the invading symbiont) typically present in host cells (see e.g., Kodama and Fujishima, 2010). We stand to learn more about the nature of the association if we understand the endosomal compartments a symbiont targets, and whether the symbiont-bearing structure retains characteristics of the original endomembrane structure. For example, might phototrophs maintain residency within a host cell by mimicking digesting prey via the phagosomal compartments (as hypothesized in Hill and Hill, 2012)? What subtle differences in the chemical characteristics of a membrane can a symbiont modify to appear to the host cell like a particular cell constituent (e.g., late endosome or phagosome) to create habitat space that is stable, persistent, and safe? These guiding questions are not new, and were prominent in earlier work on intracellular symbioses (e.g., Muscatine and Lenhoff, 1963; Trench, 1971; Karakashian and Karakashian, 1973; Karakashian and Rudzinska, 1981; Reisser et al., 1982).

The endomembrane system existed before the symbiosis, and thus understanding "normal" cellular processes will yield major insights in the diverse phototrophic:heterotrophic symbioses that exist on the planet. Despite very similar research objectives and approaches, the different fields can operate in semi-separate circles; for example, it is rare to find citations of the seminal work of, for example, Karakashian (Karakashian and Karakashian, 1973; Karakashian, 1975; Karakashian and Rudzinska, 1981) in publications focused on *Symbiodinium* symbioses, or Trench citations (Trench, 1971; Trench et al., 1981; Colley and Trench, 1983, 1985; Fitt and Trench, 1983; Trench, 1987) in *Chlorella*-based symbiotic research. Muscatine recognized the importance of taking advantage of the methodological tractability of one system (e.g., *Hydra*) to inform the other (e.g., *Symbiodinium*), and appreciated the major lessons that could be learned by paying attention to the findings from different systems (e.g., Hoegh-Guldberg et al., 2007). Davy et al. (2012) recently discussed the importance of comparative work done with different symbiont systems, and provided a review of the contributions of earlier biologists who combined insights from these different systems.

The endocytobiological structures important in intracellular symbioses likely share important biochemical features (i.e., important symplesiomorphies exist), and phototrophic symbionts (perhaps even some non-photosynthesizing parasites like *Trypanosoma* or *Plasmodium*) may co-opt cellular machinery using similar strategies as they invade eukaryotic cells (Schwarz, 2008). For example, Boulais et al. (2010) compared the proteomes of 39 taxa (from amoebas to mice), and identified an ancient core of phagosomal proteins primarily involved in phagotrophy and innate immunity. Looking for similarities between and among cellular habitats by different symbionts may offer important clues about universal processes that favor invasion of heterotrophic host cells. Recent work elucidating the detailed machinations of the phagosome at fine-scale levels of molecular and genetic resolution (Stuart et al., 2007; Stuart and Ezekowitz, 2008; Trost et al., 2009) highlights the opportunities to shed significant light on intracellular symbioses between phototrophs and heterotrophs.

Parasitophorous vacuoles, symbiosomes, and digestive vacuoles may share similar characteristics because they all target the normal endomembrane process of heterotrophic cells. Insights into diseases like malaria might come from a detailed comparison of the cellular processes operating in mutualisms involving algae. For example, Kuo et al. (2010)'s finding that the proteins GP2 and Niemann-Pick type C2 are upregulated in symbiont containing *Aiptasia* is intriguing given the role of these genes in modulating immune responses and lysosomal cholesterol transport (Kuo et al., 2010; Werner et al., 2012, respectively). Chen et al. (2004) found that a Rab protein, which is normally a regulator of endocytotic recycling, is recruited to phagosomes containing heat-killed, but not live, *Symbiodinium* introduced to *Aiptasia* hosts. This points to specific molecular genetic pathways (especially the Rab pathway) that permit successful colonization of host habitats by *Symbiodinium* since the symbiont may arrest one of the endomembrane structures in the phagosome position. Similar processes operate for *Trypanosoma* and *Plasmodium* parasitisms (e.g., Batista et al., 2006; Seixas et al., 2012). Several more recent studies have employed transcriptomic and proteomic methods to provide detailed molecular genetic perspectives on the host:symbiont interface (e.g., Rodriguez-Lanetty et al., 2006a,b,c; Sunagawa et al., 2009; Voolstra et al., 2009; De Salvo et al., 2010; Peng et al., 2010; Ganot et al., 2011; Yuyama et al., 2011; Fransolet et al., 2012; Meyer and Weis, 2012). Adopting an explicitly comparative perspective that unites the findings from different symbioses may elucidate pathways common to intracellularlity *sensu lato*.

#### **CONCLUSION**

The purpose of this perspective is to focus attention on significant trade-offs that exist for phototrophic symbionts residing in heterotrophic host cells. Constraints exist on investment strategies involving energy that is represented by fixed carbon produced through photosynthesis. The possible phenotypic responses to the trade-offs have significant evolutionary implications. To study these trade-offs, we must understand the cellular environment that the symbionts reside in because important symplesiomorphies likely exist among the various organisms that engage in this type of ecological interaction. One way to achieve success in this area is to increase the dialog that occurs among scientists working with different symbioses. By using names unique to specific hosts to describe endomembranous spaces that phototrophs live in, we may be missing important clues to how symbionts establish stable populations within a particular host. Finally, if we shift attention away from host "control" of the associations, and instead think about the role the symbiont might play in shaping the interactions, we may discover novel theoretical and empirical approaches that have broad explanatory power.

#### **ACKNOWLEDGMENTS**

I thank A. Hill and D. Thornhill for providing helpful comments and perspectives on an early draft of the manuscript. The two reviewers provided very useful suggestions for improving the manuscript. This work was supported by grants from the National Science Foundation (OCE-0647119 and DEB-0829763). I thank C. Davis for her work generating the electron micrographs of *Chlorella* infecting freshwater sponges shown in **Figure 1**.

#### **REFERENCES**


of *Symbiodinium* symbioses. *Biol. Rev.* 87, 804–821. doi: 10.1111/j.1469- 185X.2012.00223.x


(Dordrecht: Kluwer Academic Publishers), 45–61.


Peng, S., Wang, Y., Wang, L., Chen, W. U., Lu, C., Fang, L., et al. (2010). Proteomic analysis of symbiosome membranes in cnidaria-dinoflagellate endosymbiosis.

Pernice, M., Meibom, A., Van Den Heuvel, A., Kopp, C., Domart-Coulon, I., Hoegh-Guldberg, O., et al. (2012). A single-cell view of ammonium

*Proteomics* 10, 1002–1016. doi: 10.1002/pmic.200900595


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 April 2014; paper pending published: 13 May 2014; accepted: 25 June 2014; published online: 17 July 2014.*

*Citation: Hill MS (2014) Production possibility frontiers in phototroph:heterotroph symbioses: trade-offs in allocating fixed carbon pools and the challenges these alternatives present for understanding the acquisition of intracellular habitats. Front. Microbiol. 5:357. doi: 10.3389/fmicb.2014.00357*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Hill. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Hidden state prediction: a modification of classic ancestral state reconstruction algorithms helps unravel complex symbioses

## *Jesse R. R. Zaneveld\* and Rebecca L. V. Thurber*

Vega Thurber Laboratory, Department of Microbiology, Oregon State University, Corvallis, OR, USA

#### *Edited by:*

M. Pilar Francino, Center for Public Health Research, Spain

#### *Reviewed by:*

Amparo Latorre, University of Valencia, Spain Anna Carolin Frank, University of California Merced, USA

#### *\*Correspondence:*

Jesse R. R. Zaneveld, Vega Thurber Laboratory, Department of Microbiology, Oregon State University, 220 Nash Hall, Corvallis, OR 97330, USA e-mail: zaneveld@gmail.com

Complex symbioses between animal or plant hosts and their associated microbiotas can involve thousands of species and millions of genes. Because of the number of interacting partners, it is often impractical to study all organisms or genes in these host-microbe symbioses individually. Yet new phylogenetic predictive methods can use the wealth of accumulated data on diverse model organisms to make inferences into the properties of less well-studied species and gene families. Predictive functional profiling methods use evolutionary models based on the properties of studied relatives to put bounds on the likely characteristics of an organism or gene that has not yet been studied in detail. These techniques have been applied to predict diverse features of host-associated microbial communities ranging from the enzymatic function of uncharacterized genes to the gene content of uncultured microorganisms.We consider these phylogenetically informed predictive techniques from disparate fields as examples of a general class of algorithms for Hidden State Prediction (HSP), and argue that HSP methods have broad value in predicting organismal traits in a variety of contexts, including the study of complex host-microbe symbioses.

**Keywords: predictive metagenomics, "virtual" metagenomes, 16S rRNA gene copy number, phylogenetic prediction, systems biology, ecotoxicology**

## **BIOLOGICAL DIVERSITY OFTEN NECESSITATES TRAIT PREDICTION**

The immense scope of biological diversity limits detailed scientific study to a relatively small number of well-characterized model organisms. Because the technical and analytical capabilities needed to catalog the vast number of diverse organisms are limited, important scientific and regulatory decisions must often be made by applying information from well-studied models onto less wellunderstood organisms (Fagan et al., 2013; Guénard et al., 2013) or genes (Eisen, 1998; Engelhardt et al., 2005).

In some cases, extrapolation of properties across diverse organisms is needed because direct testing on the organism of interest would be unethical, illegal, and/or infeasible. For example, in the realm of ecotoxicology, direct toxicology tests on suitably large cohorts of endangered or threatened species (e.g., spotted owls or marine mammals) may better predict the lethal concentration of a toxicant than tests on other model species, but would be legally problematic and potentially counterproductive from the standpoint of conservation. Instead, data from experiments on model species must generally be extrapolated to predict impacts in rare or hard to access relatives (Guénard et al., 2011).

These problems of vast diversity and limited ability to study all species are particularly apparent in the context of complex symbiotic assemblages like those between metazoans and their associated microbial communities. The human gut lumen has been estimated to contain ∼1000 prevalent microbial species and ∼3.3 million genes based on surveys of European populations (Qin et al., 2010). The total microbial cell count on humans (∼1014 cells) is even estimated to exceed that of human cells (∼1013) by roughly an order of magnitude (Savage, 1977). The connections between the human microbiota and a wide range of variables including diet, autoimmune disease, obesity, and cancer are being actively explored (see Lozupone et al., 2012 for a recent review). Although the human microbiome is a heavily studied system, the diversity of its constituents presents an important challenge to gaining an ecosystem-level understanding of the contribution of each member to the dynamics of the overall system.

This immensity of microbial diversity presents an even larger challenge when considering less well-studied microbial communities. The Greengenes database (McDonald et al., 2012) of microbial 16S ribosomal RNA gene sequences, a popular phylogenetic marker for bacteria and archaea, contains 99,322 microbial operational taxonomic units (OTUs) at a 97% sequence similarity threshold in the present version (13\_8). Even if representatives of 100 OTUs per day could be cultured and assayed for a particular trait (an effort that would require extensive resources and automated methods for high-throughput phenotyping), it would take ∼6 years to test this trait across known OTUs. Unfortunately, this is slower than discovery of new OTUs, which tripled between 2012 and 2013. Thus even an impressive brute-force effort to study microbial phenotypes by treating each OTU in isolation would actually lose ground to the influx of newly uncovered microbial diversity. Therefore, many conclusions about microorganisms found in the environment will rely on the properties of betterstudied model microorganisms for some time, especially if new microbial diversity continues to be uncovered at high rates.

#### **METHODS FOR PHYLOGENETIC PREDICTION**

In this review, we discuss the utility of phylogenetic models for predicting features of understudied organisms, and focus in particular on a group of methods for predicting unknown traits from a phylogeny that we term Hidden State Prediction (HSP) algorithms. These methods have recently been applied to a variety of interesting problems in the study of host-microbe symbioses and microbial ecology.

We define HSP algorithms as phylogenetic methods for predicting unknown character states or character values (i.e., traits) based on a collection of known character states and a phylogenetic tree (**Figure 1**). Thus, HSP is similar to ancestral state reconstruction (ASR) techniques, in which the properties of ancestral organisms are inferred based on traits of their living descendants. However, HSP methods differ from ASR methods in that the properties of modern rather than ancestral organisms are predicted. That is, these methods predict character states for the tips of a phylogeny rather than for its internal nodes. These methods are also closely related to phylogenetic comparative methods (PCMs), which also examine the predictability of character values given a phylogeny, but do so in order to *remove* phylogenetic signal when comparing traits of interest (Harvey and Pagel, 1991; Garland and Ives, 2000). Although the two methods are closely related, we use HSP to distinguish phylogenetic prediction *per se* from methods where a phylogenetically corrected comparison of *measured* trait values is the end goal. Finally, we should be clear that while HSP methods are also sometimes called "phylogenetic predictive methods," they are distinct form standard phylogenetic inference: HSP methods use an inferred phylogeny to predict traits at the tips of the tree rather than *vice versa.*

#### **HIDDEN STATE PREDICTION ALGORITHMS**

Consider the case of predicting the copy number of a gene across many organisms, of which only a portion have been characterized (**Figures 1A,B**). HSP methods start with a set of reference annotations (the gene copy numbers) and a phylogenetic tree relating the entities that were annotated (here the organisms carrying the genes). These reference annotations are mapped onto the corresponding tips of the tree. (Although this step is conceptually simple, it can actually be surprisingly involved when reference databases and the reference phylogeny use different conventions and naming schemes.)

When the model of evolution is reversible over time (e.g., in a Brownian motion model), it is possible to make phylogenetic predictions for hidden states directly using ASR of a rerooted version of the phylogeny (Garland and Ives, 2000). Because the direction in which time moves is not important in such models, the problem of prediction can be transformed into a standard ASR by rerooting the tree on the parent edge for the node to be predicted, and resolved using any standard ASR method. Fast reconstructions using phylogenetic independent contrasts or generalized least squares (which have been shown to be equivalent for the Brownian motion model specifically) are a popular

**FIGURE 1 | Hidden State Prediction (HSP). (A)** Evolution of a simulated trait following a Brownian motion model. For example, the copy number of a gene family in each of several microbial genomes can be mapped onto a phylogenetic tree and represented as a continuous trait. (The same method could be used on any continuous evolutionary character.) Here, a trait starting with a value of 4 evolves by a Brownian motion process within a group of organisms A–F. Blue values above each edge of the phylogeny indicate regions of the phylogeny where the trait takes on a value greater than 4 (gain with respect to the ancestor of A–F). Orange values below the edges indicate trait values lower than 4 (loss relative to the ancestor). Numbers by the tips of the tree show the final value of the trait rounded to the nearest integer, as when the trait is taken to represent the copy number for a particular gene. (**B)** Observed Data. In general only a portion of all modern organisms are sampled. In this example trait values have been measured for tips A, C, and F but are unknown for tips B, D, and E. (Continued)

#### **FIGURE 1 | Continued**

The tips with unknown trait values differ in their proximity to characterized relatives. Tip D is only distantly related to tips with known values. Note that tip B is closely related to tips A and C for which trait values are known. Thus for B the closest known tip is within 0.12 units of branch length, whereas for D the closest tip is 0.63 units of branch length away. The task of HSP is to estimate trait values for B, D, and F from the values for A, C, and E. Examples of tips for which trait prediction will be more or less accurate are shaded with blue or orange boxes, respectively. This task will be simplest in cases like B in which several close relatives have been assayed and hardest in cases like D where long branches separate unknown tips from known references. **(C)** Ancestral State Reconstruction (ASR). The unknown tips are dropped from the tree (most phylogeny programs cannot handle missing character values) and ancestral character values are calculated for the remaining internal nodes. (An alternative method for discussed in the text is to repeatedly reroot the tree at each node of interest (here B,D,F) and perform standard ASR (Garland and Ives, 2000). (**D)** Prediction of character values. If prediction via tree rerooting is not used, the inferred ancestral states and evolutionary model must be extended to the tips using another method. For example, the predictive functional profiling software package PICRUSt (which predicts metagenomic counts from marker gene data; see main text) uses exponential weighting by branch length to extend reconstructed states to the tips, and inflates the variance of the reconstructed ancestral state to account for evolution between the ancestor and the tip of interest (Langille et al., 2013). In this example, Tip B, with close references A and C is assigned correctly. Tips D and F, where such references are either missing (tip D) or available only in a sister group (but not a closely related outgroup; F) are assigned less accurately (both off by two copies). However, D is correctly inferred to have more copies than F. Note that this example is intended to illustrate compactly the algorithm and some examples of success or failure, and should not be taken to represent the average accuracy of these methods, which have been studied in some depth (see Factors Influencing the Accuracy of Hidden State Prediction Algorithms for a summary of major findings).

choice (Grafen, 1989; Martins and Hansen, 1997; Felsenstein, 2004). However, due to technical limitations of most phylogenetic software, it may be necessary to prune all tip nodes with unknown character values prior to prediction. The combined rerooting and pruning operations may incur significant computational costs on extremely large reference phylogenies common in microbial studies (e.g., 10s of 1000s of tips). Nonetheless, this method has been used by several recent phylogenetic prediction studies on large microbial trees (Kembel et al., 2012; Angly et al., 2014).

Alternatively, a standard ASR can be performed on the pruned tree (**Figure 1C**), then mapped back to the full tree, and the inferred ancestral states and evolutionary model used to predict character values at nodes removed during pruning and the tips of the tree (**Figure 1D**). Under maximum parsimony (Fitch, 1971), which seeks to minimize character state changes over the tree, the most parsimonious hidden state is simply the trait value reconstructed for the last common ancestor of a tip and its closest annotated relative. As an example of this approach, (Eisen, 1998) suggested the use of a parsimony-based HSP algorithm to predict the gene function of unassigned orthologs within gene families. In a maximum likelihood framework the most likely prediction for the hidden state is the one that maximizes the likelihood of the observed character data given the phylogeny and a particular model of evolution (alternative models may be tested using Akaike information criterion/Bayesian information criterion approaches). For symmetrical models of character evolution (such as the Brownian motion model), this

criterion implies that the ML estimate of a hidden state at the tip of the tree will be the same as the ML estimate of the ancestral state for the last common ancestor of the individual in question and an annotated relative. The variance in this trait will be inflated by the product of the variance parameter describing the Brownian motion process (σ2) and the branch length to account for evolution along the branch from the last common ancestor to the tip in question. If the model of evolution is asymmetrical, then the maximum likelihood estimate for a tip may differ from its last common ancestor with an annotated tip.

Because HSP methods predict features of modern organisms, the accuracy of these algorithms can be readily tested by crossvalidation. When a large number of directly observed character values are known, the accuracy of an HSP method can be assessed by limiting program input to a subset of observed character values, and then testing the ability of the method to predict the rest. (The key conclusionsfrom several such cross-validation studies are discussed in section "Factors Influencing the Accuracy of Hidden State Prediction Algorithms," below.)

#### *Related prediction methods*

Several related approaches bear mentioning that also aim to extend information from characterized organisms to uncharacterized relatives. Phylogenetic eigenvector maps (PEMs) translate a phylogenetic tree into a matrix of similarities, and then decompose these similarities into orthogonal eigenvectors (Guénard et al., 2013). Some or all of these eigenvectors are then used as predictor variables in statistical analysis. This approach is similar to performing principal coordinates analysis (PCoA) on the similarity matrix of organisms, and then using some or all of the resulting PC axes as variables for statistical analysis. This method is implemented in the MPSEM R package, and has been used to predict the sensitivity of diverse animals to environmental toxicants (Guénard et al., 2011). Other approaches average gene counts across taxa or close phylogenetic relatives to estimate trait values. For example, Okuda et al. (2012) used all neighboring taxa within an empirically defined phylogenetic distance (0.10 16S rRNA subst./site was recommended) to predict metagenome contents from DGGE bands.

Finally, taxonomic binning approaches do not use a phylogenetic tree, but instead average values within taxonomic units in order to estimate the chances that uncharacterized taxa share that trait. For example, PanFP1 (Jun et al., Unpublished) seeks to normalize 16S rRNA copy numbers and predict microbial metagenomes from 16S rRNA data using this method. Careful comparison of the performance of HSP and each of these alternative methods in a variety of scenarios will be a valuable tool in guiding the development of methods for trait prediction.

#### **APPLICATION OF HIDDEN STATE PREDICTION TO UNDERSTAND COMPLEX SYMBIOSES**

In the remainder of the paper we will discuss applications of HSP in the study of microbial symbioses, which range from correcting

<sup>1</sup>https://github.com/srjun/PanFP

long-understood biases in 16S rRNA surveys to approximate predictions of the content of microbial genomes and metagenomes from amplicon data.

By allowing more accurate estimation of the composition of microbial communities (see Quality Control of Marker Gene Surveys through Copy Number Normalization), and their functional capabilities (see Phylogenetic Prediction of Microbial Genomes and Metagenomes), HSP methods are being used to study the interactions of complex communities of microrganisms with hosts and one another.

#### **QUALITY CONTROL OF MARKER GENE SURVEYS THROUGH COPY NUMBER NORMALIZATION**

Recently HSP methods have been used to address a long-standing problem in microbial ecology. Enormous progress has been made in exploring complex microbial communities through sequencing of phylogenetically informative marker genes. The 16S rRNA is the most widely used marker gene for studies of bacteria and archaea. However, bacteria and archaea vary in 16S rRNA gene copy number from a mode of 1 copy (Angly et al., 2014) to 15 copies in *Photobacterium profundum*. Therefore the relative abundance for certain species and broader taxa inferred from qPCR or 16S rRNA sequencing can be inflated. Bias due to 16S rRNA gene copy number is expected to affect some datasets more than others, depending on the magnitude of differences in 16S rRNA genomic copy numbers for the most abundant organisms.

Recently, several publications have described the use of HSP to correct bias due to variation in 16S rRNA copy number in microbial datasets. Kembel et al. (2012) introduced a method for predicting 16S rRNA copy numbers for uncultured microorganisms, and used the prediction to normalize 16S rRNA marker gene surveys. This method inserts reference taxa into a phylogenetic tree for a particular community, and then uses HSP via rerooting and ASR to estimate missing 16S rRNA copy numbers.

PICRUSt2, a newly developed program for prediction of microbial genomes and metagenomes from 16S rRNA data (discussed below), also corrects 16S copy number using HSP and either PIC, ML, or parsimony reconstructions (Langille et al., 2013), but precalcuates results on the Greengenes tree rather than using tree-insertion on user datasets.

CopyRighter (Angly et al., 2014) is a third method for estimating and correcting 16S rRNA copy number which, like Kembel et al. (2012) uses the method of phylogenetic contrasts and rerooting to predict hidden states (Garland and Ives, 2000), but, like PICRUSt, pre-calculates predictions for each OTU.

These and related methods will likely prove to be a common quality-control step in 16S rRNA-based microbial ecology pipelines.

#### **PHYLOGENETIC PREDICTION OF MICROBIAL GENOMES AND METAGENOMES**

The HSP methods used to predict 16S rRNA gene copy number for normalization purposes have also been extended across all genes in bacterial genomes to predict the genome contents of uncultured bacteria and archaea.

By combining gene predictions for each OTU with 16S rRNA copy number normalization it is possible to estimate the metagenome contents of entire microbial communities. This is useful because although metagenomic data can be collected directly using shotgun sequencing, this is presently quite expensive relative to surveys of a particular amplicon. (Metagenomic sequencing is so expensive relative to amplicon sequencing in part because sequencing depth must be sufficient to cover both genes and taxa, rather than just taxa.)

The PICRUSt software package uses HSP to estimate the copy number of each gene family across all OTUs in a reference phylogeny (by default the reference Greengenes 16S rRNA phylogeny). PICRUSt can use several different ASR methods at the user's discretion including Wagner parsimony, Maximum Likelihood or phylogenetic independent contrasts. 16S rRNA copy numbers for each OTU are also estimated using HSP.

The product of these initial steps is a table of predicted gene and 16S rRNA copy numbers for each microorganism in the reference tree, including the many OTUs for which no genome sequence data is available. 95% confidence intervals for these gene copy numbers can also be constructed, based on the model of evolution for each gene. The resulting estimates of gene family and 16S rRNA copy number in each of the OTUs on the Greengenes tree can then be combined to predict "virtual" metagenomes from 16S rRNA data.

To do so, the observed count of 16S rRNA sequences in each OTU from a 16S rRNA amplicon library is simply divided by the predicted 16S copy number. As described above, this step produces an estimate for the relative abundance of each microbial OTU. The normalized counts for each OTU are then multiplied by the vector of gene abundances to produce an estimate of the count of each gene family in the metagenome.

Estimation of metagenome contents using HSP has already been applied to several studies of host-microbial symbiosis. McHardy et al. (2013) used PICRUSt to compare microbial diversity, predicted gene function, and observed metabolomic profiles in the human gut, and found PICRUSt predictions largely concordant with bulk metabolite profiles obtained by mass-spectroscopy. Rooks et al. (2014) applied PICRUSt to study gene families enriched in the gut microbiome in colitis, and found significant enrichment of several gene families previously implicated in the literature. Davenport et al. (2014) examined the role of diet on the gut microbiota in Hutterite populations. Summer diets containing more fruits and vegetables corresponded to higher levels of genes in the Glycan biosynthesis and degradation KEGG category. This increase was attributable to an increased relative abundance of Bacteroidetes, consistent with previous dietary studies showing tradeoffs between Firmicutes and Bacteroidetes in obese adults (Ley et al., 2005) and children (Bervoets et al., 2013). Other studies have investigated functional shifts in the salivary microbiome following probiotic administration (Dassi et al., 2014), differences in the pulmonary microbiome of HIV patients in San Francisco vs. Uganda (Iwai et al.,2014), and interplay between human mutation, Crohn's disease, and the functional repertoire of the gut microbiota (Tong et al., 2014). Finally, HSP methods are also being used

<sup>2</sup>http://picrust.github.io/picrust/

to study non-model systems (Loudon et al., 2013; Polónia et al., 2014). For example, significant differences in several gene categories, including tetracycline production, were inferred for two sponge microbiotas vs. nearby seawater or sediment (Polónia et al., 2014).

These examples illustrate how HSP methods are being used in studies that seek to unravel the complex interactions between host factors (genetics, immune function, diet), harmful or helpful microbial symbionts, and downstream functional consequences in diverse organisms.

#### **FACTORS INFLUENCING THE ACCURACY OF HIDDEN STATE PREDICTION ALGORITHMS**

Hidden State Prediction algorithms will function best when traits exhibit strong, positive phylogenetic autocorrelation (**Figure 2**). Such correlations have been shown for many morphological traits of ecological interest (Freckleton et al., 2002). Comparative genomic studies have also shown phylogenetic autocorrelation for the content of microbial genomes by linking organismal phylogenetic divergence to both the collection of genes in bacterial genomes (Konstantinidis and Tiedje, 2005; Chaffron et al., 2010; Zaneveld et al., 2010; Okuda et al., 2012) and gene order (Tamames, 2001). Although convergent evolution of gene content within habitats (Chaffron et al., 2010; Zaneveld et al., 2010) and local negative phylogenetic autocorrelation have also been described (Zaneveld et al., 2010), these effects are generally of smaller magnitude. Correlation between organismal and functional diversity was also observed by the Human Microbiome Project (HMP), which found richness of gene function to be correlated with the taxonomic richness of microbial consortia (Consortium, 2012).

Because HSP methods predict features of modern organisms, they are readily testable by cross-validation. Tests have included comparison of sequenced vs. predicted genome contents for known genomes; prediction accuracy for synthetic metagenomes constructed *in silico* from sequenced genomes; cross-validation of annotated 16S rRNA copy numbers; and validations on cell and DNA-based mock communities of known composition. Both HSP and related methods (Okuda et al., 2012) have generally reported high accuracy (Kembel et al., 2012; Langille et al., 2013; Angly et al., 2014) with certain important exceptions summarized below.

Specific features that have been shown to compromise the accuracy of HSP methods in particular cases include: (a) low availability of reference data for phylogenetically diverse organisms (Okuda et al., 2012; Langille et al., 2013), (b) lineages that follow an evolutionary process that differs strongly from the evolutionary model used in inference (especially genome reduction in intracellular endosymbionts; Zaneveld et al., 2010; Langille et al., 2013). Other factors that have a more modest (though still statistically significant) effect on accuracy include: (a) differences in classes of gene function thought to correspond to rates of lateral transfer (Langille et al., 2013), (b) local error in the phylogeny (Stone, 2011; Langille et al., 2013); (c) the choice of ASR method (Langille et al., 2013), and (d) substitution of detailed taxonomic trees (with unit length branches) instead of a phylogeny (Angly et al., 2014). Finally, because HSP methods rely on the

structure of the phylogenetic tree rather than taxonomy at a particular rank, they are robust to taxonomic labels that in some cases may not adequately reflect ecological strategy (Philippot et al., 2010).

## **CONCLUSION**

For many traits, phylogeny provides a useful framework for summarizing knowledge gained by studying model taxa. While methods that use phylogeny to predict traits have been available for decades, it is only relatively recently that these methods have been applied at high-throughput to summarize our understanding of key players in microbial symbioses. Several exciting directions are likely to both further improve the accuracy of HSP algorithms in the domain of microbial trait prediction, and open new avenues of research.

The strongest single factor limiting the accuracy of predictions made by HSP is the availability of phylogenetically diverse reference data. In the case of using HSP to predict genome features, the relevant reference data are genome sequences. Genome sequences are used to calculate counts of gene families across microorganisms, which are then used as evolutionary characters in the algorithm. However, the vast majority of genome sequences are incomplete, and therefore cannot be used with existing HSP techniques. For example, as of this writing the PATRIC resource hosts 13,091 partial bacterial genomes vs. 2,544 complete genomes (Wattam et al., 2014). Lack of complete sequencing introduces uncertainty into the counts of gene families in that organism, and thereby complicates use of these sequences as input data for HSP. Statistical tests are needed to determine whether read depth is sufficient to conclude that absence of evidence for a particular gene family in the partial genome sequence represents genuine absence vs. missing data. Further extension of these methods to single cell genomic data (most often incomplete) could potentially allow incorporation of information from many uncultured and understudied phyla ("microbial dark matter"; Rinke et al., 2013). Algorithmic improvements that allow incorporation of information on gene content from partial genome sequences will be an important direction for future HSP algorithms in microbial ecology. For example, Bayesian HSP methods might integrate over distributions of possible copy numbers in partial genomes (derived from analysis of read depth). This is a similar to existing strategies for incorporating uncertainty in the parameters of the evolutionary model or the topology of the phylogenetic tree. Such methods would allow incorporation of much more comprehensive input datasets, and thus will likely represent an important advance in the accuracy of predictive functional profiling with HSP.

In systems where many sequenced genomes are available, HSP might be used to extend metabolic predictions of species interactions (Levy and Borenstein, 2013) where sequence information is lacking. This in turn may help to identify cases in which co-occurrence patterns (i.e., correlated abundance across samples) between two microorganisms may be driven by syntrophic mutualism. Recent advances suggest that this approach will be fruitful. Functional profiles imputed using HSP have been correlated with metabolomics profiles (McHardy et al., 2013), and metabolic modeling applied to test ideas about the processes

organism. This phenomenon is known as positive phylogenetic autocorrelation (top). Phylogenetic autocorrelation has been observed in a wide variety of traits (∼60% of 103 traits examined in (Freckleton et al., 2002). For example, closely related mammals have more similar body sizes than distantly related mammals (Gittleman et al., 1996). Phylogenetic autocorrelations are often positive (top left), but may in some cases be negative (top right). For example, closely related organisms inhabiting the same niche may diversify traits to escape competitive exclusion and exploit new resources. This may produce negative phylogenetic autocorrelation in closely related, cohabiting species. In contrast, trait correlation occurs when traits are linked to the evolution of other traits. For example, in RNAs where secondary structure is important, some nucleotide positions must

driving co-occurrence patterns in the human gut microbiome (Levy and Borenstein, 2013). A key test of the utility of HSP methods for this application will be a comparison of the accuracy of metabolic networks built from sequenced genomes vs*.* the HSP prediction for the genes in that genome. If the loss in accuracy is modest, then HSP could provide a rough outline of potentially interesting metabolic interactions at the level of entire microbial communities that could then be targeted for experimental confirmation.

a discrete trait, many of these traits will be correlated with one another. Continuous traits may also be correlated with one another (bottom left). Finally, some traits may have undergone convergent evolution (bottom right). Examples of convergence are plentiful, ranging from the similar morphology of cacti and euphorbia, which have independently adapted to arid climates, to skull morphology in diverse herbivorous vs. carnivorous lizards (Stayton, 2006). Existing HSP methods work best when traits exhibit strong positive phylogenetic autocorrelation. Statistical methods that account for observed convergent evolution of gene content within habitats (Chaffron et al., 2010; Zaneveld et al., 2010) and negative phylogenetic autocorrelation among co-occurring strains of the same OTU (Zaneveld et al., 2010) remain an important topic for future development.

## **ACKNOWLEDGMENTS**

This work was supported by NSF OCE #1130786 to Rebecca L. V. Thurber. The authors would like to thank Emily Monosson for conversations on evolutionary ecotoxicology, and both Se-Ran Jun and Mike Robeson for discussions of the PanFP software package.

#### **REFERENCES**

Angly, F. E., Dennis, P. G., Skarshewski, A., Vanwonterghem, I., Hugenholtz, P., and Tyson, G. W. (2014). Copyrighter: a rapid tool for improving the accuracy

of microbial community profiles through lineage-specific gene copy number correction. *Microbiome* 2, 11. doi: 10.1186/2049-2618-2-11 [Epub ahead of print].


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 May 2014; accepted: 31 July 2014; published online: 25 August 2014. Citation: Zaneveld JRR and Thurber RLV (2014) Hidden state prediction: a modification of classic ancestral state reconstruction algorithms helps unravel complex symbioses. Front. Microbiol. 5:431. doi: 10.3389/fmicb.2014.00431*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Zaneveld and Thurber. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Symbiote transmission and maintenance of extra-genomic associations

## *Benjamin M. Fitzpatrick\**

*Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA*

#### *Edited by:*

*Monica Medina, Pennsylvania State University, USA*

#### *Reviewed by:*

*Carl J. Yeoman, Montana State University, USA Martin Zimmer, Universität Salzburg, Austria*

*\*Correspondence: Benjamin M. Fitzpatrick, Department of Ecology and Evolutionary Biology, University of Tennessee, 569 Dabney, Knoxville, TN 37996, USA e-mail: benfitz@utk.edu*

Symbiotes can be transmitted from parents to offspring or horizontally from unrelated hosts or the environment. A key question is whether symbiote transmission is similar enough to Mendelian gene transmission to generate and maintain coevolutionary associations between host and symbiote genes. Recent papers come to opposite conclusions, with some suggesting that any horizontal transmission eliminates genetic association. These studies are hard to compare owing to arbitrary differences in modeling approach, parameter values, and assumptions about selection. I show that associations between host and symbiote genes (extra-genomic associations) can be described by the same dynamic model as conventional linkage disequilibria between genes in the same genome. Thus, covariance between host and symbiote genomes depends on population history, geographic structure, selection, and co-transmission rate, just as covariance between genes within a genome. The conclusion that horizontal transmission rapidly erodes extra-genomic associations is equivalent to the conclusion that recombination rapidly erodes associations between genes within a genome. The conclusion is correct in the absence of population structure or selection. However, population structure can maintain spatial associations between host and symbiote traits, and non-additive selection (interspecific epistasis) can generate covariances between host and symbiote genotypes. These results can also be applied to cultural or other non-genetic traits. This work contributes to a growing consensus that genomic, symbiotic, and gene-culture evolution can be analyzed under a common theoretical framework. In terms of coevolutionary potential, symbiotes can be viewed as lying on a continuum between the intimacy of genes and the indifference of casually co-occurring species.

**Keywords: vertical transmission, horizontal transmission, symbiosis, gene-culture, coevolution, dual-inheritance theory, interspecific disequilibrium, metagenome**

## **1. INTRODUCTION**

The view that organism phenotypes can be described in terms of a dichotomy between inherited genes and non-inherited environmental factors has been an enormously useful simplification in the development of quantitative genetics and evolutionary theory (Lynch and Walsh, 1998; Futuyma, 2009). However, several fields now present important opportunities to understand the prevalence and importance of additional influences. For example, studies of gene-culture coevolution (Feldman and Zhivotovsky, 1992; Henrich et al., 2008) and symbiosis (Bright and Bulgheresi, 2010; Gilbert et al., 2010) have emphasized the roles of factors with a mixture of horizontal and vertical transmission on development and evolution. Formation of intimate symbioses has contributed to major transitions in eukaryotic evolution and community ecology (Maynard Smith and Szathmary, 1995; Selosse et al., 2006), while gene-culture coevolution can promote speciation in learning animals (Vallin and Qvarnstrom, 2011) and is associated with the emergence of *Homo sapiens* as a global ecosystem engineer (Maynard Smith and Szathmary, 1995; Vitousek et al., 1997).

The degree to which symbiotic associations and cultural traits are passed vertically from parent to offspring vs. horizontally between unrelated individuals can be critical in determining the evolutionary trajectories of traits affecting cooperation, resource use, and the functional integration of systems (symbiotic or social) more inclusive than the individual organism. Vertical transmission is thought to promote partner fidelity feedbacks between genes (Frank, 1994). That is, when host and symbiote genes maintain a statistical association across generations, the evolution of stable mutualism is likely (Frank, 1994; Doebeli and Knowlton, 1998; Fletcher and Doebeli, 2009; Wyatt et al., 2013). When vertical transmission is perfect, the evolutionary dynamics converge on the dynamics of coevolution between genes within the same genome.

Several studies have sought to extend the tools and concepts of population genetics to analyze host-symbiote and gene-culture coevolution. For example efforts have been very successful in advancing understanding and manipulation of incompatibilityinducing endosymbiotes (Turelli, 1994; Hoffmann et al., 2011). Analyses of gene-culture coevolution have been illuminating but controversial when applied to human culture (Cavalli-Sforza and Feldman, 1981; Boyd and Richerson, 1985; Richerson and Boyd, 2005; Ackland et al., 2007; Claidiere and Andre, 2012; Houkes, 2012). Even genetic models of speciation can be extended to include interactions between host and symbiote genomes (Brucker and Bordenstein, 2012, 2013). Important technical advances include the recognition that gene-culture covariance and interspecific disequilibrium (covariance between host and symbiote genes) resemble conventional linkage disequilibrium and cytonuclear disequilibrium (Feldman and Cavalli-Sforza, 1984; Sanchez et al., 2000). Further, the notion of "generalized epistasis" (Feldman and Cavalli-Sforza, 1984; Feldman and Zhivotovsky, 1992) emphasizes the potential for non-additive interactions between genotypes and cultural traits to determine the direction of evolution.

These results have inspired a generalized concept of "nongenetic inheritance" to include symbiotes, cultural traits, and potentially other non-genetic or epigenetic traits with some possibility of both vertical and horizontal transmission (Bonduriansky and Day, 2009). Proponents of this expanded view of inheritance have generally concluded that there are important similarities between models of non-genetic inheritance and classic genetic models, and that extra-genomic traits can be significant factors in evolution (Feldman and Zhivotovsky, 1992; Jablonka and Lamb, 1998; Bonduriansky and Day, 2009; Day and Bonduriansky, 2011; Odling-Smee et al., 2013). However, attempts to extend classical population and quantitative genetics theory to accommodate non-genetic inheritance have been relatively complex, making it difficult to establish general principles (Feldman and Zhivotovsky, 1992; Santure and Spencer, 2006; Tal et al., 2010; Johannes and Colomé-Tatché, 2011; Bonduriansky, 2012). Moreover, the general importance of non-genetic traits in evolution remains debated (Jablonka and Lamb, 1998; Haig, 2007; Dickins and Dickins, 2008; Bonduriansky, 2012).

The opposing conclusions from recent theoretical studies most likely arise from differing premises, assumptions, and parameter spaces. Early work on gene-culture coevolution adapted classical population genetic models to assumptions about cultural transmission, and tended to emphasize effects of selection on cultural traits (Feldman and Cavalli-Sforza, 1984; Boyd and Richerson, 1985; Feldman and Cavalli-Sforza, 1989; Feldman and Zhivotovsky, 1992). Studies of genomic imprinting have generally used a quantitative genetic framework emphasizing phenotype distributions in families (Santure and Spencer, 2006; Tal et al., 2010; Johannes and Colomé-Tatché, 2011). Day and Bonduriansky (2011) used the Price equation to develop expressions for phenotypic change owing to selection, genetic inheritance, and non-genetic inheritance. Their approach appears to be quite general, but as a trade-off for generality, it does not immediately reveal the importance of any particular mechanism of non-genetic inheritance, nor any particular measure of coevolutionary association. In contrast, Brandvain et al. (2011) made an explicit population genetic model focused on how maternal transmission of a symbiote affects interspecific covariance (disequilibrium) in a panmictic population with no selection.

Under those conditions, Brandvain et al. (2011) showed that genetic covariances (interspecific disequilibria) between neutral organelle and symbiote alleles decay rapidly with even a little horizontal transmission in a panmictic population. They concluded that imperfect vertical transmission would leave negligible statistical signature in molecular marker data and that the potential for interactions between host and symbiote genes to affect evolution would be limited. Qualitatively, the same can be said for conventional nuclear genes or cytoplasmic and nuclear genomes; linkage disequilibria (covariances) decay rapidly in panmictic populations (Lewontin, 1974; Hartl and Clark, 1997). However, conditions favoring persistent covariance between nuclear genes or between cytoplasmic and nuclear genes are common in nature (Arnold, 1993; Hewitt, 2001; Zapata et al., 2001; Laurie et al., 2007). Moreover epistasis is widely acknowledged as an important factor in genome evolution and speciation (Wolf et al., 2000; Coyne and Orr, 2004; Petkov et al., 2005; Weinreich et al., 2005; Muir and Moyle, 2009; Rohlfs et al., 2010), and Drown et al. (2013) showed conditions under which interspecific epistasis would favor an evolutionary transition from horizontal to vertical transmission in a host-symbiote system.

The apparently unresolved question is whether associations between host and symbiote genes have fundamentally different dynamics from associations between genes in the same genome (as implied by Brandvain et al., 2011), or whether fundamental dynamic similarities trump a superficial distinction between genetic and non-genetic inheritance (Day and Bonduriansky, 2011). Here, I attempt to address this question by deriving a few very simple relationships, and then illustrating some of their implications with simulations of a few biologically interesting scenarios. My conclusion might not be surprising to theoretical population geneticists; extra-genomic traits like symbiotes can be studied by extending the tools of classical population genetics (Cavalli-Sforza and Feldman, 1981; Boyd and Richerson, 1985; Sanchez et al., 2000; Day and Bonduriansky, 2011). However, my analysis might be helpful for microbial ecologists. I shed some new light on the early work on associations (disequilibria) between classical alleles and non-genetic traits (Feldman and Cavalli-Sforza, 1984) and contradict the inferred conclusions (though not the mathematical results) of Brandvain et al. (2011). My results help illustrate the value of a unified view of inheritance and coevolution (Day and Bonduriansky, 2011), with symbiosis falling on a continuum of intimacy including genes, organelles, free-living species, and even environmental factors (Lewontin, 1983; Keeling and Archibald, 2008; Odling-Smee et al., 2013).

## **2. METHODS**

Brandvain et al. (2011) made two claims against the evolutionary relevance of associations between host organelle and symbiote alleles (interspecific disequilibrium). First, high rates of vertical transmission of symbiotes do not result in high levels of genetic association. Second (consequently), host-symbiote interactions will not respond to natural selection on non-additive fitness effects (interspecific epistasis). The basis of these claims is a model showing rapid decay of covariances between organelle and symbiote genomes in a single randomly mating population with no selection. Therefore, I first evaluate the veracity of their result using a simpler, more traditional population genetics model.

The model follows earlier work generalizing the traditional measure of association between genes in the same genome (linkage disequilibrium) to extra-genomic associations (Feldman and Cavalli-Sforza, 1984; Feldman and Zhivotovsky, 1992), but I provide a new result, a closed-form solution for the evolutionary dynamics of extra-genomic associations. To put a finer point on the question of whether extra-genomic associations can be informative about the transmission process, I explicitly compare the dynamics of symbiote-organelle covariance with the dynamics of symbiote-nuclear covariance. To my knowledge, this has not been done previously.

Second, I evaluate the effects of population structure on patterns of extra-genomic covariance. Population structure is well known to promote within-genome covariance (linkage disequilibrium), but the impact of subdivision and gene flow on the dynamics of extra-genomic traits has rarely been investigated (Ackland et al., 2007). I derive new analytical results for equilibrium covariance in a simple admixture model (Asmussen and Arnold, 1991) and then show numerical results for a more complicated hybrid zone model modified from Dakin (2006).

Finally, I test whether epistatic interactions between host and symbiote genes can affect how a population responds to selection. I illustrate the earlier results of Feldman and Cavalli-Sforza (1989), which demonstrate an evolutionary response to selection on epistatic effects between a gene and a cultural trait. I show conditions under which host-symbiote genetic covariance is built up by selection despite the tendency for horizontal transmission to reduce covariance. I also evaluate effects of hitchhiking between host and symbiote genes in a single population and across a hybrid zone.

### **3. RESULTS**

#### **3.1. COVARIANCE BETWEEN SYMBIOTE AND ORGANELLE GENOMES**

To derive the expected covariance between organelle (cytoplasmic) and symbiote genomes, consider a system where each genome has two types (haplotypes or haplotype groups): *C* and *c* for the cytoplasmic organelle, and *S* and *s* for the symbiote. If *xij* is the frequency of individuals with cytotype *i* and symbiote *j*, and *pi* and *pj* are the marginal frequencies of cytotype *i* and symbiote *j*, the covariance (cyto-symbiote disequilibrium, *D*) is

$$D = \mathbf{x}\_{\rm CS} - p\_{\rm C} \mathbf{p}\_{\rm S}. \tag{1}$$

Equation (1) is the covariance between two binary random variables, and can be applied to phenotypes, genes, symbiotic states or any pair of binary categorical variables (Clark, 1984; Feldman and Cavalli-Sforza, 1984). Statistical association or disequilibrium can also be expressed as a correlation (*r* = *D/* <sup>√</sup>*pSpspCpc*). If the probability of vertical (maternal) transmission of the symbiote is *v*, the set of frequencies *x* = *(xCS, xCs, xcS, xcs)* is expected to change across generations according to the transition matrix in **Table 1**. Therefore, the expected cyto-symbiote covariance in the next generation is

$$\begin{split} D' &= \mathbf{x}'\_{\text{CS}} - p\_{\text{CPS}} \\ &= \mathbf{x}\_{\text{CS}} \boldsymbol{\nu} + (\mathbf{x}\_{\text{CS}} + \mathbf{x}\_{\text{Cs}}) \left( \mathbf{l} - \boldsymbol{\nu} \right) p\_{\text{S}} - p\_{\text{CPS}} \end{split} \tag{2}$$

Substitute *pC* = *xCS* + *xCs*

$$\begin{aligned} D' &= \mathfrak{x}\_{\mathbb{CS}} \nu - p\_{\mathbb{C}} p\_{\mathbb{S}'} \\ &= D \nu \end{aligned}$$

#### **Table 1 | Transition matrix for joint symbiote and organelle genotypes.**


And the general solution is *Dt* <sup>=</sup> *<sup>D</sup>*0*v<sup>t</sup>* . This novel result is directly comparable to the classic result for covariance between nuclear genes *Dt* <sup>=</sup> *<sup>D</sup>*0*c<sup>t</sup>* , where *<sup>c</sup>* is the probability of cosegregation (1−recombination rate) (Lewontin, 1974).

If the symbiosis is not obligate and exclusive, i.e., if a host can have zero symbiotes or more than one symbiote genotype, the result is similar. Let *xCS* be the frequency of individuals with cytotype *C* and symbiote genotype *S*, regardless of what other symbiote genotypes are present. Then *x CS* = *xCSv(*1 − *hS)* + *pChS*, where *hS* is the probability of gaining a symbiote with genotype *S* by horizontal transfer (from the environment or another host). Let *D* = *xCS* − *pCpS*, where *pS* is the fraction of hosts with symbiote *S* (rather than the fraction of symbiotes with genotype *S*). In this case

$$D\_t = D\_0[\nu \left(1 - h\_{\mathbb{S}}\right)]^t. \tag{3}$$

That is, when the ecological relationship between host and symbiote is facultative, the effective recombination rate is 1 − *v(*1 − *hs)*, which is greater than the obligate case (1 − *v*).

#### **3.2. COVARIANCE BETWEEN SYMBIOTE AND NUCLEAR GENES**

For the covariance between a symbiote genotype and an allele at a nuclear locus (with alleles *A* and *a*), the nuclear-symbiote covariance is *<sup>D</sup>* <sup>=</sup> *<sup>x</sup>*AAS <sup>+</sup> <sup>1</sup> <sup>2</sup> *x*AaS − *pApS* (Asmussen et al., 1987). The relevant host-symbiote genotype frequencies in the next generation are given by

$$\mathbf{x}'\_{\rm{AAS}} = \nu \left( \mathbf{x}\_{\rm{AAS}} + \frac{1}{2} \mathbf{x}\_{\rm{AAS}} \right) p\_A + (1 - \nu) p\_S \left( \mathbf{x}\_{\rm{AA}} + \frac{1}{2} \mathbf{x}\_{\rm{A}a} \right) p\_A$$

and

$$\begin{aligned} \mathbf{x}'\_{\mathrm{AaS}} &= \nu \left( \mathbf{x}\_{\mathrm{AAS}} p\_a + \frac{1}{2} \mathbf{x}\_{\mathrm{AaS}} + \mathbf{x}\_{\mathrm{aaS}} p\_A \right) \\ &+ (1 - \nu) p\_S \left( \mathbf{x}\_{\mathrm{AA}} p\_a + \frac{1}{2} \mathbf{x}\_{\mathrm{Aa}} + \mathbf{x}\_{\mathrm{aa}} p\_A \right) \end{aligned}$$

The nuclear-symbiote covariance in the next generation *D* = *x* AAS <sup>+</sup> <sup>1</sup> 2 *x* AaS − *pApS* simplifies to

$$D' = D\frac{1}{2}\nu \tag{4}$$

And the general solution is *Dt* = *D*<sup>0</sup> 1 2 *v t* . A pertinent special case is perfect maternal transmission (*v* = 1), which eliminates the distinction between symbiote and organelle. Covariance between a maternally transmitted organelle and a nuclear gene (cyto-nuclear disequilibrium) decays by one half each generation (*Dt* = *D*<sup>0</sup> 1 2 *t* ), just like covariance between unlinked nuclear genes (Asmussen et al., 1987).

Even though all covariances decay rapidly in a panmictic population (Lewontin, 1974; Sanchez et al., 2000; Brandvain et al., 2011), cyto-symbiote covariance (Equation 2) decays much more slowly (by a factor of 2) than nuclear-symbiote covariance (Equation 4) whenever there is non-zero maternal transmission (**Figure 1**). The key difference between host-symbiote transmission and conventional gene or organelle transmission is that completely free recombination corresponds to a cosegregation probability of <sup>1</sup> <sup>2</sup> , whereas completely horizontal transmission corresponds to a vertical transmission probability of zero. Thus, although conventional linkage disequilibrium between unlinked loci is expected to decay no faster than by <sup>1</sup> <sup>2</sup> each generation, host symbiote covariance can drop to zero in one generation if there is no tendency for parent-offspring transmission. Aside from this quantitative difference, the population genetic principles describing the relationship between nuclear and cytoplasmic genomes (Asmussen et al., 1987; Arnold et al., 1988; Arnold, 1993) can be extended to the relationship between hosts and symbiotes. This insight underscores a fundamental continuity between nuclear genes, organelles, and symbiotes.

## **3.3. HOST-SYMBIOTE COVARIANCE IN STRUCTURED POPULATIONS** *3.3.1. Continent-island structure*

Admixture or immigration between populations with divergent allele frequencies is a common cause of within-genome association (conventional linkage disequilibrium) in nature (Conner and Hartl, 2004). To assess the effect of this kind of population structure on host-symbiote genetic associations, first consider a simple continent-island admixture model (Asmussen and Arnold, 1991)

in which a focal population receives a proportion *m* of immigrants from two or more divergent source populations (**Figure 2**). Let the frequency of immigrants with cytoplasmic allele *C* and symbiote *S* be π*CS*. After immigration, the covariance between cytoplasmic and symbiote genotypes (cyto-symbiote covariance *DCS*) is

$$\begin{split} D'\_{\rm CS} &= (1 - m) \, \varkappa'\_{\rm CS} + m \pi\_{\rm CS} - \bar{p}\_{\rm C} \bar{p}\_{\rm S} \\ &= \nu \left( 1 - m \right) D\_{\rm CS} + m D\_{\rm m} + \text{cov}(p\_{\rm C}, p\_{\rm S}) \end{split} \tag{5}$$

where *Dm* is the covariance among immigrants. The averages (*p*¯*<sup>C</sup>* and *p*¯*S*) and covariance of the allele frequencies are over immigrant and resident born sets with weights *m* and 1 − *m*. Exactly the same dynamics can be derived for conventional nuclear or cyto-nuclear linkage disequilibria (Asmussen and Arnold, 1991). If we assume the source populations are themselves unchanged by gene flow, so that *Dm* and the allele frequencies are constant, this difference equation has a non-zero equilibrium at

$$
\hat{D}\_{\rm CS} = \frac{m D\_m}{1 - \nu (1 - m)}.\tag{6}
$$

Likewise for nuclear-symbiote covariance,

$$
\hat{D}\_{\rm NS} = \frac{m D\_m}{1 - \frac{1}{2}\nu (1 - m)}.\tag{7}
$$

Setting *v* = 1 in Equation (7) recovers Asmussen and Arnold's (1991) solution for cyto-nuclear disequilibrium. Thus, interspecific genetic covariances can be maintained by immigration. Moreover, *D*ˆ *CS > D*ˆ *NS*; the expected cyto-symbiote covariance is greater than the expected nuclear-symbiote covariance at equilibrium between immigration and recombination.

#### *3.3.2. Stepping-stone structure*

To investigate the effects of more complicated population structure on cyto-symbiote and nuclear-symbiote covariances, I added a symbiote to the stepping stone model analyzed thoroughly for cyto-nuclear covariance by Dakin (2006). The basic framework is a line of demes between two source populations. Dispersal occurs only between adjacent demes and mating is random within demes (Kimura and Weiss, 1964; Goodisman et al., 1998; Dakin, 2006). This framework can be modified to include some other classic models as special cases, including the continent-island (above) and two-population intermixture models (Nei and Li, 1973; Asmussen and Arnold, 1991).

First, I determined how to incorporate a symbiote with vertical transmission probability *v* into Dakin's (2006) deterministic recursion equations (Appendix Equation 9). I then iterated the system of equations to illustrate transient and equilibrium patterns. I followed Dakin (2006) by initializing ten stepping stones, the first five (nearest source 1) fixed for *A*, *C*, and *S*, and the second five (nearest source 2) fixed for *a*, *c*, and *s*. This simulates secondary contact. Each time step began with symmetrical dispersal between neighboring demes, followed by random mating within demes. In one set of simulations, I followed Dakin (2006)in assuming infinite source populations unaffected by gene

flow (remaining fixed for their respective alleles for all time). In a second set, I allowed gene flow into the source populations, assuming they received half the number of immigrants (because they have only one rather than two neighbors). I report the covariances among loci in the adults (after dispersal, before mating); covariances in the offspring are lower in magnitude (by factors of *v*, *<sup>v</sup>* <sup>2</sup> , and <sup>1</sup> <sup>2</sup> ) but the differences between cyto-symbiote, nuclear-symbiote, and cyto-nuclear associations are greater.

Dynamics of the stepping stone model over the first 100 generations of contact are illustrated in **Figure 3**. Over this time frame, the difference between the infinite source and finite source models was negligible. With moderate immigration (*m* = 0*.*10 shown) the central demes (at the initial contact front) show transient covariances many times higher than their equilibrium values, as shown previously for cyto-nuclear covariances (Asmussen and Arnold, 1991). Most important, with high but imperfect vertical transmission (*v* = 0*.*90 shown), the cyto-symbiote association is greater and shows larger changes in time and space than the nuclear-symbiote or cyto-nuclear associations.

Allele and genotype frequencies were very close to their asymptotic values within a few hundred generations. For the finite source model, all covariances approached zero, as expected (Asmussen and Arnold, 1991). Asymptotic results for the infinite source model (after 1000 generations) presumed to reflect equilibrium are summarized in **Figure 4**. Cyto-nuclear results for the allelic and genotypic covariances agree with those of Dakin (2006). Cyto-symbiote associations are many times higher, as expected from equations 2 and 4 for a high probability of vertical transmission (*v* = 0*.*90 shown).

To check the effects of genetic drift on these predictions, I also modeled the system with finite populations. At each generation in each deme, I sampled *N* = 1000 adults from the genotype distribution in focal deme with probability 1 − *m* and from each neighboring deme with probability *m/*2. Then I sampled *N* = 1000 offspring given the adult genotype frequencies and transmission probabilities. Results for this stochastic version were similar to the deterministic results for asymptotic covariances (**Figure 4**). Drift tends to increase genetic associations, but this is somewhat masked by its effects on allele frequencies. The correlation (*r* = *D/* <sup>√</sup>*pSpspCpc*) is more sensitive than the covariance (**Figure 4**) because *r* accounts for local allele frequencies.

#### **3.4. GENERALIZED EPISTASIS AND SELECTION: CATTLE AS SYMBIOTES**

Feldman and Cavalli-Sforza (1984) pointed out that the covariance between genes and a culturally transmitted trait could be described by a disequilibrium (*D* or *r*) in exactly the same way as the standard covariance between genes, and Sanchez et al. (2000) observed that gene-culture covariance and host-symbiote covariance were formally analogous to cytonuclear disequilibrium (all being covariances between binary random variables). My analysis shows that this similarity is deeper than the summary statistics; the evolutionary associations in host-symbiote and gene-culture systems follow the same kind of dynamics as Mendelian genetic systems. It follows that evolutionary relationships between host and symbiote genes should respond to epistatic selection (contra Brandvain et al., 2011). That is, the notion of epistasis can be generalized to include interactions between traits with different inheritance mechanisms (Feldman and Zhivotovsky, 1992), just as the notion of coevolution can be extended to include interactions between genes in the same genome (Lovell and Robertson, 2010).

To underscore this fundamental correspondence, consider the coevolutionary model of human lactose absorption and dairy

farming investigated by Feldman and Cavalli-Sforza (1989). Their model included an autosomal (diploid) gene affecting lactase activity, and milk use as a binary cultural trait. This is a model of interspecific "generalized epistasis" because the fitness effect of a genotype depends on the presence or absence of the interaction between humans and cattle (Feldman and Cavalli-Sforza, 1989; Feldman and Zhivotovsky, 1992). The change in frequency of the absorption allele depended strongly on the degree of vertical transmission of the cultural trait (the probability that children of milk users would become milk users). The model applies equally well if we characterize dairy farming as a symbiotic association

**probability** *v* **= 0.90.** Only the middle six demes are shown; demes 1, 2, 9, and 10 have essentially flat lines at this scale.

between humans and cattle. If cattle or farms tended to be passed from parent to offspring, the gene-culture covariance describing an association between human genotype and behavior (milk use) is exactly the same as host-symbiote covariance describing an association between human genotype and symbiote (cattle) presence.

I implemented the Feldman and Cavalli-Sforza (FCS) model as follows (see the online appendix and Feldman and Cavalli-Sforza, 1989 for mathematical and coding details). The first step is random mating with imperfect mother-offspring transmission of the symbiote (*U* vs. *u* for present or absent) and Mendelian transmission of an autosomal diploid locus determining whether a host benefits or suffers a net fitness cost from the symbiosis (**Table 2**). In the second step, following Feldman and Cavalli-Sforza (1989), offspring have a chance of picking up the symbiote via horizontal transmission. That is, each individual without the symbiote has probability *fpU* of becoming infected (*pU* is the frequency of infection/symbiosis among other hosts in the population). The third step is selection according to **Table 2**, and the resulting genotypes mate at random to produce the next generation.

stochastic version of the model (1000 individuals per deme) and lines show the deterministic (no drift) expectations. Cyto-symbiote associations (black circles, solid line) are several times greater than nuclear-symbiote (open circles, dashed line) and cyto-nuclear (gray) associations. The correlation (*r*2) is affected more by drift.

**Table 2 | Fitness of host genotypes participating in the symbiosis (***U***) or not (***u***) in the Feldman and Cavalli-Sforza (FCS) model.**


*If both s1 and s2 > 0, there is sign epistasis because the effect of the EGT is negative or positive depending on the host genotype.*

To illustrate that selection can operate on interspecific epistatic effects despite imperfect vertical transmission, I present numerical results for a range of vertical transmission rates given selection *s*<sup>1</sup> = 0*.*05 and *s*<sup>2</sup> = 0*.*15, and horizontal transmission rate *f* = 0*.*5. These values are among those used by Feldman and Cavalli-Sforza (1989) in their discussion of the coevolution of lactase and dairying, and I chose them by trial and error for the simple purpose of providing a counterexample to the conjecture that imperfect transmission will make interspecific epistatic effects unresponsive to selection (Brandvain et al., 2011).

Perfect vertical transmission is not required. Rather, epistatic fitness effects can drive coevolution between host and symbiote genes if the vertical transmission rate is above some threshold value determined by the strength and mode of selection and the probability of horizontal transmission (**Figure 5**). In the specific example depicted (**Figure 5**), there is a strong deleterious effect of the symbiosis (or cultural trait) on the "wild-type" genotype (*s*<sup>2</sup> = 0*.*2) relative to its benefit for the mutant (*s*<sup>1</sup> = 0*.*05), that is, the additive effect of the symbiote is negative when the *A* host allele is rare. Nonetheless, the symbiosis can spread (along with the *A* allele) if vertical transmission is strong enough to maintain an association between the symbiote and the host allele that makes it advantageous (*v >* 0*.*76 in this case, with *f* = 0*.*5).

Feldman and Cavalli-Sforza (1989) allowed transmission (*v* and *f*) to vary with genotype. The resulting dynamics differ only quantitatively from the simplified version used here. Day and Bonduriansky (2011) analyzed a version of the FCS model including self-learning. Individuals could (with some probability) choose the cultural practice that best suited their genotype (analogous to partner choice in symbiosis). They solved for a threshold strength of selection above which milk use increases when rare. This threshold was lowered by self-learning. High rates of vertical transmission lowered the threshold when self-learning was weak, but increased it when self-learning was strong.

#### **3.5. HITCHHIKING EFFECTS**

To evaluate the potential for selection on extra-genomic traits to drive changes in organelle genotype frequencies via hitchhiking, I evaluated the correlated response of organelle genotype frequencies to selection in the basic model, the FCS model, and the hybrid zone model. The effect of an advantageous, maternally transmitted symbiote on a neutral cytoplasmic marker can be found for the basic model (Equation 2 and **Table 1**) by adding a fitness advantage of 1 + *s* for individuals with symbiote *S*. Then the frequency of the cytoplasmic allele *C* changes as

$$p\_C' = \frac{\mathbf{x\_{CS}}(1+s) + \mathbf{x\_{Cs}}}{\overline{W}} = p\_C + \frac{sD}{\overline{W}} \tag{8}$$

where *W* = 1 + *spS* is mean fitness.

To evaluate hitchhiking effects of a non-obligate symbiote, I added a maternally inherited cytoplasmic marker to the FCS model. I numerically iterated the FCS model to illustrate hitchhiking effects for three kinds of conditions.

First, for comparison to the standard case of a neutral marker linked to an advantageous mutation (Maynard Smith and Haigh, 1974; Barton, 2000), I used a single panmictic population with a uniformly advantageous symbiote in the FCS framework (*s*<sup>1</sup> = −*s*<sup>2</sup> = *s*). The symbiote was initially in complete disequilibrium with a cytoplasmic marker. I.e., there were two kinds of individuals: *CU* and *cu*. Each *CU* produced *CU* offspring with probability *v* and *Cu* offspring with probability 1 − *v*. *Cu* and *cu* offspring

**FIGURE 5 | Coevolution of a "host" gene and presence or absence of a cultural/symbiotic trait in the Feldman and Cavalli-Sforza (FCS) model for a single population.** Panel **(A)** shows prevalence of hosts (e.g., humans) with the symbiote (e.g., dairy cattle) over time for six different vertical transmission rates (see legend in **B**). Blue and black lines approach equilibrium values between 0.5 and 1.0 (Note logarithmic y-axis). **(B)** shows concurrent changes in frequency of the allele affecting fitness of hosts with and without the symbiotic relationship (e.g., a lactose absorption allele). **(C)** Shows the correlation (interspecific disequilibrium) between the host allele and symbiote presence.

were converted to *CU* and *cU* with probability *fpU*, where *pU* was the frequency of individuals with the symbiote at that time. Individuals with the symbiote had relative fitness 1 + *s*. For comparison to the standard Maynard-Smith and Haigh (MSH) model, I calculated the expected final frequency of the cytoplasmic marker (*pC,*∞) in terms of initial frequencies of *C* and *U* and selection (*s*) from Barton (2000) Equation (1): *pC,*<sup>∞</sup> = *pC,*<sup>0</sup> + *(*1 − *pC,*0*)p r/s <sup>U</sup>,*0, with the approximate effective recombination rate *r* = 1 − *v(*1 − *f)* from Equation (3). This MSH expectation is not perfectly identical to the cyto-symbiote hitchhiking model (there are differences in ploidy, initial conditions, and the "effective recombination rate" is only an approximation), but the magnitude of hitchhiking should be comparable if the claim is true that the FCS model produces evolutionary dynamics for host-symbiote gene pairs that are comparable to dynamics for conventional genes.

As expected based on standard population genetics and Equations (2, 3, and 8), the hitchhiking effect of a spreading symbiote depends on the strength of selection, the co-transmission rate, and the initial conditions (**Table 3**). The examples in **Figure 6** illustrate the similarity between cyto-symbiote hitchhiking (or extra-genomic hitchhiking in general) and the hitchhiking effect of one nuclear locus on another (Maynard Smith and Haigh, 1974; Barton, 2000). Graphs for other parameter combinations are available as Appendix A.

**Table 3 | Cytoplasmic marker frequencies after 100 generations of cyto-symbiote hitchhiking in the FCS model for various vertical transmission rates (***v***) and fitness effects.**


*The first case (p0* = *0.001) represents an initially rare symbiote in a panmictic population, and the second (p0* = *0.50) represents 1:1 admixture (the cytoplasmic marker was initially perfectly associated with the advantageous symbiote). The third and fourth cases represent contact zones (mean allele frequencies of demes 50 and 51 out of 100 total stepping stones). In the first three cases, the symbiote is universally advantageous (s* = *s1* = −*s2). In the final, epistatic case s2* <sup>=</sup> <sup>1</sup> <sup>2</sup> *s1.*

Second, to evaluate the dynamics of secondary contact with a universally advantageous symbiote, I used the stepping-stone model described above with inheritance and selection according to the FCS model after dispersal. I used 100 stepping stones and iterated the deterministic difference equations for 100 generations after secondary contact for several combinations of transmission rates (*v* and *f*) and selection strengths (*s*<sup>1</sup> = −*s*<sup>2</sup> = *s*).

Finally, I repeated the simulations of secondary contact with selection according to the FCS model with sign epistasis (Weinreich et al., 2005). That is the effect of the symbiote is positive or negative according the host genotype, as in the original formulation (Feldman and Cavalli-Sforza, 1989). For the examples depicted, I varied the strength of selection but kept the ratio constant (*s*<sup>2</sup> = *s*1*/*2).

Cyto-symbiote hitchhiking can cause significant differential introgression of cytoplasmic DNA (**Figure 7**, **Table 3**). The spatial cline for the allele frequency of a neutral nuclear marker is unaffected by the spread of a symbiote across a hybrid zone, but there is a significant spatial displacement of the cline for a neutral cytoplasmic marker owing to the slow decay of cytosymbiote covariance. As it does for intragenomic hitchhiking (Barton, 2000), spatial structure reduces the overall hitchhiking effect (**Table 3**) by slowing the response to selection and allowing more time for recombination. Introgression is generally slower with epistasis (**Table 3**), in part because the overall selective advantage of the symbiote is reduced when *s*<sup>2</sup> *>* −*s*<sup>1</sup> (**Table 3**), and in part because the advantage of the symbiote depends on cointrogression of the epistatic nuclear allele, producing a steeper, more slowly advancing wave front (**Figure 7**).

#### **4. DISCUSSION**

The simple models presented here extend previous theoretical work on nuclear, cyto-nuclear, and extra-genomic covariances (Nei and Li, 1973; Asmussen and Arnold, 1991; Feldman and Zhivotovsky, 1992; Goodisman et al., 1998; Sanchez et al., 2000; Dakin, 2006; Brandvain et al., 2011). My results illustrate a very important conceptual and mathematical correspondence between the dynamics of associations between genes in different genomes and associations between genes in the same genome. Although there have been doubts about the stability of coevolutionary associations between host and symbiote genes with less than 100% vertical transmission (Brandvain et al., 2011), such concerns are

no more or less valid for extra-genomic associations than for associations between genes within a genome. This inference can be extracted from the general frameworks of dual-inheritance theory (Cavalli-Sforza and Feldman, 1981; Feldman and Cavalli-Sforza, 1984; Boyd and Richerson, 1985; Feldman and Zhivotovsky, 1992) or non-genetic inheritance (Bonduriansky and Day, 2009; Day and Bonduriansky, 2011). My analysis uses more specific models to put a finer point on the idea that coevolution between genomes and within genomes share fundamental similarities.

The study of covariance between genes in symbiotic species is important because assortment between mutually beneficial genotypes is a key to the evolutionary origin and stability of mutualism (Frank, 1994; Fletcher and Doebeli, 2009; Wyatt et al., 2013). Vertical transmission does not guarantee the evolution of mutualism; persistent conflicts are sometimes evident within genomes or between host and organelle or endosymbiote genomes (Burt and Trivers, 2006). However, cross-generation covariance between host and symbiote genes can strongly affect the direction of evolution.

My analysis shows that extra-genomic covariances follow the same dynamics as conventional linkage disequilibria. In situations where conventional linkage disequilibria tend to persist (e.g., structured populations), extra-genomic covariances can also be expected if there is some level of vertical transmission. Therefore, extra-genomic covariances might often be informative about rates of vertical vs. horizontal transmission. In addition, the potential for non-additive fitness effects to generate significant evolutionary change at the level of symbiotic or cultural systems can be similar to the potential for epistasis to affect change at the level of conventional genomes.

#### **4.1. SIMILARITIES AND DIFFERENCES BETWEEN SYMBIOTE TRANSMISSION, CULTURAL TRANSMISSION, AND GENE TRANSMISSION**

The derivations of Equations (2–7) expose a striking dynamic similarity of within-genome and between-genome covariances. Disequilibria (covariances) are expected to decay as exponential functions of time. The consequence in a large, panmictic population is that we expect covariances to become negligible after a few generations (Lewontin, 1974; Sanchez et al., 2000; Brandvain et al., 2011). However, even in such an idealized population, the rate of decay depends on the probability of cotransmission. Therefore, maternally transmitted symbiotes will tend to have higher covariance with maternally transmitted genes (e.g., cytoplasmic genomes) than with nuclear genes (**Figure 1**).

Two differences between intra-genomic and extra-genomic associations can modify the exponential decay pattern. First, zero linkage corresponds to a cotransmission probability of 0.5 for a pair genes within a genome, whereas zero vertical transmission corresponds to a cotransmission probability of 0.0 for host and symbiote genes. This means the decay of extra-genomic covariance can be instantaneous (when *v* = 0). Second, if the symbiotic relationship is not obligate, some offspring can be without the symbiote (i.e., they neither inherited it from a parent nor picked it up from someone else or the environment). This effectively increases the decay of extra-genomic covariance and causes a systematic tendency for the symbiosis to be lost in the absence of selection or an environmental source (Feldman and Cavalli-Sforza, 1989; Lipsitch et al., 1995; Sanchez et al., 2000).

#### **4.2. USING EXTRA-GENOMIC COVARIANCE TO MAKE INFERENCES ABOUT VERTICAL TRANSMISSION**

Given the rapid decay of covariances in panmictic populations, even for physically linked genes, lack of evident covariances cannot be interpreted as evidence against vertical transmission (Lewontin, 1974; Sanchez et al., 2000; Brandvain et al., 2011). However, when present, covariances can be informative. My analysis shows how even imperfect maternal transmission of symbiotes generates greater cyto-symbiote association than cytonuclear or nuclear-symbiote association (**Figure 1**). This result suggests that extra-genomic covariance can be used to make inferences about rates of vertical transmission in structured populations. Cytonuclear covariance and covariance between nuclear genes with known linkage relationships can be used as benchmarks for the detectability of vertical transmission of symbiotes. In particular, maternally inherited symbiotes are expected to have greater covariance with maternally inherited genes than with biparentally inherited genes (by a factor of 2*<sup>t</sup>* in the simple case of neutral decay in a panmictic population). Maternal transmission of bacteria might be common in animals (Funkhouser and Bordenstein, 2013). Nuclear and cytonuclear covariances are frequently observed in nature, particularly in zones of admixture or hybridization (Arnold, 1993; Hewitt, 2001; Zapata et al., 2001; Laurie et al., 2007). Interspecific covariances should be common in those same situations for symbiotes with substantial vertical transmission.

Recently, Thierry et al. (2011) documented extra-genomic covariance between whitefly genotypes and prevalence of several endosymbiotic bacteria in a zone of recent contact between native and introduced whiteflies. These endosymbiotes are maternally transmitted and showed strong correspondence between mtDNA and symbiotypes in hybrids (Thierry et al., 2011). However, the symbiotes have varying degrees of horizontal transmission and only one of three is obligate (Ahmed et al., 2013). Despite imperfect vertical transmission, host mtDNA and symbiote genotypes covary in the admixture zone (Thierry et al., 2011) and at a regional scale (Ahmed et al., 2013). These observations from nature indicate that it will be feasible to test for patterns of extra-genomic covariance consistent with maternal transmission in many wild systems. In the whitefly system, it will be interesting to compare patterns of nuclear gene flow with those of mtDNA, primary endosymbiotes, and secondary endosymbiotes.

#### **4.3. IMPLICATIONS FOR SYMBIOSIS AND EXTRA-GENOMIC COEVOLUTION**

Selection on non-genetic traits such as symbioses or learned behaviors can produce hitchhiking effects and extra-genomic covariances even when vertical transmission is less than perfect. Even non-additive interactions (generalized epistasis) between genes in different genomes can affect the direction of evolution. This has been shown before, particularly in models of geneculture evolution (Feldman and Cavalli-Sforza, 1989; Ackland et al., 2007; Day and Bonduriansky, 2011), but the continuity of coevolution within genomes, between genomes, and even between genes and culture is underappreciated (Brandvain et al., 2011).

All forms of coevolution depend on consistent associations between partners to produce reciprocal responses to selection, but intimate symbioses are often thought of as special. Recent enthusiasm over the evolutionary ecology of human-microbe interactions has raised broad questions about whether plants and animals should be conceptualized as "holobionts" or "metaorganisms", including hosts and associated microbial communities as units or levels of organization in evolution and ecology (Rosenberg et al., 2007; Doolittle and Zhaxybayeva, 2010). The answer depends, in part, on whether coevolutionary dynamics between genes in different genomes (host and microbe) are similar to dynamics between genes in the same genome. The decay of associations between genomes in a panmictic population with imperfect vertical transmission might be interpreted as leaving little opportunity for non-additive interactions between genomes to drive evolution (Brandvain et al., 2011). However, my analysis and several previous studies illustrate how associations between genomes are affected by the same factors and follow the same kinds of dynamics as associations between genes within a genome. Population structure can generate and maintain associations that might affect local coevolution (Doebeli and Knowlton, 1998; Thompson, 2005; Wyatt et al., 2013) and generalized epistasis can generate associations and determine the outcome of coevolution (Feldman and Cavalli-Sforza, 1984; Feldman and Zhivotovsky, 1992; Drown et al., 2013). Interspecific epistasis can even contribute to host speciation (Brucker and Bordenstein, 2012), for example by impacting mating behavior (Sharon et al., 2010) or hybrid viability (Brucker and Bordenstein, 2013). It does not necessarily follow that host-symbiote systems should be conceptualized as "meta-organisms," but the theoretical continuity between coevolution of genes within genomes and between genomes is encouraging for further synthesis between evolutionary genetics and evolutionary ecology.

## **AUTHOR CONTRIBUTIONS**

The sole author (Benjamin M. Fitzpatrick) is responsible for all aspects of this article.

## **FUNDING**

University of Tennessee.

## **ACKNOWLEDGMENTS**

I am grateful to Y. Brandvain, T. Day, and S. Gavrilets for helpful comments and discussion.

## **SUPPLEMENTARY MATERIAL**

The Online Appendices B–E: the R codes for this article are available in Data Sheet 1 can be found online at: http://www. frontiersin.org/journal/10.3389/fmicb.2014.00046/abstract

## **REFERENCES**


and invasive (b) biotypes of bemisia tabaci. *Mol. Ecol.* 20, 2172–2187. doi: 10.1111/j.1365-294X.2011.05087.x


Zapata, C., Nunez, C., and Velasco, T. (2001). Distribution of nonrandom associations between pairs of protein loci along the third chromosome of drosophila melanogaster. *Genetics* 161, 1539–1550.

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 December 2013; paper pending published: 20 January 2014; accepted: 21 January 2014; published online: 24 February 2014.*

*Citation: Fitzpatrick BM (2014) Symbiote transmission and maintenance of extragenomic associations. Front. Microbiol. 5:46. doi: 10.3389/fmicb.2014.00046*

*This article was submitted to Microbial Symbioses, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Fitzpatrick. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **APPENDIX**

## **APPENDIX A: EXPANDED DESCRIPTION OF METHODS FOR SIMULATING EXTRA-GENOMIC TRAIT TRANSMISSION**

#### *Host-symbiont covariance in structured populations*

To investigate the effects of population structure on cyto-symbiont and nuclear-symbiont disequilibria, I added a symbiont to the stepping stone model analyzed for cyto-nuclear disequilibria by Dakin (2006). Within each deme, random mating and inheritance of the nuclear locus *A*, maternally transmitted cytoplasmic locus *C* and imperfectly transmitted symbiont *S* follows the set of deterministic recursions given in Equation (9).

The expected frequencies after dispersal are given by the weighted averages of adjacent demes such that a fraction *m* of each deme is composed of immigrants coming in equal proportions from the two adjacent demes (Kimura and Weiss, 1964; Goodisman et al., 1998; Dakin, 2006). I followed Dakin (2006) by initializing ten stepping stones, the first five (nearest source 1) fixed for *A*, *C*, and *S*, and the second five (nearest source 2) fixed for *a*, *c*, and *s*. This simulates secondary contact. Each time step began with symmetrical dispersal between neighboring demes, followed by random mating within demes. In one set of simulations, I followed Dakin (2006) in assuming infinite source populations unaffected by gene flow (remaining fixed for their respective alleles for all time). In a second set, I allowed gene flow into the source populations, assuming they received half the number of immigrants (because they have only one rather than two neighbors). I report the disequilibria among loci in the adults (after dispersal, before mating); disequilibria in the offspring are lower in magnitude (by factors of *v*, *<sup>v</sup>* <sup>2</sup> , and <sup>1</sup> <sup>2</sup> ) but the differences between cyto-symbiont, nuclear-symbiont, and cyto-nuclear associations are greater.

I iterated the system of equations to illustrate transient and equilibrium patterns using the R code in Online Appendix B. To check the effects of genetic drift on these predictions, I also modeled a system of finite populations. At each generation in each deme, I sampled *N* adults from the focal deme with probability 1 − *m* and from each neighboring deme with probability *m/*2. Then I sampled *N* offspring from the frequency distribution described by Equation (9) given the adult genotype frequencies.

$$\begin{aligned} \mathcal{X}\_{\text{SAA}} &= P\_{\text{b}} \left[ \nu \left( \mathcal{X}\_{\text{SAA}} + \frac{1}{2} \mathcal{X}\_{\text{SAA}} \right) + (1-\nu)P\_{\text{S}} \left( \mathcal{X}\_{\text{SAA}} + \frac{1}{2} \mathcal{X}\_{\text{CAA}} + \mathcal{X}\_{\text{CAA}} + \frac{1}{2} \mathcal{X}\_{\text{CAA}} \right) \right) \\ \mathcal{X}\_{\text{SAA}}^{\text{V}} &= \nu \left( P\_{\text{A}} \mathcal{X}\_{\text{SAA}} + P\_{\text{A}} \mathcal{X}\_{\text{SAA}} + \frac{1}{2} \mathcal{X}\_{\text{CAA}} \right) + \\ & (1-\nu) P\_{\text{S}} \left( P\_{\text{A}} \mathcal{X}\_{\text{CAA}} + P\_{\text{A}} \mathcal{X}\_{\text{SAA}} + \frac{1}{2} \mathcal{X}\_{\text{CAA}} + P\_{\text{A}} \mathcal{X}\_{\text{CAA}} + P\_{\text{L}} \mathcal{X}\_{\text{CAA}} + \frac{1}{2} \mathcal{X}\_{\text{CAA}} \right) \right. \\ \mathcal{X}\_{\text{CAA}}^{\text{V}} &= P\_{\text{L}} \left[ \nu \left( \mathcal{X}\_{\text{CAA}} + \frac{1}{2} \mathcal{X}\_{\text{CAA}} \right) + (1-\nu) P\_{\text{A}} \left( \mathcal{X}\_{\text{CAA}} + \frac{1}{2} \mathcal{X}\_{\text{CAA}} + \mathcal{X}\_{\text{CAA}} + \frac{1}{2} \mathcal{X}\_{\text{CAA}} \right) \right] \\ \mathcal{X}\_{\text{CAA}}^{\text{V}} &= P\_{\text{A}} \left[ \nu \left( P\_{\text$$

$$(1 - \nu) \, P\_s \left( P\_{\rm A} X\_{\rm c\rm S\rm a} + P\_{\rm a} X\_{\rm c\rm S\rm A} + \frac{1}{2} X\_{\rm c\rm S\rm a} + P\_{\rm A} X\_{\rm c\rm s\rm a} + P\_{\rm a} X\_{\rm c\rm s\rm A} + \frac{1}{2} X\_{\rm c\rm S\rm a} \right)$$

$$X\_{\rm c\rm s\rm a}' = P\_{\rm a} \left[ \nu \left( X\_{\rm c\rm s\rm a} + \frac{1}{2} X\_{\rm c\rm s\rm A} \right) + (1 - \nu) \, P\_s \left( X\_{\rm c\rm S\rm a} + \frac{1}{2} X\_{\rm c\rm S\rm a} + X\_{\rm c\rm s\rm a} + \frac{1}{2} X\_{\rm c\rm S\rm a} \right) \right] \tag{9}$$

#### *Epistasis and gene-culture coevolution: cattle as symbionts*

I implemented the Feldman and Cavalli-Sforza (FCS) model as follows (Feldman and Cavalli-Sforza, 1989). The first step is mating and mother-offspring transmission of the symbiote with probability *v* (*U* vs. *u* represent presence or absence of the symbiote), such that the offspring gene-culture distribution is

$$\begin{aligned} \mathbf{X}\_{\text{CAU}}^{\text{CA}} &= \eta p\_{\text{A}} \left( \mathbf{X}\_{\text{CAL}} + \frac{1}{2} \mathbf{X}\_{\text{CAL}} \right) \\ \mathbf{X}\_{\text{CAL}}^{\text{CA}} &= \eta p\_{\text{A}} \mathbf{X}\_{\text{CAL}} + p\_{\text{A}} \mathbf{X}\_{\text{CAL}} + \frac{1}{2} \mathbf{X}\_{\text{CAL}} \right) \\ \mathbf{X}\_{\text{CAL}}^{\text{CA}} &= \eta p\_{\text{A}} \left( \mathbf{X}\_{\text{CAL}} + \frac{1}{2} \mathbf{X}\_{\text{CAL}} \right) \\ \mathbf{X}\_{\text{CAL}}^{\text{CA}} &= p\_{\text{A}} \left( \mathbf{X}\_{\text{CAL}} + \frac{1}{2} \mathbf{X}\_{\text{CAL}} \right) + (1 - \eta) p\_{\text{A}} \left( \mathbf{X}\_{\text{CAL}} + \frac{1}{2} \mathbf{X}\_{\text{CAL}} \right) \\ \mathbf{X}\_{\text{CAL}}^{\text{CAm}} &= p\_{\text{A}} \mathbf{X}\_{\text{CAm}} + \frac{1}{2} \mathbf{X}\_{\text{CAL}} + (1 - \eta) \left( p\_{\text{A}} \mathbf{X}\_{\text{CAL}} + p\_{\text{A}} \mathbf{X}\_{\text{CAL}} + \frac{1}{2} \mathbf{X}\_{\text{CAL}} \right) \\ \mathbf{X}\_{\text{CAL}}^{\text{CAm}} &= p\_{\text{A}} \left( \mathbf{X}\_{\text{CAL}} + \frac{1}{2} \mathbf{X}\_{\text{CAL}} \right) + (1 - \eta) p\_{\text{A}} \left( \mathbf{X}\_{\text{CAL}} + \frac{1}{2} \mathbf{X}\_$$

where *pA* and *pa* are the host allele frequencies. In the second step, following Feldman and Cavalli-Sforza (1989), offspring have a chance of picking up the symbiote via horizontal transmission. That is, each individual with trait *u* has probability *fpU* of being converted to trait *U* (*pU* is the frequency of the symbiote *U* among other hosts in the population). The third step is selection according to

*W.*UAA = 1 + *s*<sup>1</sup> *W.*uAA = 1 *W.*UAa = 1 + *s*<sup>1</sup> *W.*uAa = 1 *W.*Uaa = 1 − *s*<sup>2</sup> *W.*uaa = 1 (11)

R code implementing the FCS model is provided in Online Appendix C.

#### *Hitchhiking effects*

R code for illustrating hitchhiking effects is provided in Online Appendices D and E. These give examples of commands for iterating the FCS model in a series of stepping stones after secondary contact and keeping track of allele and symbiont frequencies.