# RECENT DISCOVERIES IN EVOLUTIONARY AND GENOMIC MICROBIOLOGY

EDITED BY: Anton G. Kutikhin and Arseniy E. Yuzhalin PUBLISHED IN: Frontiers in Microbiology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2015 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

*All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-617-3 DOI 10.3389/978-2-88919-617-3

## About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **RECENT DISCOVERIES IN EVOLUTIONARY AND GENOMIC MICROBIOLOGY**

## Topic Editors:

**Anton G. Kutikhin,** Research Institute for Complex Issues of Cardiovascular Diseases, Russian Federation

**Arseniy E. Yuzhalin,** Cancer Research UK and Medical Research Council Oxford Institute for Radiation Oncology, University of Oxford, UK

Scanning electron microscopy image of Serratia marsescens, x15,000. Copyright: Anton Kutikhin.

This collection represents certain discoveries that were made in evolutionary and genomic microbiology during the recent ten years. We attempted to shed light on topical issues of microbial evolution and microbiome biology. In our eyes, these articles are of an excellent quality and may be helpful both for casual readers and for specialists in the field.

**Citation:** Kutikhin, A. G., Yuzhalin, A. E., eds. (2015). Recent Discoveries in Evolutionary and Genomic Microbiology. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-617-3

# Table of Contents


Christopher H. House, Matteo Pellegrini and Sorel T. Fitz-Gibbon


Isabel Moreno-Indias, Fernando Cardona, Francisco J. Tinahones and María Isabel Queipo-Ortuño


Milton H. Saier Jr and Zhongge Zhang

*91* **Helicobacter pylori** *DNA methyltransferases and the epigenetic field effect in cancerization*

Ramakrishnan Sitaraman

*95 Calcifying nanoparticles: one face of distinct entities?* Anton G. Kutikhin, Arseniy E. Yuzhalin, Vadim V. Borisov, Elena A. Velikanova, Alexey V. Frolov, Vera M. Sakharova, Elena B. Brusina and Alexey S. Golovkin

## Editorial: recent discoveries in evolutionary and genomic microbiology

Anton G. Kutikhin<sup>1</sup> \* and Arseniy E. Yuzhalin<sup>2</sup>

*<sup>1</sup> Laboratory for Genomic Medicine, Division of Experimental and Clinical Cardiology, Research Institute for Complex Issues of Cardiovascular Diseases, Kemerovo, Russia, <sup>2</sup> Cancer Research UK/MRC Oxford Institute for Radiation Oncology, Department of Oncology, University of Oxford, Oxford, UK*

Keywords: human microbiome, gut microbiome, oral microbiome, human ecology, evolution, domains of life, human mycobiome, brain-gut-microbe axis

According to the current knowledge, the human body can be viewed as a superorganism comprised of human cells and resident microbial communities called microbiota (Ley et al., 2008). Human microbiota has a wide range of physiological functions, playing an important role in digestion, immunity, and production of certain vitamins; however, most of these microorganisms cannot be cultured in laboratory conditions, and until recently it has been the main challenge for clear determination of the composition of human microbiome (Tlaskalová-Hogenová et al., 2011). New discoveries in methods of molecular biology, including the emergence of omics technologies, have provided a possibility for deciphering the qualitative and quantitative composition of the microbiome (Tlaskalová-Hogenová et al., 2011). This breakthrough has also stimulated in-depth investigation of host-microbiota interactions (Tlaskalová-Hogenová et al., 2011). Basic comparative research revealed that each individual has a unique microbiota, which composition is largely determined during the first years of life but can be altered, reversibly or irreversibly, by a number of factors, such as age, environment, diet, drugs, diseases, and others (Zoetendal et al., 2008).

Therefore, the question of human microbiome biology is extremely intriguing, and large amount of research has been published during the recent years. The investigation of gut, oral, respiratory, skin, vaginal, urinary microbiomes is gaining increasing interest and more attention with time (Sommer and Bäckhed, 2013; Belkaid and Segre, 2014; Xu and Gunsolley, 2014; van de Wijgert et al., 2014; Rogers et al., 2015; Shreiner et al., 2015; Whiteside et al., 2015). We have opened this Research Topic with the idea to provide an outlook for discoveries that have become milestones in the field.

We sincerely thank all researchers who have agreed to contribute to our Research Topic. This collection is divided into three sections. The first one includes three articles in which Eric Bapteste, Christopher House, Matteo Pellegrini, and Sorel Fitz-Gibbon discuss the general issues of microbial evolution whilst Arshan Nasir, Patrick Forterre, Kyung Mo Kim, and Gustavo Caetano-Anolles analyze the distribution of viruses and their impact on evolution of organisms.

The second section is devoted to the biology of human microbiome. We highly recommend an excellent article of Sara Quercia and colleagues who describe the timescales of human gut microbiota adaptation and talk over gut microbiota plasticity that ≪was strategic to face changes in lifestyle and dietary habits along the course of the recent evolutionary history, that has driven the passage from Paleolithic hunter-gathering societies to Neolithic agricultural farmers to modern Westernized societies≫. The article by Noah Voreades, Anne Kozil, and Tiffany Weir extends this topic, focusing primarily on diet as ≪one of the most pivotal factors in the development of the human gut microbiome from infancy to the elderly ≫. Readers interested in biology of aging will appreciate the paper by Sitaraman Saraswati and Ramakrishnan Sitaraman on the causative role of gut microbiota in this process. As immune and neuroendocrine

#### Edited and reviewed by:

*John R. Battista, Louisiana State University and A & M College, USA*

> \*Correspondence: *Anton G. Kutikhin, antonkutikhin@gmail.com*

#### Specialty section:

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology*

Received: *07 March 2015* Accepted: *31 March 2015* Published: *16 April 2015*

#### Citation:

*Kutikhin AG and Yuzhalin AE (2015) Editorial: recent discoveries in evolutionary and genomic microbiology. Front. Microbiol. 6:323. doi: 10.3389/fmicb.2015.00323* systems maturate together with gut microbes, we are glad to present a review by Sahar El Aidy, Timothy Dinan, and John Cryan who comprehensively analyze this topic. Gut microbiota also plays a major role in non-communicable diseases, for example, obesity and type 2 diabetes mellitus, and Isabel Moreno-Indias, Fernando Cardona, Francisco Tinahones along with Maria Isabel Queipo-Ortuno present a good story on it. Regarding original research and methods, Eamonn Culligan, Julian Marchesi, Colin Hill, and Roy Sleator demonstrate how the combination of metagenomic and phenomic approaches helps us to identify novel genes within the gut microbiome. Further, Hui Chen and Wen Jiang discuss the role of the oral microbiome in health and diseases. At last, there are several papers on the

## References


human mycobiome, and Ablishek Saxena along with Ramakrishnan Sitaraman shed light on the aspects of its osmoregulation.

The last piece of the collection is composed by articles on other topics. Milton Saier and Zhongge Zhang discuss an intriguing principle of directed mutation, Ramakrishnan Sitaraman talks about Helicobacter pylori, DNA methyltransferases and the epigenetic field effect in cancerization, and Anton Kutikhin with colleagues summarize recent discoveries on calcifying nanoparticles sometimes referred to as nanobacteria.

We created this Research Topic with the hope that it will be useful for a wide audience, particularly immunologists, microbiologists, graduate, and undergraduate students of biomedical faculties as well as their lecturers.

after a decade of molecular characterization? PLoS ONE 9:e105998. doi: 10.1371/journal.pone.0105998


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Kutikhin and Yuzhalin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## The origins of microbial adaptations: how introgressive descent, egalitarian evolutionary transitions and expanded kin selection shape the network of life

## *Eric Bapteste1,2\**

*<sup>1</sup> UPMC, Institut de Biologie Paris Seine, UMR7138 'Evolution Paris Seine', Paris, France*

*<sup>2</sup> CNRS, Institut de Biologie Paris Seine, UMR7138 'Evolution Paris Seine', Paris, France*

*\*Correspondence: eric.bapteste@snv.jussieu.fr*

#### *Edited and reviewed by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

**Keywords: evolution, tree of life, greenbeard genes, black queen theory, lateral gene transfer, symbiosis, mobile genetic element**

## **MICROBIAL EVOLUTION: A GENEALOGICAL PERSPECTIVE**

Protists, bacteria and archaea (the prokaryotes), and their mobile genetic elements populate the microbial world. This world is ancient (several billion years old), numerically huge (with 5 <sup>×</sup> <sup>10</sup><sup>30</sup> prokaryotic cells and 10–100 times more viruses!), and genetically extremely diversified. Such a large assemblage cannot be ignored in attempts to understand life's history on Earth, however, how can biologists account for its evolution?

For long, the notion of descent with modification, describing a process of vertical inheritance, defining a tree-like genealogical pattern, when the genetic material, modified by some mutations, is transferred from the genome of a last common ancestor to its direct progeny has offered a promising way to classify organisms and species. Thus, the origin of microbial adaptations can be searched for within lineages, in the changes of genetic material inherited from one ancestor. Yet, such studies are strongly constrained. Non-tree like evolution, generating a reticulate evolutionary pattern, cannot be analyzed with a genealogical tree. Moreover, viruses do not all originate from a single last ancestor (Lima-Mendez et al., 2008), nor do they all display obvious genealogical relationships with cellular organisms, hindering the collective study of mobile genetic elements and cells with a single tree.

Genealogy also plays a central role for explaining the main types of behaviors described in the biological world: selfishness, mutualism, altruism, and spite (West et al., 2006). The evolution of these interactions can be understood by accounting for kinship between protagonists, under the standard assumption that genealogical proximity between individuals entails their genetic proximity (Huneman, 2013). Thus, knowing the relative kinship, the benefit for the recipient of an interaction and the cost for its actor allows determining when an individual cooperating with a kin, in ways enhancing its reproduction, or preventing distantly related members of a population to reproduce, actually maximizes the reproductive success and the survival of its own genes (van Baalen, 2013). Microbiologists are thus inclined to embrace the conceptual framework of kin selection to analyze many cooperations (Diggle et al., 2007).

However, it is probably not enough to take genealogical relationships into account to explain the diversity, evolution, and interactions in the microbial world. Too systematic a focus on genealogy may even introduce some biases in the explanations of microbial diversity, evolution and interactions, because many crucial biological phenomena result from processes orthogonal to vertical descent.

## **EXPANDING THE ANALYTICAL FRAMEWORK INSPIRED BY GENEALOGY**

The multiplicity of evolutionary processes, their consequences, and the interactions at play within the microbial world are still under-appreciated. The genealogical perspective grounding evolutionary explanations needs to be completed, because its analytical framework does not accommodate for numerous important biological phenomena, which deeply challenge our background knowledge of the (microbial) world and its evolution.

## **INTROGRESSION: A CLASS OF NON-TREE-LIKE EVOLUTIONARY PROCESSES**

Due to introgressive descent, many adaptations originate from outside rather than from within lineages of vertical descent. While in vertical descent, the genetic material of a particular evolutionary unit is propagated by replication inside its own lineage, in introgressive descent, the genetic material of a particular evolutionary unit propagates into different host structures and is replicated within these host structures (Bapteste et al., 2012). Such host structures are genealogically composite, made of components with distinct genealogical origins. Importantly, introgression is very common in the microbial world, affecting entities from the same or different levels of biological organization. New introgressive mechanisms are constantly discovered (Bapteste, 2013). However, these mechanisms and their actors (viruses, plasmids, conjugative elements, outer membrane vesicles, gene transfer agents, nanotubes, membrane fusion, *...*) are largely missing from the traditional evolutionary representation. For instance, a gene sequence can propagate into another gene sequence, creating a novel composite gene, whose components come from two different gene lineages. Similarly, a gene sequence can propagate within a genome, whose ancestor lacked this gene, producing a composite genome with genes originating from different genomes. Likewise, the genome of a mobile element (a virus, a plasmid, etc.) can propagate into a cell born without this element, creating a composite cell with genetic instruction from multiple sources. Or, a part or an entire microbial genome can propagate within a symbiotic association, producing a holobiont with several unrelated genomes. Therefore, the recognition of introgression promotes a substantial expansion of the evolutionary research program: a study of the origins (rather than of the origin) of adaptations and species through the description and analysis of a plurality of processes and objects, some unexpected in the traditional genealogical perspective (Doolittle and Bapteste, 2007; Bapteste, 2013).

## **GENE–GENE INTROGRESSION: MASSIVE GENE REMODELING**

Homology guides comparative analysis in evolutionary biology (Haggerty et al., 2013). Sequences or organs are considered homologous when they evolved in a tree-like fashion from an ancestral form. Thus, gene evolution is often described by a tree with one genealogy per gene family. However, many genes originate from the composition of genetic material from sequences belonging to different gene families. Eukaryotes are the main creators of composite genes (in terms of the proportion of composite genes in their genomes) (Haggerty et al., 2013), yet in terms of absolute numbers, mobile genetic elements operate the most massive gene remodeling on Earth (Jachiet et al., 2013). Therefore, numerous genes display family resemblances (Halary et al., 2013): true similarity caused by introgression between non-homologous sequences. Such family resemblances support the study of the origins of genes and of adaptations at a more global scale than delineated by homology.

## **GENE–GENOME INTROGRESSION: REMARKABLE PANGENOMES**

All conspecific individuals do not own the same gene families. Six percent only of their gene families are distributed in all 60 strains of *Escherichia coli* (Lukjancenko et al., 2010), and experiments showed that only 61 genes out of 246,065 cannot be transferred into an *E. coli* (Sorek et al., 2007). Members of this species (and many others) exploit a large DNA pool, called pangenome, larger than the size of individual genomes. Therefore, sequencing one individual genome does not always allow describing genetic and functional diversity at the species level. Pangenomes and lateral gene transfer –by small segments or larger chunks- are not restricted to conspecifics (Nelson-Sathi et al., 2012). These observations challenge phylogenetic systematics: they mean that genome evolution is much more than genome genealogy, an increasingly elusive concept, since these objects prove to be ever more composite and their genes do not all coalesce in a single common ancestral genome.

## **FROM THE MOBILOME NETWORK TO THE SOCIAL NETWORK OF LIFE**

Many classes of evolutionary objects (i.e., virus, plasmids, etc.) have fuzzy borders, because many of these objects do not evolve independently at the genetic level. Remarkably, introgression creates novel introgressive mechanisms. Numerous genealogically mosaic mobile elements (autonomous or not: polintons, virophages, R391, phasmides, phage inducible chromosomal islands, transpovirons, etc.) emerge and evolve through the sharing of mobility functions, defining a genetic pool: the pangenome of mobile elements, which unravels a network of shared genes between these elements (Yutin et al., 2013). This network belongs to a larger one: the social network of life, whose edges describe an important biological structure: "what shares genetic material with what," without prejudices about the process involved in these sharings (in part vertical descent, but also introgression since these sequences can be used as common goods by more than one lineage Halary et al., 2010; McInerney et al., 2011). In this latter network, all entities are not genealogically related, but this does not imply their a priori exclusion from the model. Thanks to its diversity of edges and nodes, the social network of life is more inclusive than the tree of life, supposed to be universal but in fact restricted to one type of relationships between one fraction of biological diversity (Halary et al., 2010).

### **THE CHALLENGING MICROBIAL SOCIAL LIFE**

Microbial social life is hard to explain within the framework of kin selection without (at least) deeply expanding this theory. How do bacteria manage to identify their kins and cooperate? In principle, greenbeard genes provide a way to detect other organisms carrying these genes with which an individual can act cooperatively. However, experimental transfers between strains and species of myxobacteria (*M. xanthus* and *M. fulvus*) of the first characterized single greenbeard prokaryotic gene predictably transform their interactions, reprogramming their social interactions. For example, when an isogenic *M. xanthus* strain expresses a *M. fulvus traA* allele, both become efficacious partners. Moreover, strains constructed with two alleles of *traA* cooperate with a broadened range of partners (Pathak et al., 2013). Consequently, the notion of microbial greenbeard gene departs from classical kinship selection: the cooperative behavior targets other individual harboring the same allele, whatever their global genetic proximity. Lateral gene transfer does not only partly uncouple gene and genome evolution, which makes it difficult to conceive of a standard application of kinship within bacterial populations, since bacteria may be similar for some genes without being similar for all, but the transfer of greenbeard genes can also induce cooperation between relatively different microbes. Therefore, cooperation between distantly related individuals must be more largely theorized (Huneman, 2013). The black queen theory provides a good instance of such an explanation in which genealogical relationships between protagonists do not play a role (Morris et al., 2012; Sachs and Hollowell, 2012). This theory would explain why a minority of organisms (*Synechococcus* harboring the *katG* gene) are sufficient to reduce the HOOH in ocean surface waters to a level that allows the dominant types (*Prochlorococcus* and *Candidatus Pelagibacter ubique* who lost this gene) to thrive.

Considerations on the evolution of social life are fundamental: our inability to grow the vast majority of microorganisms in pure cultures (Staley and Konopka, 1985) may largely come from our too limited knowledge on this topic. Furthermore, our general knowledge in evolutionary microbiology mostly rests on analyses of the rare microbes able to grow in pure cultures. If these organisms are not representative of most of the microbial world, inferences based on a part of this world (e.g., the cultivable microbes) could be mistakenly conflated with general conclusions, which one hopes to be relevant for the whole microbial world. Yet, discoveries such as the Pandoraviruses remind us that much unknown lives outside our Petri dishes (Philippe et al., 2013). Importantly, alleviating some constraints inspired by the genealogical focus is one way to better see the whole rather the parts. Typically, sequences comparison free from the constraints of multiple alignment and a tree-based representation of sequence similarities hints at highly divergent environmental gene forms and lineages, not yet reported in the microbial world (Lynch et al., 2012).

## **EGALITARIAN EVOLUTIONARY TRANSITIONS AND SYSTEMS WITH MICROBIAL COMPONENTS**

The three steps of evolutionary transitions: the association of entities, their stabilization, and their transformation (after which entities originally able to reproduce independently are only able to reproduce as part of a larger whole) result either in fraternal transitions (which can be explained by traditional kin selection), when higher level units emerge from genealogically like components, and in egalitarian transitions, when higher level units emerge from genealogically different components (Huneman, 2013). These latter transitions are common in the microbiological literature, e.g., *Parakaryon myojinensis* (Yamaguchi et al., 2012), the origins of eukaryotes (Alvarez-Ponce et al., 2013), mutualistic viruses and even virophages, seen as components of larger systems (Espagne et al., 2004; Fischer and Suttle, 2011; Roossinck, 2011). The notion of egalitarian evolutionary transition gives credits to the proposal that some elements of microbiomes deserve to be considered as new organs, providing novel physiologies, despite their genealogically different origin from that of the host of these organs (Stahl and Davidson, 2006). The idea that the microbial component influences the physiology and the behavior of the other components of such systems is becoming increasingly popular (Hoover et al., 2011; McFall-Ngai et al., 2013), i.e., the study of mechanisms of the microbial-brain axis, testifying of the fundamental relevance in terms of adaptations of the interactions between the microbial and macrobial worlds (Collins et al., 2012).

Consequently, many evolutionary objects previously studied as if they were genealogically cohesive organisms or species would rather constitute genealogically heterogeneous systems, composed in parts by microbes or viruses. This microbial contribution to microbe–microbe and microbe–macrobe systems seems a general rule rather than an exception, when one considers the age, abundance, and ubiquity of these minute entities on the planet. This type of discoveries raises a novel fundamental issue: how to model the evolution of systems (and their possible physiological, ecological, and developmental impacts during Earth history), which brings microbial evolutionists very far from the usual reconstruction of a genealogical tree.

## **CONCLUSION**

Genealogical tree-thinking should at least be completed by other perspectives (Doolittle and Bapteste, 2007; Bapteste et al., 2012, 2013). For example networks can be used to adapt current models to the data rather than enforcing the data to fit within pre-existing genealogically constrained models, designed in order to analyze objects and processes far less complex than those affecting the microbial world (Bapteste, 2013).

## **ACKNOWLEDGMENTS**

The author thanks Dr. Philippe Lopez for critical reading of the MS, and F. Bouchard, J. O. McInerney, T. Pradeu, R. Burian, J. Dupré, P. A. Jachiet, C. Bicep, R. Méheust, L. Bittner, and M. Grenié for critical discussions.

## **REFERENCES**


Evolutionary analyses of non-genealogical bonds produced by introgressive descent. *Proc. Natl. Acad. Sci. U.S.A.* 109, 18266–18272. doi: 10.1073/pnas.1206541109


Pandoraviruses: amoeba viruses with genomes up to 2.5 Mb reaching that of parasitic eukaryotes. *Science* 341, 281–286. doi: 10.1126/science.1239181


*Received: 22 January 2014; accepted: 16 February 2014; published online: 04 March 2014.*

*Citation: Bapteste E (2014) The origins of microbial adaptations: how introgressive descent, egalitarian evolutionary transitions and expanded kin selection shape the network of life. Front. Microbiol. 5:83. doi: 10.3389/ fmicb.2014.00083*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Bapteste. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Genome-wide gene order distances support clustering the gram-positive bacteria

#### *Christopher H. House1 \*, Matteo Pellegrini 2,3 and Sorel T. Fitz-Gibbon2,3*

*<sup>1</sup> Penn State Astrobiology Research Center and Department of Geosciences, The Pennsylvania State University, University Park, PA, USA*

*<sup>2</sup> Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA*

*<sup>3</sup> Department of Molecular, Cell, and Developmental Biology, Institute of Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA, USA*

#### *Edited by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases Under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

#### *Reviewed by:*

*Russell F. Doolittle, University of California, San Diego, USA Elena Brusina, Kemerovo State Medical Academy, Russia*

#### *\*Correspondence:*

*Christopher H. House, Penn State Astrobiology Research Center and Department of Geosciences, The Pennsylvania State University, 220 Deike Building, University Park, 16802 PA, USA e-mail: chrishouse@psu.edu*

Initially using 143 genomes, we developed a method for calculating the pair-wise distance between prokaryotic genomes using a Monte Carlo method to estimate the conservation of gene order. The method was based on repeatedly selecting five or six non-adjacent random orthologs from each of two genomes and determining if the chosen orthologs were in the same order. The raw distances were then corrected for gene order convergence using an adaptation of the Jukes-Cantor model, as well as using the common distance correction D- = −ln(1-D). First, we compared the distances found via the order of six orthologs to distances found based on ortholog gene content and small subunit rRNA sequences. The Jukes-Cantor gene order distances are reasonably well correlated with the divergence of rRNA (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*24), especially at rRNA Jukes-Cantor distances of less than 0.2 (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*52). Gene content is only weakly correlated with rRNA divergence (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*04) over all distances, however, it is especially strongly correlated at rRNA Jukes-Cantor distances of less than 0.1 (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*67). This initial work suggests that gene order may be useful in conjunction with other methods to help understand the relatedness of genomes. Using the gene order distances in 143 genomes, the relations of prokaryotes were studied using neighbor joining and agreement subtrees. We then repeated our study of the relations of prokaryotes using gene order in 172 complete genomes better representing a wider-diversity of prokaryotes. Consistently, our trees show the Actinobacteria as a sister group to the bulk of the Firmicutes. In fact, the robustness of gene order support was found to be considerably greater for uniting these two phyla than for uniting any of the proteobacterial classes together. The results are supportive of the idea that Actinobacteria and Firmicutes are closely related, which in turn implies a single origin for the gram-positive cell.

**Keywords: tree of life, gene order, evolutionary distance, genomics, Actinobacteria, Firmicutes, Archaea**

## **INTRODUCTION**

For the past three decades, the comparisons of ribosomal RNA (rRNA) between microorganisms have largely provided the taxonomic and phylogenetic basis for bacteriology (Woese, 1987). During the past 15 years, however, considerable effort has been placed on comparing the similarity of organisms with genomewide methods or, at least, with methods that use more than a single gene. These methods include the estimation of genomic distances based on the content of genomes, either orthologs, homologs, folds, or protein domains (Gerstein, 1998; Fitz-Gibbon and House, 1999; Snel et al., 1999; Tekaia et al., 1999; Wolf et al., 2002; Deeds et al., 2005; Yang et al., 2005; House, 2009). Genomic distance has also been estimated using direct genome-to-genome sequence comparisons using a variety of approaches like average nucleotide identity (ANI) and the genome-to-genome-distance calculator (GGDC) that can approximate traditional DNA-DNA hybridization results (Konstantinidis and Tiedje, 2005; Goris et al., 2007; Deloger et al., 2009; Richter and Rosselló-Móra, 2009; Auch et al., 2010; Tamura et al., 2012; Meier-Kolthoff et al., 2013). Also, ever since Nadeau and Taylor (1984) first identified that gene order information was conserved between humans and mice, there has been growing interest in using gene order to estimate the difference between genomes or to solve phylogenetic problems.

Several gene order methods depend on the presence of orthologs adjacent to each other. Watterson et al. (1982) introduced the *breakpoint distance* between genomes, which is the number of orthologs found paired together in one genome but separated in the other Blanchette et al. (1999). Early on, Sankoff et al. (1992) estimated mitochondria gene rearrangements as a means to derive a phylogenetic tree for Eukaryotes. Subsequently, the presence and absence of paired genes has been used to construct trees (Wolf et al., 2001; Korbel et al., 2002) as a gene order method similar in practice to tree building by gene content. A limitation to this approach results from the fact that small groups of laterally transferred genes will be paired after their transfer. Also, a computational method for testing phylogenetic problems using gene order has been presented by Kunisawa (2001). In this method, genomes are searched for cases in which the arrangement of three genes most parsimoniously suggests that a single transposition has occurred. With the use of an outgroup, the method can be used to test phylogenetic hypotheses, such as the branching order within the Proteobacteria (Kunisawa, 2001) or Gram-positive bacteria (Kunisawa, 2003). The strength of this method is that it can be efficiently applied to a large dataset of genomes and that it reveals (a small number of) interesting cases of transposition. Another gene order approach often implemented is calculating the inversion distance. The *inversion distance* is the minimum possible number of inversions needed to transform one genome into the other (Moret et al., 2001). Recently, Belda et al. (2005) have studied a subset of 244 genes universal to the genomes of 30 γ-Preotobacteria using both the breakpoint distance and the inversion distance. They found the two distances highly correlated suggesting that inversion was the main method of genome rearrangement for these taxa. More recently, models for genome evolution that include rearrangements, duplications, and losses have been developed and tested (Swenson et al., 2008; Zhang et al., 2010; Hu et al., 2011; Lin and Moret, 2011; Shao et al., 2013) have each developed algorithms for using gene order for phylogenetic reconstruction. Furthermore, Lin et al. (2013) and Shifman et al. (2014) have used genomewide gene order to produce phylogenetic trees. The later work produced a tree of 89 diverse microbial genomes using an algorithm for estimating average genome synteny (Shifman et al., 2014).

In this study, we aimed to develop a simple computational method that could estimate a genome-wide gene order distance between two genomes (even when the genomes were highly diverged). Unlike many previous efforts, our intent was to have the gene order distance not rely on genes that are likely to be in the same operon (such as gene pairs). Here, we present a novel simple Monte Carlo method for estimating *distributed gene order distances* between genomes. In this method, we repeatedly randomly select six non-adjacent orthologs from each of two genomes and determine if the genes are in the same order. The distances are then corrected using an adaptation of the Jukes-Cantor model to account for random gene order convergence.

## **MATERIALS AND METHODS**

Initially, 143 prokaryotic genomes were analyzed (**Table 1**). This represented completed prokaryotic genomes available when the study began in January 2005. All genes from each genome were analyzed as queries using BLAST against each of the other genomes. Ortholog-pairs were identified as cases where two genes from different genomes were each other's BLAST best hit (top hit in both directions). This list of ortholog pairs served as the basis for both calculation of distributed gene order distances and the ortholog gene content distances. As defined by Snel et al. (1999), ortholog gene content similarity (S) was calculated as the number of ortholog pairs found for two genomes divided by the size of the smaller genome. This similarity was then converted to distance as equal to –ln(S), as suggested by Korbel et al. (2002). However, using distance equal to 1-S gives similar correlation results.

Distributed gene order distances were determined using a novel Monte Carlo approach (**Figure 1**). To determine the gene order distance between two genomes, first, six ortholog-pairs

#### **Table 1 | 143 taxa.**


*(Continued)*

#### **Table 1 | Continued**



*(Continued)*

Staphylococcus aureus N315 san

100 lists of distances for use as bootstrap replicates (nexus files for

PAUP are available in Supplementary Material).

Recently diverged genomes begin with close to 100% of their genes arranged in the same order, and with time, the synteny between the genomes decreases. Because there are only 60 different ways to arrange six items on a circle, there is a 1/60 probability of two genomes sharing an arrangement of six orthologs by chance. Therefore, the fraction of six ortholog picks found to be in the same order will ultimately approach 1/60 as divergence time goes to infinity. We, therefore, developed a model of gene order evolution based on the Jukes-Cantor concept that divergence is a logarithmic function with time (Jukes and Cantor, 1969).

The typical Jukes-Cantor correction (Kimura and Ohta, 1972) for nucleotide distance is:

$$D\_{\rm JC} = - (\mathfrak{Z}/4) \ln \left( 1 - (4/3)D \right) \tag{1}$$

where *D* = the observed fractional of nucleotides found to be different between two compared genes.

This classical nucleotide Jukes-Cantor correction (Equation 1) accounts for back substitution and is based on a model in which the outcome of any nucleotide substitution can be one of three possibilities. To adopt this logic to gene rearrangements, the Jukes-Cantor equation becomes:

$$D\_{\rm fC} = -(59/60)\ln\left(1 - (60/59)D\right) \tag{2}$$

where *D* = the fraction of iterations in which the six orthologpairs chosen are not in the same order.

The classical Jukes-Cantor nucleotide correction (Equation 1) can only be used for raw D up to 0.75. With raw nucleotide distances greater than 0.75, the argument of the logarithm will be zero. To use data in which D is larger than 0.75, Tajima (1993) presented a method using a Taylor series expansion to avoid the logarithm. In our case, Equation (2) fails whenever the raw D is greater than 59/60 (or 0.983). To allow corrections for all of our genome pair distances, we have adopted the method of Tajima (1993) as follows:

$$D\_{\rm IC} = \sum\_{i=1}^{k} \frac{k^{(i)}}{i(59/60)^{i-1} n^{(i)}} \tag{3}$$

where *<sup>k</sup>*(*i*) <sup>=</sup> *<sup>k</sup>*!*/*(*<sup>k</sup>* <sup>−</sup> *<sup>i</sup>*)!*, <sup>n</sup>*(*i*) <sup>=</sup> *<sup>n</sup>*!*/*(*<sup>n</sup>* <sup>−</sup> *<sup>i</sup>*)!*, <sup>k</sup>* <sup>=</sup> the number of times the six orthologs are not in the same order, and *n* = the number of iterations used.

Partial reanalysis of the work reported here demonstrated the results are similar when applying D- = −ln(1-D) as the distance correction rather than the Tajima correction (data not shown), and further future work evaluating this measure of gene order distance is warranted as it is computationally much less intense.

For comparison, Jukes-Cantor corrected rRNA distances were downloaded spring 2006 from the ribosomal database (Cole et al., 2007). The correlations between distributed gene order, gene content, and rRNA distances were performed with SPSS 13 (SPSS, Inc. Chicago, IL) for Mac OS X. Taxonomic assignments for taxa were from the NCBI taxonomic server (Bischoff et al., 2007).

Our follow-up analysis used 172 complete genomes with the aim of being a representative sample of prokaryotes. For this follow-on analysis initiated early in 2014, we used ortholog predictions from the OMA website (Dessimoz et al., 2005). This OMA database is continually updated and includes all chromosomes for each microorganism. The updated analysis here of 172 taxa was done with orthologs downloaded in early 2014. In this case, we also tried searching for five orthologs in the same order rather than six using the same equations, which naturally produces slightly shorter distances overall. In fact, the five gene distances used this last analysis are functionally the same as using the easier to calculate D- = −ln(1-D). Based on the promising results here, we recommend this simpler distance calculation for future work.

Neighbor Joining (NJ) trees (Saitou and Nei, 1987) were created from data matrices using PAUP 4.0b (Mac and Unix versions; Sinauer Associates, Sunderland, MA). Later, agreement subtrees, which identifies the largest possible pruned tree that is consistent within a set of trees, was used to limit the taxa list in order to minimize possible adverse effects of including genome pairs with very little or no gene order conservation. The agreement subtrees were identified using PAUP 4.0b (Mac) based on a comparison of all of our NJ trees produced from the 100 replicate distances.

We also tried using a hierarchical and iterative approach to produce a series of trees (**Table 2**). This novel method was based on the fact that shorter distances are known with higher confidence than greater distances. The goal of this method of tree building is to provide a systematic and objective way to build a tree that includes as many of the pair wise gene order distances as possible without letting very distant (random) pairs adversely influence the observed phylogenetic positions of the more closely related taxa. We started with a list of genome pairs ranked from shortest to largest gene order distance (available in Supplemental Data). Starting at the top of the list, we moved down the list adding each pair to our working group until enough pair wise distances were included to allow for one or more NJ trees to be built.

#### **Table 2 | Steps used in hierarchical tree building.**


This process was continued until we had an exhaustive ranked list of possible unrooted NJ trees starting with the top few very closely related taxa and ending ultimately with a NJ tree of all 143 taxa. Moving down the ranked tree list, we evaluated each tree. A tree (unrooted) was rejected if it was found to be incongruent with an earlier unrooted tree. Congruent trees were pared down in number by removing trees that were fully encompassed by another tree and by combining pairs of compatible trees. Trees were combined by building a new NJ tree with the union set of taxa from the two original trees. The trees were only considered compatible for combining if the process did not cause a disruption of either of the original backbone topologies. For each kept tree, we recorded both the rank of the taxa pair that resulted in its initial formation, and the rank of the last taxa pair added. The largest resulting tree (with 37 taxa) was selected for further study. Additional taxa were added using a process of single taxon addition. In this second round of analysis, moving down the ranked list of genome pairs, we attempted to sequentially add additional taxa to the tree. If the addition of the single taxon disrupted the existing NJ topology, then the taxon was not added.

## **RESULTS AND DISCUSSION**

## **INITIAL TEST OF GENE ORDER AS AN EVOLUTIONARY DISTANCE**

The distribution of raw gene order distances for each of the 10,153 genome pairs for our 143 genomes are plotted in **Figure 2A** (and available in Supplementary Material). As expected, with raw gene order distance of 0 (or near 0), the two genomes for *Chlamydia trachomatis*, and separately the four genomes for *Chlamydophila pneumoniae* define the far left of the distribution. The bulk of the genome pairs, however, show raw gene order distances of greater than 0.9 with a peak near, but below, the value expected randomly (0*.*983). 82% of the genome pairs have gene order distances below 0.983. **Figure 2B** shows the same data after an adapted Jukes-Cantor model correction (Equation 2). Using this logarithm–based correction, the gene order distances show a relatively normal distribution with a mean of 7.49 (*SD* = 1*.*68). This correction, however, is not possible for raw gene order distance larger than 0*.*983, and so, such divergent data are missing from **Figure 2B**. **Figure 2C** shows a fuller dataset of gene order distances corrected using the method adopted from Tajima (Equation 3). In this case, a very long tail of very large gene order

distances is apparent. This tail is caused by large corrections being applied to some dissimilar genome-pairs.

distribution is similar to that shown in **(B)**.

After calculating corrected gene order distances for each genome pair, we compared these values with other measures of genome distance, Jukes-Cantor corrected rRNA distances and logarithmic gene content distances (data used are available in Supplementary Material). **Figure 3A** shows a strong correlation between the "Jukes-Cantor" corrected gene order distances and the Jukes-Cantor rRNA distances (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*24), especially at rRNA distances shorter than 0.2 (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*52). Gene content distances show much less significant correlations with rRNA distance (**Figure 3B**; *<sup>R</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*04), and are actually much more strongly correlated with gene order (**Figure 3C**; *<sup>R</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*22). However, a very strong correlation between gene content and Jukes-Cantor rRNA distance is apparent at rRNA distances shorter than 0.1

**FIGURE 3 | Comparison of "Jukes-Cantor" distributed gene order distances with ortholog gene content and Jukes-Cantor rRNA distances.** Select gene pairs have been labeled. **(A)** Gene order distance plotted as a function of rRNA distance. Solid line is linear regression of all data (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*24). Dashed line is a linear regression for genome pairs with rRNA distances *<sup>&</sup>lt;*0.2 (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*52). **(B)** Gene content distance plotted as a function of rRNA distance. Solid line is linear regression of all data (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*04). Dashed line is linear regression for genome pairs with rRNA distances *<sup>&</sup>lt;*0.1 (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*67). **(C)** Gene content distance plotted as a function of gene order distance. Solid line is linear regression of all data (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*22).

(*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*67). Apparent in **Figure 2B**, there is a cluster of genome pairs with similar gene content (low gene content distance) but divergent rRNA sequences (rRNA distances between 0.1 and 0.3 and Gene Content distances of 0.0 and 0.2). This population of genome pairs consists of very small genomes in which extreme genome reduction has occurred. In cases of such extensive genome reduction, the proportion of orthologs shared between two genomes is high even when rRNA sequences indicate that the genomes are relatively distant. With an updated list of genomes, we found our Jukes-Cantor gene order distances were correlated with the divergence of conserved protein genes (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*13 overall and *<sup>R</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*23 for cases with less than one amino acid substitution calculated per site). For this comparison, we identified 87 taxa present in both our data (**Table 3**) and (Lang et al., 2013) and we extracted the aligned amino acid sequences from their published alignment of 841 sequences. Using MEGA5 we computed pairwise distances under the Dayhoff matrix based model (Schwartz and Dayhoff, 1978).

In general, these various analyses have shown that the distributed gene order distances are better correlated with sequenced-based distances (both rRNA and conserved proteins) than are gene content distances. Assuming rRNA and protein distances are useful approximations of evolutionary divergence, our results suggest that gene order distance may be useful for taxonomy or phylogenetics in a similar way that genome-to-genome sequence comparison has proven useful as a genomic measure to replace laboratory DNA-DNA hybridization. Both gene content and gene order have their specific issues and so we would not recommend that either genomic method is ever considered the best, but rather that each be used to help inform more traditional molecular sequence analyses when these methods show signal and can be reasonably interpreted. It is also important to note that horizontal gene transfers likely bring in orthologs in the same order, and so gene order does not necessarily avoid complications of integration arising from such transfers, if they are on a large scale.

## **GENE ORDER TREE BUILDING STARTING FROM 143 TAXA USING NEIGHBOR JOINING**

Encouraged by the strong correlation between our distributed gene order distances and those of rRNA, we proceeded to build gene order-based neighbor joining (NJ) trees. **Figure 4** shows a phylogram based on all 143 taxa, although unresolved single taxon are not shown for clarity. Similarly, **Figure 5** shows the bootstrap NJ tree when all taxa are included. The results show that major bacterial taxonomic groups are mostly correctly clustered. With the largest number of resolved taxa, the γ-Proteobacteria shows the best resolution, and has a branching order reasonably consistent with published reports (House et al., 2003; Brown and Volker, 2004; Belda et al., 2005). In contrast, Archaea are almost completely unresolved, with only generic-level similarity yielding clustering (e.g., *Pyrococcus*, *Methanosarcina*, *Sulfolobus*, and *Thermoplasma*). Interestingly, the Actinobacteria are resolved with high confidence as a sister group to the bulk of the Firmicutes (including *Bacillus*, *Clostridium*, *Lactobacillus*, *Listeria*, *Oceanobacillus*, *Staphylococcus*, and *Thermoanaerobacter*). In contrast, our bootstrap tree (**Figure 5**) does not cluster any of

#### House et al. Distributed gene order distances

#### **Table 3 | 172 taxa.**


*Enterobacter* sp. ENT38 Enterococcus faecalis ENTFA

#### **Table 3 | Continued**


*(Continued)*

*(Continued)*

#### **Table 3 | Continued**


*(Continued)*

#### **Table 3 | Continued**


the proteobacterial classes together. Finding the Actinobacteria and Firmicutes united is interesting because they are the two phyla that comprise the "gram-positive bacteria." While it has long been considered likely that the gram-positive bacteria are a monophyletic group, it has been to date remarkably hard to find supportive molecular sequence data, genetic or genomic (De Rijk et al., 1995; Olsen, 2001; Fu and Fu-Liu, 2002; Deeds et al., 2005).

Next, we tested if the small phylogenetic signal we found with gene order distance was due to the occasional sampling of ribosomal operons, despite the 5 gene exclusion zone. A detailed look at 100,000 randomly sampled gene sets revealed that sets with more than one ribosomal gene do not occur any more frequently for conserved order sets (20%) than non-conserved order sets (21%). Furthermore, there was very little difference in the percentage of each of the following cog-based (Tatusov et al., 2003) protein function categories between the two groups of sets (conserved vs. non-conserved): informational (24 vs. 26%), cellular (17 vs. 17%), metabolism (36 vs. 37%), poorly categorized (15 vs. 14%), no cog match (8 vs. 6%). This suggests that the signal is distributed across many different types of genes and is probably not due to unreliable "jackpot" effects of single operons. We also pruned our data set to remove all ribosomal genes. When this pruned data set was used for building a NJ tree, however, the resolution is reduced resulting in a topology where some well-established microbial phyla are intertwined. This new NJ result does unite the Actinobacteria and Firmicutes, but with very low confidence. Because the dataset with ribosomal genes removed does not fully reproduce the results shown in **Figure 5**, it remains a possibility that a notable proportion of the gene order signal is preserved in ribosomal genes, but that in addition the signal overall appears to be distributed across a variety of other gene functional categories. The most likely way to reconcile these apparently divergent conclusions is that the phylogenetic signal in gene order distance is small, and so, the removal of any class of genes (including

ribosomal operons) appreciably reduces the robustness of the results.

#### **ADDITIONAL GENE ORDER TREE BUILDING STARTING FROM 143 TAXA**

To complement our NJ tree building exercise using all 143 taxa, we aimed to address the fundamental problem that only a portion of our 10,153 pair wise gene order distances were significant and should be useful for tree building. The inclusion of genome pairs that are too diverged with respect to their gene order has the potential to alter the observed position of other taxa on a tree. This concern is not unique to gene order data. It has long been known that with sequence data, the uncertainty on an estimated distance goes up greatly with the magnitude of the divergence (Kimura and Ohta, 1972). Gene order data though provide a dramatic example of how it can be difficult to accurately estimate divergence when organisms are highly diverged. To minimize this problem, we proceeded with two additional tree studies.

We tried developing a novel hierarchical and iterative tree building strategy (see Materials and Methods) based on the principle that our shorter distances are known with a higher degree of confidence than our larger distances. The goal of this approach is to provide a systematic and objective way to build a tree that includes as many of the pair wise gene order distances as possible without letting very distant (random) pairs adversely influence the observed phylogenetic positions of the more closely related taxa. Detailed results of this work are listed in the Supplemental Material. **Figure 6** shows the largest tree formed starting this process with all 143 taxa. The tree in **Figure 7** has 37 taxa added in the initial clustering process and another 8 taxa added during a second phase (single taxon addition). The result shows reasonable clusters representing the α-Proteobacteria, γ-Proteobacteria, Actinobacteria, and Frimicutes, plus a few other taxa from different poorly represented groups. The midpoint-rooted result again shows the Actinobacteria clustering with the bulk of the Firmicutes in a similar fashion to that shown in **Figures 4**, **5**. The other well-supported trees from this analysis either also show such a clustering or do not contain taxa that can address the relationship between the Actinobacteria and Firmicutes. Also, observed in **Figure 6** is the splitting of the Firmicutes into two groups with the Streptococcaceae (*Streptococcus* and *Lactococcus*) falling away from the bulk of the Firmicutes. A similar result was observed in the **Figure 4**, albeit with a different ultimate affinity for the

Streptococcaceae. The inconsistent placing of this group on the trees found in **Figures 4**, **7**, plus the unresolved placing of this group in **Figure 5** and the exclusion of this group from **Figure 6**, collectively suggests that gene order is unable to confidently place this group on the tree—leaving it inconclusive to the question of whether they belong with the rest of the Firmicutes or even clustered with the gram-positive bacteria, but diverged prior to the Actinobacteria. However, the fact that *Lactobacillus* (labeled lp) is consistently clustered with the bulk of Firmicutes suggests that the Lactobacillales (which includes the Streptococcaceae) do belong

with the rest of the Firmicutes, and therefore, in this case, the Streptococcaceae appear to be misplaced due to an artifact related to "long branch attraction."

second round of single taxon addition. "Bootstrap values" shown are the

Secondly, using our original NJ trees, we identified the agreement subtrees for the 100 replicate NJ trees that had previously been constructed (and used for bootstrap analysis). Starting with the 100 trees, 18 agreement subtrees (each containing 18 taxa) were found. Together, the agreement subtrees contained a total of 23 different taxa. These 23 taxa were then used to build a found 60 or more times in the largest tree after the initial clustering were: vc (70), cbf (67), vv (65), sty (63), set (61), and vp (60).

new NJ tree (**Figure 7**) using the dataset constructed from all 10 million iterations. The result shows with high confidence three microbial groups—the Actinobacteria, the Firmicutes, and the γ-Proteobacteria. This pruned tree is the consistent core of the 100 replicate trees, and indicates that there is significant (but small) gene order conservation between these three taxonomic groups. When this tree is midpoint-rooted, the Actinobacteria and Firmicutes are united as sister groups with high confidence, which further suggests that the gram-positive bacteria might be

monophyletic (as long as the assumptions inherent to midpointrooting are met). Based both of the conservative nature of this agreement substrees approach and the sensible results that it produces, we think that this is our best option for constructing a large gene order-based tree of prokaryotes.

## **GENE ORDER TREE BUILDING STARTING FROM A MORE REPRESENTATIVE 172 TAXA**

blue, while the γ-Proteobacteria shown in gray.

Finally, we repeated our agreement subtrees approach for our updated study of the relations of prokaryotes using 172 complete genomes (**Table 3**) better representing a wider-diversity of prokaryotes. With this fuller dataset, starting with 100 replicate NJ trees, the agreement subtree only contained 13 taxa. These 13 taxa were then used to build a NJ tree as before (**Figure 7**). As before, the resultant tree shows with high confidence that Actinobacteria and Firmicutes are sister groups (**Figure 8**). We also repeated this final analysis selecting five orthologs in the same order rather than six. This resulted in a summary agreement subtree with 56 taxa suggesting there is significantly more genomic gene order signal with five genes than with six. The 56 taxa tree (**Figure 9**), which now includes Archaea and Bacteria, again shows with high confidence that Actinobacteria and Firmicutes are sister groups forming a gram-positive clade (**Figure 9**). The midpoint rooting of this final tree (**Figure 9**) places Archaea as a sister group to the γ-Proteobacteria. At face value, this suggests there is a little more gene order conservation between the Archaea and the γ-Proteobacteria than with any other bacterial group. Gene order conservation between Archaea and the γ-Proteobacteria would argue against the "neomuran origin" for the archaea cell (Cavalier-Smith, 2002). A pairing of Archaea with the γ-Proteobacteria, though, should be taken with significant caution because the result is completely dependent on the midpoint rooting, which may incorrectly represent the history of these evolving groups. Using the Archaea as an outgroup, naturally would place the Proteobacteria with the other bacterial phylum represented. In either case, though, the tree supports the notion that the gram positive bacteria (Actinobacteria and Firmicutes) evolved once from a gram-negative relative. It is also notable that the genome-wide synteny tree of 89 microbes published by Shifman et al. (2014) also shows the Actinobacteria and Firmicutes united as sister groups, even though hat particular work used different genomes and a different approach to estimate gene order similarity across genomes.

### **IMPLICATIONS OF GENE ORDER CONSERVATION FOUND**

At this point, we can conclude that starting from a large number of genomes, we find, perhaps surprising, that there is some gene order conservation between a few major groups, namely Firmicutes, Actinobacteria, and Proteobacteria (**Figures 4**–**9**) and less robustly the Archaea and Proteobacteria (**Figure 9**). Comparison of genomes from closely related species reveals that inversions are quite common. Large inversions involving up to half of the genome are found frequently between closely related species (e.g., within the *Pyrococcus* genus, Zivanovic et al. (2002), within the *Yersinia* genus, Darling et al., 2008). Given this potentially very rapid rate of divergence in gene order, it is surprising to find residual phylogenetic signal still uniting such distant groups as the Actinobacteria and the Firmicutes. However, while large inversions are common, they are not random in their distribution. For example inversions that disrupt the symmetry of the replicons are frequently not tolerated (Eisen et al., 2000; Zivanovic et al., 2002; Darling et al., 2008). Thus, the rapid changes may be restricted in their range leaving large portions of the genome with potentially conserved gene order over large time scales.

Taken together, our results suggest that the Actinobacteria is a sister group to the Firmicutes, which in turn implies a single origin for the gram-positive cell. Since the first few whole genome sequences were published, some genomic trees have failed to unite these groups (Brown et al., 2001; Fu and Fu-Liu, 2002; Korbel et al., 2002), while others have found weak support for the pairing (House et al., 2003) or have found the pairing under a subset of conditions tried (Deeds et al., 2005). There are three possible disparate causes for these results. First, it is possible that the gram-positive cell has evolved more than once in Earth history. In particular, it has been suggested that *Mycobacterium* may have a close relationship to gram-negative bacteria (Fu and Fu-Liu,

2002). Second, it has been hypothesized that gram-positive bacteria are more primitive than gram-negative bacteria (Gupta, 1998; Errington, 2013). Third, some researchers are of the opinion that the failure of genomic methods to unite the gram-positive bacteria together indicates that genomic methods are still inadequate to address this relationship (Olsen, 2001), and that ultimately, we will find that the gram-positive bacteria could be united as a monophyletic group. In particular, the strong similarity in the structure of the cell walls of Firmicutes and Actinobacteria argues for a single origin. The gram-positive cell type, found in both Firmicutes and Actinbacteria, consists of thick layers of peptioglycan with teichoic acids and a single membrane. Gram-negative bacteria have a thin peptidoglycan layer, lack teichoic acid, and have a second outer membrane with lipopolysaccharides.

Considering that our gene order analyses have consistently produced trees with the Actinobacteria united with the bulk of the Firmicutes to the exclusion of other bacterial groups (mostly the Proteobacteria), our results support the uniting of these groups and argue against multiple origins for the gram-positive cell type. The strongest evidence against a strict monophyletic pairing of the Firmicutes with the Actinobacteria comes from the (unrooted) phylogenetic analysis of 31 concatenated bacterial genes (Wu et al., 2009) and 24 concatenated bacterial genes (Lang et al., 2013), which appear to support a mostly gram-positive clade of Firmicutes, Actinobacteria, Chloroflexi, and Cyanobacteria. Incidentally, Lang et al. (2013) also show the Tenericutes as part of the Firmicutes. At present, we cannot rule out such a larger (primarily) gram-positive clade because it is possible that other phyla (like the Tenericutes) will be included within our Firmicutes/Actinobacteria cluster when taxa sampling increases for gene order studies. Generally, one can argue that because several of our trees (those restricted to agreement subtrees) do not include any taxa from bacterial groups other than the Proteobacteria, we cannot rule out the possibility that one of the other phyla, such as the Cyanobacteria, would break up our Firmicutes/Actinobacteria clade. However, such reasoning does

requires the taxa within such a phyla to have all scrambled their gene order to the point to which they show no affinity to either the Firmicutes or Actinobacteria in spite of their supposed closer affinity. Our results though still show a uniquely strong conservation of gene order between the Firmicutes and Actinobacteria. We, therefore, feel that our results are indicative of a tree of life in which most other bacteria phyla diverged prior to the base of a gram-positive cluster (either Firmicutes/Actinobacteria or a larger similar clade). This interpretation in turn implies a single origin for the gram-positive cell. Our results also indicate that gene order of certain genomes are phylogenetically informative at both low and high taxonomic levels, but that for many other genomes gene order is not conserved for a long time.

## **ACKNOWLEDGMENTS**

This work was supported by the National Aeronautics and Space Agency (NASA) Exobiology Grant NNG05GN50G and the Penn State Astrobiology Research Center through NASA Astrobiology Institute (cooperative agreement #NNA09DA76A). We also thank the UCLA Institute of Genomics and Proteomics for funding to Sorel T. Fitz-Gibbon and Matteo Pellegrini.

## **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/journal/10*.*3389/fmicb*.* 2014*.*00785/abstract

## **REFERENCES**


Olsen, G. J. (2001). The history of life. *Nat. Genet.* 28, 197–198. doi: 10.1038/90014


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 July 2014; accepted: 21 December 2014; published online: 20 January 2015.*

*Citation: House CH, Pellegrini M and Fitz-Gibbon ST (2015) Genome-wide gene order distances support clustering the gram-positive bacteria. Front. Microbiol. 5:785. doi: 10.3389/fmicb.2014.00785*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2015 House, Pellegrini and Fitz-Gibbon. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The distribution and impact of viral lineages in domains of life

#### *Arshan Nasir 1, Patrick Forterre2,3, Kyung Mo Kim4 and Gustavo Caetano-Anollés <sup>1</sup> \**

*<sup>1</sup> Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, Illinois Informatics Institute, University of Illinois, Urbana-Champaign, Urbana, IL, USA*

*<sup>2</sup> Unité BMGE, Institute Pasteur, Paris, France*

*<sup>3</sup> Institut de Génétique and Microbiologie, Université Paris-Sud, CNRS UMR8621, Orsay, France*

*<sup>4</sup> Microbial Resource Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea*

*\*Correspondence: gca@illinois.edu*

#### *Edited and reviewed by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

#### **Keywords: viruses, evolution, replicon, capsids, virion morphotype, diversity, domains of life**

Living organisms can be conveniently classified into three domains, Archaea, Bacteria, and Eukarya (Woese et al., 1990). The three domains are united by several features that support the common origin of life including the presence of ribosomes, double-stranded DNA genomes, a nearly universal genetic code, physical compartments (i.e., membranes), and the ability to carry out metabolism and oxidationreduction reactions. In comparison, other types of genetic material and particles (e.g., viruses, plasmids, and other selfish genetic elements) are often excluded from the definition of "life" (for opposing views see Raoult and Forterre, 2008; Forterre, 2011, 2012a). However, they can still influence the evolution of cellular organisms, and in conjunction, establish complex life cycles.

Viruses impact our economy, medicine and agriculture due to their infectious nature. Viral infections transform the host cell into a virocell that no longer divides by binary fission but produces more viral particles or a ribovirocell in which the viral and cellular genomes coexist, the cell still dividing while producing virions (Forterre, 2011, 2012a). The virosphere (i.e., collection of all viruses) displays exceptional variability in virion morphologies and replication strategies. Viruses can be classified into DNA or RNA viruses, retroviruses or intermediate forms depending upon the type of replicon present inside the viral particle. Moreover, replicons could be linear, circular, singlestranded, double-stranded, or even segmented. The unprecedented diversity of replicon types has led to the proposal that viruses first invented DNA as means to trick the host defense systems (Forterre, 2002, 2005). Viruses can also transfer genes between species and enhance biodiversity (Nasir et al., 2012). Even more importantly, viruses appear to create massive amount of new genetic information, part of which can transfer to cells (Abroi and Gough, 2011; Forterre, 2011, 2012b). The discovery of "giant" viruses such as mimiviruses (La Scola et al., 2003), megaviruses (Arslan et al., 2011), pandoraviruses (Philippe et al., 2013), and pithoviruses (Legendre et al., 2014) now creates a continuum in genome size and functional complexity between the virosphere and cells. Still, viruses are neglected in phylogenetic studies because they lack a unifying genetic marker, similar to rRNA for cells, and because many biologists underestimate their genetic creativity. As a consequence, their role in the origin and evolution of modern life, and their impact on the ecology of our biosphere continue to be for the most part unrecognized (Koonin and Wolf, 2012). In this opinion article, we address the impact of viruses on the evolution of cells. We argue that viruses likely initiated major evolutionary shifts. Specifically, we consider that gain and loss of viral lineages often leads to divergent evolutionary trends even in closely related species. We emphasize that no evolutionary theory could be complete without accounting for the viral world and that viruses are responsible for ongoing adaptations in the cellular domains (see also Prangishvili et al., 2006; Forterre and Prangishvili, 2013; Koonin and Dolja, 2013).

The distribution of the association of viral replicon types with cells is extremely biased. For example, RNA viruses are completely absent in Archaea and are rare in Bacteria. In comparison, vertebrates host numerous RNA and retroviruses. Surprisingly, dsDNA viruses are rare in plants while dsRNA viruses are abundant in fungi. Similarly, retroviruses are integrated into the genomes of multicellular eukaryotes but are completely absent in the microbial genomes. In other words, specific relationships exist between the type of viral replicon and the host range. Viruses with a particular replicon may infect one group of organisms but may not replicate in another. Big jumps of viruses from one cellular lineage to another have been observed within the eukaryotic "division" such as animals (opisthokonts) and plants (viridiplantae), when a virus adapts to an established consortium of ecological partners. The same virus can sometimes infect both plant and animal cells when these are linked by their mode of life. One example is the Fiji disease virus (*Reoviridae*) that can replicate in both its insect vector (Delphacidae) and flowering plants (Kings et al., 2012). However, no modern virus is known to cross the barrier between domains. Therefore, while viruses may be able to jump hosts over short evolutionary time spans, crossing domain boundaries is less likely and not expected to compromise our inferences.

To obtain a quantitative view of viral diversity and its distribution among cellular domains, we extracted genome data from the Viral Genomes Resource at NCBI (Bao et al., 2004). This resource Nasir et al. Viral diversity in cellular domains

#### **FIGURE 1 | Continued**

from the ViralZone web-resource (Hulo et al., 2011) and from Pietilä et al. (2014) and Pina et al. (2011). A keyword-based search was performed on text data to assign the most general morphotypes (e.g., rod-shaped, spherical, droplet-shaped, etc) to all viruses. More than one viridae with same morphotype is possible but not made explicit. The diagram does not always imply evolutionary relationship between viruses harboring common morphology. For example, archaeal and eukaryal rod-shaped viruses are probably not evolutionarily related (Goulet et al., 2009). Well-studied exceptions are head-tail caudovirales harboring the HK97 capsid fold and of polyhedral viruses harboring the double jelly roll fold (Abrescia et al., 2012). <sup>1</sup>*Guttaviridae*; <sup>2</sup>*Ampullaviridae*; <sup>3</sup>*Spiraviridae* [name pending approval by ICTV]; <sup>4</sup>*Fuselloviridae*; <sup>5</sup>*Ascoviridae*; <sup>6</sup>*Nimaviridae*; <sup>7</sup>*Geminiviridae*; <sup>8</sup>*Astroviridae*; <sup>9</sup>*Rhabdoviridae*; <sup>10</sup>*Ophioviridae*; <sup>11</sup>*Polydnaviridae*; (left to right) <sup>12</sup>*Rudiviridae* (Archaea); *Virgaviridae* (Eukarya); <sup>13</sup>*Clavaviridae* (Archaea) *Roniviridae* (Eukarya); <sup>14</sup>*Siphoviridae, Myoviridae,* and *Podoviridae* (Archaea and Bacteria); <sup>15</sup>*Microviridae* (Bacteria), *Circoviridae* (Eukarya); <sup>16</sup>*Cystoviridae* (Bacteria), *Reoviridae* (Eukarya); <sup>17</sup>*Lipothrixiviridae* (Archaea), *Inoviridae* (Bacteria), *Potyviridae* (Eukarya); <sup>18</sup>*Sulfolobus* turreted icosahedral virus (Archaea), *Tectiviridae* (Bacteria), *Adenoviridae* (Eukarya).

(which is expected to grow with improvements in our ability to isolate viruses from atypical habitats.

Interestingly, all archaeoviruses possess DNA replicons but no RNA genomes. The complete absence of RNA viruses in Archaea can be linked to high temperature RNA instability (Forterre, 2013). We speculate that escape from RNA viruses could be one major trigger for the evolution of modern Archaea (Forterre, 2013). Thus, loss of RNA viral lineages likely initiated archaeal migration to the harsh environments. One recent study reported the isolation of ssRNA(+) viruses from an archaea-rich community in a hot, acidic spring of Yellowstone National Park (Bolduc et al., 2012). However, their host tropism could not be established with confidence. Finally, four ssDNA viruses were recently isolated from Archaea (Pietilä et al., 2009; Mochizuki et al., 2012; Sencilo et al., 2012). Of these, *Aeropyrum* coil-shaped virus (*Spiraviridae*) is the largest known ssDNA virus and displays unique coil-shaped virion morphology (**Figure 1B**; Mochizuki et al., 2012).

Bacterioviruses are remarkably successful in Bacteria and are highly abundant. Their virions outnumber their bacterial hosts in oceans, balance microbial populations in the marine communities, and regulate biogeochemical cycles (Breitbart and Rohwer, 2005; Suttle, 2007; Rohwer and Thurber, 2009; Zhao et al., 2013). Among the dsDNA bacterioviruses, tailed-bacteriophages exhibit extensive similarities with archaeal caudovirales, suggesting that they form a monophyletic group (Krupovic et al., 2010). Archaeal and bacterial caudovirales have indeed been grouped in a single major evolutionary lineage, together with *Herpesviridae*. All of these viruses share the same Hong Kong fold (HK97) in their major capsid proteins and homologous packaging ATPases (Baker et al., 2005; Pell et al., 2009; Krupovic et al., 2010; Abrescia et al., 2012). Notably, it has been found recently that the capsid of *Herpesviridae* exhibits a small tail similar to those of *Podoviridae* (Schmid et al., 2012). These data suggest that viruses of the HK97-like lineage are very ancient and originated (most likely) prior to the last common ancestor of cells. Another example of viral lineage shared by the three domains is the so-called "PRD1/Adenovirus lineage" of dsDNA viruses characterized by a major capsid protein containing the doublejelly roll fold and a common packaging ATPase (Abrescia et al., 2012). In comparison, ssDNA bacterioviruses are not as successful in Bacteria and correspond to two major families, *Inoviridae* and *Microviridae* (smallest genomes among DNA viruses; Rosario et al., 2012). Viruses in this group replicate by converting their single-stranded DNA genome into a double-stranded intermediate form engineered by host polymerase. These viruses lack their own polymerase and share this property with the ssDNA viruses of Archaea and Eukarya.

In contrast to DNA viruses, RNA viruses are not as successful in Bacteria. Only, 5 dsRNA, and 11 ssRNA(+) bacterioviruses could be identified. In turn, none of the ssRNA(−) and retrotranscribing viruses associated with bacterial hosts. Among the RNA bacterioviruses, dsRNA viruses (*Cystoviridae*) encode segmented genomes and infect mostly *Pseudomonas* species (Silander et al., 2005). Interestingly, *Cystoviridae* closely resembles eukaryal dsRNA viruses (i.e., *Reoviridae* and *Totiviridae*) in terms of life cycle and homologous RNA-dependent-RNA-polymerase gene sequences (a virus hallmark) (Butcher et al., 1997). Unlike Archaea, Bacteria are also infected by ssRNA(+) viruses (*Leviviridae*). These viruses are amongst the simplest and smallest known viruses, and historically yielded useful insights into mRNA function (Bollback and Huelsenbeck, 2001). Because RNA viruses (ssRNA and dsRNA) infect both Bacteria and Eukarya, their ancestors likely originated from a putative ancient world of cells with RNA genomes and RNA viruses (Forterre, 2005, 2006a,b). This points to the ancient existence of RNA viruses and suggests their loss from Archaea (since loss in one domain is more likely than the independent gain in two!). The instability of RNA at high temperatures supports this hypothesis, since it is likely that the last common ancestor of Archaea was a hyperthermophile (Brochier-Armanet et al., 2011).

Viruses with all possible types of replicons infect eukaryal organisms. RNA viruses are predominant and cover the entire taxonomic range within Eukarya (**Figure 1A**). Eukaryoviruses also exhibit many unique virion morphotypes not observed in the prokaryotic viruses and are unequally distributed in the major eukaryal groups (**Figure 1**). For example, dsDNA viruses are completely absent in fungi and are rare in plants (i.e., only found in green algae). This suggests that these groups have evolved sophisticated mechanisms to eliminate dsDNA viral infections. A good candidate is the cell wall structure found in plants, fungi, and algae. Differences in cell wall composition and rigidity greatly limit means of viral entry into the cell and serve as barriers to viral infections (Dimmock et al., 2007). However, loss of one viral lineage is apparently offset by the gain of other lineages. This is evident from the high RNA virus distribution among plants and fungi. The origin of the diversity and abundance of RNA viruses in eukaryotes but their near absence in prokaryotes is particularly puzzling (Koonin et al., 2006). For example, ssRNA(−) and retroviruses are highly successful in vertebrates. At first glance, it seems that organism complexity is proportional to the variety of viral infections. For instance, metazoa are infected by a host of retroviruses. Retroviruses can integrate their genomes into host DNA and thus alter gene expression patterns and trigger genomic rearrangements (Arkhipova et al., 2012). These activities can lead to production of novel genes and advanced machineries (Forterre, 2013). In fact, telomerase enzymes are homologous to retroviral proteins and neocentromeres are formed by epigenetic regulation of transposable elements (Singer, 1995; Chueh et al., 2009), both likely transferred from viruses to host cells much earlier in evolution. This argument is further supported by the absence of RNA and retroviruses from unicellular eukaryotes such as yeast, which resemble a prokaryotic lifestyle (Forterre, 2013). Thus, co-evolution between viruses and their hosts may have led to organism complexity in the eukaryotic domain.

The diversity of eukaryoviruses is intriguing, both in terms of genome structure and virion morphology (see **Figure 1B**). In particular, retrotranscribing, ssRNA(−), and many DNA virus families are only present in eukaryotes. Surprisingly, although Archaea and eukaryotes are very similar in term of their basic molecular biology, there are no viral lineages specific for these two domains (Forterre, 2013). Virions with rod-shaped morphology are up to now specific for Archaea and Eukarya (**Figure 1B**), but they harbor DNA and RNA genomes, respectively, and it is unclear if their major coat proteins are evolutionary related (Goulet et al., 2009). The same is probably also true for bacilliform viruses. Notably, the diversity and specificity of eukaryoviruses is difficult to reconcile with the archaeon-bacterium fusion scenarios for the origin of eukaryotes (e.g., Martin and Müller, 1998), as recently argued (Forterre, 2013).

To conclude, the distribution of viral lineages follows an ancient, highly dynamic and ongoing process that impacts the evolution of organisms. New viral lineages often arise from existing ones and may cross species barriers to infect new hosts (e.g., parvovirus; Shackelton et al., 2005), putting enormous evolutionary pressure on cellular organisms and prompting them to unfold molecular and cellular innovation (Forterre and Prangishvili, 2009) in the search of either simplicity or complexity.

## **ACKNOWLEDGMENTS**

Arshan Nasir is recipient of a Chateaubriand fellowship. Research has been supported by grants from the National Science Foundation (OISE-1132791) and the United States Department of Agriculture (ILLU-802-909 and ILLU-483-625) to Gustavo Caetano-Anollés, from the European Research Council to Patrick Forterre under the European Union's Seventh Framework Programme (FP/2007–2013)/Project EVOMOBIL—ERC Grant Agreement no. 340440, and from the KRIBB Research Initiative Program and the Next-Generation BioGreen 21 Program, Rural Development Administration (PJ0090192014) to Kyung Mo Kim.

## **REFERENCES**


of single-stranded RNA bacteriophage (*family Leviviridae*)*. J. Mol. Evol.* 52, 117–128. doi: 10.1007/s002390010140


display a helical fold spanning the filamentous archaeal viruses lineage. *Proc. Natl. Acad. Sci. U.S.A.* 106, 21155–21160. doi: 10.1073/pnas.0909 893106


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 April 2014; accepted: 11 April 2014; published online: 30 April 2014.*

*Citation: Nasir A, Forterre P, Kim KM and Caetano-Anollés G (2014) The distribution and impact of viral lineages in domains of life. Front. Microbiol. 5:194. doi: 10.3389/fmicb.2014.00194*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Nasir, Forterre, Kim and Caetano-Anollés. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## From lifetime to evolution: timescales of human gut microbiota adaptation

## *Sara Quercia1, Marco Candela1, Cristina Giuliani <sup>2</sup> , Silvia Turroni 1, Donata Luiselli <sup>2</sup> , Simone Rampelli 1, Patrizia Brigidi 1, Claudio Franceschi 3,4,5,6,7 , Maria Giulia Bacalini 3,4 , Paolo Garagnani 3,4,8 and Chiara Pirazzini 3,4 \**

*<sup>1</sup> Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy*

*<sup>2</sup> BiGEA, Department of Biological, Geological and Environmental Sciences, Laboratory of Molecular Anthropology & Centre for Genome Biology, University of Bologna, Bologna, Italy*

*<sup>3</sup> DIMES, Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy*

*<sup>4</sup> CIG, Interdepartmental Centre "L. Galvani" CIG, University of Bologna, Bologna, Italy*

*<sup>5</sup> IRCSS, Institute of Neurological Sciences of Bologna, Bologna, Italy*

*<sup>6</sup> IGM-CNR, Institute of Molecular Genetics, Unit of Bologna IOR, Bologna, Italy*

*<sup>7</sup> CNR, Institute of Organic Synthesis and Photoreactivity (ISOF), Bologna, Italy*

*<sup>8</sup> CRBA, Center for Applied Biomedical Research, St. Orsola-Malpighi University Hospital, Bologna, Italy*

#### *Edited by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases Under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

#### *Reviewed by:*

*Carl James Yeoman, Montana State University, USA Valerio Iebba, 'Sapienza' University of*

*Rome, Italy Riccardo Calvani, Catholic University*

*of the Sacred Heart School of Medicine, Italy*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases Under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

*Arseniy E. Yuzhalin, University of Oxford, UK*

#### *\*Correspondence:*

*Chiara Pirazzini, DIMES, Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Via San Giacomo, 12 40126 Bologna, Italy e-mail: chiara.pirazzini5@unibo.it*

**INTESTINAL MICROBIOTA, STRUCTURE, AND ROLE IN HUMAN PHYSIOLOGY**

Human beings co-evolved as superorganisms as the result of the mutualistic relationship with the enormous microbial community that resides in human gastrointestinal tract (GIT); this ecosystem is better known as gut microbiota (GM; Turnbaugh et al., 2007). The GM reaches the highest cell concentration in the colon, with a density of 1012 CFU/g of intestinal content and represents the most densely populated and biodiverse ecosystem on earth (O'Hara and Shanahan, 2006; Ley et al., 2008a). The GM presents a very particular phylogenetic structure, resulting in a sparsely branched tree, with a high degree of radiation at the ends. Indeed, out of the 100 different bacterial phyla detected on our planet, only seven are found in our gut – Firmicutes, Bacteroidetes, Actinobacteria, Proteobacteria, Verrucomicrobia, Tenericutes, and Fusobacteria – of

Human beings harbor gut microbial communities that are essential to preserve human health. Molded by the human genome, the gut microbiota (GM) is an adaptive component of the human superorganisms that allows host adaptation at different timescales, optimizing host physiology from daily life to lifespan scales and human evolutionary history. The GM continuously changes from birth up to the most extreme limits of human life, reconfiguring its metagenomic layout in response to daily variations in diet or specific host physiological and immunological needs at different ages. On the other hand, the microbiota plasticity was strategic to face changes in lifestyle and dietary habits along the course of the recent evolutionary history, that has driven the passage from Paleolithic hunter-gathering societies to Neolithic agricultural farmers to modern Westernized societies.

#### **Keywords: gut microbiota, aging, environmental stimuli, co-evolution, biological adaptation**

which Firmicutes and Bacteroidetes represent together up to 90% of the ecosystem (Costello et al., 2009). Conversely, the GM shows an impressive biodiversity at lower phylogenetic levels with up to 1000 different species being detected by next generation sequencing-based approaches (Qin et al., 2010). Interestingly, the species-level GM composition varies dramatically among people, and each subject owns a very unique subset of microorganisms, that consists of hundreds of the more than 1000 species detected in the GM of the entire human population. The total genome of these microorganisms, often referred to as the intestinal microbiome, has been estimated to contain 150 times more genes than the human one, providing the host with essential functional traits that human beings have not evolved on their own (Qin et al., 2010). For instance, the carbohydrate-active enzymes encoded in the microbial glycobiome allow the host to extract energy from otherwise

indigestible polysaccharides (Gill et al., 2006), complementing the poor human glycobiome diversity. Indeed, the functional assignment of the gut microbiome revealed an extraordinary diversity of gene clusters involved in carbohydrate metabolism (Koropatkin et al., 2012). Moreover, the gut microbiome is enriched in genes involved in the production of vitamins, cofactors, and secondary metabolites, further supporting its important role in host nutrition (Bäckhed et al., 2004; Clemente et al., 2012). The GM is also an active component of the human immune system (Round and Mazmanian, 2009; Maynard et al., 2012). The cross-talk with intestinal microbes has been shown to be essential for the maturation of a correct immune function in early life and to preserve a well-balanced immune homeostasis later in life (Kamada et al., 2013). Finally, a new and only sparsely explored role of the GM in human physiology is its potential to modulate the function of the gut-brain axis. Indeed, accumulating data from studies carried out in mouse models suggest how the GM and its metabolites could affect the behavior and pain, in addition to depression, anxiety and other disorders belonging to the central nervous system (Cryan and Dinan, 2012). Gut commensals are capable of activating neural pathways and modulating signaling to the central nervous system through their metabolite production (Foster and McVey Neufeld, 2013). In particular, studies carried out on mice demonstrated a strategic role for commensal bacteria in programming the hypothalamic–pituitary–adrenal (HPA) stress responsiveness at early developmental stages, when brain plasticity is still preserved (Sudo et al., 2004). Indeed, germ-free mice showed an enhanced HPA response to restraint stress, which was reversed by their re-conventionalization at early stages of development. According to the authors, both a cytokine-mediated humoral route and a neural mediated pathway could be involved in the microbiota modulation of the endocrine response in early life. Moreover, as observed by Diaz Heijtz et al. (2011) in mice, the GM can also affect the synaptogenesis during the perinatal period.

The recent use of germ-free and gnotobiotic mice has allowed the field to disentangle the complexity of the GM-host transgenomic metabolism, shedding light on the specific role of GM metabolites in host physiology (Nicholson et al., 2012). In particular, the endpoints of the GM polysaccharide fermentation, short-chain fatty acids (SCFAs) – most abundantly acetate, propionate, and butyrate – are key GM metabolites, with a multifactorial role in human health and homeostasis. These acids have been shown to play a pivotal role in host nutrition and energy homeostasis, controlling energy production, and storage as well as the appetite. Butyrate represents the main energetic substrate for the colonic epithelial cells (Russell et al., 2013) and both butyrate and propionate have been reported to activate intestinal gluconeogenesis (IGN) through two complementary mechanisms: butyrate triggers IGN gene expression in the gut via cAMP-dependent mechanisms, whereas propionate activates IGN by gene expression through gut-brain neural circuits and itself represents a substrate for IGN (De Vadder et al., 2014). According to the authors, this last propionate-dependent mechanism of IGN induction has been defined as strategic to provide the host with several metabolic benefits on body weight and glucose control. Besides the nutritional role, SCFAs have been reported to be involved in the maintenance of immune homeostasis (Arpaia et al., 2013). Through their production the GM controls the epithelium inflammation rate and drives the production and migration of specific immunological cells. Effectively, propionate governs the *de novo* peripheral regulatory T cell (Treg) generation and, together with acetate, drives their homing in the colon. Furthermore, propionate has been involved in the enhancement of hematopoiesis of dendritic cells with an impaired Th2 activation (Trompette et al., 2014). On the other hand, butyrate has the ability to regulate the production of pro-inflammatory cytokines (Segain et al., 2000), exerting a local immunomodulatory activity. Moreover, it is involved in the extrathymic Treg generation. In addition, the protective activity exerted by butyrate on the gut epithelium has been reported, as it stimulates the release of mucins (Petersson et al., 2011).

## **GENETICS OF HOST AND MICROBIOTA**

Host genetics and the GM are linked together by an intense crosstalk and this interaction is dynamic throughout the course of our life. Several studies have been conducted to determine the impact of host genetics on the GM composition providing conflicting results. To address how the host genotype and the environment influence the GM composition, a study on the fecal microbiota of monozygotic and dizygotic twin pairs concordant for leanness or obesity, and their mothers was conducted (Turnbaugh et al., 2009). The authors found that the GM of monozygotic twin pairs had a degree of similarity that was comparable to that of dizygotic twin pairs, highlighting that the environment impacts the GM composition more than the genetics does. It was also reported that family members harbor a similar GM composition and share a "core microbiome" made of several microbial genes. However, a study conducted on related and unrelated children reported that the highest level of similarity was found in genetically identical twins (Stewart et al., 2005).

Several studies reported that single host genes, i.e., MEditerranean FeVer, *APOA1, NOD2,* and *FUT2* affect the GM by altering its composition or by reducing the degree of bacterial diversity (Khachatryan et al., 2008; Petnicki-Ocwieja et al., 2009; Zhang et al., 2010; Frank et al., 2011; Wacklin et al., 2014).

Murine models proved to be very useful to clarify the effect of genetics on the GM. One of the first studies focused on the interaction host genetics-GM was based on observations on the GM reconstruction process occurring after a course of antibiotics (Vaahtovuo et al., 2003). It was observed that the colonization of the GM depends on the genotype of the host, and differences in communities between mouse strains were observed, supporting the idea that the gut community is not established by chance but is influenced by the host genetic background. Kovacs et al. (2011) studied particular recombinant inbred mouse strains to assess the relative role of the host genotype in the GM composition and they reported that the mouse genetic background is a strong determinant in shaping the intestinal microbiota. To address how environmental factors and host genetic factors combine to shape the GM, Benson et al. (2010) explored the associations between host quantitative trait loci (QTL) and the GM composition in mice. Eighteen host's QTL showing a significant association with the relative abundance of specific microbial taxa were identified. Even if litter and cohort effects accounted for some of theGM variation, according to the authors host genetics had a greater impact on the GM variability.

All these studies provide information that supports the idea that the host genetics and the GM interact with each other deeply, and we can speculate that changes in the GM composition could boost the different genetic make up of every individual.

## **MICROBIOTA ADAPTATION TO DAILY LIFE: MICROBIOME PLASTICITY IN RESPONSE TO DIFFERENT DIETS AND HOST PHYSIOLOGY**

The human GM is a complex dynamic system with the potential for multistability. Indeed, Faith et al. (2013) found that on average 40% of the microbial strains harbored in an adult's intestine was variable in a 5-year sampling period. In a mutualistic context the GM makes sudden jumps from different steady states under the pressure of environmental and endogenous factors, such as diet, age, host genetics, and physiological state (Candela et al., 2012).

The most rapid observable response of the gut microbial community is the reaction to daily dietary changes. Through a study where high-fat low-fiber and low-fat high-fiber diets were compared, changes in the microbiome composition were detected within 24 h of controlled feeding, confirming that the human gut ecosystem plasticity can respond efficiently and rapidly to external variables (Walker et al., 2011). However, as reported by Wu et al. (2011), short- and long-term dietary interventions differently impact the GM composition. Effectively, some bacterial groups are more likely to be influenced only by short-term dietary intervention, while others, namely those referred to as human enterotypes (Arumugam et al., 2011), are affected only by long-term intervention.

The correlation between nutrients and the GM composition was investigated in a caloric restriction study realized in 18 lean subjects over a 4-day period. The outcomes showed that nutritional compounds, like proteins and fibers, affect the phylogenetic and functional structure of the gut microbial community (Muegge et al., 2011). The connection between the GM phylogenetic profile and the ingestion of a specific nutrient, namely fermentable carbohydrate, was also observed in a recent study conducted in 14 overweight men in a 3-week period of intervention (Walker et al., 2011).

It has been recently demonstrated that animal-based and plantbased diets deeply impact on the GM (David et al., 2014). Both diets were administered for 5 days to 10 young US adults and the microbial community composition, metabolic products, and gene expression were analyzed. Interestingly, dietary changes to the plant- or animal-based diet resulted in marked microbiota changes only 1 day after the diet modification. In particular, the plant-based diet was associated with the presence of fibrolytic SCFA producers as *Roseburia*, *Eubacterium rectale*, and *Faecalibacterium prausnitzii*, while the animal-based diet resulted in the increase of potentially putrefactive microorganisms, such as *Bacteroides* and the bile tolerant *Bilophila wadsworthia* and *Alistipes*. It was observed that the animal-based diet had a greater impact on the GM structural and functional layout than the only plantbased diet. Lower levels of metabolic products resulting from the

fermentation of carbohydrates and greater levels of the products resulting from the fermentation of amino acids were reported in individuals with the animal-based diet. Finally, the animalbased diet was associated with an increased expression of genes involved in the biosynthesis of vitamins and genes involved in the metabolism of products resulting from the consumption of meat (David et al., 2014). Interestingly, besides dietary substrates, the GM also relies on host-derived glycans secreted in the mucus as a nutrient source in the gut (Kashyap et al., 2013). Indeed, genetically dictated changes in host mucus glycan composition, such as the presence or absence of terminal fucose residues, have been shown to significantly impact the GM structure and function. This provides a global view where the diet and the host genotype interact to modulate the GM configuration.

The ability of the GM to re-program itself in response to different stimuli is necessary to adapt to the metabolic requirements of the host corresponding to different physiological states. For instance, pregnancy represents a period of deep physiological changes during which the GM composition adjusts according to the growth of the fetus and the lactation period. According to Koren et al. (2012), pregnancy is characterized by a greater inflammation tone, reduced insulin sensitivity, and body fat increase.

These traits are supported by a pregnancy-associated GM profile, whose main features are the expansion of Proteobacteria and Actinobacteria and a decrease in richness. Interestingly, such modifications in the GM composition persist for 1 month after birth and afterward the adult-like microbiota configuration is restored.

While the GM virtually varies in response to any changes in environmental and endogenous factors, the GM adaptation to extreme conditions – as an abnormal dietary sugar and fat intake or chronic inflammation – breaks the microbiota-host mutualistic homeostasis, lowering the ecosystem diversity, and overcoming the resilience of the microbiota-host symbiosis. The microbiota observed in obese people represents an appropriate example of unbalanced GM configuration driven by the Westernized diet and lifestyle (Ridaura et al., 2013). The functional annotation of the obese-type gut microbiome revealed a decreased functional diversity and an enrichment in genes involved in carbohydrate, lipid, and amino acid metabolism, showing an overall increased fermentative capacity with respect to the lean-type microbiome (Turnbaugh et al., 2009). Moreover, very recently, Schulz et al. (2014) demonstrated that a high-fat diet mediates shifts in the GM composition that promote intestinal carcinogenesis by compromising the Paneth-cell-mediated antimicrobial host defenses. On the other hand, inflammatory bowel disease (IBD) is a paradigmatic model to elucidate the self-sustained inflammatory loop that is established in the gut as a result of an inflammationinduced microbiota dysbiosis. In particular, inflammation forces GM to change toward a pro-inflammatory pathobionts-enriched profile, which consolidates the host inflammatory tone (Round and Mazmanian, 2009).

Therefore, the role of the GM as a plastic factor in response to environmental or endogenous stress is essential for the maintenance of the mutualistic relationship with the host. But under some specific circumstances the microbial community can be forced to shift to a disease-associated configuration with the breaking of the homeostasis balance.

## **MICROBIOTA ADAPTATION TO DIFFERENT AGES**

The human GM describes an evolutionary trajectory along the course of human life. The GM ecosystem changes its structural and functional layout from early infancy to old age, providing the host with ecosystem services finely calibrated for each stage of life (Yatsunenko et al., 2012). For instance, the peculiar GM composition during infancy exerts specific functions for the infant biology, supporting the immune system education, brain development, and host nutrition (Candela et al., 2013). At weaning, the GM gains diversity and develops new physiological functions, in order to fulfill the adult age-related requirements, such as the need to extract energy from the variable array of complex polysaccharides characterizing the adult diet.

The individual microbial layout begins to be formed immediately during delivery (Jost et al., 2012). We are born sterile and environmental microbes immediately colonize us (Palmer et al., 2007). The infant' GIT is firstly colonized, just a few hours after birth, by facultative anaerobic bacteria, i.e., enterobacteria, staphylococci, and streptococci. Over time, the decreased amount of available oxygen allows strictly anaerobic bacteria to settle in the intestine, modifying the intestinal environment (Vael and Desager, 2009). In particular, Jost et al. (2012) analyzed the bacterial composition in feces from seven healthy vaginally delivered, breast-fed neonates at different times after birth. They observed that, during the first days of life, anaerobes, i.e., *Bifidobacterium* and *Bacteroides*, outnumbered facultative anaerobes in all seven neonates, pointing out that anaerobes may become dominant early in life and that the switch from facultative to strict anaerobes may occur at a very early stage. The infant-type microbiota is thus characterized by the dominance of *Bifidobacterium* and the presence of *Staphylococcus*, *Streptococcus,* and *Enterobacteriaceae* as other major components. With a relatively low degree of diversity, the infant-type GM is capable of tremendous fluctuations over time, with an individual-specific temporal pattern of variation in species composition (Palmer et al., 2007; Jost et al., 2012). The delivery mode is one of the factors that most influence early infants' microbiota composition (Dominguez-Bello et al., 2010). Indeed, the authors observed that the vaginally delivered infants acquired bacterial communities resembling their own mother's vaginal microbiota, while the cesarean section infants transiently harbored bacterial communities similar to those found on the mothers' skin surface.

Largely dominated by *Bifidobacterium* and *Enterobacteriaceae* with an extraordinary rate of variation over time, the infanttype microbiota is functionally structured to educate the infant immune system through an intense, yet controlled, immunological dialog. Centanni et al. (2013) demonstrated that in infants the phylogenetic structure of the enterocyte-associated GM fraction was unaffected by the host inflammatory stimulus, probably because the GM of infants is specifically shaped to cope with the dynamic and intense cross-talk with the host immune system that is necessary for immune education. Recently, it has been shown that the diversity in the GM composition in infants is more important than the prevalence of specific bacterial taxa in the determination of

the risk of immunological diseases later in life, i.e., allergic disease and asthma (Bisgaard et al., 2011; Abrahamsson et al., 2013). Furthermore, the infant GM also responds to precise developmental and nutritional needs crucial for the infant, such as the development and functionality of the central nervous system (Sudo et al., 2004; Collins et al., 2012), as well as the specific vitamin requests (Yatsunenko et al., 2012). Recently, fascinating hypotheses extending the GM-dependent immune and metabolic programming to the perinatal period have been advanced (Rautava et al., 2012). However, until confirmed by robust experimental findings, such hypotheses need to be taken with caution (Hanage, 2014).

The infant-type GM is subject to profound fluctuations until weaning when, with the introduction of solid food, it shifts toward the adult-type microbiota, with the progressive acquisition of taxonomic and functional complexity, such as a wide array of carbohydrate-active enzymes. This shift results in a profound change in the GM composition that goes from a bifidobacteriaenriched community to another one dominated by *Firmicutes* and *Bacteroidetes*, resembling more and more the microbiota of an adult, characterized by increased functionality and stability (Koenig et al., 2011). This adult-type microbiota is functionally structured to metabolize the whole complexity of the plant polysaccharides contained in the adult diet and provides mutual benefits to the host (Vanhoutte et al., 2004). Indeed, the microbiota takes advantage of a warm and nutrient-rich environment in which it can settle, while the host can benefit from an easyfitting metabolic equipment that can provide essential factors and increase the host's digestive capacity (Lozupone et al., 2012). A strong selection toward a readily changeable individual microbiome profile has been shown (Candela et al., 2012). This is the consequence of the inherent degree of plasticity of this bacterial ecosystem in adults, which allows the GM to change in response to environmental/endogenous factors, and the uniqueness of our physiology, lifestyle and history (Costello et al., 2012). These result in a peculiar temporal dynamics of the individual GM, always providing an adaptive response to ensure ecosystem services in the face of personalized physiology, immune system, environmental, or dietary exposure and lifestyle (Candela et al., 2013).

With aging and the onset of pathophysiological conditions (e.g., colon cancer, IBD, obesity, type 2 diabetes, and cardiovascular diseases) the GM-host mutualistic relationship progressively becomes compromised (Biagi et al., 2013). In elderly people diet and lifestyle undergo profound variations that include alterations of taste and smell, of gastrointestinal motility, and mastication, resulting in a nutritionally imbalanced diet (Biagi et al., 2010, 2013; Claesson et al., 2011; Drago et al., 2012). These age-related modifications, together with immunosenescence, affect the phylogenetic and functional structure of the gut ecosystem, leading to a microbial composition that favours the bloom of pathobionts (*Enterobacteriaceae*) to the detriment of immunomodulatory groups (*Clostridium* cluster IV and XIVa, *Bifidobacterium*). This age-associated configuration together with the "inflammaging" process could contribute to the creation of a self-sustained proinflammatory loop that is prejudicial for host health (Franceschi et al., 2000a,b; Grignolio et al., 2014). Interestingly, the GM of

the elderly displays a restricted stability and extreme variability. Recently, a functional description of the aged GM was reported (Rampelli et al., 2013). By using Illumina shotgun sequencing, three centenarians' fecal samples were analyzed and a shift from a saccharolytic to a putrefactive metabolism was reported. Indeed, an increase in the proteolytic potential, a reduction of genes involved in the metabolism of carbohydrates and a reduction of genes involved in SCFA production were observed. These modifications are in agreement with the age-related enrichment of genes belonging to pathobionts, and the authors hypothesized the existence of a pro-inflammatory loop in which pathobionts actively promote the worsening of health status with age. They also speculate that in centenarians some readjustments could occur to counteract the detrimental effects of pathobiont accumulation.

## **MICROBIOTA ADAPTATION DURING DIETARY SHIFT IN HUMAN EVOLUTION**

Bacteria are part of the evolutionary history of complex organisms and they occupy every ecological niche of our planet. The human GM is the biggest stable symbiont of our body (Costello et al., 2009) and it is characterized by a long adaptive history.

Modern humans, when have moved out of Africa, had to face different environmental challenges (such as food availability, climate changes, and pathogen loads). The main change in the host–microbiota symbiosis likely occurred almost 10,000 years ago, during the *Neolithic revolution*, also called "agricultural revolution" (De Filippo et al., 2010; Ottaviani et al., 2011). This revolution is based on the transition from hunting and gathering to agriculture and permanent settlements. In this period, the agriculture and animal husbandry have led to natural changes of human lifestyle and shaped modern human genomes. Given its high plasticity, the GM is able to change its composition and to adapt itself, according to diet/food availability, and the advent of agricultural societies could have favored microbial communities able to ferment complex substrates like polysaccharides (Hehemann et al., 2010). However, to date, little is known regarding how the GM has changed during human evolution. One of the most constraining aspects in this research field is the impossibility of having suitable fossil record. Indeed, the study of changes in the GM in human history is complicated by the difficulty in finding well-preserved samples of feces or intestinal samples of different periods (Walter and Ley, 2011). Nevertheless, researchers are developing methods to overcome this limitation. In a very recent paper, Sistiaga et al. (2014) applied gas-chromatographymass spectrometry to Neanderthal's fecal matter to evaluate sterol and stanol level. The authors provide the first evidence that, even if Neanderthals predominantly consumed meat, they also had a remarkable plant intake, and they suggest the presence of a specific GM involved in cholesterol metabolism throughout human evolution. On the other hand, a glimpse of the ancestral human GM configuration could be provided by the GM of close primate relatives (Ley et al., 2008b; Moeller et al., 2012). Interestingly, the GM of modern humans clusters with that of other omnivorous primates, regardless of their affiliation to *Pan* (Ley et al., 2008b). This supports the key role of dietary habits in shaping the composition of the GIT microbial ecosystem.

The dynamics of the GM-host co-evolution and environmental adaptation can be addressed by investigating the GM variability in modern human populations of different culture (Candela et al., 2012). Indeed, the study of the GM from large healthy human populations of different age and socio-economic, geographic, and cultural settings allows researchers to point out the contribution of these environment components to the GM variation. In this context, a very recent paper explored the GM of the Hadza of Tanzania, a modern population of hunter-gatherers that still live as Paleolithic humans (Schnorr et al., 2014). This study elucidated the mechanisms of humans/GM co-evolution and showed a first map of the microbiota composition of the Hadza that reflects the functional adaptation to a foraging lifestyle. For instance, the high bacterial diversity and the enrichment in fibrolytic microorganisms (e.g., xylan-degrading *Prevotella* and *Treponema*) proper of the Hadza GM, represent ecosystem adaptations to provide SCFAs from their heavy plant-based diet. Furthermore, the Hadza show a sex-related divergence in the GM composition reflecting the sexual division of labor and sex differences in diet composition. In particular, the higher relative abundance of *Treponema* found in Hadza women could provide specific functions to deal with their higher intake of tubers and plant foods. In fact, women selectively forage for tubers and plant foods and spend a lot of time in camp, while men are highly mobile foragers and range far from the central camp site to obtain meat and honey. Even if foods are brought back to the camp and shared, men and women tend to consume more of their targeted foods. Finally, the absence of *Bifidobacterium* and a corresponding enrichment of potential opportunists as Proteobacteria and Spirochaetes in the Hadza GM probably correspond to a different tolerogenic layout of their immune system, redefining the notion of what we consider a healthy and an unhealthy GM structure. Indeed, the Hadza have relatively low rates of infectious diseases, metabolic diseases and nutritional deficiencies in comparison with other groups settled in Northern Tanzania (Bennett et al., 1973; Work et al., 1973; Blurton Jones et al., 1992). Moreover, De Filippo et al. (2010) compared the GM of children living in rural Africa and that of European children, and many differences emerged. Children from Boulpon RuralVillage in Burkina Faso have a traditional rural African diet that is rich in starch, fibers, and plant polysaccharides and low in fat and animal protein, while European children follow aWestern diet. The authors argued that the consumption of sugar, animal fat, and calorie-dense foods in industrialized countries is rapidly limiting the adaptive potential of the microbiota, by reducing microbiota richness and its functionality. Interestingly, the authors reported that only the GM of African children contains *Prevotella*, *Xylanibacter,* and *Treponema* that are involved in cellulose and xylan hydrolysis. It was speculated that the high fiber intake characterizing the African diet could change the GM composition to maximize the metabolic energy extraction from ingested plant polysaccharides (De Filippo et al., 2010). Finally, the same approach was applied to fecal samples from 531 children and adults from the Amazonas of Venezuela, rural Malawi and US metropolitan areas, including parents, siblings, and twins (Yatsunenko et al., 2012). The phylogenetic composition of the GM of these three populations is different, especially for US residents vs. non-US residents (Malawians and Amerindians).

Furthermore, the authors confirmed the importance of *Prevotella* as a discriminatory taxon that distinguishes non-US from US individuals. A meta-analysis of the GM composition in Western populations (i.e., USA and Italian citizens), rural Malawi and Burkina Faso populations, and Hadza hunter-gatherers has been carried out (Schnorr et al., 2014). Data allowed to reconstruct the putative trajectory of GM adaptive evolution that accompanied human beings along the transition from the Paleolithic huntergatherer to the Neolithic rural communities until modernWestern societies. The diagram reported in **Figure 1** shows the emergence of specific co-abundance groups (CAGs) – groups of microorganisms which correlate and cluster together – along with the most important transition phases in our recent evolutionary history, such as the higher abundance of *Ruminococcaceae* unclassified CAG distinguishing for the Hadza hunter-gatherers, the emergence of *Clostridiales* unclassified and *Prevotella* CAGs in rural Malawi and Burkina Faso populations, and the dominance of the *Faecalibacterium* CAG in Western populations (Schnorr et al., 2014).

A striking and fascinating example for gene acquisition by a gut microbe as an adaptation to the local diet is described by Hehemann et al. (2010) in a study on Japanese population. By comparing the GM from Japanese and North American populations, it was reported that the GM of Japanese subjects is enriched for genes (probably acquired by contact with marine microbes) that encode enzymes capable of degrading porphyran

**FIGURE 1 | Timescale of the intestinal microbiota evolution: from foraging to Western lifestyle, crossing the Neolithic revolution.** The trajectory of the gut microbiome structure of modern populations with different lifestyles mimics the evolution of the relationship between microbes and the human host. Each network plot is a Wiggum plot, published in Schnorr et al. (2014), which indicates patterns of variation of six identified co-abundance groups (CAGs) in the Hadza (orange; Schnorr et al., 2014), Malawi (red; Yatsunenko et al., 2012), Burkina Faso (brown; De Filippo et al., 2010), Italians (blue; Schnorr et al., 2014), US people (green; Yatsunenko et al., 2012), and Italian children (cyan; De Filippo et al., 2010).

CAGs are named by the name of the most abundant genera and are color coded as follows: *Faecalibacterium* (cyan), *Dialister* (green), *Prevotella* (orange), *Clostridiales*\_unclassified (yellow), *Ruminococcaceae*\_ unclassified (pink), and *Blautia* (violet). Each node represents a bacterial genus and its dimension is proportional to the mean relative abundance within the population. Connections between nodes represent positive and significant Kendall correlations between genera (false discovery rate <0.05). The central path indicates transition from hunter-gatherer (orange) to Western microbiome (blue), crossing the rural African configuration (red).

that is contained in seaweeds. In the Japanese culture, the greatest source of porphyran is due to nori (edible seaweed), which is commonly used in the preparation of sushi and it is the most produced and consumed seaweed for centuries in Japan.

## **CONCLUSION**

Human beings faced tremendous changes in lifestyle and dietary habits along the course of their recent evolutionary history. They passed from Paleolithic hunter-gathering societies to Neolithic agricultural farmers until modern Westernized societies in 10,000 years, adapting to dramatic changes in diet and lifestyle in a relatively short evolutionary frame.

Human beings have been recently revised as superorganisms as a result of a close mutualistic relationship with their GM. Recent longitudinal studies highlighted an adaptive role for the GM in human biology, allowing to optimize the superorganism metabolic performances in response to diet, lifestyle, and physiological changes such as aging. This raises the question of whether this adaptive GM potential had a role in our recent evolutionary history, allowing adaptation to the profound lifestyle changes and describing a real GM-host evolutionary trajectory.

The first and very recent description of the Hadza GM structure provided important light in this direction. The peculiar structural and functional configuration of the Hadza gut microbial ecosystem suggests that adaptive functional changes of the GM accompanied the evolutionary trajectory of human beings, allowing the optimization of the superorganism performances in response to the profound changes that characterized our recent evolutionary history.

However, several circumstances characteristic of the Western world are challenging the resilience of the GM-host mutualistic interaction. The high-fat high-sugar energy-dense diet, sanitization, and antibiotic usage – a landmark of Western societies – are forcing GM adaptive changes to deviate from a mutualistic configuration. This raises the need to better comprehend the dynamics involved in this process, controlling any variables to preserve the extraordinary mutualistic relationship we evolved with our microbial counterpart.

## **ACKNOWLEDGMENTS**

We thank and mention the Italian Ministry of University and Research (Project PRIN 2012 to Donata Luiselli), the European Union's Seventh Framework Programme (FP7/2007-2011) under grant agreement n. 259679 (IDEAL) and ICT-2011-9, n. 600803 (MISSION-T2D).

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 April 2014; paper pending published: 30 April 2014; accepted: 17 October 2014; published online: 04 November 2014.*

*Citation: Quercia S, Candela M, Giuliani C, Turroni S, Luiselli D, Rampelli S, Brigidi P, Franceschi C, Bacalini MG, Garagnani P and Pirazzini C (2014) From lifetime to evolution: timescales of human gut microbiota adaptation. Front. Microbiol. 5:587. doi: 10.3389/fmicb.2014.00587*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Quercia, Candela, Giuliani, Turroni, Luiselli, Rampelli, Brigidi, Franceschi, Bacalini, Garagnani and Pirazzini. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**REVIEW ARTICLE** published: 22 September 2014 doi: 10.3389/fmicb.2014.00494

## Diet and the development of the human intestinal microbiome

## *Noah Voreades, Anne Kozil and Tiffany L. Weir\**

*Department of Food Science and Human Nutrition, Colorado State University, Fort Collins, CO, USA*

### *Edited by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

#### *Reviewed by:*

*Carl James Yeoman, Montana State University, USA Franck Carbonero, University of Arkansas, USA*

#### *\*Correspondence:*

*Tiffany L. Weir, Department of Food Science and Human Nutrition, Colorado State University, 1571 Campus Delivery, 210 Gifford Building, Fort Collins, CO 80523-1571, USA e-mail: tiffany.weir@colostate.edu*

The important role of the gut microbiome in maintaining human health has necessitated a better understanding of the temporal dynamics of intestinal microbial communities as well as the host and environmental factors driving these dynamics. Genetics, mode of birth, infant feeding patterns, antibiotic usage, sanitary living conditions and long term dietary habits contribute to shaping the composition of the gut microbiome. This review focuses primarily on diet, as it is one of the most pivotal factors in the development of the human gut microbiome from infancy to the elderly. The infant gut microbiota is characterized by a high degree of instability, only reaching a state similar to that of adults by 2–3 years of age; consistent with the establishment of a varied solid food diet. The diet-related factors influencing the development of the infant gut microbiome include whether the child is breast or formula-fed as well as how and when solid foods are introduced. In contrast to the infant gut, the adult gut microbiome is resilient to large shifts in community structure. Several studies have shown that dietary changes induce transient fluctuations in the adult microbiome, sometimes in as little as 24 h; however, the microbial community rapidly returns to its stable state. Current knowledge of how long-term dietary habits shape the gut microbiome is limited by the lack of long-term feeding studies coupled with temporal gut microbiota characterization. However, long-term weight loss studies have been shown to alter the ratio of the Bacteroidetes and Firmicutes, the two major bacterial phyla residing in the human gastrointestinal tract. With aging, diet-related factors such as malnutrition are associated with microbiome shifts, although the cause and effect relationship between these factors has not been established. Increased pharmaceutical usage is also more prevalent in the elderly and can contribute to reduced gut microbiota stability and diversity. Foods containing prebiotic oligosaccharide components that nurture beneficial commensals in the gut community and probiotic supplements are being explored as interventions to manipulate the gut microbiome, potentially improving health status.

**Keywords: enterotype, gut microbiome, aging, dietary patterns, colonization**

## **IMPORTANCE OF THE GUT MICROBIOME**

The consortium of single-celled organisms residing in our intestines, the gut microbiome, is rapidly emerging as an important determinant of health. Deterrents to proper bacterial colonization in early life are hypothesized to contribute to food sensitivities, allergic reactions, Type I diabetes, and other autoimmune disorders (Kelly et al., 2007). Association of the microbiome to autoimmune diseases has been explained by the "hygiene hypothesis," which suggests that the absence of a robust microbiome results in defects in development and regulation of the immune system, resulting in a lack of immune tolerance (Okada et al., 2010; Rook, 2012). Later in life, strong evidence supports an important role for intestinal microbiota in weight regulation via contributions to dietary energy harvest and appetite control (Tilg and Kaser, 2011). The gut microbiome has also been implicated in the pathology of several intestinal inflammatory diseases as well as in the development of colorectal, gastric, and prostate cancers and cardiometabolic disorders (Sekirov et al., 2010). Mechanisms giving rise to these conditions include the production of genotoxins by bacterial pathogens, microbial metabolism of dietary components to produce carcinogenic compounds, and inciting local and systemic inflammatory cascades that result in chronic low grade inflammation and damage to affected tissues and organs.

While a dysbiotic microbiota can cause disease, a healthy microbial community is vital to assist the host in maintaining optimal wellness. Thus, there is a need to understand the factors that shape and alter the microbiome throughout the lifespan of an individual. Numerous elements, encompassing environmental exposures, genetics, and other inherent host factors, contribute to the initial colonization of the microbiome in infants and to the subtle shifts that occur in adults, occasionally culminating in microbial decline as observed in frail and unhealthy elderly individuals (Koenig et al., 2011). However, none of these factors may be as important in the development of the microbiome as diet. In this review we will present evidence for the importance of diet in initial colonization events and in determining the composition of a stable adult microbiome. Factors such as malnutrition and pharmaceutical interventions on the aging gut will also be reviewed. Finally, we will discuss potential interventions, including dietary changes that can be used to alter the intestinal microbial community.

## **EARLY MICROBIAL COLONIZATION AND ESTABLISHMENT**

The infant gut is thought to be sterile at birth, although some new research characterizing the placental microbiome challenges that assumption (Aagaard et al., 2014). After birth initial colonization and early establishment of the infant gut is influenced by whether delivery was vaginal or caesarean, feeding patterns, sanitary conditions, and antibiotic administration (Marques et al., 2010). The relative importance of these factors on the long-term structure of the intestinal microbial community and associated health outcomes is still debated. It stands to reason that with constant exposure between the microbiome and food components that diet is one of the primary drivers shaping the changes that occur during infancy and the structure of the adult microbiome that eventually establishes. This section will focus on the diet's role in shaping the infant gut microbiome from birth to ∼3 years of age. Specifically, the following topics will be explored in detail: (1) the influence of breast vs. formula-feeding in initial colonization, (2) changes related to beginning of weaning and introduction of solidfoods, and (3)factors contributing to a stable gut microbiome profile (**Figure 1**).

#### **BREAST vs. FORMULA FEEDING**

Following birth, the infant gut microbiome is characterized by low-species diversity and high rates of bacterial flux until ∼2 or 3 years old (Bergström et al., 2014). Facultative anaerobic bacteria including *Staphylococcus, Streptococcus, Escherichia coli* and *Enterobacteria* are thought to be the first colonizers of the gut. Their purpose is to consume oxygen and create an environment for obligate anaerobes to thrive (Palmer et al., 2007; Jost et al., 2012). These are later replaced by facultative anaerobes that dominate the gastrointestinal tract, primarily Actinobacteria and Firmicutes (Turroni et al., 2012). This change in dominant taxa representation can be attributed to the introduction of breast or formula-feeding, signifying the first diet-related colonization event in the infant gut microbiome (Harmsen et al., 2000; Jost et al., 2012). In breast-fed infants, the dominant Actinobacteria are represented by *Bifidobacterium* species, specifically*, B. breve*, *B. longum*, *B. dentium*, *B. infantis*, and *B. pseudocatenulatum* (Harmsen et al., 2000; Jost et al., 2012). The Firmicutes phylum

is represented principally by lactic acid bacteria such as *Lactobacillus* and *Enterococcus* as well as *Clostridium* species (Turroni et al., 2012; Bergström et al., 2014). More than 700 species of bacteria have now been identified in human colostrum and breast milk, including multiple species of lactic acid bacteria as well as species typically colonizing the oral cavity of infants (Cabrera-Rubio et al., 2012). While this may contribute to the intestinal community of breastfed infants, it is still unclear whether the composition of species in breast milk is driven by transfer from infant to mother. The chemical composition of breast milk does influence the gut microbiome through supplying unique oligosaccharides that are selectively utilized by *Bifidobacterium spp.* (Turroni et al., 2012).

There are conflicting reports regarding differences in the relative abundance of these bacteria between breast and formula fed infants. Many studies have reported that formula-fed infants display dominance of *Bifidobacterium spp.* similar to what has been observed in breastfed infants (Harmsen et al., 2000; Fallani et al., 2010, 2011). However, another study reported approximately double the count of *Bifidobacterium* in breast fed infants compared to those fed formula (Bezirtzoglou et al., 2011). Formula feeding was also associated with higher levels of *Atopobium* (Bezirtzoglou et al., 2011); which corroborated reports by Fallani et al. (2010), although they only noted*Atopobium* increases in formula fed infants delivered by Cesearean section or whose mother's had been administered antibiotics. Higher numbers of *Bacteroides spp*. as well as members of the Enterobacteriaceae have also been reported in formula-fed infants (Harmsen et al., 2000; Fallani et al., 2010). Despite significant evidence that *Bifidobacterium* is an important early colonizer in neonates, Palmer et al. (2007) reported that *Bifidobacterium* was not present in significant amounts in the infant gut (Palmer et al., 2007). However, it is important to highlight that within their cohort, there was a mixture of breast and formula-feeding, antibiotics were provided to infants and a small subset required specialized hospitalization.

The variability reported with regard to *Bifidobacterium* abundance could be driven by differences in infant formula composition. Formulas supplemented with the prebiotics

that in adults and long-term dietary patterns are beginning to establish.

galacto-oligosaccharide (GOS) and fructo-oligosaccharide (FOS) may account for high levels of *Bifidobacterium* found in many formula-fed infants (Marques et al., 2010; Oozeer et al., 2013). A recent review discusses evidence supporting GOS and FOS supplementation effects on the gut (Oozeer et al., 2013). Infant gut microbial populations provided with either human breast milk or prebiotic supplemented infant formula had similar levels of *Bifidobacterium;* whereas gut microbial populations of infants given traditional formula was reported to have about 20% fewer *Bifidobacterium* (Knol et al., 2005). Additionally, the species composition of *Bifidobacterium* was similar between infants given human breast milk and those on prebiotic supplemented formula. However, traditional formula fed infants had markedly different gut microbial communities and even the specific *Bifidobacterium* species differed with higher relative abundances of *B. cantenulatum* and *B. adolescentis,* which are typically represented in adult populations. Another potential explanation for the variation in studies reporting bacterial abundances, particularly with regard to breast-feeding could be due to differences in the maternal-diet (Cabrera-Rubio et al., 2012). Characterization of the placental microbiome suggests that it is colonized by the mother's oral microbiome (Aagaard et al., 2014). Another recent study showing that pre and post-natal maternal consumption of a high fat diet, independent of obesity in the mother, resulted in dysbiosis of the infant gut in a primate model (Ma et al., 2014). Together, these studies suggest that maternal diet may play a significant but previously unrecognized role in determining early colonization and establishment of the infant microbiome. Conduct of randomized trials in which the maternal diet is controlled or large-scale cross-sectional studies of pregnant mothers adhering to different diets (Western, vegetarian, gluten-free, etc) are necessary to further develop this hypothesis.

#### **WEANING AND THE SHIFT TOWARD AN ADULT MICROBIOME**

Around the age of 1–2 years old, the infant gut microbiome undergoes its second shift and the stable adult microbiome begins to emerge, further supporting the significant role of the diet in influencing the microbial community (De Filippo et al., 2010; Bergström et al., 2014). One study reported that although there were differences in the microbiome preand post-weaning, the impacts of earlier colonization events (delivery mode, formula or breastfed, etc.) were still apparent (Fallani et al., 2011). Another study comparing Italian vs. African children's gut microbiomes showed that after weaning and solid foods were introduced there was a significant diet-related shift in the gut microbiome profiles. Prior to the introduction of their respective Western or African diets, the children across both populations that were still breast-feeding clustered together and had similar *Bifidobacterium* species dominance. Only children who were already weaned reliably clustered together into distinct geographic groupings. This study reinforced two important points related to dietary drivers of the gut microbiome development in children. First, breast-feeding, regardless of duration supports a specific bacterial state that is unique and markedly different from that observed in individuals consuming solid foods. Second, once solid foods are

introduced, its role in shaping long-term gut microbiome profiles is so strong that individual's cluster based on diet type over other environmental and physiological factors (De Filippo et al., 2010).

A similarly significant shift was reported by Bergström et al. (2014) in a 3 year Danish study with a cohort of 330 infants. They reported that between 9 and 18 months, the infant gut bacterial abundances changed drastically with the introduction of solid foods. Specifically, *Bacteroidetes-*related species increased. Whereas *Bifidobacterium* and *Lactobacillus* species and Enterobacteriaceae declined, various species within Firmicutes phylum were also reported to increase. This bacterial taxa shift is logical given that breast and/or formula-feeding has ceased, depleting the primary fuel source for these bacteria. In addition, butyrate producing bacteria such as *Clostridium leptum* group, *E. halli*, and *Roseburia* species increased. Typically, butyrate producing bacteria are responsible for the breakdown of otherwise indigestible complex plant polysaccharides and resistant starches. Anecdotally, this studyfound that the longer infants were breast and/orformula-fed, the lower their levels of butyrate producing bacteria. Additionally, more and different species begin to appear with introduction of solid foods (Koenig et al., 2011; Bergström et al., 2014).

### **EMERGENCE OF A STABLE GUT PROFILE**

From 18 to 36 months, the infant gut microbiome undergoes its final significant shift to a more stable microbial profile composed primarily of the bacterial phyla Bacteriodetes and Firmicutes. This shift represents a temporal change that can be attributed to the continued influence of a varied solid food diet (De Filippo et al., 2010; Koenig et al., 2011; Bergström et al., 2014). The earlier that solid food is introduced into the diet, the more quickly the gut microbiome begins to resemble a stable adultlike microbiome (Bergström et al., 2014). The specific proportion of Firmicutes and Bacteroidetes is strongly influenced by diet. This was best demonstrated in the previously discussed work by De Filippo et al. (2010) where the distinct microbial signatures of the two groups of children were indicative of their respective dietary habits. The most compelling evidence for this was the dominance of *Prevotella*, capable of digesting complex plant polysaccharides, in African children and its absence in Italian children. Similar diet-driven influences were reported in a detailed temporal study of a single infant. This study demonstrated that introduction of peas, formula, and other solid foods led to an emerging co-dominance between Firmicutes and Bacteroidetes, with the increase in Bacteroidetes potentially resulting from requirements for the breakdown of newly introduced plant polysaccharides (Koenig et al., 2011). The previously mentioned emergence of a stable gut microbiome can be substantially derailed if the infant experiences either severe acute malnutrition or moderate acute malnutrition. Emerging research is demonstrating that either of these malnutrition states has the potential to significantly alter the development of a healthy gut microbiome profile, regardless of diet-based interventions (Subramanian et al., 2014). These recent findings not only support a link between diet and the development of a particular gut microbiota and microbiome, but illustrate that nutrient quantity can impact development too.

### **THE ADULT MICROBIOME**

The typical adult intestinal microbiome is primarily comprised of approximately six or seven different bacterial phyla, of which Bacteroidetes and Firmicutes dominate (Eckburg et al., 2005). Less abundant phyla can include Proteobacteria, Verrucomicrobiota, Actinobacteria, and *Euryarchaeota*. A recent study followed changes in the microbiome of 37 adults for up to 5 years and reported that ∼60–70% of the bacterial strains present remained unchanged over the course of the study and that the most stable members of the microbiome tended to be the most abundant (Faith et al., 2013). They also observed that at the phyla level, Bacteroidetes and Actinobacteria populations were less susceptible to perturbations whereas Firmicutes and Proteobacteria were significantly less stable. These results arefairly consistent with findings from an earlier study utilizing a microarray-based approach to determine molecular taxonomy and which followed a smaller cohort over a longer period of time (Rajilic-Stojanovic et al., 2013). Both studies reported that the taxa present in an individual remain fairly consistent over time, although the relative abundances of these taxa were subject to change. However, data from Rajilic-Stojanovic et al. (2013) suggests that larger fluctuations occur between samples taken at longer intervals while Faith et al. (2013) report the opposite trend, with larger fluctuations occurring in samples taken over shorter periods of time compared to those that are temporally farther apart. Despite this resilience, there is evidence that the diet shapes the relative abundance of dominant phyla and populations of specific bacterial groups are influenced by the composition of macronutrients consumed.

#### **DIET-DRIVEN ENTEROTYPES**

There have been numerous attempts to identify a "core" microbiota, usually defined as bacterial taxa that are shared between 95% of individuals tested (Huse et al., 2012). Identification of a core microbiome is important for defining a "normal" healthy state from which major variations may indicate a dysbiotic system that can result from or contribute to disease development. One barrier to defining an intestinal core microbiome has been the vast degree of variation between individuals. The microbial communities identified in samples collected from an individual over time are more similar to each other than microbial communities between two individuals, although related persons share more bacterial strains than unrelated individuals (Palmer et al., 2007; Yatsunenko et al., 2012; Faith et al., 2013). Although a consensus for what constitutes a core gut microbiome has been elusive, one report suggested that an international cohort of 39 individuals could be assigned to one of three distinct clusters or "enterotypes" based on metagenomic sequences (Arumugam et al., 2011). They found that each cluster was dominated by a particular bacterial genus (*Bacteroides*, *Prevotella*, and *Ruminococcus*) with positive or negative associations with a number of other genera in the community. They also reported that each cluster was enriched for specific gene functions that reflected different microbial trophic chains. Two of the three original enterotypes, *Bacteroides*, and *Prevotella*, were later confirmed and long term dietary patterns were identified as the primary predictor of an individual's enterotype (Wu et al., 2011). The *Bacteroides* enterotype was associated with a Western-type diet high in proteins and fat, while the *Prevotella*

enterotype was associated with plant fiber consumption. These enterotypes appear to be extremely stable, and several studies utilizing short-term interventions failed to result in a change in the assigned enterotype of participants (David et al., 2014; Roager et al., 2014).

The existence of enterotypes provided a convenient way of classifying individuals based on their fecal microbiota (although some argue a more appropriate term would be "faecotype") and speculation has begun as to whether enterotypes can be used as a predictor of long term health risks. However, a microbial survey of several body sites, including stool, conducted with more than 200 individuals showed only minimal segregation into the *Bacteroides* and *Prevotella* enterotypes rather than the distinct and well separated clusters previously reported (Huse et al., 2012). These discrepancies could be due to the fact that the method for assigning enterotypes is not consistent across studies. An analysis of archived 16S sequences also showed that enterotype determination is sensitive to clustering methods and distance metrics used and that there is a continuum of *Bacteroides* abundances across samples rather than a bimodal distribution (Koren et al., 2013). These studies suggest that the enterotype concept is not be as clear cut as previously believed, and that standard methods for defining enterotypes should be developed and employed before they can be meaningfully tied with clinical outcomes.

#### **LONG TERM DIETARY PATTERNS AND THE MICROBIOME**

Whether enterotypes truly exist or not, it is clear that diet is an important factor in shaping the microbiome (**Figure 2**). In addition to the divergence in microbial composition of Italian children and those from Burkina Faso shortly after weaning (De Filippo et al., 2010); other studies have shown microbiota segregation of individuals from Malawi, Venezuela, and the United States (Yatsunenko et al., 2012); children from Bangladesh and the United States (Lin et al., 2013), and between rural Africans and African Americans (Ou et al., 2013) that are at least partially diet-driven. In the Yatsunenko et al. (2012) study, metagenomic sequences revealed that enzyme classifications associated with protein degradation and bile salt metabolism were enriched in samples from the U.S. population where protein and fat consumption is high. Conversely, glutamate synthase and starch degrading enzymes were more abundant in the Amerindian and Malawian samples; consistent with protein poor diets of corn and cassava. This has been further demonstrated in a recent study of the diversity and metabolism of the microbiome of a Tanzanian hunter gatherer tribe, the Hadza. This study identified differences in the microbiome between the sexes which were consistent with their division of labor with regard to foraging (Schnorr et al., 2014). They also have many bacterial species associated with fermentation of plant-based fibers and are completely deficient in *Bifidobacterium*, which was hypothesized to result from the lack of meat and dairy in the diet; substrates that allow these bacteria to continue to colonize Westerners into adulthood. Although comparative studies between populations with different diets has been useful in identifying how dietary patterns shape the microbiome, these studies have utilized international cohorts that introduce confounding factors such as extreme differences in culture and

environment. Relatively few studies have been conducted that examine the effects of diet on homogenous populations. One study looked at correlations between specific dietary components and microbial function and structure in the intestines of a human cohort known for keeping meticulous diet logs (Muegge et al., 2011). They found that there were significant correlations between microbial gene function (Kegg orthologs) and protein intake, confirming the difference that was seen across multiple *mammalian* species between carnivores and herbivores. They also reported a correlation between insoluble fiber consumption and bacterial community membership. A large-scale microbiome sequencing effort called the American Gut Project is currently underway and is attempting to address the effects of diet on the adult microbiome capturing extremes within the American diet (i.e., vegan, paleo, etc) where cultural and environmental factors will be minimized.

## **DIETARY INTERVENTIONS INTRODUCE TRANSIENT AND SUBTLE CHANGES IN THE MICROBIOME**

Short-term dietary interventions that include introducing novel food components or altering macronutrient levels have also been examined for their effects on intestinal microbial populations. The first of these studies followed obese individuals partitioned to restricted calorie diet groups that controlled for either fat or carbohydrate intake (Ley et al., 2006). Regardless of the macronutrient composition of the diet, individuals that lost a significant amount of body weight had a change in their ratio of Bacteroidetes to Firmicutes, driven by increases in the Bacteroidetes. Weightloss driven changes in the microbiome was recently confirmed in individuals consuming a calorie restricted liquid diet where it was demonstrated that weight stability of an individual was a better predictor of fecal microbiome stability than time between sample collections (Faith et al., 2013). However, this and another study (Duncan et al., 2008) noted changes in members of the

Firmicutes rather than an increase in Bacteroidetes when corresponding weight loss occurred. Calorie restriction in obese and overweight individuals has also been shown to increase microbial gene richness, a parameter that was correlated to improved metabolic parameters (Cotillard et al., 2013; Le Chatelier et al., 2013).

Several studies have noted rapid but transient changes in fecal microbial composition immediately following the start of a dietary intervention study. Wu et al. (2011) conducted a controlled feeding experiment in ten individuals randomized to high fat/low fiber or high fiber/low fat diets and found that although there was no increase in community similarity between individuals on the same diet over a period of 10 days, the first 24 h period was considered an outlier because transient dramatic shifts occurred in the fecal communities of all individuals. Similarly, switching between animal and plant-based diets produces similar results (David et al., 2014). Another interesting finding of the David et al. (2014) study was that foodborne microbes transiently colonized the gut, introducing the idea that food may not only select for commensal bacterial species, but serve as a reservoir for new microbial introductions. Intentional introduction of food-borne microorganisms (probiotics) as well as prebiotic food ingredients and foods high in fiber can also be a means of subtly changing the relative abundance of bacterial species in the gut (Preidis and Versalovic, 2009). Thus, despite the inherent stability of the microbiome over time, changes related to weight loss and diet composition continue to subtly alter the composition and relative abundance of our commensal organisms, driving the development of our gut microbiome throughout adulthood.

## **THE AGING GUT**

As a person ages, the stability and diversity of their gut microbiota declines with the state of their health. If health remains intact however, microbiota composition often retains the stability and compositional make-up of a healthy younger adult (Claesson et al., 2012). The most prevalent age-related factors influencing the microbial population of the gut are: (1) physiological changes, (2) dietary choices and malnutrition, (3) living situation (community-dwelling, hospitalized, or long-term care), and (4) use of antibiotics (Bartosch et al., 2004; Woodmansey, 2007; Claesson et al., 2012) and other prescription drugs (Qato et al., 2008). This section will explore dietary alterations and antibiotic usage as drivers of change in the elderly gut microbiome and discuss the use of probiotics and prebiotics as potential solutions for the restoration of a healthy gut.

Diet is a major influence on the bacterial makeup of the aging gut. Physiological changes, such as loss of taste and smell, difficulty chewing or swallowing, impaired digestive function, and lack of physical mobility can leave elderly individuals consuming a narrow and nutritionally imbalanced diet, setting the stage for malnutrition (Bartosch et al., 2004; Claesson et al., 2012). Relocation from an in-home community setting to a long-term care facility can change dietary intake as well. The move often contributes to a greater consumption of fat and a decreased intake of fiber, fruits, vegetables, and meat. These dietary alterations are associated with a decrease in microbial diversity and increased frailty (Claesson et al., 2012).

The use of antibiotics in elderly populations is especially prevalent in hospital and long-term care facilities. Antibiotics create an environment of instability by diminishing the population of total and commensal bacteria and opening the door for pathogenic bacteria to overpopulate (Claesson et al., 2011). The use of broad-spectrum antibiotics is associated with the overgrowth of *Clostridium difficile* which flourishes in the antibiotic-weakened gut, often resulting in a life threatening infection (Macfarlane, 2014). As health issues compound and antibiotic use increases, elderly often see a decline in commensal anaerobes (*Bacteroides*, *Lactobacillus* and *Bifidobacterium)* accompanied by a rise in proteolytic and pathogenic bacteria (*Fusobacteria*, *Propionibacteria*, *Clostridia,* and *E. coli*; Wu et al., 2011). Studies indicate that probiotics may have potential as a therapeutic tool to replenish and recolonize beneficial bacterial species like *Bifidobacterium*and *Lactobacillus,* bringing the elderly gut back into balance (Likotrafiti et al., 2014).

### **EFFECTS OF DIET AND MALNUTRITION ON THE ELDERLY MICROBIOME**

A number of proposed factors contribute to alterations in the elderly gut ecosystem and diet is a significant driver of change (Claesson et al., 2012). Dietary intake can change for a number of reasons with advanced age. Decline in physical mobility may limit access to the grocery store or inhibit the ability to cook. Some elderly lose the desire to eat due to loss of smell and taste or due to slow digestion and prolonged satiety (Britton and McLaughlin, 2013). Malnutrition is often an unintended consequence of age-related physiological changes that can lead to changes in the elderly gut microbiome. Furthermore, studies have shown that compositional dietary changes can result in almost immediate alterations in microbial populations. Wu et al. (2011) found that changes in microbiome composition were detectable within 24 h of dietary alteration and occurred even faster than transit time of food through the gut. In an infant population, malnutrition

was shown to delay the maturation of the intestinal microbiota (Subramanian et al., 2014), and it is likely to have consequences of a similar magnitude in the elderly gut.

Dietary changes that come with age are also impacted by living situation. Claesson et al. (2012) found distinct dietary differences between elderly individuals living in a traditional community setting compared to those in long-term care facilities. Community-dwellers may be healthier than their institutionalized counterparts for a number of reasons, but they broadly stated that community-dwellers eat a healthier and more diverse diet and have a distinct microbiota from those in long-term care facilities (Claesson et al., 2012). The largest dietary differences were seen in consumption of fruits, vegetables, and meat. Community-dwellers correlated 98% with a moderate fat/high fiber diet and long-term care dwellers correlated 83% with high fat/low fiber diet (Claesson et al., 2012). The gut microbiota of community-dwellers was more diverse than long-stay subjects and grouped more closely with healthy young adults, indicating that age itself is not the driving factor of microbial change. Similar to young adults, community-dwellers had a higher proportion of phylum Firmicutes and unclassified bacteria, and abundant populations of genera *Coprococcus*, *Roseburia*, *Ruminococcus,* and *Butyricoccus* when compared to long-term stay individuals. Longstay subjects had a higher incidence of frailty accompanied by a proportional increase in Bacteroidetes and an increased abundance of *Alistipes* and *Oscillibacter* when compared to healthier community-dwelling elderly (Claesson et al., 2012). Increasingly frail individuals showed a significant 26-fold reduction in the number of *Lactobacillus* and a significant sevenfold increase in the number of *Enterobacteriaceae* compared to less frail subjects (van Tongeren et al., 2005).

## **ANTIBIOTICS**

The compounded effects of poor diet, ailing health, and prolonged stays in a hospital or long-term care facility reduce the prevalence of protective gut microbiota and give way to detrimental populations (Bartosch et al., 2004; Wu et al., 2011). This leaves the elderly individual vulnerable to infection and disease and a prime candidate for antibiotic usage. Unfortunately, antibiotic therapies only exacerbate the flux and instability of the already fragile gut microbiome in unhealthy elderly. The use of antibiotics in elderly populations is especially prevalent in hospital and long-term care facilities and it is estimated that nearly 20% of elderly patients in hospitals are receiving antibiotic treatment at any given time (Bartosch et al., 2004).

Antibiotics cause significant disturbances in gut microbiota resulting in the suppression of both beneficial and pathogenic species, allowing the overgrowth of antibiotic-resistant strains. In young, healthy volunteers administered two separate courses of the antibiotic ciprofloxacin, a dramatic change in the microbiota was noted, followed by the return to an alternative stable state of undetermined consequences (Dethlefsen and Relman, 2011). Use of broad-spectrum antibiotics is associated with the opportunistic bacterium *Clostridium difficile* which flourishes in the antibioticweakened gut and results in severe diarrhea (Macfarlane, 2014). Elderly hospital patients and others with fragile immune systems are especially susceptible to this life-threatening infection.

Most often, elderly individuals exposed to antibiotics see an increased relative abundance of Bacteroidetes and a significant increase in Bacteroidetes:Firmicutes ratio (Claesson et al., 2011). Beneficial anaerobic species in the colon such as *Bifidobacterium*, *Lactobacillus*, and *Bacteroides* can be drastically reduced or even eradicated with the use of antibiotics (Bartosch et al., 2004). *Bifidobacterium* and *Lactobacillus* are producers of short chain fatty acids (SCFA's), a nutrient vital to the proper function of intestinal cells; the loss of these bacteria can be especially detrimental. A study examining the differences in bacterial colonies between healthy elderly, hospitalized patients, and hospitalized patients receiving antibiotics, found that the hospitalized patients receiving antibiotics saw a significant reduction in the numbers of *Bifidobacterium spp*. and an increased relative abundance of *Enterococcus faecalis* compared to the other two groups. In some patients, the antibiotic treatment eliminated certain bacterial communities altogether (Bartosch et al., 2004).

Effects of antibiotic treatment on gut microbiota can differ significantly with the type and dose of antibiotic administered. A study by Bartosch et al. (2004) following elderly patients receiving antibiotics, found that the same antibiotic, clarithromycin, had different effects on gut microbiota at different doses. A low dose of the antibiotic decreased the proportion of Bacteroidetes (*Bacteroides* and *Parabacteroides*) and increased Firmicutes (*Alistipes*) and a high dose increased the proportion of Bacteroidetes (*Parabacteroides*) and decreased the proportion of Firmicutes (*Alistipes*; Claesson et al., 2011). Countless variables must be considered with the use of antibiotics in elderly individuals. What seems like a lifesaving drug may have detrimental effects on the aging microbiome and the health of the individual. Additional research is needed to inform practitioners on the safest ways to use antibiotics on the elderly while supporting their potentially fragile gut microbiota.

## **PROBIOTICS AND PREBIOTICS**

Probiotics and prebiotics, when taken together or individually, may be particularly beneficial in restoring the proper microbial balance to the elderly gut microbiota, helping to mitigate the detrimental effects of antibiotic usage and under nutrition. Probiotics are live microbes that when administered in sufficient quantities are beneficial to the host. Prebiotics are non-digestible food ingredients such as inulin or various oligosaccharides, which have been show to selectively stimulate growth of beneficial bacterial populations in the large intestine. Probiotic foods and supplements often contain *Bifidobacterium* and/or *Lactobacillus* organisms, both of which are extremely important to proper function of the intestine (Duncan and Flint, 2013). *Bifidobacterium* and *Lactobacillus* are often depleted in elderly individuals as health deteriorates. Research shows that consumption of probiotics containing these strains can result in a notable rise in their abundance along with a reduction of more pathogenic microorganisms in the gut (Toward et al., 2012). Prebiotics may support the *Bifidobacterium* and *Lactobacillus* species delivered via probiotic supplementation by providing a fermentable food source for these bacteria, allowing them to flourish. More specifically, it has been reported that prebiotics have the ability to exert a bifidogenic effect on human subjects (O'Connor et al., 2014).

A recent *in vitro* study showed promise that the elderly gut microbiota can in fact be modulated with appropriate probiotics. Species of *Bifidobacterium* and *Lactobacillus* along with two prebiotics were added to the fecal batch culture of elderly participants. The addition of the beneficial bacteria significantly increased the *Bifidobacterium* and decreased the *Bacteroides* count after fermentation (Likotrafiti et al., 2014). Both probiotic/prebiotic combinations added to the culture increased the *Bifidobacterium* and *Lactobacillus* count in the vessel representing the distal colon. These results represented a major shift in the gut microbiota toward a healthier colon (Likotrafiti et al., 2014). However, prebiotics alone have also been shown to improve the health and alter the gut microbial composition of elderly populations. A study providing inulin supplementation to an elderly cohort increased *Bifidobacterium* levels (Guigoz et al., 2002). Multiple studies using either fructo or GOSs demonstrated both bifidogenic effects and beneficial immune-related effects. Specific immune related effects included reduction in pro-inflammatory cytokines and an increase in the anti-inflammatory cytokine, IL-10.

While probiotic supplementation has become a widely utilized tool to positively impact health by assisting with digestion, bolstering intestinal barrier function and coordinating with the body to regulate both the innate and specific immune responses, the mechanisms by which they exert these beneficial effects is poorly understood (Siciliano and Mazzeo, 2012). Proteomic-based probiotic research is beginning to inform both researchers and industry that adaptation and adherence properties specific to probiotic strains influence their ability to colonize the host (Siciliano and Mazzeo, 2012; van de Guchte et al., 2012). Additionally, these adaptation and adherence mechanisms have been reported to potentially be strain specific, making it difficult to globally apply these mechanisms to all probiotic bacterial strains (Siciliano and Mazzeo, 2012).

Experimentation on the effects of probiotics and prebiotics of the elderly gut microbiome is still limited, but results of the available research lends merit to the notion that beneficial bacteria in the form of probiotics and the indigestible fibers of prebiotics has potential to help restore stability, increase diversity and beneficially alter the immune system in the aging gut (Vulevic et al., 2008). However, these beneficial effects must be placed in perspective given the lack of a mutually agreed upon selection criteria, evaluation methodologies and a clear mechanistic model. With the reduced cost of sequencing and continued proteomic research, hopefully researchers will be able to speak with increased certainty as to the reasons probiotics can be beneficial to human host.

## **CONCLUSION**

The microbes that reside in our gastrointestinal tract comprise a dynamic community that changes throughout the lifespan of an individual. The early years of infancy and childhood are characterized by a microbial state that has been described as chaotic because of the rapid and dramatic fluctuations observed. While the microbiota of small children begins to resemble that of adults at a very early age, there is a paucity of studies examining temporal microbial community shifts in children beyond infancy, so the stability of their microbiota is not known. Once stable dietary patterns are established, the microbiota of adults remains relatively unaltered; however, significant weight changes have been associated with a higher amount of microbial instability. Finally, factors related to aging, including increased use of pharmaceuticals and changes in diet likely play an important role in shaping the microbial communities residing in the elderly. Changes in physical activity and hormone levels may also be important determinants of the elderly microbiome, but they have not yet been investigated with sufficient depth. Some evidence suggests that the microbial communities of healthy elderly individuals are similar to that of younger adults, but whether the health of the individual contributes to microbial stability or vice versa is not known. Current data suggest that diet is an important driver in the development of the gut microbiome and could serve as a means of therapeutic intervention for prevention of diseases. Studies linking the composition and function of the gut microbiome and disease development certainly highlight the need for a better understanding of temporal microbiome dynamics and their predictors.

### **ACKNOWLEDGMENTS**

The authors would like to acknowledge support from NIH R21CA161472, the Colorado Agricultural Experiment Station, and Colorado State University Libraries Open Access Research and Scholarship Fund.

## **REFERENCES**


the human gut microbiome. *Nature* 505, 559–563. doi: 10.1038/nature 12820


children from Bangladesh and the United States. *PLoS ONE* 8:e53838. doi: 10.1371/journal.pone.0053838


Marques, T. M., Wall, R., Ross, R. P., Fitzgerald, G. F., Ryan, C. A., and Stanton, C. (2010). Programming infant gut microbiota: influence of dietary and environmental factors. *Curr. Opin. Biotechnol.* 21, 149–156. doi: 10.1016/j.copbio.2010.03.020


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 July 2014; accepted: 02 September 2014; published online: 22 September 2014.*

*Citation: Voreades N, Kozil A and Weir TL (2014) Diet and the development of the human intestinal microbiome. Front. Microbiol. 5:494. doi: 10.3389/fmicb.2014.00494 This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Voreades, Kozil and Weir. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Aging and the human gut microbiota—from correlation to causality

#### *Sitaraman Saraswati <sup>1</sup> and Ramakrishnan Sitaraman2 \**

*<sup>1</sup> Department of Biochemistry, Dayananda Sagar College of Arts, Science, and Commerce, Bangalore, India*

*<sup>2</sup> Department of Biotechnology, TERI University, New Delhi, India*

*\*Correspondence: minraj@gmail.com*

#### *Edited by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

#### *Reviewed by:*

*Suleyman Yildirim, Istanbul Medipol University, Turkey Carl James Yeoman, Montana State University, USA Ranjit Kumar, University of Birmingham at Alabama, USA*

**Keywords: gut microbiota, microbiome, ecological succession, co-evolution, inter-species/kingdom signaling, elderly, aging, effector molecules**

## **INTRODUCTION**

The human gastrointestinal (GI) tract harbors the largest number and concentration of microbes found in the human body. Perturbations in the gut microbial ecosystem have also been associated with conditions as diverse as chronic GI diseases (e.g., Crohn's disease, ulcerative colitis), metabolic disorders (e.g., diabetes types 1 and 2, obesity) and antibiotic use (for a review see Sekirov et al., 2010). Metagenomic culture-independent methods have enabled the unraveling of the complexity of the gut microbiota (Rajilic-´ Stojanovic et al., 2009 ´ ). Given the considerable inter-individual diversity in the actual composition of the microbiota, significant collaborative attempts have been made to systematize the available knowledge (Arumugam et al., 2011; Human Microbiome Project Consortium, 2012) and identify "core" microbiota that are conserved among humans to facilitate meaningful comparisons (Huse et al., 2012). Changes in the microbial composition also take place with age, with a high degree of variability at the two extremes of infancy and old age, punctuated by comparative stability during adulthood (for reviews, see Woodmansey, 2007; O'Toole and Claesson, 2010). Given that increases in life expectancy will likely result in an increase in the elderly population worldwide, analysis of the contribution of the microbiota to healthy aging assumes greater significance (for a recent review, see Tiihonen et al., 2010).

Age-related spatio-temporal variations in the microbiota are best viewed within an ecological- evolutionary framework (see review by Costello et al., 2012). Diet is a major, controllable environmental factor influencing the composition of the host microbiome, with the high-fat, sugar-rich Western diet contributing to a *Bacteroides*dominant microbiome and high-fiber diet to one dominated by Firmicutes with a strong correlation between long-term diet and enterotypes (Wu et al., 2011). In terms of ecological succession, the Bifidobacterium-dominated microbiota of the infant changes over time into the Bacteroidetes- and Firmicutes-dominated microbiota of the adult (Ottman et al., 2012), remaining fairly stable through adulthood in the absence of perturbations like long-term dietary changes or repeated antibiotic intervention. Pathogens may then be viewed as invasive species in the ecological sense, constantly testing the resilience of the native ecosystem, resulting in their elimination, low-level persistence (enabling future opportunism), or establishment causing disease.

## **AGE- AND ENVIRONMENT-RELATED CHANGES IN THE GUT MICROBIOTA**

The most noticeable feature in the microbiota of elderly individuals is an alteration in the relative proportions of the Firmicutes and the Bacteroidetes, with the elderly having a higher proportion of Bacteroidetes while young adults have higher proportions of Firmicutes (Mariat et al., 2009). Significant decreases in Bifidobacteria, *Bacteriodes,* and *Clostridium* cluster IV have also been reported (Zwielehner et al., 2009). Variability among individuals ranges from 3% to 92% for Bacteroidetes and between 7% and 94% for Firmicutes. The microbiota of individual subjects however exhibit less temporal variability (Claesson et al., 2011).

Changes occurring in the microbiota during aging can have an impact on host health. van Tongeren et al. (2005) studied the relationship between microbial diversity and frailty scores in elderly individuals. A significant reduction in the proportion of lactobacilli, *Bacteroides/Prevotella* and *Faecalibacterium prausnitzii*, and an increasein the proportion of*Ruminococcus*, *Atopobium,* and Enterobacteriaceae was seen in individuals with high frailty scores. Recently, Claesson et al. (2012) studied the relationship between diet, host health, environment, and the gut microbiota. Specifically, associationwas observed between microbial diversity and the functional independence measure (FIM), the Barthel index (used to evaluate performance in daily routine activities) and nutrition. Decreased microbial diversity correlated with increased frailty, decreased diet diversity and health parameters, and with increased levels of inflammatory markers. Individuals living in a community had the most diverse microbiota and were healthier as compared to those in short- or long-term residential care. Bartosch et al. (2004) reported that hospitalization itself appeared to resultin a decreased abundance of the *Bacteroides-Prevotella* group. Later studies by Claesson et al. (2012) further detailed the effects of residence location on gut microbial diversity. Residence location also affects the microbiota of patients on antibiotic treatment, with highest levels of bifidobacteria in the community-dwelling group and lowest in those in long-term residential care. The levels of *Lactobacillus* in the antibiotic-untreated group were higher in rehabilitation (hospital stay *<* 6 weeks) as compared to long-stay or community-dwellers.

Predictably, antibiotic treatment has been reported to affect both richness and diversity of the microbiota and is associated with decreases in bifidobacteria (Bartosch et al., 2004; Woodmansey et al., 2004; O'Sullivan et al., 2013) as well as the *Bacteroides-Prevotella* group (Bartosch et al., 2004; Woodmansey et al., 2004). Lactobacilli, however, are observed to have increased in antibiotic-treated elderly subjects (Woodmansey et al., 2004; O'Sullivan et al., 2013); similarly, an increase in clostridial diversity has also been reported (Woodmansey et al., 2004). The changes taking place in response to antibiotics are more apparent at the genus rather than the family or phylum levels (O'Sullivan et al., 2013). Treatment with antibiotics can result in *Clostridium difficile* infection in the elderly, manifesting as *C. difficile*-associated diarrhea (CDAD). Reduced species diversity in CDAD patients compared to healthy elderly and young adults accompanied by a large reduction in bifidobacteria, *Bacteroides,* and *Prevotella* has been reported (Hopkins and MacFarlane, 2002). However, an increase in facultative bacteria along with an increase in diversity of clostridial and lactobacilli species in CDAD patients was reported in the same study. A recent study also detected differences at the genus level between *C. difficile* -negative and -positive subjects, and patients with CDAD (Rea et al., 2012). Incidentally, the isolation of the hypervirulent *C. difficile* R027 ribotype from one asymptomatic individual in this study who exhibited greater microbial diversity compared to CDAD patients, serves to highlight the importance of an intact and unperturbed gut ecosystem in resisting colonization by pathogens. Restoration of the microbiota and curing CDAD by fecal microbiota transplantation (FMT) in recent years presents a novel therapeutic strategy that is under intense scrutiny (for a discussion see Vrieze et al., 2013). In contrast to antibiotics, the common usage of non-steroidal antiinflammatory drugs (NSAIDs) does not appear to significantly perturb the microbiota (Tiihonen et al., 2008; Mäkivuokko et al., 2010).

Interestingly, centenarians harbor less diverse microbiota, though Bacteroidetes and Firmicutes still constitute the dominant phyla (Biagi et al., 2010), with enrichment for potentially pathogenic Proteobacteria. Biagi et al. (2010) reported higher levels of *Akkermansia* in the elderly, compared to young adults, in contrast to an earlier study (Collado et al., 2007) that reported a decrease in this genus with age. Subsequent functional microbiome profiling of selected, well-characterized samples from this cohort indicated increased abundance of genes involved in aromatic amino acid metabolism, decreased abundance of those involved in short-chain (≤6) fatty acid production and an enrichment of "pathobionts"—low-abundance microbiota that promote and sustain pro-inflammatory conditions (Rampelli et al., 2013). This supports an earlier finding by Collino et al. (2013) that increased levels of phenylacetylglutamine (PAG) and *p*-cresol-sulfate (PCS), derived from the catabolism of aromatic amino acids, were excreted in the urine of centenarians. Thus, the changes in the gut microbiota of the elderly are reflected in the changes in the microbial metabolism.

## **FROM CORRELATION TO CAUSALITY—SOME GENERAL CONSIDERATIONS**

High-throughput analytical tools and meta-"omics" enable probing of the hostmicrobiota relationship at high resolution, helping correlate healthy or diseased states with the detailed composition of the microbiota, and informing the use of wellcharacterized (e.g., probiotic) or largely unknown (e.g., stool transplants) mixtures of microorganisms for restorative or maintenance purposes. However, complicating matters further is the existence of distinct ecological niches all along the alimentary canal, indicating that the common (and convenient) method of fecal sampling for microbiota studies may not adequately reflect the situation *in vivo* (Li et al., 2011). Ideally, we would like to determine the identity of the molecules that mediate host-microbiota interactions, and how their deployment is regulated. Here, information about host-pathogen interactions and general microbiology offers insights into the range of intraand inter-species interactions, and even inter-kingdom ones (**Table 1**). However, given our current inability to convincingly delineate the contextually most significant effector mechanisms involved in the hostmicrobiota interaction over a lifetime, it is difficult to tease apart causality from correlation. Moreover, the host and the microbiota impact each other reciprocally, and the microbiota themselves interact in many modes among themselves. While current host signals may modulate the microbiota, it is an open question whether these signals themselves were induced, at least in some measure by components of the microbiota themselves. Theoretically, the host could also modulate the microbiota so that microbial responses are, in turn, beneficial to itself. The landmark study of Claesson et al. (2012) points to the possibility of such a reciprocal (and more confusingly, recursive) relationship between host health and microbial diversity.

From an evolutionary standpoint, we would also like to know how much these interactions and associations are modulated over the host lifetime and during coevolution in order to benefit both partners. Their persistence is also dependent on the forces of selection operative at a given time (Sancar, 2008; Lukeš et al., 2011), such as bacteriophage infection (Reyes et al., 2012; Koskella, 2013). The recent discovery that an unknown secreted protein from human intestinal cells decreases conjugation efficiency in *E. coli* indicates that the host can potentially influence the composition, the rate of evolution and lateral gene transfer among its microbiota (Machado and Sommer, 2014). An unexplored consideration is the potential influence of host hormones and their changing levels over age on the microbiota. Additionally, the relative abundance of a given enterotype or species may also not be an unambiguous


**Table 1 | Examples of effector mechanisms involved in inter-species and inter-kingdom interactions.**

predictor of relative importance in the ecological sense. Therefore, identifying keystone species that might have an effect on the ecosystem disproportionate to their abundance would be very valuable for focused studies of the microbiota. We surmise that the benefits arising from probiotic administration could be due to their temporary assumption of such a "keystone" role. Notably, current studies of microbiota concentrate solely on the doubtlessly important bacteria, omitting archaea and clinically relevant eukaryotes (fungi), an approach that could potentially miss less abundant but nevertheless important species. A recent finding on the important role of Dectin-1, a Ctype lectin receptor and an innate immune sensor of fungi, in preventing intestinal colitis-associated pathology underlines the importance of interactions between the human host and the numerically less abundant intestinal fungi (Iliev et al., 2012).

Natural selection operates simultaneously at multiple ecological levels, ranging from single unicellular organisms to entire communities and ecosystems. The magnitude and relative importance of multiple, and often stochastic, selection pressures acting over a human lifetime, therefore need careful consideration. Thus, it could be inaccurate to ascribe a given microbiota profile solely to a single factor (e.g., diet) even though there may be some correlation between the said profile and a single contributing factor (see review by Yeoman et al., 2011). Additionally, non-equilibrium (co-) evolutionary processes may not necessarily result in optimality. Rather, they are governed by the actual functionality of the given arrangement ("phenotype") and its ability to propagate itself (fitness), not on the details of the arrangement itself (components, genotypes etc.). Microbiota research, in the context of aging or otherwise, will greatly benefit from the integration of several disparate pieces of mechanistic information within an evolutionary-ecological framework in order to determine the causes underlying our observations, and the formulation of plausible mechanistic models describing how these causes result in the observed effects.

## **ACKNOWLEDGMENTS**

We dedicate this paper to our parents, Mr. G. Sitaraman and Mrs. Indu Bala, for unstintingly supporting our academic endeavors.

## **REFERENCES**


beyond: gut microbiota and inflammatory status in seniors and centenarians. *PLoS ONE* 5:e10667. doi: 10.1371/journal.pone.0010667


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 August 2014; accepted: 15 December 2014; published online: 12 January 2015.*

*Citation: Saraswati S and Sitaraman R (2015) Aging and the human gut microbiota—from correlation to causality. Front. Microbiol. 5:764. doi: 10.3389/fmicb. 2014.00764*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2015 Saraswati and Sitaraman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Immune modulation of the brain-gut-microbe axis

## *Sahar El Aidy1,2\*, Timothy G. Dinan1,3 and John F. Cryan1,4*

*<sup>1</sup> Laboratory of Neurogastroenterology, Alimentary Pharmabiotic Centre, University College Cork, Cork, Ireland*

*<sup>2</sup> Department of Industrial Biotechnology, Genetic Engineering and Biotechnology Research Institute, Sadat City University, Sadat City, Egypt*

*<sup>3</sup> Department of Psychiatry, University College Cork, Cork, Ireland*

*<sup>4</sup> Department of Anatomy and Neuroscience, University College Cork, Cork, Ireland*

*\*Correspondence: s.elaidy@ucc.ie*

#### *Edited and reviewed by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

**Keywords: neuropeptides, immune cells, gut microbiota, nervous system, HPA**

Only recently have we fully appreciated that the classically separated domains of neurology, endocrinology, immunology and microbiology, with their various organs- the brain, glands, gut, immune cells and microbiota, could actually be joined to each other in a multidirectional network of communication, in order to maintain homeostasis. For example, local and systemic immune activation have profound neural and behavioral effects (Campos-Rodríguez et al., 2013), neuroendocrine hormones regulate immune cytokines, and together, the nervous system and immune system work together in synergy to protect the body from infection (Steinman, 2004). Analogously, the gut microbes greatly impact the host immunological, psychological, and overall well-being of the host (Collins and Bercik, 2013; El Aidy and Kleerebezem, 2013a; Wang and Kasper, 2013; Moloney et al., 2014). However, definitive mechanisms that orchestrate a functionally relevant communication within this network, in particular, during the early life development, are yet to be elucidated. A potential unifying mechanism very likely involves multiple-functioning molecules and their receptors, as they are produced by, act upon and move from one system to another linking the brain, gut, immune system, and microbiota. These messenger molecules include (among others) neurotransmitters, neuropeptides, endocrine hormones, and cytokines.

The physiological phenomenon of maturation of the immune and neurological systems, as well as the microbial colonization, initiated within the fetal period, are dynamic in their character and are expanding in time through the first months and even years of human's life. The mode of delivery, be it vaginal birth or caesarean-section, has recently been shown to be critical in determining the pioneer microbial composition of neonates (Dominguez-Bello et al., 2010). Particularly over the first few years of life, as the microbiota develops, there is a greater potential for disruption of the long-term microbial state upon the repeated use of antibiotics (Lemon et al., 2012). Not only does the antibiotics use disrupt the microbial community in healthy infants but also it amplifies the microbial dysbiosis in pediatric patients with Crohn's disease (Gevers et al., 2014). Several reports have shown that dysbiosis alone as may result from antibiotic treatment is sufficient to drive intestinal inflammation (Hooper et al., 2012). Additionally, alterations of the microbial composition are often associated with changes in brain development and plasticity and alterations in motor, anxiety and social behavior (Sudo et al., 2004; Diaz-Heijtz et al., 2011; Neufeld et al., 2011; Clarke et al., 2013; Desbonnet et al., 2014). Consequently, abrupt shifts during the infant's unique developmental path through this early unstable phase may have longer term health implications (Costello et al., 2012). Activation of the (innate) immune response during the primary colonization involves the induction of toll-like receptors (TLRs) (Carvalho et al., 2012; El Aidy et al., 2012a, 2013b) and is linked to the stress induced by colonization, which increases gut permeability (Dinan and Cryan, 2012) to the colonizing microbes and their metabolites. Although very crucial in the recognition of the gut microbes, the scenario of immune activation via TLRs appear not to respond promptly at the initial stage of colonization (El Aidy et al., 2013c), suggesting that the very rapid actions of neurotransmitters and hormones might be important to control the priming and migration of cells of the first line of defense. Several members of pioneer gut colonizers are able to produce neurotransmitters. *Escherichia* and *Streptococcus*, for instance, can produce norepinephrine and serotonin (5-HT), whereas, *Lactobacillus* and *Bifidobacterium* can produce GABA and acetylcholine (Roshchina, 2010; Lyte, 2011). Mobile cells of the immune system express receptors for neurotransmitters (Pert et al., 1985). For example, migration of immature dendritic cells (DCs) to lymph nodes is mediated via α1 adrenergic receptors (Maestroni, 2000), emphasizing the early effects of the sympathetic nervous system (SNS) at the start of a local immune response. Norepinephrine, a neurotransmitter of the SNS (among other molecules stored in sympathetic vesicles), has pro-inflammatory effects at low concentration mediated through binding to α2 adrenoceptors and reduction of cAMP levels (Spengler et al., 1990). On the other hand, acetylcholine, the principle vagal neurotransmitter, attenuates the release of cytokines [tumor necrosis factor-alpha (TNF- α), interleukin (Il)-1β, Il-6 and Il-18, but not the anti-inflammatory cytokine Il-10, in lipopolysaccharides (LPS) stimulated human macrophage cultures (Borovikova et al., 2000)]. Moreover, choline acetyltransferase (ChAT), the key enzyme in the synthesis of acetylcholine, is expressed by B cells, DCs, and macrophages in the mucosal-associated lymphoid tissue (MALT). Reardon et al. reported that ChAT expression begins after microbial colonization, following birth, and requires MyD88-dependent signaling derived from the intestinal microbiota (Reardon et al., 2013). Monocytes such as macrophages and other leucocytes travel through the blood and when they come within scenting distance of a given neurotransmitter, they begin to chemotaxically orient toward it, and then communicate with other immune cells in the adaptive arm such as B and T lymphocytes to ensure well-coordinated immune response.

In parallel, activation of the nervous system, such as afferent vagus nerve fibers by cytokines stimulates neuronal anti-inflammatory responses (Sternberg, 1997). Immune cells can produce various neurotransmitters and other factors, which alert the brain to the changes that occur in the body and affect the plasticity of the local and central nervous systems, thereby can regulate mood and behavior. Leukocytes, for example, synthesize and release corticotropin (ACTH) and endorphins in response to bacterial LPS (Harbour-McMenamin et al., 1985). Mounting an innate immune response during pathogenesis and transiently, during primary gut colonization results in the production of pro-inflammatory cytokines such as Il-1β, TNF- α, and IFN-γ (El Aidy et al., 2013c). The proinflammatory cytokine- Il-1β, for instance, is able to inhibit the release of norepinephine from noradrenoceptor axon terminals, in the intestine via the induction of nitric oxide (Rühl et al., 1994; Rühl and Collins, 1997). To note, norepinephrine via β adrenergic signaling, inhibits many aspects of the innate and Th1 mediated immune response (Straub et al., 2006), whereas activated macrophages and other innate immune cells produce nerve repellent factors directed toward sympathetic nerve to counteract the sympathetic inhibitory effect (Miller et al., 2004). Importantly, another possible mechanism underlying the reduced levels of norepinephrine during mounting innate and Th1 mediated immune responses is the decreased L-DOPA decarboxylase activity-the enzyme that converts L-DOPA to norepinephrine, as observed in both inflamed and noninflamed colonic mucosa of Crohn's patients (Magro et al., 2002).

One mechanism through which immune activation or immunomodulation may affect physiology and behavior is via actions on serotonergic systems (Lowry et al., 2007). Analogous to the action of gut microbiota during primary colonization, Lowry et al. found that the nonpathogenic *Mycobacterium vaccae* led to stimulation of the peripheral immune system. The Th1 and Treg but not Th2 mediated immune activation stimulated a specific subset of serotonergic neurons in the dorsal raphe nucleus of mice and increased serotonin metabolism within the ventromedial prefrontal cortex (Lowry et al., 2007). This finding demonstrates that the type of peripheral immune response is important in determining the effects on serotonergic neurons. Additionally, the study of Lowry and co-workers suggests that the immune-responsive subpopulation of serotonergic neurons in the dorsal raphe appears to play an important role in the regulation of mood during the health state. Afferent fibers within the vagus nerve could be involved in transferring signals of peripheral immune activation to the CNS (Maier et al., 1998). The regulating mechanism appears to involve enhancement of c-Fos expression in dorsal raphe nucleus serotonergic neurons (Hollis et al., 2006). It is therefore tempting to speculate that immune activation under physiological conditions stimulates a subset of serotonergic neurons, distinct from those activated under pathological conditions or by uncontrollable stressors. Nevertheless it is unclear if this association reflects a causal or reactionary response. Immune activation induces symptoms of depression and anxiety in human patients including disorders such as irritable bowel syndrome (IBS), a disorder of the gut-brain axis (Clarke et al., 2009) and in patients receiving treatment with interferon (Felger et al., 2013). Treatment with serotonergic antidepressant drugs prevents the onset of depressive symptoms in such situations (Capuron and Miller, 2004; Capuron et al., 2004). Moreover, recent data has shown that the TNF antagonist infliximab reduces depression symptoms in a subset of patients with high baseline inflammatory biomarkers (Raison et al., 2013). Of note, local and systemic depletion of the 5-HT precursor- Tryptophan (Trp) is associated with elevation of the immunomodulatory enzyme Indoleamine-pyrrole 2,3-dioxygenase (IDO), which occurs during immune activation (Moffett and Namboodiri, 2003) and transiently during primary gut colonization (El Aidy et al., 2012b, 2013b). Moreover, 5-HT is altered by the gut microbiota, being elevated in conventionally raised mice, only when gut colonization occurs at birth (Clarke et al., 2013). Collectively, these findings illustrate evidence that serotonergic systems may play an important role in the relationship between the immune function, gut microbiota, and psychological state.

Local and systemic elevation of proinflammatory cytokines, in particular, Il-1β and Il-6 and altered 5-HT levels cause activation of the hypothalamicpituitary-adrenal (HPA) axis and production of corticotrophin releasing factor (CRF). CRF, which is also elevated in stress response, causes disturbance to other neuropeptides, leading to changes in mood and behavior (Dinan and Cryan, 2012). CRF stimulates the anterior pituitary gland to release the stress hormone; ACTH, which in turn stimulates release of cortisol from the adrenal gland. Cortisol, through a feedback loop, regulates the levels of CRF, and ACTH. Importantly, immune cells, through the COX-2 pathway and production of PEG-2 during (pro-) inflammation, stimulate the adrenal gland to produce corticosterone, which exclusively supports the β adrenergic pathway through which the SNS perform its anti-inflammatory effect (Straub et al., 2006). This indicates that under normal conditions the neuroendocrine and immune systems coordinate to ensure maintenance of homeostasis. Indeed, patients with IBS show inadequately low concentration of the antiinflammatory steroid hormone, cortisol (Straub et al., 2002).

In conclusion, we assume that the intimate communication between the microbiota-immune-neuroendocrine systems involves multiple-functioning molecule(s). Those same molecule(s) appear to be produced by and signal member(s) of the gut microbial community, the neuroendocrine and immune systems to mount or avoid rising an attack response against the commensals. This would be of crucial importance early in life. During this vulnerable period, it is believed to exist a narrow window during which colonization with a "healthy" microbiota exerts effects that may decrease susceptibility to diseases and ensure normal development of the mucosal and systemic immunity and metabolism as well as the development of HPA axis,which impacts on the gut through its action on the enteric nervous system, immune system and the CNS (Sudo et al., 2004; Shreiner et al., 2008). The most important question needed to be answered at this point, what would be the molecular mechanisms underlying the intimate cross-talk between the immune system and the microbiota-gut-brain axis at its various nodes of interaction?

## **REFERENCES**


has indomethacin-sensitive actions on Fos expression in topographically organized subpopulations of serotonergic neurons. *Brain Behav. Immun.* 20, 569–577. doi: 10.1016/j.bbi.2006.01.006


Wang, Y., and Kasper, L. H. (2013). The role of microbiome in central nervous system disorders. *Brain. Behav. Immun.* doi: 10.1016/j.bbi.2013.12.015. [Epub ahead of print].

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 March 2014; accepted: 20 March 2014; published online: 07 April 2014.*

*Citation: El Aidy S, Dinan TG and Cryan JF (2014) Immune modulation of the brain-gut-microbe axis. Front. Microbiol. 5:146. doi: 10.3389/fmicb.2014.00146 This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 El Aidy, Dinan and Cryan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Impact of the gut microbiota on the development of obesity and type 2 diabetes mellitus

## *Isabel Moreno-Indias1,2 , Fernando Cardona1,2 \*, Francisco J. Tinahones1,2 and María Isabel Queipo-Ortuño1,2 \**

*<sup>1</sup> Unidad de Gestion Clínica de Endocrinología y Nutrición, Laboratorio del Instituto de Investigación Biomédica de Málaga (IBIMA), Hospital Universitario de Málaga (Virgen de la Victoria), Málaga, Spain*

*<sup>2</sup> Centro de Investigación Biomédica en Red de Fisiopatología de la Obesidad y la Nutrición, Madrid, Spain*

#### *Edited by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

#### *Reviewed by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia Arseniy E. Yuzhalin, University of Oxford, UK*

#### *\*Correspondence:*

*María Isabel Queipo-Ortuño and Fernando Cardona, Unidad de Gestion Clínica de Endocrinología y Nutrición, Laboratorio del Instituto de Investigación Biomédica de Málaga (IBIMA), Hospital Universitario de Málaga (Virgen de la Victoria), Campus de Teatinos s/n, 29010 Málaga, Spain e-mail: maribelqo@gmail.com; fernandocardonadiaz@gmail.com*

## **INTRODUCTION**

The prevalence of obesity and its associated disorders, such as type 2 diabetes mellitus (TDM2), has increased substantially worldwide over the last decades. Recent insight suggests that an altered composition and diversity of gut microbiota could play an important role in the development of metabolic disorders. Most of the gut microorganisms reside in the large intestine, which contains an estimated 1011−<sup>12</sup> bacterial concentrations per gram of content (Leser and Molbak, 2009). These gut microbiota play a number of physiological roles involving digestion, metabolism, extraction of nutrients, synthesis of vitamins, prevention against colonization by pathogens, and immunomodulation (Jumpertz et al., 2011; Purchiaroni et al.,2013). In addition to an increased energy harvest from the diet, several mechanisms, including chronic low-grade endotoxemia, regulation of biologically active fatty acid tissue composition, and the modulation of gut-derived peptide secretion, have been proposed as links between gut microbiota and obesity (Musso et al., 2010). However, the contribution of gut microbiota to obesity and diabetes in humans is unclear. This is probably for various reasons, such as the fact that the heterogeneous etiology of obesity and diabetes can be associated with different microbes, studies have involved participants of diverse ethnic origin and food habits, the large inter-individual

Obesity and its associated disorders are a major public health concern. Although obesity has been mainly related with perturbations of the balance between food intake and energy expenditure, other factors must nevertheless be considered. Recent insight suggests that an altered composition and diversity of gut microbiota could play an important role in the development of metabolic disorders. This review discusses research aimed at understanding the role of gut microbiota in the pathogenesis of obesity and type 2 diabetes mellitus (TDM2). The establishment of gut microbiota is dependent on the type of birth. With effect from this point, gut microbiota remain quite stable, although changes take place between birth and adulthood due to external influences, such as diet, disease and environment. Understand these changes is important to predict diseases and develop therapies. A new theory suggests that gut microbiota contribute to the regulation of energy homeostasis, provoking the development of an impairment in energy homeostasis and causing metabolic diseases, such as insulin resistance or TDM2. The metabolic endotoxemia, modifications in the secretion of incretins and butyrate production might explain the influence of the microbiota in these diseases.

#### **Keywords: gut microbiota, obesity, type 2 diabetes mellitus, inflammation, LPS, SCFA**

variation in the composition of gut microbiota, and in particular the different methods that have been used to profile the microbiota in these studies (Tremaroli et al., 2012). On the other hand, the differences between gut microbiota in lean and obese individuals as well as the impact of diet in the composition of the gut microbiome are still not wholly understood. Thus, manipulation of the gut microbiome represents a novel approach to treating obesity although it is in no way a substitute for diet and exercise. This review discusses the research conducted in understanding the role of gut microbiota in the pathogenesis of obesity and TDM2.

### **GUT MICROBIOTA COMPOSITION**

Microorganisms colonize all surfaces of the human body that are exposed to the environment, with most residing in the intestinal tract. Bacterial communities at a particular body site have more similarity among different subjects than in the same subject but at different body sites; i.e., there is more similarity between oral bacterial communities of different individuals than between the bacterial communities of the skin and the mouth in a single individual (Costello et al., 2009), although there is also considerable inter-individual variability (Costello et al., 2009; Robinson et al., 2010). The bacterial component of the microbiota has hardly been studied in recent years, driven by large-scale projects such as the Human Microbiome Project (Turnbaugh et al., 2007; Peterson et al., 2009) and MetaHIT (Qin et al., 2010). Research about gut microbiota, mainly using ribosomal 16S RNA and whole-genome sequencing (WGS – whole-genome shotgun; Turnbaugh et al., 2009b), has provided a general view of the commensal microbial communities and their functional capacity. For instance, in 2010, a catalog was established of 3.3 million gut microbial genes (Qin et al., 2010), with another wider catalog published soon after (Human Microbiome Project Consortium, 2012a,b). These studies have shown the great variability in microbiota composition among healthy subjects, even between twins sharing less than 50% of their bacterial taxons at the species level (Turnbaugh et al., 2010). However, this does not mean genetics has no role in the establishment and conformation of the gut microbiota, and it has been demonstrated that bacterial community composition is influenced by host-specific genomic locus (Benson et al., 2010; Koenig et al., 2011). Metagenomic studies have established that in spite of the high interpersonal variability, some bacterial groups share functionalities (Turnbaugh et al., 2009b; Burke et al., 2011). The main bacterial phyla are: Firmicutes (Gram-positive), Bacteroidetes (Gram-negative), and Actinobacteria (Gram-positive). Firmicutes is found in the highest proportion (60%), with more than 200 genera, the most important of which are: *Mycoplasma*, *Bacillus*, and *Clostridium*; Bacteroidetes and Actinobacteria each comprise about 10% of the gut microbiota, with the rest belonging to over 10 minority families. In total there are more than 1000 different species in the gut. It has also been suggested that the microbiota of most individuals can be categorized into three predominant enterotypes dominated by three different genera: *Bacteroides*, *Prevotella*, and *Ruminococcus*, which are independent of age, gender, ethnicity, or body mass index (BMI; Benson et al., 2010; Arumugam et al., 2011). Nevertheless an important debate has recently started about the concept of enterotypes (Jeffery et al., 2012; Yong, 2012), with a number of studies failing to identify the three distinct categories described by Arumugam et al., 2011 (Claesson et al., 2012; Huse et al., 2012).

### **MICROBIOTA ESTABLISHMENT**

Changes are produced in our microbiota from birth to adulthood. The fetal intestinal tract is sterile until birth, after which the newborn tract begins to be colonized. Infants are exposed to a great variety of microorganisms from different environments during and immediately after birth, either in their encounter with their maternal vagina or by the cutaneous microorganisms depending on the type of delivery (Adlerberth and Wold, 2009; Dominguez-Bello et al., 2010). Infants born vaginally have similar communities to those found in the vaginal microbiota of their mothers. In contrast, those born by Caesarian section have the characteristic microbiota of the skin, with taxons like *Staphylococcus* and *Propionibacterium* spp. (Dominguez-Bello et al., 2010). Moreover, these infants have lower intestinal bacteria counts with less diversity in the early weeks of life (Grölund et al., 1999; Axad et al., 2013). Another factor influencing the microbiota concerns the method of feeding. The microbiota of breast-fed infants is dominated by *Bifidobacterium* (Turroni et al., 2012; Yatsunenko et al., 2012) and *Ruminococcus*

(Morelli, 2008), with significantly lower rates of colonization by *Escherichia coli*, *C. difficile*, *Bacteroides fragilis*, and *Lactobacillus* than those observed in exclusively formula-fed infants (Penders et al., 2006). The microbiota of formula-fed infants is more complex and includes enterobacterial genera, *Streptococcus*, *Bacteroides*, and *Clostridium*, as well as *Bifidobacterium* and *Atopobium* (Bezirtzoglou et al., 2011). But, the composition of the microbiota changes with the introduction of solid foods and a more complex and stable community similar to the adult microbiota becomes established at 2–3 years of age (Palmer et al., 2007; Koenig et al., 2011; Ravel et al., 2011; Yatsunenko et al., 2012), with Firmicutes and Bacteroidetes predominating. During adulthood the microbiota is relatively stable until old age, when this stability is reduced (McCartney et al., 1996). The ELDERMET consortium studied the microbiota of elderly Irish subjects, finding a different characteristic microbiota composition to that of young persons, particularly in the proportions of *Bacteroides* spp. and *Clostridium* groups (Claesson et al., 2011).

## **EFFECT OF DIET ON THE TEMPORAL DYNAMICS OF MICROBIOTA**

Human-related microorganisms have been enumerated and categorized (Costello et al., 2009) and their temporal dynamics have been described (Caporaso et al., 2011). To understand the stability of microbiota within an individual over time is an important step to predict diseases and develop therapies to correct dysbiosis (microbial community mismatches). Data from longitudinal studies show the microbiota composition is relatively stable in healthy adults over time and is only transiently altered by external disturbances such as diet, disease, and environment (Delgado et al., 2006). Particularly, changes in diet have shown important effects on the composition of the intestinal microbiota. Indeed, dietary changes could explain 57% of the total structural variation in gut microbiota whereas changes in genetics accounted for no more than 12% (Zhang et al., 2010). Diet provides nutrients for both the host and the bacteria of the gastrointestinal tract. Changes in the composition of the gut microbiota in response to dietary intake take place because different bacterial species are better equipped genetically to utilize different substrates (Scott et al., 2008). Many studies have demonstrated that an increase in fat intake produces an increase in the Gram-negative/Gram-positive index of our microbiota. Recent studies have found that mice [humanized germ-free (GF)] changed from a diet low in fat and rich in vegetable polysaccharides to a diet rich in fat and sugar and low in plant polysaccharides (western diet) changed their microbiota in just 1 day. Mice on the "western diet" experienced an increase in the abundance of bacteria of the phylum Firmicutes and a decrease in the abundance of those of the phylum Bacteroidetes (Turnbaugh et al., 2009a,b). Hildebrandt et al. (2009) also found important changes in the abundance of the gut microbiota of mice after changing from a standard chow to a high-fat diet, which was associated with a decrease in the abundance of bacteria of the phylum Bacteroidetes and an increase in that of both Firmicutes and Proteobacteria phyla. Moreover, murine studies have shown that carbohydrate-reduced diets result in enriched populations of bacteria from the Bacteroidetes phyla (Walker et al., 2011) while

calorie-restricted diets prevent the growth of *C. coccoides*, *Lactobacillus* spp., and *Bifidobacteria* spp., which are all major butyrate producers required for colonocyte homeostasis (Santacruz et al., 2009). Only a limited number of human clinical trials have assessed the effects of changes in dietary patterns on the intestinal microbiota (De Palma et al., 2009; Muegge et al., 2011; Walker et al., 2011). In a controlled-feeding study with humans consuming a high-fat/low-fiber or low-fat/high-fiber diet, notable changes were found in gut microbiota in just 24 h, highlighting the rapid effect that diet can have on the intestinal microbiota (Wu et al., 2011). Interestingly, De Filippo et al. (2010) found that European children have a microbiota depleted of Bacteroidetes and enriched in Enterobacteriaceae compared to rural African children, which the authors attributed to low dietary fiber intake by Europeans (Wu et al., 2011). These authors postulated that gut microbiota coevolved with the plant-rich diet of the African children, allowing them to maximize energy extraction from dietary fiber while also protecting them from inflammation and non-infectious intestinal diseases (De Filippo et al., 2010). Another study demonstrated that subjects consuming a vegan or vegetarian diet had a lower stool pH and significantly lower total counts of culturable *Bacteroides* spp., *Bifidobacterium* spp., *E. coli*, and Enterobacteriaceae spp. than controls (Zimmer et al., 2011). A vegetarian diet has also been shown to decrease the amount and change the diversity of *Clostridium* cluster IV and *Clostridium* clusters XIV and XVII (Liszt et al., 2009). However, large well-controlled trials are needed to elucidate the mechanisms that link dietary changes to alterations in microbial composition as well as the implications of key population changes for health and disease.

## **MODULATION OF GUT MICROBIOTA DIVERSITY BY ANTIBIOTICS**

Much evidence now exists concerning an important change in our microbiota over recent decades, with some species increasing and others decreasing, though one of the most striking findings is that in developed countries there is a loss in the diversity of our microbiota. One of the most important factors that can disturb microbiota composition is the increased use of antibiotic treatment. There is evidence of important alterations in microbiota after antibiotic treatment (Sullivan et al., 2001; Jernberg et al., 2007; Dethlefsen et al., 2008). Although affected taxons vary among subjects, some taxons are not recovered even several months after treatment, and in general, there is a long-term reduction in bacterial diversity after the use of antibiotics (Jernberg et al., 2010; Dethlefsen and Relman, 2011). A correlation has recently been proposed between the increasing global use of antibiotics and weight gain or obesity in humans (Thuny et al., 2010). Several studies have indicated that some antibiotics are associated with weight gain in malnourished children, neonates, and adults (Ajslev et al., 2011; Trehan et al., 2013), but the precise mechanisms by which antibiotics improve weight are not well characterized. It has been suggested that antibiotics, such as avoparcin (a glycopeptide structurally related to vancomycin), exert selective pressure on Gram-positive bacteria and that gut colonization by *Lactobacillus* spp., which are known to be resistant to glycopeptides, used as a growth promoter in animals and found at a high concentration in the feces of obese patients, could be responsible for the weight

gain observed in patients who had been treated with vancomycin. These data suggest that nutritional programs and follow-up of weight should be undertaken in patients under such treatment (Thuny et al., 2010). Other recent studies have also demonstrated the beneficial effects of antibiotics on metabolic abnormalities in obese mice, giving rise to reduced glucose intolerance, body weight gain, metabolic endotoxemia, and markers of inflammation and oxidative stress (Bech-Nielsen et al., 2012). Moreover, these effects were associated with a reduced diversity of gut microbiota (Murphy et al., 2013). Antibiotic treatment combined with a protective hydrolyzed casein diet has been found to decrease the incidence and delay the onset of diabetes in a rat model (Brugman et al., 2006). A recent study also reported that antibiotic-treated humans showed greater and less balanced sugar anabolic capabilities than non-treated individuals (Hernandez et al., 2013). However, the majority of clinical studies are focused primarily on the characterization of the composition and diversity of gut microbes, it remaining uncertain whether antibiotic-induced gut microbiota alteration in human subjects with metabolic disorders is associated with improvements in metabolic derangements as observed in animal studies.

## **ROLE OF GUT MICROBIOTA IN METABOLIC DISEASES**

Recent decades have seen an increase in the prevalence of metabolic diseases in developed countries. Environmental factors, such as the increase in energy intake and the decrease in physical activity, have been considered causes of this spectacular increase in the prevalence of metabolic diseases. However, even when the energy intake does not increase and physical activity does not decrease, the prevalence continues growing exponentially, so other environmental factors must be taken into account, including changes in gut microbiota. One of the challenges is to elucidate the molecular origin of metabolic diseases, though the great diversity and social differences among humans make this difficult. During the last half century, with the advances in molecular biology, researchers have been investigating the genetics of metabolic diseases. In spite of the great efforts and the identification of some mutations in the genome, no global view has yet been established. The discovery of candidate genes in studies of pangenomic associations (GWAS – genome-wide association studies) has helped to identify new genes associated with sensitivity/resistance to diabetes and extreme metabolic phenotypes (Jacquemont et al., 2011). However, the global diversity of metabolic diseases cannot be explained, especially given the studies in monozygotic twins, discordant for TDM2 and obesity (Medici et al., 1999; Beck-Nielsen et al., 2003).

A second step toward the comprehension of the origin of metabolic diseases involves epigenetic and environmental factors. A drastic change in feeding habits in which dietary fiber has been replaced by a high fat diet contributes to the origin of metabolic diseases. However, this simple concept cannot explain why some people are sensitive and others are resistant to the development of these metabolic diseases. In mice, a metabolic adaptation is frequently observed (Burcelin et al., 2002). Genetically identical mice in the same box and with a fat-rich diet for 6–9 months can develop both obesity and diabetes, or only one of the diseases. There is a need to find a new paradigm that takes into account the genetic diversity, the environmental factor impact, the rapid development of metabolic diseases, and the individual behavior to develop diabetes and obesity. The conclusion reached concerns the concept of personalized medicine in which the individual characteristics should be identified in order to adapt a suitable therapeutic strategy for small patient groups.

## **INFLUENCE OF GUT MICROBIOTA COMPOSITION IN THE DEVELOPMENT OF OBESITY**

Studies during the last decade have associated the gut microbiota with the development of metabolic disorders, especially diabetes and obesity. Although incompletely understood, the gut microbiota is implicated in the programing and control of many physiological functions, including gut epithelial development, blood circulation, innate and adaptative mechanisms (Mackie et al., 1999; Dethlefsen et al., 2006). A new theory shows microbiota as a contributor to the regulation of energy homeostasis. Thus, with the environmental vulnerabilities, gut microbiota could provoke the development of impairment in energy homeostasis, causing metabolic diseases.

The first discovery was related to the fact that mice with a mutation in the leptin gene (metabolically obese mice) have different microbiota as compared with other mice without the mutation (Ley et al., 2005). In this obese animal model, the proportion of the dominant gut phyla, Bacteroidetes and Firmicutes, is modified with a significant reduction in Bacteroidetes and a corresponding increase in Firmicutes (Ley, 2010). Ley et al. (2006) were the first to report an altered gut microbiota similar to that found in obese mice (a larger proportion of Firmicutes and relatively fewer Bacteroidetes) in 12 obese subjects compared with 2 lean controls. Later, Armougom et al. (2009) confirmed a reduction in Bacteroidetes accompanied by a rise in *Lactobacillus* species belonging to the Firmicutes phylum. Turnbaugh et al. (2009b) and Furet et al. (2010) showed a different pattern based on a lower representation of Bacteroidetes (*Bacteroides/Prevotella*) in obese individuals with no differences in Firmicutes phylum. Collado et al. (2008) reported increases in species belonging to both Firmicutes (*Staphylococcus aureus*) and Bacteroidetes (*Bacteroides/Prevotella*) in overweight women. Million et al. (2012) described changes in the composition of Firmicutes based on an increase in *Lactobacillus reuteri* coupled with a reduction in *L. paracasei* and *L. plantarum*. Finally, other studies have found no differences between Firmicutes and Bacteroidetes at the phylum level (Duncan et al., 2008; Mai et al., 2009; Jumpertz et al., 2011).

The shift in the relative abundance observed in these phyla is associated with the increased capacity to harvest energy from food and with increased low-grade inflammation. The increase in Firmicutes and the decrease in the proportion of Bacteroidetes observed in obese mice could be related with the presence of genes encoding enzymes that break down polysaccharides that cannot be digested by the host, increasing the production of monosaccharides and short-chain fatty acids (SCFA) and the conversion of these SCFA to triglycerides in the liver (**Figure 1**). These SCFAs are able to bind and activate two G-protein-coupled receptors (GPR41 and GPR43) of the gut epithelial cells. The activation of these receptors induces peptide YY secretion, which suppresses gut

motility and retards intestinal transit. By this mechanism of SCFAlinked G-protein-coupled receptor activation, the gut microbiota may contribute markedly to increased nutrient uptake and deposition, contributing to the development of metabolic disorders (Erejuwa et al., 2014). Moreover, gut microbiota have also been shown to decrease the production of the fasting-induced adipose factor [FIAF; a secreted lipoprotein lipase (LPL)] by the intestinal cells, which inhibits LPL activity, increasing the storage of liver-derived triglycerides (Backhed et al., 2007).

Turnbaugh et al. (2006), in a study using ob/ob mice, found a reduced calorie content in the feces of obese mice as compared with lean mice. Other studies have suggested that obese subjects might be able to extract more energy from nutrients due to hydrogen transfer between taxa. In fact, a simultaneous increase in both hydrogen-producing Prevotellaceae and hydrogen-utilizing methanogenic Archaea has been previously associated with obesity by Zhang et al. (2009), suggesting a higher energy harvest in obese patients. For instance, intestinal starch digestion produces hydrogen, the increase of which inhibits digestion and methanogenic Archaea are able to transform this hydrogen into methane (**Figure 1**). Thus, there is a specific microbiota that obtains more energy from the same energy intake (Turnbaugh et al., 2009a). These findings agree with the observation in which GF mice fed with a fat-rich diet gained less weight than conventional mice (Backhed et al., 2004).

The most relevant experiment dealing with the causality between microbiota and obesity was done by Turnbaugh et al. (2006). In this study, they demonstrated that microbiota transplantation from genetically obese mice to axenic mice provokes a very significant weight increase compared with the axenic mice transplanted with the microbiota from lean mice.

Surprisingly, the phenotype with increase capacity for energy harvest is simply transmitted by transplantation of the obesityassociated gut microbiota in to healthy and lean donors (Turnbaugh et al., 2006, 2008). But within a phylum, not all the genera have the same role, so that bacterial genera have been related with either beneficial or harmful characteristics associated within the same phylum. Kalliomäki et al. (2008) undertook a prospective study in which they followed 49 children from birth to 7 years of age. Stool was collected at 6 and 12 months of life and it was found that the children who were 7 years old with a normal weight had a higher number of *Bifidobacterium* spp. and a smaller number of *Staphylococcus aureus* than the children who became overweight several years later. The authors concluded that the alteration in the microbiota precedes the alteration in weight, an explanation that is relevant for obesity prevention. The authors also proposed that *Staphylococcus aureus* may act as a trigger of low-grade inflammation, contributing to the development of obesity (Kalliomäki et al., 2008).

On the other hand, *Lactobacillus* spp. and bifidobacteria represent a major bacterial population of the small intestine where lipids and simple carbohydrates are absorbed, especially in the duodenum and jejunum. Recent publications reveal that the *Bifidobacteria* and *Lactobacillus* are not all the same and they may have different characteristics according to the species. For example, within the genus *Lactobacillus*, *L. plantarum*, and *L. paracasei* have been associated with leanness whereas *L. reuteri*

is associated with obesity (Million et al., 2012). Moreover, Drissi et al. (2014) have shown that weight gain-associated *Lactobacillus* spp. encode more bacteriocins and appear to lack enzymes involved in the catabolism of fructose, a defense against oxidative stress and the synthesis of dextrin, L-rhamnose and acetate than weight protection-associated *Lactobacillus* spp., which encodes for a significant gene amount of glucose permease. Regarding lipid metabolism, thiolases were only encoded in the genome of weight gain-associated *Lactobacillus* spp. The results of this study revealed that weight protection-associated *Lactobacillus* spp. have developed defense mechanisms for enhanced glycolysis and defense against oxidative stress while weight gainassociated *Lactobacillus* spp. possess a limited ability to break down fructose or glucose and might reduce ileal brake effects (Drissi et al., 2014).

activate two G-protein-coupled receptors (GPR41 and GPR43) of the gut

## **MICROBIOTA AND ITS RELATIONSHIP WITH TYPE 2 DIABETES MELLITUS**

Type 2 diabetes mellitus is the consequence of an increase in the production of glucose in the liver and a deficit in the secretion and action of insulin. Other physiological functions are altered, such as the central and autonomous nervous systems, leading to an impaired secretion of hormones like glucagon and incretins. However, a common feature of obesity and TDM2 is the presence

of a low-grade inflammatory component described in tissues involved in metabolism regulation, such as the liver, adipose tissue, and muscles (Pickup and Crook, 1998). This metabolic inflammation is characterized by a moderate excess in cytokine production, including interleukin (IL)-6, IL-1, or tumor necrosis factor alpha (TNF-α), that injures cellular insulin signals and contributes to insulin resistance and diabetes (Hotamisligil, 2006; Shoelson et al., 2006). Weight increase would be an initiating factor of low-grade inflammation. When adipocyte hypertrophy is produced as a response to excess energy intake, an increase in TNF-α production in the adipose tissue is also produced and this stimulates the production of chemotactic factors resulting in adipose tissue being infiltrated by proinflammatory macrophages that produce an increase in the production of IL-6 and IL-1. Recently, two studies have shown that the intestinal microbiome might be an important contributor to the development of TDM2. Both studies also showed that TDM2 subjects were characterized by a reduction in the number of Clostridiales bacteria (*Roseburia* species and *Faecalibacterium prausnitzii*), which produce the SCFA butyrate (Qin et al., 2012; Karlsson et al., 2013). Also, another study found microbiota changes in patients with diabetes or insulin resistance as compared with subjects without alterations in carbohydrate metabolism (Serino et al., 2013). In addition, changes in the amount of *Bifidobacterium*, *Lactobacillus*, and *Clostridium* as well as a reduced Firmicutes to Bacteroidetes ratio in gut microbiota have also been recently reported in type 1 diabetic children. This study also showed that bacteria involved in the maintenance of gut integrity were significantly lower in diabetic patients than in healthy controls (Murri et al., 2013). Similar changes in the composition of intestinal microbiota have also been reported in TDM2 patients (Larsen et al., 2010; Qin et al., 2012). Several other studies linking the gut microbiota to metabolic disorders, such as obesity, insulin resistance and diabetes mellitus, have been reviewed by other authors (Caricilli and Saad, 2013; Stachowicz and Kiersztan, 2013; Tagliabue and Elli, 2013). Moreover, probiotic (Amar et al., 2011) and prebiotic treatments (Cani et al., 2007b) control gut microbiota and metabolic diseases.

Various mechanisms have been proposed to explain the influence of the microbiota on insulin resistance and TDM2, such as metabolic endotoxemia, modifications in the secretion of the incretins and butyrate production.

The lipopolysaccharides (LPS) are endotoxins commonly found in the outer membrane of Gram-negative bacteria that cause metabolic endotoxemia, which is characterized by the release of proinflammatory molecules (Manco et al., 2010). A rise in LPS levels has been observed in subjects who increased their fat intake (Amar et al., 2008). Similar results were found in mice (Cani et al., 2007b) and in mutant mice (like the leptin-deficient mice) even feeding with a normal diet (Cani et al., 2008), which suggests that a change in the proportion of Gram-negative bacteria in the gut or a change in the gut permeability were produced by the LPS rise in serum (Cani et al., 2008, 2009b) and this increase is

directly related with the degree of insulin resistance. Cani et al. (2007a,b) reported that modulation of the intestinal microbiota by using prebiotics in obese mice acts favorably on the intestinal barrier, lowering the high-fat diet-induced LPS endotoxemia and systemic and liver inflammation (**Figure 2**). LPS are absorbed by enterocytes and they are conveyed into plasma coupled to chylomicrons (Clemente Postigo et al., 2012). In this way, dietary fats can be associated with increased absorption of LPS which in turn can be related with changes in the gut microbiota distinguished by a decrease in the *Eubacterium rectale*–*C. coccoides* group, Gram-negative *Bacteroides* and in *Bifidobacterium* (Caricilli and Saad, 2013). This causal role of LPS was demonstrated by infusing LPS in mice with a normal diet inducing hepatic insulin resistance, glucose intolerance, and an increase in the weight of adipose tissue (Cani et al., 2007a). It has been recently shown that the LPS-induced signaling cascade via Toll-like receptor 4 (TLR4) impairs pancreatic β-cell function via suppressed glucose-induced insulin secretion and decreased mRNA expression of pancreasduodenum homebox-1 (PDX-1; Rodes et al., 2013). LPS binds to the CD14/TLR4 receptor present on macrophages and produces an increase in the production of proinflammatory molecules. When LPS injections were administrated to mice with a genetic absence of the CD14/TLR4 receptor they did not develop these metabolic characteristics and there was no start of TDM2 or obesity, showing the important role of LPS in the mechanism of CD14/TLR4. Moreover, knockout CD14/TLR4 mice were even more sensitive to insulin than wild type controls (Cani et al., 2007a; Poggi et al., 2007). LPS can also promote the expression of NF-κB (nuclear factor kappa-light-chain-enhancer of activated

B cells) and activation of the MAPK (mitogen-activated protein kinase) pathway in adipocytes with several target genes (Chung et al., 2006).

An increase of *Bifidobacterium* spp. modulates inflammation in obese mice by an increase in the production of incretins like the glucagon-like peptide (GLP), also reducing intestinal permeability (Cani et al., 2009b). There is evidence that the rise in *Bifidobacterium* spp. produced by some prebiotics is accompanied by an increase in GLP1 and YY peptide secretions by the intestine. These two molecules have favorable effects, decreasing insulin resistance and the functionality of beta cells (Cani and Delzenne, 2009). In addition, modulation of the gut flora with prebiotics increases GLP2 production in the colon and this increase in GLP2 production is associated with higher expression of zonula occludens-1 (ZO-1), which improves the mucosal barrier function leading to a decrease in plasma LPS (Cani et al., 2009a,b). The study by Qin et al. (2012)showed that subjects with TDM2 sufferedfrom a moderate intestinal dysbiosis and an increase in the number of various opportunistic gut pathogens, more than a change in a specific microbial species, having a direct association with the pathophysiology of TDM2. Specifically, they experienced a decrease in their butyrate-producing bacteria (Qin et al., 2012). This is significant because butyrate is the preferred source of energy, repair and maintaining cell health in the human digestive system. In the colon, the predominant butyrate-producing bacteria are the *C. coccoides* and the *Eubacterium rectale* groups. These changes in intestinal bacteria have recently been reported in patients with colorectal cancer (Wang et al., 2012) and in elderly people (Biagi et al., 2010). Thus, butyrate-producing bacteria could have a protecting role against a functional dysbiosis. Moreover, as other intestinal diseases show a loss of butyrate-producing bacteria with a commensurate increase in opportunistic pathogens, a possible hypothesis is that this change in the microbiota can cause an increase in susceptibility to a wide variety of diseases. The analysis of genetic bacterial functions shows an increase in functions related to the response to intestinal oxidative stress. This is of interest, because previous studies have shown that a high oxidative stress level is related to a predisposition to diabetic complications (Kashyap and Farrugia, 2011).

## **CONCLUSION**

Metabolic diseases are caused by many factors, including a higher consumption of energy-rich diets, reduced physical activity, and a hereditary disposition. In the past 6 years, much evidence suggests that gut microbiota may play an important role in the regulation of energy balance and weight in animal models and in humans. However, although metagenomic tools have provided an important amount of data concerning the characterization and the potential role of this gut microbiota in the development of human obesity and TDM2, the causal relationship between this microbiota and obesity still needs to be confirmed in humans. In the future, larger human studies conducted at the species level and taking into account all of the possible confounding variables (such as age, gender, ethnicity, diet, and genetic factors) are needed to allow us to use the gut microbiota composition and modulation as novel diagnostic or therapeutic strategies to treat obesity and TDM2.

## **ACKNOWLEDGMENTS**

We gratefully acknowledge the help of Ian Johnstone for his expertise in preparing this manuscript. The research group belongs to the "Centros de Investigación en Red" [CIBER, CB06/03/0018] of the "Instituto de Salud Carlos III." Isabel Moreno-Indias was supported by a "Sara Borrell" Postdoctoral contract (CD12/00530), María Isabel Queipo-Ortuño acknowledges support from the "Miguel Servet Type I" program (CP13/00065) and Fernando Cardona acknowledges support from the "Miguel Servet Type II" program (CP13/00023) from the Instituto de Salud Carlos III, Madrid, Spain.

## **REFERENCES**


in the development of type 1 diabetes? *Diabetologia* 49, 2105–2108. doi: 10.1007/s00125-006-0334-0


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 March 2014; accepted: 08 April 2014; published online: 29 April 2014. Citation: Moreno-Indias I, Cardona F, Tinahones FJ and Queipo-Ortuño MI (2014) Impact of the gut microbiota on the development of obesity and type 2 diabetes mellitus. Front. Microbiol. 5:190. doi: 10.3389/fmicb.2014.00190*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Moreno-Indias, Cardona, Tinahones and Queipo-Ortuño. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Combined metagenomic and phenomic approaches identify a novel salt tolerance gene from the human gut microbiome

## *Eamonn P. Culligan1,2, Julian R. Marchesi 1,3,4\*, Colin Hill 1,2\* and Roy D. Sleator 1,5\**

*<sup>1</sup> Alimentary Pharmabiotic Centre, Biosciences Institute, University College Cork, Cork, Ireland*

*<sup>2</sup> School of Microbiology, University College Cork, Cork, Ireland*

*<sup>3</sup> Cardiff School of Biosciences, Cardiff University, Cardiff, UK*

*<sup>4</sup> Department of Surgery and Cancer, Centre for Digestive and Gut Health, Imperial College London, London, UK*

*<sup>5</sup> Department of Biological Sciences, Cork Institute of Technology, Cork, Ireland*

#### *Edited by:*

*Anton G. Kutikhin, Kemerovo State Medical Academy, Russia*

#### *Reviewed by:*

*Anton G. Kutikhin, Kemerovo State Medical Academy, Russia Arseniy E. Yuzhalin, University of Oxford, UK*

#### *\*Correspondence:*

*Julian R. Marchesi, School of Biosciences, Cardiff University, Museum Avenue, Cardiff CF10 3AX, UK*

*e-mail: marchesijr@cardiff.ac.uk; Colin Hill, Alimentary Pharmabiotic Centre, School of Microbiology, University College Cork, Cork, Ireland*

*e-mail: c.hill@ucc.ie;*

*Roy D. Sleator, Department of Biological Sciences, Cork Institute of Technology, Rossa Avenue, Bishopstown, Cork, Ireland e-mail: roy.sleator@cit.ie*

**INTRODUCTION**

The ability to adapt to and tolerate increases in extracellular osmolarity is an important characteristic that enables bacteria to survive in stressful environments. Increased osmolarity, caused by sodium chloride (NaCl) for example, initiates a phased response in bacteria. Firstly during the primary response, potassium ions are rapidly accumulated within the bacterial cell to offset the detrimental effects of water loss and influx of toxic sodium and chloride ions (Sleator and Hill, 2002; Epstein, 2003). Once the cell has been stabilized, the secondary response begins and involves the synthesis or, more often the more energetically favorable, uptake of osmoprotectant compounds (also termed compatible solutes or osmolytes). Osmoprotectants are compatible with cellular functions and can accumulate to very high concentrations within the cell and function to protect proteins and to restore cell volume and thus, turgor pressure (Kempf and Bremer, 1998; Sleator and Hill, 2002; Kunte, 2006). Osmoprotectants can be grouped broadly as amino acids, polyols, sugars, trimethyl- and quaternary-ammonium compounds and their derivatives (Kempf and Bremer, 1998). The most widely utilized and best characterized osmoprotectants are glycine betaine, carnitine, proline, and ectoine. Numerous studies have shown carnitine to be important not only for salt tolerance, but also for survival *in vivo*

In the current study, a number of salt-tolerant clones previously isolated from a human gut metagenomic library were screened using Phenotype MicroArray (PM) technology to assess their functional capacity. PM's can be used to study gene function, pathogenicity, metabolic capacity and identify drug targets using a series of specialized microtitre plate assays, where each well of the microtitre plate contains a different set of conditions and tests a different phenotype. Cellular respiration is monitored colorimetrically by the reduction of a tetrazolium dye. One clone, SMG 9, was found to be positive for utilization/transport of L-carnitine (a well-characterized osmoprotectant) in the presence of 6% w/v sodium chloride (NaCl). Subsequent experiments revealed a significant growth advantage in minimal media containing NaCl and L-carnitine. Fosmid sequencing revealed putative candidate genes responsible for the phenotype. Subsequent cloning of two genes did not replicate the L-carnitine-associated phenotype, although one of the genes, a <sup>54</sup> σ -dependent transcriptional regulator, did confer salt tolerance to *Escherichia coli* when expressed in isolation. The original clone, SMG 9, was subsequently found to have lost the original observed phenotype upon further investigation. Nevertheless, this study demonstrates the usefulness of a phenomic approach to assign a functional role to metagenome-derived clones.

**Keywords: metagenomics, functional metagenomics, gut microbiome, microbiota, salt tolerance, BIOLOG, phenotype microarray, transcriptional regulator**

> and pathogenesis of infection (Sleator et al., 2001; Wemekamp-Kamphuis et al., 2004). Carnitine is also found abundantly in animal tissues and red meat and is an important compound in the host environment; for the human pathogen *Listeria monocytogenes*, carnitine and its uptake system OpuC are critical for infection in mice (Sleator et al., 2003). In addition to its osmoprotective properties, carnitine may also be catabolised as a carbon or nitrogen source to generate energy (Wargo and Hogan, 2009).

> The emergence of "omics" technologies, an umbrella term to include analyses such as genomics, metagenomics, transcriptomics, proteomics, metabolomics, and phenomics to name a few, have been used to gain valuable information about the functions and interactions of various biological systems as a whole and can provide more information than more traditional and reductive approaches to biological problems. Sequence-based and functional metagenomic approaches have led to the discovery of many novel and diverse genes (Beja et al., 2000; Gillespie et al., 2002; Banik and Brady, 2008; Culligan et al., 2013; Yoon et al., 2013). While identifying clones which display a specific phenotype through functional metagenomic screening yields worthwhile results, characterizing the functional mechanisms responsible for the observed phenotype can be sometimes difficult owing to the fact that a large proportion of metagenome derived

genes will be annotated as hypothetical proteins or have no known function or homology to existing proteins (Bork, 2000; Qin et al., 2010). Phenomic approaches can be used to study hundreds of phenotypic profiles of different bacterial strains concurrently. Comparing phenomic profiles of wild-type and mutant derivatives or host strains and clones identified through metagenomic screening can reveal differences between strains relating to gene function, pathogenicity and metabolism for example (Bochner et al., 2001; Bochner, 2009).

With this in mind, we have utilized combined metagenomic and phenomic approaches in this study to characterize a salt tolerant clone identified from a human gut microbiome metagenomic fosmid library. An overview of the study design using this combined approach can be seen in **Figure 1**.The BIOLOG phenotype microarray (PM) system was used to compare phenotypes between metagenomic clones and cloning host (carrying empty fosmid vector). PM plates measure cellular respiration colorimetrically via reduction of a tetrazolium dye with electrons from NADH generated during the process of respiration. Strongly metabolized substrates generate a more intense purple color which is recorded with a camera on the Omnilog instrument. If desired, thousands of phenotypes may be monitored simultaneously using the different available PM plates which can be grouped as those that measure carbon, nitrogen, phosphorous and sulfur metabolism, response to different pH conditions, osmolytes and chemicals such as anitbiotics. The full range of PM plates and their constituents, for investigation of bacterial phenotypes can be found at: http://www.biolog.com/products-static/ phenotype\_microbial\_cells\_use.php

As our primary interest is salt tolerance we used the PM osmolyte plate (PM9) for analysis. The PM screen indicated that clone SMG 9 was positive for L-carnitine utilization in the presence of 6% w/v NaCl. Sequencing of the fosmid insert and cloning of two genes identified a novel salt tolerance gene, but did not replicate the carnitine-associated phenotype originally observed. Subsequent investigation revealed SMG 9 had lost this phenotype. Notwithstanding this phenomenon, this study demonstrates the usefulness of a phenomic approach to assign a functional role to metagenomic library-derived clones.

## **MATERIALS AND METHODS**

## **BACTERIAL STRAINS AND GROWTH CONDITIONS**

Bacterial strains, plasmids, and oligonucleotide primers (Eurofins, MWG Operon, Germany) used in this study are listed in **Table 1**. *Escherichia coli* EPI300::pCC1FOS (Epicentre Biotechnologies, Madison, WI, USA) was grown in Luria-Bertani (LB) medium containing 12.5μg/ml chloramphenicol (Cm). *E. coli* MKH13 was grown in LB medium and in LB medium supplemented with 20μg/ml Cm for strains transformed with the plasmid pCI372. LB media was supplemented with 1.5% w/v agar when required. All overnight cultures were grown at 37◦C with shaking.

#### **METAGENOMIC LIBRARY CONSTRUCTION AND SCREENING**

A previously constructed fosmid clone library (Jones and Marchesi, 2007; Jones et al., 2008), created from metagenomic DNA isolated from a fecal sample from a healthy 26 year old Caucasian male was used to screen for salt-tolerant clones. The library was screened as outlined previously (Culligan et al., 2012). Briefly, a total of 23,040 clones from the library were screened on LB agar supplemented with 6.5% (w/v) NaCl (a concentration which inhibits the growth of the cloning host, *E. coli* EPI300) and 12.5μg/ml Cm using a Genetix QPix 2 XT colony picking/gridding robotics platform to identify clones with an increased salt tolerance phenotype compared to the cloning host (*E. coli* EPI300) carrying an empty fosmid vector (pCC1FOS). Identification of any salt tolerant clones will therefore most likely be due to a gene (or genes) present on the metagenomic insert from the human gut microbiota. Clones were gridded onto Qtrays (Genetix) using the robotics platform. Q-trays were incubated at 37◦C for 3 days and checked twice daily for growth of likely salt-tolerant clones. Salt-tolerant clones were subsequently replica plated onto LB agar containing 12.5μg/ml Cm and 6.5% NaCl and onto LB containing 12.5μg/ml Cm, but without 6.5% NaCl (which represented a positive control plate). Each salt tolerant clone identified was streaked on LB agar + 12.5μg/ml Cm to ensure a pure culture and all clones were maintained as glycerol stocks at −80◦C.

## **PHENOTYPE MICROARRAY (PM) ASSAY**

The PM9 osmolytes microplate was used to compare the cellular phenotypes of SMG 9 and the cloning host, *E. coli* EPI300::pCC1FOS (containing an empty fosmid vector) under 96 different conditions. Preparation of the different IF (Inoculating Fluids; proprietary formulation supplied by BIOLOG) solutions and inoculation of the PM plates was performed according to the BIOLOG PM protocol for *E. coli* and other Gram negative bacteria. Briefly, IF-0 solution was prepared by adding 25 ml of sterile water to 125 ml of 1.2× IF-0. IF-0 + dye mix A solution was prepared by adding 1.8 ml of dye mix A and 23.2 ml of sterile water to another bottle containing 125 ml of 1.2× IF-0. IF-10 solution was prepared by adding 1.5 ml of dye mix A and 23.5 ml of sterile water to a bottle containing 125 ml of 1.2× IF-10. *E. coli* strains were grown overnight on LB agar at 37◦C by streaking from a frozen stock. Cells were sub-cultured by streaking again on LB agar and grown overnight again. Isolated colonies were removed from the agar plate using a sterile swab and added to a tube containing 16 ml of IF-0 solution until a cell suspension of 42% T (transmittance) was achieved using the BIOLOG Turbidimeter. 15 ml of this 42% T solution was diluted in 75 ml of IF-0 + dye mix A to achieve 85% T. 600μl of the 85% T cell suspension solution was added to 120 ml of IF-10 + dye mix A. Finally, 100μl of the final cell suspension was inoculated to each well of the PM 9 microplate. Plates were incubated at 37◦C for 24 h in the Omnilog plate reader (BIOLOG).

#### **FOSMID SEQUENCING AND ANALYSIS**

Fosmid DNA was isolated from SMG 9 as described above to a concentration *>*200 ng/μl (approximately 5μg in total). Sequencing of the full fosmid insert of SMG 9 was performed by GATC Biotech (Konstanz, Germany) using 454 pyrosequencing on a titanium mini-run of the Roche GS-FLX platform, achieving approximately 65-fold coverage. Sequencing



*FP, forward primer; RP, reverse primer; CmR, chlorpamphenicol resistance; Restriction enzyme cut sites are underlined; PstI, CTGCAG; XbaI, TCTAGA; SalI, GTCGAC.*

reads were assembled into a single contig by GATC Biotech. The retrieved sequence was analyzed using the FGENESB software program (Softberry) to identify putative open reading frames and translated nucleotide sequences were subjected to BLASTP analysis to assign putative functions to the encoded proteins and identify homologous sequences from the National Centre for Biotechnology Information (NCBI; http://www*.*ncbi*.*nlm*.*nih*.* gov/blast/Blast*.*cgi). The full fosmid insert sequence of SMG 9 was submitted to GenBank and assigned the following accession number; KJ524644.

### **DNA MANIPULATIONS AND CLONING OF** *mfsT* **AND** *sdtR* **GENES**

Extraction of fosmids containing metagenomic DNA: 5 ml of bacterial culture was grown overnight with 12.5μg/ml Cm. One millilitre of culture was used to inoculate 4 ml of fresh LB broth. To this, 5μl of 1000× Copy Control Induction solution (Epicentre Biotechnologies) and 12.5μg/ml Cm were added. The mixture was incubated at 37◦C for 5 h with vigorous shaking (200–250 rpm) to ensure maximum aeration. Cells were harvested from the whole 5 ml of induced culture by centrifuging at 2100 × *g* for 12 min. Qiagen QIAprep Spin mini-prep kit was used to extract fosmids as per manufacturer's instructions. PCR products were purified with a Qiagen PCR purification kit and digested with *XbaI* and *PstI* (Roche Applied Science) for *mfsT* and with *SalI* and *XbaI* for *sdtR*, followed by ligation using the FastLink DNA ligase kit (Epicentre Biotechnologies) to similarly digested plasmid pCI372. Electrocompetent *E. coli* MKH13 and *E. coli* EPI300 wer*e*transformed with the ligation mixture and plated on LB agar plates containing 20μg/ml Cm for selection. Colony PCR was performed on all resistant transformants using primers across the multiple cloning site (MCS) of pCI372 to confirm the presence and size of the insert.

## **CONFIRMATION TESTS FOR OBSERVED PHENOTYPE**

Growth experiments were performed in defined M9 minimal media (M9MM) (Fluka) to confirm the observed phenotype. Single isolated colonies of SMG 9 and EPI300::pCC1FOS were grown overnight in M9MM (containing final concentrations of; D-glucose (0.4%), Bacto™ casamino acids (w/v 0.2%) (Becton, Dickinson and Co, Sparks, MD, USA), magnesium sulfate (MgSO4) (2 mM), calcium chloride (CaCl2) (0.1 mM) and 12.5μg/ml Cm). Reagents were purchased from Sigma Aldrich (St. Louis, MO, USA) unless otherwise stated. Cells were harvested by centrifugation, washed in ¼ strength Ringers solution and resuspended in fresh M9MM. A 2% v/v inoculum was sub-cultured in fresh M9MM containing various concentrations of sodium chloride (0–8% w/v NaCl) and 1 mM of L-carnitine when required. Triplicate wells of a 96-well microtitre plate were inoculated with 200μl of the appropriate cell suspension. Plates were incubated at 37◦C for 24–48 h in an automated spectrophotometer (Tecan Genios) which recorded the optical density at 595 nm (OD595 nm) every hour. After 48 h the data was retrieved and analyzed using the Magellan 3 software program and graphs were created with Sigma Plot 10.0 (Systat Software Inc, London, UK). Results are presented as the average of triplicate experiments, with error bars being representative of the standard error of the mean (SEM).

## **RESULTS**

## **SCREENING OF METAGENOMIC LIBRARY**

Approximately 23,000 clones from a metagenomic fosmid library from the human gut microbiome were screened previously and resulted in the identification of 53 salt tolerant clones which could grow on LB agar supplemented with 6.5% NaCl (a concentration which inhibits growth of the cloning host, *E. coli* EPI300::pCC1FOS) (Culligan et al., 2012). The salt tolerant clones identified were annotated as SMG 1-53 (Salt MetaGenome 1–53).

## **PM OSMOLYTE PLATE ASSAY**

A number of clones initially identified from the metagenomic library were chosen at random and phenotypically screened using PM 9 osmolytes plate from BIOLOG. The layout and contents of PM9 can be viewed at http://www.biolog.com/pdf/pm\_lit/PM1-PM10.pdf. One clone, SMG 9, gave a positive result under one of the 96 different conditions tested. It was found that SMG 9 had an increased metabolic response (causing a reduction of the tetrazolium dye to generate a purple color, **Figure 2A**) compared to the cloning host containing an empty fosmid vector, *E. coli* EPI300::pCC1FOS (**Figure 2B**) in 6% w/v NaCl supplemented with L-carnitine. The color formation within each well was measured by BIOLOG's Omnilog machine, which produces a color-coded graph. A comparison of the kinetic data output from SMG 9 and EPI300::pCC1FOS can be seen in **Figure 3**.

## **CONFIRMATORY EXPERIMENTATION OF PM ANALYSIS**

In an attempt to confirm and replicate the result from the PM analysis, SMG 9 and EPI300::pCC1FOS were grown in M9MM containing various concentrations of NaCl (0–6% w/v) and supplemented with 1 mM L-carnitine when appropriate. **Figure 4A** shows growth of both clones in M9MM in the presence and absence of 1 mM L-carnitine. In the presence of L-carnitine, SMG

**FIGURE 2 | Appearance of PM 9 plates after incubation for 24 hours at 37◦C. (A)** Control EPI300::pCC1FOS and **(B)** SMG 9. PM plates measure cellular respiration colorimetrically via reduction of a tetrazolium dye with electrons from NADH generated during the process of respiration. Strongly metabolized substrates generate a more intense purple color. Development of a strong purple color can be seen in well B12 in **Figure 2B** (circled in red), which was inoculated with SMG 9, while no color development is visible in B12 of the control plate. This indicates SMG 9 has a greater ability to transport and utilise L-carnitine compared to the EPI300::pCC1FOS host strain.

9 displays a growth defect, while growth is similar under all other conditions. The growth defect is alleviated when grown at 4% w/v NaCl + 1 mM L-carnitine and there is no difference in growth between clones either in the presence or absence of L-carnitine (**Figure 4B**). At 5% w/v NaCl however, SMG 9 has a significant growth advantage compared to EPI300::pCC1FOS both in the presence and absence of 1 mM L-carnitine (**Figure 4C**). The positive effect of L-carnitine on the growth of SMG 9 is evident with cells entering logarithmic phase growth sooner and reaching a

9 under the conditions in the well (6% NaCl + L-carnitine).

much higher final optical density (OD595 nm). A similar, positive effect for L-carnitine on growth of SMG 9 is also seen at 6% w/v NaCl (**Figure 4D**).

## **SEQUENCING OF SMG 9 FOSMID INSERT AND ANALYSIS**

Fosmid SMG 9 was fully sequenced (454-pyrosequencing) and assembled by GATC Biotech. This generated a total of 2*.*<sup>3</sup> <sup>×</sup> <sup>10</sup><sup>6</sup> base pairs of sequence data in 6939 sequencing reads. The average read length was 334 base pairs and coverage of 64.5× was achieved. Following vector trimming the length of the insert was approximately 36.5 kb and the %G+C content was 49.43%. Twenty-four putative open reading frames were predicted using Softberry's FGENESB, bacterial operon and gene prediction software (www*.*softberry*.*com) (Mavromatis et al., 2007). Translated nucleotide sequences were functionally annotated by homology searches using the BLASTP program to identify homologous sequences and determine their taxonomic origin. All but two of the genes encoded proteins with high identity (98–100% at the amino acid level) to *Bacteroides* sp. CAG:545. A list of the genes on SMG 9, their encoded functions and putative domains are presented in **Table 2**. The full fosmid insert sequence of SMG 9 has been submitted to GenBank and assigned the accession number, KJ524644.

## **CLONING OF** *mfsT* **AND** *sdtR* **GENES**

Following initial inspection of the encoded proteins on SMG 9, the presence of an L-carnitine or general osmoprotectant transporter, nor indeed any protein with a functional link to carnitine metabolism was not immediately obvious. Transposon mutagenesis was attempted in order to create a knock-out mutant; this

**FIGURE 4 | Growth in M9 minimal media with NaCl +***/***− 1mM L-carnitine.** Growth of *E. coli* EPI300::pCC1FOS and SMG 9 in **(A)** M9 minimal media, **(B)** M9 minimal media + 4% NaCl, **(C)** M9 minimal media + 5% NaCl and **(D)**

M9 minimal media + 6% NaCl. **Legend:** *E. coli* EPI300::pCC1FOS (• closed circle); SMG 9 ( open triangle); *E. coli* EPI300::pCC1FOS + 1mM L-carnitine ( closed square); SMG 9 + 1 mM L-carnitine, (open diamond).


**Table 2 | Proteins predicted to be encoded on SMG 9 fosmid insert.**

*applicable.*

however, proved unsuccessful. Two genes (gene 5 and gene 18), which we felt may be likely to have a possible role in L-carnitine utilization based on bioinformatic analysis were cloned in isolation to further examine the phenotype. Genes 5 and 18 were annotated *sdtR* for sigma-dependent transcriptional regulator and *mfsT* for major facilitator superfamily transporter, respectively.

Both *mfsT* and *sdtR* were cloned in the vector pCI372 and expressed in both *E. coli* EPI300 and the osmosensitive strain *E. coli* MKH13 (Haardt et al., 1995). The effect of each gene on the growth of each strain under salt stress in the presence and absence of L-carnitine was assessed. The *mfsT* gene had no effect on growth under any of the conditions tested (data not shown). The *sdtR* gene on the other hand conferred a significant salt tolerance phenotype to *E. coli* MKH13 when grown in media supplemented with both 3 and 4% w/v NaCl (**Figures 5C,D**, respectively), while growth was similar in media lacking NaCl and in media supplemented with 2% w/v NaCl (**Figures 5A,B**, respectively). Cloning and expression of *sdtR* in EPI300 resulted in an increase in salt tolerance compare to wild-type EPI300 carrying an empty copy of the plasmid pCI372. Addition of 1 mM L-carnitine increased the growth rate and final optical density of both strains, but its effect on the *sdtR*+strain was not significant relative to the EPI300::pCI372 control (**Figures 6A,B**).

Although *sdtR* did confer a salt tolerance phenotype, neither of the cloned genes replicated the original phenotype related to L-carnitine. We re-examined clone SMG 9 and carried out further studies, however these revealed that SMG 9 had lost the carnitine-associated phenotype seen originally and unfortunately, it was therefore not possible to identify the gene(s) responsible.

## **DISCUSSION**

In the present study we have identified a novel salt tolerance gene from a metagenomic library clone from the human gut microbiome using a combined functional metagenomic and PM approach. The clone, SMG 9, was identified from a previous library screen to identify salt tolerant clones (Culligan et al., 2012) and was further characterized in this study using PM osmolyte plates. From the PM screen, SMG 9 showed an increased metabolic profile in the presence of 6% NaCl + 1 mM L-carnitine compared to the control strain carrying an empty fosmid vector (EPI300::pCC1FOS), indicating this clone could utilize or transport L-carnitine. Experiments to confirm the findings of the PM assay showed that SMG 9 displayed an increased growth profile at 5% and 6% NaCl in the presence of 1 mM Lcarnitine compared to controls, similar to observations in the PM assay. Transposon mutagenesis was attempted to create phenotypic knock out mutants, using the EZTn*5* system (Epicentre Biotechnologies; Goryshin and Reznikoff, 1998) but this proved unsuccessful. This may be due to the presence of a gene encoding a DNA repair protein MutS on the fosmid insert, which has been associated with transposon excision (specifically Tn5 and Tn10) (Lundblad and Kleckner, 1985).

Next generation sequencing of the full fosmid insert of SMG 9 and functional assignment of the encoded proteins using BLASTP revealed sequences shared highest genetic identity to *Bacteroides* sp. CAG:545. Species of *Bacteroides* are commonly found in the human gut, where the resident microbiota is largely composed of species from two dominant phyla, the *Bacteroidetes* and *Firmicutes* (Qin et al., 2010). The %G+C content of the SMG 9 insert was 49.44% which close to the reported range of 40–48% for genomes

of species of *Bacteroides* (Shah, 1992). Sequencing and subsequent functional annotation did not reveal any obvious genes related to known L-carnitine transport or utilization systems, suggesting a novel mechanism may be involved. We conducted further bioinformatic analysis of the encoded proteins in an attempt to identify any link to salt tolerance or osmoprotectant transport. Gene 5 (*sdtR*) is predicted to encode a σ54-dependent transcriptional regulator, which contains a number of domains (see **Table 2**), including a helix-turn-helix 8 (HTH\_8), Fis-family protein domain, while gene 18 (*mfsT*) is predicted to encode a major facilitator superfamily (MFS) protein with an UhpC sugar phosphate permease domain. Both genes were chosen for further study and cloning as Fis is a regulatory protein involved the regulation of proline (another important osmoprotectant) uptake (Xu and Johnson, 1995, 1997; Typas et al., 2007), while we further reasoned that *sdtR* could be regulating host EPI300 genes, contributing the L-carnitine-associated phenotype, while MFS transporters also play a role in osmoprotectant uptake (Culham et al., 1993; Haardt et al., 1995; Pao et al., 1998). Despite the presence of a sugar phosphate permease domain, indicating sugar transport, MFS transporters are known to have a diverse substrate range (Pao et al., 1998; Saier, 2000; Law et al., 2008).

The *mfsT* gene did not confer a salt tolerance or the L-carnitine associated phenotype to transformed cells (data not shown). The *sdtR* gene also did not confer the L-carnitine associated phenotype to *E. coli,* but did however confer an increased salt tolerance phenotype. *sdtR* therefore represents a novel salt tolerance gene and most likely functions by influencing expression (either positively or negatively) of host *E. coli* genes, although further work, comprising expression studies and microarray analysis, will be required to elucidate the genes involved as transcriptional regulators can influence a wide variety of genes. Transcriptional regulators are commonly involved in the response different stresses in bacteria (Hengge-Aronis et al., 1991; Cheville et al., 1996; Battesti et al., 2011; Hoffmann et al., 2013), while σ<sup>54</sup> (RpoN) has been shown to play a role in osmotolerance in *Listeria monocytogenes* (Okada et al., 2006).

The loss of the carnitine-associated phenotype of SMG 9 prevented further characterization of this clone and ultimately the identification of the gene(s) responsible. It is difficult to pinpoint the cause of this phenotypic reversion, but a clue may be evident from **Figure 4A**, where a growth defect for SMG 9 is apparent when grown in M9MM + 1 mM L-carnitine. This indicates L-carnitine may be increasing the metabolic load on the cell and this metabolic stress is only relieved in the presence of NaCl, when L-carnitine may be utilized efficiently in an osmoprotective capacity. If the gene is constitutively expressed, a mutation may have occurred to counteract this phenomenon. The presence of a gene encoding MutS may also be relevant as mutations to MutS can result in a mutator phenotype in *E. coli* cells (Wu and Marinus, 1994). Furthermore, it is possible the original SMG 9 clone acquired a mutation on the fosmid insert that conferred the carnitine-associated phenotype and a subsequent suppressor mutation occurred to silence this mutation, returning the clone to its original phenotype.

In conclusion, we have identified a novel salt tolerance gene from the human gut microbiome using a combined functional metagenomic and PM approach. The gene originates from a species of *Bacteroides* and encodes a putative transcriptional regulator. Overall this study demonstrates the utility of functional metagenomics and phenomics for novel gene discovery and functional characterization of metagenome-derived clones.

## **ACKNOWLEDGMENTS**

The Alimentary Pharmabiotic Centre is a research center funded by Science Foundation Ireland (SFI grant number 07/CE/B1368). We acknowledge the continued financial assistance of the Alimentary Pharmabiotic Centre, funded by Science Foundation Ireland. Julian R. Marchesi acknowledges funding from The Royal Society which supports the bioinformatic cluster (Hive) at Cardiff University, School of Biosciences. Roy D. Sleator is an ESCMID Research Fellow and Coordinator of ClouDx-i an EU FP7-PEOPLE-2012-IAPP project

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### *Received: 26 March 2014; paper pending published: 05 April 2014; accepted: 07 April 2014; published online: 29 April 2014.*

*Citation: Culligan EP, Marchesi JR, Hill C and Sleator RD (2014) Combined metagenomic and phenomic approaches identify a novel salt tolerance gene from the human gut microbiome. Front. Microbiol. 5:189. doi: 10.3389/fmicb.2014.00189*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Culligan, Marchesi, Hill and Sleator. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## *Hui Chen\* and Wen Jiang*

*Department of Conservative Dentistry and Periodontics, Affiliated Hospital of Stomatology, College of Medicine, Zhejiang University, Hangzhou, China*

#### *Edited by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

#### *Reviewed by:*

*Maria Paula Curado, International Prevention Research Institute, France Nur A. Hasan, University of Maryland College Park, USA Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia Arseniy E. Yuzhalin, University of Oxford, UK \*Correspondence:*

*Hui Chen, Department of Conservative Dentistry and Periodontics, Affiliated Hospital of Stomatology, College of Medicine, Zhejiang University, No. 395 Yanan Road, Hangzhou, Zhejiang 310006, China e-mail: huic66@hotmail.com*

## **INTRODUCTION**

Improvements in bio-technologies have spurred a large number of studies aimed at obtaining a better understanding of the composition and effect in microbiota and its associations with various human diseases. Early studies of the human microbiome have revealed most of the microbes that occupy in different habitats of human body and are ∼10 times more numerous than our own cells, more attention has paid to viewing ourselves as a supraorganism (Gill et al., 2006).

Traditional culture independent methods, such as DNA–DNA hybridization or cloning sequencing of DNA is widely used to identify oral organisms and found over 700 bacterial species in oral cavity (Aas et al., 2005). However, they still have significant biases that do not allow microbial diversity to be fully studied, as many low richness species cannot be detected.

Recently a major advance over conventional sequencing techniques is the development of high-throughput sequencing methods, which is a part of the next-generation sequencing (NGS) techniques (Ronaghi, 2001). These methods can largely be grouped into three main types: sequencing by synthesis (Roche 454 pyrosequencing, Illumina, The Ion Torrent system), sequencing by ligation (SOLiD, Polonator G.007 system), and single-molecule sequencing (Helicos, Pacific BioSciences). This has been proving

The oral microbiome is one of most diversity habitat in the human body and they are closely related with oral health and disease. As the technique developing, high-throughput sequencing has become a popular approach applied for oral microbial analysis. Oral bacterial profiles have been studied to explore the relationship between microbial diversity and oral diseases such as caries and periodontal disease. This review describes the application of high-throughput sequencing for characterization of oral microbiota and analyzing the changes of the microbiome in the states of health or disease. Deep understanding the knowledge of microbiota will pave the way for more effective prevent dentistry and contribute to the development of personalized dental medicine.

**Keywords: oral microbiome, high-throughput sequencing, dental caries, periodontitis, apical periodontitis**

amenable for use in massively parallel signature sequencing technologies (Rothberg and Leamon, 2008), including Roche 454 genome sequencers, Illumina sequencers, Applied Biosystems SOLiD sequencer, Life Technologies Ion Torrent, Helicos biosciences HeliScope, Pacific Biosciences SMRT DNA sequencer. The most frequently used methods are the 454 pyrosequencing (Roche, Bradford, CT, USA), Illumina (Illumina, San Diego, CA, USA), and SOLiD (Applied Biosystems, Foster City, CA, USA). Each of them had their own character. Nucleotide detection in Illumina and SOLiD systems is performed one at a time. As a result, homopolymer regions can be accurately sequenced. Second advantage of them is their high output per run compared to 454 pyrosequencing, which lead it soon became a workhorse for whole-genome resequencing applications and for exploring metabolic processing potential and pathway representation in health and disease. Owning to its optical signal decay and dephasing, these systems have relative short-read length (Zhou et al., 2010). On the other hand, the 454 pyrosequencing platform had long read length and relatively short run time. Furthermore it does not need to carry out an extra chemical deblocking step, which would reduce the chances of premature chain termination and non-simultaneous extension (Metzker, 2010; Zhou et al., 2010). Although the drawback of relatively high cost per

megabase sequencing output, the 454 pyrosequencing has still become one of the most prevalent used worldwide (Rogers and Venter, 2005).

Oral cavity is one of the indispensable parts of the human microbiome habitat. The oral bacterial community dynamics is complicated and still far way to full understand. In this paper, we reviewed the human oral microbiome composite and its shifting which related with health and disease by application of high-throughput sequencing.

## **THE CONCEPT OF HUMAN ORAL MICROBIOME**

Oral microbiome, which is referred to as the oral microflora or oral microbiota, is defined as all the microorganisms residing in the human oral cavity and their collective genome. It was firstly coined by Lederberg and Mccray (2001) "to signify the ecological community of commensal, symbiotic, and pathogenic microorganisms that literally share our body space and have been all but ignored as determinants of health and disease." The oral cavity harbors one of the most diverse microbiomes in the human body. And bacteria predominated in the oral cavity, while others are in relatively low proportions in the most circumstance.

Oral microbiome harbors on teeth, gingival sulcus, tongue, cheeks, hard and soft palates, and tonsils and it is a critical component of oral health and disease.

## **THE DIVERSITY AND COMPOSITION OF ORAL MICROBIOME**

Oral cavity, which is one of the largest and most complex humanassociated microbial habitats, harbors large numbers of bacteria that can have important effects on health. During the past 40 years, a wealth of knowledge has been gathered about these bacteria: over 250 oral species have been isolated and characterized by cultivation, and over 450 species have been identified by cultureindependent molecular approaches (Paster et al., 2006). By 454 pyrosequencing, we studied in 120 children at age 6 and found that there were about 2,000 phylotypes by clustering at 3% dissimilarity level in each sample of caries-active and caries-free children; and there were more phylotypes in saliva than in dental plaque samples from the same groups. At the genus level, sequences from saliva and plaque represented 203 different genera, while 153 different genera were found in dental plaque (120 genera in cariesactive samples and 116 genera in caries-free samples) and 156 different genera were found in saliva (132 genera in caries-active samples and 115 genera in caries-free samples; Ling et al., 2010). No gender differences in oral microbiome diversity were detected. By using whole-metagenome sequencing approaches, Firmicutes, Actinobacteria, Bacteroidetes, Fusobacteria, and Proteobacteria were found account for 80–95% of the entire oral microbiome. At the genus level, a total of 58 distinct genera are present at an 0.1% abundance. The most abundant genera comprise previously characterized oral bacteria: *Actinomyces*, *Prevotella*, *Streptococcus*, *Fusobacterium*, *Leptotrichia*, *Corynebacterium*, *Veillonella, Rothia*, *Capnocytophaga*, *Selenomonas*, *Treponema*, and TM7 genera 1 and 5 (Liu et al., 2012).

#### **THE COMPOSITION OF ORAL MICROBIOTA IN HEALTH**

Oral microbiome in 60 caries free children were analyzed with 454 pryosequencing in one of our studies (Ling et al., 2010). The

result agreed with the other studies more and less, found that more than fourteen phyla, including Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, Spirochaetes, and Fusobacteria, Euryarchaeota, Chlamydia, Chloroflexi, SR1, Synergistetes, Tenericutes, Cyanobacteria, OD2, and TM7 in healthy subjects (Zaura et al., 2009; Bik et al., 2010; Griffen et al., 2011). Among them, the vast majority (containing more than 80% of the taxa) of oral bacteria belong to Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, Spirochaetes, and Fusobacteria (Ling et al., 2010). At genus level, over 200 genera were found in the oral microbiota. The most abundant genera include *Streptococcus*, *Prevotella*, *Neisseria*, *Haemophilus*, *Porphyromonas*, *Gemella*, *Rothia*, *Granulicatella*, *Fusobacterium*, *Actinomyces*, and *Veillonella*. At species level, it had been estimated that the number of species–level phylotypes were between 500 and 10000 (Keijser et al., 2008; Lazarevic et al., 2009), and each oral niches harbored 266 "species-level" phylotypes on average (Zaura et al., 2009). It is higher than the previously reported 10–81 species per site by using a 16S rRNA gene-based microarray (Preza et al., 2009). In Hasan et al.'s (2014) study, the microbiota of human saliva was analyzed by using the Illumina GAIIx and HiSeq 2000 instrument, and more than 175 bacterial species were found at >90% accuracy, including bacteria *Haemophilus influenzae*, *Neisseria meningitidis*, *Streptococcus pneumoniae*, and *Gammaproteobacteria*.

## **THE CORE MICROBIOME IN HEALTH**

In our study, there were 3530 OTUs (operational taxonomic unit) shared with the oral microbiota in each of the intact enamel surfaces of 60 children. Five phyla including Proteobacteria, Firmicutes, Actinobacteria, Fusobacteria, and Bacteroidetes were found in healthy individuals, while nine genera were common in all subjects, including *Actinomyces*, *Capnocytophaga*, *Corynebacterium*, *Derxia*, *Leptotrichia*, *Neisseria*, *Prevotella*, *Streptococcaceae Streptococcus*, and*Veillonella*. Thus, in agreement with other studies it was proposed that there might be a core microbiome in the oral environment (Shade and Handelsman, 2012; Jiang et al., 2014; Xu et al., 2014). The core microbiome is shared with most of individuals and comprised of the predominant species in healthy conditions of oral cavity (Zarco et al., 2012).

In Zaura's study, 26% of the unique sequences, 47% of the OTUs were shared with oral microbiome in each sample of three healthy subjects. At the higher taxonomic levels, 72% of all taxa (genus level or above) were common to the oral microbiome of three adults, contributing to 99.8% of all reads. The result also suggested the existence of a core microbiome (Zaura et al., 2009). Moreover, Lazarevic et al. (2010) found salivary microbial community appeared to be stable at different time points (from 5 to 29 days), supported the concept of a core microbiome in health state. However, It is just the beginning of the understanding whether there is the core microbiome in different oral niches and needed more researches to keep verifying this concept.

#### **THE SITE-SPECIFICITY MICROBIOME**

According to the Huse' research, oral bacterial microbiota might be site-specificity and showed the different richness. Hard palate showed the lowest estimate of total richness, while the gingival plaque showed the highest estimate of total richness. The genus *Corynebacterium*had at least eight OTUs with five different profiles and colonized in different niches. For instance, *Corynebacterium matruchotii* was present almost exclusively in the supragingival plaque, while *Corynebacterium argentoratense* mostly in saliva and to a lesser extent on the hard palate (Huse et al., 2012). Further study also found similar result. It may due to the shedding of the epithelial cells and the shear forces from chewing in the buccal fold and the hard palate. Genera *Eubacterium* and *Prevotella* showed a significant association with the tongue dorsum. The papillary structure and the low redox potential of its surface might explain its significant site-specific bacterial association. *Lautropia mirabilis* was the only species significantly associated with the supragingival plaque, while *Treponema socranskii* was found only in the subgingival plaque (Preza et al., 2009). The anaerobic environment of subgingival plaque may explain their significant site-specific association. In the oropharynx, the distribution of Firmicutes, Proteobacteria, and Bacteroidetes was similar to that of saliva, and more Proteobacteria than that in the mouth (Lemon et al., 2010).

### **THE MICROBIOME VARYING DURING DIFFERENT PERIODS OF AGE**

The oral microbiota is various during different periods of age, which colonizes in the oral cavity and changes with the developmental status such as primary and permanent tooth. *Veillonella, Neisseria*, *Rothia*, *Haemophilus*, *Gemella*, *Granulicatella*, *Leptotrichia*, and *Fusobacterium* were predominant genera in infant samples, while *Haemophilus*, *Neisseria*, *Veillonella*, *Fusobacterium*, *Oribacterium*, *Rothia*, *Treponema*, and *Actinomyces* were present at higher levels in their parents. Saliva bacterial microbiome in adults had greater bacteria diversity than that in infants (Cephas et al., 2011). Crielaard et al. (2011) found a higher proportion of Proteobacteria (*Gammaproteobacteria*, *Moraxellaceae*) than that of Bacteroidetes in the deciduous dentition and Bacteroidetes (mainly genus *Prevotella*),*Veillonellaceae* family, Spirochaetes, and candidate division TM7 increased with increasing age. It may reflect variation of oral microbiome driven by biological changes with age (Crielaard et al., 2011). And by comparing salivary microbiota from healthy children and adults, it was found that the mean level of seven genera including *Moraxella*, *Leptotrichia*, *Peptostreptococcus*, *Eubacterium*, *Neisseriaceae*, *Flavobacteriaceae*, and SR1 were significant differences between children and adults in another research, implying the microbiome shifts during different ages (Ling et al., 2013).

## **ORAL MICROBIOTA IN DENTAL CARIES THE ORAL MICROBIOTA IN CARIES**

Dental caries is one of the most prevalent worldwide chronic infectious diseases (Petersen et al., 2005). The desire of a core theme in studying the characterization of the oral microbiota has being pursued to understanding the particular organisms with tooth decay in a way that implies causation. Most researches have suggested that *Streptococcus mutans* is the major pathogen of dental caries, for it is the most frequently detected bacteria in the caries lesions (Hamada and Slade, 1980; Loesche, 1986; Matee et al., 1992).

However, some recent studies indicate that the relationship between MS and caries is not absolute: high proportions of MS may persist on tooth surfaces without lesion development, and caries can develop in the absence of *Streptococcus mutans*(Bowden, 1997). And very recently, researches found that acidogenic and aciduric bacteria other than MS, are responsible for the initiation of caries (Kashket et al., 1996). As new pyosequencing technique applied in the oral microbiology, ever greater numbers of bacteria have been identified as being associated with caries. In our recent study, applying high-throughput barcoded pyrosequencing combined with PCR-denaturing gradient gel electrophoresis, around 120 genera were found in the oral microbiota of saliva and supragingival plaques from children aged 3–6 years old with and without dental caries. Our study showed that oral microbiota in children was far more diverse than previous studies reported and more than 200 genera belonging to 10 phyla were found in the oral cavity. Six genera (*Streptococcus*, *Veillonella*, *Actinomyces*, *Granulicatella*, *Leptotrichia,* and *Thiomonas*) were significantly different between caries-active and caries-free samples in plaque (Ling et al., 2010). Further research also found that three genera including *Streptococcus*, *Granulicatella,* and *Actinomyces* exhibited a relative higher abundance in severe early children caries subjects, whereas caries free subjects exhibited a relative higher abundance of *Aestuariimicrobium*, indicating that there might be no specific pathogens but rather pathogenic populations structure shafting would lead to the occurrence of dental caries (Jiang et al., 2013). Yang et al. (2012)found that caries microbiomes were significantly more variable. And 147 OTUs were associated with adults dental caries (Yang et al., 2012).

## **THE MICROBIOTA SHIFTING IN THE DIFFERENT STAGE OF CARIES**

Our further research also found that oral microbiota was specific at different stages of caries progression. Gomar-Vercher et al. (2014) collected 110 saliva samples from 12-year-old children and divided into six groups according to the International Caries Detection and Assessment System II criteria. They found that *Porphyromonas* and *Prevotella* showed an increasing percentage compared to healthy individuals and bacterial diversity diminished as the severity of the disease increased (Gomar-Vercher et al., 2014). We also studied microbiome of plaque from caries-active subjects in different caries stages including intact enamel, white spot lesions and carious dentin lesions by pyrosequencing technique. And the result showed that the diversity of the total plaque bacterial community in the health subjects were more complex than caries subjects, which is in accordance with Gomar-Vercher's study. Moreover thirteen genera (including *Capnocytophaga*, *Fusobacterium*, *Porphyromonas*, *Abiotrophia*, *Comamonas*, *Tannerella*, *Eikenella*, *Paludibacter*, *Treponema*, *Actinobaculum*, *Stenotrophomonas*, *Aestuariimicrobium,* and *Peptococcus*) were associated with dental health, eight genera (including*Cryptobacterium*, *Lactobacillus*, *Megasphaera*, *Olsenella*, *Scardovia*, *Shuttleworthia*, *Cryptobacterium*, and *Streptococcus*) increased significantly in cavitated dentin lesions, and *Actinomyces* and *Corynebacterium* were present at significant high levels in white spot lesions, while *Flavobacterium*, *Neisseria*, *Bergeyella,* and *Derxia* were enriched in the intact surfaces of caries individuals (Jiang et al., 2014). Relatively high proportions of *Atopobium*, *Prevotella*, or *Propionibacterium* with *Streptococcus* or *Actinomyces* dominated in carious dentin lesions in Obata's study (Obata et al., 2014).

## **ORAL MICROBIOTA OF APICAL PERIODONTITIS**

Apical periodontitis develops around the apex of the dental root and is caused primarily by root canal infection (Siqueira, 2001). Bacterial biofilm communities established in the apical part of infected root canals are conceivably of the most importance in the pathogenesis of apical periodontitis (Siqueira, 2002). For there was no strong evidence of the specific involvement of a single species with any particular sign or symptom of apical periodontitis been found with advancing technique. By using massive parallel pyrosequencing analysis, 187 bacterial species-level phylotypes, 84 genera and 10 phyla were found in the apical part of root canals of teeth with apical periodontitis. The most abundant and prevalent phyla were Proteobacteria, Firmicutes, Bacteroidetes, Fusobacteria, and Actinobacteria. And the mean number of species-level phylotypes per sample was 37. These results indicate that bacterial communities in apical periodontitis are more diverse than previously demonstrated (Siqueira et al., 2011). Another study investigated the microbial diversity in symptomatic and asymptomatic canals with primary endodontic infections by using GS FLX Titanium pyrosequencing. The result showed that the vast majority of sequences belonged to seven phyla including *Actinobacteria*, *Bacteroidetes*, *Firmicutes*, *Fusobacteria*, *Proteobacteria*, *Spirochetes*, and *Synergistetes*. And *Pyramidobacter*, *Streptococcus*, *Leptotrichia* constituted nearly 50% of microbial community in asymptomatic teeth, whereas *Neisseria*, *Propionibacterium*, and *Tessaracoccus* were frequently found in symptomatic teeth (Lim et al., 2011). Santos et al. (2011) performed barcoded multiplex pyrosequencing to compare the microbiota of dental root canal infections associated with acute or chronic apical periodontitis. They found that the most abundant phyla in acute infections were *Firmicutes*, *Fusobacteria,* and *Bacteroidetes*, while in chronic infections, the dominants were *Firmicutes*, *Bacteroidetes,* and *Actinobacteria*. And the most prevalent genera in acute infections were *Fusobacterium* and *Parvimonas* (Santos et al., 2011). In Hong's report, the diversity of bacterial community profile of intracanal microbiota in primary and persistent endodontic infections associated with asymptomatic chronic apical periodontitis showed no significantly different. And Bacteroidetes was the most abundant phylum in both primary and persistent infections. Other reports also found Bacteroidetes was the most abundant phylum in both primary and persistent infections by using pyrosequencing (Hong et al., 2013). And in symptomatic periapical lesions, the most abundant phyla were Proteobacteria and Firmicutes, while the predominated genera were *Fusobacterium*, *Streptococcus*, *Prevotella*, *Corynebacterium*, *Porphyromonas,* and *Actinomyces* (Saber et al., 2012). Another research analyzed endodontic infections by deep coverage pyrosequencing and found that 179 bacterial genera in 13 phyla. Among them, Bacteroidetes was the most prevalent bacterial phylum (Li et al., 2010).

## **ORAL MICROBIOTA IN PERIODONTITIS**

Periodontitis is an inflammatory disease in which oral bacteria play an important role in the progress of disease. It is thought to be concerned to a polymicrobial etiology, and comprehensive studies were performed to elucidate differences of the complex communities between health and disease (Ashimoto et al., 1996). By comparing the periodontally healthy controls and subjects with chronic periodontitis, Griffen found that community diversity was higher than that in disease. 123 species were identified which were significantly more abundant in individuals with chronic periodontitis and 53 species were identified in health controls. Among them, Spirochaetes, Synergistetes, and Bacteroidetes were health-associated, whereas Proteobacteria, Clostridia, *Negativicutes* and *Erysipelotrichia* were associated with disease (Griffen et al., 2012). There are significantly different in abundance comparing the oral microbiome in deep (diseased) and shallow (healthy) sites by sequencing 16S rRNA genes. In the deep sites, 14 genus-level OTUs, including *Streptococcus*, *Actinomyces,* and *Veillonella*, were decreased, whereas 37 genus-level OTUs were present in increased abundance compared to shallow sites such as *Prevotella*, *Porphyromonas*, *Treponema,* and *Fusobacterium*(Ge et al., 2013). By utilizing pyrosequencing technique, the gram-negative genera *Selenomonas*, *Prevotella*, *Treponema*, *Tannerella*, *Haemophilus,* and *Catonella* are significantly enriched in periodontal disease, whereas a set of gram-positive genera are significantly enriched in healthy samples (*Streptococcus*, *Actinomyces,* and *Granulicatella*; Liu et al., 2012). Bacteroidetes was the most abundant phylum in samples of periodontal disease, whereas Actinobacteria and Proteobacteria were significantly increased in plaque of periodontal health in another metagenomic sequencing analysis. At genus level, microbial community of periodontal health were dominated by *Streptococcus*, *Haemophilus*, *Rothia,* and *Capnocytophaga*, while microbiota in periodontal disease exhibited high level of *Prevotella* (Wang et al., 2013). Another 16S rRNA gene sequencing analysis found that *Fusobacterium*, *Porphyromonas*, *Treponema*, *Filifactor*, *Eubacterium*, *Tannerella*, *Hallella*, *Parvimonas*, *Peptostreptococcus,* and *Catonella* showed higher relative abundances in the periodontitis group (Li et al., 2014).

By using an Ion Torrent Personal Genome Machine, the diversity of bacterial community increased after scaling and root planning therapy (SPR). The most striking difference was that periodontal pathogenic species including the genera *Porphyromonas*, *Tannerella*, *Treponema*, and *Filifactor* were removed only in the group treated with SPR and antibiotics (Jünemann et al., 2012). And the post-treatment plaque samples retained the highest similarity to pre-treatment samples of the same individual (Schwarzberg et al., 2014).

## **CONCLUSION**

The high through-put technique has largely expended our knowledge regarding the composition of the bacterial communities associated with healthy and disease. The oral microbiota is far more diverse than previous thought. And as a number of uncultivated organisms discovery, it has being shed light on the relationship between oral microbiota and the caries process and periodontitis developing and other oral disease. The new high-throughput methodologies are likely to approach our understanding of bacterial ecology in oral disease.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 April 2014; accepted: 09 September 2014; published online: 13 October 2014.*

*Citation: Chen H and Jiang W (2014) Application of high-throughput sequencing in understanding human oral microbiome related with health and disease. Front. Microbiol. 5:508. doi: 10.3389/fmicb.2014.00508*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Chen and Jiang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Osmoregulation and the human mycobiome

## *Abhishek Saxena and Ramakrishnan Sitaraman\**

*Department of Biotechnology, TERI University, New Delhi, India \*Correspondence: minraj@gmail.com*

#### *Edited and reviewed by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

**Keywords: mycobiome, microbiota, osmoregulation, MAP kinase,** *Candida* **spp., stress response,** *S. cerevisiae*

## **INTRODUCTION**

The last one-and-a half decades have made it amply clear that the human microbiota have a very significant role to play in health and disease. The human body can (or should) be better viewed as a complex ecosystem inhabited by micro-organisms that outnumber human cells 10 to 1 (Ley et al., 2006). However, most research in this field has been focused on the prokaryotic (specifically, bacterial) component of the microbiota. Sampling, in turn, is carried out mostly from sites that are readily accessible (Human Microbiome Project Consortium, 2012). Such sampling might not be genuinely representative of the situation *in vivo*, even for the bacteria under study. The human alimentary tract is a complex entity that exhibits extensive variation in biological characteristics such as tissue/cell types and secretions, as well as parameters such as temperature, pH, oxygen levels and osmolarity along its entire length. A proteomic study sampling mucosal lavages at multiple colonic sites indicated significant differences in protein profiles between the proximal and distal colon, which was supportive of the concept of their functional and developmental distinctness (Li et al., 2011). The colonization of the human infant by microbes, initially during the process of birth, exhibits an ecological succession of microbial species over time, and plays a prominent role in the maturation of the immune system as well (reviewed in Costello et al., 2012).

The fungal members of the microbiota are not very numerous compared to bacteria. Large-scale metagenomic sequencing of fecal samples of 124 individuals found that only about 0.1% of genes detected were of eukaryotic origin (Qin et al., 2010). The most commonly encountered genera constituting the fungal microbiota or "mycobiome" (Huffnagle and Noverr, 2013) are *Candida*, *Saccharomyces* and *Cladosporium* (Hoffmann et al., 2013). The bacterial microbiota and a functional immune system are thought to keep the numbers of opportunistic fungal pathogens, such as *Candida* spp., under check in the absence of any perturbations. However, information from studies of polymicrobial diseases points to subtler adjustments, dependent on environmental conditions and cross-kingdom signals that eventually influence (positively and negatively) modes and rates of growth (reviewed in Peleg et al., 2010). The sensing of, and responses to, biotic and abiotic stimuli by fungi (as in other organisms) involves multiple signaling pathways that can interact to either augment or attenuate one another, as will be discussed below.

## **OSMOREGULATION AND STRESS RESPONSES IN** *SACCHAROMYCES CEREVISIAE*

*S. cerevisiae*, a well-studied fungus, employs two strategies for responding to stress, both involving extensive signaling by MAP kinases (MAPK, alternatively termed SAPK-stress activated protein kinase). The first is stress-specific, e.g., involved in the response to pheromone, spore wall formation and adapting to hyperosmotic conditions etc. (Gustin et al., 1998). The HOG (high osmorality glycerol) pathway that is activated in response to hyperosmotic conditions leads to the accumulation of compatible solutes (glycerol being the most important) and also results in the closure of the aquaglyceroporin Fps1p, enabling retention of glycerol. The HOG pathway functions through two signaling branches. The SLN1 branch involves the two-component membrane sensor protein Sln1p complexed with Ypd1p and Ssk1p that, under hyperosmotic conditions, is unable to inactivate downstream MAPKKKs (functionally redundant Ssk2p and Ssk22p) by phosphorylation. This results in the dephosphorylation of these kinases that phosphorylate the MAPKK Pbs2p which, in turn, phsophorylates the MAPK Hog1p. Phosphorylated Hog1p moves into the nucleus and interacts with transcription factors such as Hot1p, Msn1p, Msn2p etc. activating the transcription of, among other genes, including those encoding phosphatases (e.g., Ptp2p, Ptp3p) that dephosphosphorylate Hog1p, causing feedback inhibition, limiting the duration of Hog1p activity. The SHO1 branch involves two functionally redundant mucin-like transmembrane osmosensors, Msb2p and Hkr1p, that recruit the Pbs2p MAPKK directly to the cytoplasmic face of the cell membrane as part of a macromolecular complex. Notably, SHO1 branch proteins are shared with other signaling pathways, and it is activated when hyperosmolarity occurs as a result of other cellular responses (reviewed in Hohmann, 2002; Hohmann et al., 2007).

The second strategy for coping with stress is the environmental stress response (ESR) that enables adaptation to the longterm effects of various stresses, in contrast to the more specific and short-term response of other MAPK pathways. The ESR was first described as an increased expression of ∼300 genes and repression of ∼600 genes in response to diverse environmental conditions to which *S. cerevisiae* was subjected (Gasch et al., 2000; Causton et al., 2001). In both these studies, various stress conditions were tested including temperature, hyper- and hypoosmotic shock, oxidative (H2O2) stress etc. Induced genes included those involved in a wide variety of processes, including carbohydrate metabolism, detoxification of reactive oxygen species, cellular redox reactions, cell wall modification, protein folding and degradation, DNA damage repair, fatty acid metabolism, metabolite transport, vacuolar and mitochondrial functions, autophagy, and intracellular signaling (Gasch et al., 2000). Genes encoding cytoplasmic ribosomal proteins, tRNA synthases, proteins required for processing rRNAs, and a subset of translation initiation factors were repressed (Causton et al., 2001).

The ESR provides a "cross-protective effect" wherein *S. cerevisiae* subjected to mild heat stress as the primary stress becomes adapted to higher levels of heat as well as oxidative (H2O2) stresses (Berry et al., 2011). This is especially relevant as stresses under natural conditions don't occur singly or sequentially, but simultaneously. It may seem at first sight that the mild dosage of primary/initial stress is irrelevant, as the ultimate adaptation to the secondary stress(es) is achieved anyway. However, Berry et al. (2011) demonstrated that distinct subsets of genes were activated due to primary and secondary stresses. Earlier work indicated that the cross-protective effect is not universal, but specific to the primary/secondary stress combination (Berry and Gasch, 2008). The major transcription factors mediating the ESR are Msn2p and Msn4p (Berry et al., 2011). Msn2p and Msn4p (see below) play specific roles depending on the stress combination and are even regulated in a condition-specific manner. Besides, there are other transcription factors activated during the ESR and subsequent "acquired stress resistance," like Hsf1p (heat stress), Yap1p (oxidative stress), and Hot1p & Sko1p (hyperosmotic stress), that can also activate Msn2p/4p target genes (Berry and Gasch, 2008).

Adding to the mechanistic complexity of stress responses is recent evidence that Hog1p induces transcription of a long non-coding RNA whose presence is required for chromosome remodeling around the *CDC28* gene encoding a cyclindependent kinase and its subsequent induction. This is accompanied by cell cycle delay, and increased Cdc28p levels ensure faster recovery following the stress application (Nadal-Ribelles et al., 2014). Osmoadapated *S. cerevisiae* exhibit HOG activation upon shmooing in response to pheromone (Baltanás et al., 2013). Hog1p also imposes checkpoints on the mating pathway if pheromone is sensed during a period of hyperosmotic stress. It phosphorylates the protein kinase Rck2p that inhibits translation of Fus3p (the MAPK of mating pathway) by phosphorylating EF2 (elongation factor 2). Ste50p, a shared component of both the HOG and mating pathways is phosphorylated (and inhibited) by Hog1p (Nagiec and Dohlman, 2012). Thus, the osmoregulation response is not elicited solely by hyperosmolarity *per se*, but is also influenced by the spatiotemporal modulation of cell-cycle events involving stimuli impinging on other signaling pathways.

## **THE STRESS RESPONSES AND ADAPTIVE POTENTIAL OF** *CANDIDA* **SPP.**

*Candida* spp*.* are dimorphic yeasts that occur practically throughout the alimentary tract, and *C. albicans* and *C. glabrata* well-known opportunistic pathogens. *C. albicans* can grow as a planktonic unicellular organism (yeast) or as a filamentous organism (hypha). The yeast form is suitable for dissemination, while the hyphal form is more adapted to colonization and biofilm formation. *Candida albicans* also exhibits parasexuality, in which mating-competent (so-called "opaque") diploid strains mate to form tetraploids, whose progeny later undergo chromosome losses to regenerate the diploid state (Hull et al., 2000; Magee and Magee, 2000). More recent work has demonstrated the existence of a viable and stable haploid form generated by chromosome losses implying that *C. albicans* may not be an "obligate diploid" as originally thought (Hickman et al., 2013). This increases the overall repertoire of genetic diversity in the *Candida* population, that could confer an adaptive advantage on the organism. Incidentally, these



*Sc, Saccharomyces cerevisiae; Ca, Candida albicans; Cg, Candida glabrata; Cn, Cryptococcus neoformans.*

polymorphic transitions caution us that reliance on metagenomic and quantitative approaches to study the gut microbiota may not be adequately reflective of significant, populations and sub-populations that arise transiently by random and/or adaptive mechanisms.

Orthologs of the various MAPK pathway genes and also of the factors involved in the ESR have been discovered in *Candida* spp. *C. albicans* is thought to have diverged from *S. cerevisiae* more than 200 million years ago (Kurtzman and Piškur, 2006) The *C. albicans* ESR (CaESR) is not as extensive in genetic terms as in *S. cerevisiae*. Only a small number of genes are involved in CaESR (∼24 upregulated genes and ∼37 downregulated genes) (Enjalbert et al., 2006; Gasch, 2007). This suggests the absence of a core environmental response/general stress response in *C. albicans* (Enjalbert et al., 2003). Later studies confirmed that CaMsn4p only weakly complements the inability of an *msn2msn4* double mutant in *S. cerevisiae* to activate a STRE*lacZ* reporter (STRE-Stress response element) while CaMln1p *(Candida albicans* Msn2p/Msn4p-like protein) does not complement the defect at all (Nicholls et al., 2004). The transcription factors finally activated in the CaESR have not been conclusively identified. Therefore a complete picture of activation and regulation of ESR in *C. albicans* is as yet unavailable. However, the number of genes activated/repressed during the CaESR were more in response to oxidative stress (5mM H2O2) than in response to osmotic stress (0.3 M NaCl) and heavy metal stress (0.5 mM CdSO4) (Enjalbert et al., 2006). Other studies have reported that components of stress signaling pathways may be important in virulence, drug tolerance, or quorum sensing, among other phenotypes. Interestingly, Msn2p homologs in entomopathogenic fungi *Beauveria bassiana* and *Metarhizium robertsii* augment virulence (Zhang et al., 2009; Liu et al., 2013). **Table 1** lists some components of stress signaling known to influence other phenotypes in pathogenic fungi.

### **CONCLUSIONS**

In contrast to bacteria, a beneficial role for fungi in microbiota-human interactions has not emerged, though their role as opportunistic pathogens (cf. ecologically invasive species) has been extensively studied. The major fungal probiotic in use today, *Saccharomyces cerevisiae* var. *boulardii* (Sb), that is not indigenous to the human gut, provides some examples of potential benefits of fungal proteins for the host (Czerucka et al., 1994; Dahan et al., 2003), and can help in the maintenance and/or restoration of hostmicrobiota homeostasis.

The strategies employed by the members of the mycobiome to adapt to changing conditions along the length of the gut and host immune responses may depend significantly on adapting to continuously changing environmental parameters that could also serve as indicators for spatial location, microbiota composition and the physiological state of the host. Signaling pathways that respond to different stimuli are not watertight modules, but can interact in unforeseen ways to produce an integrated behavioral response (Baltanás et al., 2013). Thus, components of the osmoregulatory pathway may also participate in the process of mounting a coordinated response to environmental stiumuli.

## **ACKNOWLEDGMENTS**

Mr. Abhishek Saxena has successively received research fellowships from the Department of Biotechnology, Government of India (sanction order number BT/PR12598/PBD/26/215/2009, April, 2011 – February, 2014) and from Novozymes and Henning-Holck Larsen foundation (February, 2014 – July, 2014). Mr. Ratan Jha and Dr. Bharati Paliwal of the TERI University library are gratefully acknowledged for their timely and consistent support in procuring some of the references used herein.

### **REFERENCES**


role for impending stress in yeast. *Mol. Biol. Cell* 19, 4580–4587. doi: 10.1091/mbc.E07-07-0680


eds. P. Sunnerhagen and J. Piskur (Berlin, Heidelberg: Springer), 29–46.


**Conflict of Interest Statement:** Work on osmoregulation in *S. cerevisiae* in our laboratory is supported by a grant from the Department of Biotechnology, Government of India (sanction order number BT/ PR12598/PBD/26/215/2009).

*Received: 27 March 2014; accepted: 27 March 2014; published online: 17 April 2014.*

*Citation: Saxena A and Sitaraman R (2014) Osmoregulation and the human mycobiome. Front. Microbiol. 5:167. doi: 10.3389/fmicb.2014.00167*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Saxena and Sitaraman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Transposon-mediated directed mutation controlled by DNA binding proteins in *Escherichia coli*

## *Milton H. Saier Jr\* and Zhongge Zhang*

*Division of Biological Sciences, Department of Molecular Biology, University of California at San Diego, La Jolla, CA, USA \*Correspondence: msaier@ucsd.edu*

#### *Edited by:*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

#### *Reviewed by:*

*Robert Heinzen, NIH/NIAID-RML, USA*

*Claudio Palmieri, Polytechnic University of Marche, Italy*

*Anton G. Kutikhin, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Russia*

**Keywords: directed mutation, transposable genetic element, transposon, insertion sequence, gene activation,** *glpFK* **operon**

## **INTRODUCTION**

It is a basic principle of genetics that the likelihood of a particular mutation occurring is independent of its phenotypic consequences. The concept of directed mutation, defined as genetic change that is specifically induced by the stress conditions that the mutation relieves (Cairns et al., 1988), challenges this principle (Foster, 1999; Rosenberg, 2001; Wright, 2004). The topic of directed mutation is controversial, and its existence, even its *potential* existence, as defined above, has been altogether questioned (Roth et al., 2006).

Part of the justifiable skepticism concerning directed mutation resulted from experiments that were purported to demonstrate this phenomenon, but were subsequently shown to be explainable by classical genetics (Roth et al., 2006). Mutation rates vary with environmental conditions (e.g., growth state) and genetic background (e. g., mutator genes), a phenomenon known as "adaptive" mutation (Wright, 2004; Foster, 2005), but this does not render the mutation "directed." To establish the principle of directed mutation, it is necessary to show that the adaptive mutation is "directed" to a specific site, characterize the mechanism responsible, identify the proteins involved, and provide the evolutionary basis for its appearance.

One frequently encountered type of mutation results from the hopping of transposable genetic elements, transposons, which can activate or inactivate critical genes (Mahillon and Chandler, 1998; Chandler and Mahillon, 2002). For example, activation of the normally cryptic β-glucoside (*bgl*) catabolic operon in *E. coli* can be accomplished by insertion of either of two insertion sequences, IS1 or IS5, upstream of the *bgl* promoter (Schnetz and Rak, 1992).

The *E. coli* glycerol (*glp*) regulon consists of five operons, two of which (*glpFK* and *glpD*) are required for aerobic growth on glycerol (Lin, 1976). Both operons are subject to negative control by the DNAbinding *glp* regulon repressor, GlpR (Zeng et al., 1996), which also binds glycerol-3 phosphate, the inducer of the *glp* regulon. The *glpFK* operon is additionally subject to positive regulation by the cyclic AMP receptor protein (CRP) complexed with cyclic adenosine monophosphate (cAMP; Freedberg and Lin, 1973; Campos et al., 2013), although *glpD* is not appreciably subject to regulation by this mediator of catabolite repression (Weissenborn et al., 1992). The *glpFK* regulatory region contains four GlpR binding sites, *O1–O4*, and two CRP binding sites, CrpI and CrpII, which overlap *O2* and *O3*, respectively, (**Figure 1A**). The strong CRP dependency of *glpFK* transcription is reflected by the fact that *crp* and *cya* (adenylate cyclase) mutant cells are unable to utilize glycerol (Zhang and Saier, 2009a). We have found that binding of GlpR and the cAMP-CRP complex to the *glpFK* upstream control region negatively influences IS5 hopping specifically into the single site that strongly activates *glpFK* promoter activity (Zhang and Saier, 2009a, unpublished observations).

## **Glp+ MUTATIONS IN A** *crp* **GENETIC BACKGROUND**

When *crp* cells were incubated on solid glycerol minimal medium, Glp+ colonies appeared. We tested the growth of a *crp* Glp+ strain on glycerol in defined liquid medium. The growth rate was greater than that of wild type (*wt*) *E. coli* (Zhang and Saier, 2009b).

The relative rates of Glp+ mutation were determined in minimal and complex media. On glycerol plates, colonies first appeared after about 3 days (**Figure 1B**) although wild type (*wt*) and *crp* Glp+ *E. coli* cells formed visible colonies in *<*2 days (**Figure 1C**). When the same cells were plated as before, but small numbers of *crp* Glp+ mutant cells were included with the *crp* cells before plating, colonies appeared from the *crp* Glp+ cells within 2 days, and new Glp+ mutants arose at the same rate as before (**Figure 1C**). This experiment showed that the mutants that arose on the plates had not existed in the cell population when initially plated. Thus, the Glp+ mutants arising from *crp* cells on these plates arose during incubation on the plate, and no growth inhibitor was present. The rate of mutation on glycerol plates proved to be 10 times higher than in minimal sorbitol or complex LB medium, and it was over 100 times higher than in

glucose-containing medium (Zhang and Saier, 2009a).

Mutation proved to be due to IS5 hopping to a discrete site, between base pairs 126 and 127 upstream of the transcriptional start site, and always in the same orientation. The increase in mutation rate was specific to the *glpFK* operon and did not occur in three other operons examined (Zhang and Saier, 2009a). Only the downstream 177 bp region of IS5 was required for activation of the *glpFK* operon, and this proved to be due to the presence of a permanent bend and an overlapping IHF binding site, each of which was responsible for half of the activation (Zhang and Saier, 2009b). This mechanism of activation (but not the actual transposon hopping event), presumably involving DNA looping, could also be demonstrated for the lactose (*lac*) operon in a *crp* genetic background of *E. coli* (Zhang and Saier, 2009b).

## **DEPENDENCY OF THE Glp+ MUTATION RATE ON GlpR**

Glycerol is phosphorylated by glycerol kinase (GlpK) to glycerol-3-phosphate which binds to and releases GlpR from its operators (Lin, 1976). When GlpR dissociates from its operators, due to glycerol-3-phosphate binding, a conformational change could be transmitted through the DNA, promoting insertion of IS5 at the target CTAA site upstream of the *glpFK* promoter. In other words, GlpR binding might have two functions: repression of gene expression and suppression of IS5 transposition to the upstream activating site.

To test this possibility, the *glpR* gene was deleted, and the rates of appearance of Glp+ mutations in the *crp glpR* double mutant background were measured in the absence and presence of glycerol. The numbers of Glp+ cells arising was 10-fold higher in the *crp glpR* double mutant than in the *crp* mutant when glycerol was absent. In the presence of glycerol, the loss of GlpR was without effect. Thus, deletion of *glpR* is equivalent to the inclusion of excess glycerol in the growth medium. High level overexpression of *glpR* decreased mutation rate to background levels (Zhang and Saier, 2009a). Clearly, GlpR regulates the increased rate of IS5-mediated insertional activation of the *glpFK* promoter by glycerol.

## **GlpR OPERATORS DIFFERENTIALLY CONTROL** *glpFK* **EXPRESSION AND Glp+ MUTATION RATE**

There are four GlpR binding sites, *O1– O4*, in the upstream *glpFK* operon regulatory region (see **Figure 1A**), identified by DNA footprinting (Freedberg and Lin, 1973; Zeng et al., 1996). We mutated the far upstream site (*O1*) and the far downstream site (*O4*) so they no longer could bind GlpR, and compared the effects on *glpFK* expression using a *lacZ* reporter gene fusion construct vs. mutation rate to Glp+ during growth in LB medium. Mutation of *O4* increased *glpFK* operon expression about 5-fold although mutation of *O1* was almost without effect (Zhang and Saier, 2009a). By contrast, loss of *O1* yielded a 7-fold increase in mutation rate although loss of *O4* had only a 2-fold effect on mutation rate. We confirmed that IS5 was always in the same position and orientation (Zhang and Saier, 2009a). Thus, *O1* primarily controls the rate of IS5 insertion into the activating site, while *O4* primarily controls *glpFK* expression.

## **CONTROL OF IS5-MEDIATED** *glpFK* **OPERON ACTIVATION BY THE cAMP-CRP COMPLEX**

As noted above, IS5-mediated activation of the *glpFK* promoter was observed in a *crp* genetic background. Initial attempts in our laboratory and elsewhere to isolate such mutants in a wild type genetic background proved unsuccessful (Ibarra et al., 2002; Honisch et al., 2004; Fong et al., 2005; Zhang and Saier, 2011, unpublished observations; Cheng et al., 2014). Since *crp* mutants are not found in nature, this brought into question the suggestion that our discovery of directed mutation in a *crp* mutant of *E. coli* was relevant to the wild type situation, and hence whether it had actually evolved under the pressures of natural selection.

We consequently undertook a systematic analysis of the cAMP dependency of IS5-mediated *glpFK* activation to understand why *crp* mutants but not *wt* cells gave rise to IS5-activated mutants. The *cya* gene, encoding the cAMP biosynthetic enzyme, adenylate cyclase (Cya), was deleted, and as expected, IS5-mediated *glpFK* activation was observed at high frequency. When sub-mM concentrations of cAMP were added to the growth medium, the frequency of these mutations was drastically reduced; 1 mM exogenous cAMP essentially prevented the appearance of these mutants. When *glpR* was deleted in

the *cya* genetic background, the frequencies of IS5 insertion increased by about 20-fold (Zhang and Saier, unpublished observations).

These observations led us to experiment with *wt E. coli* cells. Glycerol utilization in these cells is strongly inhibited by the presence of the non-metabolizable glucose analogs, 2-deoxyglucose and α-methyl glucoside, which also lower cytoplasmic cAMP levels by inhibiting adenylate cyclase activity (Saier and Feucht, 1975; Castro et al., 1976; Saier et al., 1976; Feucht and Saier, 1980; Saier, 1989). Wild type cells were therefore plated on minimal salts medium containing 0.2% glycerol and 0.1% 2 deoxyglucose or 0.5% α-methyl glucoside. Not surprisingly, IS5 insertional directed mutants could be isolated under these conditions (Zhang and Saier, unpublished results). Abolition of the CRP binding sites in the *glpFK* upstream regulatory region greatly enhanced the frequency of these mutants in an otherwise *wt* genetic background, even in the absence of a nonmetabolizable glucose analog, showing that binding of the cAMP-CRP complex to the *glpFK* control region negatively regulates IS5 insertion. These experiments showed that directed mutation is negatively regulated by binding of both the glycerol repressor, GlpR, and the cAMP-CRP complex to the *glpFK* control region. This explains why IS5 insertion into the activating site of the *glpFK* regulatory region in wild type cells depends on both high glycerol and low cyclic AMP concentrations.

## **DIRECTED MUTATIONS PROMOTING EXPRESSION OF OTHER OPERONS IN** *E. COLI*

Following the reports of Zhang and Saier (2009a,b) cited above, Wang and Wood (2011) demonstrated directed mutation of the operon (*flhDC*) encoding the flagellar transcriptional master switch in *E. coli*, FlhDC. IS5 insertion into the upstream control region of the *flhDC* operon was responsible. Although the mechanism was not established, the frequency of IS5 insertion clearly increased under swarming conditions in soft agar compared to growth in liquid medium or on solid agar plates where swarming does not occur (Wang and Wood, 2011). We have confirmed and extended their results (Zhang et al., 2013). Moreover, preliminary results suggest that the *E. coli fuc* (fucose; propanediol) operon may also be subject to IS5-mediated directed mutation (Zhang and Saier, 2011; Zhang et al., 2013). It seems that transposon-mediated directed mutation will prove to be important to microbial evolution, and possibly to that in other organisms, partly accounting for the prevalence of these genetic elements in virtually all living organisms.

## **CONCLUSIONS AND PERSPECTIVE**

Directed mutation has been defined as genetic change that is specifically induced by the stress conditions that the mutation relieves, but until recently, in no case had such a mechanism been established. We have demonstrated that mutations in the *glpFK* control region, allowing growth of *E. coli crp* or*cya* mutants on glycerol plates, or *wt* cells on glycerol plus 2-deoxyglucose or α-methyl glucoside plates, are specifically induced by the presence of these compounds. The glycerol regulon repressor, GlpR, which binds to its four operators (*O1–O4*) in front of the *glpFK* operon (**Figure 1A**) and is displaced from these sites when α-glycerol phosphate binds allosterically to GlpR (Zeng et al., 1996), not only controls gene expression, but also controls mutation rate. In this case, GlpR binding negatively influences both *glpFK* operon expression and operon activation by IS5. Our results established that binding of GlpR to *O4*, which overlaps the −10 promoter region, primarily controls gene expression, while binding to *O1* primarily controls IS5 hopping into the specific CTAA site, between 127 and 122 base pairs upstream of the *glpFK* transcriptional start site, that activates the *glpFK* promoter. Binding of the cAMP-CRP complex to its two binding sites overlapping *O2* and *O3* also blocks IS5 hopping to the activating site. This dual mechanism may involve changes in DNA conformation or supercoiling. The results serve to dissociate two functions of both GlpR and CRP.

The mechanism of IS5-mediated *glpFK* promoter activation in wild type cells provides relief from starvation when glycerol is present and a cAMPdepressing toxic substance, such as 2-deoxyglucose, is simultaneously present. Such non-metabolizable sugar derivatives are synthesized by many organisms and therefore are present in nature. This example of directed mutation could therefore have been selected for during evolution. It appears to be a genuine example of directed mutation, with mutations arising at a greater rate under conditions that allow benefit to the organism. The fact that mutation rate is influenced by the presence of glycerol in a process mediated by the glycerol repressor, and by cAMP in a process mediated by CRP, provides mechanistic explanations for IS5-mediated directed mutational control. This mechanism, illustrated in **Figure 2**, allows rationalization of the presence of four GlpR binding sites and two CRP binding sites in the control region of the *glpFK* operon. Our studies also provide the rationale for the evolution of this elaborate mechanism of gene activation.

## **ACKNOWLEDGMENT**

We thank Fengyi (Andy) Tang for assistance with the preparation of this manuscript, and the NIH for financial support (GM077402).

## **REFERENCES**


mutation. *Annu. Rev. Genet.* 33, 57–88. doi: 10.1146/annurev.genet.33.1.57


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 April 2014; accepted: 11 July 2014; published online: 01 August 2014.*

*Citation: Saier MH Jr and Zhang Z (2014) Transposonmediated directed mutation controlled by DNA binding proteins in Escherichia coli. Front. Microbiol. 5:390. doi: 10.3389/fmicb.2014.00390*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Saier and Zhang. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## *Helicobacter pylori* DNA methyltransferases and the epigenetic field effect in cancerization

## *Ramakrishnan Sitaraman\**

*Department of Biotechnology, TERI University, New Delhi, India \*Correspondence: minraj@gmail.com*

#### *Edited and reviewed by:*

*Anton G. Kutikhin, Russian Academy of Medical Sciences, Russia*

**Keywords:** *Helicobacter pylori***, epigenetic field effect, gastric cancer, DNA methyltransferase, DNA methylation, host-pathogen interactions, gastric microbiota**

## **INTRODUCTION**

*Helicobacter pylori*, a Gram-negative, microaerophilic bacterium, has co-existed with humans beings as a prominent member of their gastric microbiota for approximately 10<sup>5</sup> years (Moodley et al., 2012). It infects approximately half the world's population, and most infected individuals are asymptomatic, but histologically exhibit superficial gastritis (The EUROGAST Study Group, 1993). Only a minority of infected individuals develop gastric or duodenal ulcers that necessitate treatment. Prolonged inflammation caused by chronic (often lifelong) infection predisposes a small fraction of infected individuals to develop gastric adenocarcinoma or lymphoma of the mucosa-associated lymphoid tissue (MALT lymphoma) (Passaro et al., 2002). Unfortunately, the prognosis for cases of gastric cancer is very poor, with 5-year survival rates being lower than 15% (Peek and Blaser, 2002).

A mechanism for carcinogenesis ensuing from *H. pylori*-triggered inflammation was first proposed by Pelayo Correa (Correa, 1992; Correa and Piazuelo, 2012). Briefly, chronic inflammation causes superficial gastritis that progresses over time to multifocal atrophic gastritis (MAG), characterized by the destruction of gastric glands. This is followed by intestinal metaplasia, wherein gastric epithelium undergoes an "epithelialmesenchymal transition" and begins to exhibit an intestinal phenotype. The subsequent stage consists of dysplasia culminating in invasive carcinoma, which completes the "pre-cancerous cascade." The final outcome is also dependent on host and pathogen genotypes, as well as environmental factors such as socioeconomic indicators, a high-salt diet, low fruit/vegetable intake and smoking (Khalifa et al., 2010). Most notably, *H. pylori* is the sole bacterium to be classified by the WHO as a class I carcinogen (IARC Working Group on the Evaluation of Carcinogenic Risks to Humans, 1994).

## **A MECHANISM FOR EPIGENETIC FIELD CANCERIZATION BY** *H. pylori* **DNA MTases**

Field cancerization is a concept first proposed in 1953 in the context of oral stratified squamous epithelium (Slaughter et al., 1953), and subsequently extended to other tissues. Briefly, upon exposure to a carcinogen at sufficient intensity for a significant duration, grossly normal-looking tissue near tumor sites suffers microscopic (essentially, molecular) changes that eventually result in carcinogenesis. Aberrant methylation of cytosine residues within CpG islands (CGI) in genomic DNA has been reported in a variety of cancers, including gastric cancer (Laird and Jaenisch, 1994; Laird, 2005). This is an epigenetic change that could contribute to cancer development by the process of field cancerization (Ramachandran and Singal, 2012). Chronic *H. pylori* infection in humans is associated with hypermethylation of promoter sequences of different categories of genes, resulting in downregulation of transcription. Some of these are: *CDH1* that codes for E-cadherin, a transmembrane glycoprotein involved in maintaining epithelial integrity (Chan et al., 2003); *GATA4* and *GATA5* encoding transcription factors (Wen et al., 2010); and *TFF2* (encoding trefoil factor 2) (Peterson et al., 2010) and *FOXD3* (encoding a forkhead box transcriptional regulator) that are tumor suppressors (Cheng et al., 2013). However, one unexplored possibility is that one or more of the several functional DNA methyltransferases (MTases) of *H. pylori* could enter host cells and methylate their recognition sequences in chromosomal DNA in an unregulated manner. The result would be the creation of an epigenetic field of cancerization.

## **THE NUMEROUS DNA MTases OF** *H. pylori*

DNA MTases are sequence-specific DNAbinding enzymes that methylate adenine or cytosine residues in the context of their cognate recognition sequences using Sadenosylmethionine as the methyl donor, and are widespread among prokaryotes (Roberts et al., 2010). Depending on the enzyme in question, a methyl group may be added at the *N*<sup>6</sup> position in adenine forming *N*6-methyladenine (m6A) or the *N*<sup>4</sup> or *C*<sup>5</sup> positions in cytosine forming *N*4-methylcytosine (m4C) or *C*5-methylcytosine (m5C) respectively. DNA methylation of regulatory sequences is known to result in changes in gene expression in a wide variety of organisms, both prokaryotes and eukaryotes. Therefore, should DNA MTases encoded by pathogens gain entry into host cells by specific or non-specific means, there is a strong possibility that they could modify host regulatory DNA sequences (nuclear, and perhaps even organellar), making the process at least partially inflammationindependent.

Several pathogens, including *H. pylori*, are known to introduce virulence factors into eukaryotic host cell by a variety of mechanisms. Recently, a type I DNA methyltransferase subunit (HsdM) of *Klebsiella pneumoniae* was found to have a nuclear localization signal (NLS). When expressed in in the COS-1 (African green monkey kidney) cell line, HsdM localized to the nucleus. Surprisingly, HsdM was capable of methylating DNA even in the absence of the specificity subunit (HsdS), albeit at much lower levels (Lee et al., 2009). A recent study demonstrated that a transposase (Tnp) of *Acinetobacter baumanii* was targeted to the nuclei of A549 (a human lung carcinoma) and COS-7 (African green monkey kidney) cell lines, and that this resulted in specific CpG methylation of the *CDH1* (E-cadherin) promoter (Moon et al., 2012).

A survey of the database of restriction enzymes (REBASE; http://rebase*.*neb*.* com) indicates that *H. pylori* encodes a noticeably large number of DNA MTases, known or putative—ranging from 25 in the strain SouthAfrica7 to 37 in strain Puno135. Very few prokaryotes encode such a large number of DNA MTases or restriction-modification (R-M) systems. A majority of the predicted/known DNA MTases encoded by *H. pylori*, both adenine- and cytosine-specific, are type II enzymes (http://tools.neb.com/∼ vincze/genomes/index.php?page=H). In this class of DNA MTases, the functions of sequence-specific DNA binding and DNA methylation are carried out by the same protein, and do not require any accessory protein factors for full activity. DNA transfer experiments between *H. pylori* strains clearly demonstrated sequencespecific DNA methylation in cell extracts (Donahue et al., 2000). Several studies have indicated that many of the DNA MTases encoded by the *H. pylori* genome are expressed and functional (Vitkute et al., 2001; Takata et al., 2002; Vale and Vítor, 2007; Kumar et al., 2012a), and can affect *H. pylori* protein expression in a strain-specific manner (Donahue et al., 2002; Takata et al., 2002; Kumar et al., 2012a; Vitoriano et al., 2013).

## **ENTRY OF** *H. pylori* **DNA MTases INTO HOST CELLS**

There are at least three mutually nonexclusive routes by which DNA MTases could gain entry into host cells. Firstly, while *H. pylori* is predominantly extracellular, studies have indicated that it may be a facultatively intracellular as well (Kwok et al., 2002; Necchi et al., 2007; Liu et al., 2012). As a chronic pathogen, its intracellular persistence could conceivably result in the DNA MTases gaining access into the host cell cytoplasm. Secondly, *H. pylori* is also known to release membrane vesicles containing cellular proteins, and it is possible that DNA MTases could be transported into to the host cytoplasm in these vesicles. However, a recent proteomic study of *H. pylori* vesicles failed to detect any DNA MTases in them, indicating that this is unlikely, but it is possible that growth of *H. pylori* on plates or in broth might not correspond to the situation *in vivo* (Olofsson et al., 2010). Thirdly, many *H. pylori* strains encode components of a type IV secretion system (termed the *cag* pathogenicity island, *cag* PAI) that is capable of translocating a protein, CagA (Odenbreit et al., 2000), and peptidoglycan (Viala et al., 2004) into host cells. Presently, there is no conclusive data on whether or not DNA MTases or other cell components could be similarly translocated, though a recent computational prediction using indicates that 1–3 DNA MTases could translocated by the *cag*PAI (Wang et al., 2014), based on the model underlying the prediction.

Regardless of the mechanism by which bacterial proteins might enter the host cells, the fact remains that none of the known DNA MTases (or restriction endonucleases) of *H. pylori* possess any recognizable nuclear localization signals, so that the actual mechanism of nuclear translocation required for DNA methylation, if it happens, is still open to question.

## **INFERENCES FROM STUDIES IN THE MONGOLIAN GERBIL (***Meriones unguiculatus***) MODEL**

Humans are the only known natural host for *H. pylori.* However, *H. pylori*-infected Mongolian gerbils reproducibly develop gastric adenocarcinoma upon oral *N*methyl-*N-*nitosourea administration, and have therefore been used in animal studies for more than 15 years now (Watanabe et al., 1998). Niwa *et al.* have used this model system to examine DNA methylation in gastric cancer in detail over a duration of up to a year (Niwa et al., 2013, 2010). In their earlier study, they first demonstrated that carcinogenesis is accompanied by hypermethylation of promoters, and that cyclosporin A (CsA), an anti-inflammatory agent, does not interfere with bacterial colonization of the animals, but abrogates DNA hypermethylation significantly. Their studies demonstrated that the inflammatory response to *H. pylori* infection in Mongolian gerbils is associated with an increase in DNA methylation in gastric epithelial cells (GECs). This was taken by them to imply that methylation is not directly attributable to any bacterial effectors such as CagA or DNA MTases (Niwa et al., 2010). However, the same study also observed an unexpected decrease in the transcription levels of the host DNA MTases (Dnmts) in the GECs of infected gerbils compared to uninfected controls. Is it possible that cellular Dnmts are down-regulated in response to the chronic burden of a large number of bacterial DNA MTases, and the significant association of *H. pylori* with cancer development is due, in some part, to its large complement of DNA MTases? An additional fact to consider is that bacterial DNA adenine MTases, depending on their specificity, could modify adenine residues in regulatory regions in the DNA. More importantly, the specificity of some adenine MTases may also be relaxed, resulting in cytosine methylation at the *N*<sup>4</sup> position (Jeltsch et al., 1999). A cytosine DNA MTase of *H. pylori* (M.HpyAVIB) was found to exhibit relaxed specificity upon mutation (Kumar et al., 2012b). Lastly, given that adenine methylation is not routinely examined in studies targeting promoter hypermethylation in humans on the basis of the very low incidence of m6A in mammalian DNA, it may well have been missed in studies concentrating on CGI methylation.

## **CONCLUSIONS**

While the link between viruses and cancer has been extensively researched, it is notable that *Helicobacter pylori* has remained the best-studied and, for nearly two decades, the sole bacterial pathogen systematically linked with any type of cancer in clinical practice. Some bacterial effector molecules associated with carcinogenesis, such as CagA and VacA, have been studied in great detail. Owing to a combination of unique characteristics encoding a large number of functional DNA MTases, lifelong persistence in the host and facultative intracellularity—*H. pylori* may well be a unique member of the stomach microbiota that affects its host in unforeseen ways. The investigation of the effects of the entry of DNA MTases (and restriction endonucleases, including methylation-dependent restriction enzymes, that can cause DNA breaks) and other proteins of the microbiome into host cells has the potential to uncover novel interactions between evolutionarily disparate species. More generally, it is possible that these proteins are effectors of inter-specific epigenetic signals, that perhaps enable commensals, symbionts and pathogens to adapt to their ecological niches by modulating host gene expression. While housekeeping DNA MTases (e.g., the Dam methylase of *E. coli*) of bacteria, pathogenic or non-pathogenic, are known to be important for bacterial viability (Marinus and Casadesus, 2009), the role of bacterial DNA MTases in infectious diseases and importantly, in the evolution and maintenance of hostmicrobiome interactions remains unclear, and perhaps merits fresh consideration in terms of the epigenetic modulation of host physiology.

## **ACKNOWLEDGMENTS**

This paper is dedicated to my parents, Mr. G. Sitaraman and Mrs. Indubala for their active encouragement and support of my studies. Mr. Ratan Jha and Dr. Bharati Paliwal of the TERI University library are gratefully acknowledged for their timely and consistent support in procuring some of the references used herein.

## **REFERENCES**


of *Acinetobacter baumannii* transposase induces DNA methylation of CpG regions in the promoters of E-cadherin gene. *PLoS ONE* 7:e38974. doi: 10.1371/journal.pone.0038974


variability among *Helicobacter pylori* isolates clustered according to genomic methylation. *J. Appl. Microbiol.* 114, 1817–1832. doi: 10.1111/ jam.12187


**Conflict of Interest Statement:** Work on *Helicobacter pylori* phosopholipases in my laboratory is supported by a grant from the Department of Biotechnology, Government of India (sanction order number BT/PR11740/BRB/10/683/2008).

*Received: 07 March 2014; accepted: 07 March 2014; published online: 26 March 2014.*

*Citation: Sitaraman R (2014) Helicobacter pylori DNA methyltransferases and the epigenetic field effect in cancerization. Front. Microbiol. 5:115. doi: 10.3389/fmicb. 2014.00115*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Sitaraman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Calcifying nanoparticles: one face of distinct entities?

*Anton G. Kutikhin1,2,3\*, Arseniy E. Yuzhalin4, Vadim V. Borisov1, Elena A. Velikanova1, Alexey V. Frolov1, Vera M. Sakharova1, Elena B. Brusina1,2 and Alexey S. Golovkin1*

*<sup>1</sup> Division of Experimental and Clinical Cardiology, Research Institute for Complex Issues of Cardiovascular Diseases under the Siberian Branch of the Russian Academy of Medical Sciences, Kemerovo, Russia*

*<sup>4</sup> Department of Oncology, Cancer Research UK and Medical Research Council Oxford Institute for Radiation Oncology, University of Oxford, Oxford, UK*

*\*Correspondence: antonkutikhin@gmail.com*

#### *Edited by:*

*Eric Altermann, AgResearch Ltd, New Zealand*

#### *Reviewed by:*

*Parish Paymon Sedghizadeh, University of Southern California, USA*

**Keywords: calcifying nanoparticles, nanobacteria, nanobacteria-like particles, calcification, inflammation, atherosclerosis, diseases, hydroxyapatite**

Calcifying nanoparticles (CNPs, nanobacteria, nanobacteria-like particles) were discovered as cell culture contaminants by Kajander et al. more than 25 years ago, and the first results of their work were published some years later (Kajander et al., 1997; Kajander and Ciftçioglu, 1998). The nature of CNPs has been obscured so far. Possibly representing a new proposed class of living organisms or inorganic nanostructures, CNPs exist as coccoid, coccobacillar, or bacillary particles of 80–500 nm in diameter, consisting of a central cavity surrounded by the hydroxyapatite shell, and possessing an ability to grow and divide in culture medium forming biofilms. The phenomenon of CNPs raised intensive discussions amongst the scientific community; there are active debates on their nature and potential role in clinical medicine. According to a number of fundamental and clinical studies, CNPs are suspected to cause ectopic calcificationrelated diseases such as atherosclerosis, heart valve calcification, placental calcification, nephrolithiasis, cholecystolithiasis, type III chronic prostatitis/chronic pelvic pain syndrome, and testicular microlithiasis. Electron microscopy is considered to be a gold standard for the visualization of CNPs; researchers usually observe the colonies of CNPs after the culturing in DMEM or RPMI-1640 for 4–8 weeks. Other means used by multiple groups for the detection of CNPs are various serological methods, including ELISA, immunohistochemistry, immunofluorescence reaction, immunoblotting, and Ouchterlony immunodiffusion. Although PCR has been previously used to identify possible genome of CNPs, there are doubts on credibility of this method since primers could have been designed based not on putative genome sequences of CNPs but on the genome sequences of contaminating bacteria. The general features of CNPs are indicated in **Table 1**.

Since the beginning of 2012, when our group published a comprehensive review on the role of CNPs in biology and medicine (Kutikhin et al., 2012), a number of newer findings have been reported. Currently, it is possible to divide all researchers studying CNPs into three groups: (1) those that presume that CNPs are living organisms (nanobacteria), (2) those that consider them as culprits of pathological calcification, and (3) those that completely reject their pathological role and assume them as the result of physiological calcification.

A group from China investigated the role of CNPs in placental calcification. The authors successfully isolated and cultured CNPs from the majority of calcified placental samples but not from normal placental tissues, concluding that CNPs are associated with this pathology (Guo et al., 2012; Lu et al., 2012a,b). Further, the authors also detected nucleic acid-like materials within individual CNPs in consistency with previous results by Agababov et al. (2007). Despite the fact that sequencing of putative nanobacterial 16s rRNA was successful, the obtained sequences were 99% similar to *Agrobacterium tumefaciens* and *Rheinheimera* spp., which suggests sample contamination (Guo et al., 2012; Lu et al., 2012a,b). In addition, the researchers speculated that genome of CNPs isolated from calcified placental samples may differ from the putative genome of CNPs isolated from human blood and other sources (Guo et al., 2012); nevertheless, it is arguable that this assumption can be proved by *in silico* analysis whether CNPs have nucleic acids or not.

The same research group recently carried out a first investigation of the impact of nanohydroxyapatites (nHAPs) and CNPs on growth and viability of cancer cells (Zhang et al., 2014). Both nHAPs and CNPs had considerable cytotoxic effects on MDA-MB-231 breast cancer cell line, including altered size and morphology, formation of large cytoplasmic vacuoles, inhibition of proliferation, and induction of apoptosis (Zhang et al., 2014). Importantly, CNPs caused more pronounced cytotoxic effects as compared to nHAPs, as they induced both early and late apoptosis and necrosis, whereas nHAPs triggered early apoptosis only (Zhang et al., 2014). After the internalization of CNPs by cancer cells, cell shrinkage, chromatin condensation, nuclear fragmentation, nuclear dissolution, and the formation of apoptotic bodies were detected, although nHAPs did not exhibit such an effect (Zhang et al., 2014). No cytotoxic effects were observed in the untreated control group (Zhang et al., 2014).

Another investigation on the role of CNPs in calcification-related diseases was performed by a group from Spain concerning aortic valve calcification (Barba et al., 2012). The authors successfully

*<sup>2</sup> Department of Epidemiology, Kemerovo State Medical Academy, Kemerovo, Russia*

*<sup>3</sup> Central Research Laboratory, Kemerovo State Medical Academy, Kemerovo, Russia*

#### **Table 1 | Properties of calcifying nanoparticles (CNPs).**


### **Table 1 | Continued**

## **Properties of calcifying nanoparticles**


*Abbreviations: CNPs, calcifying nanoparticles; EDTA, ethylenediaminetetraacetic acid; EGTA, ethyleneglycoltetraacetic acid; DMEM, Dulbecco's modified Eagle's medium; RPMI-1640, Roswell Park Memorial Institute 1640 medium; ELISA, enzyme-linked immunosorbent assay; PCR, polymerase chain reaction. Cited from Kutikhin et al. (2012).*

cultured and isolated CNPs from calcified aortic valves but not from uncalcified control valves, showing a feasible pathological role of CNPs; however, no metabolic activity was observed in the samples, and authors failed to detect CNPs' presupposed nucleic acids by real-time PCR (Barba et al., 2012).

Other researchers did not attempt to extract CNPs from clinical specimens; instead, they investigated the inorganic properties of CNPs and conditions of their formation. Wu et al. (2013a) demonstrated that various charged elements and ions may form mineralo-organic nanoparticles (so-called bions) with bacteria-like morphology and similar properties in biological fluids. Upon formation, bions precipitated with phosphate, accumulated carbonate apatite, incorporated additional elements and thus reflected the ionic milieu of the biological fluid in which they formed (Wu et al., 2013a). Bions were able to increase in size and number and to be sub-cultured in fresh culture medium (Wu et al., 2013a). So, many morphological and cultural features of bions are similar to ones typical for CNPs. The authors suggested that bions may represent a part of a physiological cycle that regulates the function, transport, and disposal of mineral ions in the body, and the accumulation of bions may cause pathological processes in the human body under the conditions of altered calcium homeostasis and disturbed clearance mechanisms (Wu et al., 2013a). So, a proposed hypothesis was that bions may form under both physiological and pathological conditions (Wu et al., 2013a). In addition, the same research group suggested that membrane vesicles secreted by various cells may contain a number of serum proteins and can induce hydroxyapatite precipitation during incubation in cell culture medium forming nanostructures which morphologically resemble CNPs (Wu et al., 2013b). Treatment of these membrane vesicles with antiphosphatidylserine antibodies resulted in decrease of their mineral seeding activity suggesting that phosphatidylserine may provide nucleating sites for calcium phosphate deposition on the vesicles (Wu et al., 2013b).

Using dynamic light scattering, Peng et al. (2013) showed that serum and ion concentrations within the physiological range may form nanoparticles below 100 nm in diameter which can be phagocytosed by macrophages in a sizeindependent manner. However, only large nanoparticles or their aggregates were able to induce the production of mitochondrial reactive oxygen species, caspase-1 activation, and secretion of interleukin-1β and therefore cause inflammation (Peng et al., 2013). In addition, the authors found that the set of particle-bound proteins does not depend on particle size and curvature (Peng et al., 2013). According to the work of Baum et al. (2012), aggregates of hemoglobin and various salts from physiological environment can also produce morphological structures resembling CNPs. Finally, Kumon et al. investigated one of the original CNP isolates from urinary stones (P-17) (Kumon et al., 2014). The authors developed anti-P-17 IgM monoclonal antibodies, CL-15, which were specific for oxidized lipids, and combined immunoelectron microscopy with ultrastructural and elemental analysis (Kumon et al., 2014). They suggested that lamellar structures consisted of acidic/oxidized lipids provided structural scaffolds for carbonate apatite and that lipid peroxidation induced by γ-irradiation of fetal bovine serum (FBS) was a major cause of CNP propagation (Kumon et al., 2014). Moreover, it was proposed that oxidized lipids may be a common platform for ectopic calcification in atherosclerosisprone (ApoE−*/*−) mice, thus CNPs were suggested to be by-products rather than etiological agents of chronic inflammation (Kumon et al., 2014). However, it was also noted that propagation of CNPs largely depended on the amount of oxidized lipids available and therefore could play a role in disease progression (Kumon et al., 2014).

However, it should be clearly stated that distinct processes may lead to the formation of nanostructures which are morphologically similar. Bions, hemoglobin-salt aggregates and oxidized lipids with acidified functional groups may lead to the occurrence of nanomorphological phenomenon called CNPs; however, there is no reason to discount many studies where CNPs were significantly associated with ectopic calcification-related diseases. In patients with these diseases, CNPs were detected significantly more frequently by electron microscopy and cultural features in comparison with control samples. Therefore, despite the fact that the emergence of CNPs can be due to physiological processes such as hemoglobinsalt aggregation, they can also form in greater numbers under pathological conditions such as alteration of metabolic and mineral ion homeostasis or lipid peroxidation regardless of their nature, in this case being the culprits of the ectopic calcification-related diseases. In addition, as demonstrated by Peng et al. (2013), large CNPs or their aggregates may cause chronic inflammation which plays a major role in the development of all these diseases. They should not be by-products of inflammation since: (a) injection of CNPs caused artery calcification (Schwartz et al., 2008), nephrolithiasis (Hu et al., 2010), cholecystolithiasis (Wang et al., 2006), and type III chronic prostatitis (Shen et al., 2010) in animal models; (b) treatment of CNPs by comET-therapy (tetracycline HCl, EDTA, and mixture of nutrients) led to significant decline of CNP detection rate, decreased calcification, and substantial therapeutic improvement in patients with coronary artery disease (CAD) and type III chronic prostatitis/chronic pelvic pain syndrome (Maniscalco and Taylor, 2004; Shoskes et al., 2005; Zhou et al., 2008). So, the phenomenon of CNPs definitely should be taken into account when we talk about the etiology and pathogenesis of ectopic calcification-related diseases.

Regarding the hypothesis of CNPs as of the smallest self-replicating life form on Earth (nanobacteria), the absence of a fairly accurately sequenced genome aborts all discussions about their putative living nature. Nevertheless, the diverse nature of CNPs (bions, hemoglobin-salt aggregates, products of lipid peroxidation, calcified membrane vesicles, other inorganic entities) still leaves a place for speculations that certain nanoscale organisms may also be one of the entities of CNPs. However, this suggestion should be interpreted with caution.

From our point of view, immunological and cytotoxic properties of CNPs are still underinvestigated in the light of their potential pathogenic role. The investigations of Zhang et al. (2014) and Peng et al. (2013) are good examples of this kind; however, new studies are clearly needed, particularly due to increasing role of nanomedicine in clinical practice (nanovesicles for drug delivery, nanobiosensors for disease diagnosis, therapeutic nanoparticles which possibly may act as nucleation agents, etc.). Possibly, B cells may produce antibodies to CNPs, at least to large CNPs and their aggregates, and it is necessary to clarify this issue due to its feasible importance for understanding of the immune response against CNPs and for their serological detection. Moreover, anti-CNP antibodies may be of distinct structure due to feasibly different entities of CNPs, their different size and different proteins coating them. In addition, the significance of CNPs as etiological agents of ectopic calcification-related diseases may be also tested in animal models and clinical trials using anti-CNP treatment. Notwithstanding, current standards of CAD and peripheral artery disease (PAD) therapy do not include drugs specifically directed against the calcification. Unfortunately, comET-therapy which demonstrated certain clinical efficacy in treatment of CAD and type III chronic prostatitis/chronic pelvic pain syndrome should not be used for a prolonged treatment due to the increasing hazard of antibiotic resistance. Possibly, larger clinical trials of comET-therapy analogues that do not contain antibiotics but include other calcium-chelating agents can be worthwhile.

To conclude, these two years have revealed some new facts about CNPs:


What is clear is that at the present time there is no universal theory that can entirely characterize CNPs, their characteristics, and biological/medical significance. We should not disclaim the fact that CNPs may form under the physiological conditions; however, there is irrefutable evidence that an appearance of CNPs in the living organism may cause ectopic calcification-related diseases, and CNPs definitely should not be considered as just physiological phenomenon. In addition, the fact that CNPs are more likely to be inorganic structures than life forms definitely should not cause underestimation of their potential pathogenic role. No doubt, further investigations will shed light on the nature of CNPs, their biological properties, and their role in clinical medicine.

## **REFERENCES**


Zhou, Z., Hong, L., Shen, X., Rao, X., Jin, X., Lu, G., et al. (2008). Detection of nanobacteria infection in type III prostatitis. *Urology* 71, 1091–1095. doi: 10.1016/j.urology.2008.02.041

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 April 2014; accepted: 23 April 2014; published online: 20 May 2014.*

*Citation: Kutikhin AG, Yuzhalin AE, Borisov VV, Velikanova EA, Frolov AV, Sakharova VM, Brusina EB and Golovkin AS (2014) Calcifying nanoparticles: one face of distinct entities?. Front. Microbiol. 5:214. doi: 10.3389/fmicb.2014.00214*

*This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.*

*Copyright © 2014 Kutikhin, Yuzhalin, Borisov, Velikanova, Frolov, Sakharova, Brusina and Golovkin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## ADVANTAGES OF PUBLISHING IN FRONTIERS

FAST PUBLICATION Average 90 days from submission to publication

COLLABORATIVE PEER-REVIEW

Designed to be rigorous – yet also collaborative, fair and constructive

RESEARCH NETWORK Our network increases readership for your article

## OPEN ACCESS

Articles are free to read, for greatest visibility

## TRANSPARENT

Editors and reviewers acknowledged by name on published articles

GLOBAL SPREAD Six million monthly page views worldwide

## COPYRIGHT TO AUTHORS

No limit to article distribution and re-use

IMPACT METRICS Advanced metrics track your article's impact

SUPPORT By our Swiss-based editorial team

EPFL Innovation Park · Building I · 1015 Lausanne · Switzerland T +41 21 510 17 00 · info@frontiersin.org · frontiersin.org