# PLANT SINGLE CELL TYPE SYSTEMS BIOLOGY

EDITED BY: Marc Libault and Sixue Chen PUBLISHED IN: Frontiers in Plant Science

#### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-948-8 DOI 10.3389/978-2-88919-948-8

# About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **PLANT SINGLE CELL TYPE SYSTEMS BIOLOGY**

Topic Editors: **Marc Libault,** University of Oklahoma, USA **Sixue Chen,** University of Florida, USA

**Citation:** Libault, M., Chen, S., eds. (2016). Plant Single Cell Type Systems Biology. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-948-8

# Table of Contents


Da-Song Chen, Cheng-Wu Liu, Sonali Roy, Donna Cousins, Nicola Stacey and Jeremy D. Murray

## **Chapter 3. Biochemical analysis of plant single cell types**


Qi Zhao, Jing Gao, Jinwei Suo, Sixue Chen, Tai Wang and Shaojun Dai

*110 Proteasome targeting of proteins in Arabidopsis leaf mesophyll, epidermal and vascular tissues*

Julia Svozil, Wilhelm Gruissem and Katja Baerenfaller

*127 The guard cell metabolome: functions in stomatal movement and global food security*

Biswapriya B. Misra, Biswa R. Acharya, David Granot, Sarah M. Assmann and Sixue Chen

*140 Single cell-type comparative metabolomics of epidermal bladder cells from the halophyte* **Mesembryanthemum crystallinum**

Bronwyn J. Barkla and Rosario Vera-Estrella

# Editorial: Plant Single Cell Type Systems Biology

Marc Libault <sup>1</sup> \* and Sixue Chen<sup>2</sup>

*<sup>1</sup> Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK, USA, <sup>2</sup> Department of Biology, Interdisciplinary Center for Biotechnology Research, Genetics Institute, Plant Molecular and Cellular Biology Program, University of Florida, Gainesville, FL, USA*

Keywords: systems biology, molecular phenotype, omic analyses, single cell types, root hair, trichome

**Editorial on the Research Topic**

#### **Plant Single Cell Type Systems Biology**

The molecular responses of a plant to a stress and the molecular profiles of plant organs during their development and differentiation are the reflection of the contribution from different cell types composing the plant and the organs. Hence, a major limitation to understanding plant cellular and molecular responses in different cells is the multicellular complexity of the plant or organs used to decipher them. For instance, as mentioned by Coker et al. changes in the expression levels of plant cells infected by pathogenic microbial organisms are diluted by the relative abundance of uninfected cells. This constraint has led plant biologists to select model single plant cell types such as pollen, trichomes, cotton fiber, guard cells of stomata, and various root cell types including the root hair cells, and to develop new technologies (e.g., microscopic, biochemical, and omics) to decipher their biology (Dai and Chen, 2012; Misra et al.). However, as mentioned by Schmid et al. (2015) profiling single cell types is dependent on the quantity and purity of the samples isolated as well as the use of sensitive and accurate profiling methods. The 12 articles published in this Research Topic highlight interesting methodology and biological systems applied by plant scientists to advance our knowledge in plant biology using single cell type models.

Working at the level of single cell types is motivated by the need for analyzing specific biological information in the relevant cell types, which would otherwise be missed when using tissues or organs (Dai and Chen, 2012; Misra et al.). This is especially true when working on plant-microbe interactions where only a subset of cells are infected by pathogenic or mutualistic microbes. For instance, to precisely characterize the transcriptional response of Arabidopsis thaliana during infection by the oomycete Hyaloperonospora arabidopsidis (Hpa), Coker et al. applied Fluorescent Activated Cell Sorting (FACS) in separation of haustoriated and non-haustoriated Arabidopsis cells for transcriptomic analysis, allowing the discovery of 139 new Hpa-responsive genes and characterization of the local and systemic responses of the plant cells. Similarly, working on the infection of the soybean root hair cells by rhizobium, the nitrogen-fixing symbiotic soil bacterium, Hossain et al. described the integration of the transcriptomic, proteomic, phosphoproteomic, and metabolomic datasets to generate a comprehensive network of the early stage of the nodulation process. Another strategy applied by Chen et al. to gain a better understanding of the nodulation process is to compare the transcriptomes of different rhizobium-infected plant cell types. Specifically, they looked for the Medicago truncatula genes controlling infection thread formation and elongation by analyzing transcriptomic data obtained from inoculated root hair cells and the infection zone of the M. truncatula nodule. Studying plant reproduction, another complex biological process, can also benefit from single cell type analyses. Schmid et al. detailed novel methods to analyze single cell type molecular profiles such as the female gametophyte, which is composed of antipodal, central, egg, and synergid cells. Similarly, working on the male

#### Edited by:

*Joshua L. Heazlewood, The University of Melbourne, Australia*

Reviewed by: *Berit Ebert, The University of Melbourne, Australia*

> \*Correspondence: *Marc Libault libaultm@ou.edu*

#### Specialty section:

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> Received: *20 November 2015* Accepted: *11 January 2016* Published: *09 February 2016*

#### Citation:

*Libault M and Chen S (2016) Editorial: Plant Single Cell Type Systems Biology. Front. Plant Sci. 7:35. doi: 10.3389/fpls.2016.00035* gametophyte, Lu et al. developed a pollen culture system for isolating generative cells, sperm cells and vegetative nuclei from tomato pollen grains. Plant reproduction studies can also benefit from the utility of unique single cell type models, such as Equisetum arvense, an herbaceous plant characterized by its spore reproduction. Zhao et al. analyzed the cellular and proteomic profiles of E. arvense during spore germination, revealing the high level activities of the heterotrophic and autotrophic metabolisms.

The generation of unambiguous datasets from single cell types is an asset for generating systems biology models as demonstrated by Kwak et al. (2008), Sun et al. (2014) and Hossain et al. Single cell types are also considered attractive systems to precisely depict molecular phenotypes. As noted by Schiefelbein, access to single cell types now opens a new area to phenotype mutants: the establishment of molecular phenotypes (i.e., distinct molecular profiles between wild-type and mutants and their changes in response to environmental stresses). Such an approach is often limited by efficient methods to generate high quality single plant cell type samples and by the limited amount of material available for analyzing the molecular phenotype. Thus, technological development must continue to meet the needs of addressing questions at the single cell type level. Nucleic acid sequencing technologies associated with the use of performant bioinformatics tools are now enabling an accurate and sensitive quantification of single cell type transcriptomes and epigenomes. As an example, the analysis of previously published Arabidopsis root hair transcriptome data sets allowed the characterization of 5409 genes differentially expressed in root hairs versus non-root hair epidermal cells and the generation of a co-expression network (Li and Lan). Similarly, biochemical methods are quickly developing allowing access to single plant cell type proteome (Svozil et al.) and metabolome (Barkla and Vera-Estrella; Bartels and Svatos; Misra et al.). Specifically, Barkla and Vera-Estrella described the differential metabolome between specialized trichome cells from Mesembryanthemum crystallinum named epidermal bladder cells (EBC). This analysis can be expected to provide a systems level of understanding of EBC when integrated with the existing

#### REFERENCES


proteomic and transcriptomic data sets. Similarly, Misra et al. reviewed the most recent advances in our understanding of the guard cell metabolome. This knowledge is essential to advance our understanding of stomatal opening and closing, which have a major impact on plant transpiration, CO<sup>2</sup> uptake and pathogen immunity. At the proteome level, Svozil et al. applied Meselect, an innovative methodology to isolate leaf epidermal, vascular and mesophyll cells. Using these samples, the authors established a proteome map of each cell type and revealed cell type specific processes. These types of studies are going to be revolutionized by the development of new imaging techniques. For instance, applying infrared-laser ablation electrospray ionization (LAESI) and UV-laser desorption/ionization (LDI) methods, less intrusive and spatially-resolved analyses of the metabolomes of single plant cell types are described in this ebook (Bartels and Svatos). These technological developments have greatly enhanced our capabilities in analyzing molecular components in different cells at an unprecedented scope and depth through omics for modeling and hypothesis generation. The integration of hypothesis generation and hypothesis testing in systems biology research will ultimately lead to a holistic view of cellular processes and molecular networks in plants and will create stepping stones toward molecular breeding and biotechnology for enhanced crop stress tolerance, yield and bioenergy.

# AUTHOR CONTRIBUTIONS

ML drafted the manuscript. SC edited the manuscript.

# FUNDING

Research on plant single cell-type regulatory networks in the Libault laboratory has been supported by NSF grants IOS-1453613, IOS-1339194, by DOE grant DE-SC0012629, and by the Oklahoma Center for the Advancement of Science and Technology PS14-025 to ML. Research on single cell-type proteomics and metabolomics in the Chen laboratory has been supported by NSF grants MCB-0818051, MCB-1158000, and MCB-1412547 to SC.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer BE and handling Editor declared their shared affiliation, and the handling editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Libault and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Molecular phenotyping of plant single cell-types enhances forward genetic analyses**

#### *John Schiefelbein\**

*Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI, USA*

Recent advances in the isolation of single cell-types in plants provides an opportunity to conduct detailed analyses of their molecular characteristics at high resolution. This kind of cell-type specific molecular phenotyping is likely to enhance forward genetics studies to dissect the effect of mutations and thereby aid gene function assignment. Recent experimental results support this view, demonstrating that different cell-types exhibit substantial variation in transcript, protein, and metabolite accumulation and these molecular phenotypes are often sensitive to genetic and environmental alterations. The use of single cell-type molecular phenotyping approach to define plant gene function is most amenable to cell-types with well-characterized molecular tools and isolation protocols.

#### *Edited by:*

*Marc Libault, University of Oklahoma, USA*

#### *Reviewed by:*

*Stefan Kempa, Helmholtz Association, Germany Jiangxin Wang, Arizona State University, USA*

#### *\*Correspondence:*

*John Schiefelbein, Department of Molecular, Cellular, and Developmental Biology, University of Michigan, 830 North University Avenue, Ann Arbor, MI 48109, USA schiefel@umich.edu*

#### *Specialty section:*

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> *Received: 01 May 2015 Accepted: 25 June 2015 Published: 06 July 2015*

#### *Citation:*

*Schiefelbein J (2015) Molecular phenotyping of plant single cell-types enhances forward genetic analyses. Front. Plant Sci. 6:509. doi: 10.3389/fpls.2015.00509* **Keywords: genetic analysis, mutant, molecular phenotype, cell biology, transcriptomics**

# **Introduction**

The use of forward genetic analysis has historically been an effective approach for defining gene function. By analyzing the phenotypic effect of a mutated gene, it has been possible to assign hundreds of genes in numerous organisms to particular developmental or metabolic pathways. However, a limitation of this approach is its reliance on the detection and measurement of phenotypic alterations. Indeed, if one cannot observe a change in the phenotype of a genetically altered individual, then little insight is gained concerning the affected gene's role in the process of interest.

The recent development of approaches to isolate and analyze plant single cell-types are likely to help overcome this limitation of forward genetics in plant biology. Specifically, recent studies suggest that individual plant cell-types exhibit distinct molecular characteristics (molecular phenotypes) and that these are differentially affected by genetic or environmental changes. Thus, forward genetic analyses that focus on phenotyping single cell-types are expected to have the capacity to detect and measure molecular alterations and thereby provide greater insight into gene function.

This perspective article summarizes recent studies showing distinct molecular phenotypes in plant single cell-types, their sensitivity to genetic and environmental perturbation, and evidence that forward genetic analyses are aided by single cell-type studies. This article also examines the factors likely to be critical for the successful use of molecular phenotyping to study gene function at the level of plant single cell-types.

# **Molecular Phenotypes in Single Cell-Types of Plants**

The molecular analysis of a bulk cell population yields the composite molecular phenotype of many cell and tissue types, leading to an average assessment of the transcriptome, proteome, or metabolome of the population. Although such bulk data are useful for some purposes, recent studies have shown that individual celltypes within a plant organ possess widely disparate molecular characteristics (Brandt, 2005). For example, extensive microarray analyses of cell-sorted *Arabidopsis* roots show cell-type specific transcript accumulation that differs substantially from the total root RNA population (Birnbaum et al., 2003; Brady et al., 2007). Similar cell-specific expression has been observed in several different isolated cell-types from rice, yielding a "celltype transcriptome atlas" (Jiao et al., 2009). Further, it has been shown that distinct protein and metabolite accumulation patterns exist in different root cell populations as compared with the entire root organ (Petricka et al., 2012; Moussaieff et al., 2013). Although the evidence for cell-type specific accumulation of biological molecules is greatest for roots, the analysis of other isolated cell-types of plants, including trichomes and stomata, also reveals transcriptome patterns distinct from the larger organ-level accumulation patterns (Lieckfeldt et al., 2008; Adrian et al., 2015).

Furthermore, it is now clear that the molecular phenotype of a given cell-type is altered in a unique manner following genetic or environmental perturbation. That is, individual celltypes appear to respond in different ways to a particular alteration, such as a mutation or an environmental stress. For example, the analysis of single-gene mutants shows distinct transcriptome responses at the cell-type level in roots (Brady et al., 2011; Bruex et al., 2012), cotton fibers (Wan et al., 2014), and trichomes (Jakoby et al., 2008). Further, distinct cell-type-specific alterations in transcriptional profiles are observed following a specific change in the environmental conditions, including iron deprivation, nitrogen availability, or salt stress, that likely reflect the physiological response appropriate for a particular cell-type (Dinneny et al., 2008; Gifford et al., 2008; Geng et al., 2013). These results indicate that single cell-types of plants have distinct molecular programs that are not apparent from the analysis of whole organs or whole plants.

An important implication of these studies is that high resolution molecular analyses at the single cell level may be useful to improve forward genetic analyses. That is, although a given mutant individual may not exhibit an observable alteration at the level of the whole plant or from a bulk cell population, it may be possible to detect a molecular phenotype if single cell-type analyses were performed. Indeed, the results of several recent studies support this view. First, the molecular analysis of different wild-type *Arabidopsis* lines (ecotypes) has demonstrated substantial differences in their transcript, protein, and metabolite profiles (Keurentjes et al., 2006; Kliebenstein et al., 2006; Fu et al., 2009; Terpstra et al., 2010), indicating a surprising degree of underlying molecular variation that does not cause observable morphological differences, perhaps due to "phenotypic buffering." Interestingly, these studies also enable the identification of new kinds of quantitative trait loci (QTLs) that likely mediate the responses of large numbers of genes for ecotype-specific traits. Further, the reverse genetic analysis of a collection of steleenriched *Arabidopsis* transcription factor genes showed that, among the resulting single-gene mutants, 65% of them exhibited reproducible transcriptome changes whereas only 16% exhibited observable morphological changes in the root (Brady et al., 2011). Together, these studies suggest that a substantial degree of molecular variation exists that does not impact the plant's phenotype.

Finally, direct demonstration of the value of single cell-type analyses for forward genetic analyses has recently been reported. Several studies have shown that a comparative transcriptome analysis of single cell-types from mutants versus wild-type provides enhanced resolution for transcript changes. Comparing cotton fiber transcriptomes from lines differing in fiber production, specific transcription factors and metabolic pathways were identified as fiber associated (Wan et al., 2014). Further, transcript profiles from trichomes of a wild-type and immature trichome mutant lines uncovered new genes required for normal trichome formation (Marks et al., 2009). In another study, several single-gene mutants associated with *Arabidopsis* root epidermis development, but lacking an observable morphological phenotype due to redundancy, were found to alter root epidermis transcript profiles in a manner that reflects the known biological role of the genes (Simon et al., 2013). In addition, the function of an uncharacterized gene in this root epidermal network (*TTG2*) was deduced by inspection of the cell-specific transcriptome of its corresponding mutant, which lacked a morphological abnormality (Simon et al., 2013). Interestingly, the transcript changes showed that TTG2 normally promotes root hair cell differentiation, rather than non-hair cell differentiation as previously suspected. This shows that the high resolution analysis of single cell-types can reveal molecular differences associated with gene function.

# **Issues to Consider for Molecular Phenotyping of Single Cell-Types**

The studies described above suggests it is possible to better understand gene function by conducting detailed molecular phenotyping of genetically altered lines (e.g., mutant lines). In particular, it is notable that even mutants lacking an outward (morphological) phenotype may exhibit a molecular phenotype at the cell-type level. This is exciting because the majority of singlegene knockouts in plants lack observable changes (Pickett and Meeks-Wagner, 1995; Bouche and Bouchez, 2001; Hanada et al., 2009; Perez-Perez et al., 2009; Lloyd and Meinke, 2012), and so this approach may enable their associated genes to be assigned a particular function.

For the successful application of this approach, there are several issues that should be considered. First, the biological material to be analyzed should be as specific as possible. Ideally, a single cell-type or cell state (specific stage of a given cell-type) should be analyzed. This is important as it is becoming apparent from many different studies that there is tremendous cell–cell variation within multicellular organisms. For example, the oligodendrocyte of the mammalian brain has historically been considered to be a single cell-type, but it is now know to be composed of six distinct subpopulations in the adult mouse brain (Zeisel et al., 2015). One of the limiting factors in isolating single cells is the accessibility of individual cell-types, with epidermal cells (e.g., root hairs, trichomes, cotton fibers) being among the first to be analyzed due to their extension from the plant surface (Qiao and Libault, 2013; Becker et al., 2014). The difficulty in isolating other single cell-types is being mitigated by new advances in protoplasting/fluorescence activated cell sorting (FACS), laser capture microdissection (LCM), or nuclear tagging in specific cell-types (INTACT) that enable cells of internal tissues to be purified in plants (Kerk et al., 2003; Deal and Henikoff, 2010). Another limiting factor can be the small amount of biological material collected, which may prevent accurate assessment of molecule accumulation given current transcriptome, proteome, and metabolome methodologies. Indeed, this may prevent the detection of a "phenotype" in some mutants that elicit relatively minor effects. As the sensitivity of these large-scale methods increases, the ultimate goal is to conduct these molecular analyses on individual cells of a single cell-type. For example, it is conceivable that transcriptomes will soon be possible from single cells through the development of single cell RNA-seq methods (Saliba et al., 2014). It is likely that the combined technical advances in single cell isolation and molecular analysis methods will be required for future success.

Second, the likely success of using this approach is improved if the investigator has some knowledge of the likely defect in the mutant line. This knowledge enables the investigator to focus their molecular analysis on one (or a limited number of) cell-type(s) in the mutant line. In the absence of some knowledge or interest in a particular cell-type(s), it would be difficult to employ this approach to deducing the role of a gene, due to the time and

### **References**


resources required to survey a large number of cell/tissue types and developmental stages of the mutant plant. Having said this, at some point in the future, it may be feasible to consider the ultimate goal of defining the molecular impact of every gene knockout on every cell-type in the plant. It is likely that this information would profoundly change our view of gene activity and function in plants.

Third, this approach is most likely to be successful if there is a known molecular pathway or molecular markers for the celltype or trait of interest already available to aid in the assignment of gene function. For example, if some of the genes have been identified in a transcriptional regulatory network for the celltype of interest, then it is a relatively straightforward exercise to determine whether/how the mutant lines alter specific genes or subdomains of the gene network using transcriptomic approaches. This feature has been exploited in the successful applications of this approach to date, using the detailed knowledge available in the trichome and root hair systems (Marks et al., 2009; Simon et al., 2013).

In summary, recent advances in the isolation and characterization of plant single cell-types shows great promise in enhancing the effectiveness of mutant phenotyping. By focusing on single cell-types, rather than entire organs or entire plants, it is possible to detect specific molecular changes in mutant lines. This represents a valuable tool for the future detailed dissection of gene function in plants.


downstream of the *Arabidopsis*receptor-like kinase ERECTA. *Plant Physiol*. 154, 1067–1078. doi: 10.1104/pp.110.159996


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Schiefelbein. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **System approaches to study root hairs as a single cell plant model: current status and future perspectives**

#### *Md Shakhawat Hossain <sup>1</sup> , Trupti Joshi <sup>2</sup> and Gary Stacey <sup>1</sup> \**

*<sup>1</sup> Division of Plant Sciences and Biochemistry, National Center for Soybean Biotechnology, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA, <sup>2</sup> Department of Computer Science, Informatics Institute and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA*

#### *Edited by:*

*Marc Libault, University of Oklahoma, USA*

#### *Reviewed by:*

*Chuang Ma, Northwest Agriculture and Forest University, China Maria E. Zanetti, CONICET and Universidad Nacional de LA Plata, Argentina*

#### *\*Correspondence:*

*Gary Stacey, Division of Plant Sciences and Biochemistry, National Center for Soybean Biotechnology, Christopher S. Bond Life Sciences Center, University of Missouri, 1201 Rollins Street, Columbia, MO 65211, USA staceyg@missouri.edu*

#### *Specialty section:*

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> *Received: 17 March 2015 Accepted: 06 May 2015 Published: 19 May 2015*

#### *Citation:*

*Hossain MS, Joshi T and Stacey G (2015) System approaches to study root hairs as a single cell plant model: current status and future perspectives. Front. Plant Sci. 6:363. doi: 10.3389/fpls.2015.00363* Our current understanding of plant functional genomics derives primarily from measurements of gene, protein and/or metabolite levels averaged over the whole plant or multicellular tissues. These approaches risk diluting the response of specific cells that might respond strongly to the treatment but whose signal is diluted by the larger proportion of non-responding cells. For example, if a gene is expressed at a low level, does this mean that it is indeed lowly expressed or is it highly expressed, but only in a few cells? In order to avoid these issues, we adopted the soybean root hair cell, derived from a single, differentiated root epidermal cell, as a single-cell model for functional genomics. Root hair cells are intrinsically interesting since they are major conduits for root water and nutrient uptake and are also the preferred site of infection by nitrogen-fixing rhizobium bacteria. Although a variety of other approaches have been used to study single plant cells or single cell types, the root hair system is perhaps unique in allowing application of the full repertoire of functional genomic and biochemical approaches. In this mini review, we summarize our published work and place this within the broader context of root biology, with a significant focus on understanding the initial events in the soybean-rhizobium interaction.

**Keywords: root hair, single cell, rhizobium, soybean, systems biology**

# **Why Root Hairs are an Excellent, Single-cell, Plant Model for Systems Biology?**

A root hair is a single cell (Wan et al., 2005; Brechenmacher et al., 2009, 2012; Libault et al., 2010a; Qiao and Libault, 2013), structurally simple and tubular outgrowth of root epidermal cells (Grierson et al., 2014). Root hairs have a huge absorptive surface area (Hofer, 1991), evolved in order to allow the plant to take up water, nutrients and minerals (Minorsky, 2002). They are also a major route for plant-microbe interactions (Oldroyd, 2001). For example, legume-rhizobium interactions that lead to the formation of a new organ, the nodule, where biological N2-fixation takes place (Oldroyd and Dixon, 2014).

In order to adapt to environmental changes and respond to morphological or developmental stimuli, plant cells have evolved complex regulatory networks integrating the response at a variety of levels; such as DNA, RNA, proteins, metabolites and small molecules (Pu and Brady, 2010). While some plant genomes can encode over 50,000 genes, it is clear that many genes are specifically

expressed in only a few organs, tissues or cell types. However, it can be technically very challenging to measure the levels of genes, proteins or metabolites in a specific cell type (Rogers et al., 2012). Hence, most studies measure these components in whole plants or tissues, resulting in an averaging of the responses occurring within all cells. In order to overcome these sampling issues, the soybean root hair was proposed as a single cell plant model (Libault et al., 2010a). The size and thickness of the soybean root allows root hair cells to be isolated easily after freezing in liquid nitrogen. The result of this procedure is pure root hair preparations in gram quantities allowing the full repertoire of functional genomic methods to be applied (Wan et al., 2005; Brechenmacher et al., 2009, 2012; Libault et al., 2010a; Nguyen et al., 2012; Qiao and Libault, 2013).

Studies of plant root hairs are not new. A literature search identified more than 1,300 articles covering various aspects of root hair biology, including studies of root hair elongation, tip growth, polarized cell expansion, endomembrane trafficking, cytoskeletal organization and cell wall modifications in model and crop plants; such as, *Arabidopsis*, *Lotus japonicus*, rice, corn, barley, and tomato (http://www.iroothair.org/; Foreman and Dolan, 2001; Karas et al., 2005; Yokota et al., 2009; Hossain et al., 2012; Peña et al., 2012; Kwasniewski et al., 2013; Grierson et al., 2014). However, recent advances with our methods of single cell root hair isolation in soybean (**Figure 1**; Libault et al., 2010a; Qiao and Libault, 2013), along with the availability of the soybean genome sequence (Schmutz et al., 2010) and high-throughput sequencing, proteomic, metabolomic and epigenetic technologies make the soybean root hair system particular attractive for detailed, systems-level studies. The ultimate goal is to use this information for computational prediction and integration of big data sets for network analysis of plant cell function (**Figure 2**; Foreman and Dolan, 2001; Wan et al., 2005; Brechenmacher et al., 2009, 2012; Kwasniewski et al., 2010; Libault et al., 2010a; Peña et al., 2012; Grierson et al., 2014).

# **What have We Learned about Root Hairs Using System Approaches?**

The study of a single cell system provides significantly higher resolution and sensitivity when various functional genomic methods are applied. This has been amply demonstrated by studies in *Arabidopsis* that characterized the transcriptome, proteome and metabolome of various root cell types (Pu and Brady, 2010; Rogers et al., 2012; Moussaieff et al., 2013; Misra et al., 2014). In our laboratory, over the past few years, the full repertoire of functional genomic methods has been applied to studies of soybean root hairs as a single cell plant model (Wan et al., 2005; Brechenmacher et al., 2009; Libault et al., 2010b; Qiao and Libault, 2013). From these efforts, one could argue that, the soybean root hair cell is one of the best characterized cell types in plant biology.

# **Root Hair Transcriptome**

A number of genome-wide transcriptome profiling studies have been published using root hairs from several model and crop plants (Jones et al., 2006; Kwasniewski et al., 2010; Libault et al., 2010b; Lan et al., 2013; Libault, 2013; Breakspear et al., 2014). For example, in *Arabidopsis* and barley, researchers used this cell type to study transcriptional regulation mostly focused on root hair morphogenesis, cell fate, cellular growth and differentiation. Since root hairs expand by polar growth, along with pollen tubes, they serve as a model to study this distinctive growth process (Campanoni and Blatt, 2007).

However, in legumes, root hairs are also the primary site for rhizobial infection and, therefore, several studies have sought to define the early events in this infection process. Libault et al. (2010b) studied soybean root hair infection by the symbiotic bacterium *Bradyrhizobium japonicum*. This study used two combined platforms (Affymetrix and Illumina sequencing) and identified 1,973 genes that were differentially expressed in response to bacterial infection, including those involved in the initial rhizobial symbiotic signal, lipo-chitooligosaccharide (Nod) factor perception, plant defense response, modification of cell wall composition, signal transduction, basic metabolic processes, and hormonal regulation (Libault et al., 2010b). Very similar findings came from a more recent investigation in the model legume, *Medicago truncatula* (Breakspear et al., 2014). A microarray based transcriptomics approach identified hundreds of genes regulated in root hair cells in response to *Sinorhizobium meliloti* and bacterial Nod factor application. A comparison of these two studies revealed *∼*370 genes differentially regulated by rhizobial inoculation in the two legume species. Among genes responding in both species were those shown previously to be critical for the legume-rhizobial interaction; including, *NIN* (Nodule Inception), *PUB1* (Plant Ubox Protein 1), *VPY* (Vapyrin), *RPG* (Rhizobium-directed Polar Growth), *NSP1* (Nodulation Signaling Pathway 1), *NSP2* (Nodulation signaling Pathway 2), *NPL1* (Nodulation Pectate Lyase 1), *FLOT4* (Flotillin-like protein 4), *ERN1* (Ethylene Response Factor Required for Nodulation1), *ERN2* (Ethylene Response Factor Required for Nodulation2), *NFYA1* (Nuclear transcription factor Y subunit A-1), and *NMN1* (Nucleolar/Mitochondrial protein involved in Nodulation) (Breakspear et al., 2014). These findings emphasize the utility of root hair studies to identify key genes involved in the rhizobial symbiosis.

# **Root Hair Proteome**

While transcriptome studies are rather common, fewer studies have focused on the root hair proteome (Wan et al., 2005; Brechenmacher et al., 2009, 2012; Pang et al., 2010; Nestler et al., 2011;

Subramanian and Smith, 2013). One can argue that, while mRNA profiling provides a picture of the potential functions in the cell, only the proteome can give you a true picture of which of these functions are likely occurring. The first report on the soybean root hair proteome focused on providing a protein reference map of this single cell type (Brechenmacher et al., 2009) using twodimensional-polyacrylamide gel electrophoresis (2D-PAGE), augmented by multidimensional protein identification technology (MudPIT). This study identified 1,492 proteins involved in basic cell metabolism, water and nutrient uptake, vesicle trafficking, and hormone and secondary metabolism. A later study, using the Accurate Mass and Time (AMT) tag approach combined with liquid chromatography-tandem mass spectrometry (LC-MS/MS), identified a total of 5,702 proteins from soybean root hair cell preparations (Brechenmacher et al., 2012). Both studies reported similar functional categories of proteins. A recent study focused on root hairs isolated from the monocot maize with proteins separated by 1-dimensional PAGE and then subjected to nano LC-MS/MS (Nestler et al., 2011). This study identified 2,573 abundant proteins in maize root hair cells. Interestingly, a comparison of the soybean (dicot) and maize (monocot) datasets identified 252 conserved proteins pointing to functionally conserved, root hair functions in these disparate species.

In order to specifically address protein changes occurring in root hairs upon rhizobial inoculation, Wan et al. (2005) utilized 2-D PAGE to separate proteins, which were then identified by matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) MS. This studied identified 37 proteins including enzymes, such as chitinase and phosphoenolpyruvate carboxylase, that appeared to be specific to root hairs (relative to roots stripped of the root hairs), as well as peroxidase, phenylalanine-ammonia lyase, lectin, phospholipase D and phosphoglucomutase, whose expression changed significantly upon rhizobial inoculation. It was previously shown that these proteins played significant roles in root hair deformation, infection and legume nodulation in response to bacterial infection (Halverson and Stacey, 1985; Díaz et al., 1989; Estabrook and Sengupta-Gopalan, 1991; Cook et al., 1995; van Rhijn et al., 1998; den Hartog et al., 2001; Lepek et al., 2002; Mitra and Long, 2004; Yan et al., 2015b).

# **Root Hair Phosphoproteome**

It is now clear that many of the initial events in Nod factor perception and the plant response to rhizobial infection involves activation of a variety of protein kinases (Antolín-Llovera et al., 2014; Liang et al., 2014). Therefore, it is of interest to identify those proteins that are rapidly phosphorylated after rhizobial inoculation. Modern MS-based methods for phosphoproteomic analysis allow for such global analyses. For example, Rose et al. (2012) analyzed the phosphoproteome of *M. truncatula* roots after inoculation with *S. meliloti*. However, again, although this report identified a variety of interesting proteins, the results represent an average of the whole root response, not that of specific cells. Hence, our laboratory undertook a similar study to specifically examine the phosphoproteome of soybean root hair cells subsequent to rhizobial inoculation (Nguyen et al., 2012). Again, this is an ideal cell type since it is the site of the initial interaction and penetration of the plant root by the rhizobial symbiont. The root hair phosphoproteome was compared to that of roots stripped of their root hairs. In order to provide accurate quantification of peptide levels, each was labeled with an isobaric tag (eight plex) using the iTRAQ (isobaric tags for relative and absolute quantification) method, followed by phosphopeptide enrichment and LC-MS/MS analysis. A total of 1,625 unique phosphopeptides, spanning 1,659 nonredundant phosphorylation sites, were detected from 1,126 soybean phosphoproteins from both root hairs and stripped roots. Among the identified phosphopeptides, the levels of 273, belonging to 240 phosphoproteins, were found to be significantly regulated upon *B. japonicum* infection suggesting a complex network of kinase substrate and phosphatase-substrate interactions in response to rhizobial inoculation (Nguyen et al., 2012). Proteins predicted to play a role in signal transduction (e.g., protein kinases, protein phosphatases, protein phosphatase inhibitors, and G protein-related proteins) and those involved in hormone signaling were among those whose phosphorylation was specifically affected by *B. japonicum* inoculation. The identified phosphoproteins and phosphorylation site data were deposited and are available at the Plant Protein Phosphorylation Database (P<sup>3</sup>DB; http://digbio.missouri.edu/p3db/).

# **Root Hair Metabolome**

Similar to transcriptomics and proteomics, responses and measurement of metabolites from multi-cellular tissues, organs or the whole plant could give misleading information if specific metabolism is confined to only a few cells or a single cell type. Until recently, a single analytical platform was not available that could measure a significant number of metabolites from a single cell (Fukushima et al., 2009). However, extant methods could be applied to soybean root hairs given our ability to isolate this cell type in a pure form and in quantity (Libault et al., 2010a). Again, the legume-rhizobium interaction was an obvious target for such analyses given the relevance of the root hair system and a variety of publications implicating specific metabolites as playing important roles. A variety of secondary products (e.g., flavonoids), as well as various hormones, were shown to be important regulators of the nodulation process (Matamoros et al., 2006; Gibson et al., 2008; Ding and Oldroyd, 2009). For example, formation of the infection thread by which rhizobia gain entry into the plant root hair cell might require a significant change in the metabolism of both the symbiont and host.

Brechenmacher et al. (2010) investigated the root hair metabolome in an effort to identify metabolites whose levels changed significantly during the first 48 h after rhizobial inoculation. Metabolites were analyzed using both gas chromatography-mass spectrometry (GC-MS) and ultraperformance liquid chromatography-quadrupole time of flight-mass spectrometry (UPLC-MS). Using these combined approaches, a total of 2,610 metabolites were identified in soybean root hair cells. Among these metabolites, 166 were found to be significantly regulated in response to rhizobial infection, including various (iso)flavonoids, amino acids, fatty acids, carboxylic acids, and various carbohydrates, indicating major metabolic changes occurring during *B. japonicum*-root hair interactions. Among these metabolites was trehalose, an α-linked disaccharide of glucose, which has been implicated in a variety of plant processes, including resistance to osmotic stress (Iturriaga et al., 2009). Trehalose levels increased significantly in root hairs upon inoculation by *B. japonicum*. However, through the use of various mutants of *B. japonicum* blocked in the synthesis of trehalose, the majority of this disaccharide could be attributed to the bacteria and did not appear to be predominantly derived from the plant host. The authors postulated that trehalose synthesis by *B. japonicum* may be important to allow the bacteria to survive what may be a stressful osmotic environment within the plant.

# **Root Hair Small RNAome**

MicroRNAs (miRNAs) are small non-protein coding endogenous RNAs, typically 21-24 nucleotides in length. In the past decade, miRNAs have been shown to be key players controlling gene expression by transcript cleavage or translational inhibition in a wide variety of plant biological processes, including growth and development, disease, stress and plant microbe interactions (Kidner and Martienssen, 2005; Xiao and Rajewsky, 2009; Lauressergues et al., 2012; Sunkar et al., 2012; Weiberg et al., 2013). Indeed, several studies documented an important role for miR-NAs in regulated gene expression during the legume nodulation process (Combier et al., 2006; Subramanian et al., 2008; Lelandais-Brière et al., 2009;Wang et al., 2009; Joshi et al., 2010; Li et al., 2010; De Luis et al., 2012; Turner et al., 2013; Yan et al., 2013, 2015a). However, all of these reports came from small RNA profiling of roots and/or nodules and not from more specific studies of the initial infection process within root hairs. In order to target these very early stages of rhizobial-host interaction, Yan et al. (2015b) utilized *B. japonicum* infected soybean root hair single cells and roots stripped of their root hairs to generate both small RNA and degradome libraries. Sequencing of three small RNA libraries from inoculated root hairs, stripped roots and mock inoculated control samples identified a total of 114 miRNAs. Among these, 22 were found to be novel miRNAs. Comparative analysis of miRNA abundance identified 66 miRNAs that were differentially expressed between root hair and stripped roots. A total of 48 miRNAs were differentially regulated in root hairs in response to bacterial infection when compared to the un-infected control root hairs. Sequencing of a Parallel Analysis of RNA Ends (PARE; i.e., degradome) library from similar tissues revealed a total of 405 soybean mRNA targets. This method identifies new 5*′* -mRNA ends presumably arising from miRNA-mediated cleavage (German et al., 2009; Zhai et al., 2014). The mRNA targets identified were predicted to encode transcription factors or proteins involved in protein modification, protein degradation and enzymes in various hormonal pathways. The root hair data set represents an important starting point for in depth analysis of the role that specific miRNAs may be playing in the legume-rhizobial symbiosis.

# **Bioinformatics and Data Integration for Network Analysis and Modeling**

Next generation sequencing technologies have empowered researchers to conduct experiments on a whole genome scale and have the potential to completely revolutionize biological research. Big Data have been generated in all domains including transcriptomics (RNA-seq), proteomics, metabolomics, epigenomics, microRNA/smallRNA, and genomic variations, including single nucleotide polymorphisms (SNP) and insertion/deletions (InDels; Mochida and Shinozaki, 2010; Brauer et al., 2014). Most lab experiments now generate anywhere from a few GB to several TB of raw and analyzed data, and have a need to overlay different types of data and information to get a comprehensive understanding. All of these data provide valuable insights into a systems-level understanding of the biology of an organism and need to be mined in an innovative and integrative manner. This is posing a new challenge for researchers and mandating the development of comprehensive, efficient informatic platforms and web resources to facilitate data sharing and collaboration among the research community, while providing advanced techniques for multi-omics integration, computational analysis, and hypothesis generation.

Most databases and available tools can handle analysis of a single -omics data-type, but analysis of multiple omics data-types presents a major challenge. No tools are currently available that provide a systematic solution to this problem. In most cases, only by integrated analysis of the expression of mRNA, proteins, metabolites along with miRNA/sRNA, etc (**Figure 2**) can we draw conclusions about all the involved regulatory mechanisms. With multi-omics data integration techniques, we can gain better insight into the network modules derived and, supported by experimental evidence, utilize this information to build *in silico* computational models for understanding the underlying mechanisms and to generate new hypotheses. The *in silico* models provide templates that automate the identification and generation of network modules by incorporating differentially expressed genes, proteins, and metabolites into the function and pathway enrichment analysis. The models can be visualized using multiple layers of platforms such that data from transcriptomics, proteomics, and

# **References**

metabolomics can each be visualized simultaneously on separate platform layers as shown in **Figure 2**. This presentation enables rapid comparisons that can be further enriched by incorporating available experimental data, literature references, and user inputs to cross-validate results. Users can start with their own hypothesis and add the multi-omics evidence stored in the associated database to continue building the modules. Such *in silico* modeling systems can provide researchers with additional avenues for generating and testing a specific hypothesis by incorporating diverse varieties of data. Bioinformatics plays a key and essential role in bringing these pieces together in the form of one-stopshop web resources, such as the soybean knowledge base (SoyKB; Joshi et al., 2014, 2012), and facilitating the ability of researchers to get a more complete understanding of the underlying molecular mechanisms.

# **Conclusion**

Our planet is facing increasing challenges related to environment, food security, non-renewable energy sources, water availability and overall sustainability. Plant biological research and its application in agriculture have the potential to mitigate many of these challenges. Thus, we need to have a better understanding of plant biology at a systems level, ultimately reaching the point where computational models can predict biological outcomes. Irrespective of the popularity of system biology, we are still a long way from achieving this level of understanding. Among the obstacles is the functional complexity of cellular/organismal regulatory networks, as well as our ability to deal with Big Data, especially integration of disparate data types. There is a need to simplify our systems to allow for clearer conclusions, while increasing the resolution and sensitivity of our analysis, without losing contact with real world relevance. The root hair single cell system provides one such route for the study of plant processes. The use of the diverse "omics" data sets developed from the soybean root hair as a single cell type provides an opportunity for comprehensive network analysis and ultimately will help to build an integrated predictive model. Besides the inherent interest of the root hair cell, the hope is that as an illustrative model system, the soybean root hair cell system can contribute to an overall greater understanding of plant biology leading ultimately to improvements in agriculture.

# **Acknowledgments**

Research was funded by a grant from the USA National Science Foundation Plant Genome program (grant no. DBI-0421620 to GS). We are thankful to Joseph M. Batek, University of Missouri for aiding in the literature review. We also thank Dr. Katalin Toth for reading and revising the manuscript.

Breakspear, A., Liu, C., Roy, S., Stacey, N., Rogers, C., Trick, M., et al. (2014). The root hair "Infectome" of *Medicago truncatula* uncovers changes in cell cycle genes and reveals a requirement for auxin signaling in rhizobial infection. *Plant Cell* 26, 4680–4701. doi: 10.1105/tpc.114.133496

Antolín-Llovera, M., Petutsching, E. K., Ried, M. K., Lipka, V., Nürnberger, T., Robatzek, S., et al. (2014). Knowing your friends and foes—plant receptor-like kinases as initiators of symbiosis or defence. *New Phytol.* 204, 791–802. doi: 10.1111/nph.13117

Brauer, E. K., Singh, D. K., and Popescu, S. C. (2014). Next-generation plant science: putting big data to work. *Genome Biol.* 15, 301. doi: 10.1186/gb4149

Brechenmacher, L., Lei, Z., Libault, M., Findley, S., Sugawara, M., Sadowsky, M. J., et al. (2010). Soybean metabolites regulated in root hairs in response to the symbiotic bacterium *Bradyrhizobium japonicum*. *Plant Physiol*. 153, 1808–1822. doi: 10.1104/pp.110.157800


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Hossain, Joshi and Stacey. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The female gametophyte: an emerging model for cell type-specific systems biology in plant development

#### Edited by:

*Marc Libault, University of Oklahoma, USA*

#### Reviewed by:

*Matthew Mount Stuart Evans, Carnegie Institution for Science, USA Zhenzhen Qiao, University of Oklahoma, USA*

#### \*Correspondence:

*Marc W. Schmid marcschmid@gmx.ch; Anja Schmidt aschmidt@botinst.uzh.ch; Ueli Grossniklaus grossnik@botinst.uzh.ch*

*† These authors have contributed equally to this work.*

#### Specialty section:

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> Received: *09 July 2015* Accepted: *10 October 2015* Published: *03 November 2015*

#### Citation:

*Schmid MW, Schmidt A and Grossniklaus U (2015) The female gametophyte: an emerging model for cell type-specific systems biology in plant development. Front. Plant Sci. 6:907. doi: 10.3389/fpls.2015.00907*

#### Marc W. Schmid\* † , Anja Schmidt\* † and Ueli Grossniklaus \*

*Department of Plant & Microbial Biology and Zurich-Basel Plant Science Center, University of Zurich, Zurich, Switzerland*

Systems biology, a holistic approach describing a system emerging from the interactions of its molecular components, critically depends on accurate qualitative determination and quantitative measurements of these components. Development and improvement of large-scale profiling methods ("omics") now facilitates comprehensive measurements of many relevant molecules. For multicellular organisms, such as animals, fungi, algae, and plants, the complexity of the system is augmented by the presence of specialized cell types and organs, and a complex interplay within and between them. Cell type-specific analyses are therefore crucial for the understanding of developmental processes and environmental responses. This review first gives an overview of current methods used for large-scale profiling of specific cell types exemplified by recent advances in plant biology. The focus then lies on suitable model systems to study plant development and cell type specification. We introduce the female gametophyte of flowering plants as an ideal model to study fundamental developmental processes. Moreover, the female reproductive lineage is of importance for the emergence of evolutionary novelties such as an unequal parental contribution to the tissue nurturing the embryo or the clonal production of seeds by asexual reproduction (apomixis). Understanding these processes is not only interesting from a developmental or evolutionary perspective, but bears great potential for further crop improvement and the simplification of breeding efforts. We finally highlight novel methods, which are already available or which will likely soon facilitate large-scale profiling of the specific cell types of the female gametophyte in both model and non-model species. We conclude that it may take only few years until an evolutionary systems biology approach toward female gametogenesis may decipher some of its biologically most interesting and economically most valuable processes.

Keywords: developmental systems biology, model systems, single cell type isolation, gametophyte, transcriptomics

# 1. SYSTEMS BIOLOGY: AN INTEGRATED APPROACH TO MODEL BIOLOGICAL PROCESSES WITH LARGE-SCALE DATA

Since the foundation of the Institute for Systems Biology in the year 2000 and the formal definition of systems biology at the beginning of the twenty-first century (Ideker et al., 2001; Kitano, 2002), it has been a steadily growing field of research. As an integrative approach, systems biology is markedly different from the reductionistic approach generally used in molecular biology and genetics. Powered by the central dogma of biology, where a gene is transcribed to mRNA, which is then translated into proteins, molecular biology and genetics have successfully identified genes, their functions, and the processes they are involved in. However, the implicit link of a gene to a certain function or a phenotype is an oversimplification of the underlying process. It thus frequently misses important interactions with other cellular or environmental factors (e.g., responses to environmental conditions like a temperaturedependent phenotype of a mutant). In contrast, systems biology may be described as an attempt to quantitatively and/or qualitatively describe and understand the global behavior of a biological entity, emerging from the interactions between its molecular components. Such a comprehensive understanding would allow the prediction and modeling of the biological entity, its precise control, and ultimately the targeted manipulation of a complex biological system (reviewed in Kitano, 2002; Yuan et al., 2008; Fukushima et al., 2009; Chuang et al., 2010; Katari et al., 2010; Weckwerth, 2011).

Systems biology comprises and integrates experimental studies and large-scale data sets derived from high-throughput technologies (omics), such as transcriptomics (RNA profiling), proteomics (analysis of proteins), and metabolomics (profiling of metabolites). However, also epigenetic regulatory processes based on the modification of chromatin components or DNA (epigenomics), the translation of mRNAs to proteins (translatomics), complex formation of proteins with proteins or nucleic acids (interactomics), the investigation of protein modifications, e.g., phosphorylation important for the regulation of their activity (phospho-proteomics), and the transport of ions or metabolites (fluxomics) need to be taken into account to achieve a full picture of the dynamic processes of a cell or organism (reviewed by Sheth and Thaker, 2014). One of the most crucial aspects for systems biology approaches is the comprehensiveness of the omics data (Kitano, 2002). For a given method this includes the number of items that can be measured at once (e.g., transcripts with transcriptomics). For the entire system, it is then important whether the relevant items (e.g., enzymes and metabolites) or processes (e.g., posttranslational modifications) can be accurately measured with a combination of certain methods. An additional level of complexity may be imposed by the requirement of a high spatial and/or temporal resolution. For a single, isolated cell this can refer to specific organelles, subcellular compartments, certain domains of the plasma membrane, and the stage of the cell-cycle. For an unicellular organism like yeast, this may be augmented by studying the cell-to-cell variability within the population (Pelkmans, 2012). In multicellular organisms, each cell (type) has a specific function and position within an organ. Its role and differentiation status may be influenced by local signals as well as systemic signals originating from other organs (e.g., hormones). In addition, the temporal coordinate expands to developmental stages of the organs or the life span of the organism.

Consequently, a complete understanding at the systems level requires highly resolved, quantitative spatio-temporal data on the individual components and their interactions, and the integration of the data into models. On one hand, integration of these data with computational methods can aid to characterize previously unknown components (e.g., genes) of a system, as exemplified for yeast (Brown et al., 2006). Alternatively, the data may be used in a mathematical model describing the system and allowing the prediction of a system's behavior and the formulation of hypotheses (Süel et al., 2007). Finally, the integration of omics data, the formulation of mathematical models, the generation of hypotheses, and the experiments are interlinked and benefit from each other. A possible extension of systems biology is the use of interspecies comparisons to, for example, elucidate the extent to which genotypic variation translates into phenotypic differences (Konstantinidis et al., 2009). Even broader, evolutionary systems biology may be recognized as an approach to describe and understand how biological systems are shaped by evolution and are steering it at the same time (reviewed in Soyer, 2012).

Prior to the understanding of a complex organism composed of many different cell and tissue types, investigations of distinct cell types can lead to an understanding of basic processes governing cellular specification, differentiation, and metabolism. To date, yeast (S. cerevisiae) is a widely used model system appreciated as the currently best understood cell (Boone, 2014). While evolutionary only distantly related, pathways in yeast have shown to have considerable similarities to the ones in plants, animals, and humans (Ideker et al., 2001). In addition, yeast serves for the production of food and pharmaceuticals. Due to its simplicity and its importance for biotechnology and biomedical research, yeast has shaped modern molecular biology to a great extent. Indeed, it has been a pioneering organism in systems biology (reviewed in Bostein and Fink, 2011; Österlund et al., 2012; Boone, 2014), starting from gene expression and regulatory networks discovered during early transcriptome studies and their integration with other genome-wide data, over genetic interaction networks obtained by crossing thousands of mutant strains (Costanzo et al., 2010) and modeling of gene expression as a Quantitative Trait Locus (eQTL, Brem et al., 2002), to genomewide metabolic models. However, given the unicellularity of yeast, it can hardly serve as a developmental model for complex multicellular animals and even less so for plants. In plants, systems biology is less advanced for several reasons, including the higher complexity of most plant genomes, large gene families, the multitude of primary and secondary metabolites, and the lack of suitable in vitro systems or cell lines for most plant tissues. Most efforts in plant research thus require in vivo experiments, making the procedures generally more difficult and less suitable to high-throughput approaches. As a consequence, data generation can be a severely limiting factor for plant systems biology. On the other hand, the results are of high relevance for the process under investigation.

Apart from the above mentioned obstacles, substantial progress in the analysis of specific cell types in plants has been made over the last decade. Facilitated by advances in high-throughput profiling technologies and methods for the isolation of individual cell types, recent studied focussed on the analysis of specific cell types or even single cells (**Figure 1**). To investigate cell type-specific processes in higher plants, root hairs and trichomes have been used as models, both for their physiological importance and their accessibility at the epidermal surface (for details see below; Ishida et al., 2008; Brechenmacher et al., 2009, 2010; Dai et al., 2010; Libault et al., 2010a,b; Schilmiller et al., 2010; Nestler et al., 2011; Van Cutsem et al., 2011; Dai and Chen, 2012; Rogers et al., 2012; Tissier, 2012; Qiao and Libault, 2013). In addition, starting with only a few examples at the beginning of the twenty-first century (Kehr, 2001), cell type-specific transcriptional profiling has become a robust and frequently used method. In the model plant Arabidopsis thaliana, novel insights into plant development and cellular responses to environmental stimuli were for example gained through studies on individual cell types of the root, root hairs, trichomes, and guard cells, and by transcriptional profiling during male and female gametogenesis (reviewed in Taylor-Teeples et al., 2011; Schmidt et al., 2012; Wuest et al., 2013). These examples clearly illustrate the importance of cell type-specific investigations for a detailed understanding of differentiation processes and environmental responses of distinct cell types. However, depending on the cell type under investigation, the currently available methods for cell isolation may be challenging, time-consuming, or limited to a subset of omics approaches (e.g., Laser-Assisted Microdissection (LAM) of rare cell types, Wuest et al., 2013). While studies focusing on specific cell types, which can be isolated in quantities high enough for the full set of omics approaches, can serve as initial models for cell type-specific systems biology in plants (Libault et al., 2010a), the ultimate goal must be that the full set of methods can be applied to any cell type of interest.

# 2. METHODS FOR THE ACQUISITION OF LARGE-SCALE QUANTITATIVE DATA FROM SPECIFIC CELL TYPES

Large-scale profiling of distinct cell types critically depends on the possibility to isolate these cells in sufficient purity and quantity, as well as the sensitivity and accuracy of the profiling methods. Despite the rapid improvements of established and novel tools for systems biology, the demand for fast and easily applicable methodologies for cell type-specific analyses is not yet satisfied. Further challenges are associated with the requirement for normalization and integration of different data types, and the increasing demand for platforms allowing storage and sharing of the rapidly growing amount of large-scale datasets (reviewed by Chuang et al., 2010; Katari et al., 2010; Gomez-Cabrero

et al., 2014; Sheth and Thaker, 2014). In brief, three steps are of great importance for cell type-specific systems biology: (i) isolation and purification of the specific cell type, (ii) profiling of the selected molecular compounds, and (iii) data analysis, integration, storage, and sharing. In the following sections, we will present current methods to acquire large-scale quantitative data required for systems biology. We will focus on methods allowing genome-wide cell type-specific analyses and present representative examples. For a discussion on the computational challenges in systems biology, the reader is referred to several recent reviews (Ahrens et al., 2007; Yuan et al., 2008; Fukushima et al., 2009; Chuang et al., 2010; Katari et al., 2010; Liberman et al., 2012; Fukushima et al., 2014; Gomez-Cabrero et al., 2014; Robinson et al., 2014).

# 2.1. Methods for the Isolation of Specific Cell Types

A few cell types in plants are exposed on the surfaces of tissues and can be collected by abrasion or mechanical detachment. Depending on the species, relatively simple mechanical isolation procedures for trichomes and root hairs enabled a large spectrum of methods. Mechanical isolation of trichomes allowed transcriptomics and metabolomics in various species (for an integrated database see Dai et al., 2010) and proteomics (Schilmiller et al., 2010; Van Cutsem et al., 2011). Another example for an exposed cell type are root hairs, for which relatively simple isolation procedures facilitated transcriptomics (Libault et al., 2010b), proteomics (Brechenmacher et al., 2009; Nestler et al., 2011), and metabolomics (Brechenmacher et al., 2010). Certain other cell types can be isolated by tissue disruption, followed by centrifugation-based methods or manual isolation of the dissociated cells under a microscope using a micropipette (eventually with a marker for the cell type of interest). Examples include specific cell types from the male or female reproductive lineages, plant mesophyll cells, and guard cells (reviewed by Dai and Chen, 2012; Schmidt et al., 2012; Wuest et al., 2013). Proteomic profiling has, for example, been performed on Brassica napus guard cells and mesophyll cells that could be purified as protoplasts (Zhu et al., 2009).

However, for most cell types these methods are not applicable. Several methods for the isolation of specific cell types embedded in differentiated tissues have been established. Fluorescent Activated Cell Sorting (FACS) can be used to sort fluorescent cells based on their light scattering characteristics and fluorescence (reviewed by Hu et al., 2011). This method allowed high resolution transcriptional profiling of different cell types in the Arabidopsis root, and, more recently, proteomics (Petricka et al., 2012) and metabolite mapping of selected root cell and tissue types (Brady et al., 2007; reviewed by Benfey, 2012; Moussaieff et al., 2013). Similarly, Fluorescence-Activated Nuclei Sorting (FANS) has been established and, for example, used to isolate endosperm nuclei for profiling of RNA activity or epigenetic modifications (Weinhofer et al., 2010; Weinhofer and Köhler, 2014). Despite the great potential of FACS/FANS for plant cell type-specific systems biology, both approaches have certain limitations: They can only be applied if transgenic lines carrying cell type-specific fluorescent markers can be established, and they are thus not suitable for most non-model species. In addition, depending on the tissue type, longer enzymatic incubations are required to digest the cell walls and to release the protoplasted cells prior to sorting (Evrard et al., 2012). Consequently, changes in, for example, the transcriptome or metabolome cannot be fully excluded. Alternatively, the INTACT method (Isolation of Nuclei TAgged in specific Cell Types) allows the isolation of nuclei expressing a biotinylated nuclear envelope protein by affinity purification with streptavidin-coated beads (Deal and Henikoff, 2011). This method is suitable to study epigenetic modifications (DNA methylation of histone modifications) and to profile the RNA within the nucleus. To study actively translated mRNAs bound to ribosomes (translatome), small epitope tags can be fused to a ribosomal protein to allow immunopurification of the ribosomes containing the mRNAs with a method named TRAP (Translating Ribosome Affinity Purification; reviewed in Bailey-Serres, 2013). Alternatively, RNAs binding to RNA binding proteins involved in the formation of ribonucleoprotein (RNP) complexes can be profiled by immunoprecipitation of an epitopetagged protein (RNP ImmunoPurification, RIP; Bailey-Serres, 2013). It has to be noted that the analyses of transcriptome and translatome abundance will not give the same results, because not all mRNAs present in a cell are actively translated at a given time point. In this respect, profiling of mRNAs bound to ribosomes gives complementary results to transcriptome profiling as the readouts are closer to the synthesis of proteins (Bailey-Serres, 2013). Similar to FACS and FANS, also INTACT, TRAP, and cell type-specific RIP require the use of transgenic lines and pre-existing knowledge about cell type-specific promoters or markers.

An alternative method not requiring any molecular knowledge is LAM (Kerk et al., 2003). Plant tissues are thereby typically fixed and embedded in paraffin wax (reviewed in Schmidt et al., 2012; Wuest et al., 2013) or resin (Tucker et al., 2012; Okada et al., 2013). Thin sections of the tissues (typically between 6 and 10 µm) are subsequently mounted on metal framed plastic slides and used to isolate the cell types of interest after resolving the wax or resin and drying the tissues on the slides (Okada et al., 2013; Wuest and Grossniklaus, 2014). Alternatively, the tissue may also be embedded in optimal cutting temperature compound for cryosectioning, followed by on-slide tissue dehydration and LAM (Kelliher and Walbot, 2012, 2014). The main constraint of LAM is that harvesting sufficient material for downstream omics methods can be very time-consuming. Furthermore, the suitability for single cell isolation depends on the optical resolution in sectioned tissues and the recognizability of the cell type of interest. In addition, the physical properties of the laser beam of the instrument used can impose limitations on which cell types can be isolated (Schmidt et al., 2012). Thus, the time required for collecting enough material for one sample is largely dependent on the cell type of interest. So far, the applications of LAM for cell type-specific omics have been restricted to transcriptional profiling, e.g., to study cell type-specification in the female reproductive lineage in Arabidopsis thaliana, Boechera gunnisoniana, and Hieracium praealtum (Wuest et al., 2010; Schmidt et al., 2011, 2014; Schmid et al., 2012; Okada et al., 2013). However, other applications, such as genome wide profiling of DNA methylation, are likely feasible (see below).

# 2.2. Methods for Data Acquisition

# 2.2.1. Transcriptomics

Transcriptome profiling encompasses the identification and quantification of all expressed RNA transcripts at a given time point (mRNA, tRNA, microRNA). However, due to the frequent use of oligo-dT priming during cDNA synthesis or the hybridization to microarrays covering only coding regions of the genome, many studies are restricted to mRNAs or a subset of mRNAs. Several types of microarrays were produced and extensively used for the analyses of gene expression in different plant species, including the model plant Arabidopsis thaliana and different important crop species like maize, rice, and barley (reviewed in Sheth and Thaker, 2014). The Affymetrix ATH1 GeneCHIP (www.affymetrix.com), the most popular microarray for Arabidopsis has for example been used to profile a large variety of different tissue types (e.g., Schmid et al., 2005), specific cell types of the root isolated through FACS (Birnbaum et al., 2003), and specific cell types of the male and female reproductive lineages (reviewed in Schmidt et al., 2012). In addition to well established tools for data analysis, the wealth of publicly available datasets generated on the same platform makes commonly used microarrays a very valuable tool for systems biology (Katari et al., 2010).

Apart from microarrays, several platforms for Next Generation Sequencing (NGS) have been developed over the last years and are now routinely used for transcriptional profiling (RNA-Seq; see Mardis, 2013, for a review on NGS platforms). RNA-Seq has several advantages as compared to the use of microarrays, including a higher dynamic range, higher sensitivity, and whole-genome coverage allowing the identification of previously unknown transcripts and splice variants (reviewed in Schmidt et al., 2012). A major advantage is the applicability to non-model species, either through de novo assembly of the short reads into transcripts or by the use of a reference transcriptome either produced separately or taken from a public database (e.g., the ongoing effort to sequence 1000 plant transcriptomes, www.onekp. com). Examples for such an approach are the central cells of Arabidopsis thaliana, and cells of the female reproductive lineage in Hieracium praealtum and Boechera gunnisoniana (Schmid et al., 2012; Okada et al., 2013; Schmidt et al., 2014). Several tools for RNA-Seq data analysis are available (see Sheth and Thaker, 2014, for a selection of software tools, and Schmid and Grossniklaus, 2015, for Rcount, a count tool addressing the problem of reads aligning at multiple locations in the genome, or reads aligning at positions where two or more genes overlap). Current challenges are the increasing demand for standardized annotations of datasets and the development of computational methods allowing the integration of data from different studies using different methods and platforms. In the future, the integration of data from different species will be of great value for plant systems biology, allowing researchers to gain insights into conserved common regulatory mechanisms, environmental adaptations, and evolutionary changes.

#### 2.2.2. Proteomics

In addition to the analysis of gene expression and actively translated mRNAs, the investigation of proteins and protein modifications (e.g., phospho-proteomics and glyco-proteomics) add additional levels of complexity. From a systems biology perspective the aim is the combination of cell type-specific proteomics with transcriptomics and metabolomics to elucidate and model regulatory networks (reviewed in Dai and Chen, 2012). In the beginning of proteomics, 2D gel electrophoresis was frequently used for separation of the proteins in a sample and to identify spots representing proteins differentially occurring in two samples (reviewed by Schulze and Usadel, 2010). However, the protein or protein mixture in one spot could only be identified by excising the spot and analysis using Mass Spectrometry (MS). To date, proteomics largely depends on the use of various MS methods in combination with different protein separation procedures. Typically, proteins are first digested with trypsin and subsequently either analyzed directly by MS or first separated by chromatography before MS. MS methods have greatly improved with the development of soft ionization methods like ElectroSpray Ionization (ESI) in solution (typically aqueous or organic solvents) or Matrix Assisted Laser Desorption Ionization (MALDI, Hollenbeck et al., 1999; Schulze and Usadel, 2010). By both methods, intact gas phase ions are generated that are introduced into mass analyzers and sorted depending on their mass-to-charge ratio, e.g., using their Time-Of-Flight (TOF, Hollenbeck et al., 1999; for a recent summary of mass analyzers see Lee et al., 2012; for a description of Orbitrap mass analyzers see Perry et al., 2008). However, detection based on peptide mass-to-charge ratios is largely qualitative and can only be used for quantification in two or more samples acquired under standardized conditions (Schulze and Usadel, 2010). Thus, stable isotope or chemical labeling is frequently applied for quantification in proteomic methods (reviewed in Schulze and Usadel, 2010). While software and algorithms for protein identification are well established, quantitative analysis remains more challenging (Schulze and Usadel, 2010; Sheth and Thaker, 2014, see Sakata and Komatsu, 2014, for a recent survey on proteomics repositories and databases).

To date, only a restricted number of plant cell types have been profiled in a cell type-specific manner by proteomics, including guard cells, mesophyll cells, trichomes, root hair cells, leaf epidermal cells, lily and rice sperm cells, different stages of pollen development in tobacco, Arabidopsis, and tomato, and rice egg cells (Brechenmacher et al., 2009; Grobei et al., 2009; Abiko et al., 2013; Chaturvedi et al., 2013; Ischebeck et al., 2014, and reviewed by Dai and Chen, 2012; Wuest et al., 2013). As compared to transcriptomics approaches, a larger amount of starting material is required. For example, approximately 40 µg of protein were isolated to study the proteome during tobacco pollen development (Ischebeck et al., 2014). In addition, the amount of proteins detected is typically in the range of 10–30% of the transcripts identified from the same cell or tissue type, as exemplified by a study on Arabidopsis pollen, in which 3599 proteins as compared to 11,150 expressed genes were reported (Grobei et al., 2009). This quantitative difference largely reflects the difference in the sensitivity of the methods and likely only to a smaller extent meaningful biological differences. Nevertheless, as only a few proteins have been identified in previous studies, e.g., from maize egg cells, these data reflect a great improvement (Okamoto et al., 2004), and a rapid advance since the shaping of the term proteomics in 1997 (James, 1997).

#### 2.2.3. Protein-Protein Interactions

For studies of protein-protein interactions, the major methods used are Yeast Two-Hybrid (Y2H) screens, Affinity Purification Mass Spectrometry (AP-MS), or Bimolecular Fluorescence Complementation (BiFC) (reviewed in Zhang et al., 2010). Y2H assays take advantage of the bipartite structure of the yeast GAL4 transcriptional activator consisting of two functional domains, a transcription activation domain and a DNA-binding domain. In Y2H assays, the bait and the target protein are fused to the two functional domains of GAL4, respectively, together reconstituting the functional GAL4 protein that binds to its target promoter (UASGAL4) to activate the expression of a downstream gene encoding a selectable marker. Apart from a high false-positive rate, the use of yeast itself is a major drawback of the method. While cell type-specific cDNA libraries can be used to profile pairwise protein interactions, the system does not truly reflect the in vivo state of a specific plant cell (e.g., cofactors of an interaction may be missing). Several systems similar to Y2H assays have been established to specifically study membrane proteins (e.g., split-ubiquitin system; Obrdlik et al., 2004; Chen et al., 2010). For AP-MS, a bait protein is fused to an affinity tag for expression in vivo. The tagged protein of interest is subsequently purified as a complex with interacting proteins or other molecules and assayed by MS. This method is also associated with a relatively high false-positive rate due to protein contaminants. While the method is well-suitable for cell type-specific studies if the expression of the tagged protein is driven by a cell type-specific promoter, true omics-scale profiling can hardly be achieved, as a precondition would be the cell type-specific tagging of all proteins represented in a cell. This also holds true for BiFC, where a fluorescent protein (YFP, RFP, or GFP) is split into two non-fluorescent halves that are reconstituted to a fluorescent protein upon interaction of the bait and target proteins they are fused to (reviewed by Zhang et al., 2010). While BiFC has the advantage that spatial and temporal interactions can be resolved, it is also associated with a high false-positive rate. Consequently, methods for true cell type-specific large-scale protein-protein interaction studies in plants are lacking to date. Nonetheless, the currently available data on protein-protein interactions, as for example the recently established membrane protein interactome (Chen et al., 2012), may help to resolve certain dependencies within regulatory networks (see Sheth and Thaker, 2014, for a summary of the available databases).

#### 2.2.4. Protein-DNA Interactions

Interactions between proteins and DNA comprises several functional aspects, for example nucleosome occupancy, specific histone modifications, or transcription factor binding. These interactions may be studied using either Chromatin ImmunoPrecipitation (ChIP, Orlando and Paro, 1993), or DNA adenine methyltransferase IDentification (Dam-ID, van Steensel and Henikoff, 2000). In both cases, the interaction of one protein (variant) with the DNA is monitored genome-wide. During the ChIP procedure, the DNA is cross-linked by formaldehyde to bound proteins before fragmentation by sonication. Chromatin fragments are then isolated with antibodies against the protein (variant) of interest. After recovery of the co-purified DNA by reverting the cross-links, the DNA sequence can be identified using microarray hybridization or high-throughput sequencing (He et al., 2011). Protocols facilitating cell type-specific ChIP (Chromatin Affinity purification from Specific cell Types by ChIP; CAST-ChIP), without the need for purification of the cell type of interest or a protein-specific antibody, have been developed (Schauer et al., 2013). However, these protocols rely on transgenics and specific promoters. In addition, we are not aware of a report where this method has been applied in plants or used to study rare cell types.

For Dam-ID, the protein of interest is fused to an adeninemethyltransferase of E. coli (Dam, Greil et al., 2006). Endogenous methylation of adenine is absent in most eukaryotes. Upon expression of the fusion protein, Dam is targeted to the native binding sites of the protein fused to it. This results in a localized methylation of adenines in the GATC sequence context. These regions can then be identified using methylationsensitive restriction enzymes and microarray hybridization or high-throughput sequencing (Greil et al., 2006; Luo et al., 2011). Tissue or cell type-specific expression of the fusion protein can be used to overcome the need for cell isolation and has been shown to be highly specific (targeted DamID, "TaDa," Southall et al., 2013). The major disadvantages of the method are the requirement for transgenics and specific promoters, as well as the need for optimization of the expression level to avoid untargeted methylation and toxicity of the Dam fusion protein. Thus, both approaches are currently quite laborious and generally only applicable to model-species. Nonetheless, especially transcription factor binding is of great value for the study of transcriptional networks (Yuan et al., 2008). If cell typespecific data is not available, previously identified transcription factor binding motifs may still help to identify transcriptional modules (Diez et al., 2014).

#### 2.2.5. Protein Microarrays

Protein microarrays are a promising tool for proteomics as well as for interactions of proteins with other proteins, nucleic acids, cellular surface markers, or posttranslational proteinmodifications (Yang et al., 2011b; Uzoma and Zhu, 2013). Several different types of protein microarrays can therefore be distinguished. On analytical microarrays, well characterized proteins (e.g., monoclonal antibodies) are spotted to identify a specific set of proteins. Alternatively, less well characterized proteins (e.g., lysates from whole cells) are spotted on functional microarrays to test for interaction partners. Finally, proteome microarrays hold the majority of encoded proteins for an organism (Yang et al., 2011b). While the first proteome microarray for budding yeast was established in 2001 (reviewed by Uzoma and Zhu, 2013), not many applications were reported in plants (Yang et al., 2011b; Uzoma and Zhu, 2013). Nevertheless, protein microarrays have, for example, successfully been used to study 802 transcription factors in Arabidopsis (almost half of all transcription factors annotated in Arabidopsis, Gong et al., 2008). While protein microarrays may have a high potential for applications in systems biology, they are currently still limited by high production costs and laborious production methods (e.g., large-scale cloning of open reading frames, protein purification, and production of high-affinity monoclonal antibodies, Yang et al., 2011b).

#### 2.2.6. Metabolomics

Due to the high complexity of plant metabolites coming from both primary and secondary metabolism, the plant metabolome is highly complex. Although by far not comprehensively elucidated to date, about 200,000 different metabolites are estimated to be represented in plants (reviewed by Sheth and Thaker, 2014). While a variety of analysis platforms can in principle be applied for metabolite detection, Nuclear Magnetic Resonance (NMR) and MS are the most frequently used methods (Kueger et al., 2012; Sheth and Thaker, 2014). High resolution mapping of metabolites has recently been achieved in Arabidopsis roots by combining FACS with high resolution MS (Moussaieff et al., 2013). In addition, glandular trichomes have been used as model systems for large-scale metabolome analyses (Tissier, 2012). However, the major limitation of current metabolomics is the lack of a single method allowing comprehensive measurements in terms of qualitative detection, quantitation, and spatio-temporal resolution. This is the case because the metabolites differ significantly in their concentration, chemical properties, and analytical behavior. Two major strategies in metabolome profiling are the use of either targeted or untargeted MS (reviewed in Kueger et al., 2012). Targeted MS relies on previous knowledge about structures and chemical properties of the metabolites of interest and combines chromatographic separation techniques, e.g., High Pressure Liquid Chromatography (HPLC) or Gas Chromatography (GC), with MS techniques. In contrast, non-targeted analyses using MS without prior chromatographic separation is used to profile metabolites without prior knowledge about their abundance or structure. This method often only allows the determination of metabolic signatures, as the characterization of a specific metabolite, for example by NMR, is highly challenging. Therefore, a key problem is the availability of reference spectra and compounds for compound identification and annotation (Kueger et al., 2012). Thus, the need for comprehensive databases including relevant information on the compounds, e.g., spectra, and the requirement for integration of metabolome data with other large-scale omics data has been noted (Fukushima et al., 2009). Current online resources include the Golm Metabolome Database (gmd.mpimp-golm.mpg.de) and the MASSBANK Database (www.massbank.jp).

An alternative method to study, for example, metabolites at spatial resolution without the need for prior cell isolation is MALDI-MS Imaging (MSI, reviewed by Lee et al., 2012). For MSI, a suitable matrix is directly applied to thin tissue sections (e.g., 10–20 µm). The prepared tissue sections are then rasterized with a laser-beam coupled to a high mass resolution (TOF-MS, reviewed in Kaspar et al., 2011). The spot size of the laser thereby determines the resolution. Only recently, technical improvements allowed to reach resolutions required for the analysis of single cells (<20 µm, reviewed in Kueger et al., 2012; Lee et al., 2012). MSI has rarely been used in plants for proteomics, and only few studies reported the imaging of metabolites (reviewed in Kaspar et al., 2011; Kueger et al., 2012). Examples for metabolite imaging with MSI include the measurement of wheat grain cell-wall polysaccharides (Velickovi ˇ c et al. ´ , 2014, 100 µm spot size), or the lipid measurements in embryos of cotton (Horn et al., 2012, 35 µm spot size). While MSI has a great potential for cell type-specific studies for plant systems biology, it needs to be noted that only thin surface layers of <1 µm are sampled by MALDI (Lee et al., 2012). However, further improvements in MSI are likely to be developed soon and adaption of these methods to plant tissues may once facilitate single-cell proteomics as well as metabolomics in a range of species.

In addition, to study the subcellular localization of specific ions or metabolites and their physiological relocation, e.g., by directed transport, a variety of molecular sensors has recently been developed. Such sensors usually depend on proteins changing their conformation upon binding of a specific substrate. Consequently, the distance between attached fluorescent proteins will change leading to an alteration in Fluorescent Resonance Energy Transfer (FRET, reviewed by Okumoto, 2012; Okumoto et al., 2012). For spatially and temporally resolved measurements, FRET can be measured by, for instance, Fluorescence Lifetime Imaging Microscopy (FLIM, reviewed by De Los Santos et al., 2015). While being very valuable tools in plant research, these techniques do not readily allow the high-throughput analysis of a large number of compounds in a plant cell and will thus not be discussed in detail in this review.

#### 2.2.7. DNA Methylation

DNA (cytosine) methylation is a heritable epigenetic modification of the genome and is involved in various cellular and developmental processes in a wide range of species, including animals, fungi, and plants. Several methods for genome-wide profiling of the DNA cytosine methylation status have been established. These include the hybridization onto whole-genome DNA microarrays after digestion of genomic DNA with methylation-sensitive restriction enzymes, or the precipitation of methylated DNA with antibodies targeting methylated cytosines (Methylated DNA ImmunoPrecipitation, MeDIP), followed by either microarray hybridization (MeDIP-chip) or NGS (MeDIP-Seq, reviewed by Su et al., 2011; Ji et al., 2015). The current method of choice for methylome profiling is Whole-Genome Bisulfite Sequencing (WGBS). In brief, DNA is incubated with bisulfite, converting all unmethylated cytosines to uracils, which are identified as thymines during sequencing. In contrast, all methylated cytosines are protected from the conversion, remain unchanged, and are identified as cytosines during sequencing (Ji et al., 2015). Compared to the profiling of other epigenetic marks, such as histone modifications, WGBS has two major advantages. It does not require the use of transgenic plants or antibodies, and recently developed methods allow WGBS on as little as 125 pg of DNA (Post-Bisulfite Adaptor Tagging (PBAT), Miura et al. (2012); 20 pg diluted Arabidopsis DNA with a modified protocol, our unpublished data). WGBS is therefore a very promising method for the profiling of specific cell types in plants.

# 3. SYSTEMS BIOLOGY APPROACH TOWARD PLANT DEVELOPMENT

As evident from the previous examples, plant cell type-specific systems biology is most advanced in cell types that can relatively easily be isolated in large enough amounts of suitable for any type of omics approach. For the root hairs of soybean, for example, a promising method to isolate large quantities facilitating any omics analysis has recently been described and will likely be of great use (Qiao and Libault, 2013). The method uses an ultrasound aeroponic system to enhance root hair density, followed by fixation and separation of the root hairs in liquid nitrogen. In addition, for the different cell types of the Arabidopsis root, FACS yields sufficient material for most omics approaches. An advantage of these systems is that due to the use of only one isolation method, the variability imposed by it can be held constant over all experiments. The use of a single method is also cost-efficient as it requires less time and resources to optimize only one method as compared to several. Due to the relatively easy sample collection and their physiological roles, roots, root hairs, and trichomes are excellent models to study responses to environmental stimuli, host-pathogen/symbiont interactions, metabolic pathways, or the dynamics of cellular specification and cell-cell communication in complex tissues. However, even the root may not be an optimal model to address fundamental questions of developmental systems biology. Its main disadvantages are the long developmental time span, starting very early during embryogenesis, and the complex interplay within and between the different cell types of the root, but also with the above-ground tissues, and biotic and abiotic environmental factors. Ideally, a developmental model system should allow an experimental coverage of the entire life-span of the organism. It would be of advantage if the organism were short-lived and comprise only a limited number of developmental stages and specialized cell and tissue types to reduce complexity and increase the affordability of comprehensive studies. For comparative analyses and evolutionary systems biology approaches, it would be further advantageous if the phylogeny of the model system included a broad range of organisms with gradual phenotypic changes, or with gain, loss, and alternative usage of modular building blocks. Finally, an ideal model system is most beneficial if its understanding can lead to direct applications in, for example, production of food or pharmaceuticals.

An intuitive model for the development of an organism is the embryo. During plant embryogenesis, the basic body organization with an apical-basal and radial pattern is established starting from a single cell, the zygote. The mature embryo already contains the progenitors of the main organizers of plant growth, the primary Shoot and Root Apical Meristems (SAM and RAM), and the hypocotyl and cotyledons with their various tissue types (reviewed in Lau et al., 2012). However, it is thus already a relatively complex system composed of multiple cell and tissue types. Additional complexity is imposed by the different stages of embryo development, spanning the time between the onecellular zygote and the mature embryo. An in-depth systems biological description of embryogensis would therefore require sampling of a large variety of cell types at many time points. Nevertheless, while most transcriptional studies published so far focussed on whole tissues or entire embryos (reviewed in Palovaara et al., 2013; Zhan et al., 2015), recently, high-quality cell type-specific transcriptomes of the proembryo and the suspensor of the early stages of the Arabidopsis embryo were described (Slane et al., 2014).

Alternative models for the development of organisms, which are far less complex than the embryo, are the gametophytes of flowering plants: the pollen (male) and the embryo sac (female). They are typically formed from one spore (meiotic product) and, at maturity, they consist of only a few cells and cell types, including the male and female gametes, the sperm cells and the egg and central cells, respectively (reviewed in Yang et al., 2010; Twell, 2011; Schmidt et al., 2015). Upon double fertilization, the egg cell and the central cell fuse with one sperm each to give rise to the embryo and endosperm, respectively. The latter nurtures the embryo and acts as storage organ for seed reserves in many species, including the cereals. The endosperm is thus the most important food and feed source.

Given the sheer amount of pollen produced by a single plant, and the relatively simple isolation procedures for some of the specific cell types of the male germline in developing pollen, multiple cell type-specific transcriptome data sets are available from different species, including Arabidopsis thaliana, Oryza sativa (rice), Zea mays (maize), Lilium longiflorum (lily), and Plumbago zeylanica (white leadwort) (**Table 1**; reviewed in Schmidt et al., 2012; Anderson et al., 2013; Dukowic-Schulze et al., 2014; Kelliher and Walbot, 2014), and several cell typespecific proteomes have recently been described for tobacco, Lilium davidii var. unicolor (Lanzhou lily), and tomato (**Table 1**; Abiko et al., 2013; Chaturvedi et al., 2013; Zhao et al., 2013; Ischebeck et al., 2014). Due to its characteristic tip-growth, pollen tubes also serve as an excellent model to study cell elongation and mechanical properties of the cell wall (Vogler et al., 2013). However, pollen development is strikingly uniform in angiosperms (Maheshwari, 1950), and inter-species comparisons would therefore likely be more fruitful in gymnosperms, which show a remarkable variation in terms of the number of cell divisions between meiosis and the subsequent specification of the sperm cells (Fernando et al., 2010). In contrast to pollen, a plant forms much fewer female gametophytes, which are deeply embedded in the maternal floral tissue (e.g., in Arabidopsis, each flower contains around 50 ovules, each of which harbors only one embryo sac). Nonetheless, several cell type-specific transcriptomes (**Table 2**; reviewed in Schmidt et al., 2012; Wuest et al., 2013, and more recent data in Anderson et al., 2013; Okada et al., 2013; Schmidt et al., 2014) as well as a proteome analysis for rice egg cells (**Table 2**; Abiko et al., 2013) are currently available. Even though it is more difficult to collect than the pollen,

TABLE 1 | Summary of transcriptome (top) and proteome (bottom) datasets generated for specific cell types during formation of the male reproductive lineage and gametogenesis.


*In brief, pollen formation starts with a microspore mother cell (or meiocyte) which undergoes meiosis to give rise to a tetrad of reduced spores. Each of these microspores undergoes pollen mitosis I to give rise to a generative and a vegetative cell. The subsequent mitotic division of the generative cell (pollen mitosis II) results in the formation of two sperm cells (Twell, 2011). UNM, uninucleate microspore; GC, generative cell; SC, sperm cell; LC, liquid chromatography.*



*MMC, megaspore mother cell; AIC, apomictic initial cell; AI, aposporous initial cell; egg, egg cell; syn, synergids; cen, central cell; LC, liquid chromatography.*

the embryo sac has certain developmental features rendering it a highly interesting model system for plant development: (i) high evolutionary diversity within angiosperms, (ii) syncytial development (i.e., the formation of a multinucleate cell), (iii) specification and differentiation of only three to four distinct cell types, and (iv) a process in which plants can reproduce asexually via seeds (gametophytic apomixis).

The mature embryo sacs of angiosperms generally contain at least three distinct cell types: the synergids required for pollen tube attraction and reception, and the two gametes, the egg and the central cell. An exception are, for example, the Podostemaceae, where the central cell seems to degenerate before pollen tube arrival, resulting in a single fertilization event (Sehgal et al., 2014). In addition, antipodal cells are frequently present, but little is known about their function. It has been hypothesized that they might be involved in nutrient transfer from the surrounding tissues to the embryo sac (Raghavan, 1997). Despite the high functional similarity of mature embryo sacs, their formation is highly diverse across different plant taxa (**Figure 2**; Maheshwari, 1950; Huang and Russell, 1992; Baroux et al., 2002; Williams and Friedman, 2004). Reproductive development can be divided into two steps: megasporogenesis and megagametogenesis. Megasporogenesis comprises the formation and maturation of the initial meiotic products (megaspores) from a single selected sporophytic cell, the Megaspore Mother Cell (MMC), and is under the control of the usually diploid sporophytic genome. Megagametogenesis describes the following mitotic divisions, cellularization, cell

FIGURE 2 | Schematic showing several basic types of female gametophyte development in angiosperms and the structural diversity of the mature embryo sacs (after Maheshwari, 1950). The development of the female gametophyte can be devided into two steps: megasporogenesis (orange shading) and megagametogenesis (green shading). During megasporogenesis, a selected sporophytic cell, the megaspore mother cell (MMC), undergoes meiosis to give rise to *(Continued)*

#### FIGURE 2 | Continued

spores. In most angiosperms, a tetrad of four megaspores is formed, of which three subsequently abort, leaving only one functional megaspore (FMS) to participate in megagametogenesis (e.g., *Polygonum*-type). However, a high diversity of the developmental processes of megasporogenesis and megagametogenesis has been observed in different genera, with variations, for example, including bispory and tetraspory. During megagametogenesis, the mature female gametophyte is formed through mitotic divisions, nuclear migration, and cellularization. For the mature embryo sac, the colors indicate the cell types: egg (pink), synergids (yellow), central cell (blue), and antipodal/lateral cells (white). Cells structurally similar to egg cells or synergids are drawn accordingly, but are colored gray.

specification, and maturation of the female gametophyte, which is under the control of the typically haploid genome. Both processes exhibit high diversity within angiosperms. Depending on the number of spores that survive and participate in megagametogenesis, megasporogenesis can be divided into monosporic (one spore), bisporic (two spores), and tetrasporic (all four spores). Further variation includes the location of the degenerating spores and the positioning of the spores in the tetrasporic types. Likewise, megagametogenesis can vary in the number of mitotic divisions, the arrangement of the nuclei/cells, and late divisions of individual cells after cellularization (e.g., in Amborella, Friedman, 2006). Comparative analysis of the structure of a wide range of embryo sacs and reconstruction of the ancestral state suggest that the embryo sacs of early angiosperms contained only four cells: two synergids, one egg cell and one central cell. It has been hypothesized that duplication of this four-celled module facilitated the emergence of the bi-nucleate central cell that, following fertilization, forms an endosperm with a maternal:paternal genome contribution ratio of 2:1 (Williams and Friedman, 2004; Friedman, 2006; Friedman and Ryerson, 2009). This unequal parental contribution to the endosperm has received a lot of attention over the last century. As a tissue protecting and nourishing the embryo, the endosperm may be subject to adaptive processes and parental conflicts (Haig and Westoby, 1989; Baroux et al., 2002).

An interesting aspect of female gametophyte development (and tetrasporic megasporogenesis) is the formation of a syncytium during the divisions of the nuclei prior to cellularization. In angiosperms, gametogenesis and early stages of endosperm development are the two major examples for the formation of a syncytium. In contrast, the plasmodial tapetum, for example, is formed by degeneration of the cell walls and the fusion of the resulting protoplasts (Furness and Rudall, 1998). Unlike regular cell divisions, where the positions of cells are relatively fixed due to the rigid cell wall, a syncytium allows for nuclear migration and for differentiation according to gradients of positional information. Indeed, determination of cell fate in the embryo sac of Arabidopsis depends on the position of the nuclei as, for example, indicated by the Arabidopsis retinoblastoma-related1 (rbr1) mutant, which produces supernumerary nuclei differentiating according to their position within the FG (Johnston et al., 2008; Sprunck and Groß-Hardt, 2011). However, the nature of such information is still under debate. Appealing candidates may be gradients of plant hormones, such as cytokinin or auxin. For both, a role in establishing polarity during embryo sac development has been proposed (reviewed in Schmidt et al., 2015) but their role may be rather indirect (Lituiev et al., 2013). However, an alternative or complementary hypothesis can be formulated using the analogy to the syncytial embryogenesis in Drosophila, where around 70% of the genes expressed during early embryogenesis show a specific subcellular localization of their mRNA in the syncytium. Interestingly, specific subcellular mRNA localization peaks around the transition from syncytial to cellular development, potentially reflecting the high demand for localization mechanisms (Lécuyer et al., 2007). Thus, a fascinating possibility is that the specific subcellular localization of mRNAs in the syncytial stage of the developing embryo sac may play a role in determining cell fate. A possibility to test this hypothesis would be to separately isolate specific subcellular regions (e.g., the two opposing poles) of the developing syncytial female gametophyte and to compare the transcriptional profiles of these regions with each other.

Another interesting variation of reproductive development is gametophytic apomixis. It refers to the process of asexual reproduction through seeds in the absence of fertilization (reviewed in Koltunow and Grossniklaus, 2003). Apomixis occurs in more than 400 plant species from around 40 genera and is likely of polyphyletic origin (Asker and Jerling, 1992; Carman, 1997). Gametophytic apomixis involves the omission or abortion of meiosis (apomeiosis) and the formation of an embryo from an unfertilized egg (parthenogenesis), while the endosperm can be formed by autonomous development of the central cell or dependent on fertilization (pseudogamy). Depending on the mechanism of the formation of the unreduced megaspore, the resulting offspring can be genetically completely identical to the mother plant without any chromosomal rearrangements. It is thereby possible to fix complex genotypes over multiple generations without a loss in heterozygosity. While gametophytic apomixis is absent in major crop plants, engineered apomictic crops would promise great potential and economical value for plant breeding and agriculture (Koltunow et al., 1995; Vielle-Calzada et al., 1996; Grossniklaus et al., 1998). From a developmental perspective, apomixis can be seen as an alteration of the sexual pathway, where certain processes are initiated too early or in the wrong cell type (Koltunow, 1993; Grossniklaus, 2001). Detailed understanding of the molecular processes and pathways governing gametogenesis during sexual and apomictic reproduction is therefore a precondition to engineer apomixis in crop plants. In evolutionary terms, apomixis is a highly interesting trait. On one hand, it allows the dispersal of seeds without the need for a sexual partner (Smith, 1978) and may therefore be advantageous for the colonization of new habitats (Tomlinson, 1966). On the other hand, the trade-off for this clonal reproduction appears to be very costly. Apomicts may accumulate deleterious mutations over many generations (Muller, 1964) and their populations are likely of low genetic variability, which reduces their potential to adapt to a changing

environment. Recent proposals, however, suggest that epigenetic variation may also contribute to adaptive potential, which may explain the ecological success of many apomicts (Hirsch et al., 2012).

Given the natural variation in sexual and apomictic species, the female gametophyte of angiosperms can be seen as an excellent model system to study fundamental developmental processes and evolutionary aspects of plant development and biology that are of high importance to agriculture. Its simple organization and the relatively few developmental stages would allow for an in-depth analysis of various species enabling evolutionary comparisons at the whole-genome level. Given the high diversity, inter-species comparisons may identify genes and genetic networks involved in the emergence of evolutionary novelties, such as the unequal genetic contribution of the two parents to the endosperm or gametophytic apomixis. Deciphering the evolutionary mechanisms underlying these processes may also provide an answer to the long-standing question, how useful research on model organisms is for crop improvement. However, the small size and inaccessibility of the cell types of developing and mature embryo sacs make the isolation and subsequent application of omics methods very difficult. Aside the challenges associated with data integration and analysis, data generation is hence a major limiting factor. In general, the main obstacle with most approaches is the number of cells required for in-depth profiling of a certain molecule (e.g., protein or metabolite). This may be overcome by either increased sensitivity of the profiling method, or through a simplified collection of a large number of cells. However, most highthroughput isolation methods (e.g., for FACS/FANS/INTACT) rely on the existence of a specific marker (i.e., a cell typespecific promoter) and the possibility to generate transgenic plants. In addition, typically a certain abundance of the cell type of interest in the sample is required for efficient sorting and purification. Given that these preconditions are generally not met by low abundant cell types of of non-model organisms, it is likely that plant systems biology will profit the most from an increase in sensitivity and the development of novel profiling methods. In the following sections, we will therefore focus on a subset of omics approaches, which are readily available or which bear great future potential for routine large-scale in vivo profiling of specific cell types. The examples given are restricted to studies on specific cell types of the female gametophytes of angiosperms.

#### 3.1. Transcriptome

Transcriptomics is clearly the most frequently used and currently the most robust omics approach to study female gametophyte and plant reproductive development. Following the early transcriptional profiling with low-throughput technologies [early Expressed Sequence Tag (EST) sequencing projects, reviewed in Wuest et al. (2013)], cell type-specific transcriptomes were generated for the egg cell, the central cell, the synergids, and the MMC of Arabidopsis (Wuest et al., 2010; Schmidt et al., 2011; Schmid et al., 2012), the egg cell and the synergids for rice (Ohnishi et al., 2011; Anderson et al., 2013), all cell types of the mature embryo sac and the Apomictic Initial Cell (AIC) of Boechera gunnisoniana (a close apomictic relative of Arabidopsis thaliana where an AIC is specified instead of a sexual MMC, Schmidt et al., 2014), and the AIC of Hieracium praealtum (hawkweed, where the AIC is formed by an additional sporophytic cell developing adjacent to the sexual reproductive lineage, Okada et al., 2013; **Table 2**). Given the requirement to establish a specific gene expression profile for cell specification and differentiation, transcriptomics is also especially suitable as a first approach toward an unknown species, because it provides a comprehensive snapshot of the cellular instruction machinery. It further enables the identification of cell type-specific markers and can thus provide a basis for other approaches, like detailed molecular and mechanistic studies. The advantage of transcriptional profiling as compared to proteomic studies is the possibility to amplify the material prior to detection. Several RNA-Seq protocols allow transcriptional profiling of single cells corresponding to as little as about 10 pg of total RNA (reviewed in Head et al., 2014). This low detection limit facilitates the use of relatively low throughput isolation methods, such as LAM or manual microdissection, allowing the profiling of specific cell types of embryo sacs in model and non-model species (Okada et al., 2013; Wuest et al., 2013; Schmidt et al., 2014). A current drawback of the amplification strategy is the introduction of potential quantification biases. A possible solution may be Unique Molecular Identifiers (UMI). These are short sequences with random nucleotides (e.g., 1024 different UMIs with 5 random nucleotides), which are used to label initial cDNA molecules prior to amplification. An excess of UMIs compared to the number of identical cDNAs ensures that each combination of a given UMI with a certain cDNA is unique. After amplification and sequencing, this can be used to differentiate between individual molecules in the initial cDNA pool and duplicates originating from cDNA amplification (i.e., to count molecules instead of reads, Islam et al., 2014). An interesting approach for future studies may be Fluorescent In Situ RNA SEQuencing (FISSEQ), in which stably cross-linked cDNA amplicons are sequenced directly within a biological sample, thereby not only quantifying gene expression, but also detecting the subcellular localization of the transcripts (Lee et al., 2014). Improvement of this method and its adaption to plant tissues would thus undoubtfully be a major advance in cell type-specific transcriptional profiling.

## 3.2. Proteome and Metabolome

Proteomics and metabolomics on specific cell types is substantially more challenging than transcriptomics. A current limitation for cell type-specific proteomics is the large discrepancy between the number of detected proteins compared to the number of expressed genes, which is due to the low sensitivity of proteomics methods towards lowabundant proteins. An additional complexity arises by the presence of a wide range of post-translational modifications, such as phosphorylation or glycosylation. Apart from two early examples, identifying only the major proteins in the egg cells of maize and rice (6 and 4 proteins, Okamoto et al., 2004; Uchiumi et al., 2007), we are only aware of the recent description of the egg cell proteome in rice, where 2138 proteins were identified using around 500 egg cells (Abiko et al., 2013; **Table 2**). In the same study, 2179 proteins were identified starting from 30,000 isolated sperm cells (**Table 1**; Abiko et al., 2013). Given the further improvements of the sensitivity of mass spectrometers, the example demonstrates that proteomics of purified cells of the female gametopyhte should be possible for cases where enough material can be collected. Mechanical or manual isolation of female gametes was reported for a variety of species including barley, wheat, rape seed, maize, tobacco, Torenia, Alstroemeria, and Arabidopsis (Kranz et al., 1991; Holm et al., 1994; Kovács et al., 1994; Katoh et al., 1997; Tian and Russell, 1997; Sprunck et al., 2005; Hoshino et al., 2006; Okuda et al., 2009; Jullien et al., 2012). In most of these species, we anticipate that the protocols would already allow the isolation of sufficient material for MS-based proteomics. Another promising approach for future experiments may be MSI, circumventing the need for (laborious) cell purification.

# 3.3. Methylome

DNA cytosine methylation (5mC) plays an important role in the epigenetic regulation of plant genomes. While WGBS has not yet been reported for isolated cells of the female gametophyte, bisulfite sequencing of specific sequences has already been applied for Arabidopsis central cells and synergids isolated by LAM (Wöhrmann et al., 2012; You et al., 2012). It would likely be possible to combine LAM or manual microdissection with WBGS. This would thus allow methylome profiling of gametes in model as well as non-model species. Importantly, this may provide novel insights into the molecular basis underlying heterosis (Groszmann et al., 2011), characterized by superior characteristics of F1 hybrid plants as compared to their parents. While epigenetic regulatory pathways are likely important for heterosis, their precise involvement remains elusive to date (Chen, 2013). Understanding of the regulatory mechanisms governing heterosis is of great interest for plant breeding and crop production. Importantly, gametophytic development and early stages of embryogenesis are likely important for the establishment of heterosis.

# 4. CONCLUSION AND PERSPECTIVES

To date, cell type-specific systems biology in plants is frequently constrained by the difficulties associated with the isolation of the cell type of interest in large enough amounts. Robust and simple isolation methods exist only for a few cell types. Consequently, the comprehensive profiling of all cell types of an organism with different large-scale profiling methods, allowing the detailed understanding of all biological processes ongoing in the biological system, is still an unreached goal. While the in-depth understanding of complex organisms over their lifespan is a major aim for systems biology, the use of simple model organisms bears advantages, given the persisting technical limitations. We introduce the female gametophyte of angiosperms as an attractive model system for future systems biology approaches in plant development. Apart from its relatively simple organization, it is of great biological and agronomical importance, for example with respect to seed production and plant breeding.

Currently, most high-throughput isolation methods with broader application (e.g., FACS/FANS/INTACT) are limited to model organisms (e.g., Arabidopsis thaliana, Oryza sativa). However, a biological system may be best understood in the context of evolution. In addition, a detailed understanding of the cellular processes in major agriculturally important species including wheat, where an additional challenge is the genome size and its hexaploid nature, are a precondition for targeted crop improvement. Such studies would thus not only be of potential applied value, but would also help to understand the common concepts and divergent mechanisms active in different species. Therefore, methods facilitating largescale profiling of specific cell types in model as well as non-model organism are of crucial importance. Parallel highthroughput profiling of several organisms covering a phenotypic gradient, or including gain, loss, and alternative usage of modular building blocks along the phylogeny, will enable evolutionary systems biology. This approach may ultimately help to reconstruct the emergence of evolutionary novelties and to find the underlying genetic and molecular networks. Such an understanding would in turn allow the control of the underlying processes with an unprecedented resolution. In perspective, this can be an important precondition for targeted improvement of crop species, including the engineering of apomixis into crop plants.

Even though the isolation of individual cell types is currently still very challenging, the rapid technical advances observed over the past few years in, for example, transcriptional profiling, are clear indications for the tremendous improvement of large-scale profiling technologies. In this light, we emphasize methods for transcriptomics, proteomics, metabolomics, and methylomics, in which we see great future potential. However, cell type-specificity and single-cell resolution are just one step towards a more comprehensive view on developmental processes and environmental responses. Clearly, monitoring subcellular localization of molecules and their interactions will be essential to understand certain patterning processes and specific cellular functions. In analogy to the hypothesized distribution of mRNA within the syncytial female gametophyte, subcellular localization of mRNA may also occur within the cell types of the mature female gametophyte to, for example, target the proteins they encode to a specific subcellular region. In this respect, technologies based on high resolution imaging, allowing large-scale profiling without prior cell isolation, for example MSI or FISSEQ, are very promising for future applications. The growing amount of data and data types also points to the need for novel computational solutions addressing the problems of data storage, integration, and analysis (see Ahrens et al., 2007; Yuan et al., 2008; Fukushima et al., 2009; Chuang et al., 2010; Katari et al., 2010; Liberman et al., 2012; Fukushima et al., 2014; Gomez-Cabrero et al., 2014; Robinson et al., 2014). The current situation, in which data sometimes remain unpublished, are frequently poorly annotated, and widely dispersed in specialized databases, may be taken as motivation to develop integrative computational platforms specifically focussing on future data. Considering the almost exponential growth of biological data over the last years (Ideker et al., 2001; Chuang et al., 2010), these platforms may also ignore data from the past to allow for innovative solutions. In this context, standardized data formats and annotation, easily accessible databases, powerful data mining tools, user-friendly and freely available software, as well as scalable storage platforms are the current and future demands in systems biology (Chuang et al., 2010; Gomez-Cabrero et al., 2014).

# REFERENCES


Asker, S. E., and Jerling, L. (1992). Apomixis in Plants. London, UK: CRC Press.


# ACKNOWLEDGMENTS

Work on gametophyte development, apomixis, and epigenetic gene regulation in UG's laboratory is supported by the University of Zürich, and by grant C11.0060 from the "Staatssekretariat für Bildung und Forschung" in the framework of COST action FA0903 (to UG and AS), grant 31003A\_141245 from the Swiss National Science Foundation (to UG), and the grant No 250358 (MEDEA) from the European Research Council (to UG).

cell to mature pollen provides evidence for developmental priming. J. Proteome Res. 12, 4892–4903. doi: 10.1021/pr400197p


with sperm development and function specialization. J. Proteome Res. 12, 5058–5071. doi: 10.1021/pr400291p

Zhu, M., Dai, S., McClung, S., Yan, X., and Chen, S. (2009). Functional differentiation of Brassica napus guard cells and mesophyll cells revealed by comparative proteomics. Mol. Cell. Proteomics 8, 752–766. doi: 10.1074/mcp.M800343-MCP200

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Schmid, Schmidt and Grossniklaus. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Methods to isolate a large amount of generative cells, sperm cells and vegetative nuclei from tomato pollen for "omics" analysis

The development of sperm cells (SCs) from microspores involves a set of finely regulated

*Yunlong Lu1,2, Liqin Wei1 and Tai Wang1\**

*<sup>1</sup> Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, China, <sup>2</sup> University of Chinese Academy of Sciences, Beijing, China*

molecular and cellular events and the coordination of these events. The mechanisms underlying these events and their interconnections remain a major challenge. Systems analysis of genome-wide molecular networks and functional modules with highthroughput "omics" approaches is crucial for understanding the mechanisms; however, this study is hindered because of the difficulty in isolating a large amount of cells of different types, especially generative cells (GCs), from the pollen. Here, we optimized the conditions of tomato pollen germination and pollen tube growth to allow for long-term growth of pollen tubes *in vitro* with SCs generated in the tube. Using this culture system, we developed methods for isolating GCs, SCs and vegetative cell nuclei (VN) from just-germinated tomato pollen grains and growing pollen tubes and their purification by Percoll density gradient centrifugation. The purity and viability of isolated GCs and SCs were confirmed by microscopy examination and fluorescein diacetate staining, respectively, and the integrity of VN was confirmed by propidium iodide staining. We could obtain about 1.5 million GCs and 2.0 million SCs each from 180 mg initiated pollen grains, and 10 million VN from 270 mg initiated pollen grains germinated *in vitro* in each experiment. These methods provide the necessary preconditions for systematic biology studies of SC development and differentiation in higher plants.

Keywords: *Solanum lycopersicum*, generative cell, sperm cell, vegetative nuclei, isolation, Percoll density gradient centrifugation

# Introduction

During the development of sperm cells (SCs, male gamete) from microspores in higher plants, the microspore generated from diploid microsporocytes via meiosis first undergoes asymmetric mitosis to produce a larger vegetative cell (VC) and a smaller generative cell (GC) embedded in the VC. Thereafter, the VC exits the cell cycle and has potential to generate a polarly growing pollen tube; the GC enters further mitosis to produce two SCs for double fertilization (McCormick, 1993; Twell et al., 1998; Twell, 2011). Depending on the plant species, GC mitosis occurs before anthesis or in growing pollen tubes; therefore, released mature pollen at anthesis is tricellular in some species such as *Oryza sativa*, *Zea mays,* and *Arabidopsis thaliana* (Berger and Twell, 2011) or bicellular in other species such as *Lilium brownii* and *Solanum lycopersicum*. This development

#### *Edited by:*

*Sixue Chen, University of Florida, USA*

#### *Reviewed by:*

*David Twell, University of Leicester, UK Mengmeng Zhu, The Pennsylvania State University, USA*

#### *\*Correspondence:*

*Tai Wang, Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, 20 Nanxincun, Xiangshan, Haidianqu, Beijing 100093, China twang@ibcas.ac.cn*

#### *Specialty section:*

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> *Received: 14 April 2015 Accepted: 16 May 2015 Published: 02 June 2015*

#### *Citation:*

*Lu Y, Wei L and Wang T (2015) Methods to isolate a large amount of generative cells, sperm cells and vegetative nuclei from tomato pollen for "omics" analysis. Front. Plant Sci. 6:391. doi: 10.3389/fpls.2015.00391* process involves a set of fine-tuned molecular and cellular events and the coordination of these events, such as cell cycle regulation, cell differentiation and fate determination, genome stability, and epigenetic reprogramming. Although genetic studies have functionally identified many important genes involved in plant SC development, such as *DUO1*, *DUO3*, *DAZ1,* and *DAZ2* (Borg et al., 2009, 2011, 2014; Brownfield et al., 2009a,b; Twell, 2011), the mechanisms underlying these events and their interconnections remain a major challenge for plant science. Systematic "omics" studies of the development process are essential for understanding the mechanisms.

"Omics" studies of pollen from several plants including *Arabidopsis* and rice have provided insights into the molecular mechanisms of pollen development (Rutley and Twell, 2015). During postmeiotic development from microspores, pollen express a set of specific transcripts; the total number of transcripts expressed is decreased, but the proportion of pollen-specific or preferential transcripts is increased (Honys and Twell, 2004; Wang et al., 2008; Wei et al., 2010). The composition and expression profile of miRNAs expressed in developing pollen differs from those in sporophytes, and novel and non-conserved known miRNAs are the main contributors to the difference (Wei et al., 2011). In pollen, small RNA displays cell-specific activity: working by translational repression in the SC, and by cleavage-induced mRNA turnover in the VC (Grant-Downton et al., 2013). The small RNA from the VC are strongly implicated in gene silencing in SCs (Slotkin et al., 2009; Grant-Downton et al., 2013). This indicates reprogramming of gene expression during pollen development and the importance of epigenetic signals in this reprogramming. In addition, proteomics and metabolomics studies have revealed the importance of presynthesized proteins during pollen maturation in pollen function (Holmes-Davis et al., 2005; Dai et al., 2006), and difference in proteomes and metabolitic pathways between mature and germinated pollen (Dai et al., 2007; Obermeyer et al., 2013). These studies also revealed many important candidate genes for further understanding the molecular control of pollen development by functionally dissecting these candidates.

Recent studies have isolated SCs from tricellular pollen of rice and *Arabidopsis* and analyzed the transcriptome of SCs (Borges et al., 2008; Russell et al., 2012). The transcriptome of the SC was significantly different from that of the pollen grain, which is consistent with the SC being only a little part of the pollen grain that is mainly represented by the VC. SCpreferential transcripts showed a prominent functional skew toward epigenetic regulation, DNA repair, and cell cycles (Borges et al., 2008; Russell et al., 2012). Small RNA-mediated DNA methylation in SCs is associated with epigenetic inheritance, transposon silencing and paternal imprinting (Borges et al., 2008; Calarco et al., 2012). Further systematic "omics" analysis of molecular programs for SC development from its precursors, the GC and microspore, is essential to understand the mechanism of SC development. To achieve this goal, we need to establish a condition to isolate GCs and SCs from the pollen of a species. Because the GC occurs at a short time window *in vivo* and develops asynchronously in different flowers in rice and *Arabidopsis*, isolating a large amount of GCs at high purity from developing pollen of these species for "omics" analysis is difficult.

Tomato is another model plant to study pollen development (Twell et al., 1990, 1991; Muschietti et al., 1994; Filichkin et al., 2004) and can be an excellent model to achieve the above target because (1) its genome has been sequenced (Sato et al., 2012) and (2) its mature pollen is bicellular. This feature of pollen indicates the possibility to isolate GCs from pollen grains or justgerminated pollen grains (JGPGs) and to isolate SCs from pollen tubes with SCs formed from GCs via mitosis.

In this study, we optimized the conditions of pollen germination and pollen tube growth to allow for long-term growth of pollen tubes *in vitro* with SCs generated in the tube. Using this culture system, we developed efficient protocols to isolate a large amount of GCs, SCs, and vegetative cell nuclei (VN) at high purity to satisfy the demands of "omics" study.

# Materials and Methods

#### Plants Growth and Pollen Collection

Tomato (*S. lycopersicum*) plants (Heinz 1706) were grown in the greenhouse under long-day conditions (14 h light/10 h dark) at 25 ∼ 35◦C. During anthesis, anthers from opened flowers were collected and dried for 10 h at 28◦C in an electrothermal drying closet, then placed into a colander (85 mm × 50 mm) with mesh at 63-μm-pore size; mature pollen was released and collected by shaking the colander vigorously. Pollen grains were used immediately or stored in a 1.5 mL tube with 5∼10 particles of Silica gel Rubin (Sigma, 85815) at −20◦C.

### Pollen Germination *In Vitro* and Morphologic Observation

Mature pollen grains (60 mg) were pre-hydrated in a Petri dish (60 mm × 15 mm), which was covered with gauze and then placed in a large Petri dish (150 mm × 25 mm) with 50 mL saturated Na2HPO4 at 25◦C for 4∼8 h. This device only permitted gauze contact this solution, and prohibited pollen grains contact the gauze and solution directly. Hydrated pollen grains were incubated in 100 mL germination medium (20 mM MES, 3 mM Ca(NO3)2, 1 mM KCl, 0.8 mM MgSO4, 1.6 mM boric acid, 24% PEG 4000, 2.5% sucrose, pH 6.0; osmotic pressure, 1253.33 ± 2.33 mOsmol/kg H2O) in a Petri dish (150 × 25 mm) at 25◦C in the dark with shaking at 90 rpm (Tang et al., 2002; Zhao et al., 2013). During germination, 1 mL medium was took out at regular intervals, centrifuged to collect germinating pollen grains, then transferred to 1 mL Carnoy's fluid (three parts of absolute ethyl alcohol, one part of acetic acid) for treatment of 30 min. Thereafter, these treated pollen grains or tubes were stained with 4 ,6-diamino-2-phenylindole (DAPI; Molecular Probes) and observed under a microscope (Zeiss Axio Imager A1).

#### Isolation of GCs

An improved two-step osmotic shock was used to release GCs from JGPGs and all procedures were performed at room Lu et al. Isolation of male germ cells

temperature (Zhao et al., 2013). Two aliquots of pollen grains 90 mg each were germinated in 100 mL germination medium as described above for ∼20 min until the pollen tubes emerged but were shorter than the diameter of the grains. JGPGs were harvested through a Büchner funnel (100 mm in diameter) with 11-μm hydrated nylon mesh (Millipore, NY1100010) with the help of an aspirator pump, then rinsed with osmotic shock solution (15.3% sucrose, 1% bovine serum albumin [BSA], 531.33 ± 3.84 mOsmol/kg H2O) to clean germination medium, which would affect the result of osmotic shock. The collected JGPGs were immediately transferred to 80 mL fresh osmotic shock solution and incubated for 10 min to burst tubes and release GCs. Cell debris was removed by sieving the mixture through a hydrated 11-μm nylon mesh. The filtrate containing GCs was equally divided into two centrifugation tubes, and centrifuged at 850 *g* for 4 min to collect GCs. To avoid loss of GCs, we retained 10 ml of the supernatant in each tube after centrifugation, then added 10 mL of isolation buffer 1 (IB1; 20 mM MES-KOH, 20 mM NaCl, 10 mM EDTANa2, 1 mM spermidine, 0.3 mM spermine, 2 mM DTT, 18% sucrose and 1% BSA, pH6.0) to suspend GCs. The suspension was supplemented with stock solution of cellulase "Onozuka" R-10 (Yakult) and macerozyme R-10 (Yakult; 0.4% each in IB2; IB2; 10 mM MES-KOH, 10 mM NaCl, 5 mM EDTANa2, 0.5 mM spermidine, 0.15 mM spermine, 1 mM DTT, 18% sucrose and 1% BSA, pH6.0) to a final concentration of 0.04% each enzyme, mixed gently and incubated for 15 min without shaking, and centrifuged at 850 *g* for 3 min to collect GCs, followed by a washing with IB2. Collected GCs were further purified on 23/32% Percoll density gradient (2 mL 23% Percoll and 3 mL 32% Percoll in IB2) by horizontal centrifugation at 1000 *g* for 40 min. After centrifugation, GCs partitioned at the interface of 23% and 32% Percoll were collected with use of a glass pipette and washed twice with 3 × volume IB2 followed by centrifugation at 950 *g* for 3 min each. The viability of isolated GCs was examined by fluorescein diacetate (FDA) staining. The purified GCs were snap-frozen in liquid nitrogen and stored at −80◦C.

#### Isolation of SCs

Sperm cells were isolated under room temperature as described by Xu et al. (2002) with modifications. In brief, three aliquots of pollen grains 60 mg each were cultured in 100 mL germination medium as described above for 10 h. After germination, medium was removed by use of hydrated 100-μm nylon mesh (Millipore, NY1H00010), and pollen tubes were washed with osmotic shock solution as for isolation of GCs, immediately transferred to low-osmotic enzyme solution (0.4% cellulase "Onozuka" R-10 and 0.2% macerozyme R-10 in osmotic shock solution), and incubated for 5 min to release SCs. Cell debris and ungerminated pollen grains were removed by use of hydrated 11-μm nylon mesh, and SCs in the filtrate were collected and washed as described in isolation of GCs. Thereafter, collected SCs were purified on 5 mL 23% Percoll gradient in IB2 by horizontal centrifugation at 1000 *g* for 30 min. SCs were enriched to the upper surface of 23% Percoll gradient and harvested by use of a glass pipette and washed twice with 3 × volume IB2 followed by centrifugation at 950 *g* for 3 min each. The viability of isolated

#### Isolation of VN

All operations were performed on ice or at 4◦C unless otherwise specified, and all solutions were pre-cooled on ice or at 4◦C. Three aliquots of pollen grains 90 mg each were cultured in 100 mL germination medium as described above for 1.5 h. Pollen tubes were collected with a 20-μm hydrated nylon mesh (Millipore, NY2009000) at room temperature, rinsed with wash buffer (10 mM MOPS-NaOH, 2.5% sucrose, 9.5% mannitol, 5 mM EDTANa2, 1% BSA, pH7.2), and treated with 12 mL enzyme solution (0.5% cellulase "Onozuka" R-10, and 0.3% macerozyme R-10 in wash buffer) for 5 min to release VN. After removal of ungerminated pollen grains and cell debris with use of 20-μm hydrated nylon mesh, the filtrate containing VN was divided into four equal parts, loaded onto the surface of 3 mL 10% Percoll gradient in wash buffer each, then centrifuged at 1500 *g* for 30 min. VN in the upper surface of the gradient were collected by use of a glass pipette, snap-frozen in liquid nitrogen, then stored at −80◦C.

# Results

#### Dynamics of GCs and SCs During Culture *In Vitro*

Our experiments showed that low-temperature (−20◦C) stored tomato pollen grains without prehydration germinated *in vitro* at low germination rate (Supplementary Table S1). To guarantee a high proportion of synchronously germinated tomato pollen grains after low-temperature storage and long-term growth of pollen tubes to allow generation of SCs in the tube *in vitro*, we optimized the pre-hydration condition of the stored pollen grains and culture condition of pollen tubes. We prehydrated low-temperature-stored tomato pollen grains using the saturated solution of Na2HPO4 to rescue the germination activity. Prehydration with the saturated solution for 4–8 h significantly increased germination rate of the pollen grains (Supplementary Table S1). Our culture conditions allowed for *in vitro* growth of pollen tubes for <sup>&</sup>gt;10 h (**Figure 1**).

To determine the suitable time of pollen tube growth for isolating GCs, SCs, and VN, we examined their dynamics during pollen germination and tube growth by DAPI staining. A bulge appeared at the germination aperture of hydrated pollen grains on culture for 20 min, and the bulge emerged as a morphologically visible pollen tube with a length shorter than or equal to the diameter of the pollen grain during 30-min culture (**Figure 1**). With increased culture time, GCs and VN began to move into the tube at 1 h and completely entered the tube at 1.5 h. We used DAPI staining to determine the movement order of VN and GCs (**Figure 2**). Among 126 surveyed pollen grains, for 68, VN entered the tube first, and for 58, GCs entered first. Therefore, during tomato pollen tube growth, VN and GCs may move into the tube in a random order. Furthermore, for GCs, 77.8% completed mitosis to generate SCs at 8 h, 84.2% at 9 h, and 92.4% at 10 h (Supplementary Table S2).

FIGURE 1 | The dynamics of generative cells (GCs) and sperm cells (SCs) during pollen germination and tube growth. Pollen cultured at different times observed by differential interference contrast (DIC) microscopy (upper panel), or after 4 ,6-diamino-2-phenylindole (DAPI) staining (middle panel). Merged images in the bottom panel. Scale bar: 20 μm.

#### Release and Purification of GCs

Tomato pollen grains could not burst directly with osmotic shock and also could not germinate in a sucrose solution alone (data not shown). So, we developed a modified twostep method. We incubated prehydrated tomato pollen grains in germination medium for 20 min, when a bulge appeared at the aperture (**Figures 1** and **3A**), then osmotically shocked the JGPGs, which were sensitive to the low-osmotic shock (**Figure 3B**).

When JGPGs were transferred into low-osmotic solution, the tube burst, and GCs, along with the cytoplasm, were emitted (**Figures 3C,D**). The just-released GCs underwent a change from spindle- to oval-shaped (**Figures 3C,D**). This change may be associated with the microtubule cytoskeleton, which was dynamic in response to environmental conditions and is important to determine the shape of GCs (Zhou et al., 1990). When incubated in the low-osmotic solution for 10 min, 62.9% (680/1082) of pollen tubes burst (**Figure 3B**). The released GCs were intact in the low-osmotic solution up to 1 h, but most GCs appeared to break after 1 h (data not shown). To maintain GC integrity, we added IB1 into the filtrate containing GCs to neutralize the low osmotic shock in a short time (<1 h). The suspension was used for purifying GCs.

Furthermore, we treated the suspension with cellulase and macerozyme at a low concentration, which had no effect on viability of GCs but increased the efficiency in removing cell debris at subsequent gradient centrifugation. GCs were enriched at the interface of 23% and 32% Percoll gradient, with cell debris on the upper interface of 23% Percoll. The isolated GCs

were viable on FDA staining (**Figures 3E,F**) and had no VN contamination, as confirmed by propidium iodide (PI) staining (viable GCs cannot be stained by PI; Supplementary Figures S1A,B). Finally, we obtained about 1.5 million GCs from 180 mg of initiated mature pollen grains (about 18 million grains, in that 0.1 million tomato pollen grains is about 1 mg).

#### Release and Purification of SCs

(E,F). Scale bar: 20 μm.

Successful isolation of SCs from *in vitro*-cultured pollen tubes depends on the formation of SCs in the growing tubes. We isolated SCs from 10-h-cultured pollen tubes, in which GCs had completed mitosis to generate SCs (see above, **Figure 4A**; Supplementary Table S2). To decrease the possible contamination of GCs, we collected long pollen tubes using a large pore-size nylon mesh (100 μm), which allowed ungerminated pollen grains and shorter pollen tubes to pass through. We found that osmotic shock alone did not burst the long tubes efficiently (data not shown), and a modified low osmotic solution with cellulase and macerozyme was efficient to burst the tube (**Figure 4B**). After removal of cell debris, SCs in filtrate could be enriched with a layer of 23% Percoll. Finally, we obtained about 2 million viable SCs at high purity from 180 mg initiated pollen grains (**Figures 4C,D**), with no VN contamination (Supplementary Figures S1C,D).

#### Release and Purification of VN

Our results showed that VN was fragile and disrupted quickly as released to medium at room temperature. Repeated pipetting also

led to its complete disruption (Supplementary Figures S2A,B,C). Therefore, no VN contamination was present in isolated GCs and SCs (see above). We solved the bottleneck of VN isolation by (1) keeping all operations at 4◦C or on ice, (2) avoiding pipetting as much as possible, and (3) using 1.5-h-cultured pollen tubes in which VN had moved into the tube (**Figures 1** and **5A,B**), for easier release of VN (**Figures 5C,D**). Furthermore, additional washing as well as passing through Percoll on gradient centrifugation disrupted VN (Supplementary Figures S2A,D), so we used only a hydrated nylon net filter to remove pollen grains and cell debris and then enriched VN by using a layer of 10% Percoll. These measures allowed for isolation of VN without GC contamination (**Figures 5E,F**). Using this protocol, we obtained 10 million VN from 270 mg initiated pollen grains.

# Discussion

We have optimized the conditions allowing for growth of pollen tubes for more than 10 h and generation of SCs in tubes, as well as conditions affecting rupture of pollen grains (tubes) and release of cytoplasm, GCs, SCs and VN into medium. Finally, we developed methods to isolate GCs, SCs, and VN from JGPGs and 1.5-h– and 10-h–cultured pollen tubes, respectively (**Figure 6**). These methods allowed for isolating large amounts of GCs, SCs, and VN at high purity.

#### Culture Conditions for Low-Temperature–Stored Pollen Grains and Long-Term–Cultured Pollen Tubes

Previous study established the condition for *in vitro* germination of fresh pollen grains from tomato (Tang et al., 2002), but under the condition, low-temperature–stored tomato pollen grains did not germinate well (Supplementary Table S1). Generally, pre-hydration is required for rescuing the viability of lowtemperature-stored pollen grains, such as from *Rosa*, *Pistacia vera* L., *Gladiolus* sp. and *Brassica rapa* (Visser et al., 1977; Golan-Goldhirsh et al., 1991; Rajasekharan et al., 1994; Sato et al., 1998). The prehydration was usually actualized with water or saturated salt solution, which generated a fixed relative humidity in a chamber at a certain temperature. We found that a saturated solution of Na2HPO4 was suitable for prehydrating tomato pollen grains. The saturated solution was previously used to prehydrate *B. rapa* pollen (Sato et al., 1998), and could result in 95% relative humidity in a chamber at 25◦C, and tomato pollen grains rescued

under this condition germinated synchronously (**Figure 3A**), which was important for synchronous pollen tube growth and GC division. Pollen density also was an important factor affecting germination in tomato, and increased density led to increased germination percentage (Supplementary Figure S3), which agrees with observations in other species such as *Arabidopsis* (Boavida and McCormick, 2007), *Nicotiana*, *B. oleracea,* and *Betula pendula* (Roberts et al., 1983; Jahnen et al., 1989; Pasonen and Käpylä, 1998; Chen et al., 2000). Furthermore, we evaluated the effect of loaded pollen grain amount in a given volume germination medium on the integrity of pollen tubes during longterm culture (>3 h) by examining cell debris in the medium (Supplementary Figure S4). Cell debris was barely observed with ≤4 mg pollen grains used (Supplementary Figures S4A–D), and substantial with 5 mg pollen grains (Supplementary Figure S4E). Thus, the amount of pollen grains in a given volume germination medium is crucial to the integrity of long-term–cultured pollen tubes, and 3 mg pollen grains loaded in 5 mL germination medium was appropriate for long-term culture of pollen tubes.

### Methods to Release GCs, SCs, and VN

Four major methods were used previously to break pollen grains or tubes: mechanical grinding, one-step and two-step osmotic shock and enzyme digestion (Russell, 1986, 1991; Zhou, 1988; Russell et al., 1990; Theunis et al., 1991; Chaboud and Perez, 1992; Xu et al., 2002; Borges et al., 2008; Zhao et al., 2013). The mechanical grinding had relatively low efficiency for breaking tomato pollen grains and produced a large amount of debris, which interfered with further purification. One-step or two-step osmotic shock is usually used to break pollen grains (tubes) and release target cells. The former breaks pollen grains and releases target cells simultaneously using a low-osmotic solution (Russell, 1986). The latter first makes the grain germinate in a sucrose solution, then release target cells under low-osmotic shock with a diluted sucrose solution (Zhou, 1988; Wu and Zhou, 1991). Tomato pollen grains were insensitive to osmotic shock and even did not burst under osmotic shock of water, which is similar to pollen grains from *Vicia faba*, *B. napus,* and *L. davidii* var. *unicdor* (Zhou, 1988; Taylor et al., 1991; Zhao et al., 2013), thus suggesting a complicated mechanism of pollen grain burst for different species and individual methods needed for different species.

We choose the two-step osmotic shock to release GCs. Components of osmotic shock solution affected the appearance of released cytoplasm, which affected the following purification of GCs. The released cytoplasm appeared not to conglutinate when the solution had only 15.3% sucrose (Supplementary Figure S5A) but appeared to conglutinate with shock solution containing MES or MOPS regardless of concentration (Supplementary Figures S5B–I). Acidic pH could aggravate this situation (Supplementary Figures S5D,J,L). The phenomenon of cytoplasm clumping was also observed under high concentration of CaCl2 in a previous study of isolating SCs from pollen tubes of *N. tabacum* (Tian and Russell, 1997). We chose sucrose and BSA as the components of the osmotic shock solution. The use of BSA in all solutions except the germination medium aimed to protect GCs against damage because we found that GCs could remain intact in the shock solution with BSA up to 1 h but for only a few minutes in a shock solution without BSA; GCs from several species such as *V. faba* could remain intact in simple sucrose shock solution until being purified (Zhou, 1988).

However, pollen tubes cultured *in vitro* for 10 h, which we used to isolate SCs, were not as sensitive as pollen tubes cultured for a short time to osmotic shock described above. So we developed a modified shock solution containing cellulase and macroenzyme. The two enzymes are usually applied to digest hemicelluloses, cellulose and pectin, the main components of the pollen tube wall (Taylor and Hepler, 1997; Cheung and Wu, 2008; Rounds and Bezanilla, 2013). This modified shock solution broke the long pollen tubes and released SCs efficiently (**Figure 4B**). Why pollen tubes had different sensitivity to osmotic shock with long- and short-term culture needs further studies.

In contrast to reports of GCs and SCs, we have limited reports of VN isolation (Wever and Takats, 1971; LaFountain and Mascarenhas, 1972; Ueda and Tanaka, 1994; Borges et al., 2012; Calarco et al., 2012). These reports did not describe the stability of isolated VN or the effect of environmental conditions on the stability. We found that VN was fragile *in vitro* (Supplementary Figure S2) and easily ruptured at room temperature but was relatively stable at 4◦C. However, at low temperature, pollen tubes were not sensitive to osmotic shock. In this situation, enzyme digestion was found efficient to break pollen tubes and release VN but depended on a suitable buffer. We found that pH value was a crucial factor of the buffer. The pH value affected the appearance of the released cytoplasm: conglutination appeared at acidic pH and not at alkalinous pH (Supplementary Figure S6).

#### Measures to Guarantee the Purity of GCs, SCs, and VN

Previous reports mainly described isolation of SCs from tricellular pollen or GCs from bicellular pollen (Russell, 1986; Dupuis et al., 1987; Zhou, 1988; Southworth and Knox, 1989; Yang and Zhou, 1989; Xu et al., 2002; Engel et al., 2003), or along with VN (Borges et al., 2012). For these cases, possible contaminants were VN for isolated SCs or GCs and vice versa. Such purity evaluation was relatively simple. Most previous works evaluated the purity of isolated SCs or GCs based on their morphologic features observed on microscopy (Zhou, 1988; Southworth and Knox, 1989; Yang and Zhou, 1989; Xu et al., 2002; Engel et al., 2003). Only the study of *Arabidopsis* used SCand VN-specific markers to estimate the purity of isolated SCs and VN because of available markers for this species (Borges et al., 2008, 2012).

An evaluation of the purity of isolated GCs, SCs, or VN from a species was relatively lacking in early studies. We found that a combination of PI staining and differential interference contrast (DIC) microscope observation was efficient to evaluate VN contamination of GCs or SCs and vice versa. VN was not observed on DIC microscopy but were detectable with PI staining. In contrast, viable GCs and SCs were easily observed on DIC microscopy but undetectable with PI staining. Using

the combination, we did not find VN contamination in isolated GCs and SCs or GC and SC contamination in isolated VN (Supplementary Figure S1, **Figures 5E,F**). We considered several measures to eliminate this contamination in designing methods to isolate GCs, SCs, and VN. (1) GCs or SCs were stable and isolated at room temperature, but under this temperature, VN was fragile and broke quickly. (2) GCs, SCs and VN had different density on Percoll gradient and could be enriched with different gradient ingredients. These measures could eliminate VN contamination in isolated GCs and SCs and vice versus (Supplementary Figure S1, **Figures 5E,F**).

A major challenge in methodology was how to get SCs at high purity. Here, besides the use of high synchronous pollen tubes, of which 92.4% generated SCs (Supplementary Table S2), and the measures above, we collected long pollen tubes by using a large-pore mesh (100 μm in pore diameter), which excluded ungerminated pollen grains (diameter of about 20 μm) and short pollen tubes. Thus, the methods could obtain SCs at high purity. However, tools to distinguish GCs and SCs are lacking because of their similar morphology under light microscope in tomato and other most species and lack of molecular markers.

# Conclusion

Saturated Na2HPO4 solution was suitable for pre-hydration of low-temperature–stored tomato pollen grains, and the prehydrated pollen grains germinated synchronously. The loaded amount of 0.6 mg pollen grains per mL allowed pollen tubes to grow for more than 10 h, and more than 92% GCs completed mitosis to generate SCs. GCs or SCs were stable and could be isolated at room temperature, whereas under the same temperature, VN was fragile and broke quickly *in vitro*. GCs, SCs, and VN had different density on Percoll gradient, and could be enriched with different gradient ingredients. Thus, we have established methods to isolate GCs and VN from just-germinated pollen grains and 1.5-h–cultured pollen tubes, respectively, and SCs from 10-h–cultured pollen tubes. Using these methods, we could obtain 1.5 million GCs and 2 million SCs each from 180 mg initiated pollen grains, and 10 million VN from 270 mg initiated pollen grains, for higher productivity as compared with previous reports of other species.

# Acknowledgments

We thank Xin Zhao for help in developing these methods, and Fan Yang, Bo Yu, Yunyun Song, Yuxia Liu for help in collecting tomato pollen grains. This work was supported by the Chinese Ministry of Science and Technology (grant no. 2012CB910504).

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls. 2015.00391/abstract

# References


germinated pollen in *Oryza sativa*. *BMC Genomics* 11:338. doi: 10.1186/1471- 2164-11-338


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Lu, Wei and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Re-analysis of RNA-seq transcriptome data reveals new aspects of gene activity in Arabidopsis root hairs

#### Wenfeng Li 1, 2 and Ping Lan<sup>2</sup> \*

*<sup>1</sup> Collaborative Innovation Center of Sustainable Forestry in Southern China of Jiangsu Province, College of Biology and the Environment, Nanjing Forestry University, Nanjing, China, <sup>2</sup> State Key Laboratory of Soil and Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing, China*

#### Edited by:

*Marc Libault, University of Oklahoma, USA*

#### Reviewed by:

*Jedrzej Jakub Szymanski, Weizmann Institute of Science, Israel Chuang Ma, Northwest Agricultural and Forestry University, China*

#### \*Correspondence:

*Ping Lan, Institute of Soil Science, Chinese Academy of Sciences, 71# East Beijing Road, Nanjing 210008, China plan@issas.ac.cn*

#### Specialty section:

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> Received: *31 January 2015* Accepted: *25 May 2015* Published: *08 June 2015*

#### Citation:

*Li W and Lan P (2015) Re-analysis of RNA-seq transcriptome data reveals new aspects of gene activity in Arabidopsis root hairs. Front. Plant Sci. 6:421. doi: 10.3389/fpls.2015.00421* Root hairs, tubular-shaped outgrowths from root epidermal cells, play important roles in the acquisition of nutrients and water, interaction with microbe, and in plant anchorage. As a specialized cell type, root hairs, especially in Arabidopsis, provide a pragmatic research system for various aspects of studies. Here, we re-analyzed the RNA-seq transcriptome profile of Arabidopsis root hair cells by Tophat software and used Cufflinks program to mine the differentially expressed genes. Results showed that *ERD14*, *RIN4, AT5G64401* were among the most abundant genes in the root hair cells; while *ATGSTU2,* AT5G54940, AT4G30530 were highly expressed in non-root hair tissues. In total, 5409 genes, with a fold change greater than two-fold (FDR adjusted *P* < 0.05), showed differential expression between root hair cells and non-root hair tissues. Of which, 61 were expressed only in root hair cells. One hundred and thirty-six out of 5409 genes have been reported to be "core" root epidermal genes, which could be grouped into nine clusters according to expression patterns. Gene ontology (GO) analysis of the 5409 genes showed that processes of "response to salt stress," "ribosome biogenesis," "protein phosphorylation," and "response to water deprivation" were enriched. Whereas only process of "intracellular signal transduction" was enriched in the subset of 61 genes expressed only in the root hair cells. One hundred and twenty-one unannotated transcripts were identified and 14 of which were shown to be differentially expressed between root hair cells and non-root hair tissues, with transcripts XLOC\_000763, XLOC\_031361, and XLOC\_005665 being highly expressed in the root hair cells. The comprehensive transcriptomic analysis provides new information on root hair gene activity and sets the stage for follow-up experiments to certify the biological functions of the newly identified genes and novel transcripts in root hair cell morphogenesis.

Keywords: root hair, novel transcript, RNA-seq, co-expression, Arabidopsis

# Introduction

Root hairs provide a remarkably tractable system for various aspects of studies, such as development, cell biology, and physiology, particularly in Arabidopsis thaliana (Dolan et al., 1998; Ryan et al., 2001; Grebe, 2012; Grierson et al., 2014). Over the last 20 years, the mechanisms underlying root hair morphogenesis have been extensively investigated, and "how and where to build a root hair" has been getting more comprehensive (Grebe, 2012; Grierson et al., 2014). The fate of epidermal cells is determined in a position-dependent manner, cells spanning the cleft of two underlying cortical cells, namely "H position," will form hair cells; while cells presenting over a single cortical cell, called "N position," will stay as non-hair cells (Grebe, 2012; Grierson et al., 2014). Molecular genetics studies have shown that only 0.0625% (21 out of 33,602 genes) of Arabidopsis genes are involved in the root cell patterning formation (Grierson et al., 2014). Among them, WEREWOLF (WER), MYB23, MYC1, TRANSPARENT TESTA GLABRA(TTG), GLABRA 3 (GL3)/ENHANCER OF GLABRA 3 (EGL3), and GL2 are critical positive regulators for non-hair cell differentiation through the inhibition of RHD6 expression (Galway et al., 1994; Di Cristina et al., 1996; Masucci and Schiefelbein, 1996; Lee and Schiefelbein, 1999; Bernhardt et al., 2003, 2005; Kang et al., 2009; Bruex et al., 2012; Pesch et al., 2013). GL2 itself is regulated by the regulatory complex TTG-GL3/EGL3/MYC1-WER/MYB23 (Grebe, 2012; Grierson et al., 2014). Whereas CAPRICE (CPC), TRIPTYCHON (TRY), ENHANCER OF TRY AND CPC1 (ETC1) have been proven to be positive regulators determining the cell fate of root hair (Wada et al., 1997, 2002; Schellmann et al., 2002; Kirik et al., 2004; Tominaga-Wada and Wada, 2014). In addition, some upstream genes, such as SCRAMBLED (SCM), HISTONE DEACETYLASE 18 (HD 18), and JACKDAW(JKD) have been identified and well-documented as critical elements in the cell patterning (Kwak et al., 2005, 2014; Xu et al., 2005; Kwak and Schiefelbein, 2007, 2008; Hassan et al., 2010; Liu et al., 2013; Kwak et al., 2014). Although being defined as a root hair, whether a cell could finally become a root hair is relied on many internal and external factors (Grierson et al., 2014). More than 45 genes including ROOT HAIR DEFECTIVE 6 (RHD6), ROOT HAIR DEFECTIVE 2 (RHD2), EXPANSIN A7 (EXPA7), and EXPANSIN A18 (EXPA18) have been proved to be involved in root hair morphogenesis by molecular genetics studies (Grierson et al., 2014). These genes coordinately regulate the processes of Rop-GTPase re-localization and subsequently mediated signaling, vesicle trafficking, cell wall reassembly, establishment of ion gradients, reorganization of cytoskeleton (actin and microtubule), and producing and homoeostasismaintaining of reactive oxygen species (Ishida et al., 2008; Grierson et al., 2014).

During the past 10 years, with the emergence of microarray technology coupled with advanced computational methods, vast transcriptome analyses at genome-wide level have been performed in Arabidopsis either by comparing transcriptional profiles of root hair-defective mutants compared to those of wild type plants, or by direct exploration of root hairspecific transcriptional profiles from root hair protoplasts based on fluorescence-activated cell sorting (FACS) platform, with hundreds to thousands of genes being identified either involved in root hair morphogenesis or in root hair response to abiotic stresses (Jones et al., 2006; Brady et al., 2007; Dinneny et al., 2008; Gifford et al., 2008; Bruex et al., 2012; Hill et al., 2013; Kwasniewski et al., 2013; Simon et al., 2013; Becker et al., 2014; Niu et al., 2014; Tanaka et al., 2014; Wilson et al., 2015). From these omics datasets, supported by previous molecular genetics studies (Ishida et al., 2008; Grebe, 2012; Ryu et al., 2013; Grierson et al., 2014), a subset of 208 "core" epidermal genes has been identified and a gene regulatory network involved in root epidermis cell differentiation in Arabidopsis has been established (Bruex et al., 2012), which provides an advantage model to study the roles of both single and duplicate genes in a specific gene network (Simon et al., 2013). However, several technical limitations of microarrays, such as limited gene probes present in the chip, narrow dynamic range of gene expression changes, as well as incapability to distinguish homologous genes with high similarity, have failed to show the dynamicity and genomewide range of transcriptional profiling of root hairs. Fortunately, next-generation sequencing technology has overcome such weaknesses and enabled us to explore whole transcriptomes at single-base resolution in a cost-effective manner. It has also enabled us to accurately quantify gene expression and identify unannotated transcripts and splicing isoforms via advanced computational methods (Trapnell et al., 2012).

In our previous study, the paired-end reads were separately matched to Arabidopsis genome in each biological repeat using BLAT program (Kent, 2002) and the differentially expressed genes were then identified in each replication using custommade software RACKJ (Lan et al., 2013), which results in 1617 differentially expressed genes between root hairs (herein referred as RH) and non-root hair tissues (all root tissues except root hairs; herein referred as NRH). However, it must be noticed that although BLAT is a very effective tool for doing nucleotide alignments between mRNA and genomic DNA, it was slow and not very accurate for mapping RNAseq reads to the genome. In addition, BLAT is not designed for the alignment of paired-end reads. RACKJ was initiated for identification of splicing isoforms. It was employed to identify differentially genes in each biological repeat via Z-sore analysis. Therefore, additional information could be revealed by re-analysis of the RNA-seq data using advanced pipeline. Tophat-Cufflinks pipeline are free, open-source software tools for gene discovery and comprehensive expression analysis of RNA-seq data (Trapnell et al., 2012). Tophat was initiated specially for RNA-seq data analysis (Trapnell et al., 2009), which enables both single-end and paired-end reads to align to huge genomes using the ultra-high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons (Trapnell et al., 2009, 2012). The Cufflinks pipeline contains four programs which enables us to perform not only accurate quantification of the known gene expression but also identification and quantification of any previously unannotated transcripts with or without biological repeats (Trapnell et al., 2012). In this study, we extended previous study by re-analyzing RNA-seq data using Tophat-Cufflinks pipeline aimed to provide additional information on root hair gene activity. We revealed more than five thousands of genes that were differentially expressed between RH and NRH, with more than 4000 genes only being reported in the present study. Moreover, a subset of 14 previously unannotated transcripts was identified as to be differentially expressed between RH and NRH. The comprehensive transcriptomic analysis expands our knowledge in root hair gene activity and sets the stage for followup experiments on the biological functions of the newly identified genes and novel transcripts in root hair morphogenesis.

# Results

#### Digital Information on Gene Expression in Root Hairs and Non-root Hair Tissues at Genome-wide Level

In previous study, using transgenic plants carrying Expansin7 (EXP7) promoter fused to GFP as materials (Cho and Cosgrove, 2002), coupled with FACS technique, Arabidopsis root hair protoplasts were harvested and the transcriptome profiling has been explored by RNA-seq from two biological repeats. The RNA-seq data was subsequently analyzed using BLAT program (Kent, 2002) and the differentially expressed genes were further mined by custom-made software RACKJ (Lan et al., 2013). In the present study, the RNA-seq data were re-analyzed by aligning the paired-end reads to Arabidopsis Genome released in TAIR10 via Tophat program (Trapnell et al., 2009). The differentially expressed genes between RH and NRH were subsequently identified using Cufflinks pipeline (Trapnell et al., 2010, 2012). Results showed that a total of 19,743 and 19,660 genes were confidently identified (with status "OK") in RH and NRH, respectively (Table S1 in the Supplementary Material). Of which, an overlap of 19,600 genes were expressed both in RH and NRH. In the RH, ERD14, RIN4, and AT5G64401 were among the most abundant genes, with RPKM (Reads Per Kb per Million reads) value more than 1500 (**Table 1**). Among the 30 highest abundant genes, those encoding arabinogalactan proteins were the most enriched, and four of which were highly expressed in RH. Genes encoding glutathione S-transferases, dehydrins, thioredoxins, and proline-rich extension-like proteins were among the second enriched group in RH, with at least two members detected from each gene family (**Table 1**). Among the 30 highest abundant genes in NRH were arabinogalactan protein-encoding genes and genes encoding glutathione S-transferases and dehydrins (Table S2 in the Supplementary Material). A comparison of the 30 highest abundant genes in RH and NRH resulted in an overlap of 14 genes. Of which, genes encoding arabinogalactan proteins, glutathione S-transferases and dehydrins were the most enriched (Table S3 in the Supplementary Material).

Gene Ontology (GO) analysis of the top 30 most abundant genes in RH and NRH revealed that stress related processes involved in "cold acclimation," "response to water deprivation," "toxin catabolic process," and "aluminum ion transport" were enriched both in RH and NRH (Table S4 in the Supplementary Material). Other processes of "responsive to oxidative stress," "response to microbial phytotoxin," "defense response to fungus,"



and "aluminum ion transport" were more enriched in RH than in NRH. In contrast, genes involved in "response to cold" and "serine–isocitratelyase pathway" were more pronounced in NRH than in RH (Table S4 in the Supplementary Material).

#### Differentially Expressed Genes Identified Between Root Hairs and Non-root Hair Tissues Using Cufflinks Pipeline

The differentially expressed genes between RH and NRH were identified using Cuffdiff algorithm in Cufflinks pipeline with following parameters: for a given gene, (1) the FDR (false discovery rate) adjusted p-value (that is q-value) must be less than 0.05; (2) a fold change between RH and NRH is greater than two-fold; (3) the RPKM value of each gene must be more than one in either of the samples. Subsequently, a total of 5409 genes were identified as differentially expressed genes between RH and NRH (Table S5 in the Supplementary Material). Of which, 2596 genes were significantly greater expressed in NRH than in RH; while abundance of the other 2813 genes was markedly higher in RH than in NRH (Table S6 in the Supplementary Material). Of the 2813 genes, a subset of 61 genes was only detected in RH (Table S7 in the Supplementary Material). Among the rest of 2752 (excluding 61 genes from 2813 genes), the most up-regulated genes were those encoding proline-rich (extensin-like) proteins, extensins, expansins such as EXP7 and EXP18, arabinogalactan proteins, xyloglucan endotransglucosylase/hydrolase, peroxidase superfamily protein, and others. Some other genes including COBL9, COW1, LRX1, IRE, IRC1, and others, which were reported to be required for or associated with root hair development and growth, were also found highly induced in RH.

Comparison of the 1617 differentially expressed genes identified in previous study (Lan et al., 2013) to the 5409 genes identified in the present study led to an overlap of 1259 genes (Table S8 in the Supplementary Material). Seventy-seven percent (1259 out of 1617) of the differentially expressed genes identified in previous study have been determined in the present analysis. By contrast, only 23% (1259 out of 5409) of the differentially expressed genes identified in the present study have been found in previous analysis, and 77% (4150 out of 5409) of the additional genes were only discovered in the present study using Tophat-Cufflinks pipeline. Several of these additional genes, such as COBL9, RHS15, and RHS have been reported to be associated with root hair formation; some of them were among the most up-regulated genes in RH. List of the top 100 most up-regulated genes in the present study can be found in **Table 2** for detailed information.

In addition, among the 1617 differentially expressed genes identified in previous study, a subset of 635 genes was shown up-regulated in RH. Among them, 580 genes were found upregulated in RH in present study, i.e., 91% (580 out of 635) genes up-regulated in previous study were also identified by Tophat-Cufflinks pipeline.

#### Differential Go Analysis of Differentially Expressed Genes

Differential GO analysis of the 5409, 4150, and 1259 genes, which were differentially expressed genes identified in this study, newly identified in this study and identified by both present and previous study, respectively, were performed. Results showed that processes of "response to salt stress," "ribosome biogenesis," "protein phosphorylation," "embryo development ending in seed dormancy," and "response to water deprivation" were most enriched (P ≦ 3.01E-10) in the total 5409 differentially expressed genes (Table S9 in the Supplementary Material). In the 4150 subset, processes of "protein phosphorylation," "embryo development ending in seed dormancy," "response to water deprivation," "response to chitin," "cytokinesis," "intracellular signal transduction," "DNA replication," "transport," and "microtubule-based movement" were enriched (P ≦ 5.21E-7). Whereas protein synthesis related processes of "ribosome biogenesis" and "translation," and root hair related processes of root hair cell differentiation and development were underrepresented (P ≧ 0.55). By contrast, protein synthesis and root hair related processes as well as processes of "response to salt stress," "response to cold," and "response to cadmium ion" were dramatically overrepresented (P ≦ 2.97E-11) in the 1259 overlapping genes (Table S9 in the Supplementary Material).

GO analysis of the top 100 most induced genes in RH revealed that processes of "plant-type cell wall organization," "root hair cell differentiation," "trichoblast differentiation," "response to oxidative stress," "unidimensional cell growth," "protein phosphorylation," and "root hair cell tip growth" were enriched; while only the process of "intracellular signal transduction" was shown significantly (P < 0.01) in the subset of 61 genes expressed only in RH (Table S10 in the Supplementary Material).

#### Co-Expression Network Construction and Module Identification in RH

MACCU program (Lin et al., 2011) was used to calculate the Pearson correlation coefficients of any two genes based on the 300 root-related arrays which were manually identified as previously described, and gene pairs with a threshold value of ≧0.83 were selected to build co-expression networks (Lin et al., 2011). The threshold value was selected for individual coexpression network mainly based on the GO enrichment analysis of genes involved in the network (Lin et al., 2011). Briefly, first a series of threshold values from 0.7 to 0.9 were employed to select gene pairs for co-expression networks. Then, we applied a series of GO enrichment analysis of the genes corresponding to individual co-expression network and looked for the threshold with the best enrichments of GO categories (P < 1E-03) among the input genes. Cytoscape (http://www.cytoscape.org) program was applied to visualize the co-expression relationships among genes and the tool of NetworkAnalyzer was employed to extract connected components (sub-network). In the present study, we mainly focused on the up-regulated genes in RH and on finding novel modules, when compared to the previous study. To this end, the co-expression analysis of the 635 upregulated genes previously identified was performed. Results showed that a network comprising of 122 nodes from 124 genes and 367 edges (correlations between genes) was constructed. This network can be divided into one large and 12 small components (sub-networks), with the large one consisting of 93 nodes from 94 genes and 349 edges (**Figure S1**). Using MCODE program, two modules containing 20 and nine genes were extracted from

#### TABLE 2 | List of the 100 most up-regulated genes in root hairs (RH) compared to non-root hair tissues (NRH).



the large component, respectively (**Figures 1A,B**). GO analysis showed that processes of "plant-type cell wall organization," "response to oxidative stress," "oxidation–reduction process," and "trichoblast differentiation" were enriched (P < 0.001) in module1; while only process of "trichoblast differentiation" was enriched in module 2 (**Figure 1C**). To know whether this

network and modules were presented in the present study, analysis of the 580 up-regulated overlapping genes showed that nearly the same co-expression network was found except that four genes were not included (nodes labeled in blue stars in **Figure S1**).

Co-expression analysis was then performed on the 2172 up-regulated genes identified only in the present study. A subset of 264 out of 2172 genes (12%) was co-expressed at the cutoff of 0.83. This network contained 260 nodes from 264 genes and 589 edges, which can be divided into five large (>10 nodes), eight middle (3–10 nodes), and 20 small (2 nodes) sub-networks (**Figure S2**). The largest sub-network contained 70 nodes from 74 genes, and the second largest sub-network contained 44 genes and the third one contained 29 genes, respectively (**Figure S2**). The other two large sub-networks contained 20 and 16 genes, respectively (**Figure S2**). GO analysis showed that processes of energy-related metabolism, such as "ATP synthesis coupled proton transport," "photorespiration," "response to salt stress," "mitochondrial electron transport, ubiquinol to cytochrome c," "response to cadmium ion," and "proton transport" were enriched in the largest sub-network (Table S11 in the Supplementary Material). Other enriched processes were mainly related to stress responses such as cellular response to cold, cold acclimation, response to wounding, response to chitin, hyperosmotic salinity response, response to karrikin, and response to UV-B. These enriched processes were mainly distributed in the sub-networks of 3, 4, and 7 (Table S11 in the Supplementary Material). No (P = 1) and low (0.01 > P > 0.001) enriched processes were found in the sub-networks of 9, and 2, 5, 6, and 8, respectively (Table S11 in the Supplementary Material). Four functional modules were extracted from the network, which contains 12, 13, 7, and 6 nodes and various edges, respectively (**Figure 2**). GO analysis showed that processes of "glycogen biosynthetic process," "photosynthetic electron transport in photosystem II," "histone deacetylation," "red, far-red light phototransduction," "defense response signaling pathway, resistance gene-dependent," and "ethylene biosynthetic process" were enriched in the module 1 (**Figure 3**), and processes of "ATP synthesis coupled proton transport," "mitochondrial electron transport, ubiquinol to cytochrome c," "purine nucleotide transport," "oxidation–reduction process," and "actin polymerization or depolymerization" were enriched in the module 2, respectively. In the module 3, besides the process of "ATP synthesis coupled proton transport" which was overrepresented, other processes of "glucose mediated signaling pathway," "Golgi organization," and "proton transport" were enriched. While signaling related processes of "small GTPase mediated signal transduction," "photosynthesis, light reaction," and "intracellular signal transduction" were enriched in the module 4 (**Figure 3**).

#### Analysis of Root Hair Regulatory Element in the Differentially Expressed Genes

Existence of Root Hair Regulatory Element (RHE) cis-element sequence "WHHDTGNNN(N)KCACGWH" (where W = A/T, H = A/T/C, D = G/T/A, K = G/T, and N = A/T/C/G) in the 5409 differentially expressed genes was investigated as previously described (Won et al., 2009). Screening within 3000 bp upstream of the start codon (Hereafter named as −3000 bp) resulted in 201 RHE hits from 194 genes, with few genes carrying two or more RHEs (Table S12 in the Supplementary Material). Among the 201 RHE hits, RHE patterns of "AAAGTGTAGAGCACGAT," "ATCTTGGCTT TCACGTT," and "TTCGTGAGTTTCAAATA" were relatively enriched. Subsequently, screening within introns identified 43 genes with one RHE in different intron positions (Table S13 in the Supplementary Material). Eighty nine genes were found to contain one RHE in the CDS (Encoding DNA Sequence) regions, with the sequences of "TCCATGGAAGTCACGAT," and "TTTATGGCTGGCACGTA" being pronounced among the hits (Table S14 in the Supplementary Material). AT2G31350, encoding glyoxalase 2-5, was shown to contain two RHEs in −3000 bp region and the first intron, respectively (**Table 3**). Four genes AT1G18460, AT1G18470, AT2G33320, and AT3G45530 were found to harbor one RHE in −3000 bp regions and another one in the CDS regions (**Table 3**). Another three genes AT3G19050, AT4G03500, and AT5G27680 were shown to carry RHEs in both introns and CDS regions but not in the -3000 bp region (**Table 3**).

# Identification of Conserved Root Epidermal Genes and Associated Co-expression Network

With the attempt to identify conserved root epidermal genes, the set of 208 "core" root epidermal genes was derived from previous report (Bruex et al., 2012), and was compared with the 5409 differentially expressed genes in this study. Comparison resulted in an overlap of 136 genes (Table S15 in the Supplementary Material), which could be grouped into nine clusters according to expression patterns (**Figure S3**). One hundred and twenty three out of 136 genes were annotated as hair genes, but only 27 of the 123 genes carry RHEs in their promoters (Table S15 in the Supplementary Material). To obtain the conserved root epidermal gene-specific co-expression network, the 136 genes were loaded as baits with the rest of 5409 differentially expressed genes (preys) for subsequent co-expression analysis at correlation coefficient cutoff of 0.83. The final network composing of 122 nodes (genes) and 306 edges was generated after discarding edges only linked to two preys. Of the 122 genes, 50 and 72 were from baits and preys, respectively. This network can be further divided into one large and three small clusters (**Figure S4**). GO analysis of the bait genes involved in the network showed that processes of "trichoblast differentiation," "plant-type cell wall organization," and "root hair elongation" were most enriched (**Figure S4**), while processes of "plant-type cell wall organization" and "oxidation–reduction process" were overrepresented in the prey genes (**Figure S5**). One module was extracted from the network, which contains 15 nodes and 80 edges (**Figure 4A**). GO analysis showed that processes of "plant-type cell wall

TABLE 3 | Distribution of RHE motif in the differentially expressed genes between root hairs and non-root tissues.


modification involved in multidimensional cell growth," "planttype cell wall loosening," and "trichoblast differentiation" were most enriched in this module (**Figure 4B**).

#### Identification of Unannotated Transcripts

To identify previously unannotated transcripts which are differentially expressed between RH and NRH, we first assembled a new transcript on the basis of annotated transcript reference (TAIR10\_GFF3\_genes\_gff) using Cuffmerge algorithm in the Cufflinks pipeline (Trapnell et al., 2012). Subsequently, the differentially expressed previously unannotated transcripts were analyzed using Cuffdiff program (Trapnell et al., 2012). Results showed that a total of 121 novel transcripts were identified, and 14 out of 121 unannotated transcripts were differentially expressed between RH and NRH, with transcripts XLOC\_000763, XLOC\_031361, and XLOC\_005665 being the most expressed genes in RH (**Table 4**). XLOC\_005665 is of particular interest, which was highly expressed in RH (**Figure 5**) and deduced a small peptide with 59 amino acids.

## Discussion

Root hairs in Arabidopsis have been intensively studied in various respects and close to 100 genes involved in the cell fate determination and root hair formation have been identified, which provides numerous advantages for basic studies of development, cell biology, and physiology (Grierson et al., 2014).

In the last decade, high-throughput transcriptome analysis, used as alternate approaches differing from traditional molecular genetic analysis, have been adopted extensively to explore genes potentially involved in root hair morphogenesis at genomewide in Arabidopsis (Birnbaum et al., 2003; Jones et al., 2006; Brady et al., 2007; Dinneny et al., 2008; Gifford et al., 2008; Bruex et al., 2012; Lan et al., 2013). In the current study, RNAseq data sets were re-analyzed by Tophat-Cufflinks pipeline, and several new aspects of root hair gene expression were presented. First, RNA-seq technique facilitated obtaining the global "digital" transcriptional information on root hair genes (Table S1 in the Supplementary Material). Of the 19,743 genes detected in RH, ERD14, RIN4, AT5G64401, and others were among the most abundant transcripts (**Table 1**). ERD14 and its homologous ERD10 were previously isolated from a cDNA library of Arabidopsis plants hydrated for 1 h and induced by ABA treatment and dehydration (Kiyosue et al., 1994). In this study, both ERD14 and ERD10 were shown highly expressed in RH, and were up-regulated in RH compared to NRH (**Table 1** and Table S2 in the Supplementary Material). This suggests that ERD14 and ERD10 might be important in root hair morphogenesis or in response to abiotic stresses. RIN4 (RPM1-interacting protein 4) is first reported to interact with Pseudomonas syringae type III effector or molecules, and is required for RPM1-mediated resistance in Arabidopsis (Mackey et al., 2002). Further study showed that RIN4 can interact with AHA1 and AHA2 both in vitro and in vivo, thus regulating plasma membrane (PM) H(+)-ATPases activity. PM H(+)- ATPase activation/ inactivation can regulate the opening or closure of stomata, thereby controls bacterial entry into the leaf (Liu et al., 2009). AHA2 has been reported to be a major regulator controlling the rhizosphere acidification in response to Fe deficiency (Santi and Schmidt, 2009). Taken together, it is possible that RIN4 also plays important roles in root hair morphogenesis and response to Fe deficiency by regulating (PM) H(+)-ATPases activity mediated by AHA2. The third highest expressed gene in root hairs was AT5G64401 which encodes a small peptide with unknown function (**Table 1**).

In previous study, a subset of 1617 genes showed differential expression between RH and NRH (Lan et al., 2013). In this study, the abundance of the 5409 genes was revealed to be changed significantly (Table S5) by Tophat-Cufflinks pipeline. Comparison of these two sets (5409 vs. 1617) resulted in an overlapped 1259 genes. We showed that additional 4150 genes were differentially expressed between RH and NRH. Genes like COBL9 (Jones et al., 2006) and RHS15 (Won et al., 2009; Bruex et al., 2012), which were reported to be required for or associated with root hair development and growth, were only determined in this study (**Table 2**). Moreover, 1/3 of cell-type patterning genes, such as ECTOPIC ROOT HAIR2 (ERH2/POM1), ECTOPIC ROOT HAIR3 (ERH3), GLABRA3 (GL3), ROOTHAIRLESS2 (RHL2), SCRAMBLED (SCM/SUB), TRANSPARENT TESTA GLABRA2 (TTG2), and WEREWOLF (WER), 63% (31 out of 49) of root hair morphogenesis-related genes (Grierson et al.,



*"inf" indicates no ratio.*

2014), and 45% (five out of 11) of genes related to hormone action affecting root hair development (Grierson et al., 2014) have been identified as differentially expressed genes between RH and NRH (Table S5). This study well-complements and extends the previous study by adding new information on root hair genes' numbers and activity. Several highly up-regulated genes in RH, which were not reported previously, deserve further investigation.

Co-expression analysis, which is based on the concept that genes with coordinated expression pattern under diverse conditions are often functionally related (Eisen et al., 1998). This concept allows us to filter and select genes of unknown functions for experimental validation and functional predictions as their co-expression is related to genes of known functions (Aoki et al., 2007; Usadel et al., 2009). Not only did we identified modules from previous study (**Figure 1** and **Figure S1**), but also revealed some new modules by the co-expression analysis of the subset of 2172 up-regulated genes in RH (**Figure 2** and **Figure S2**). Results showed that only 12% (264 out of 2172) of the differentially expressed genes are involved in the network, and 589 relationships between genes were formed, suggesting that most of these genes are involved in diverse processes. GO analysis showed that genes associated with energy and stress related processes are enriched in the network (Table S11). This further indicates that root hair development and growth are sensitive to environmental stimuli and are energy-dependent. The conserved root epidermal genes, associated the co-expression analysis of 5409 genes, led to a network composed of 122 nodes (genes) and 306 edges (**Figure S3**). Unexpectedly, in the module, only one gene was from preys (**Figure 4A** in red color) and another 14 genes, including EXP7, EXP18, RHS12, RHS13, and RHS19, were from core root epidermal genes. (**Figure 4A**). Since these core genes were verified to be required for root hair development and growth, therefore it can be suggested that this prey gene plays important roles in root morphogenesis (**Figure 4B**). These results strongly encourage worth further investigation for those genes with unknown functions associated with the above mentioned networks.

The analysis of RHEs in the differentially expressed genes (5409) resulted in only 194 genes which carry one or two RHEs within the 3000 bp upstream of the start codon (Table S12). In an attempt to find whether such RHE localizes in other positions, we screened RHE in both introns and CDS regions. Subsets of 43 and 89 genes harboring one RHE have been hit, respectively (Tables S13, S14). Further analysis showed that only few genes carry RHE in introns and CDS, but none of them carry RHE within the three different types of positions (**Table 3**). Similarly, the previous study identified 154 out of 208 "core" epidermal genes in "H" position, namely root hair genes, but only 33 of them carry RHE (Bruex et al., 2012). These results suggest that regulatory elements, other than RHE, are probably involved in the transcriptional regulation of root hair gene expression.

# Conclusions

In summary, using the currently popular RNA-seq analysis programs, we here provided genome-wide "digital" information on transcriptional expression of root hair genes. We detected additional 4150 genes that are differentially expressed between RH and NRH. We also identified 14 previously unannotated transcripts, which are also differentially expressed between RH and NRH. The findings in this study well-complement and extend the previous one. Some of the highly up-regulated genes in root hairs, which were not reported in the previous study, such as RIN4 (of known function) or AT5G64401 (of unknown function) are worth further study. Gene clustering and the root epidermal-specific co-expression analysis revealed some potentially important genes, such as AT5G04960, AT4G26010, and AT5G05500 probably function as putative novel players in root hair morphogenesis.

# Materials and Methods

#### Data Collection and Processing

Transcriptomic data sets were downloaded from a public database (NCBI: SRA045009.1) and analyzed as previously described (Trapnell et al., 2009, 2010). Microarray data of 2671 ATH1 arrays from the NASCarray database (http://affymetrix. arabidopsis.info/) were downloaded and normalized using the RMA function of the Affy package of the Bioconductor software. Three hundred root-related arrays were manually identified as previously described (Lin et al., 2011), and were used as a database for co-expression analysis.

#### Mapping of RNA-seq Reads and Identification of Differentially Expressed Genes

All analyses were carried out using the Tophat-Cufflinks pipeline (Trapnell et al., 2009, 2010), with the following versions: Tophat v2.0.11, Bowtie2 v2.2.2.0, and Cufflinks v2.2.1. The Arabidopsis TAIR10 genome and gene model annotation file (GFF, TAIR10\_GFF3\_genes\_gff) downloaded from TAIR (www.arabidopsis.org) were used as reference.

To align the RNA-seq reads to the genome, we first generated a Bowtie2 index using TAIR10 genome and then run Tophat with the following options: -N 2 –read-gap-length 3 –read-editdist 3 –read-realign-edit-dist 0 –report-secondary-alignments – coverage-search –microexon-search –library-type fr-unstranded –b2-sensitive. The resulting aligned reads were then used to create a **RABT** (Reference Annotation Based Transcript) assembly using Cufflinks. First, Cufflinks was run in the discovery mode aimed to identify previously unannotated transcripts. Assemblies both from RH and NRH were then merged into one file using Cuffmerge, using TAIR10\_GFF3\_genes\_gff file as the reference annotation, resulting in a **RABT** assembly, used to quantify transcript abundance. Finally, transcript abundance (RPKM) and identification of differentially expressed genes was performed using Cuffdiff with default parameters (P < 0.05 and FDR cutoff of 0.05%) with the options: -N –u, corresponding to upper quartile normalization and multi-read-correct. Differential transcript abundance at all genes was calculated as the logarithm base-2 of the expression ratio (RPKMNRH/RPKMRH).

#### Gene Ontology Analysis

GO enrichment analysis using the TopGo "elim" method (Alexa et al., 2006) was based on The Gene Ontology Browsing Utility (GOBU) as previously described (Lin et al., 2006). The elim algorithm iteratively removes the genes mapped to significant terms from higher level GO terms, and thus avoids the increase of unimportant functional categories.

#### Generation of Co-expression Networks Using the MACCU Toolbox

Gene co-expression networks were constructed on the basis of 300 publicly available root-related microarrays using the MACCU toolbox as previous report (Lin et al., 2011), with a Pearson correlation threshold of equal to or greater than 0.83 based on the GO enrichment analysis. The generated coexpression networks were visualized by Cytoscape (http://www. cytoscape.org), and the Cytoscape tool of NetworkAnalyzer was employed to extract connected components (sub-network).

#### Module Identification of Co-expression Networks

MCODE plugin in Cytoscape software was employed to extract functional modules as previous report (Rivera et al., 2010). First, a vertex-weighting value was calculated based on the clustering coefficient, Ci [Ci = 2 <sup>∗</sup> n/Ki <sup>∗</sup> (Ki-1)], where Ki represents the node count of the neighborhood of node i; and n indicates the number of edges among the Ki nodes in the neighborhood. Next, the highest weighted vertex is set as a center point, seed of the region and search node j whose weight ratio (Wj/Wseed) was >0.1. Then, it filters the predicted complexes if the minimum degree of the graph is less than the given threshold and then constructs a module by deleting the searched node from the network. The top modules with a node count >5 were selected in the co-expression networks for GO enrichment analysis.

#### Acknowledgments

This work was funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDB15030103), the Natural Science Foundation of China (31370280, 31470346), the National Science Foundation in Jiangsu Provinces (BK20141470) and Research Fund of State Key Laboratory of Soil and Sustainable Agriculture, Nanjing Institute of Soil Science, Chinese Academy of Science (Y412201446). WL is supported by the Jiangsu Specially-Appointed Professor program. PL is supported by Chinese Academy of Science through its One Hundred Talents Program. We thank Dr. Wen-Dar Lin and Jorge Rodríguez-Celma for their help in using the MACCU software. Dr. Mazen Alazem is most appreciated for English editing of the revision and we are grateful

#### References


to two reviewers for their invaluable comments and suggestions to substantially improve the manuscript.

### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00421/abstract

Figure S1 | Co-expression relationships of the 635 up-regulated genes in root hairs (RH) when compared to non-root hair tissues (NRH),with pearson correlation coefficient cutoff at 0.83. Bule stars indicate the genes requried for or associated with root hair development and growth.

Figure S2 | Co-expression relationships of the 2172 up-regulated genes in root hairs (RH) when compared to non-root hair tissues (NRH),with pearson correlation coefficient cutoff at 0.83.

Figure S3 | Hierarchical clustering analysis of changes in transcript abundance of 136 overlapping genes (Table S13 in the Supplementary Material) between 208 "core" root epidermal genes (Bruex et al., 2012) and 5409 differentially expressed genes in this study. Transcript abundance was defined as RPKM (Reads Per Kilobase per Millionmapped reads) in the root hairs (RH) and non-root hair tissues (NRH) with two biological repeats. Color key indicates the log2 transformed intensity, gray color which not in the color key indicates that the number is missing.

Figure S4 | The "core" root epidermal gene associated co-expression newwork of the differentially expressed genes between root hairs (RH) and non-root hair tissues (NRH),with pearson correlation coefficient cutoff at 0.83. Genes in green color indicate bait genes from "core" root epidermal gene and genes in red color indicate prey genes identified in the present study.

Figure S5 | Gene Ontology (GO) enrichment analysis of the bait and prey genes involved in the the "core" root epidermal gene associated co-expression newwork.


GL1 and TRY/CPC in Arabidopsis. Development 140, 3456–3467. doi: 10.1242/dev.094698


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Li and Lan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Spatial dissection of the Arabidopsis thaliana transcriptional response to downy mildew using Fluorescence Activated Cell Sorting

#### Timothy L. R. Coker 1, 2, Volkan Cevik 2 †, Jim L. Beynon<sup>2</sup> and Miriam L. Gifford<sup>2</sup> \*

<sup>1</sup> Systems Biology Doctoral Training Centre, University of Warwick, Coventry, UK, <sup>2</sup> School of Life Sciences, University of Warwick, Coventry, UK

#### Edited by:

Sixue Chen, University of Florida, USA

#### Reviewed by:

Dilip Shah, Donald Danforth Plant Science Center, USA Mi-Jeong Yoo, University of Florida, USA

#### \*Correspondence:

Miriam L. Gifford, School of Life Sciences, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, UK miriam.gifford@warwick.ac.uk

†Present Address:

Volkan Cevik, The Sainsbury Laboratory, Norwich, UK

#### Specialty section:

This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science

> Received: 16 April 2015 Accepted: 29 June 2015 Published: 10 July 2015

#### Citation:

Coker TLR, Cevik V, Beynon JL and Gifford ML (2015) Spatial dissection of the Arabidopsis thaliana transcriptional response to downy mildew using Fluorescence Activated Cell Sorting. Front. Plant Sci. 6:527. doi: 10.3389/fpls.2015.00527 Changes in gene expression form a crucial part of the plant response to infection. In the last decade, whole-leaf expression profiling has played a valuable role in identifying genes and processes that contribute to the interactions between the model plant Arabidopsis thaliana and a diverse range of pathogens. However, with some pathogens such as downy mildew caused by the biotrophic oomycete pathogen Hyaloperonospora arabidopsidis (Hpa), whole-leaf profiling may fail to capture the complete Arabidopsis response encompassing responses of non-infected as well as infected cells within the leaf. Highly localized expression changes that occur in infected cells may be diluted by the comparative abundance of non-infected cells. Furthermore, local and systemic Hpa responses of a differing nature may become conflated. To address this we applied the technique of Fluorescence Activated Cell Sorting (FACS), typically used for analyzing plant abiotic responses, to the study of plant-pathogen interactions. We isolated haustoriated (Hpa-proximal) and non-haustoriated (Hpa-distal) cells from infected seedling samples using FACS, and measured global gene expression. When compared with an uninfected control, 278 transcripts were identified as significantly differentially expressed, the vast majority of which were differentially expressed specifically in Hpa-proximal cells. By comparing our data to previous, whole organ studies, we discovered many highly locally regulated genes that can be implicated as novel in the Hpa response, and that were uncovered for the first time using our sensitive FACS technique.

Keywords: plant-pathogen interactions, oomycete pathogens, biotrophic infection, cell type-specific transcriptomics, Fluorescence Activated Cell Sorting

# Introduction

Unlike mammals, plants do not develop specialized immune cells. Instead, they rely on Pattern-Recognition Receptors (PRRs), which detect conserved molecules or motifs associated with foreign micro-organisms (Zipfel, 2014), and cytoplasmic NOD-Like Receptors (NLRs), which detect more specific pathogen-derived effectors that are delivered into the plant cell (Jones and Dangl, 2006). Perception of a pathogen by these receptors triggers a cascade of cellular signaling events, which culminate at the cell nucleus where transcriptional reprogramming occurs (Tsuda and Somssich, 2015).

Transcriptional reprogramming is a crucial part of the immune response, and this makes it a potential target for interference from pathogens. Manipulation of host gene expression may be particularly important for biotrophic pathogens, which must keep their host cells alive while effectively suppressing the immune system and extracting nutrients. A number of pathogenic effectors from Pseudomonas syringae and Hyaloperonospora arabidopsidis (Hpa) have been shown to localize to the host cell nucleus, or to physically interact with transcriptional machinery (Mukhtar et al., 2011; Caillaud et al., 2012, 2013). Several endogenous Arabidopsis genes have been shown to be involved in disease susceptibility (Lapin and Van den Ackerveken, 2013; Zeilmaker et al., 2015) and expression of these may be induced by a pathogen to aid infection. Thus, being able to understand the transcriptional response to infection is not only important to understand the mechanisms by which plants resist pathogens, but also those by which pathogens suppress the plant immune system and exploit the endogenous molecular machinery of the plant for their own gain.

The pathosystem of Arabidopsis and its downy mildew pathogen Hpa has been an invaluable model in plant pathology over the past two decades for a number of reasons (Coates and Beynon, 2010). Firstly, Hpa is an oomycete, making it phylogenetically distinct from the many bacterial and fungal pathogens that have received extensive study, but more closely related to the agriculturally important potato blight, Phytophthora infestans. Additionally, the remarkable number of Hpa isolates, along with the number of differentially susceptible and resistant Arabidopsis ecotypes, available for study has made the pathosystem a useful tool for studying gene-forgene resistance (Holub, 2007). Following this, advancements in genomics have shifted the focus toward large-scale identification of Hpa's RxLR effectors and unraveling their effects on the host (Baxter et al., 2010; Fabro et al., 2011; Mukhtar et al., 2011; Caillaud et al., 2013).

Finally, the pathosystem is perhaps the clearest example of obligate biotrophy in Arabidopsis. Upon landing on a leaf surface, an asexual Hpa conidiospore germinates and forms an appressorium to penetrate the leaf surface. As early as 1 day post-infection, Hpa grows intercellularly as hyphae, before forming lobe-shaped structures called haustoria in almost every cell it contacts during a compatible interaction. These haustoria are invaginations of the plant cell that, while keeping the cell membrane intact, form an intimate interface between host and pathogen that aids nutrient acquisition and the delivery of effectors. Assuming successful infection, Hpa completes its life cycle within around 7 days, producing both asexual spores, which are carried by the tree-like conidiophores that emerge from the stomata, and sexual oospores (Coates and Beynon, 2010).

Whereas, progress is being made in identifying the key determinants of pathogenicity in Hpa and their effect on the host Arabidopsis, this progress is limited in comparison to other pathogens such as P. syringae, most notably because Hpa cannot be genetically manipulated. Several studies have looked at transcriptional change in response to Hpa infection (Huibers et al., 2009; Hok et al., 2011; Wang et al., 2011a; Asai et al., 2014), but it has been suggested that many of the key transcriptional events, which may occur exclusively in haustoriated cells, are often diluted by the comparative abundance of non-haustoriated cells when taking whole-organ samples (Huibers et al., 2009; Asai et al., 2014). Moreover, very little is known about the localization of Arabidopsis responses to Hpa, and how events which occur in haustoriated cells may differ from more systemic signaling events on a genome-wide scale. Making this distinction may be crucial in understanding how the haustorial environment influences the behavior of host cells.

In order to identify plant gene expression responses specifically in haustoriated cells, and to compare these to more systemic changes in gene expression during Hpa infection, we developed a method of isolating haustoriated cells from seedlings infected with the compatible Hpa isolate Noks1. The issue of dilution of highly localized pathogen responses has been previously overcome in the Arabidopsis-powdery mildew interaction in one published study, where by isolating infected cells through laser capture microdissection sensitivity of transcriptomic analysis was greatly increased (Chandran et al., 2010). Here, however, we chose to use Fluorescence Activated Cell Sorting (FACS) as it is a rapid way of isolating a large number of cells for gene expression analysis (Karve and Iyer-Pascuzzi, 2015). FACS is a flow cytometry technique that allows sorting of individual cells according to their fluorescence properties (Rogers et al., 2012), and has been a valuable tool for profiling the changing transcriptome of Arabidopsis roots during development at high spatial and temporal resolution (Brady et al., 2007). It has also been used extensively to characterize the cell typespecificity of root response to environmental/abiotic factors such as nitrogen content (Gifford et al., 2008) and salinity (Dinneny et al., 2008). FACS has also seen limited application to leaves (Grønlund et al., 2012) and analyzing the shoot apical meristem (Yadav et al., 2009), but has not been used before to study plant-pathogen interactions.

Here we used FACS to isolate haustoriated (Hpa-proximal) and non-haustoriated (Hpa-distal) cells from Hpa Noks1 inoculated Arabidopsis seedlings using the Hpa-responsive transgene ProDMR6:GFP at two time points. We demonstrated that the FACS-isolated cells can be used for transcriptional analysis, and identified 278 transcripts that are differentially expressed between the cell types, relative to uninfected controls or between the two time points. Included in these transcripts were many novel responses which may give us new insight into how infection-site-specific events may influence the outcome of downy mildew infection in Arabidopsis.

# Materials and Methods

#### Plant Material and Growth Conditions

A 2.5 kb fragment of the DMR6 [At5g24530, Downy Mildew Resistant 6 (van Damme et al., 2008)] promoter was PCRamplified from Arabidopsis (ecotype Col-0) using the primers proDMR-F (AAAAAGCAGGCTTCACCGACTCTGTCTGAG TCTGAAGTCCCAAACCATG) and proDMR-R (CAAGAA AGCTGGGTGCCGCCATTTGATGTCAGAAAATTGAAGAA G), followed by a second amplification with pAttB1 (GG GGACAAGTTTGTACAAAAAAGCAGGCT) and pAttB2 (GGGGACCACTTTGTACAAGAAAGCTGGGT), and cloned into the pDONRZeo plasmid (Invitrogen). The entry clone was then recombined with the binary vector pBGWFS7 (Karimi et al., 2002). The resulting plasmid was then introduced into Agrobacterium tumefaciens strain GV3101. Arabidopsis thaliana Col-0 plants were transformed using the Agrobacteriummediated floral dipping technique (Clough and Bent, 1998), and successful transformant seeds selected on BASTA. Homozygous T<sup>3</sup> plants with single insertions were used for all experiments. ProDMR6:GFP and Col-0 seeds were stratified for a minimum of 24 h before sowing onto soil, and were loosely covered with plastic film to retain moisture for the first 4 days after sowing. Plants were grown in a growth chamber (Weiss Technik, Vejle, Denmark) at 20◦C with 10 h of light. The whole experiment was carried out in triplicate.

#### Hyaloperonospora arabidopsidis (Hpa) Propagation and Inoculation

Hpa isolate Noks1 (Rehmany et al., 2005) was maintained on Arabidopsis Col-0 by weekly transfer to 7-day-old seedlings. Inoculum was collected from seedlings and sprayed at a concentration of 30,000–60,000 spores ml−<sup>1</sup> onto new hosts according to Tomé et al. (2014). Spores were applied to 7-dayold seedlings carrying the ProDMR6:GFP transgene, or Col-0 wild type. These plants were then placed in water-tight propagator trays and incubated in a growth chamber (Weiss Technik, Vejle, Denmark) at 18◦C with 10 h of light.

#### Imaging and Microscopy

Images were acquired using a Zeiss LSM 710 confocal microscope, in conjunction with the Zeiss ZEM software.

#### Protoplast Generation and Fluorescence Activated Cell Sorting (FACS)

Protoplasts were generated from seedling leaves according to Grønlund et al. (2012), but with the following alterations: (i) ProtectRNA and Actinomycin D were not used, (ii) vacuum infiltration was omitted, (iii) petri dishes were rotated on orbital shaker for only 45–60 min, and (iv) only one wash and centrifugation step was performed. FACS was performed according to Grønlund et al. (2012), using a workspace derived from **Figure 3** of the publication. Cells were sorted directly into tubes containing 1 ml RLT cell lysis buffer (Qiagen) containing 1% β-mercaptoethanol, then samples stored at −80◦C.

#### RNA Extraction, cDNA Amplification, and Labeling

RNA was extracted using the RNeasy Plant Mini Kit according to manufacturers instructions (Qiagen). DNase treatment was performed on-column using TURBO DNase (Life Technologies), with dose dependent on the approximate number of sorted cells in the sample, as manufacturers instructions: GFP-positive samples, which typically contained ∼20,000 cells, were treated with one unit of TURBO DNase and incubated at 37◦C for 20 min; all other samples, which contained >100,000 cells, were given a second equal round of DNase I treatment. cDNA was amplified using the Ovation Pico WTA System (NuGen), then labeled with Cy3 using the One-Color DNA Labeling Kit (NimbleGen) according to manufacturers instructions. RNA integrity was measured using a 2100 Bioanalyzer Picochip (Agilent). cDNA and Cy3-labeled cDNA were quantified using a NanoDrop Spectrophotometer (Thermo Scientific).

#### Microarray Hybridization and Data Normalization

Labeled cDNA samples were randomized and hybridized for 18 h on a 12x135k expression array custom designed for the TAIR10 A. thaliana genome annotation (Design ID OID37507; see GEO GSE58046, NimbleGen), then the arrays were washed, dried and scanned according to manufacturers instructions. The scanned microarray images were imported into DEVA software, and data outputted as raw.xys files. The data were then imported into R (R Development Core Team, 2008). The Robust Multichip Average (RMA) algorithm was used to normalize the data, taking outlier probes into account, and to summarize expression at the transcript level using median polish (Irizarry et al., 2003). All raw and normalized microarray data has been deposited in GEO (GSE67100).

#### Microarray Data Analysis

Linear Models for Microarray Data (package limma in R) was used to fit linear models to pairs of samples (**Figure S3**), identifying genes that contrasted the most between the experimental pairs (Smyth, 2004). Transcripts were differentially expressed if they showed an absolute log<sup>2</sup> fold-change of ≥0.75 [a threshold previously used by Huibers et al. (2009)] and a Benjamini-Hochberg adjusted p ≤ 0.05 in at least one comparison. Published data was processed in the same way, except for the data from Huibers et al. (2009), which had been previously normalized. The Cytoscape plugin BiNGO was used to identify gene ontology (GO) terms overrepresented in transcript groups, using the default settings and the "GO full" database, and a significance threshold of Benjamini-Hochberg adjusted p ≤ 0.05. For grouping, transcripts found to be differentially expressed in any pairwise comparison between sample types at either 5 or 7 days post-inoculation (d.p.i.) were placed in order of their ratio of proximal change to distal change, measured as log2(ExpressionProximal/ExpressionControl)/log2(ExpressionDistal/ ExpressionControl) and divided evenly into the final number of groups.

# Results

#### ProDMR6::GFP as a Fluorescent Reporter for Host Cells Containing Hyaloperonospora arabidopsidis Haustoria

In order to identify transcriptional events in A. thaliana that occur specifically in cells containing Hpa haustoria, we developed a method of using FACS to isolate haustoriated cells and non-haustoriated cells from Hpa-infected plants. This required a fluorescent reporter that is expressed specifically in haustoriated cells. van Damme et al. (2008)recently characterized the Arabidopsis gene Downy Mildew Resistant 6 (DMR6), which encodes a 2-oxoglutarate (2OG)-Fe(II) oxygenase and is required for susceptibility to Hpa isolate Waco9. By expressing a GUS reporter under the control of the DMR6 promoter they demonstrated that DMR6 expression is induced specifically in haustoriated cells, in both compatible and incompatible interactions with Hpa (van Damme et al., 2008). In order to assess ProDMR6 as a marker for isolating haustoriated cells using FACS, a construct containing 2.5 kb upstream of DMR6 was fused to the GFP coding sequence and used to transform Arabidopsis Col-0 plants.

To investigate ProDMR6::GFP expression we screened 10-to-14-day-old T<sup>3</sup> seedlings of four independent transformants using confocal microscopy. GFP expression was observed consistently in all transformants upon inoculation with the compatible Hpa isolate Noks1, and all transgenic lines behaved as Col-0 in terms of growth and development. Although van Damme et al. (2008) reported expression of ProDMR6::GUS as early as 2 d.p.i., we observed little or no fluorescence at 3 d.p.i. (**Figure 1A**). Instead, we observed strong fluorescence at 5 (**Figure 1B**) and 7 d.p.i. (**Figure 1C**). Fluorescent cells were observed adjacent to each other, suggestive of the pattern of Hpa infection (**Figure S1**), and this was confirmed to correlate with the visibility of conidiophores on the cotyledon surface at 7 d.p.i. (data not shown). We did not observe green fluorescence in Noks1-infected Col-0 seedlings (**Figure 1D**), or uninoculated ProDMR6::GFP seedlings (**Figure 1E**), at any time point, confirming that the GFP was expressed specifically upon Hpa infection in the marker line.

#### Fluorescence Activated Cell Sorting to Isolate Haustoriated and Non-haustoriated Cells from Infected Tissues

Having isolated an effective and specific marker of Hpa haustoriated cells, we designed an experiment allowing us to study the transcriptional response of Arabidopsis to Hpa Noks1 on a spatial scale (**Figure 2**). Seven-day-old ProDMR6::GFP seedlings were inoculated with Hpa isolate Noks1 and cotyledons

uninfected ProDMR6::GFP transgenic seedling; seedling is 12 days old, an

sampled at 5 and 7 d.p.i. in three biological replicates. We chose 5 d.p.i., as this was when we could first observe GFP expression under the microscope, and 7 d.p.i., as it represents a point where the Hpa life cycle has completed (Coates and Beynon, 2010). Protoplasts were generated from these samples and cells sorted using FACS to obtain two cell populations: GFP-expressing cells, representing the haustoriated cell population and hereon referred to as "Hpa-proximal cells," and non-GFP-expressing cells, representing the non-haustoriated cell population from infected plants, hereon be referred to as "Hpa-distal cells." As a control and baseline for comparison, uninfected ProDMR6::GFP seedlings of the same age were also sampled at both time points, protoplasts generated and sorted through FACS.

Protoplasts were generated using a recent protocol for FACS of leaf cells by Grønlund et al. (2012). Immediately prior to FACS, a small subset of the protoplasts derived from infected seedlings express GFP, consistent with the proportion of GFP expressing cells in infected seedling leaves. This GFP expression was detected upon FACS analysis (**Figure S2**). In contrast, GFP expressing cells were not observed in protoplasts from uninfected seedlings prior to FACS. From the 18 protoplast samples collected (three cell populations × two time points × three biological replicates), RNA was extracted, converted to cDNA, labeled, and hybridized to whole genome oligonucleotide Arabidopsis microarrays.

### Differential Expression of Genes in Hpa-proximal and Hpa-distal Cells Gives Insight Local and Systemic Responses to the Pathogen

Microarray gene expression was summarized at the transcript level and normalized using the RMA algorithm (Irizarry et al., 2003) (**Table S1**). In order to identify transcripts which were differentially expressed (DE) in Hpa-proximal cells, and to differentiate these from systemic signaling observed in cells distal to the infection site, we performed pairwise comparisons (**Figure S3**) across cell populations and time points using Linear Models for Microarray Data (LIMMA) (Smyth, 2004). A total of 278 transcripts were identified as differentially expressed at a cutoff of absolute log<sup>2</sup> fold-change ≥0.75 and a Benjamini-Hochberg adjusted p-value ≤ 0.05 in at least one pairwise comparison (**Table S2**).

As a confirmation that the cells isolated by FACS were those that were Hpa-associated, among the 278 DE transcripts was DMR6 (At5g24530), which showed ∼seven-fold upregulation in Hpa-proximal cells relative to uninfected control cells at 7 d.p.i. (Benjamini-Hochberg adjusted p = 0.035), and at 5 d.p.i. (Benjamini-Hochberg adjusted p = 0.061). We also observed upregulation of several other genes which have been previously implicated in the Hpa response, or as more general regulators of plant-pathogen interactions. These include Impaired Oomycete Susceptibility 1 (IOS1, At1g51800), Pathogenesis-Related 4 (PR4, At3g04720), Pathogen and Circadian Controlled 1 (PCC1, At3g22231), Flg22-induced Receptor-like Kinase 1 (FLK1, At2g19190) and WRKY8 (At5g46350) (**Table S2**).

Of the 278 total DE transcripts, 81 and 231 transcripts were DE between the three cell types at 5 d.p.i. and 7 d.p.i. respectively, with 35 transcripts being DE over both time points (**Figure 3A**). 276 transcripts were DE between Hpa-proximal

equivalent age to a 5 d.p.i. seedlings.

cells and uninfected control cells from the same time point, with 37 transcripts found to be DE between Hpa-proximal and Hpa-distal cells at the same time point (**Figure 3B**). A single transcript, At2g18660.1 (Plant Natriuretic Peptide A, PNP-A), was found to be DE between GFP-negative (Hpa-distal) cells from infected plants and cells from uninfected plants. Together with the detection of previously characterized Hpa responsive genes, the observation that the vast majority of transcriptional responses are being identified in the Hpa-proximal populations, rather than the Hpa-distal populations, from infected plants confirms that Hpa-responsive cells can be isolated using FACS.

In order to discover what types of genes are responding locally vs. systemically, i.e., specifically in Hpa-proximal cells vs. more generally in both Hpa-proximal and Hpa-distal cells, the 278 DE transcripts were grouped according to the localization of their response at each of the time points, and these groups were searched for overrepresentation of GO terms using the Cytoscape plugin BiNGO (Benjamini-Hochberg adjusted p ≤ 0.05, Maere et al., 2005, **Figure 4**, **Table S3**). To take a more granular view of response location we chose to differentiate local and systemic genes based on the ratio of their Hpa-proximal response (log<sup>2</sup> fold-change relative to uninfected control) to their Hpa-distal response; for a list of the genes within each group, see **Table 1**, **Table S2**.

The 81 transcripts DE at 5 d.p.i. were split into three groups (**Figure 4A**). For upregulated genes, we were interested in broadly comparing local and systemic responses, so we split the transcripts found to be upregulated at this time point into two groups—one representing systemic induction (almost equal proximal and distal response), and one representing localized induction (strong proximal response, weak distal response). The systemic induction group showed overrepresentation of pathology-related GO terms such as "response to other organism" and "defense response," as well as "systemic acquired resistance," fitting to the systemic expression pattern of the genes in this group. This suggests that, despite the lack of genes DE in Hpa-distal cells relative to the control, this population of cells is capturing systemic signaling in response to Hpa. Genes involved in lipid transport and localization were also overrepresented in this group. Individual genes represented in this group include Enhanced Disease Susceptibility to Erysiphe orontii (EDS16, At1g74710) and AVRPPHB Susceptible 3 (PBS3, At5g13320), which have both been implicated in salicylic acid accumulation in plant defense (Wildermuth et al., 2001; Nobuta et al., 2007), and Lysine Histidine Transporter 1 (LHT1, At5g40780), which has been shown to influence plant defense in a salicylic acid-mediated manner (Liu et al., 2010). The defense genes Pathogenesis-Related 4 (PR4, At3g04720) and Pathogen and Circadian Controlled 1 (PCC1, At3g22231) also fell into this group.

In contrast localized induction group did not show overrepresentation of any GO terms, suggesting a diversity of genes within this group. Individual genes represented in this group include the transcription factor WRKY29 (At4g23550), a terpene synthase (TPS4, At1g61120) and a peroxidase superfamily protein (At5g39580). The group also includes cysteine-rich receptor-like protein kinases ARCK1 (At1g11890) and CRK26 (At4g38830), and a monodehydroascorbate reductase (AtMDAR3, At3g09940) that is crucial for colonization of Arabidopsis by the mutualistic fungus Piriformospora indica (Vadassery et al., 2009).

Due to the small number of transcripts at this time point, downregulated genes could not effectively be split into "systemic" and "local" responding and were thus considered as one group. This group showed overrepresentation for only one GO term: "cytoskeletal part." Downregulated genes include Callose Synthase 3 (At5g13000), peroxidase 12 (At1g71695) and a pathogenesis-related thaumatin superfamily protein (At1g73620).

The larger number (231) of transcripts DE at 7 d.p.i. allowed us to split them into more groups (**Figure 4B**). Transcripts

uninfected plants.

upregulated at this time point were this time split into three gene groups—systemic induction, local induction and infectionsite-specific induction, representing increasing localization of their response, such that genes in the infection-site-specific induction group showed a negligible Hpa-distal response. As with the systemic induction group at 5 d.p.i., the systemic induction group at 7 d.p.i. showed overrepresentation for the GO terms "lipid transport," "systemic acquired resistance" and a number of generic defense-related terms such as "defense response." The GO terms "response to salicylic acid stimulus" and "response to stress" were also additionally overrepresented in this group. Individual genes within this group include Pathogenesis-Related 4 (PR4, At3g04720) and 5 (PR5, At1g75040), WRKY59 (At2g21900), WRKY62 (At5g01900) and WRKY8 (At5g46350), Accelerated Cell Death 6 (ACD6, At4g14400), Plant Natriuretic Peptide A (PNP-A, At2g18660) and Late Upregulated in Response to Hyaloperonospora parasitica (LURP1, At2g14560). Surprisingly, DMR6 fell into this group, despite being used as our marker for Hpa-local cells. This could be due to weaker, more systemic signaling of DMR6 that was beyond detection using a GFP marker. As this data set is enriched for responses predominantly in Hpa-local cells, this too may also been an indication that even the most systemic responses captured remain fairly localized to the infection site.

The local induction group showed similar GO term enrichment to the systemic induction group at 7 d.p.i., such as the pathology-related terms "response to other organism" and "defense response" and the more generic "response to stress." A number of receptor-like proteins were present in this group, including Flg22-induced Receptor-like Kinase 1 (FRK1, At2g19190), Cysteine-rich Receptor-like Kinase 13 (CRK13, At4g23210), Receptor Like Proteins 9 (AtRLP9, At1g58190) and 52 (AtRLP52, At5g25910) a putative CC-NBS-LRR class disease resistance protein (At1g58400) and a putative TIR-NBS class disease resistance protein (At4g09420). WRKY47, (At4g01720), WRKY72 (At5g15130) and WRKY38 (At5g22570) were also in this group.

Infection-site-specific induced, representing the most localized genes upregulated at 7 d.p.i., showed overrepresentation of only the GO term "oxidoreductase activity, acting on the CH-NH group of donors." Genes in this group include the transcription factors WRKY36 (At1g69810), NAC3 (At3g29035) and NAC087 (At5g18270), as well as an RNA-binding Suppressor-of-White-APricot (SWAP) protein (At5g06520).

The larger number of downregulated genes at 7 d.p.i., relative to 5 d.p.i., allowed us to split them into two groups representing systemic and local repression. Genes that showed systemic repression were overrepresented for a number of cellular functions such as "cytoskeletal part," "organelle organization," "cell cycle process," and "nucleoside-triphosphatase activity." Genes in this group include a histone H1/H5 family member (At1g48620), metacaspase 3 (MC3, At5g64240) and A. thaliana Kinesins 1 (ATK1, At4g21270) and 12B (ATK12B, At3g23670).

Finally, there was no overrepresentation of GO terms in the localized repression group. Genes in this group included peroxidase 12 (PER12, At1g71695), the receptor protein kinase ERECTA (At2g26330), microtubule-associated protein 65-4 (MAP65-4, At3g60840) and Starch Synthase 3 (ATSS3, At1g11720).

#### Comparison with Published Data Sets

To ask if the FACS approach identifies novel genes in the Arabidopsis response to Hpa infection, we compared our list of differentially expressed genes to previously published microarray data from Huibers et al. (2009), Wang et al. (2011a) and Hok et al. (2011). Data from these publications was retrieved from the relevant public databases and processed in a similar manner to the data we present here, i.e., differentially expressed genes identified by making pairwise contrasts in LIMMA. From each published dataset we considered only samples and direct comparisons that were most relevant to our experimental design here. Huibers et al. (2009) used two-color CATMA arrays to profile expression in a compatible Arabidopsis-Hpa interaction (Landsberg erecta (Ler) and Cala2) and an incompatible interaction (Ler and Waco9), relative to uninfected controls, at 3 d.p.i. Wang et al. (2011a) performed a 6-day timecourse of infection with the incompatible strain Emwa1, in Col-0 and the susceptible mutant rpp4. Finally, Hok et al. (2011) measured gene expression in Arabidopsis Wassilewskija (WS) seedlings

after mock treatment, and treatment with the compatible isolate Emwa, at an early time point (8 and 24 h post-inoculation) and at a late time point (4 and 6 d.p.i.). For the former two datasets, we considered only the Cala2 interaction and the rpp4 interaction, respectively, as they represented compatible interactions that result in a similar outcome to the Col-0 and Noks1 interaction, i.e., completion of the Hpa lifecycle. For the latter two datasets, which have multiple time points, we considered all time points as to capture as much of the Hpa response as possible.

Our 278 differentially expressed transcripts represent 267 different genes—128 of which could be detected in the previously published datasets based on our analysis (**Figure 5A**). The remaining 139 genes are thus novel Hpa responses identified by our FACS-based cell response type specific approach. However, ∼5300 transcripts were previously detected as differentially expressed in one or more of the datasets outlined above, but not differentially expressed in our dataset. A comparison between previous datasets shows that only a small proportion of these are common between datasets (**Figure 5B**), suggesting that these DE genes arose as differences in experimental design, Hpa strain used or otherwise may be false positives.

In order to compare the sensitivity and specificity of our approach to the previously published data, we compared the average fold-change and Benjamini-Hochberg adjusted p-values for all genes for a number of pairwise comparisons across different datasets (**Figures 5C–F**). We found that our dataset had a larger proportion of genes with significant (≥0.75) log<sup>2</sup>

#### TABLE 1 | Differentially expressed genes grouped according to the direction and localization of their response at 5 and 7 d.p.i.


AT2G44380 AT5G11210 GLR2.5 AT1G69930 GSTU11 AT5G22570 WRKY38

AT2G45510 CYP704A2 AT5G22540 AT1G77380 AAP3 AT5G25910 RLP52

WIN3, GH3.12

AT2G44890 CYP704A1 AT5G13320 PBS3, GDG1,

(Continued)

AT1G71910 AT5G25260


At 5 and 7 d.p.i., transcripts were classified as either "upregulated" or "downregulated," then split into groups according to the ratio of a Hpa-local response (measured as the log<sup>2</sup> fold-change between Hpa-local cells and uninfected cells at that time point) and a Hpa-distal response (the fold-change between Hpa-distal and uninfected cells). See Figure 4 for a graphical representation of their expression pattern, Table S2 for expression values, and Table S3 for GO term analysis details.

fold changes relative to an uninfected control than in the previously published datasets, and those identified as DE showed DE of a higher magnitude, highlighting that by specifically analyzing Hpa-proximal cells, we observe greater sensitivity in expression changes during infection (compare the width of plots in **Figures 5C,D** to **Figures 5E,F**). Conversely, the published datasets had a larger proportion of genes within the significance threshold of adjusted p ≤ 0.05, but with almost-zero fold-changes (**Figures 5E,F**). This suggests that, relative to the published datasets, although our data shows higher sensitivity, in this instance noise may be a limiting factor in Hpa-responsive gene detection.

#### Discussion

Here we present the novel use of FACS to isolate A. thaliana cells infected by the downy mildew pathogen Hpa. To our knowledge, this is the first use of FACS to specifically isolate plant cells responding to infection, although this has previously been achieved in animal systems (Richman et al., 2002; Thöne et al., 2007).

We demonstrate that cells isolated by FACS of Hpa-infected seedlings can be used for transcriptomic analysis of the local vs. systemic response to Hpa infection. Consistent with expectations that the majority of transcriptional events would occur at the infection site, all differentially expressed genes were either significantly upregulated or downregulated in the Hpa-proximal cell population, over time, or relative to an uninfected control or Hpa-distal cells from infection plants at the same time point. In contrast, only a single transcript showed significant differential expression between Hpa-distal cells and uninfected control cells. The identity of this transcript as Plant Natriuretic Peptide A (PNP-A, At2g18660) is assuring as PNP-A has been previously described as a secreted signal working systemically during both abiotic and biotic stress (Wang et al., 2011b). Ideally we would have identified further genes to be significantly differentially expressed in the Hpa-distal population, representing systemic signaling. However, as the Hpa-distal cell population was simply a collection of cells not expressing the haustoriated cell marker ProDMR6::GFP, we might expect this population to be heterogeneous, containing cells at varying proximity to the pathogen, many of which may not be responding to the pathogen at all. To address this potential dilution of systemic responses, we considered that many of the genes differentially expressed in Hpa-proximal cells may also be responding more systemically, and grouped these into the "Systemic Induction" and "Systemic Repression" groups in **Figure 4**. Several of the genes and GO terms associated with these groups are consistent with what is already known about defense signaling in Arabidopsis, such as the role of salicylic acid and salicylic acid-responsive gene expression in systemic acquired resistance (Durrant and Dong, 2004). However, no firm conclusions can currently be made from the analysis in **Figure 4** and further experiments are needed to validate the localization of these responses, and to unravel their significance in the Hpa-Arabidopsis interaction.

The use of FACS to study cells specifically at the site of infection has potential to increase the sensitivity of transcriptomic or other high-throughput analyses, such as proteomics. We have shown that in general, the magnitude of up- or down-regulation of genes is greater in our FACSisolated Hpa-proximal cells than in previous whole-leaf datasets, relative to uninfected controls (**Figures 5C–F**). We have also identified a number of genes that are differentially expressed in Hpa-proximal cells not previously detected in microarray studies (**Figure 5A**). However, we have also failed to detected many genes previously associated with Hpa infection. While many of these could potentially be attributed to differences in experimental design or the Hpa isolate used, it seems that noise is largely a contributing factor. Greater optimization of the FACS protocol will hopefully help to overcome this in the future.

A crucial development in the use of FACS for studying local vs. systemic signaling during Arabidopsis infection will be the development of new cell markers. A key challenge, particularly for the Hpa pathosystem, is that the pathogen and the proteins that it delivers into host cells cannot currently be fluorescently labeled through genetic manipulation. As such, isolation of Hpa-contacting cells relies entirely on pathogenresponsive Arabidopsis promoters, which may not be induced immediately and are likely to show changes in expression over the course of infection. This seems to be an issue with the DMR6 promoter, from which we could not detect GFP expression until 5 d.p.i. This prevented us from studying earlier stages of infection, which is unfortunate as it is at these stages that the use of FACS will be most informative, as the limited spread of the pathogen precludes the use of whole tissue microarrays. An additional caveat more relevant to this dataset, is that, at the later time points (e.g., 5 and 7 d.p.i.), recently haustoriated cells may not fluoresce, and may instead be interpreted as Hpadistal cells. Characterizations of new, early-induced haustoriated cell markers, as well as an in-depth study of their expression patterns will be crucial in developing a refined FACS approach. Furthermore, to avoid dilution of the systemic response, one could use a second fluorophore to mark cells within a certain range of the pathogen. This could potentially be complex as signals and responses spread over space. In addition to developing new methods to study pathogen signaling at a cellspecific resolution, we must in turn develop theoretical methods to understand the data being generated, and perhaps take into account some of the assumptions and limitations of the FACS approach. As these methods develop, we can better understand the events that occur specifically at the Arabidopsis-Hpa interface, and how these might influence more widespread signaling in the plant.

# Author Contributions

TC, MG, and JB contributed to design of the experiment and interpretation of the data. VC generated the ProDMR6::GFP construct and plant lines, and TC and VC performed microscopy of these plants. Protoplast generation, FACS and microarray analysis was performed by TC. All authors wrote the manuscript.

# Acknowledgments

We thank D. Tomé and J. Steinbrenner for maintenance of Hpa cultures and inoculation of experimental plants. We also thank D. Patel and J. Hulsmans for technical support with FACS. This work was supported by a BBSRC New Investigator grant BB/H109502/1 to MG, a BBSRC grant BB/F001347 to JB and the EPSRC/BBSRC funded Warwick Systems Biology Doctoral Training Centre to TLRC. All microarray data have been deposited in GEO (Series GSE58046), released upon publication.

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00527

Figure S1 | Confocal microscopy images of Hyaloperonospora arabidopsidis (Hpa) infection marker ProDMR6::GFP expression in an Arabidopsis cotyledon, 7 d.p.i. with compatible Hpa isolate Noks1. (A,B) Expression of the marker follows the pattern of pathogen spread across the cotyledon. (C–E) Cells expressing the marker appear to contain haustoria.

Figure S2 | Fluorescence expression profiles for cells analyzed and sorted with FACS. (A,B) Dot plots of output from the 580/30 nm vs. 530/40 nm bandpass filters on the BD Influx, using a workspace derived from Grønlund et al. (2012). (A) Protoplasts generated from uninfected, 14-day-old ProDMR6::GFP seedlings, where cells were collected exclusively from the low 580/low 530 (GFP-negative) fate. (B) Protoplasts generated from ProDMR6::GFP inoculated with Hpa isolate Noks1, at 7 d.p.i., where cells were collected from both the low 580/low 530 (GFP-negative) and low 580/high 530 (GFP-positive, ∼0.5%) gates. The high 580/low 530 gate represents cell debris.

Figure S3 | Pair-wise comparisons used to identify differentially expressed genes. Expression at 7 d.p.i. and 5 d.p.i. was compared for each cell type. Genes significantly differentially expressed in either Hpa-distal or Hpa-proximal cells, but not uninfected control cells, over time were considered as differentially expressed. To investigate differential expression between cell types, six comparisons were performed: Hpa-distal cells vs. uninfected control cells, Hpa-proximal cells vs. uninfected control cells, and Hpa-proximal cells vs. Hpa-distal cells, independently for each time point.

Table S1 | Normalized log2 expression data for uninfected control, Hpa-proximal, and Hpa-distal cells at 5 and 7 d.p.i. Gene IDs, names, and descriptions are given based on the TAIR10 genome annotation. Column names are given in the format "Cell type, time point, replicate," where "Control5a" is control cells at 5 d.p.i., replicate 1.

Table S2 | Transcripts differentially expressed between uninfected control, Hpa-proximal, and Hpa-distal cells at 5 and 7 d.p.i. Gene IDs, names and descriptions are given based on the TAIR10 genome annotation. Mean log2 expression values are given in columns D-I–the names of these columns are given in the format "Cell type, time point," where "Control5" is control cells at 5 d.p.i. Columns J-M are fold-changes for distal and proximal cells at 5 d.p.i. and 7 d.p.i., relative to uninfected controls at the respective time points. Columns N ("Group@5dpi") and O ("Group@7dpi") give the names of the groups which that transcript belonged to (see Figure 4), or state "Not DE" if not differentially expressed at that time point. Column P ("Newly detected?") states "Yes" or "No" as to whether the gene is newly detected in our dataset, based on the analysis in Figure 5A.

Table S3 | Overrepresentation of Gene Ontology (GO) terms in Hpa-responsive gene groups. Hpa-responsive genes were grouped according to the localization of their response at 5 and 7 d.p.i. (see Table S2, Figure 4). All GO terms found to be overrepresented (Benjamini-Hochberg adjusted p ≤ 0.05) in the groups are included, other than those that were only represented by a single gene in a group, which were excluded. The percentage frequency of each GO term in the cluster and in the Arabidopsis thaliana genome (of the 27,594 GO-annotated genes), as well as a list of genes annotated with the GO term are included.

#### References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Coker, Cevik, Beynon and Gifford. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Identification of a core set of rhizobial infection genes using data from single cell-types

*Da-Song Chen1, Cheng-Wu Liu2, Sonali Roy2, Donna Cousins2, Nicola Stacey2 and Jeremy D. Murray 2\**

*<sup>1</sup> State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China, <sup>2</sup> John Innes Centre, Department of Cell and Developmental Biology, Norfolk, UK*

Genome-wide expression studies on nodulation have varied in their scale from entire root systems to dissected nodules or root sections containing nodule primordia (NP). More recently efforts have focused on developing methods for isolation of root hairs from infected plants and the application of laser-capture microdissection technology to nodules. Here we analyze two published data sets to identify a core set of infection genes that are expressed in the nodule and in root hairs during infection. Among the genes identified were those encoding phenylpropanoid biosynthesis enzymes including *Chalcone-O-Methyltransferase* which is required for the production of the potent Nod gene inducer 4- ,4-dihydroxy-2-methoxychalcone. A promoter-GUS analysis in transgenic hairy roots for two genes encoding *Chalcone-O-Methyltransferase* isoforms revealed their expression in rhizobially infected root hairs and the nodule infection zone but not in the nitrogen fixation zone. We also describe a group of *Rhizobially Induced Peroxidases* whose expression overlaps with the production of superoxide in rhizobially infected root hairs and in nodules and roots. Finally, we identify a cohort of co-regulated transcription factors as candidate regulators of these processes.

Keywords: infection threads, methoxychalcone, medicarpin, CCAAT-box, infection zone, nod genes, Nod factors, nodulation

#### Introduction

Most legumes are able to interact with soil bacteria called rhizobia to form special root structures called nodules within which the bacteria are able to fix atmospheric nitrogen, a process which is fueled by host photosynthate. In *Medicago truncatula*, as with many legumes, the rhizobia infects the host root by colonizing special intracellular tubular invaginations called infection threads that form first in root hairs and then later in mature nodules. This process begins with rhizobial attachment near the tip of growing root hairs which then curl and entrap a rhizobial microcolony in an infection pocket, also called an infection focus. The infection thread then extends as a tubular structure from this pocket through the length of the root hair, becoming colonized by rhizobia as it grows (Fournier et al., 2008). The process of infection thread formation is then repeated in the underlying cell layers allowing the rhizobia access to the inner root layers which have divided to form the NP. Root hair infection involves the upregulation of 100s of genes that regulate several different processes including host-symbiont signaling and diverse developmental processes including cell growth and engagement of the cell cycle (Libault et al., 2010; Breakspear et al., 2014). While maturing the nodule forms several developmental zones: an apical

#### *Edited by:*

*Marc Libault, University of Oklahoma, USA*

#### *Reviewed by:*

*Ulrike Mathesius, Australian National University, Australia Md Shakhawat Hossain, University of Missouri, USA*

#### *\*Correspondence:*

*Jeremy D. Murray, John Innes Centre, Department of Cell and Developmental Biology, Norwich Research Park, Norwich, Norfolk NR4 7UH, UK jeremy.murray@jic.ac.uk*

#### *Specialty section:*

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> *Received: 12 May 2015 Accepted: 13 July 2015 Published: 28 July 2015*

#### *Citation:*

*Chen D-S, Liu C-W, Roy S, Cousins D, Stacey N and Murray JD (2015) Identification of a core set of rhizobial infection genes using data from single cell-types. Front. Plant Sci. 6:575. doi: 10.3389/fpls.2015.00575* meristem (Zone I), an infection zone containing cells that form infection threads (ZII), a nitrogen fixing zone comprised of giant cells filled with endocytosed nitrogen-fixing rhizobia (ZIII), an interzone (IZ), and a senescent zone, where no nitrogen fixation takes place (ZIV). The process of infection, from the initial entry of the rhizobia into the infection threads up until their release into symbiosomes, requires constant communication between the host and symbiont. At the heart of the dialog is the production of bacterial signaling compounds called Nod factors. Nod factors are produced in response to plant flavonoids, phenylpropanoid compounds which are secreted by the roots into the rhizosphere (Peters et al., 1986; Redmond et al., 1986; Maxwell et al., 1989; Kape et al., 1992; Phillips et al., 1994; Zuanazzi et al., 1998; Subramanian et al., 2007). The secreted flavonoids induce the expression of rhizobial nod genes required for the production of Nod factors through activation of the transcription factor NodD (Mulligan and Long, 1985; Rossen et al., 1985). The flavonoids produced by a given host are often specific to certain symbionts, only activating Nod genes in nodulation-competent rhizobia (Györgypal et al., 1988). In *M. truncatula*, several flavonoids capable of inducing Nod gene in its symbiont *Sinorhizobium meliloti* have been identified and one of the most potent is 4,4- dihydroxy 2- -methoxychalcone (Kapulnik et al., 1987; Maxwell et al., 1989; Orgambide et al., 1994). The enzyme Chalcone *O*-Methyltransferase (ChOMT) in the closely related *M. sativa* is required for production of this compound from isoliquiritigenin (4,2- ,4- -trihydroxychalcone; Maxwell et al., 1992). Our recent study has shown that the *M. truncatula* ortholog, *ChOMT1*, and three other close homologs (*ChOMT2*, *ChOMT3*, and *ChOMT4*) were induced during infection of root hairs (Breakspear et al., 2014). Transcripts for two of these isoforms were also detected in the infection zone (Zone II) of mature nodules (Roux et al., 2014), but a spatio-temporal analysis of the expression of these genes during early infection is lacking.

One of the first physiological events occurring during nodulation is the generation of ROS (Cook et al., 1995; Ramu et al., 2002), which is coincident with enhanced flavonoid biosynthesis (van Brussel et al., 1990; Mathesius et al., 1998; Hassan and Mathesius, 2012). Earlier studies on the production and breakdown of ROS by rhizobia suggest that ROS levels must be maintained between certain limits for the symbiosis to be successful (Santos et al., 2000; Jamet et al., 2003, 2007). The expression of *Rhizobial Induced Peroxidase 1* (*RIP1*) was shown to increase in response to rhizobial infection or Nod factors and was found to be correlated with the production of ROS during rhizobial infection (Cook et al., 1995; Ramu et al., 2002). Recently, transcriptomic analysis of root hairs revealed that *RIP1* belongs to a family of 10 *RIPs* that are similarly strongly induced upon inoculation with rhizobia or Nod factors, further suggesting an important role for the modulation of ROS during nodulation (Breakspear et al., 2014).

In this study we compared published data from studies using single-cell types, i.e., microarray data for root hairs collected from seedlings inoculated with rhizobia (Breakspear et al., 2014) and RNAseq data for laser-capture microdissected nodules (Roux et al., 2014) to identify a core set of genes associated with infection. We then investigated the expression of two *ChOMT* genes and the generation of superoxides during the different stages of nodulation in infected root hair and nodule cells. Finally, we identified transcription factors induced during rhizobial infection of root hairs and preferentially expressed in the nodule infection zone as potential regulators of ROS and phenylpropanoid production.

# Materials and Methods

#### Analysis of Transcriptomics Data

The first data set used for our analysis was from Roux et al. (2014). This study generated RNAseq data by laser-capture microdissection of *M. truncatula* cv. Jemalong A17 nodules harvested 15 days post inoculation with *S. meliloti* 2011. This was designated as data set A. A second data set was from a microarray analysis of root hairs of harvested from *M. truncatula* cv. Jemalong A17 seedlings 1, 3, and 5 days post inoculation with *S. meliloti* 1021 or treated with Nod factor (24 h post treatment; Breakspear et al., 2014), designated data set B. Roux et al. (2014) assigned all plant and bacterial genes that were differentially expressed between zones into 13 hierarchical clusters based on their expression across the sampled nodule zones. To compare the two data sets we identified all genes in dataset B that could be assigned to the above-mentioned clusters and compared their frequencies to that of all genes assigned to clusters in dataset A using a chi-squared test with the online GraphPad QuickCalc software http://graphpad*.*com/quickcalcs/chisquared1*.*cfm.

#### Plant Growth Conditions

Composite plants and seedlings were grown in controlled environment chambers with 16 h day length with a light intensity of 90–130 µmol m−<sup>2</sup> s−<sup>1</sup> and constant temperature of 20◦C. For rhizobial inoculation a 24-h culture of *S. meliloti* 1021 rhizobia was spun down and resuspended in buffered nodulation medium to an optical density of 0.02 at 600 nm, and 3 mL of the culture was used to inoculate each of the growing *M. truncatula* plants.

#### Promoter-GUS Analysis

For the promoter-GUS analyses, the promoter regions of *ChOMT2* (Medtr3g021440; from −23 to −1803 bp) and *ChOMT3* (Medtr7g011900; from −23 to −1885 bp) were amplified by PCR using Phusion High Fidelity DNA polymerase (NEB). The fragments were then cloned into pDONR207 and after sequence confirmation were transferred to the destination vector pKGWFS7 upstream of the GUS open reading frame using the GATEWAY cloning system (Life Technologies) as per the manufacturer's recommended protocol. The vector was then transformed into *M. truncatula* (A17) using *Agrobacterium rhizogenes*-mediated hairy root transformation as described previously (Breakspear et al., 2014). The composite plants with transgenic roots were then transferred to a soil mixture of Terragreen (Oil-Dri UK) and silver sand 1:1 mixture, inoculated with *S. meliloti* 1021, and watered as needed with distilled water. Nodulated roots were then stained for GUS activity at 1, 2, and 3 weeks post inoculation, as previously described (Breakspear et al., 2014).

#### Nitroblue tetrazolium (NBT) Staining

For nitroblue tetrazolium (NBT) staining, *M. truncatula* seedlings were grown in Terragreen (Oil-Dri UK) and silver sand mixture (1:1) for 7 days and then inoculated with *S. meliloti* 1021. At 14 days post inoculation the nodulated roots were cut off and stained in 0.1% NBT water solution (Promega UK) for 40 min in the dark at room temperature. Imbedding and sectioning of plant tissues was carried out as previously described (Guan et al., 2013).

# Results

Infection threads form in both root hairs and in cells of nodule ZII. To identify genes common to infection thread formation in these two cell types we compared genes that were induced either by rhizobia or by purified Nod factors in root hairs with the genes expressed in different nodule zones (Roux et al., 2014). To do this we used the RNAseq analysis described by Roux et al. (2014) which assigned genes to 13 clusters based on their expression across tissues sampled from each nodule zone using laser-capture microdissection. The advantage of this approach is that genes that are only expressed in the nodule but not in the root hair or vice versa are not considered. For example, the large family genes encoding for nodule-specific cysteine-rich (NCR) peptides, which are highly expressed in the nodule but are not expressed in root hairs, are thereby excluded. A total of 768 genes were identified that met the clustering criteria and were induced by at least one treatment in root hairs. When these genes were assigned to their clusters a disproportionate number of the rhizobially induced genes in the WT and *sickle* (*skl*) mutant belonged to clusters 2–5, which are genes that are primarily expressed in the

nodule meristem (ZI) and distal ZII with some expression in proximal ZII (**Table 1**). Nod factor-induced genes displayed an overlapping pattern, with clusters 1–4 being over-represented, cluster one containing genes with the majority of their expression in the meristem (ZI) with some expression in distal ZII (**Table 1**). For each treatment and the combined treatments the observed frequencies in the different clusters were significantly different than the expected frequencies (Chi-squared tests, all *p*-values *<*0.0001). This analysis suggests that a core set of genes underlies infection processes in both tissue types.

We then considered the genes that were induced by rhizobial infection and were found in the over-represented clusters 2–5 (Supplementary Table S1). Amongst these genes were several members of the isoflavonoid biosynthesis pathway analyzed in Breakspear et al. (2014) including *Chalcone Reductase (CHR)* which is required for the production of isoliquiritigenin and *ChOMT* which converts isoliquiritigenin to the even more potent nod gene inducer methoxychalcone. Four *ChOMT* genes are induced in root hairs after rhizobial infection (Breakspear et al., 2014), but their expression patterns during nodulation have not been investigated in detail. We investigated the expression of two of these genes, *ChOMT2* and *ChOMT3,* during the early stages of infection and nodule formation using promoter-GUS analysis in *A. rhizogenes* transformed roots. We found that both genes were expressed in root hairs undergoing infection (**Figures 1A–D**) and throughout the NP (**Figures 1E,F**). As the nodule matured the expression of both genes was restricted to the apex (**Figures 1G,H**), and in root tips (data not shown). The expression of both genes was typically observed at numerous infection sites along the root (**Figures 1A,B,D,I**). Hand sectioning of the nodules revealed expression of *ChOMT2* in the

TABLE 1 | Comparison of the observed number of genes induced by rhizobial inoculation in root hairs belonging to different clusters of nodule expression (Roux et al., 2014).


<sup>1</sup>*clusters from Roux et al. (2014).*

<sup>2</sup>*number of genes induced by rhizobia or Nod factor (Breakspear et al., 2014).*

<sup>3</sup>*percentage of genes in overrepresented clusters (shaded cells).*

free-hand sections.

nodule meristem, the infection zone, with some expression in the IZ, closely matching the published LCM data (**Figures 1J,K**; Roux et al., 2014). *ChOMT3* expression was also found in the nodule meristem and but was absent from the IZ and proximal infection zone (**Figures 1L,M**), again closely reflecting the data of Roux et al. (2014). Expression of both genes was absent in the nitrogen fixation zone. *ChOMT3* expression was also strong in the apical part of the nodule vasculature, including the nodule vascular meristems (**Figures 1L,M**).

The expression of *ChOMT2* and *ChOMT3* corresponded well with the other genes encoding enzymes for isoliquiritigenin production which had an average of 65% of their total expression in ZI and ZII (**Figure 2**, Supplementary Table S1). However, some transcripts of these genes (∼10% of the total) were also detected in each of the remaining zones (IZ, ZIII). The expression of the rhizobial nod genes required for the biosynthesis and secretion of Nod factors which are known to be induced by flavonoids also had strong expression in ZI and proximal ZII (on average 55% of the total reads), however, as noted by Roux et al. (2014), they also showed expression (∼25% of total reads) in the nitrogen fixation zone (ZIII; **Figure 2**). Based on our results it seems unlikely that methoxychalcone is responsible for the induction of nod gene expression in ZIII. Instead this expression could be induced by isoliquiritigenin for which the required genes are expressed in ZIII (**Figure 2**). In contrast to the Nod factor biosynthesis and transport genes, the three *nodD* genes were mostly expressed in ZIII, with only one (nodD3) having some expression in the nodule apex (ZI; **Figure 2**).

#### *Rhizobial Induced Peroxidases*

Another gene class identified in the overlap between the inoculated root hair and nodule data sets was type III peroxidases, which are apoplastic/cell wall localized enzymes have a role in cell wall remodeling (Francoz et al., 2015). The genes encoding this family of enzymes have been shown to be inducible by rhizobia and Nod factors in root hairs and designated as *RIP1- 10* (Breakspear et al., 2014). Most of the previously identified *RIPs* are also present in ZI and ZII (**Figure 2**, data from Roux et al., 2014). Unexpectedly one of them, *RIP9*, was more highly expressed than all the others combined (Supplementary Table S3) and had a dramatically different expression domain, being *<sup>&</sup>gt;*90% expressed in the IZ and ZIII (**Figure 2**). Since secreted peroxidases are capable of generating reactive oxygen species, we monitored superoxide production at the different stages of nodulation using NBT staining (Doke, 1983). The stain was seen in microcolonies and in infection threads at the early stages of nodulation (**Figures 3A,B**). The nascent NP were also strongly stained (**Figure 3C**) and as nodules matured, the staining was mainly found in the nodule apex (**Figure 3D**). Nodule sectioning revealed staining in a patchwork of individual cells located mostly in ZII and sometimes in ZIII (**Figures 3E,F**). Closer inspection revealed that the staining was present in cells containing infection threads in ZII (**Figure 3G**) and also in ZIII (**Figure 3H**). There was also staining in a thin layer of non-infected cells near the nodule apex (**Figure 3F**). In addition, staining was also strong in root tips of both mature primary roots and emerging lateral roots (**Figure 3G**).

#### Transcription Factors

The transcription factors controlling the expression of genes for flavonoid biosynthesis and ROS production in the nodule are currently not known. A large number of transcription factors (**Table 2**) were shared between the genes in nodule clusters 2–5 and genes induced in root hairs of infected plants (**Table 3**).

FIGURE 2 | Expression in different nodule zones of a family of *Rhizobial Induced Peroxidases (RIPs)* and genes involved in the induction of Nod factor biosynthesis by flavonoids. Genes for isoliquiritigenin biosynthesis: *ChOMT1*, *ChOMT2*; genes *Phenylalanine Ammonia Lyase* (*PAL*), *4-Coumarate:coA Ligase* (*4CL*), *Chalcone Synthase* (*CHS*), *Chalcone Reductase* (*CHR*), *Chalcone Isomerase* (*CHI*); Rhizobial nod genes for Nod factor biosynthesis and transport: nodA, B, C, D, E, F, I, J; *S. meliloti* nod (nodD1, D2, D3) genes for flavonoid-dependent activation of Nod factor biosynthesis; Nod factor receptor genes *Nod Factor Perception* (*NFP*) and *LysM receptor-like kinase* (*LYK3*); *RIPs 1–10.* Data, adapted from Roux et al. (2014), is represented as percent of total normalized transcripts for each gene or group of genes, see Supplementary Table S2 for details.

These included the well-studied *ERN1*, *ERN2,* and *NSP2* which are required for early infection events and nodule development (Oldroyd and Long, 2003; Kaló et al., 2005; Heckmann et al., 2006; Middleton et al., 2007; Cerri et al., 2012). In addition, four genes encoding CCAAT-box transcription factor subunits (*NF-YA1*, *NF-YA2*, *NF-YC2*, *NF-YB7*) were in this group. *NF-YA1* and *NF-YC2* are important for both rhizobial infection and nodule development (Combier et al., 2006, 2008; Zanetti et al., 2010; Soyano et al., 2013; Laporte et al., 2014), while *NF-YA2* is a close homolog of *NF-YA1*. This co-regulation across root hair and nodule-zones suggests that the encoded subunits may act as part of heterocomplex along with NF-YB7 to regulate infection. Notably the *nf-ya1* mutant forms nodules with either reduced or absent meristems that fail to release the bacteria from the infection thread into symbiosomes (Xiao et al., 2014). A summary of all data (Breakspear et al., 2014; Roux et al., 2014) for the entire family of *M. truncatula* CCAAT-box transcription factors is included in Supplementary Table S4.

The roles of the remaining transcription factors in nodulation have not been defined. Of note in this group are the nodulespecific GRF-Zinc finger gene N20, and the *M. truncatula* ortholog of *Arabidopsis thaliana* Zinc Finger Protein 6 (ZFP6). ZFP6 has been assigned a role in integration of gibberellic acid and cytokinin signaling (Zhou et al., 2013).

## Discussion

#### Root Hair Infection-Genes are Expressed in Nodule Zone II

Current molecular genetic studies of nodulation rely heavily on gene expression data to provide insight into symbiotic processes. Candidate genes identified in this manner can then be followed up using reverse genetic studies such as the *Tnt1* or *LORE1* transposon mutant collections available in *M. truncatula* or *Lotus japonicus*, respectively (Tadege et al., 2008; Cheng et al., 2011; Fukai et al., 2012; Urbanski et al., 2012 ´ ). Recently laser-capture has been used to target the specific regions of indeterminate nodules (Roux et al., 2014), including the infection zone which contain the cells becoming infected by rhizobia. A subsequent study, which transcriptionally profiled root hairs from seedlings undergoing infection (Breakspear et al., 2014), provides a unique opportunity to compare gene expression responses during infection of these two cell types. A disproportionate number of genes induced in root hairs of infected plants were expressed in the nodule apex. The majority of genes induced in root hairs in response to purified Nod factors were also expressed in the nodule apex (71% belonging to clusters 2–5), mirroring the analysis by Roux et al. (2014) who reported that 63.4% of genes induced by Nod factors in intact root samples in an earlier study (Czaja et al., 2012) belonged to clusters 1–4. This overlap suggests that a core set of genes required to support rhizobial infection in both epidermal and cortical (nodule) tissues are directly induced by Nod factor signaling.

#### *ChOMT* Genes are Expressed Specifically in Infected Roots Hairs and in the Nodule Apex While Other Flavonoid Biosynthesis Genes are Expressed Throughout the Nodule

Expression of the *ChOMT* genes required for the production of the nod-gene inducing flavonoid methoxychalcone was high in infected root hairs and within the nodule was restricted to the

FIGURE 3 | Nitroblue tetrazolium (NBT) staining of *M. truncatula* roots nodulated by *S. meliloti* 1021. (A) A microcolony within an infection focus. (B) An infection thread in a root hair. (C) Nodule primordia (NP) and a lateral root primordium (LRP). (D) An intact

nodule (E,F,G,H) nodule sections, 10 µm thick (I) the primary root tip. In (G) and (H) arrows indicate infection threads. Plants harvested 14 dpi with *S. meliloti* 1021. Bars = 50 µm (A,B), 100 µm (E,F), and 500 µm (C,D,I), 20 µm (G,H).

#### TABLE 2 | Transcription factors associated with rhizobial infection.


*Data from Roux et al. (2014).*

*Cells shading indicates relative strength of expression for a given gene across nodule zones.*

<sup>1</sup>*Previously HAP2.1.*

<sup>2</sup>*Previously MtARR8, see notes on family nomenclature (Liu et al., 2015).*

<sup>3</sup>*Chromosome position Mtv4.0.*

<sup>4</sup>*Percentage of normalized RNAseq reads across samples (Roux et al., 2014).*



*Data from Breakspear et al. (2014).*

<sup>1</sup>*dpi, days post inoculation with Sinorhizobium meliloti 1021.*

<sup>2</sup>*NF, 24 h post treatment with 10 nM Nod factors.*

∗*Significantly different compared to control p < 0.05, also shaded light red (see Breakspear et al., 2014 for details).*

infection zone, matching the pattern of expression of rhizobial Nod factor synthesis/export genes and the Nod factor receptors *Nod Factor Perception (NFP)* and *LysM receptor-like kinase (LYK3),* (**Figure 2**; Roux et al., 2014). This is consistent with the detection of flavonoids in root hairs undergoing infection (Hassan and Mathesius, 2012) and is consistent with evidence showing that Nod factor production by the rhizobia is required at all stages of infection thread development (Marie et al., 1992; Den Herder et al., 2007). While *ChOMT* expression was restricted to ZII, transcripts for genes required to make the methoxychalcone precursor isoliquiritigenin as well as rhizobial nod genes were also detected in the N-fixation zone (ZIII; **Figure 2**; Roux et al., 2014). The authors further showed using promoter-GUS analysis that the expression of the nod genes coincided with the relatively infrequent infection threads observed in ZIII. The significance of these sporadic infections which can also be observed in mature determinate nodules of *L. japonicus* (unpublished results) remains to be determined. Our analysis suggests that the nod gene inducer in the N-fixing zone of the nodule is unlikely to be methoxychalcone since *ChOMT* expression is tightly confined to the nodule apex. On the other hand isoliquiritigenin seems to fit the role as several genes required for its synthesis, including *CHR* and *Chalcone Isomerase*, have moderate levels of expression in ZIII. Another function recently highlighted for flavonoids is their role as antioxidants, having roles in stomatal closure and drought

response (Nakabayashi et al., 2014; Watkins et al., 2014). It seems possible that these compounds could serve to help buffer against damage from ROS generated in cells undergoing infection. Two *Vestitone Reductase* genes needed for the production of medicarpin, one of which is expressed at infection sites in the epidermis (Breakspear et al., 2014), were also found to be expressed in ZI and the IZ (Supplementary Table S2). A potential role for medicarpin is selection against incompatible rhizobia and other opportunistic bacteria as previously discussed (Breakspear et al., 2014).

The expression of the nodD genes did not correspond with the Nod factor signaling domain that was clearly delineated by the expression of the Nod factor receptors, rhizobial nod factor biosynthesis genes, and the *ChOMTs*, with only one gene, nodD3, having about 30% of its expression in this domain and nodD1 and nodD2 being mainly expressed in ZIII. The lack of correspondence between nodD expression and the Nod factor signaling domain may reflect the fact that the nodD gene, at least in *Rhizobium leguminosarum*, is constitutively expressed and is not induced by flavonoids (Rossen et al., 1985). Furthermore, the NodD protein is present at relatively low levels in the rhizobia, suggesting that low constitutive expression is sufficient for its activity (Schlaman et al., 1991). While nodD expression is not induced by flavonoids, activation of nod genes by some NodD proteins is flavonoid-dependent; methoxychalcone strongly induces nod gene expression in rhizobia containing extra copies of nodD1 or nodD2 in *S. meliloti* (Hartwig et al., 1990). In contrast, NodD3 induction of nod gene expression is flavonoid-independent and requires the transcriptional regulator syrM (Mulligan and Long, 1989). Accordingly, in nodules syrM expression closely matches that of nodD3 Roux et al. (2014; Supplementary Table S2). A study which examined all three single nodD mutants showed a small reduction in nodulation in the nodD3 mutant only, but strongly reduced nodulation for all combinations of double mutants, suggesting all three isoforms are important in the interaction with *M. truncatula* (Smith and Long, 1998). Similar results were obtained with *M. sativa* (Honma and Ausubel, 1987). The available data suggest that nodD1 and nodD2 activation of nod gene expression in the infection zone occurs mainly through the action of methoxychalcone, but expression of these genes in ZIII may be mediated by another flavonoid, such as isoliquiritigenin, whilst NodD3 operates independently of flavonoids and is sufficient to sustain nodulation.

## *Rhizobial Induced Peroxidases* have Complementary Patterns of Expression in the Nodule

Also common to epidermal and cortical infection was the strong expression of a subset of *RIPs*. Type III peroxidases have been implicated in generation of apoplastic reactive oxygen species (Martinez et al., 1998; Bindschedler et al., 2006), with one important role being to promote cell wall hardening (Passardi et al., 2004). A corresponding role for these peroxidases has been proposed in the rigidification of the infection thread cell wall and matrix (Wisniewski et al., 2000). Expression of nine of the ten *RIPs* was strictly limited to Zones I and II of the nodule while another family member, *RIP9*, showed a complementary pattern, being very highly expressed (∼7 times higher than all other RIPs combined) in the IZ and ZIII. The expression of *RIP9* therefore coincides with the low pO2 in the proximal zones of the nodule; Roux et al. (2014) report that the leghemoglobin genes required for microaerobic conditions are strongly and abruptly upregulated in the IZ and remain highly expressed in ZIII. Indeed, low oxygen availability in these zones might explain the need for a highly expressed, separately regulated peroxidase isoform. While the link between type III peroxidases with rhizobial infection is well established (Cook et al., 1995; Ramu et al., 2002; this study) functional analysis is confounded by the presence of multiple family members with similar expression patterns. *RIP9* may therefore present an opportunity to use genetics to help study the role of these enzymes in nodulation.

Based on co-expression and co-regulation we identified candidate transcription factors involved in rhizobial infection. The group included *ERN1*, *ERN2*, *NSP2*, *NF-YA1*, and *NF-YC2*, all of which have been functionally implicated in rhizobial infection (Oldroyd and Long, 2003; Kaló et al., 2005; Combier et al., 2006, 2008; Heckmann et al., 2006; Middleton et al., 2007; Zanetti et al., 2010; Cerri et al., 2012; Soyano et al., 2013; Laporte et al., 2014). Two other genes encoding CCAATbox subunits were also identified in the analysis, *NF-YA1*, a close homolog of *NF-YA1*, and *NF-YB7.* As these transcriptional regulators act in heterocomplexes having one of each A, B, and C subunits, it is tempting to speculate that these act together to control infection, with NF-YA1 and NF-YA2 acting interchangeably. Among the unstudied members of this group is a GRF zinc finger protein N20, which has expression that is highly nodulation-specific but with only weak aa sequence homology to other legumes. Also of interest is *ZPF6* which encodes a C2H2 transcription factor that is required for trichome development, and has been proposed as an integrator of GA and cytokinin signaling in this process (Zhou et al., 2013). Also found were the transcription factors cytokinin response regulator *RRA3*, an ERF (Medtr3g090760) with homology to *Arabidopsis Cytokinin Response Factor 4* (**Table 2**), and two genes encoding the GA biosynthetic enzymes, Ent-Kaurenoic Acid Oxidase 1 (KAO1) and Gibberellin 3-Oxidase 1 (GA3OX1; Supplementary Table S1), which were part of a larger set of GA and cytokinin related genes identified as upregulated after rhizobial inoculation (Breakspear et al., 2014; Liu et al., 2015). This indicates that the regulation of these hormones, along with auxin, is important during infection in both root hairs and within the nodule.

#### Challenges and Future Directions

This study illustrates the power of single-cell type approaches to study the key mechanisms that underlie a biological process. Through the study of gene expression associated with infection thread formation in two different tissues, tissue-specific and background effects can be eliminated and core processes exposed. Improvements can of course still be made; Roux et al. (2014) estimated a 10% rate of contamination of ZII transcripts in the ZI sample, which argues the need for sampling smaller areas of tissue. Furthermore, it can be difficult to discern different cell types, for instance between meristematic cells and cells of the distal infection zone (Limpens et al., 2013). Similarly, while root hair isolation methods used by Breakspear et al. (2014) or Libault et al. (2010) are technically less challenging than laser-capture microdissection, only a small percentage of root hairs in the sample are undergoing infection. Ultimately, a single cell-based approach offers the greatest resolution, having the potential to sub-classify cells from the same tissue. Such an approach would avoid the averaging out of highly localized phenomena such as the production of superoxide reported in this study. Another limitation of the described approach is evident from the analysis of the transcriptional regulators. It is clear that infection thread development is comprised of numerous processes occurring in parallel sometimes within the same cells which cannot easily be separated even using single-cell approaches. This can be addressed in three ways. The first is to use a developmental time series, as employed in Breakspear et al. (2014), which allowed partial resolution of events occurring before and during infection. The second is the use of relevant sensory or chemical inputs that can perturb individual components of the system. For instance a subset of the genes may be inducible by treatment with ROS, allowing them to be partitioned away from the larger set of co-regulated genes. In this respect the ever-growing gene expression atlases available for medicago and other plants present a useful resource (Benedito et al., 2008; Hruz et al., 2008). The third and most powerful approach is the use of mutants. Careful comparison of specific mutants defective in one or a few processes will provide a clearer picture of the transcriptional network underlying rhizobial infection.

Our work compares two different types of data sets and identifies a core set of infection genes common to infection in both root hairs and nodules with specific attention to transcription factors that can serve as a starting point for future studies. In addition we show for the first time the expression pattern of two genes encoding *ChOMT* isoforms, an enzyme that plays a key role in the symbiosis that so far has only been studied at the biochemical level. We confirmed expression of these genes in the nodule infection zone, and have extended this knowledge by showing that these genes are specifically induced in infected root hairs, and that one gene is expressed in the nodule vascular

# References


bundle. We show using NBT staining that superoxide is being produced specifically in cells undergoing infection in root hairs and the nodule further demonstrating the tight link between ROS production and the infection process.

# Acknowledgments

This work was supported by the Biotechnology and Biological Sciences Research Council Grants BB/G023832/1 and BB/L010305/1 and the John Innes Foundation.

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal*.*frontiersin*.*org/article/10*.*3389/fpls*.*2015*.*00575

*Phytophthora infestans* and to the hyphal wall components. *Physiol. Plant Pathol.* 23, 345–357. doi: 10.1016/0048-4059(83)90019-X


*meliloti–Medicago sativa* symbiosis and their crucial role during the infection process. *Mol. Plant Microbe Interact.* 16, 217–225. doi: 10.1094/MPMI.2003.16. 3.217


activity of a diglycosyl diacylglycerol membrane glycolipid from *Rhizobium leguminosarum* biovar trifolii. *J. Bacteriol.* 176, 4338–4347.


and nodule development affects partner selection in the common bean-*Rhizobium etli* symbiosis. *Plant Cell* 22, 4142–4157. doi: 10.1105/tpc.110.079137


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Chen, Liu, Roy, Cousins, Stacey and Murray. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Spatially resolved *in vivo* plant metabolomics by laser ablation-based mass spectrometry imaging (MSI) techniques: LDI-MSI and LAESI

#### *Benjamin Bartels and Aleš Svatoš\**

*Research Group Mass Spectrometry/Proteomics, Max Planck Institute for Chemical Ecology, Jena, Germany*

#### *Edited by:*

*Marc Libault, University of Oklahoma, USA*

#### *Reviewed by:*

*Sixue Chen, University of Florida, USA Zhibo Yang, University of Oklahoma, USA*

#### *\*Correspondence:*

*Aleš Svatoš, Research Group Mass Spectrometry/Proteomics, Max Planck Institute for Chemical Ecology, Max-Planck-Gesellschaft, Hans-Knöll-Straße 8, Jena D-07745, Germany svatos@ice.mpg.de*

#### *Specialty section:*

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> *Received: 06 May 2015 Accepted: 15 June 2015 Published: 10 July 2015*

#### *Citation:*

*Bartels B and Svatoš A (2015) Spatially resolved in vivo plant metabolomics by laser ablation-based mass spectrometry imaging (MSI) techniques: LDI-MSI and LAESI. Front. Plant Sci. 6:471. doi: 10.3389/fpls.2015.00471* This short review aims to summarize the current developments and applications of mass spectrometry-based methods for *in situ* profiling and imaging of plants with minimal or no sample pre-treatment or manipulation. Infrared-laser ablation electrospray ionization and UV-laser desorption/ionization methods are reviewed. The underlying mechanisms of the ionization techniques–namely, laser ablation of biological samples and electrospray ionization–as well as variations of the LAESI ion source for specific targets of interest are described.

Keywords: ambient, ionization, mass spectrometry, laser ablation, electrospray

## Introduction

Sample preparation is an important step that precedes acquisition of many kinds of data. However, often sample preparation is associated with artificially altering the biological or biochemical status of the system under study. In order to minimize this effect, we would like to have little to no sample preparation. If we can perform analysis directly *in vivo*, our data might fully represent the actual system. The usual workflow relies on sample dissection, solvent or thermal extraction and subsequent analysis using chromatographic methods connected to a detector with the needed selectivity. Minimal sample preparation facilitates the analytic process, by allowing people with minimal experience in analytical chemistry to perform the necessary steps without highly involved training. The sheer number of emerging ionization techniques involving minimal, ambient pressure sample preparation demonstrates the current interest, but, sadly, an alphabet soup of abbreviations has been created. Recent reviews (Bhardwaj and Hanley, 2014; El-Baba et al., 2014; Venter et al., 2014) summarize established techniques for most of the possible applications to date, providing an excellent guide for beginners to the field. These techniques are especially interesting for the life sciences (Alberici et al., 2010; Shrivas and Setou, 2012), due to the delicate nature of biological samples. Biological mass spectrometry imaging (MSI) is profoundly profiting from these developments.

In addition to being the least intrusive approach, spatial resolution is an important feature for any imaging technique. Secondary ion mass spectrometry (SIMS) is the ionization technique for mass spectrometry (MS) that offers highest spatial resolution down to reported values of below one micron (Svatos, 2010). Because it uses an ion beam to create secondary ions from the sample (**Figure 1A**), SIMS is not considered a soft ionization technique. Molecules tend to fragment upon ionization, and the utilization of SIMS is intrinsically linked to extensive sample preparation. SIMS

has successfully been used on biological samples for imaging (McMahon et al., 1995). In 2013, SIMS was successfully used to investigate the dynamics of nitrogen gas fixation of cyanobacteria at the level of a single cell (Mohr et al., 2013; **Figure 1D**). MSI of intact biomolecules, however, struggles to reach the level of a bacterial cell. In contrast, recent advances report singlecell resolution on eukaryotes with matrix-assisted ionization techniques, involving extensive sample preparation prior to analysis (Boggio et al., 2011). In early 2015, single-cell imaging was done within a tissue (Li et al., 2015b) utilizing laser ablation electrospray ionization (LAESI), which requires considerably less sample preparation.

A prominent ionization technique used in MSI of large biomolecule imaging is matrix-assisted laser desorption/ionization (MALDI; Caprioli et al., 1997; Bjarnholt et al., 2014; El-Baba et al., 2014). MALDI instrumentation for MSI is commercially available with a spatial resolution of 10 µm (FLEX series, Bruker, Bremen, Germany). MALDI requires the samples to be pre-processed extensively by dissolution in and co-crystallization together with a matrix. Originally restricted to vacuum application (Feigl et al., 1983; Karas et al., 1985, 1987), MALDI has since been adapted to work under atmospheric pressure (Laiko et al., 2000; Li et al., 2007). Desorption and ionization of co-crystallized samples with matrix is facilitated by an ultraviolet (UV) laser and recently has also been used in conjunction with infrared (IR) lasers. The matrix molecules absorb most of the energy deposited to the sample by the laser and transfer the energy to the sample analytes more gently than via direct irradiation (Caprioli et al., 1997; Karas and Kruger, 2003), as depicted in **Figure 1B**. With MALDI, scientists can ionize very big molecules, e.g., proteins, non-destructively, which is one of the reasons why MALDI is used in protein MSI analysis. The method requires reliable matrix deposition and high ion yield (Karas and Kruger, 2003; El-Baba et al., 2014). To image plant cells – some as large as 50 µm – the spatial resolution of commercial instruments is sufficient. Laser desorption ionization (LDI) works similarly to MALDI but does not require an externally applied matrix. Because samples are not pre-treated with a matrix, spatial resolution is not compromised by matrix crystals, which could be larger than the studied cells.

Electrospray ionization (ESI) was originally designed to ionize long polymer chains (Dole et al., 1968) and has subsequently evolved (Yamashita and Fenn, 1984; Whitehouse et al., 1985) to a commonly used ion source in mass spectroscopy. ESI has become very popular (Bhardwaj and Hanley, 2014), for example, in combination with liquid chromatography (Whitehouse et al., 1985) and been used for MSI as well, especially in the form of desorption electrospray ionization (DESI; Bjarnholt et al., 2014) and the closely related nano-DESI (Lanekoff et al., 2012). These techniques have been shown to achieve 50 and 20 µm spatial resolution, respectively (Campbell et al., 2012; Lanekoff et al., 2012). Instead of extracting analytes prior to analysis, both techniques extract analytes *in situ* prior to ionization directly from the sample surface (Venter et al., 2014). Control over the amount of sample surfaces wetted becomes imperative to avoid cross contamination and maintain spatial resolution.

In 2007, LAESI was introduced (Nemes and Vertes, 2007). The basic principle of LAESI combines LDI and ESI: ablation with a laser, and ionization via ESI, as shown in **Figure 1C**. However, LAESI uses an IR laser and relies on water present in the sample as a makeshift matrix (Apitz and Vogel, 2005; Nemes et al., 2012), a condition that most samples in life sciences fulfill. This way the deposition of an external matrix is not required, sample handling is simplified and the need to manipulate the samples prior to analysis is reduced. In a LAESI source, IR-laser light of 2940 nm wavelength is used to irradiate samples. At this wavelength, water has a major peak in its absorption spectrum and thus acts as a chromophore absorbing the deposited energy (Hale and Querry, 1973; Downing and Williams, 1975). Essential work describing the physics of ablating biological tissue with a laser was done recently (Vogel and Venugopalan, 2003b). The event of sample ablation can be split into at least two different phases based on the tensile strength of the sample (Vogel and Venugopalan, 2003a; Apitz and Vogel, 2005). Initially, irradiated sample material is heated and vaporization of molecules from the surface takes place (Vogel and Venugopalan, 2003a). When the energy deposition of the laser is larger than the energy consumption of the vaporization process, the water content of the sample is further heated and driven into a superheated state, leading to phase explosion upon relaxation to a stable state (Vogel and Venugopalan, 2003a; Apitz and Vogel, 2005; Chen et al., 2006). This results in material expulsion as well as tissue rupture and is primarily responsible for ablation efficiency

(Apitz and Vogel, 2005). The resulting ablation plume consists mostly of neutral matter in the form of nanoparticles, droplets, and large particulates. Experimental data suggest droplets from the electrospray plume intercept and fuse with the ablation plume nanoparticles, extracting analytes in the process (Nemes and Vertes, 2007). At this point, post-ionization by ESI takes over. A review of the research done on most of the aspects governing ESI (Kebarle and Verkerk, 2009) provides an excellent introduction to the field. Once ions have been generated from the sample, mass analyzers provide the means of detection.

The following section provides examples of instrumentation to illustrate the capabilities of the LAESI technique. LAESI displays promising potential for application in animal and plant metabolomics (Stolee et al., 2012; Stopka et al., 2014) and MSI of living plant tissue (Nemes and Vertes, 2007; Li et al., 2015b). For more information on different types of MSI methods, refer to **Table 1**.

# Application of LAESI

The first realization of a LAESI ion source, as described by Nemes and Vertes (2007), consisted of a custom-built electrospray system, an Er:YAG laser tuned to a wavelength of 2940 µm, and a time-of-flight (TOF) mass spectrometer. One of the proofof-concept experiments carried out was metabolic profiling of *Tagetes patula* seedlings *in vivo*. Several tentative assignments of metabolites from roots, leaves and stems were made. For that, accurate mass measurements, isotope patterns and metabolomic databases of model organisms such as *Arabidopsis thaliana* were considered. Cautious use of these databases was justified under the presumption that plants share certain metabolomics features (Nemes and Vertes, 2007; Nemes et al., 2008). Although LAESI is classified as a destructive method, seedlings subjected to the single-shot laser ablation were reported to survive the 350 µm wide ablation craters in roots, leaves, and stems.

Nemes et al. (2008) used a combination of LAESI and TOF mass analyzer techniques to show the usability of LAESI for MSI of plant tissues. Leaves of *Aphelandra squarrosa* with variegation patterns were subjected to two-dimensional imaging with a


# TABLE 1 | Ionization techniques used for mass spectrometry imaging (MSI) of biological samples.

spatial resolution of 400 µm and depth profiling with a resolution of 50 µm. The actual spot size of the laser was reported as 350 µm, but a bigger step size was chosen to limit cross-talk in the acquisition of mass spectra. Nemes et al. (2008) were able to show that localization of the secondary metabolites kaempferol and luteolin, as well as certain derivatives with sugar moieties, coincides with the variegation pattern. The spatial distribution was then combined with the information gathered from depth profiling to visualize the spatial distribution of secondary plant metabolites in three dimensions. Depth profiling was realized by consecutive irradiation of the same spot (Nemes et al., 2008, 2009).

The work of Nemes et al. (2008, 2009) showed the feasibility of a LAESI ion source for analyzing and imaging metabolites in plant samples. Shrestha and Vertes (2009) improved upon the LAESI concept by using an etched, GeO2-based glass fiber to focus and deliver the laser to the sample. This made it possible to decrease the diameter of the ablation marks to slightly larger than 2R, with R being the radius of the glass fiber tip's curvature, reported as roughly 15 µm in size and as forming ablation craters of ca. 30 µm. The metabolome of single epithelial cells from *Allium cepa* and *Narcissus pseudonarcissus* bulbs was analyzed and compared across species, but also compared to relative species within a particular sample tissue. Interestingly, the same cell type, *A. cepa* bulb epithelial cells and their *N. pseudonarcissus* equivalent, showed different contents of metabolites, with oligosaccharides and alkaloid, respectively, abundant (Shrestha and Vertes, 2009). By looking at epithelia from different layers of the same bulb, differently aged *A. cepa* cells were compared. The content of arginine was reported to decrease with increasing cell age, while the alliin gradient was oriented the other way around. Cells in an *A. cepa* bulb are older when located in the outer layers. Shrestha et al. (2011) also determined the influence of ablating event on single cells within a tissue on the surrounding cells and found no major disturbance compared to similar cells in undisturbed areas of the sampled tissue.

The same experimental set-up was also used to find biomarkers in the oil glands of *Citrus aurantium* leaves. For the initial mass spectra from achlorophyllous cells of *C. auratium,* leaf oil glands and epidermal cells from distant parts of the same leaf were first measured and then compared. Different terpenes and terpenoids were found in the oil gland cells, which are absent in the epidermal cells and which contained flavonoids compounds not present in the gland cells (Shrestha et al., 2011).

The step to subcellular resolution was taken by Stolee et al. (2012). The LAESI set-up described previously (Shrestha and Vertes, 2009) was improved upon by adding a micro-dissection needle made out of tungsten. Prior to sample irradiation by the IR laser, the needle with a tip diameter of approximately 1 µm was used to cut open and peel back the cell wall of *A. cepa* epithelial cells. Metabolites such as hexose and alliin were reportedly found with higher abundance in cytosolic areas of a cell, whereas the amino acids arginine and glutamine were found more commonly in the area of the cell nucleus (Stolee et al., 2012). However, the improvement made by ablating the sample precisely goes hand in hand with the small sample volume from which ions can be generated. This limitation obviously reduces sensitivity of the method and poses a general problem of spatially confined ionization techniques.

Depending on the properties of the electrospray solution used, imaging substances with strongly diverging polarities may be difficult to ionize simultaneously. A LAESI source was modified to address this problem (Vaikkinen et al., 2013). By adding a nebulizer chip blowing heated nitrogen gas toward the MS orifice, a more efficient ionization of both polar and nonpolar compounds was expected (Careri et al., 1999; Boscaro et al., 2002). Compared to an unmodified LAESI ion source, heat-assisted LAESI (HA-LAESI) has shown to better ionize compounds with low polarity, as demonstrated on *Persea americana* mesocarp (Vaikkinen et al., 2013). A high abundance of signals assigned to triglycerides was observed in the MS spectrum measured with HA-LAESI. These particular peaks were less pronounced when using LAESI. To demonstrate imaging capabilities, Vaikkinen et al. (2013) used *Viola* flower petals and visualized the distribution of glycosides known to be present in *Viola* (Saito et al., 1983) as shown in **Figure 1F**. To further improve on ionizing low and non-polar compounds, a krypton discharge lamp for photo-ionization was added to the LAESI set-up to ionize anisole molecules with UV light that in turn ionize analytes in subsequent reactions taking place in the gas phase. The electrospray was exchanged for a nebulizer chip with an anisole and heated nitrogen gas flow (Vaikkinen et al., 2014), very similar to HA-LAESI. The technique was called laser ablation atmospheric pressure photoionization (LAAPPI). MSI was performed on *Salvia officinalis* leaves, and tentative assignment of multiple terpene and terpenoid compounds could be made (Vaikkinen et al., 2014). Because the IR light was focused using a lens instead of an etched glass fiber (Shrestha and Vertes, 2009) as described by Nemes et al. (2008), spatial resolution was reported as 400 µm.

Until recently, MSI was performed by measuring a sample step-wise using a predefined raster. Resolution of the mapping thus depended on the smallest possible step preventing pixel cross-talk. Li et al. (2015b) reported a procedure for LAESI-MSI, integrating light microscopy to assess and identify single cells within a sample tissue. An imaging raster consisting of cells defining that particular sample tissue was then created and used for systematic cell-by-cell imaging. Feasibility and proofof-concept experiments on *A. cepa* bulb and *Lilium longiflorum* were performed using the precision of LAESI with an etched, GeO2-based glass fiber (Shrestha and Vertes, 2009). The capacity for separating isobaric and structurally isomeric ions in LAESI-MSI experiments was demonstrated by Li et al. (2015a) on *Pelargonium peltatum* leaves and mouse brain tissue.

Trying to make LAESI more compatible with complementary methods such as light microscopy, Compton et al. (2015) tried to spatially separate laser ablation from ESI. After ablation, the produced plume was carried into transfer tubing with nitrogen gas, and analytes were ionized with ESI after emerging from the 60 cm long tubing. Parts of *Viola* and *Acer* sp. were analyzed using remote-LAESI as proof-of-principle experiments. Signal strength was reported to be 27% of the intensity detected using conventional LAESI (Compton et al., 2015).

Laser ablation electrospray ionization was recently used as one of the methods to confirm the quantitative MSI of surfaceoccurring glucosinolate on *A. thaliana* leaf surfaces (Shroff et al., 2015). Data obtained from LAESI and liquid extraction surface analysis (LESA; Kertesz and Van Berkel, 2010) unambiguously supported the data obtained using a 9-aminacridine matrix sublimed on the leaves and imaged using vacuum MALDI-MSI.

In addition, LAESI has been applied to human- and animalderived samples. The applicability of LAESI to blood and serum samples for medical purposes as well as antihistamine quantification directly from human urine samples has been shown (Nemes and Vertes, 2007). Since then, metabolomic and lipidomic analysis of the electric organ of *Torpedo californica* (Sripadi et al., 2009), rat and mouse brain (Nemes et al., 2010; Shrestha et al., 2010), fish gills (Shrestha et al., 2013), and other samples (Parsiegla et al., 2012; Shrestha et al., 2014) has been reported. A LAESI system, DP-1000 LAESI, is now available commercially from Protea Bioscience (Morgantown, WV, USA). The spatial resolution of the system is ca. 200 µm and can be attached to diverse mass spectrometers. Early data on MSI of pesticides, mycotoxines, and plant metabolites from lemon or rose leaves have recently been published (Nielen and van Beek, 2014) using this source.

# Application of LDI-MSI *in Planta*

Laser desorption ionization can be applied *in planta,* as many important secondary metabolites contain conjugated doublebond systems like aromatic/heteroaromatic rings and show strong UV adsorption at 337 or 355 nm; both levels are emitted by the most common UV lasers. Plant pigments and compounds of the polyketide family readily absorb UV light and serve to desorb/ionize themselves. Elimination of MALDI matrices makes MSI in cellular resolution possible; see, for example, hypercins in glandular pigment cells of *Hypericum perforatum* or quercetin glucosides in *A. thaliana* petals or sepals as demonstrated by Hölscher et al. (2009) and shown in **Figure 1E**. A vacuum MALDI system Ultraflex (Bruker) with smart beam technology provided 10 µm spatial resolutions. Hypercins were shown to co-localize with dark pigment glands. A recent advance in developing systems with even higher spatial resolution as well as mass accuracy was commercialized in the AP-SMALDI imagine10 (TransMIT, Giessen, Germany) source attached to a Q-Exactive system with orbital mass analyzer (Thermo Scientific, San Jose, CA, USA). Laser spot sizes smaller than 5 µm are possible, and LDI measurements can be performed at ambient conditions thus preventing plant sample desiccation and deformation. This

## References


method is not limited to plants as was documented by MSIs of nematodes ingesting plant toxins from infected banana roots (Hoelscher et al., 2014) or on various MSI of antibiotics produced by actinomycetes on beewolf cocoons (Kroiss et al., 2010). LDI coupled with a plasma torch, also known as laser ablation inductively coupled plasma MS (LA-ICP-MS), is used for imaging distribution of metals *in planta* (Becker et al., 2010) or to localize proteins labeled with antibodies containing a metal-reporter ion (Bendall et al., 2011). This method shows extreme sensitivity, and as desorbed tissue debris undergoes post-ionization in a plasma torch, the technique is also quantitative.

# Conclusion

Although plant tissues have been employed to characterize LAESI since the introduction of the technique in 2007, its application in plant metabolomics and MSI is still limited to proof-of-concept experiments, for example, with onion (*A. cepa*) bulbs. This limited use may be a result of the apparent dominance of MALDI applications in imaging with high spatial resolution and the initial barrier of acquiring a LAESI source, since instrumentation with high spatial resolution is not yet commercially available. Even custom-built realizations do not reach the benchmark resolutions reported for MALDI. Advantages such as the absence of an external matrix and the potential for direct correlation with microscopically gathered data through the means of software evaluation may, however, promote the use of LAESI over time. Interdisciplinary work, in particular, which is usually characterized by a wide variety of methods and thus depends on data correlation, might profit from these ionization techniques. As the literature reviewed here shows, the performance of the LAESI ion source is sufficient for utilization in larger studies of plant metabolomes, especially in MSI of target metabolites, and for answering current biological questions. The same can be said about LDI. It is less intrusive than MALDI, because it does not require an externally applied matrix. Additionally, the spatial resolution is not compromised by the matrix crystals, which could be larger than the studied cells. Typically, using diverse orthogonal methods can be fruitful and is of help in reducing experimental bias.

## Acknowledgments

We thank Emily Wheeler for editorial assistance and the Max Planck Society for a stipend to BB and for financial support.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Bartels and Svatoš. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Cytological and proteomic analyses of horsetail (*Equisetum arvense* L.) spore germination

Qi Zhao1 †, Jing Gao2 †, Jinwei Suo<sup>2</sup> † , Sixue Chen<sup>3</sup> , Tai Wang<sup>4</sup> and Shaojun Dai <sup>1</sup> \*

<sup>1</sup> Development Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai, China, <sup>2</sup> Key Laboratory of Saline-alkali Vegetation Ecology Restoration in Oil Field, Ministry of Education, Alkali Soil Natural Environmental Science Center, Northeast Forestry University, Harbin, China, <sup>3</sup> Department of Biology, Interdisciplinary Center for Biotechnology Research, Genetics Institute, Plant Molecular and Cellular Biology Program, University of Florida, Gainesville, FL, USA, <sup>4</sup> Institute of Botany, Chinese Academy of Sciences, Beijing, China

#### *Edited by:*

Wagner L. Araújo, Universidade Federal de Viçosa, Brazil

#### *Reviewed by:*

Ján A. Miernyk, University of Missouri, USA Kazuhiro Takemoto, Kyushu Institute of Technology, Japan

#### *\*Correspondence:*

Shaojun Dai, College of Life and Environmental Sciences, Shanghai Normal University, Guilin Rd. 100, Shanghai 200234, China daishaojun@hotmail.com

> † These authors have contributed equally to this work.

#### *Specialty section:*

This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science

> *Received:* 19 March 2015 *Accepted:* 29 May 2015 *Published:* 17 June 2015

#### *Citation:*

Zhao Q, Gao J, Suo J, Chen S, Wang T and Dai S (2015) Cytological and proteomic analyses of horsetail (Equisetum arvense L.) spore germination. Front. Plant Sci. 6:441. doi: 10.3389/fpls.2015.00441 Spermatophyte pollen tubes and root hairs have been used as single-cell-type model systems to understand the molecular processes underlying polar growth of plant cells. Horsetail (Equisetum arvense L.) is a perennial herb species in Equisetopsida, which creates separately growing spring and summer stems in its life cycle. The mature chlorophyllous spores produced from spring stems can germinate without dormancy. Here we report the cellular features and protein expression patterns in five stages of horsetail spore germination (mature spores, rehydrated spores, double-celled spores, germinated spores, and spores with protonemal cells). Using 2-DE combined with mass spectrometry, 80 proteins were found to be abundance changed upon spore germination. Among them, proteins involved in photosynthesis, protein turnover, and energy supply were over-represented. Thirteen proteins appeared as proteoforms on the gels, indicating the potential importance of post-translational modification. In addition, the dynamic changes of ascorbate peroxidase, peroxiredoxin, and dehydroascorbate reductase implied that reactive oxygen species homeostasis is critical in regulating cell division and tip-growth. The time course of germination and diverse expression patterns of proteins in photosynthesis, energy supply, lipid and amino acid metabolism indicated that heterotrophic and autotrophic metabolism were necessary in light-dependent germination of the spores. Twenty-six proteins were involved in protein synthesis, folding, and degradation, indicating that protein turnover is vital to spore germination and rhizoid tip-growth. Furthermore, the altered abundance of 14-3-3 protein, small G protein Ran, actin, and caffeoyl-CoA O-methyltransferase revealed that signaling transduction, vesicle trafficking, cytoskeleton dynamics, and cell wall modulation were critical to cell division and polar growth. These findings lay a foundation toward understanding the molecular mechanisms underlying fern spore asymmetric division and rhizoid polar growth.

Keywords: spore germination, *Equisetum arvense* L., fern, proteomics, single cell, polar growth

### Introduction

Sexual reproduction is crucial in plant life cycle. Spermatophyte seeds, pollen grains, and fern spores play central roles in sexual reproduction with common capability of surviving under harsh conditions upon emergence from dormancy. They have physiological resemblance with each other as to features of resurrection and polar growth during the emergence of roots, pollen tubes, and rhizoids, respectively. Previous reports have addressed the physiological and molecular characteristics of pollen and seed germination using genomics and proteomics approaches (Dai et al., 2007a,b; Tan et al., 2013), but comparable information is lacking for fern spore germination.

The similarity of germination among spermatophyte pollen grains, seeds, and fern spores includes mobilization and organization of limited reserves for polar growth within a short time frame. Previously, 33 and 123 genes were found to be commonly expressed when Ceratopteris richardii spores were compared with Arabidopsis thaliana pollen and seeds, respectively (Bushart and Roux, 2007; Salmi et al., 2007). Some proteins encoded by these genes (e.g., Rop GTPase, Mago nashi, calmodulin 2, No pollen germination 1, phospholipase D, synaptobrevin, and constitutive photomorphogenic 9) are mainly involved in calcium signaling, vesicle trafficking, and ubiquitinmediated protein degradation. Although the protein functions encoded by some of the important genes (e.g., Rop GTPase, calmodulin, and phospholipase D) in pollen grains have been wellstudied (Malhó et al., 2006), their roles in fern spore germination are not clear. In spite of polar growth similarities, pollen and fern spores are evolutionarily distinct. The pollen grains are reduced male gametophytes with two/three cells to reach ovary by tip-growing pollen tube after recognition on appropriate stigma, while the mononuclear fern spores can generate liveindependent gametophyte through germination (Dai et al., 2008). Fern spore germination is somewhat more complex in that they undergo extensive cell division and differentiation during the emergences of rhizoid and prothallus under various environmental conditions (Bushart and Roux, 2007). Fern spores represent a new single-celled model for investigating asymmetric cell division, differentiation, and polar growth (Chatterjee et al., 2000; Bushart and Roux, 2007).

Previous physiological studies have reported that spore germination of more than 200 fern species was modulated by various environmental factors, such as light, gravity, calcium, phytohormones, and temperature (Suo et al., 2015). Recently, gene function analyses revealed that several signaling pathways are crucial for fern spore germination. Phytochrome and cryptochrome signaling is important for the first spore cell mitosis (Suetsugu and Wada, 2003; Kamachi et al., 2004; Tsuboi et al., 2012). Besides, gravity and calcium signaling determines polarity establishment, cell asymmetry division, and the direction of rhizoid elongation (Chatterjee et al., 2000; Bushart and Roux, 2007; Bushart et al., 2014), and nitric oxide functions as a signal molecule in the regulation of gravity-directed fern spore polarity (Salmi et al., 2007). Moreover, gibberellin and antheridiogen can initiate and promote spore germination in many species, but abscisic acid, jasmonic acid, and ethylene have only minor promoting effects (Suo et al., 2015). In addition, the enzymes involved in glyoxylate cycle (e.g., isocitrate lyase and malate synthase) and genes encoding aconitase in the conversion from citrate to isocitrate were induced in germinating spores from Onoclea sensibilis (DeMaggio and Stetler, 1980), Anemia phyllitidis (Gemmrich, 1979), and C. richardii (Banks, 1999), indicating the importance of lipid reserve degradation and mobilization during spore germination. However, the molecular

regulatory mechanisms in these processes are still unknown. Germination of fern spores, especially the chlorophyllous spores (green spores), is a sophisticated signaling and metabolic process. Spores from Equisetum species are chlorophyll-bearing and of short viability (only a few weeks) (Ballesteros et al., 2011). Equisetum spores can geminate immediately under appropriate conditions with high humidity (Lebkuecher, 1997), therefore, they are good materials for studying the chlorophyllous spore germination. Equisetum is the oldest living genus of vascular plants, containing 15–25 extant hollow-stemmed taxa (Guillon, 2007). Most Equisetum species are regarded as persistent weeds in wetlands (Large et al., 2006). The reproduction of Equisetum species is mainly dependent on the growth of the rhizomes underground, but not the spore germination. Large-scale comparative proteomics of developing rhizomes of Equisetum hyemale have revealed that 1911 and 1860 proteins in rhizomes apical tip and elongation zone, respectively (Balbuena et al., 2012). However, the cellular and proteomic features of spore germination are still lacking. In the present study, we carried out cellular and 2-DE based proteomics analysis of horsetail (Equisetum arvense L.) spore germination to reveal the signaling and metabolic characteristics.

## Materials and Methods

#### Collection and Germination of Mature Horsetail Spores

The mature fertile sporophylls (the separate stalks in the spring) of horsetail (E. arvense) were collected in suburb of Harbin (45◦ 27′ N, 127◦ 52′ E), Heilongjiang province, China. The mature spores (MS) were released by its elaters moving from the sporangia during dehydration at room temperature (25◦C). To synchronize spore germination, the collected fresh MS were soaked in deionized H2O overnight in the dark. Then, the rehydrated spores (RS) were transferred into a liquid germination medium (4.6 mM Ca(NO3)2·4H2O, 2 mM KNO3, 1.5 mM KH2PO4, 0.8 mM MgSO4·7H2O) and cultured in an

**Abbreviations:** APX, ascorbate peroxidase; ARF, ADP-ribosylation factor; CCoAOMT, caffeoyl-CoA O-methyltransferase; CDSP32, chloroplast droughtinduced stress protein of 32 kDa; DAP, differentially abundant protein; DCS, double-celled spores; DHAR, dehydroascorbate reductase; EF, elongation factor; eIF, eukaryotic translation initiation factor; GO, Gene Ontology; GS, germinated spores; HAI, hours after illumination; HSP, heat shock protein; IPMDH, 3-isopropylmalate dehydrogenase; MS, mature spores; PDX, pyridoxal biosynthesis protein; Prx, peroxiredoxin; RAN, GTP-binding nuclear protein Ran; ROS, reactive oxygen species; RS, rehydrated spores; RuBisCO, ribulose-1,5-bisphosphate carboxylase/oxygenase; SPC, spores with protonemal cells; TCA, tricarboxylic acid; TCP, T-complex protein; Trx, thioredoxin; TypA, tyrosine phosphorylated protein A.

environmentally controlled chamber at 25 ± 2 ◦C, 24 h light at 60µmol·m−<sup>2</sup> · s −1 . The amount of germinating spores at various stages of RS, double-celled spores (DCS), germinated spores (GS), and spores with protonemal cells (SPC) was examined under a microscope. All these spores were collected by centrifugation at 500 × g for 5 min, and used immediately after collection for protein extraction or storage at −80◦C.

### Observation of Spore Morphology Upon Germination

Morphological characteristics of spores upon germination were examined under a Axioskop 40 fluorescence microscope (Zeiss, Oberkochen, Germany) without or with staining using 0.2µg·µL −1 4,6-diamidino-2-phenylindole (Molecular Probes, Carlsbad, USA). Observation of MS was performed under HITACHI S-520 scanning electron microscope (Hitachi, Tokyo, Japan) (Dai et al., 2002). Living MS were prepared by standard techniques for scanning electron microscope observation. Spores were fixed, gradually dehydrated, and critically point dried by Balt-Tec CPD-030 critical point dryer (Balt-Tec AG, Balzers, Liechtenstein) according to Dai et al. (2002). Dry spores were mounted on aluminum stubs using double-sided tape and coated with gold-palladium using a Denton Vacuum Desk II sputter coater (Denton Vacuum Inc., Cherry Hill, USA).

#### Protein Extraction and Quantification

For protein preparation, 0.5 g spores in the five stages of germination (MS, RS, DCS, GS, and SPC) were ground to powder in liquid nitrogen using chilled mortar and pestle. Total protein of spores was extracted according to the method of Wang et al. (2010). Protein samples were prepared independently from three different batches of plants, considered as three biological replicates. Protein concentration was determined using a Quantkit according to manufacture's instructions (GE Healthcare, Salt Lake City, USA).

#### 2-DE, Gel Staining and Protein Abundance Analysis

The protein samples were separated and visualized using 2-DE according to Dai et al. (2006). An aliquot of 1.3 mg total protein was diluted with rehydration buffer (7 M urea, 2 M thiourea, 0.5% CHAPS, 20 mM DTT, 0.5% immobilized pH gradient IPG buffer 4–7, and 0.002% bromphenol blue) to a final volume of 450µL and loaded onto an IPG strip holder containing a 24 cm, pH 4–7 linear gradient IPG strip (GE Healthcare, Salt Lake City, USA). Isoelectric focusing was performed in the Ettan IPGphor isoelectric focusing system (GE Healthcare, Salt Lake City, USA) following the protocol of the manufacturer. For SDS-PAGE, the equilibrated IPG strips were transferred onto 12.5% acrylamide gels by using an Ettan DALT Six Electrophoresis Unit (GE Healthcare, Salt Lake City, USA). The gels were stained by Coomassie Brilliant Blue. Gel image acquisition and analysis were conducted as previously described (Wang et al., 2010). Images were acquired by scanning each stained gel using an ImageScanner (GE Healthcare, Salt Lake City, USA) at a resolution of 300 dpi and 16-bit grayscale pixel depth, and then analyzed using ImageMaster 2D version 5.0 (GE Healthcare, Salt Lake City, USA). The experimental molecular weight of each protein was estimated by comparison with the coseparated molecular weight markers. The experimental pI of each protein was determined by its migration on IPG linear strips. For quantitative analysis, the average vol% values were calculated from three biological replicates. Protein spots with reproducible and statistically significant changes in intensity (greater than 1.5 fold and p < 0.05) were considered to be differentially abundant protein (DAP) spots.

#### Protein Identification Using Mass Spectrometry and Database Searching

The DAP spots were manually excised from the 2-DE gels, and the in-gel digestion was performed as described previously (Dai et al., 2006). Tandem mass spectrometry spectra were acquired on a ESI-Q-TOF mass spectrometry (QSTAR XL) and a ESI-Q-Trap mass spectrometry (Applied Biosystems, Foster City, USA) (Wang et al., 2010; Yu et al., 2011). The tandem mass spectrometry spectra were searched against the NCBI non-redundant protein database (http://www.ncbi.nlm.nih.gov/) using Mascot software (Matrix Science, London, UK), according to the searching criteria described previously (Yu et al., 2011). The taxonomic category was green plants (3,019,757 sequence entries), mass accuracy was 0.3 Da, and the maximum number of missed cleavages was set to one. To obtain highly confident identification, proteins had to meet the following criteria: (1) the top hits on the database searching report, (2) a probabilitybased MOWSE score greater than 43 (p < 0.05), and (3) more than two peptides matched with nearly complete y-ion series and complementary b-ion series. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (Vizcaíno et al., 2014) via the PRIDE partner repository with the dataset identifier PXD002218.

#### Protein Classification and Hierarchical Clustering Analysis

For function classification of DAPs, the peptide sequences of each DAP were aligned against the Gene Ontology (GO) protein database (http://geneontology.org) following the policies and procedures provided by the GO Consortium (http:// geneontology.org/). The proteins were classified according to their cellular component, molecular function, and biological process. Also, PSI and PHI-BLAST programs (http://www.ncbi. nlm.nih.gov/BLAST/) were used to search against the NCBI non-redundant protein database for protein functional domain annotation. Besides, the biological function of protein was obtained from the Kyoto Encyclopedia of Genes and Genomes pathway database (http://www.kegg.jp/kegg/). In addition, the conservative protein function during spore germination was predicted from previous publications on the germinating seeds and pollen. Finally, by integrative analysis of all the information collected from aforementioned processes, each DAP was classified into certain functional category defined by us. The definition of functional category is referred from literatures on pollen and seed germination (Supplementary Figure S1). Log (base 2) transformed ratios were used for hierarchical clustering analysis using Cluster 3.0 available on the Internet (http://bonsai.hgc.jp/~mdehoon/software/cluster/ software.htm). Protein abundance ratio was calculated as protein abundance at MS stage divided by abundance at each stage. Using a tree algorithm, these DAPs were organized based on similarities in the expression profile. These proteins can be joined by very short branches if they are very similar to each other, and by increasingly longer branches as their similarity decreases. Java TreeView (http://jtreeview.sourceforge.net/) was used for data visualization.

#### Protein Subcellular Location and Protein-protein Interaction Analysis

The subcellular location of the identified proteins was predicted using five internet tools: (1) YLoc (http://abi.inf.uni-tuebingen. de/Services/YLoc/webloc.cgi), confidence score ≥0.4; (2) LocTree3 (https://rostlab.org/services/loctree3/), expected accuracy ≥80%; (3) ngLOC (http://genome.unmc.edu/ngLOC/ index.html), probability ≥80%; (4) TargetP (http://www.cbs. dtu.dk/services/TargetP/), reliability class ≤3; (5) Plant-mPLoc (http://www.csbio.sjtu.edu.cn/bioinf/plant-multi/), no threshold value in Plant-mPLoc. Only the consistent predictions from at least two tools were accepted as a confident result. For the inconsistent prediction results among five tools, subcellular localizations for corresponding proteins were predicted based on literatures.

The protein-protein interactions were predicted using the web-tool STRING 9.1 (http://string-db.org). The DAPs homologs in Arabidopsis were found by sequence BLASTing in TAIR database (http://www.arabidopsis.org/Blast/index.jsp). The homologs were subjected to the molecular interaction tool of STRING 9.1 for creating the proteome-scale interaction network.

#### Statistical Analysis

All the results were presented as means ± standard deviation of three biological replicates. Data were analyzed by One-Way ANOVA using the statistical software SPSS 17.0 (SPSS Inc., Chicago, USA). The mean values from different stages of spore germination were compared by least significant difference posthoc test. A p-value less than 0.05 was considered statistically significant.

# Results and Discussion

#### Horsetail Chlorophyllous Spore Germination Process

Chlorophyllous spores from fern species in a few unrelated taxa in Pteridophyta can germinate in less than 3 days and their viability lasts about 1 year or less (Lloyd and Klekowski, 1970). Equisetum spores are typical chlorophyllous spores with short viability, which can germinate immediately under humid conditions and remain viable for about 2 weeks (Lebkuecher, 1997). In this study, the spores were sown and cultured on Knop's medium under two different illumination levels (i.e., 30µmol·m−<sup>2</sup> · s −1 and 60µmol·m−<sup>2</sup> · s −1 ) after 12 h dark imbibition in deionized H2O at room temperature. We found that the light illumination level has obvious effects on horsetail spore germination rate. At 16 h after illumination (HAI), the germination rate of spores under higher illumination level (60µmol·m−<sup>2</sup> · s −1 ) was over 50%, but the germination rate of spores under lower illumination level (30µmol·m−<sup>2</sup> · s −1 ) was only 20%. The spores under the higher light intensity reached to the maximum germination rate of 95% at 32 HAI, but the maximum germination rate of spores under 30µmol·m−<sup>2</sup> · s <sup>−</sup><sup>1</sup> was only 35% (Supplementary Figure S2). Thus, the 60µmol·m−<sup>2</sup> · s −1 light was used for horsetail spore germination. Actually, the horsetail MS have initiated germination during the process of dark imbibition. Eighty-seven percent of the RS finished their cell nucleus polar migration toward the gravity, and the nucleus of spore started their division at 12 HAI for the preparation of asymmetry cell mitosis (**Figure 1**). At 8 HAI, 82% spores finished the first mitosis to generate DCS containing a larger cell and a smaller cell, and the smaller cell of 4% spores elongated to form an obvious rhizoid. At this time point, we started to calculate the spore germination rate that is defined as the ratio of GS number to total spore number. At 18 HAI, more than 70% of spores generated the rhizoids, being defined as GS. At 32 HAI, spores obtained the maximum germination rate of over 95%, and the larger cell (protonemal cell) of over 87% spores have finished the second mitosis to give rise to the photosynthetic prothallus. The spores at this stage were defined as SPC (**Figure 1**).

#### Cytological Characteristics of Horsetail Germinating Spores

The horsetail chlorophyllous spores were generated in sporangium from strobili. The MS were released from partially dried strobilus. Horsetail spores are unusual in the morphological and physiological aspects. The spores are typically about 25–35µm in diameter with four flexible ribbon-like elaters about 100µm each (**Figures 2A,E,I,Q**). The elaters initially wrapping around the spore body can deploy upon dehydration and fold back in humid air. The elater movement driven by humidity variations led to the spore exiting from sporangium, and especially can catch the wind again when they were jumping from the ground, which is believed to be a novel type of efficient self-propelled dispersal mechanism (Marmottant et al., 2013). The single nuclear was visible clearly in the center of the spores (**Figures 2E,I**). The elaters were lost when spores were cultured in liquid medium and cell nucleus has migrated toward gravity (**Figures 2N,P**). After 12 h dark imbibition, the RS were swelled to the diameter of about 50µm. In RS, the nucleus has completed the first mitosis and two nuclei in the bottom side of the spore can be clearly observed (**Figures 2B,F,J**). Subsequently, at 8 HAI, the novel cell wall was formed between the two nuclei to generate an asymmetry DCS, containing a larger cell and a smaller cell (**Figures 2C,G,K**). The smaller cell started to elongate out of the spore and an approximate 30µm long rhizoid emerged at 18 HAI to form GS (**Figures 2D,H,L**). In the GS, the cell nucleus of rhizoid still left inside the spores (**Figures 2H,L**). As rhizoid elongated, its nucleus was moving outside from the spore to the center of the rhizoid at 28 HAI (**Figures 2O,R**). At the same time, the larger cell inside the spore can complete the second mitosis to form a new protonemal cell (**Figures 2O,R**). Thus, the spore finish its germination by the formation of

spores with a rhizoid cell and two protonemal original cells (**Figures 2M,O,R**).

#### DAPs Upon Spore Germination

To determine DAPs in mature and germinating spores, protein samples collected at the five stages (MS, RS, DCS, GS and SPC) were subjected to 2-DE analysis. On the Coomassie Brilliant Blue-stained gels (pH 4–7, 24 cm IPG srtip), 1243 ± 47, 1247 ± 34, 1254 ± 31, 1229 ± 57, and 1234 ± 40 spots from MS, RS, DCS, GS, and SPC were detected, respectively (**Figure 3**, Supplementary Figure S3). Among them, 139 protein spots showed differential abundances in five distinct stages of spore germination (>1.5-fold, p < 0.05). A total of 131 spots were identified using tandem mass spectrometry and Mascot database searching, and eight spots were not matched in database. Among the proteins, 28 spots contained only single peptide match, which were considered as un-identified according to our criteria. In the remaining 103 spots, 80 spots contained a single protein each (**Table 1**, Supplementary Table S1) and 23 spots contained more than one protein each (Supplementary Table S2). Thus, the 80 proteins were DAPs during spore germination.

The 80 DAPs were classified into eleven groups, including photosynthesis (17), carbohydrate and energy metabolism (9), other metabolisms (5), signaling and vesicle trafficking (8), cell structure (5), cell cycle (2), transcription related (1), protein synthesis (10), protein folding and processing (8), protein degradation (8), stress and defense (7) (**Table 1**). Among them, proteins involved in photosynthesis (21%) and protein synthesis (13%) were over-represented (**Figure 4A**), indicating active photosynthesis and de novo protein synthesis are pivotal for the germinating chloropyllous spores.

Interestingly, the 80 DAPs represented 48 unique proteins. Thirteen proteins (16.3%) had multi-proteoforms, which were mainly involved in photosynthesis, tricarboxylic acid (TCA) cycle, vesicle trafficking, protein synthesis, folding and turnover, as well as reactive oxygen species (ROS) scavenging (**Table 1**, Supplementary Table S3). These proteoforms might be generated from alternative splicing and various post-translational modifications.

The subcellular localization of the 80 DAPs was predicted based on five internet tools (i.e., YLoc, LocTree3, ngLOC, TargetP, and Plant-mPLoc) and literature. In total, 32 DAPs were predicted to be localized in chloroplast, 27 in cytoplasm, one in Golgi apparatus, two in mitochondria, and six in nucleus. Besides, seven proteins were predicted to be localized in two organelles, and five proteins in four organelles (**Figure 4B**, Supplementary Table S4).

To better understand the protein expression characteristics during the five stages, hierarchical clustering analysis was

applied to the 80 proteins, which revealed three main clusters. Cluster I contained 21 proteins, which were not expressed in MS, but induced in other stages of germination. They were involved in photosynthesis, glycolysis, protein folding and turnover, signal transduction, and ROS scavenging, implying these pathways were specially induced in certain germination stages (**Figure 4C**). Cluster II included 35 proteins that were mainly decreased during spore germination. These proteins were divided into two subclusters. Subcluster II-1 contained the proteins mainly decreased during germination, and subcluster II-2 included proteins induced at the stage of GS but obviously reduced in SPC. The proteins in cluster II covered ten function categories, in addition to category of transcription (**Figure 4D**). The rest 24 proteins were grouped into cluster III, representing increased proteins during spore germination. Cluster III contained two subclusters. Subcluster III-1 contained five proteins induced in stages of RS and DCS, but reduced in GS and SPC, and subcluster III-2 included 19 proteins significantly increased at all stages of germination (**Figure 4D**).

Rehydrated spores. (C) Double-celled spores. (D) Germinated spores. (E) Spores with protonemal cells. Proteins were separated on 24 cm IPG strips (pH 4–7 linear gradient) using IEF in the first dimension, followed by 12.5% SDS-PAGE gels in the second dimension. The 2-DE gel was

spectrometry are marked with numbers on the gels. Molecular weight (MW) in kDa and pI of proteins are indicated on the left and top of the gels, respectively. Detailed information can be found in Table 1 and Supplementary Table S1.

#### Photosynthesis and Reserve Mobilization are Active in the Germinating Spores

Mature chlorophyllous spores of Equisetum species contain chloroplasts and water, making them metabolically active and short-lived. After released from sporangium, the green spores can survive desiccation for less than 2 weeks (Lebkuecher, 1997). It has been found that the E. hyemale spores can tolerate desiccation of 2% relative humidity for 24 h. Upon rehydration, they can rapidly regain photosynthetic competence (Lebkuecher, 1997). In this study, we found the viability of

#### TABLE 1 | Differentially abundant proteins during *E. arvense* spore germination.







<sup>a</sup>Assigned spot number as indicated in *Figure 3*.

<sup>b</sup>The name and functional category of the proteins identified by ESI-Q-TOF and ESI-Q-Trap tandem mass spectrometry. Protein names marked with an asterisk (\*) have been edited by us depending on searching against NCBI non-redundant protein database for functional domain. The abbreviations for the protein names are indicated in the bracket after protein names.

<sup>c</sup>Protein subcellular localization predicted by softwares (YLoc, LocTree3, Plant-mPLoc, ngLOC, and TargetP). Only the consistent predictions from at least two tools were accepted as a confident result. Pounds (#) indicate prediction results were inconsistent among five tools. The subcellular localizations were predicted based on literature listed in Supplementary Table S4. Chl, chloroplast; Cyt, cytoplasm; Gol, Golgi apparatus; Mit, mitochondria; Nuc, nucleus; Pox, peroxisome.

<sup>d</sup>The plant species that the peptides matched from.

<sup>e</sup>Database accession number from NCBI non-redundant protein database.

<sup>f</sup>,gTheoretical (f) and experimental (g) molecular weight (Da) and pI of identified proteins. Theoretical values were retrieved from the protein database. Experimental values were calculated using ImageMaster 2D version 5.0.

<sup>h</sup>The amino acid sequence coverage for the identified proteins.

<sup>i</sup>The Mascot score obtained after searching against the NCBI non-redundant protein database.

<sup>j</sup>The number of matched peptides for each protein.

<sup>k</sup>The mean values of protein spot volumes relative to total volume of all the spots. Five spore germination stages, MS, mature spores; RS, rehydrated spores, DCS, double-celled spores, GS, germinated spores; and SPC, spores with protonemal cells were performed. Error bar indicates ± standard deviation (SD). Letters indicate statistically significant differences (p <0.05) among five stages of spore germination as determined by One-Way ANOVA.

spores from E. arvense lasted less than 3 weeks, and the fresh spores germinated in 32 h (**Figure 1**). The extremely short viability and rapid germination were mainly due to active photosynthesis and respiration in the chlorophyllous spores. This is different from non-green spores with long dormancy and viability (Lloyd and Klekowski, 1970). Upon germination, chlorophyllous spores also exhibited features distinct from the non-green spores. In our proteomic results, we found that 17 DAPs were photosynthesis-related proteins, including a chlorophyll a/b-binding protein, 10 proteoforms of

ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) large subunit, a RuBisCO large subunit binding protein, three proteoforms of RuBisCO activase, and two proteoforms of chloroplast transketolase (**Table 1**). The increased chlorophyll a/b-binding protein indicated that the photosystem II was enhanced upon spore germination. The multiple proteoforms of RuBisCO large subunit, RuBisCO activase, and transketolase changed in abundance possibly due to protein phosphorylation (Guitton and Mache, 1987). This implies that the carbon assimilation is active in germinating spores. Importantly, we also found nine enzymes involved in carbohydrate and energy metabolism. They were six TCA cycle enzymes (i.e., a pyruvate dehydrogenase and five malate dehydrogenases), an enolase involved in glycolysis, a cytosolic 6-phosphogluconate dehydrogenase (an enzyme in pentose phosphate pathway), and a fructokinase in charge of sucrose and fructose metabolism. All these enzymes, taking up 32% of DAPs, were considered as key enzymes for carbon and energy supplies during spore germination. Their variations also indicate that the heterotrophic metabolism is crucial for chlorophyllous spore germination.

We also found several proteins involved in fatty acid synthesis, amino acid metabolism, sulfur assimilation, and secondary metabolism (**Table 1**). Among them, enoyl-acyl carrier protein reductase is a key enzyme of the type II fatty acid synthesis system in plastids. Enoyl-acyl carrier protein reductase was proved to be important for fatty acid deposition in developing seeds (de Boer et al., 1999) and pollen grains (Poghosyan et al., 2005). In our results, enoyl-acyl carrier protein reductase was decreased at the stages of DCS, GS and SPC during germination. This indicates that fatty acid synthesis is reduced upon spore germination, which is consistent with the notion that mobilization of storage lipids has been triggered for energy supply in germinating spores (DeMaggio and Stetler, 1985). Besides, we found three increased enzymes at certain stages involved in amino acid metabolism, including glycine decarboxylase, 3-isopropylmalate dehydrogenase (IPMDH), and ATP-sulfurylase. Glycine decarboxylase participates in glycine, serine and threonine metabolism, and was found to be abundant in mitochondria of C3 leaves that functions in photorespiratory carbon recovery (Timm et al., 2012). Whether glycine decarboxylase is involved in photorespiratory in germinating fern spores needs to be further investigated. IPMDH catalyzes the oxidative decarboxylation of 3-isopropylmalate in leucine biosynthesis, and was proved to be essential for Arabidopsis pollen development (He et al., 2011). In our results, IPMDH was obviously induced in the stage of GS, suggesting that IPMDH is also pivotal for spore germination. Interestingly, ATP-sulfurylase acts as the metabolic entry point into the sulfur assimilation pathway. It was found that the increase of ATP-sulfurylase level during soybean seed development could lead to an increase in the availability of sulfur amino acids (Phartiyal et al., 2006). The induced ATP-sulfurylase in GS and SPC may facilitate the synthesis of sulfur rich amino acids for spore germination. Additionally, pyridoxal biosynthesis protein (PDX) was specially expressed in germinating spores, but not found in MS. PDX family in plants is primarily known for its role in vitamin B6 biosynthesis. Recently, PDX1.2 was proved to be critically required for hypocotyl elongation and primary root growth (Leuendorf et al., 2014). The increase of PDX in germinating spores implies that it probably functions in rhizoid elongation.

#### Signaling and Vesical Trafficking are Important for Spore Germination

The critical roles of hormone signaling and vesical trafficking in germinating seeds and pollen grains have been well-studied (Ellis and Turner, 2002; Dai et al., 2007a), but little information is available for fern spore germination. In this study, we found some novel factors in signal transduction and vesical trafficking upon fern spore germination, including chalcone isomerase, WD40, 14-3-3 protein, GTP-binding nuclear protein Ran (RAN), and ADP-ribosylation factor (ARF) (**Table 1**). Among them, chalcone isomerase involved in flavonoid biosynthesis was increased at the stage of RS. Flavonoid participates in auxin signaling and facilitates pollen-tube growth (Falcone Ferreyra et al., 2012). In maize (Zea mays) and Petunia hybrida mutants, the flavonoiddeficient pollen failed to produce a functional pollen tube (Mo et al., 1992). The increase of chalcone isomerase in spores implies that flavonoids may function in fern spore germination as well. Interestingly, flavonoid biosynthesis is regulated by conserved WD40 domain-contained transcription factors (Falcone Ferreyra et al., 2012). Members of WD40 protein superfamily are known as key regulators of multi-cellular processes (e.g., cell division, light signaling, protein trafficking, cytoskeleton dynamics, nuclear export, RNA processing, chromatin modification, and transcriptional mechanism), acting as scaffolding molecules assisting proper activity of other proteins (Stirnimann et al., 2010). Moreover, some WD40 proteins have been found to be crucial for Arabidopsis seed germination (Gachomo et al., 2014) and pollen viability in flax (Linum usitatissimum L.) (Kumar et al., 2013). In our results, the nucleus-localized WD40 was induced in horsetail germinating spores, indicating its regulatory function for spore germination and rhizoid tip-growth. Similarly, 14-3- 3 protein family is also a highly conserved eukaryotic proteins with multiple molecular and cellular functions by binding to phosphorylated client proteins to modulate their function (Denison et al., 2011). In lily (Lilium longiflorum) pollen, 14-3- 3 protein was involved in the regulation of plasma membrane H+-ATPase via modulation of its activity, which is essential for germination and tube elongation (Pertl et al., 2010). In this study, we found a 14-3-3 protein was induced in the stages of RS and SPC, indicating it would play roles in cell nuclear migration and rhizoid tip-growth. Importantly, we also found four proteoforms of RAN were significantly increased in germinating spores. RAN, a primarily nucleus-localized small GTPase, is essential for nuclear transport and assembly, mRNA processing, and cell cycle (Ciciarello et al., 2007). It was reported that overexpression of RAN1 in rice (Oryza sativa) and Arabidopsis altered the mitotic progress and sensitivity to auxin (Wang et al., 2006). The multi-proteoforms of RAN induced in germinating horsetail spores were probably due to the different phosphorylation levels, implying their probable key roles in cell division and polar growth during spore germination. Besides of RAN, we found another important small GTPase, ARF, was induced at the stages of RS and DCS. ARF contributes to the regulation of multiple trafficking routes with respect to Golgi organization, endocytic cycling, cell polarity and cytokinesis (Yorimitsu et al., 2014). It was found that ARF played essential roles for endosomal recycling during Arabidopsis pollen and root hair polarized tip growth (Richter et al., 2011). The induced ARF in horsetail spores meets the specific requirement of early-secretory and polar vesical recycling to facilitate rhizoids tip growth.

### Cytoskeleton and Cell Wall Dynamics are Necessary for Spore Germination

Actin cytoskeleton directs the flow of vesicles to the apical domain, where they fuse with the plasma membrane and contribute their contents to the expanding cell wall (Hepler et al., 2013). The local changes in contents and viscosity of the apical wall control the local expansion rate and cell elongation. Precise mechanisms in the organization of actin cytoskeleton and cell wall dynamics have been well-studied in growing pollen tubes (Hepler et al., 2013; Qu et al., 2015), but the mechanisms remain to be further elucidated in fern spores. Here we found the levels of actin, reversibly glycosylated polypeptide, rhamnose biosynthetic enzyme 1, and caffeoyl-CoA O-methyltransferase (CCoAOMT) were altered during horsetail spore germination. Actin is known to be the main content of actin cytoskeleton, and the other three enzymes are essential for cell wall modulation. Reversibly glycosylated polypeptide is implicated in polysaccharide biosynthesis (Langeveld et al., 2002), and may function in cell wall construction in pollen from rice and Picea meyeri (Dai et al., 2006; Chen et al., 2009). Besides, rhamnose biosynthetic enzyme catalyzes the synthesis of L-rhamnose, an important constituent of pectic polysaccharides in cell wall of pollen tube (Yue et al., 2014). In addition, CCoAOMT has a proven role in lignin monomer biosynthesis, which is crucial for pollen wall development (Arnaud et al., 2012). In our results, rhamnose biosynthetic enzyme was increased in GS and decreased in SPC, CCoAOMT was induced in RS and reduced in the stages of DCS and SPC, and two proteoforms reversibly glycosylated polypeptide displayed different changes. All these alterations would modulate the biosynthesis of cell wall components (e.g., polysaccharide, rhamnose, and lignin), leading to the dynamics of rhizoid cell wall rigidity for sustained polarized growth.

### Rapid Protein Synthesis, Processing, and Turnover are Essential for Fern Spore Germination

The germinating pollen and fern spores exhibit quick switches from metabolic quiescent state to active state. The substance and energy supply for rapid cell division and polar tip-growth need to be triggered in a short time period. Although it has been found that mature pollen grains have pre-synthesized mRNA and proteins for germination and tube growth (Taylor and Hepler, 1997; Dai et al., 2006), de novo protein synthesis is necessary for pollen tube elongation (Dai et al., 2007a,b). For fern species, the germination of green spores from O. sensibilis and non-green spores from A. phyllitidis, Marsilea vestita, Pteridium aquilinum, and Pteris vittata were not inhibited by actinomycin D (Raghavan, 1977, 1991, 1992; Kuligowski et al., 1991). In horsetail spores, similar metabolic features were discovered from our proteomics data. We found eleven DAPs involved in transcription and protein synthesis, including a threonyltRNA synthetase, a 40S ribosomal protein SA, three eukaryotic translation initiation factors (eIFs) (a eIF3 and two eIF4A), five elongation factors (EFs) (two proteoforms of EF2, a EF-G, a EF-Tu, and a EF-Ts), and a protein translation-related GTP-binding protein tyrosine phosphorylated protein A (TypA) (**Table 1**). The increases of threonyl-tRNA synthetase in GS and SPC and increases of 40S ribosomal protein SA in RS and DCS indicate that the protein synthesis machinery is enhanced during spore germination, while the changes in multi-proteoforms of eIF and EF imply that certain specific protein synthesis is regulated in diverse modes in different stages of germinating spores. It is interesting to note that these proteins are localized in cytoplasm, mitochondria, and chloroplast, respectively (**Table 1**). This indicates that not only the synthesis of nuclear geneencoding proteins are necessary, but also the protein synthesis machineries in chloroplast and mitochondria are all triggered for the active metabolism in germinating spores.

Molecular chaperones not only control house-keeping processes, but also regulate protein functional folding and assembly which are necessary for activating/inhibiting various signaling pathways. In horsetail germinating spores, we found eight chaperones, including five proteoforms of heat shock protein 70 (HSP70), HSP90, T-complex protein 1 (TCP1) subunit alpha, and TCP1 subunit gamma. Among these HSP members, HSP70 was found to bind microtubules and interact with kinesin in tobacco (Nicotiana tabacum) pollen tubes (Parrotta et al., 2013). Besides, Arabidopsis HSP90 was found to be active in mature and germinating pollen grains (Prasinos et al., 2005). Moreover, TCP1 plays a pivotal role in the folding and assembly of cytoskeleton proteins as an individual or complex with other subunits (Sternlicht et al., 1993; Bhaskar et al., 2012). The variations of chaperones in spores highlight that protein processing is important for diverse processes (e.g., photosynthesis, cytoskeleton, protein turnover, and various metabolisms) upon horsetail spore germination.

In addition to folding and assembly, active protein turnover occurred in germinating horsetail spores, which was reflected by the changes of eight degradation-related proteins (**Table 1**). These proteins included an alpha 7 proteasome subunit, three proteoforms of zinc dependent protease, a zinc metalloprotease, two proteoforms of FtsH protease, and an ATP-dependent Clp protease (**Table 1**). Among them, all the zinc dependent proteases, zinc metalloprotease, and FtsH are all ATP-dependent metalloproteases in chloroplasts, which play a major role in assembly and maintenance of the plastidic membrane system. The Clp protease system plays essential role in plastid development through selective removal of miss-folded, aggregated (Nishimura and van Wijk, 2014), or unwanted proteins. All these imply that active protein degradation and turnover in chloropyllous spores are crucial for spore germination.

#### ROS Homeostasis is Crucial for Spore Germination

In the tip-growing pollen tubes and root hairs, ROS act as regulators in diverse signal and metabolic pathways, modulating kinase cascades, ion channels, cell wall properties, nitric oxide levels, and G-protein activities (Wilson et al., 2008; Swanson and Gilroy, 2010). Interestingly, in fern spores, nitric oxide was shown to be a positive regulator for tip growth (Bushart and Roux, 2007). However, the ROS function in fern spores is poorly understood. Our proteomics results revealed that seven proteins in germinating spores function as ROS scavengers, including two proteoforms of 2-cys peroxiredoxin (Prx), a chloroplast drought-induced stress protein of 32 kDa (CDSP32), ascorbate peroxidase (APX), and three proteoforms of dehydroascorbate reductase (DHAR). The proteoforms of Prx, CDSP32, DHAR, and APX are predicted to be localized in chloroplasts and/or cytoplasm in horsetail spores (**Table 1**). CDSP32 is composed of two thioredoxin (Trx) modules and has been found to be involved in the protection of the photosynthetic apparatus against oxidative damage (Broin et al., 2002). PrxR/Trx pathway is a central antioxidant defense system in plants, in which PrxRs employ a thiol-based catalytic mechanism to reduce H2O<sup>2</sup> and is regenerated using Trxs as electron donors. APX and DHAR are the key members in glutathione-ascorbate cycle. In this cycle, H2O<sup>2</sup> is reduced to water by APX using ascorbate as the electron donor. The oxidized ascorbate is still a radical, which can be converted into dehydroascorbate spontaneously or reduced to ascorbate by monodehydroascorbate reductase. Dehydroascorbate is then reduced to ascorbate by DHAR at the expense of glutathione, yielding oxidized glutathione. Finally, oxidized glutathione is reduced by glutathione reductase using NADPH as electron donor (Horling et al., 2003). Previous proteomics studies have revealed that the abundances of APX and Trx were changed in germinating pollen grains from O. sativa (Dai et al., 2007a), Arabidopsis (Ge et al., 2011), Brassica napus (Sheoran et al., 2009), P. meyeri (Chen et al., 2009), Picea wilsonii (Chen et al., 2012), and Pinus strobus (Fernando, 2005). This indicated that PrxR/Trx pathway and glutathione-ascorbate cycle were employed in germinating pollen grains. Thus, our results indicate chlorophyllous fern spores with active photosynthesis and respiration modulate ROS homeostasis by these enzymes during germination.

#### Prediction of Protein-protein Interaction Upon Spore Germination

The fern spore germination is a fine-tuned process regulated by temporal and spatial expression and interaction of a number of genes/proteins. To discover the relationship of DAPs during horsetail spore germination, the protein-protein interaction networks were generated by the web-tool STRING 9.1 (http:// string-db.org). The DAPs homologs in Arabidopsis were found by sequence BLASTing in TAIR database (http://www. arabidopsis.org/Blast/index.jsp) (Supplementary Table S5), and then the homologs were subjected to the molecular interaction tool of STRING 9.1 for creation of proteomescale interaction network. Among all the 80 DAPs, 63 proteins identified in germinating spores and represented by 38 unique homologous proteins from Arabidopsis, were depicted in the STRING database (**Figure 5**), according to the information from the published literature, genome analysis based on domain fusion, phylogenetic profiling/homology, gene neighborhood, co-occurrence, co-expression, and other experimental evidence (Supplementary Figure S4). Four functional modules are apparently illuminated in the network, which form tightly connected clusters (**Figure 5**). In the protein networks, stronger associations are represented by thicker lines (**Figure 5**). In Module 1 (green nodes), photosynthetic proteins (chlorophyll a/b-binding protein, RuBisCO large subunit, RuBisCO activase, and RuBisCO large subunit binding protein), three members of chloroplastic protein synthesis machine (EF-Ts, EF-G, and TypA), chloroplast-located Prx/Trx and zinc dependent protease, as well as a mitochondrialocated glycine decarboxylase appeared linked closely. This implies that photosynthesis, photorespiration, chloroplastic protein synthesis and turnover, as well as ROS scavenging are active and cooperated closely in horsetail chloropyllous spores. Besides, multiple metabolic enzymes (e.g., two malate dehydrogenases, enolase, transketolase, pyruvate dehydrogenase, and 6-phosphogluconate dehydrogenase), cytoplastic members of protein synthesis machine (40S ribosomal protein SA,

in *E. arvense* spores based on STRING analysis. A total of 63 differentially abundant proteins represented by 38 unique homologous proteins from Arabidopsis are shown in PPI network. Nodes in different confidence view generated by STRING database. Strong associations are represented by thicker lines. Detailed information on protein names and abbreviations can be found in Table 1.

eIF4A, EF2), a molecular chaperone (HSP70), a protease (zinc metalloprotease), and 14-3-3 protein were assigned in Module 2 (blue nodes). These linked proteins show that diverse metabolic pathways (e.g., TCA cycle, glycolysis, pentose phosphate pathway, and Calvin cycle) formed a synergistic system for carbon and energy supplies during germination. Moreover, these metabolic activities were controlled by the levels and activities of key enzymes that were modulated through protein synthesis, folding, and turnover, while 14-3-3 proteins acted as a crucial regulator of these metabolic processes (e.g., TCA cycle) (Diaz et al., 2011). Interestingly, actin is linked with TCP1, proliferation-associated protein 2G4, and two members of protein synthesis machine (eIF3 and EF-Tu) in Module 3 (yellow nodes), indicating the synthesis and processing of cytoskeletal proteins are pivotal for the rapid cell division and cell cycle upon spore germination. Furthermore, proteins involved in protein folding (HSP70 and HSP90) and degradation (FtsH and ATP-dependent Clp protease), as well as ROS scavengers (APX and DHAR) were fitted into Model 4 (red nodes). This indicates that the protein conformational changes determine their fates and are regulated by ROS homeostasis in germinating spores.

# Conclusion and Remarks

Although the molecular mechanism of pollen germination has been well-studied as a model for cell polar growth, our knowledge of fern chlorophyllous spore germination is lacking. In this proteomics study, we found some pollen homologous proteins and several novel components are pivotal for fern spore germination. The dynamics of photosynthesis, TCA cycle, glycolysis, and pentose phosphate pathway, as well as the variations of reserve mobilization pathways (fatty acid synthesis, amino acid metabolism, sulfur assimilation, and secondary metabolism) indicate that both heterotrophic and autotrophic

## References


metabolisms are triggered in chlorophyllous spores, which is obviously distinct with non-green spores and pollen grains. Besides, a number of proteins are suspected to be necessary for the cell nuclear migration, cytoskeleton dynamics, and cell wall modulation during fern spore germination. Importantly, the protein synthesis machines, protein processing, and proteasomedependent protein degradation in cytoplasm and chloroplasts are active for the rapid protein synthesis and turnover. In addition, several members in ROS signaling and G proteininvolved vesical trafficking are crucial for polar rhizoid growth. All these provide invaluable information, however, further validation and characterization of these proteins in a model system (i.e., C. richardii), as well as the post-translational modification analysis are still necessary for ultimately discovering protein functions and interactions toward understanding of the underlying sophisticated cellular and molecular processes in fern germinating spores.

# Acknowledgments

The project was supported by the National Natural Science Foundation of China (No. 31071194), the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning (2011), Capacity Construction Project of Local Universities, Shanghai, China (No.14390502700) to SD, and by Funding Program for Young Teachers at Universities and Colleges of Shanghai (2014), and General Scientific Research Project of Shanghai Normal University (No. SK201419) to QZ.

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00441/abstract


germination and tube growth reveals characteristics of germinated Oryza sativa pollen. Mol. Cell. Proteomics 6, 207–230. doi: 10.1074/mcp.M600146-MCP200


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Zhao, Gao, Suo, Chen, Wang and Dai. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Proteasome targeting of proteins in Arabidopsis leaf mesophyll, epidermal and vascular tissues

Julia Svozil, Wilhelm Gruissem and Katja Baerenfaller\*

*Plant Biotechnology, Department of Biology, Swiss Federal Institute of Technology Zurich, Zurich, Switzerland*

Protein and transcript levels are partly decoupled as a function of translation efficiency and protein degradation. Selective protein degradation via the Ubiquitin-26S proteasome system (UPS) ensures protein homeostasis and facilitates adjustment of protein abundance during changing environmental conditions. Since individual leaf tissues have specialized functions, their protein composition is different and hence also protein level regulation is expected to differ. To understand UPS function in a tissue-specific context we developed a method termed Meselect to effectively and rapidly separate *Arabidopsis thaliana* leaf epidermal, vascular and mesophyll tissues. Epidermal and vascular tissue cells are separated mechanically, while mesophyll cells are obtained after rapid protoplasting. The high yield of proteins was sufficient for tissue-specific proteome analyses after inhibition of the proteasome with the specific inhibitor Syringolin A (SylA) and affinity enrichment of ubiquitylated proteins. SylA treatment of leaves resulted in the accumulation of 225 proteins and identification of 519 ubiquitylated proteins. Proteins that were exclusively identified in the three different tissue types are consistent with specific cellular functions. Mesophyll cell proteins were enriched for plastid membrane translocation complexes as targets of the UPS. Epidermis enzymes of the TCA cycle and cell wall biosynthesis specifically accumulated after proteasome inhibition, and in the vascular tissue several enzymes involved in glucosinolate biosynthesis were found to be ubiquitylated. Our results demonstrate that protein level changes and UPS protein targets are characteristic of the individual leaf tissues and that the proteasome is relevant for tissue-specific functions.

Keywords: protein level regulation, ubiquitylation, proteasome inhibition, leaf tissues, epidermis, mesophyll, vasculature, *Arabidopsis thaliana*

# Introduction

Plant organs are composed of different tissues that are specialized for particular biological processes and the functionality of the organ is the sum of the functions of each of its tissue types. Just as each Arabidopsis organ has its own functional proteome map (Baerenfaller et al., 2008) we also expect that individual tissues have specific protein compositions for their specific functions. In fully grown leaves, the main leaf tissue types are epidermis, mesophyll and vasculature. In Arabidopsis, the epidermis is composed of stomata and trichomes that are embedded in the single cell layer of pavement cells, which are covered with the waxy cuticle. The adaxial and abaxial epidermal tissues enclose the palisade and spongy mesophyll,

#### *Edited by:*

*Marc Libault, University of Oklahoma, USA*

#### *Reviewed by:*

*Kazuhiro Takemoto, Kyushu Institute of Technology, Japan Asdrubal Burgos, Max Planck Institute for Molecular Plant Phisiology, Germany*

#### *\*Correspondence:*

*Katja Baerenfaller, Plant Biotechnology, Department of Biology, Swiss Federal Institute of Technology Zurich, Zurich Universitaetstrasse 2, 8092 Zurich, Switzerland kbaerenfaller@ethz.ch*

#### *Specialty section:*

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> *Received: 24 March 2015 Accepted: 11 May 2015 Published: 28 May 2015*

#### *Citation:*

*Svozil J, Gruissem W and Baerenfaller K (2015) Proteasome targeting of proteins in Arabidopsis leaf mesophyll, epidermal and vascular tissues. Front. Plant Sci. 6:376. doi: 10.3389/fpls.2015.00376* which represent the main photosynthetic capacity of the leaf. Embedded in the mesophyll tissue is the vascular tissue, which consists of phloem, xylem and cambial cells (Tsukaya, 2002).

Considering the specific functions of the different leaf tissues their protein composition would be expected to vary, however, leaf tissue-specific proteome information is currently not available. Since the correlation of absolute protein and transcript levels and their dynamic changes over time is limited (Baerenfaller et al., 2008, 2012; Walley et al., 2013), regulatory processes such as protein degradation likely influence the composition of tissue-specific proteomes. The activity of the ubiquitin-26S proteasome system (UPS) should therefore have specific signatures for individual specialized tissues. We previously identified changes in leaf protein composition after proteasome inhibition and a large number of direct UPS targets in leaves and roots (Svozil et al., 2014). However, the whole organ information does not have the necessary resolution to understand UPS function in a tissue-specific context because by homogenizing organs functional protein signatures cannot longer be attributed to individual tissues or cell types (Brandt, 2005).

To avoid this limitation and conserve specific information, different techniques for the separation of tissues or cell types have been developed. In plants, pollen and tissue culture cells can be collected effectively because they are not attached to other tissues. For example, it has been possible to obtain enough Arabidopsis pollen for a large-scale proteomics study (Grobei et al., 2009). Trichomes and root hairs are other specialized plant cells that can be easily collected for smallscale proteomics experiments (Wienkoop et al., 2004; Marks et al., 2008; Brechenmacher et al., 2009; Nestler et al., 2011; Van Cutsem et al., 2011). Microcapillaries have also been used to collect cell sap from epidermal cells (Wienkoop et al., 2004) and vascular S-cells (Koroleva and Cramer, 2011) for proteomic analyses. However, this method is restricted to accessible cell types and soluble proteins only. Alternatively, single cell or tissue types can be precisely excised from tissue sections using laser capture microdissection, for example epidermal and vascular cells of maize coleoptiles (Nakazono et al., 2003) and Arabidopsis vascular bundle cells (Schad et al., 2005). For any other approach of selecting specific cell types, the cells first need to be released from their tissue context, which in plants requires the degradation of cell walls to obtain protoplasts. Leaf mesophyll and guard cell protoplasts can be easily collected and analyzed (Zhao et al., 2008; Zhu et al., 2009). Enrichment of guard cell protoplasts expressing green fluorescent protein (GFP) has been achieved using fluorescent activating cell sorting (FACS) (Gardner et al., 2009). However, FACS of leaf protoplasts is challenging because of the chlorophyll autofluorescence that can interfere with the sorting process (Galbraith, 2007). Nevertheless, FACS in combination with enhancer trap lines expressing GFP specifically in the leaf epidermal, vasculature or guard cells has allowed enrichment of the respective cell type (Grønlund et al., 2012). FACS has also been employed for transcriptome and proteome analysis of specific cell types in Arabidopsis roots both under standard and stress conditions (Birnbaum et al., 2003, 2005; Brady et al., 2007; Dinneny et al., 2008; Petricka et al., 2012). Recently, a method of protoplasting followed by sonication and manual separation of cotyledon epidermal and vascular cells was reported for tissue-specific transcriptome analyses (Endo et al., 2014). In general, however, single cell and tissue type-specific high-throughput proteomics experiments are challenging because of the availability of material and low protein content of plants cells (Wienkoop et al., 2004; Koroleva and Cramer, 2011).

To enable tissue-specific proteome analyses we developed a rapid and effective method for the separation of the different leaf tissue types (Meselect, mechanical separation of leaf compound tissues). Here we report the specificity of the method in separating leaf mesophyll, vasculature and epidermis tissues. Using Meselect we generated tissue type-specific functional protein maps and identified responses to inhibition of the proteasome that account for the diverse composition of the leaf tissue types. Because Meselect produces a high enough yield of tissue-specific protein extracts for affinity enrichment experiments, we found several tissue-specific UPS target proteins that provide new insights into the role of protein degradation in tissue functions.

# Materials and Methods

#### Plant Growth and Inhibition of the Proteasome

Seeds of Arabidopsis thaliana ecotype Col-0 were stratified for 2 days in darkness at 4◦C and afterwards grown in short day conditions with 8 h light and 16 h darkness for 55 days at 22◦C, 70% humidity The proteasome was inhibited by spraying the adaxial epidermis of leaves with 10µM Syringolin A in 0.02% (v/v) Tween 20. As a mock control leaves were sprayed with 0.02% (v/v) Tween 20 only. The whole rosette was treated 30–60 min before the end of the night period and leaves were harvested 8 h after treatment. SylA was produced as described in Svozil et al. (2014) and was kindly provided by R. Dudler (University of Zurich, Switzerland). The experiment was performed in three biological replicates.

#### Separation of the different Leaf Tissues with the Meselect Method

After harvesting of the leaf, the TAPE sandwich method (Wu et al., 2009) was applied to remove the abaxial epidermis from the leaf. For this, the leaf was positioned between two tape stripes and the tape facing the abaxial epidermis was removed. The tape containing the abaxial epidermis was incubated for 15 min in protoplasting solution (1% cellulase Onozuka RS (Yakult, Japan), 0.25% macerozyme Onozuka R10 (Yakult, Japan), 0.4 M mannitol, 10 mM CaCl2, 20 mM KCl, 0.1% BSA and 20 mM MES, pH 5.7) under constant agitation at 50 rpm. During this process spongy mesophyll cells adhering to the epidermis are released into the solution, but the epidermis remains attached to the tape. The tape including epidermis was washed twice in washing buffer (154 mM NaCl, 125 mM CaCl2, 5 mM KCl, 5 mM glucose, and 2 mM MES, pH 5.7) and frozen in liquid nitrogen. After freezing, the epidermis could be easily removed from the tape with precooled tweezers and scrapers. The epidermis was collected in a pre-cooled mortar, ground, and the tissue powder was stored

at −80◦C until further processing. The remainder of the leaf without the abaxial epidermis that was attached to the other tape was incubated for about 1 h in protoplast solution, which releases the palisade and spongy mesophyll cells into solution. The mesophyll protoplasts were collected by centrifugation at 200 × g for 2 min at 4◦C and washed twice with washing buffer. After complete removal of the buffer, the protoplasts were frozen in liquid nitrogen and stored at −80◦C until further processing. The tape with the remaining attached upper epidermal and vascular tissue was immersed in cold washing buffer. The vascular tissue network was removed with a tweezer, washed twice in washing buffer, frozen in liquid nitrogen, and stored at −80◦C until further processing. The remainder of the tape with the attached adaxial epidermis was discarded because the specificity of the enrichment could not be confirmed for this tissue.

To verify the enrichment of different leaf tissues types the Arabidopsis GAL4-GFP enhancer trap lines with specific GFP expression in the epidermis (KC464), the vasculature (KC274) and the and mesophyll (JR11-2) were used (Gardner et al., 2009) (kindly provided by Alex Webb, UK). The leaves of the JR11- 2 line in the C24 ecotype background were too fragile for the Meselect method. The line was therefore backcrossed four times in the Col-0 ecotype background to yield sufficiently robust leaves for the generation of the tissue type-specific protein preparations.

#### Preparation of the Tissue Type-Specific Protein Extracts and Western Blot Analysis

For the preparation of the whole leaf and tissue type specific protein extracts the frozen plant material was ground with a mortar and pestle and proteins were extracted by incubation with SDS buffer [4% SDS, 40 mM Tris-base, 5 mM MgCl2, 2x protease inhibitor mix (Roche)] for 20 min at room temperature (RT). Non-soluble material was pelleted by centrifugation at RT for 10 min at 16,200 × g. The supernatant was subsequently cleared by ultracentrifugation at RT for 45 min at 100,000 × g. The SDS extract of epidermal cells was processed quickly and could not be stored, neither frozen nor at RT, since proteins will precipitate.

For mass spectrometry measurements, 50–150µg of protein, depending on the tissue type, were subjected to SDS PAGE on 10% SDS gels. After electrophoretic separation of the proteins, the proteins were stained with Coomassie brilliant blue R250 and each lane was cut into 5 sections. Tryptic digest of the proteins and extraction of the peptides was carried out as described (Svozil et al., 2014).

For the Western Blots 15µg of each extract were electrophoretically separated with SDS PAGE on 10% SDS gels. After blotting, the nitrocellulose membranes were cut horizontally in two halves. The lower parts were probed with α-GFP (1:1000 dilution, Roche) and α-mouse antibodies (1:5000, Roche), and the upper parts with α-HSP90 (1:3000, Agrisera) and α-rabbit (1:3000, Roche) antibodies. After detection of the immunofluorescence signal the membranes were stained with Coomassie R250.

#### Affinity Enrichment Experiment

Frozen plant material was ground with a mortar and pestle. Mesophyll tissue cells were collected from 8 to 10 leaves each of three, vascular tissue of nine, and abaxial epidermal tissue of four plants. Proteins from mesophyll cells were extracted as described previously with a sequential extraction using native and urea buffer (Svozil et al., 2014). Proteins from vascular tissue or epidermal cells were extracted only with urea buffer. For the preclearing step 500µg protein of mesophyll and vascular tissue in a total volume of 700–1200µl and 400µg protein of epidermal tissue in a total volume of 1800µl were each washed with 100µl sepharose CL4B (Sigma-Aldrich). The remaining affinity enrichment method was performed as described previously (Svozil et al., 2014).

#### Mass Spectrometry Measurements

Mass spectrometry measurements were performed using a LTQ OrbiTrap XL mass spectrometer (Thermo Fisher) coupled to a NanoLC-AS1 (Eksigent) using electrospray ionization. For LC separation a capillary column packed with 8 cm C18 beads with a diameter of 3µm and a pore size of 100 Å was used. Peptides were loaded on the column with a flow rate of 500 nl/min for 16 min and eluted by an increasing acetonitrile gradient from 3% acetonitrile to 50% acetonitrile for 60 min with a flow rate of 200 nl/min. One scan cycle was comprised of a survey full MS scan of spectra from m/z 300 to m/z 2000 acquired in the FT-Orbitrap with a resolution of R = 60,000 at m/z 400, followed by MS/MS scans of the five highest parent ions. CID was done with a target value of 1e4 in the linear trap. Collision energy was set to 28 V, Q value to 0.25, and activation time to 30 ms. Dynamic exclusion was enabled at a duration of 120 s.

#### Interpretation of MS/MS Spectra

The acquired raw spectra were transformed to mgf data format and searched against the TAIR10 database (28, download on January 17th, 2011) (Lamesch et al., 2012) with concatenated decoy database and supplemented with common contaminants (71,032 entries) using the Mascot algorithm (version 2.3.02) (Mascot Science). The search parameters used were: mass = monoisotopic, requirement for tryptic ends, 2 missed cleavages allowed, precursor ion tolerance = ±10 ppm, fragment ion tolerance = ±0.8 Da, variable modifications of methionine (M, PSI-MOD name: oxidation, mono 1 = 15.995) and static modifications of cysteine (C, PSI-MOD name: iodoacetamide derivatized residue, mono 1 = 57.0215). The diglycine tag that remains attached to ubiquitylated peptides after tryptic digest was not included as variable modification because we had previously observed that the identification of peptide ubiquitylation sites in complex peptide mixtures in which only a small portion of the peptides carry the diglycine tag will result in unreliable identifications (Svozil et al., 2014). Peptide spectrum assignments with ionscore >30 and expect value <0.015, except those of known contaminants, were filtered for ambiguity. Peptides matching to several proteins were excluded from further analyses. This does not apply to different splice variants of the same protein or to different loci sharing exactly the same amino acid sequence. All remaining spectrum assignments were inserted into the pep2pro database (Baerenfaller et al., 2011; Hirsch-Hoffmann et al., 2012). The false discovery rate (FDR) was calculated by dividing the number of reverse hits by the number of true hits times 100%. It was 0.55% for the tissue-type specific extracts and 1.6% for the affinity enrichments. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://www.proteomexchange. org) via the PRIDE partner repository (Vizcaíno et al., 2013) with the dataset identifiers PXD000941 and PXD000942. The data are also available in the pep2pro database at www.pep2pro.ethz.ch.

#### Statistical Analyses and Selection Criteria

Proteins were quantified by normalized spectral counting according to Svozil et al. (2014) by calculating the expected contribution of each individual protein to the samples total peptide pool correcting the values using a normalization factor that balances for the theoretical number of tryptic peptides per protein and sample depth. The statistical analyses and selection criteria were applied to each tissue type-specific dataset separately. For the total protein extracts, proteins with a minimum of 5 spectra within a tissue type were included in the quantification. After filtering, the normalized spectral counts for the proteins in each tissue type were re-normalized by scaling the average to a value of 1. Proteins were considered changing in abundance between SylA and mock treatment if the foldchange of the average relative abundance over three biological replicates was more than 1.5 and the corresponding p-value in a paired t-test was smaller than 0.05, or if the fold change was an outlier. As outliers we considered fold changes that were higher than the upper adjacent values in boxplot statistics, which correspond to the values of the largest observation that is less than or equal to the upper quartile plus 1.5 times the length of the interquartile range. The upper adjacent values were: 2.94 for the mesophyll, 2.55 for the vasculature and 6.05 for the epidermis. Those proteins, which were identified in at least two of the three biological replicates in one treatment and not at all in the other treatment were also considered as changing in abundance. Proteins considered to be exclusively identified in a specific tissue type were detected with a minimum of 5 spectra in the respective tissue type, but not at all in the other two tissue types. In the affinity enrichment experiments we required a minimum of 5 different spectra to call a protein identified. With these data an index was calculated, where each protein identification in the UBA-domain affinity enrichment counts +1 and each identification in the sepharose control counts -1 toward the final index. As an example, an index of 2 either indicates that the protein was identified in two of the three UBA affinity enrichment replicates and never in the sepharose control, or that it was identified in all three UBA affinity enrichment replicates and only once in the sepharose control. For accepting a protein to be ubiquitylated we required that the protein had a minimum index of 2 and was either not detected at all in the sepharose background control or was at least 5-fold enriched. Analyses were done using the R algorithm (R Core Team, 2012) (Supplemental Data Sheet 1).

#### Go Categorization

GO categorization was performed as described before (Svozil et al., 2014) using the Ontologizer software (http://compbio. charite.de/ontologizer) (Bauer et al., 2008) in combination with the Arabidopsis annotation file considering aspect Biological Process. Annotations with GO evidence codes IEA (Inferred from Electronic Annotation) or RCA (Inferred from Reviewed Computational Analysis) were excluded from analyses. As background lists the organ-specific protein maps for leaves of the pep2pro TAIR10 wos dataset were used (Baerenfaller et al., 2011). Over-representation was assessed with the Topology-weighted method and the p-values were corrected for multiple testing using the Bonferroni method. GO categories with a p < 0.01 were considered significant.

# Results

#### Effective Separation of the Leaf Epidermal, Vascular, and Mesophyll Tissues

We developed Meselect (mechanical separation of leaf compound tissues) to specifically enrich the three main leaf tissue types. The method first utilizes the TAPE sandwich approach to separate the abaxial epidermis from the remainder of the leaf (Wu et al., 2009). Any attached mesophyll cells are released by rapid protoplasting, which leaves the epidermal cells intact. The tape is flash-frozen and the epidermal cells are collected from the tape with tweezers and scrapers. Mesophyll cells are released from the remainder of the leaf by rapid protoplasting. The vascular tissue embedded in the mesophyll tissue remains intact during protoplasting and is then isolated with a tweezer. To confirm the specificity of the enrichment we applied Meselect to leaves of Arabidopsis lines that express GFP specifically in epidermal, mesophyll and vascular tissues (Gardner et al., 2009) (**Figure 1**). In the tissue-type specific protein extracts of leaves from the KC464 line, which expresses GFP exclusively in the epidermis, the GFP protein was exclusively found in the epidermal protein extract but not in the protein extract of vasculature and mesophyll tissues (**Figure 1A**). Similarly, in the tissue type-specific protein extracts of leaves of the KC274 line, which expresses GFP only in the vasculature, GFP was specifically enriched in the vasculature protein extract (**Figure 1C**). We noticed small amount of GFP in the mesophyll but not the epidermal protein extract. In the JR11-2 line backcrossed with Col-0, GFP is expressed in the spongy mesophyll and GFP was detected only the mesophyll protein extract, confirming that both vasculature and epidermal protein extracts do not contain mesophyll proteins (**Figure 1E**). Together, the Meselect method effectively and efficiently separates the abaxial epidermal, mesophyll and vascular leaf tissues.

#### Experimental Design to Detect Tissue-Specific Proteins and differences in UPS Targeting in Different Leaf Tissues

We adopted the workflow in **Figure 2** to investigate tissuespecific differences in UPS targeting. For this we extracted total proteins from the three tissues of mock-treated leaves or leaves that were treated with Syringolin A (SylA), which specifically and effectively inhibits the UPS (Groll et al., 2008; Svozil et al., 2014). Together, we identified a total of 1799 distinct proteins and 1114 proteins that were identified with at least 5 spectra in at least one tissue (Supplemental Table 1). Of these proteins only those were

considered to change in abundance if one of the following criteria was met: (i) the fold-change between SylA and mock-treated samples was >1.5 with a p < 0.05, (ii) the fold-change was large enough to be considered an outlier according to boxplot statistics, or (iii) the protein was identified in at least two of three biological replicates in one condition but not another condition (**Table 1**, Supplemental Table 2). Additionally, for tissue specificity we required that the protein was not identified in the other tissues (**Table 1**, Supplemental Table 3). In total, 225 distinct proteins were found that had increased and 30 that had decreased in the different leaf tissues after inhibition of the proteasome. As discussed previously, reduced protein levels could result from reduced transcription, translation or stabilization of the proteins after SylA treatment, or from proteasome-independent protein degradation (Svozil et al., 2014). In contrast, the proteins that increased after inhibition of the proteasome are likely targets of the UPS or participate in pathways that respond to SylA treatment.

Using the Meselect method the protein yield in the tissue type-specific extracts was adequate for affinity enrichment of ubiquitylated proteins from SylA-treated leaves using the ubiquitin-binding UBA-domain (Sutovsky et al., 2005; Manzano et al., 2008; Svozil et al., 2014). To call a UBA affinity-enriched protein identified we required a minimum of 5 spectra for the protein in the respective tissue extract. We also calculated an index, in which each protein identification in the UBA-domain affinity enrichment counts +1 and in the sepharose control −1 for the final index. We accepted a protein to be ubiquitylated in a tissue extract if it had a minimum index of 2 and was at least 5-fold enriched over the sepharose background control. Using these criteria we identified a total of 519 ubiquitylated proteins in the three different tissues (**Table 1**, Supplemental Table 4). When comparing the tissue-specific ubiquitylated proteins with the reported ubiquitylated proteins identified in whole leaves (Svozil et al., 2014) we identified 161 proteins that are unique to the leaf tissue-specific dataset (**Figure 3**). The vascular tissue extract revealed the highest number of newly identified ubiquitylated proteins. Together, the separation of leaf tissues indeed allows the sampling of proteins that remain undetected in the whole leaf context where they are masked by proteins from abundant cell types. In the following, the datasets of proteins that are ubiquitylated and/or accumulate in the mesophyll, vasculature and epidermis after SylA treatment, as well as proteins identified in only one of the three tissue extracts, will be analyzed in detail for their tissue-type specific functions.

#### Proteins Identified in the Tissue-Specific Protein Extracts have Specific Roles in the Respective Tissues

#### Proteins Identified Exclusively in Mesophyll have Functions in Photosynthesis

Among the 85 proteins that were only identified in the mesophyll the GO categories that are most significantly overrepresented are photosynthesis, starch metabolic process and cellular polysaccharide catabolic process. The proteins in these categories comprise chloroplast proteases, components of the photosystems, as well as proteins that facilitate the incorporation of proteins into the photosystems and a

kinase that phosphorylates them (**Table 2**). The proteins involved in starch metabolic process can be classified into two groups, starch biosynthesis and starch breakdown (Streb and Zeeman, 2012) (**Table 2**), and include proteins such as INOSITOL MONOPHOSPHATASE FAMILY PROTEIN (FBPase, AT1G43670) involved in cytosolic sucrose synthesis,


TABLE 1 | Summary of the number of identified proteins in total protein extracts of epidermal, vascular and mesophyll tissues, and in ubiquitin affinity enrichments of proteins from the respective tissue types.

the starch biosynthesis enzyme PHOSPHOGLUCOMUTASE (PGM, AT5G51820) and the starch breakdown enzyme PHOSPHOGLUCAN, WATER DIKINASES (GWD3, AT5G26570). This indicates that starch metabolism is mostly confined to the mesophyll tissue, although starch granules can also be observed in guard cells (Stadler et al., 2003), which are specialized cells in the epidermis.

#### Epidermis-specific Proteins Function in Defense and Protection

The epidermis and cuticle form the outer barrier of the leaf to the environment. Therefore it can be expected that proteins, which are involved in cuticle formation and immunoprotection are enriched in the epidermal tissue (Barel and Ginzberg, 2008; Kaspar et al., 2010; Yeats et al., 2010). Indeed, the overrepresented GO categories among the 113 proteins exclusively identified in the epidermis include innate immune response, immune response and immune system process, as well as lignin, cutin, and cell wall pectin biosynthetic process (**Table 3**). The proteins responsible for the over-representation of immune response categories also included HSP90.1, which in addition to its chaperone function is also required for the R-gene mediated defense response (Takahashi et al., 2003). HSP90.1 was the only defense-related protein that accumulated after SylA treatment. This suggests that most of the identified defense related proteins are not targeted by the proteasome under these conditions and that the epidermal cells remain in a responsive state to counteract biotic stress. The categories cutin biosynthetic process and cell wall pectin biosynthetic process include proteins that were also identified to accumulate after inhibition of the proteasome (**Table 3**). These are the enzymes GLYCEROL-3-PHOSPHATE-SN-2-ACYLTRANSFERASE (GPAT4, AT1G01610) important for cutin biogenesis (Li et al., 2007; Yang et al., 2010), DEFICIENT IN CUTIN FERULATE (DCF, AT3G48720) that catalyzes the feruloylation of ω-hydroxy fatty acids, which are one type of cutin monomers (Rautengarten et al., 2012) as well as GALACTURONOSYLTRANSFERASE 1 (GAUT1, AT3G61130) and 8 (GAUT8/QUA1, AT3G25140). GAUT1 functions in an enzymatic complex that catalyzes the elongation step of homogalacturonan synthesis. GAUT8 may also function in homogalacturonan synthesis, and it was suggested to work in the hemicellulose pathway because gaut8 mutants have a marked reduction in xylan synthase activity (Atmodjo et al., 2013). GAUT1 and GAUT7 are part of a complex that also includes KORRIGAN 1 (KOR1, AT5G49720) (Atmodjo et al., 2011) that is involved in cellulose biosynthesis and my form a link between cellulose and pectin biosynthesis and was also identified to accumulate after SylA treatment. The functions of the proteins identified exclusively in the epidermal protein extract are therefore important for establishing the special characteristics of the epidermal cell wall. Their accumulation after treatment with SylA might either point to their involvement in the general response to the treatment, or indicate that protein degradation by the UPS is one way to regulate epidermal cell wall composition.

The two cinnamyl alcohol dehydrogenases CAD7 and CAD9 that were identified with 94 and 168 spectra, respectively, are involved in lignin biosynthesis (Sibout et al., 2005; Eudes et al., 2006), but their enzymatic activities were reported to be either low or non-detectable (Kim et al., 2007). GUS staining in Arabidopsis showed that the expression or localization pattern of CAD7 and CAD9 changes during development. They are exclusively expressed in the vasculature of leaves in 2 week old seedlings, but at a later developmental stage that corresponds to the age of the leaves used in our experiments, they are localized in


TABLE 2 | Proteins that were exclusively identified in the mesophyll tissue and that were assigned to the significantly over-represented GO categories *photosynthesis* and *starch metabolic process.*

trichomes and hydatodes (Kim et al., 2007). However, the analysis of CAD7 and CAD9 transcript accumulation over anatomy using Genevestigator (Hruz et al., 2008) revealed that their highest transcript levels were specifically detected in the translatome of the leaf epidermis (Mustroph et al., 2009). Additionally, the mRNAs of CAD7 and CAD9 and 20 additional epidermis-specific proteins were classified into the epidermis-specific translatome cluster (Mustroph et al., 2009) (Supplemental Table 3). The large overlap between exclusively identified epidermal proteins and the epidermis-specific translatome emphasizes that information on translated transcripts or proteins is important to assess tissue specific functions.

#### Comparison of Leaf and Root Epidermal and Vascular Tissue Proteins Reveals Tissue- and Organ-specific Processes

When comparing the identified leaf vascular tissue protein set with the root vascular tissue proteins previously reported by Petricka et al. (2012) we found an overlap of 365 proteins. In the set of 920 proteins that were identified in the root vascular tissue, but not in the leaf vascular tissue, GO biological processes such as protein targeting to mitochondrion and fatty acid β-oxidation were over-represented, corresponding well with root as a heterotrophic organ. In contrast, the 365 proteins that were identified in the leaf vascular tissue, but not in the root vascular tissue, were assigned to glucosinolate biosynthesis, processes related to photosynthesis, and response to light stimulus. The latter two were also over-represented in the 505 proteins that were identified in the leaf epidermis, but not in the root epidermis. In addition, indole catabolic process and defense to fungus were over-represented in the proteins identified in the leaf epidermis, but not in the root epidermis. In contrast, the 506 proteins that were identified in the root epidermis, but not in the leaf epidermis, were over-represented for fatty acid β-oxidation, amino acid, and sterol biosynthesis, and thalianol metabolism. Thalianol is a secondary metabolite that can be detected in Arabidopsis roots (Field and Osbourn, 2008). This demonstrates that the identified tissue type-specific proteins in different organs correspond to the specific functions of the respective tissue type and plant organ.

#### SylA-dependent Accumulating and Ubiquitylated Proteins in the Mesophyll Reveal that UPS-mediated Protein Degradation Modulates Plastid Protein Composition

Altogether we found that more than 50% of the ubiquitylated and more than 67% of the accumulating proteins detected in the mesophyll are predicted to be localized in the plastid according to SUBAcon in the SUBA3 database (Tanz et al., 2013). Furthermore, three of the four most significantly


TABLE 3 | Proteins that were exclusively identified in the epidermis and that were assigned to the significantly over-represented GO categories *innate immunity* and *immune response,* as well as the categories related to cell wall biosynthesis *lignin biosynthetic process, cutin biosynthetic process,* and *cell wall pectin biosynthetic process.*

*In column "Accumulation after SylA" "*+*" indicates accumulation and "*−*" decreased protein amount after treatment with SylA.*

over-represented GO categories of the proteins that accumulated in the mesophyll after SylA treatment are protein targeting to chloroplast, establishment of protein localization to chloroplast, and protein localization to chloroplast. These categories are also over-represented in the identified ubiquitylated mesophyll proteins. Proteins of the plastid import complexes that accumulated after SylA treatment are TRANSLOCON COMPLEXES AT THE OUTER CHLOROPLAST ENVELOPE MEMBRANE 75-III (TOC75-III, AT3G46740), CHLOROPLAST SIGNAL RECOGNITION PARTICLE 54 (cpSRP54, AT5G03940), the translocase ALBINO 3 (ALB3; AT2G28800), the chloroplast chaperone CHLOROPLAST HEAT SHOCK PROTEIN 70-2 (cpHsc70-2, AT5G03940) and TRANSLOCON COMPLEXES AT THE INNER CHLOROPLAST ENVELOPE MEMBRANE 55 (TIC55, AT2G24820). TOC75-III is the channel protein in the TOC complex and TIC55 is a component of the redox regulon of the TIC complex (Kovács-Bogdán et al., 2010). Stromal Hsp70 is important for the import of precursor proteins (Su and Li, 2010), while the cpSRP consisting of cpSRP54 and cpSRP43 facilitates the passage of imported proteins through the stroma to the thylakoids (Falk and Sinning, 2010). The identified ubiquitylated mesophyll proteins included the TOC complex protein TOC159 (AT4G02510), cpSRP43 (AT2G47450) and HEAT SHOCK PROTEIN 90.5 (HSP90.5; AT2G04030). Since many of the accumulating and ubiquitylated mesophyll proteins are associated with plastid import, we compared the accumulating proteins with the set of proteins that were decreased in a TOC159-deficient mutant and are therefore putative TOC159-dependent substrates (Bischof et al., 2011). The 10 proteins in the overlap are likely proteins that are transported via the plastid import machinery and whose pre-proteins are targets of the UPS (Supplemental Table 2). Additional plastidlocalized proteins that accumulated after SylA treatment are part of the plastid proteolysis system and include the proteases ORGANELLAR OLIGOPEPTIDASE (OOP; AT5G65620) and PRESEQUENCE PROTEASE 1 (PREP1; AT3G19170), as well as proteins involved in chloroplast protein degradation, namely the FtsH1 (AT1G50250), FtsH2 (AT2G30950), and FtsH8 (AT1G06430) subunits of the FtsH membrane-bound metalloprotease complex, the DEG2 (AT2G47940) subunit of the Degradation of periplasmic proteins (Deg)/high temperature requirement A (HtrA) protease, and the ClpB3 (AT5G15450) subunit of the CLP proteases. Another subunit of the CLP proteases, ClpP4 (AT5G45390) was found to be ubiquitylated. Together, the identification of the above proteins as accumulating or ubiquitylated proteins suggests that these proteins are subject to proteasome degradation and indicates that the UPS is involved in modulating the protein composition in the plastid.

#### Proteins that Accumulate in the Epidermis After SylA Treatment Function in the Non-cyclic Flux Mode of the TCA Cycle

In the list of proteins that accumulated in the epidermis after SylA treatment, GO categories related to tricarboxylic acid (TCA) cycle and cell wall biosynthesis were over-represented. The plant TCA cycle produces not only reducing equivalents and ATP, but also carbon skeletons for the synthesis of amino acids (Millar et al., 2011; Nunes-Nesi et al., 2013). The noncyclic flux mode of the TCA cycle produces 2-oxoglutarate as precursor for the amino acids glutamate and glutamine, which are the products of nitrate assimilation (Sweetlove et al., 2010). We found a remarkably high overlap in the epidermis between proteins involved in the non-cyclic flux mode of the TCA cycle and proteins that accumulated after SylA treatment (**Figure 4**). In addition to mitochondrial proteins that are directly involved in the TCA cycle, cytosolic proteins from connected pathways also accumulated after SylA treatment. This suggests that inhibition of the proteasome might deplete the pool of free amino acids, thus leading to increased de novo synthesis of glutamine and glutamate. Consistent with this hypothesis, NITRITE REDUCTASE 1 (NIR1, AT2G15620), which produces the ammonium required for the biosynthesis of glutamine and glutamate in the plastid, and DICARBOXYLATE TRANSPORT 2.1 (DiT2.1, AT5G64290), which transports glutamate from the plastid into the cytosol in exchange for malate (Renné et al., 2003), also accumulated after SylA treatment. The two mitochondrial proteins NADPH:QUINONE OXIDOREDUCTASE (AT3G27890) and a subunit of ATP synthase (AT4G29480) are involved in metabolizing NADH and FADH2 and generating ATP, respectively, and are thus part of the full TCA cycle. Their decrease after SylA treatment therefore supports the hypothesis of the TCA cycle operating in a nonflux mode in the epidermis. On the other hand, ATP:citrate lyase (ACL), which utilizes citrate to produce acetyl-CoA and thereby removes substrate from the non-cyclic flux mode TCA cycle, accumulates after SylA treatment. Cytosolic acetyl-CoA is required for the biosynthesis of a range of metabolites including elongated fatty acids (Fatland et al., 2005). Increased levels of ACL might therefore suggest an increased requirement for acetyl-CoA after SylA treatment.

None of the described enzymes of the core TCA cycle was identified to be ubiquitylated. However, CYTOSOLIC NADP+-DEPENDENT ISOCITRATE DEHYDROGENASE (CICDH, AT1G65930) and the enzymes of glutamate metabolism, GLUTAMATE DEHYDROGENASE 1 and 2 (GDH1, AT5G18170; GDH2, AT5G07440), GLUTAMATE

treatment are indicated by a dotted line, and those for which the corresponding enzyme was ubiquitylated by a blue line. PPC2, PEP subunit B (AT5G49460) and ACLB-1 (AT3G06650); DiT2, dicarboxylate transport 2.1 (AT5G64290); OAA, oxaloacetate; PEP, phosphoenolpyruvate. DECARBOXYLASE 2 (AT1G65960) and GLUTAMINE SYNTHETASE 2 (AT5G35630), were ubiquitylated and are therefore putative targets of the UPS.

The expression of GDH subunits was reported to be limited to companion cells of roots and shoots for GDH1 and GDH2, or only roots for GDH3 (Fontaine et al., 2012). However, we identified GDH1 and GDH2 both in the affinity enrichments and in total extracts of epidermis and vasculature, and GDH3 was identified in the total vasculature extract. These localization data are supported by the cell-type specific leaf translatome data (Mustroph and Bailey-Serres, 2010), which indicates that the GDH subunit proteins are present in more tissues than previously reported and that their levels and the subunit composition of the GDH enzyme might be regulated by the UPS.

#### Glucosinolate Biosynthesis Occurs Primarily in the Vascular Tissue and is Regulated by UPS Targeting

In vascular tissue protein extracts we identified an exceptionally high number of 353 ubiquitylated proteins (**Table 1**). Apart from various response pathways, the GO category glucosinolate biosynthetic process was over-represented in this set of proteins. This category was also over-represented in the list of proteins that were exclusively identified in the vascular but not in the epidermal or mesophyll tissue protein extracts (**Table 4**). Glucosinolates are derived from different amino acids, including methionine and phenylalanine. Depending on the amino acid precursor, they are assigned to different groups. Here, we focus on the biosynthesis of aliphatic glucosinolates, since the identified ubiquitylated proteins mainly catalyze reactions in this pathway.


TABLE 4 | Proteins involved in glucosinolate metabolism identified in the tissue protein extracts and their ubiquitin affinity enrichments.

*Proteins were assigned to the different processes in glucosinolate biosynthesis and breakdown. In columns "Identified" and "Ubiquitylated" the numbers of spectra are indicated with which the protein was identified. "x" indicates exclusive identification of the respective protein in the vascular protein extract.*

The biosynthesis of aliphatic glucosinolates involves three major steps (Grubb and Abel, 2006; Sawada et al., 2009; Sønderby et al., 2010). The first step is methionine chain elongation, which includes methionine deamination, repeated condensation, isomerization, oxidative decarboxylation and transamination. The proteins involved in this step were exclusively identified in vascular tissue protein extract and in affinity enriched ubiquitylated proteins from vascular tissue (**Table 4**, **Figure 5**). In the second step the glucone core structure is formed, which involves the incorporation of sulfur. All of the identified enzymes of the core biosynthetic process except for GGP1 were also identified exclusively in the vascular tissue protein extract or after affinity purification (**Table 4**, **Figure 5**). The third and last step in glucosinolate biosynthesis is secondary side chain modification. The identified enzymes in this process were found to be ubiquitylated exclusively in the vasculature tissue (**Figure 5**; **Table 4**). In summary, we identified most of the enzymes that catalyze the core biosynthetic steps of glucosinolate biosynthesis only in vascular tissue extracts and in affinityenriched ubiquitylated proteins from the vascular tissue.

Glucosinolate breakdown is catalyzed by myrosinases and produces toxic isothiocyanates and nitriles and other reactive products that are important in plant defense. Due to the toxicity of the degradation products glucosinolates and myrosinases are located in different cell types and only brought together after tissue damage or transport of glucosinolates (Grubb and Abel, 2006; Halkier and Gershenzon, 2006; Wittstock and Burow, 2010). While we found a strong tissue-type specificity in glucosinolate biosynthesis, the localization of the two Arabidopsis myrosinases TGG1 and TGG2 is more widespread, as they were identified in all tissue-specific protein extracts with highest levels in the epidermis, and also as ubiquitylated proteins (**Table 4**). This broad expression pattern of the TGG1 and TGG2 proteins corresponds very well with their transcript translation data in the Arabidopsis translatome atlas (Mustroph and Bailey-Serres, 2010), although previous studies had found their expression to be limited to guard cells and phloem myrosin cell idioblasts (Koroleva et al., 2000; Andréasson et al., 2001; Husebye et al., 2002; Thangstad et al., 2004; Zhao et al., 2008).

Following myrosinase action, the S-adenosyl-l-methioninedependent methyltransferase HARMLESS TO OZONE LAYER 1 (ATHOL1) converts thiocyanate to methylthiocyanate (Nagatoshi and Nakamura, 2009). ATHOL1 was the only protein in the glucosinolate pathway that accumulated after SylA treatment. Together, the large number of identified ubiquitylated enzymes suggests that glucosinolate synthesis and metabolism are tightly controlled by the UPS.

## Discussion

Cellular processes and responses that are analyzed in the total leaf context cannot reveal the contribution of specific tissues. We therefore developed a method to effectively separate different leaf tissues and to investigate tissue-specific processes and the response to treatment of the leaf with SylA, which inhibits the proteasome. The Meselect method described here is an effective, high-yielding method to separate mesophyll, epidermal and vascular tissues. We demonstrated that the resulting tissue type-specific protein extracts had essentially no contaminations from other tissues. Since the separation of the three tissues takes only approximately 1 h, it provides a clear advantage over other methods such as FACS. In addition, Meselect does not require the use of specific fluorescent reporter lines and is therefore compatible with experimental workflows using wild type or

biosynthesis consisting of methionine chain elongation (A), glucone formation (B) and side chain modification (C), displaying the proteins that were exclusively identified in the vascular but not in the epidermal

ubiquitylated in the vascular but not in the epidermal or mesophyll tissue extracts (blue), or both (red). In brown the proteins that were not exclusively identified in vasculature and in gray those that were not identified. any transgenic plants. Meselect furthermore allows the isolation of the three different tissues from the same leaves, permitting comparisons between the tissue types and the elucidation of tissue type-specific processes. Employing the Meselect method reduced the complexity of the total leaf protein samples and therefore increased the sensitivity of the analysis. Our approach resulted in the identification of proteins that were exclusively found in the respective tissue types, of proteins that accumulated after SylA treatment and of proteins that are ubiquitylated. Based on the resulting protein lists we could assign specific processes and UPS targets to each of the three tissues.

The proteins identified in the tissue-type specific extracts are indicative of specific functions of the respective tissue. For example, proteins exclusively identified in mesophyll are related to photosynthesis as well as starch and sugar metabolic processes, and proteins in the epidermis function in defense and leaf protection. The tissue localization of other proteins that we detected in several tissue types, e.g., GDH subunits, CAD7/CAD9 and TGG1/TGG2 corresponded with the localization information in the cell-type specific leaf translatome atlas (Mustroph and Bailey-Serres, 2010). These proteins are therefore more ubiquitously expressed than previously reported. This supports our view that tissue-specific protein expression data enhance our understanding of tissue-specific processes and that tissue-type specific protein localization cannot be inferred from transcript expression data only.

Interestingly, many of the identified ubiquitylated and SylA-dependent accumulating proteins in the mesophyll are localized to the plastid. Cytosolic precursors of plastid-localized proteins are known to be degraded by the proteasome following ubiquitylation (Shen et al., 2007a,b; Lee et al., 2009, 2013). A direct interaction between the transit peptide and the 26S proteasome has been reported for certain plastidlocalized proteins that may target plastid protein precursors for degradation if they are not imported (Sako et al., 2014). We therefore suggest that the ubiquitylated and SylA-dependent accumulating proteins we identified are pre-proteins that would normally be ubiquitylated and degraded in the cytosol. Furthermore, plastid import complexes themselves seem to be substrates of the UPS because E3 ligase SP1 associates with TOC complexes and mediates the ubiquitylation and degradation of TOC subunits such as TOC159 (Ling et al., 2012). We identified TOC159 and additional proteins associated with plastid transport to be ubiquitylated or accumulating after SylA treatment. UPSmediated protein degradation might therefore modulate plastid protein composition not only by the degradation of cytosolic precursor proteins, but also by reorganization of the protein import machinery.

The UPS may also affect plastid protein homeostasis by degrading enzymes of the plastid proteolysis system involved in transit peptide cleavage and recycling, protein maturation and degradation (Adam et al., 2006; Van Wijk, 2015). After import, transit peptides of plastid-localized proteins are cleaved off in the stroma and are subsequently degraded by proteases including PREP1 and OOP (Richter and Lamppa, 1998; Kmiec and Glaser, 2012), which accumulated after inhibition of the proteasome. Protein homeostasis in the chloroplast depends on protein degradation systems of the FtsH complex, the Clp protease system, and additional proteases (Adam et al., 2006; Van Wijk, 2015). For example, the soluble protease DEG2 in cooperation with FtsH subunits is involved in repair of photosystem II by degrading the D1 reaction center protein that is damaged by light and needs to be replaced (Zaltsman et al., 2005; Kato et al., 2012). ClpB3 was shown to be essential for normal chloroplast development (Lee et al., 2007; Singh and Grover, 2010). It has also been reported that the UPS targets the precursors of the plastid proteases FtsH1 and ClpP4 for degradation (Shen et al., 2007a,b). Our finding that several proteins of the chloroplast protein degradation systems accumulated after SylA treatment suggests that the UPS is involved in plastid protein homeostasis by controlling the cytosolic accumulation of precursor proteins that themselves have proteolytic activities.

Glucosinolate biosynthesis and metabolism is an excellent example for tissue type-specific localization of proteins. Glucosinolates are sulfur-containing secondary compounds that participate in plant defense. Upon mechanical wounding of leaves, glucosinolates are hydrolysed by myrosinases that produce various toxic breakdown products, especially isothiocyanates and nitriles (Grubb and Abel, 2006; Halkier and Gershenzon, 2006). Because of the toxicity of glucosinolate breakdown products glucosinolate biosynthetic enzymes and myrosinases should not co-localize. While glucosinolates are mainly stored in vascular S-cells, myrosinases were reported to be expressed in guard cells and in idioblasts, which are located in the phloem parenchyma (Koroleva et al., 2000; Andréasson et al., 2001; Husebye et al., 2002; Thangstad et al., 2004; Grubb and Abel, 2006; Zhao et al., 2008; Wittstock and Burow, 2010). Correspondingly, we identified most of the aliphatic glucosinolate biosynthesis enzymes specifically in vascular tissue extracts. In contrast to previous reports of results from promoter:GUS expression constructs (Husebye et al., 2002; Thangstad et al., 2004), we detected the myrosinases TGG1 and TGG2 in all tissue types with highest levels in the epidermis. In Arabidopsis, vegetative rosette leaves are the major site of aliphatic, methionine-derived glucosinolate biosynthesis and storage (Andersen et al., 2013). Accordingly, we found the GO category glucosinolate biosynthetic process over-represented for proteins that were identified in the leaf, but not in the root vasculature (Petricka et al., 2012). As many of the glucosinolate biosynthetic enzymes were found to be ubiquitylated, the levels of these enzymes seem to be controlled by the UPS. Further experiments are required to clarify if targeted degradation regulates the tissue-type specific protein expression of the glucosinolate biosynthetic enzymes and their increased accumulation after the plant defense response has been triggered.

In the epidermis we found that many proteins of the noncyclic flux mode of the TCA cycle accumulated after inhibition of the proteasome. In normal and carbon starvation conditions the TCA cycle drives ATP synthesis by the oxidation of respiratory substrates. Also, energy and stress signaling are thought to converge because under conditions of carbon deprivation, protein degradation is a key process for recycling cellular molecules (Baena-González and Sheen, 2008). Furthermore, Svozil et al. Proteasome targets in different leaf tissues

UPS-mediated protein degradation has an important role in adaptation to carbon and nitrogen availability (Kang and Turano, 2003; Sato et al., 2011). However, in the non-cyclic flux mode, the TCA cycle can provide C skeletons for the biosynthesis of the primary amino acids glutamine and glutamate as well as other metabolites (Sweetlove et al., 2010). One of the functions of the proteasome system is to maintain the free amino acid pools that are needed for protein synthesis through continuous degradation of proteins (Vierstra, 1996; Zhang et al., 2014). Complete recycling of amino acids first involves the partial cleavage of proteins by the UPS followed by further degradation to free amino acids by various endo- and exopeptidases (Book et al., 2005). Inhibition of the proteasome therefore may deplete the pools of free amino acids. It was suggested that the supply of free amino acids is linked to the rate of proteolysis and that low amino acid supplies are monitored in the cell by the increasing levels of uncharged tRNAs (Vierstra, 1996). Depleted pools of free amino acids after proteasome inhibition may therefore lead to enhanced de novo synthesis of the primary amino acids glutamine and glutamate through activation of the non-cyclic flux mode of the TCA cycle. The mode of the TCA cycle might therefore switch between providing 2-oxoglutarate for amino acid synthesis or consuming 2-oxoglutarate for the provision of pyruvate, depending on the metabolic status of the cell and the availability of carbon, nitrogen, and amino acids. The coordination of carbon and nitrogen metabolism involves the inter-conversion of keto acids and amino acids, which is catalyzed by the enzyme glutamate dehydrogenase (GDH) (Melo-Oliveira et al., 1996; Forde and Lea, 2007; Miyashita and Good, 2008). We found that the GDH subunit proteins are present in all three leaf tissues and that they are possible targets of the UPS. In addition, CICDH that converts isocitrate to 2-oxoglutarate in the cytosol was also ubiquitylated. For darkened leaves, citrate is known to be catabolized through the TCA cycle, but in illuminated leaves, some of the produced citrate is exported from the mitochondrium to the cytosol, bypassing the mitochondrial aconitase and isocitrate dehydrogenase steps, to produce 2-oxoglutarate via cytosolic reactions (Hanning and Heldt, 1993; Lee et al., 2010). The ubiquitylation of the cytosolic enzymes for glutamate and glutamine synthesis indicates that the

# References


cytosolic part of this pathway is a direct target of the UPS. It will be interesting to determine which E3 ligases are responsible for the ubiquitylation of these proteins and what role targeted protein degradation has for regulating the mode of the TCA cycle.

In summary, the Meselect method effectively and rapidly separates epidermis, vasculature, and mesophyll tissues from the same leaf in approximately 1 h and yields high tissue amounts for proteomics. Analysis of the tissue type-specific protein localization revealed novel insights into tissuespecific processes. Moreover, the quantitative information on tissue-specific protein level changes and the types of ubiquitylated proteins that accumulated after inhibition of the proteasome by SylA has expanded our understanding of UPS-mediated control of protein accumulation in leaves. The data also support the view that protein degradation is an important mechanism for optimizing functional tissue-specific proteomes.

### Author Contributions

JS and KB designed research; JS performed research; JS and KB analyzed data; JS, WG, and KB wrote the paper and approved the final version to be published.

### Acknowledgments

We thank the Functional Genomics Center Zurich for infrastructure and technical support, Dr. Alex Webb (University of Cambridge, UK) for providing seeds of GAL4-GFP enhancer trap lines JR11-2, KC274 and KC464, and R. Dudler (University of Zurich, Switzerland) for providing Syringolin A. We also thank Johannes Fütterer for critical reading of the manuscript and helpful discussions. This work was supported by ETH Zurich and SNF grant 31003A awarded to KB.

### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00376/abstract

and GAUT7 are the core of a plant cell wall pectin biosynthetic homogalacturonan:galacturonosyltransferase complex. Proc. Natl. Acad. Sci. U.S.A. 108, 20225–20230. doi: 10.1073/pnas.1112816108


Barel, G., and Ginzberg, I. (2008). Potato skin proteome is enriched with plant defence components. J. Exp. Bot. 59, 3347–3357. doi: 10.1093/jxb/ern184


Nicotiana tabacum trichomes identifies proteins involved in secondary metabolism and in the (a)biotic stress response. Proteomics 11, 440–454. doi: 10.1002/pmic.201000356


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Svozil, Gruissem and Baerenfaller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **The guard cell metabolome: functions in stomatal movement and global food security**

*Biswapriya B. Misra <sup>1</sup> , Biswa R. Acharya <sup>2</sup> , David Granot <sup>3</sup> , Sarah M. Assmann <sup>2</sup> and Sixue Chen 1,4 \**

*<sup>1</sup> Department of Biology, Genetics Institute, Plant Molecular and Cellular Biology Program, University of Florida, Gainesville, FL, USA, <sup>2</sup> Department of Biology, Pennsylvania State University, PA, USA, <sup>3</sup> Department of Vegetable Research, Institute of Plant Sciences, Agricultural Research Organization, Bet-Dagan, Israel, <sup>4</sup> Interdisciplinary Center for Biotechnology Research, University of Florida, Gainesville, FL, USA*

#### *Edited by:*

*Yariv Brotman, Max Planck Institute of Molecular Plant Physiology, Germany*

#### *Reviewed by:*

*Norihito Nakamichi, Nagoya University, Japan Maria F. Drincovich, Rosario National University, Argentina*

#### *\*Correspondence:*

*Sixue Chen, Department of Biology, Genetics Institute, Plant Molecular and Cellular Biology Program, University of Florida, Cancer and Genetics Research Complex, Room 438, 2033 Mowry Road, Gainesville, FL 32610, USA schen@ufl.edu*

#### *Specialty section:*

*This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science*

> *Received: 21 March 2015 Accepted: 28 April 2015 Published: 19 May 2015*

#### *Citation:*

*Misra BB, Acharya BR, Granot D, Assmann SM and Chen S (2015) The guard cell metabolome: functions in stomatal movement and global food security. Front. Plant Sci. 6:334. doi: 10.3389/fpls.2015.00334* Guard cells represent a unique single cell-type system for the study of cellular responses to abiotic and biotic perturbations that affect stomatal movement. Decades of effort through both classical physiological and functional genomics approaches have generated an enormous amount of information on the roles of individual metabolites in stomatal guard cell function and physiology. Recent application of metabolomics methods has produced a substantial amount of new information on metabolome control of stomatal movement. In conjunction with other "omics" approaches, the knowledge-base is growing to reach a systems-level description of this single cell-type. Here we summarize current knowledge of the guard cell metabolome and highlight critical metabolites that bear significant impact on future engineering and breeding efforts to generate plants/crops that are resistant to environmental challenges and produce high yield and quality products for food and energy security.

**Keywords: stomata, primary metabolites, abscisic acid, phytohormones, lipids, specialized metabolites, food security**

# **Introduction**

Guard cells as a unique plant single cell-type perform many functions essential to plant growth and survival. Each pair of guard cells and the regulated pore they enclose, known as a stoma or stomate, provides a conduit for atmospheric photosynthetic gas exchange (CO<sup>2</sup> uptake and O<sup>2</sup> release) and transpirational release of water (H2O) in terrestrial plants, in addition to defense against pathogenic invasion. Stomatal opening and closing, in which the guard cells actively increase and decrease their volume via turgor changes to regulate the pore size in response to environmental stimuli, are vital processes in maintaining the balance of H2O loss and CO<sup>2</sup> fixation. While drought stress induces stomatal closure, pathogens exploit stomatal opening to facilitate entry into the leaf (Zeng and He, 2010). Abscisic acid (ABA), CO<sup>2</sup> and blue light mediated stomatal movements have generated tremendous interest in their signaling mechanisms. Each pathway/network has unique components such as distinct receptors and early signaling elements. They also have common components, for example, actual stomatal movement is caused by water influx/efflux mostly driven by solute fluxes through plasma membrane anion channels and K<sup>+</sup> channels. When the concentrations of solutes decrease in guard cells in the cases of ABA and elevated CO2, water potential increases in the cells and water flows out, causing a decrease of turgor pressure and closure of the pores. Blue light activates H <sup>+</sup>-ATPases and resultant membrane hyperpolarization drives K<sup>+</sup> influx, leading to decreased water potential, increased turgor pressure, and stomatal opening. Please refer to excellent articles published over the years on these signaling mechanisms (e.g., Zeiger and Zhu, 1998; Schroeder et al., 2001; Yu and Assmann, 2014; Zhang et al., 2014; Tian et al., 2015). Although stomatal movements in response to ABA, CO2, and blue light are well studied, the metabolome of the guard cell is far from catalogued. Endeavors in metabolomic approaches have led to deeper understanding of the biology inherent to several specialized important single-cell types including guard cells (Misra et al., 2014). Recent efforts to comprehend the guard cell metabolome (Jin et al., 2013) and systems biology approaches to identify the critical regulators in stomatal movement (Sun et al., 2014) have provided interesting leads into the intricate regulation of stomatal movement in response to environmental stimuli. Ongoing systems biology approaches, combining modeling and high-throughput experiments, will help to elucidate the mechanisms underlying stomatal control and unravel targets for modulation of stomatal responses to environment (Medeiros et al., 2015).

Many factors pose immense challenges to global food and bioenergy security, including population growth, climate, and environmental changes coupled to land degradation and changes in hydrological resources, essential ecosystem services, and agricultural production systems. Urgent efforts are needed to enhance the resilience of crops to the adverse effects of climate change. Stomata are highly responsive to hormonal and environmental cues, including those associated with climate change: water availability, temperature, and CO<sup>2</sup> concentrations. Thus, understanding the basic biology, the concealed information content, and the connection to functional output of guard cells through multiple -omics approaches such as transcriptomics (Wang et al., 2011), proteomics (Zhao et al., 2008; Zhu et al., 2014) and metabolomics (Jin et al., 2013) is highly relevant to the goal of improving crop productivity and yield in ever changing climatic regimens. Here we briefly review collective efforts to unravel the functional guard cell metabolome (**Figure 1**), to discover metabolites of convergence and divergence among various environmental cues, to examine the molecular mechanisms of guard cell metabolic regulation, and ultimately to highlight the potential of guard cell biology in harnessing possible solutions for global food and bioenergy security.

# **Primary Metabolites of Carbon Metabolism in Regulatory Roles of Stomatal Function**

Early hypotheses regarding guard cell osmoregulation suggested that sugar generated from starch degradation at dawn is the primary osmolyte that opens stomata (Lloyd, 1908). Upon the discovery that potassium (K+) ions, with chloride (Cl*−*) and malate<sup>2</sup>*<sup>−</sup>* as counter anions, are osmolytes that open stomata, a role for sugar in guard cell osmoregulation and stomatal movement was abandoned for several decades (Imamura, 1943; Yamashita, 1952; Fischer, 1968; Humble and Raschke, 1971; Allaway, 1973; Outlaw and Lowry, 1977; Asai et al., 2000). A later study reporting that blue and red (photosynthetic) light can open stomata and are followed by sucrose accumulation in guard cells revived the hypothesis of sucrose as an osmolyte that opens stomata (Talbott and Zeiger, 1993). A correlation between the decline of K <sup>+</sup> content in guard cells in the middle of the day concomitantly with an increase in sucrose content further suggested that sucrose is an osmolyte that replaces K<sup>+</sup> and maintains stomatal opening (Amodeo et al., 1996; Talbott and Zeiger, 1996).

The origin of sucrose in guard cells is not yet clear. Potentially, sucrose could be obtained from guard cell starch degradation, guard cell photosynthesis, or import from mesophyll cells (Gotow et al., 1988; Talbott and Zeiger, 1988; Lawson et al., 2014). It is generally accepted though, that the contribution of sucrose produced from guard cell photosynthesis to the osmotic requirement for stomatal opening is minimal and that most of the sugar or the organic compounds from which sugar can be synthesized is obtained from the mesophyll cells (Outlaw, 1989; Reckmann et al., 1990). When exported out of the mesophyll cells for phloem loading, some sucrose accumulates in the guard cell apoplast (Lu et al., 1995, 1997; Outlaw and De Vlieghere-He, 2001). As a result, the concentration of sucrose in the guard cell apoplast increases as photosynthesis proceeds. This sucrose may be imported by guard cells and contribute to guard cell osmolarity and stomatal opening. But it also has been proposed that as sucrose accumulates in the apoplast, its osmotic effect drives water efflux from guard cells, resulting in a decrease in stomatal apertures in a mechanism that thus inversely coordinates photosynthesis and transpiration rates (Lu et al., 1997; Outlaw and De Vlieghere-He, 2001).

Apoplastic sucrose may enter the guard cells either via sucrose transporters, or via guard cell hexose transporters following sucrose cleavage by apoplastic invertase to yield the hexoses glucose and fructose (Stadler et al., 2003; Weise et al., 2008; Bates et al., 2012). Regardless of its origin, mesophyll or guard cells, sucrose must be cleaved to be metabolized, and the hexoses obtained from sucrose cleavage, glucose and fructose, must be phosphorylated by intracellular hexose phosphorylating enzymes, hexokinases (HXK) and fructokinases (FRK; Dennis and Blakeley, 2000). Glucose can be phosphorylated only by HXK, an enzyme demonstrated to exist in guard cells and to participate in sugar sensing (Moore et al., 2003; Rolland et al., 2006; Granot, 2008). A recent study has shown that sugars such as sucrose, glucose, and fructose, do not exert an apoplastic osmotic effect on guard cells, but rather are sensed within guard cells by HXK to stimulate stomatal closure, thus coordinating photosynthesis and sugar levels with transpiration (Kelly et al., 2013).

The phosphorylated hexoses (hexose-P) within guard cells may be converted to starch or enter glycolysis and the tricarboxylic acid (TCA) cycle to yield energy (ATP) and various metabolites including pyruvate and malate that regulate stomatal movement (Allaway, 1973; Pearson, 1973; Outlaw and Lowry, 1977). Glycolysis is a central metabolic pathway for cellular respiration and generation of energy in the form of ATP. In the glycolytic pathway, 2,3-biphosphoglycerate-independent phosphoglycerate mutase (iPGAM) catalyzes the interconversion of 3-phosphoglycerate to 2-phosphoglycerate. *Arabidopsis thaliana* double mutants of *iPGAM* genes show hyposensitivity in blue light, ABA, and low CO<sup>2</sup> regulated stomatal movements, confirming a role of glycolysis in guard cell function (Zhao and Assmann, 2011). ABA inhibition of stomatal opening in *Commelina benghalensis* is reversed by exogenous ATP and pyruvate

**FIGURE 1 | The guard cell metabolome.** Based on available literature, 109 known metabolites in guard cells are represented as a network based on their structural and biochemical relationships. Solid blue lines represent KEGG pathway-based biochemical relatedness and green dotted lines represent the Tanimoto structural-index based relatedness, which were inferred using the MetaMapR tool (http//dgrapov.github.io/MetaMapR/) and were visualized using Cytoscape (http//www.cytoscape.org/). The clusters of metabolites are highlighted based on metabolic categorization. Abbreviations of metabolites are as follows: ABA, Abscisic acid; ABA-GE, Abscisic acid glucose ester; ACC, 1-Amino cyclopropane-1-carboxylic acid; Ade, Adenine; ArchA, Arachidonic acid; Arg, L-Arginine; Asc, Ascorbic acid; Asn, L-Asparagine; Asp, L-Aspartic acid; ATP, Adenosine triphosphate; AllylITC, Allylisothiocyanate; BA, Benzoic acid; BL, Brassinolide; cADPR, Cyclic adenosine diphosphate ribose; CaffA, Caffeic acid; cAMP, Cyclic adenosine monophosphate; ChlA, Chlorogenic acid; CinnA, trans-Cinnamic acid; Cit, Citric acid; Cys, L-Cysteine; DAG, 1,2-Diacylglycerol; DAsc, Dehydroascorbate; DecA, Decanoic acid; DiHPA, Dihydrophaseic acid; eBL, Epibrassinolide; EdA, Eicosadienoic acid; EpA, Eicosapentaenoic acid; ET, Ethylene; EtA, Eicosatrienoic acid; FerA, Ferulic acid; F6P, Fructose-6-phosphate; Fruct, Fructose; GallA, Gallic acid; GA3, Gibberellic acid A3; GA4, Gibberellic acid A4; GentA, Gentisic acid; Glu, L-Glutamic acid; Gluc, Glucose; G6P, Glucose-6-phosphate; GSH, Glutathione (reduced); GSSG, Glutathione

(Raghavendra et al., 1976), suggesting a role of pyruvate in negative regulation of ABA signaling (Yu and Assmann, 2014). Recently, it was established that a putative mitochondrial pyruvate importer, NRGA1, negatively regulates ABA inhibition of K<sup>+</sup> inward channels, ABA activation of slow anion channels and (oxidized); His, L-Histidine; H2S, Hydrogen sulfide; IAA, Indole-3-acetic acid; IBA, Indole-3-butyric acid; ICA, Indole-3-carboxylic acid; I3P, Inositol-1, 4, 5-trisphosphate; 2IPMal, 2-Isopropylmalic acid; 3IPMal, 3-Isopropylmalic acid; Isocit, DL-Isocitric acid; JA, Jasmonic acid; Leu, L-Leucine; LinA, Linoleic acid; LipA, α-Lipoic acid; Lys, L-Lysine; Mal, L-Malic acid; MeJA, Methyl jasmonate; Met, L-Methionine; mGlyox, Methylglyoxal; mIAA, Methyl indole-3-acetate; mSA, Methyl salicylate; mThioBut, α-keto-γ-(methyl-thio) Butyric acid; mSA, Methyl salicylic acid; NAD, Nicotinamide adenine dinucleotide; NDiHGuarA, Nor dihydro guaiaretic acid; OA, Oleic acid; OAA, Oxalo acetic acid; OlAEE, Oleic acid ethyl ester; 12OPDA, 12-oxophytodienoic acid; cGMP, Guanosine-3*′* ,5*′* -cyclic monophosphate; PA, Phaseic acid; PalA, Palmitic acid; PCA, Protocatechuic acid; pCoumA, p-Coumaric acid; PEP, Phosphoenolpyruvate; 3PG, 3-Phosphoglycerate; PhA, Phosphatidic acid; pGlu, L-Pyroglutamic acid; Phe, L-Phenylalanine; Pinitol, D-Pinitol; PI3P, Phosphatidylinositol-3-phosphate; PI4P, Phosphatidylinositol-4-phosphate; PIP2, Phosphatidylinositol-4, 5-bisphosphate; Pro, L-Proline; Put, Putrescine; RosmA, Rosmarinic acid; SA, Salicylic acid; Ser, L-Serine; SinA, Sinapinic acid; S1P, Sphingosine-1-phosphate; Sph, D-erythro-Sphingosine; Spd, Spermidine; Spr, Spermine; Suc, Sucrose; SuccA, Succinic acid; SyrA, Syringic acid; Thr, L-Threonine; TraumA, Traumatic acid; Trp, L-Tryptophan; Tyr, L-Tyrosine; UA, Undecanoic acid; Val, L-Valine; Zeatin, trans-Zeatin; ZeatinGluc, trans-Zeatin glucoside; ZeatinRibo, trans-Zeatin riboside.

drought tolerance in *A. thaliana* (Li et al., 2014). Altogether, these findings suggest that accumulation of pyruvate in mitochondria would oppose stomatal closure.

Malate, an osmolyte that contributes to stomatal opening, can be generated from hexoses and phosphorylated hexoses obtained from guard cell starch degradation or from triose-phosphates produced in guard cell chloroplasts and exported to the cytoplasm where triose-P metabolism yields malate among other metabolites. ABA-stimulated stomatal closure is accompanied by malate disposal through release, gluconeogenesis, or consumption in the TCA cycle, supporting the role of malate as an osmolyte that opens stomata (Dittrich and Raschke, 1977). In the guard cell cytosol, malate can be metabolized into oxaloacetate (OAA) by malate dehydrogenase. Subsequently, phosphoenolpyruvate carboxykinase (PEPCK) can catalyze the production of PEP from OAA that in turn would enter into gluconeogenesis. An isoform of PEPCK, PCK1, is expressed in *A. thaliana* guard cells according to three experimental approaches: *PCK1* gene promoter analysis and analyses of the proteome, and transcriptome of guard cell protoplasts (Leonhardt et al., 2004; Penfield et al., 2012; and Zhao et al., 2008). Loss-of-function *PCK1* plants (*pck1-2*) show hyposensitivity in response to dark-induced (but not ABAinduced) stomatal closure, indicating the importance of malate metabolism for some stomatal responses (Penfield et al., 2012). Malate produced in photosynthetic tissues may also arrive at and enter the guard cells through malate transporters (Lee et al., 2008). Mesophyll-produced malate also coordinates stomatal behavior with mesophyll photosynthesis, as increasing apoplastic malate activates anion channels that reduce stomatal aperture (Hedrich and Marten, 1993; Fernie and Martinoia, 2009; Araujo et al., 2011). In addition, methylglyoxal, an oxygenated short aldehydic glycolytic intermediate, can induce stomatal closure in *A. thaliana* accompanied by extracellular reactive oxygen species (ROS) production mediated by SHAM-sensitive peroxidases, intracellular ROS accumulation, and suppression of free cytosolic (Ca<sup>2</sup>+) oscillations (Hoque et al., 2012). These results indicate a strong interconnectivity between central carbon metabolism and ABA signaling in guard cells.

# **Reactive Oxygen Species Related Metabolites in Guard Cell Signaling**

Reactive oxygen species and nitric oxide (NO) are central components of the signaling network regulating stomatal movement in response to ABA, jasmonic acid (JA), darkness, UV, pathogen, and high CO<sup>2</sup> concentrations (Zhang et al., 2001; Desikan et al., 2004, 2006; Zhu et al., 2012; Akter et al., 2013; He et al., 2013; Joudoi et al., 2013; Ou et al., 2014). Upon application of NOreleasing compounds, NO induces dose-dependent stomatal closure. In contrast, NO has also been implicated as a key component in negative feedback regulation of ABA guard cell signaling through S-nitrosylation of OST1 at cysteine 137 and subsequent inactivation of kinase activity that in turn blocks the positive regulatory role of OST1 in ABA signaling (Wang et al., 2015). NO-mediated negative feedback regulation may prevent complete stomatal closure, allowing some basal level of CO<sup>2</sup> uptake and photosynthesis. Hydrogen peroxide (H2O2) may also elicit stomatal movement in a similar manner through redox modification of guard cell signaling components. However, experimental data are lacking for this hypothesis. In addition, ascorbic acid (Asc) and glutathione (GSH) are critical in maintaining cellular ROS levels and redox homeostasis (Noctor and Foyer, 1998). Asc is a key antioxidant that scavenges ROS including H2O2. Dehydroascorbate reductase (DHAR) is the key regulatory enzyme that catalyzes the generation of Asc (reduced form) from dehydroascorbate (DAsc, oxidized form) in a reaction that requires GSH. Tobacco *DHAR* overexpression lines that have elevated levels of reduced Asc in guard cells show hyposensitivity in stomatal response to ABA and H2O<sup>2</sup> and these plants are drought susceptible. In contrast, DHAR antisense tobacco lines show drought tolerance (Chen and Gallie, 2004). These findings indicate that Asc redox state plays an important regulatory role in ABA and H2O<sup>2</sup> mediated stomatal responses. Altered redox state and stomatal aperture in mutants defective in GSH synthesis are well established (Okuma et al., 2011; Munemasa et al., 2013). Negative regulation of methyl jasmonate (MeJA)-induced stomatal closure by GSH in *A. thaliana* has been demonstrated (Akter et al., 2013). In addition, GSH peroxidases are known to function as redox transducers as well as scavengers in ABAmediated stress responses (Miao et al., 2006). Thus, understanding redox changes and their regulation and coordination with stomatal functions would provide new insights into guard cell signaling networks.

Stomatal guard cells have a thick cuticular layer containing high concentrations of wax-bound phenolics that provide protection against UV radiation (Karabourniotis et al., 2001; He et al., 2013) and form a constitutive defense barrier against pathogens and insects. Intracellular phenolics and flavonoids synthesized from the phenylpropanoid pathway are also responsible for cellular defense and pigmentation among other functions. Flavonoids protect plants from UV-B irradiation (Li et al., 1993) and also function as stress-induced antioxidants (Dixon and Paiva, 1995). Flavonols accumulate in guard cells of *A. thaliana*, but not in the surrounding pavement cells (Watkins et al., 2014). Enhanced flavonol content and decreased ROS levels upon ethylene (ET) treatment in guard cells were correlated with a reduction in the rate of stomatal closure in response to ABA. The results suggest that flavonols may quench the ABA-dependent ROS burst (Watkins et al., 2014). Moreover, some flavonoids, such as quercetin, apigenin, and kaempferol, have functions similar to synthetic auxin transport inhibitors, so changes in the synthesis or deposition of specific flavonoids within cells may act to change the rate or direction of auxin transport (Winkel-Shirley, 2002). Given that ABA reduces guard cell auxin concentrations (Jin et al., 2013), it would be interesting to further investigate the interrelationships between flavonoids, ROS, ABA, and guard cell auxin transport.

# **Role of Lipid Signaling in Stomatal Movement and Development**

Lipids are essential for membrane formation and energy storage. In addition, lipids and their metabolites are also important cellular signaling molecules, including in stomatal regulation. For instance, lipid-based secondary messengers that positively regulate guard cell ABA signaling and stomatal closure include phosphatidic acid (PhA), phosphatidylinositol-3-phosphate (PI3P), inositol-1,4,5-trisphosphate (IP3), inositol-6-phosphate (IP6), and sphingolipids (Kim et al., 2010; **Figure 1**).

Phosphoinositides play important roles in guard cell signaling. Phospholipase C (PLC) hydrolyses phosphatidylinositol 4,5 bisphosphate (PIP2) to produce 1,2-diacylglycerol (DAG) and IP3. ABA induced the production of IP3 in *Vicia faba* guard cells (Lee et al., 1996), and cytosolic Ca<sup>2</sup><sup>+</sup> elevation and subsequent stomatal closure occurred upon experimental elevation of cytosolic IP3 in *Commelina communis* guard cells (Gilroy et al., 1990). Increases in guard cell PI3P and PI 4-phosphate (PI4P; the products of PI 3-kinase (PI3K) and PI 4-kinase (PI4K) activities, respectively) induce stomatal closure mediated by ABA-induced ROS generation (Jung et al., 2002; Park et al., 2003). IP6 is generated in guard cells in response to ABA. IP6 is an endomembraneacting Ca<sup>2</sup>+-release signal that inhibits the inwardly rectifying K <sup>+</sup> channel, which would then inhibit stomatal opening (Lemtiri-Chlieh et al., 2003).

PhA, a product of phospholipase Dα1 (PLDα1) activity, is a positive regulator in ABA-induced ROS and NO production that promotes stomatal closure (Zhang et al., 2009). In *A. thaliana* guard cells, NO synthesis is positively regulated by both ABA and ROS*,* and interaction of PhA with the two NADPH oxidases, AtrbohD and AtrbohF (Kwak et al., 2003), is necessary for ABA-induced ROS production (Zhang et al., 2009). The NADPH oxidase-deficient double mutant *atrbohD/F* shows impaired ABA induction of NO production and stomatal closure, indicating that ROS production is necessary for NO production. Application of NO scavengers can inhibit ROS-mediated stomatal closure, indicating that NO is required for ROS-promoted stomatal closure. In contrast, application of NO cannot induce ROS production in *A. thaliana* guard cells (Bright et al., 2006). These findings indicate that PhA functions upstream of ROS production and ROS function upstream of NO production.

The lipid metabolite sphingosine-1-phosphate (S1P) is a product of sphingosine kinase (SPHK) activity, which uses the longchain amine alcohol sphingosine as a substrate. S1P induced increases of cytosolic (Ca<sup>2</sup>+) (Ng et al., 2001) and stimulated stomatal closure in *C. communis* and *A. thaliana* (Ng et al., 2001, Coursol et al., 2003). The *A. thaliana* genome encodes two functional SPHK genes, *SPHK1* and *SPHK2* (Worrall et al., 2008; Guo et al., 2011). Both SPHKs can use sphingosine and phyto-sphingosine as substrates to produce S1P and phyto-S1P, respectively. Both S1P and phyto-S1P induce stomatal closure in *A. thaliana* (Coursol et al., 2005). S1P inhibits inward K<sup>+</sup> channels and promotes slow anion channel activity in *A. thaliana* guard cell protoplasts, which in turn cause inhibition of stomatal opening and promotion of stomatal closure, respectively (Coursol et al., 2003). In *A. thaliana*, a functional G-protein α-subunit (GPA1) is required for S1P regulation of ion channels (Coursol et al., 2003). In *A. thaliana*, PhA interacts with SPHKs, promoting substrate binding, which in turn increases SPHK activity. Phyto-S1P induces PhA production in wild type (WT) *A. thaliana*, but not in the *pld*α mutant, indicating a positive regulatory role of phyto-S1P in PLDα-mediated PhA production. It has been suggested that phyto-S1P promotes PLDα activity by increasing cytoplasmic Ca<sup>2</sup><sup>+</sup> concentration (Guo and Wang, 2012). These findings indicate that phyto-S1P and PhA are dependent on each other via positive feedback regulation.

A guard cell-specific and ABA-independent oxylipin pathway was recently reported (Montillet and Hirt, 2013). Derived from complex membrane lipids, unesterified fatty acids are catalyzed by lipoxygenase (LOX) into various oxylipin products, such as JA, fatty acid hydroperoxides, and reactive electrophile species (RES) oxylipins, and these can induce stomatal closure at nanomolar concentrations (Montillet et al., 2013). *A. thaliana lox1* mutants were as sensitive to exogenously applied ABA as WT plants, suggesting that LOX1 activity is not involved in ABA-induced stomatal closure. In addition, a transgenic SA-deficient NahG line, and the two SA biosynthesis mutant lines, *sid1-1* and *sid2- 1*, responded normally to ABA, but were non-responsive to RES oxylipins. In addition, *lox1* mutant lines were as sensitive to SA (100 µM) as WT, demonstrating that exogenously applied SA compensated for the *LOX1* deficiency. The results indicate that SA is required to convey the RES oxylipin signal, but not the ABA-mediated signal, leading to stomatal closure.

Naturally occurring saturated short, straight chain fatty acids, such as decanoic and undecanoic acids, can inhibit stomatal opening and cause stomatal closure in epidermal strips of *C. communis* (Willmer et al., 1978). In contrast, some polyunsaturated fatty acids, such as linolenic and arachidonic acid enhance stomatal opening and inhibit stomatal closing, consistent with their promotion of inward K<sup>+</sup> channel activity and inhibition of outward K <sup>+</sup> channel activity (Lee et al., 1994). Very-long-chain polyunsaturated fatty acids (VLCPUFAs), such as eicosapentaenoic acid (20:5 δ 5,8,11,14,17) are abundant lipids in several key plant pathogens (Sun et al., 2013), and may elicit plant defense responses, including stomatal closure. Interestingly, it was shown that exogenous application of eicosadienoic and eicosatrienoic acids to WT plants or endogenous production in the transgenic plants could reduce water loss from excised leaves and confer ABA hypersensitivity to stomatal responses (Yuan et al., 2014). Some fatty acids have been shown to regulate stomatal development, thus affecting the overall plant response to the environment. The *A. thaliana* gene *HIC* (high carbon dioxide) encodes a putative 3-keto acyl coenzyme A synthase (KCS), an enzyme involved in the synthesis of verylong-chain fatty acids (VLCFA) and is a negative regulator of stomatal development in response to CO<sup>2</sup> (Gray et al., 2000). Mutant *hic* plants exhibit up to a 42% increase in stomatal density in response to a doubling of CO2, possibly by preventing the synthesis of component(s) of the extracellular matrix found at the guard cell surface, such as waxes, glycerolipids, sphingolipids, and cutin (Gray et al., 2000). FATTY-ACID DESATURASE4 (FAD4) is required to desaturate palmitic acid (16:0), and the *fad4* mutant is unable to change stomatal index (defined as the percentage of stomata as compared to all the epidermal cells (including stomata) in a unit area of leaf) in response to elevated CO<sup>2</sup> (Lake et al., 2002). Metabolic profiling of *sdd1* (*STOMATAL DENSITY AND DISTRIBUTION1*) plants, which have three to fourfold higher stomatal density than WT plants, showed a fivefold reduction of unsaturated C16 fatty acids compared to WT, and a concomitant rise in saturated fatty acid 16:0 species (i.e., palmitic acid; Fiehn et al., 2000). The fates of these fatty acids are scarcely known, although it is assumed that some are incorporated into the cutin layers. In *A. thaliana*, mutations of the VLCFA-producing enzymes CER6, CER1, and HIC that are involved in cuticle biosynthesis result in increased stomatal index (Gray et al., 2000). Whether stomatal index/density affects stomatal movement is not clear. Nevertheless, the aforementioned roles of fatty acid metabolites and their metabolic enzymes offer new avenues to elucidate lipid signaling networks in guard cells, which will facilitate engineering of fatty acid metabolism in crops for enhanced stress tolerance and productivity.

# **Phytohormone Cross-talk in Stomatal Function**

The phytohormone ABA, first reported in plants in the 1960s (Eagles and Wareing, 1963; Ohkuma et al., 1963), is the single most studied metabolite in guard cell physiology owing to its distinct stress (e.g., drought) responsiveness and strong effect on stomatal closure. ABA causes stomatal closure, prevents opening of closed stomata, and reduces transpiration in the leaves of a wide range of species. Stomata accumulate (Cornish and Zeevaart, 1986), catabolize (Grantz et al., 1985), and conjugate exogenously supplied ABA (Grantz et al., 1985; Lee et al., 2006), but to date it is unclear if stomatal opening initially includes or requires depletion of endogenous guard cell ABA (Tallman, 2004). The biosynthesis of ABA from carotenoids in plastids and its catabolism and storage in the cytosol and endoplasmic reticulum in plant cells is well characterized (Nambara and Marion-Poll, 2005). The regulatory network of ABA sensing involve three major components, PYRABACTIN RESISTANCE1 (PYR1)/PYR1-LIKE (PYL)/REGULATORY COMPONENTS OF ABA RECEPTORS (RCAR; i.e., PYR/PYL/RCAR; an ABA receptor; Ma et al., 2009; Park et al., 2009; Joshi-Saha et al., 2011), type 2C protein phosphatase (PP2C; a negative regulator) and SNF1 related protein kinase 2 (SnRK2; a positive regulator), and they offer a double negative regulatory system, (PYR/PYL/RCAR—| PP2C—| SnRK2), which has been well studied (Klingler et al., 2010; Umezawa et al., 2010). PP2Cs inactivate SnRK2s kinases by physical interaction and direct dephosphorylation. Upon ABA binding, PYLs change their conformations and then physically interact and inhibit PP2Cs. However, PYLs inhibit PP2Cs in both the presence and absence of ABA and activate SnRK2s (Zhang et al., 2015). Several natural and artificial compounds interacting with the ABA receptor PYR/PYL/RCAR family are now known (Hitomi et al., 2013). Evolutionary insights obtained from studies on components of the ABA signaling network indicate that PYR/RCAR ABA receptor and ABF-type (ABA-responsive element binding factors) transcription factor families arose during land colonization by plants, while the ABA biosynthesis enzymes have evolved in different plant and fungal specific pathways (Hauser et al., 2011). The structural insights provided from the three-dimensional structures of module PYR/PYL/RCAR-ABA-PP2C pave the way to the design of ABA agonists able to modulate the plant stress response (Santiago et al., 2012).

ABA is transported over short and long distances in plants. Plasma membrane-localized ABA transporters belonging to ATPbinding cassette (ABC; Kang et al., 2010) and nitrate transporter 1/peptide transporter (NRT1/PTR) families are established (Boursiac et al., 2013) and ABA-perception sites were visualized on the plasma membrane of stomatal guard cells (Yamazaki et al., 2003), in addition to internal sites of perception. For instance, application of ABA into the cytosol of *V. faba* guard-cell protoplasts via patch-clamp techniques inhibited inward K<sup>+</sup> currents thus inhibiting stomatal opening (Schwartz et al., 1994). Although ABA synthesis in guard cells and vascular tissues has been shown (Seo and Koshiba, 2011; Bauer et al., 2013; Boursiac et al., 2013), the relative extent to which guard cells and vascular tissues contribute to the ABA dynamics in guard cells is a topic of ongoing interest. For instance, the recent design, engineering and use of ABAleons with ABA affinities in the range of 100–600 nM to map ABA concentration changes in plant tissues with spatial and temporal resolution in distinct cell types, and in response to low humidity and NaCl in guard cells (Waadt et al., 2014) has promising future applications.

ABA causes alkalization of the guard cell cytosol (Blatt and Armstrong, 1993), which directly enhances outward K<sup>+</sup> channel activity (Blatt and Armstrong, 1993; Ilan et al., 1994; Miedema and Assmann, 1996), and a sustained efflux of both anions and K <sup>+</sup> from guard cells contributes to loss of guard cell turgor, thus facilitating stomatal closing. In addition, ABA-induced stomatal closing can be Ca<sup>2</sup>+-dependent or -independent (Schroeder et al., 2001). ABA mediated inhibition of stomatal opening is a process distinct from ABA-induced stomatal closure, and it is unclear if H2O<sup>2</sup> and NO are involved in the ABA inhibition of stomatal opening (Desikan et al., 2004). Even after half a century of research, the role of ABA in guard cell signaling continues to be elucidated (Kim et al., 2010; Yu and Assmann, 2014). ABA content can be decreased via catabolism to phaseic acid (PA), sequestration in the form of an ABA-glucose ester (ABA-GE), which is thought to be physiologically inactive, or deposition in vacuoles (Nambara and Marion-Poll, 2005). Studies on sugarresponse mutants indicate that ABA and sugar-response pathways overlap extensively (León and Sheen, 2003). It is known that the sugar sensing effects mediated by HXK are dependent on production of and signaling by ABA (Rolland et al., 2006; Rognoni et al., 2007; Ramon et al., 2008); for example, these interactions take place in mesophyll cells where sugar and HXK inhibit expression of photosynthesis genes (Rolland et al., 2006). Recently, it has been shown that sugar and HXK stimulate the ABA signaling pathway within guard cells, promoting stomatal closure (Kelly et al., 2013). These effects were also observed in epidermal peels, suggesting that sugar and HXK stimulate production of ABA, release of biologically active ABA from inactive ABA pools, and/or inhibition of ABA degradation within guard cells (Koiwai et al., 2004; Christmann et al., 2005; Melhorn et al., 2008; Wasilewska et al., 2008; Zhu et al., 2011). These observations also imply that ABA is probably essential for daily regulation of stomatal aperture even in the absence of water stress (Kelly et al., 2013).

A comprehensive and comparative metabolomics study undertaken in guard and mesophyll cells of *A. thaliana* revealed that following ABA treatment, metabolites are clustered into different temporal modules in guard cells and mesophyll cells (Jin et al., 2013). Guard cell modules differ in WT plants as compared to the modules in the heterotrimeric G-protein α subunit null mutant (*gpa1*), with fewer metabolites showing ABA-altered profiles in *gpa1*, consistent with hyposensitivity of *gpa1* K <sup>+</sup>, anion, and Ca<sup>2</sup><sup>+</sup> channels to ABA (Wang et al., 2001; Fan et al., 2008; Zhang et al., 2011). For instance, the Ca<sup>2</sup>+-mobilizing metabolites S1P and cyclic adenosine 5*′* -diphosphoribose (cADPR) exhibited weaker ABA-stimulated increases in *gpa1* than in WT guard cells. Phytohormones such as ABA catabolites, i.e., ABA glucose-ester, PA, and dihydrophaseic acid (DiHPA), and indole-3-acetic acid (IAA), JA, MeJA, and methyl salicylate were responsive to ABA, with greater responsiveness in WT than in the *gpa1* guard cells. In particular, IAA concentrations in guard cells declined following ABA treatment in WT guard cells but not in *gpa1* guard cells. These findings are consistent with the observation that exogenous application of IAA activates the guard cell H+-ATPase and impairs ABA-inhibition of stomatal opening, and suggest that endogenous ABA in guard cells functions upstream to regulate other endogenous hormones, particularly IAA, consistent with G proteins modulating multiple hormonal signaling pathways. Most phytohormones also showed differential ABA responses in guard cells as compared to mesophyll cells (Jin et al., 2013). In support of the idea that multiple hormones regulate guard cell responses, in *V. faba*, cytokinin and auxin induced stomatal opening (Levitt et al., 1987; Song et al., 2006) in conjunction with decreased H2O<sup>2</sup> production (Song et al., 2006). Salicylic acid (SA) is a ubiquitous phenolic phytohormone involved in stomatal movement. Addition of 1 mM SA to fully opened stomata resulted in a significant reduction (75%) in stomatal aperture (Lee, 1998) in *C. communis*. SA is known to induce stomatal closure accompanied by extracellular ROS production, intracellular ROS accumulation and inward K <sup>+</sup> channel inactivation (Khokon et al., 2011a). Although both ABA and SA were reported to be needed for stomatal closure in response to pathogens, with SA action upstream of ABA (Melotto et al., 2006), a recent study using the ABA biosynthesis mutant *aba2* and a mutant of JA biosynthesis reported no differences in SA induced stomatal closure in the mutants as compared to WT. The authors concluded that neither ABA nor JA is involved in SA, yeast elicitor, or chitosan-induced stomatal closure in *A. thaliana* (Issak et al., 2013). These results appear to indicate the presence of an ABA independent SA signaling pathway in guard cells, but more research is need to fully resolve the contradictory conclusions in the literature.

In *V. faba* (Zhang et al., 2001) and *Pisum sativum* (Suhita et al., 2004), ABA-mediated stomatal closure is preceded by cytoplasmic alkalization and H2O<sup>2</sup> production, events that also occur during MeJA-mediated stomatal closure. In fact, ABA and MeJAmediated stomatal closure share several characteristic signaling components, such as Ca<sup>2</sup><sup>+</sup> involvement, protein phosphorylation, cytoplasmic alkalization, ROS production, and modulation of plasma membrane K<sup>+</sup> channels in the guard cells (Suhita et al., 2004). Extremely low levels of the phytotoxin coronatine (COR), secreted by virulent strains of *Pseudomonas syringae* p.v. *tomato* (*Pst*) act as a JA mimic, activate the JA signaling pathway, and enable the strains to reopen stomata, thereby circumventing host stomatal defense (Montillet and Hirt, 2013). However, unlike COR, exogenous MeJA does not appear to antagonize ABAinduced stomatal closure (Melotto et al., 2006). In fact, the ability of MeJA to regulate stomatal apertures remains controversial (Montillet et al., 2013). Allene oxide synthase (AOS) is a key enzyme in the oxylipin pathway and plays a vital role in production of 12-oxo-phytodienoic acid (12-OPDA, a JA precursor) and JA. Recently, it has been proposed that 12-OPDA, rather than MeJA, acts in promotion of stomatal closure (Savchenko et al., 2014).

The role of brassinosteroids (BRs) in stomatal movements is less established. Brassinolide (BL), the most bioactive BR form, has been shown to promote stomatal closure in *V. faba* (Haubrick et al., 2006), where BL-induced stomatal opening was not observed. Interestingly, low concentrations of epibrassinolide (eBL) promoted stomatal opening in epidermal peels of *Solanum lycopersicum* in the dark, whereas high concentrations of eBL promoted stomatal closure in the light (Xia et al., 2014). Exogenous (apoplastic) and endogenous (cytosolic) BR may act differently, and guard cells of different species may respond differently to BL application. In *S. lycopersicum*, transient H2O<sup>2</sup> production was deemed essential for poising the cellular redox status, which played an important role in BR-induced stomatal opening (Xia et al., 2014). BR promoted stomatal closure through apparent biosynthesis of ABA, while stomatal opening was dependent on the GSH redox status of the guard cells. It was proposed that GSH regeneration and/or biosynthesis, leading to a reduced redox status, strictly controls the ROS level and negatively regulates the ABA response pathway, and that BR can directly induce ROS production independently of ABA via NADPH oxidase.

ET and its precursor 1-aminocyclopropane-1-carboxylic acid (ACC) activate the production of H2O<sup>2</sup> in guard cells and induce stomatal closure in *V. faba* (Song et al., 2014), and the closure was preceded by elevated ROS generated by NADPH oxidases (Desikan et al., 2006). However, the ET effect varies depending on species and conditions. For example, ET promotes stomatal closure in *Arachis hypogaea* (Pallas and Kays, 1982) and *A. thaliana* (using intact leaves; Desikan et al., 2006), but evokes stomatal opening in *Dianthus caryophyllus* and *S. lycopersicum* (Madhavan et al., 1983), *V. faba* (Levitt et al., 1987), and *A. thaliana* (using epidermal peels; Tanaka et al., 2005). The ET effect on stomatal opening was attributed to its impairment of ABA regulation of stomatal closure (Tanaka et al., 2005), but recently, it was shown in *A. thaliana* that ET mediated BR-induced stomatal closure via Gα protein-activated AtrbohF-dependent H2O<sup>2</sup> production and subsequent Nia1-catalyzed NO production (Ge et al., 2015; Shi et al., 2015). Nonetheless, the exact mechanisms underlying the different ET effects are unknown.

# **Nitrogen and Sulfur Rich Metabolites in Guard Cell Signaling**

Nitrogenous bases in the form of purines and pyrimidines form an essential pool of nitrogen in plant cells. Nitrogenous metabolites have been extensively studied as metabolic intermediates and signaling molecules in stomatal movement and guard cell function. Important nitrogenous signaling molecules, such as cADPR, a metabolite derived from nicotinamide adenine dinucleotide (NAD; Wu et al., 1997), play important roles in guard cell ABA signaling. Injection of cADPR into guard cells resulted in [(Ca<sup>2</sup>+)cyt] increases and turgor reduction. When guard cells were preloaded with the cADPR antagonist 8NH2-cADPR, a slowing of stomatal closure was observed in response to ABA (Leckie et al., 1998). Recently, it was established that inhibition of the poly (ADP-R) Misra et al. Guard cell metabolome and functions

polymerase activity correlated with increased number of stomata in *A. thaliana* leaves (Schulz et al., 2014), highlighting the role of poly (ADP-R) metabolism in stomatal development. Another nucleotide-related metabolite, cyclic guanosine monophosphate (cGMP), has been implicated in ABA-induced stomatal closure by acting downstream of H2O<sup>2</sup> and NO in the signaling pathway by which ABA induces stomatal closure (Dubovskaya et al., 2011). H2O<sup>2</sup> and NO-induced cytosolic calcium increases [(Ca<sup>2</sup>+)cyt] were cGMP-dependent, positioning cGMP upstream of (Ca<sup>2</sup>+)cyt, and involved the action of the type 2C protein phosphatase, ABI1. Increases in cGMP were mediated through the stimulation of guanylyl cyclase by H2O<sup>2</sup> and NO (Dubovskaya et al., 2011). The nitrated form of cGMP (8-nitro-cGMP) is a positive regulator in promotion of stomatal closure (Joudoi et al., 2013). NO and cGMP induce the synthesis of 8-nitro-cGMP in guard cells in the presence of ROS leading to the hypothesis that NOdependent guanine nitration of cGMP may occur in plants and the resulting 8-nitro-cGMP acts as a signaling molecule that activates cADPR production in guard cells. By contrast, a positive role for cGMP in kinetin- and natriuretic peptide–induced stomatal opening in *Tradescantia albiflora* (Pharmawati et al., 1998) and in auxin-induced stomatal opening in *C. communis* and *A. thaliana* (Cousson and Vavasseur, 1998; Cousson, 2003) also has been recognized. Furthermore, application of 8-bromo-cGMP, a membrane-permeant cGMP analog, causes stomata to open in the dark (Cousson, 2003; Joudoi et al., 2013), but 8-nitro-cGMP does not. These results lead to the conclusion that cGMP and its nitrated derivative play different roles in guard cell signaling, wherein cGMP promotes stomatal opening in the dark, while 8 nitro-cGMP promotes stomatal closure in the light (Joudoi et al., 2013).

Another important group of N-containing specialized metabolites in plants are polyamines. Exogenous application of polyamines, such as 1 mM spermine, inhibit stomatal opening by inhibiting inwardly rectifying K<sup>+</sup> channels (Liu et al., 2000). Application of spermidine also promotes stomatal closure but the mechanism is unknown as outward K<sup>+</sup> channels and anion channels are not affected (Liu et al., 2000). On the other hand, (acetyl-)1,3-diaminopropane (DAP), a product of oxidative deamination of spermidine and spermine, suppresses anionic currents, and increases those of inwardly rectifying K<sup>+</sup> channels, and may induce membrane hyperpolarization and extracellular acidification by activating the H<sup>+</sup> ATPase, thus restraining stomatal closing (Jammes et al., 2014). These mechanisms act antagonistically to ABA. It is thought that during acclimation to low soil-water availability, acetyl-DAP prevents complete stomatal closure (Jammes et al., 2014). Moreover, DAP and such amine oxidase reaction products are precursors of γamino butyric acid, alkaloids, β-alanine, and other uncommon polyamines that play significant roles in stress tolerance and defense (Bouchereau et al., 1999). In fact, based on proteome analysis in *Brassica napus* guard cells, ABA-responsive proteins that decrease in abundance include those involved in spermidine synthesis, purine metabolism, and alkaloid biosynthesis pathways (Zhu et al., 2010).

Glucosinolates are N- and S-containing specialized metabolites in plants that have been shown to be present in guard cells. Glucosinolate-myrosinase systems in Brassicales, especially *A. thaliana*, are well understood in plant-herbivore interactions and defense against pathogens (Yan and Chen, 2007; Andersson et al., 2015). However, recent evidence from proteomic investigations has indicated that glucosinolates are required for ABA responses of guard cells (Zhao et al., 2008, Zhu et al., 2010, 2014). THIOGLUCOSIDE GLUCOHYDROLASE1 (TGG1), encoding a myrosinase that catalyzes the production of isothiocyanates (ITC) from glucosinolates, is highly abundant in guard cell proteomes. In fact, myrosinases are proposed to redundantly function downstream of ROS production and upstream of cytosolic Ca<sup>2</sup><sup>+</sup> elevation in ABA and MeJA signaling in guard cells (Islam et al., 2009). *A. thaliana tgg1* mutants are hyposensitive to ABA inhibition of guard cell inward K<sup>+</sup> channels and stomatal opening. In addition, thiol-reagents such as ITCs have been shown to be potent inducers of stomatal closure, possibly via covalent reactions with RES oxylipin targets (Montillet et al., 2013). Some of the glucosinolate-producing plant species, such as *Brassica juncea,* produce 2-propenylglucosinlate, which can be hydrolyzed to allylisothiocyanate (allylITC). Exogenous application of allyl-ITC was found to induce stomatal closure (Khokon et al., 2011b). The stomatal closure by allylITC was induced via production of ROS and NO, and elevation of cytosolic Ca<sup>2</sup>+. In addition, other ITCs, nitriles, and thiocyanates (e.g., 3-butenenitrile and ethyl thiocyanate) have also been shown to induce foliar ROS generation and stomatal closure (Hossain et al., 2013). Manipulation of glucosinolate metabolic pathways by plant metabolic engineering and breeding approaches may lead to development of crop varieties with combined disease and drought resistance. Recently, another sulfur containing compound, hydrogen sulfide (H2S), generated by L-cysteine desulfhydrase was shown to act upstream of NO to modulate ABA-dependent stomatal closure (Scuffi et al., 2014).

# **Conclusion**

The plant leaf metabolome can boast as many as 5,000 different metabolites (Bino et al., 2004). Considering the roles of established metabolites in guard cell functions, we have begun the heydays of functional genomics, fluxomics, and systems biology toward understanding of this highly sophisticated single cell type model system. Although studies on guard cell metabolism are highly biased toward ABA and osmolytes owing to their primary importance in stomatal movement, the identification of additional critical metabolites (as shown in **Figure 1**) underlying or correlated with stomatal movement will form a solid foundation toward a broader understanding of optimal plant adaptation to environmental changes. For example, although progress in the study of stomatal movement in plant immunity has been made (Zhang et al., 2008), a deeper mechanistic understanding is required to harness the potential for generation of disease resistant crops. Information currently available has revealed universal and diverse metabolites and pathways leading to stomatal responses.

Many years of traditional breeding has unknowingly selected varieties with cool leaf temperature in some species, i.e., larger stomatal opening for higher yield. For instance, in Pima cotton (*Gossypium barbadense* L.) and bread wheat, increased stomatal conductance led to lint and grain yield increases respectively (Lu et al., 1998). Furthermore, in cotton, stomatal conductance and leaf cooling were significantly correlated with fruiting prolificacy and yield during the hottest period of the year (Radin et al., 1994). Understanding the functions and molecular networks of the regulatory metabolites of stomatal functions would open avenues for development of "smart" crops, providing a unique platform for endeavors at the genetic level to favor food security and human nutrition. Although we did not focus on the roles of stomatal ontogeny, shape, size, and distribution, which can also significantly affect plant water balance, growth and biomass, the engineering of stomatal development and response as a means to improve water use efficiency is an attractive approach

## **References**


to improve drought tolerance in crops (Schroeder et al., 2001). Guard cell metabolomics and systems biology hold the potential to unravel key molecular networks that control plant productivity and defense in a changing climate.

# **Acknowledgments**

Work on guard cell signaling in the Assmann laboratory is supported by BARD grant IS-4541-12 and by NSF grants MCB-1121612, MCB-1157921, and MCB-1412644 to SA. Research on guard cell signaling in the Chen laboratory is supported by NSF grants MCB- 0818051, MCB-1158000, and MCB-1412547 to SC. Work on guard cell signaling in the Granot laboratory is supported by BARD grant IS-4541-12.


in *Arabidopsis thaliana*. *Biosci. Biotechnol. Biochem.* 77, 1111–1113. doi: 10.1271/bbb.120980


stomatal closure and immune defense in *Arabidopsis*. *PLoS Biol.* 11:e1001513. doi: 10.3410/f.717991704.793474995


germination and post-germination growth of *Arabidopsis*. *Physiol. Plant.* 143, 375–384. doi: 10.1111/j.1399-3054.2011.01510.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Misra, Acharya, Granot, Assmann and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Single cell-type comparative metabolomics of epidermal bladder cells from the halophyte Mesembryanthemum crystallinum

#### Bronwyn J. Barkla<sup>1</sup> \* and Rosario Vera-Estrella<sup>2</sup>

<sup>1</sup> Southern Cross Plant Science, Southern Cross University, Lismore, NSW, Australia, <sup>2</sup> Departamento de Biologia Molecular de Plantas, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Mexico

#### Edited by:

Sixue Chen, University of Florida, USA

#### Reviewed by:

Steffen Neumann, Leibniz Institute of Plant Biochemistry, Germany Jedrzej Jakub Szymanski, Weizmann Institute of Science, Israel

#### \*Correspondence:

Bronwyn J. Barkla, Southern Cross Plant Science, Southern Cross University, Military Road, PO Box 157, Lismore, NSW 2480, Australia bronwyn.barkla@scu.edu.au

#### Specialty section:

This article was submitted to Plant Systems and Synthetic Biology, a section of the journal Frontiers in Plant Science

> Received: 02 May 2015 Accepted: 27 May 2015 Published: 10 June 2015

#### Citation:

Barkla BJ and Vera-Estrella R (2015) Single cell-type comparative metabolomics of epidermal bladder cells from the halophyte Mesembryanthemum crystallinum. Front. Plant Sci. 6:435. doi: 10.3389/fpls.2015.00435 One of the remarkable adaptive features of the halophyte Mesembryanthemum crystallinum are the specialized modified trichomes called epidermal bladder cells (EBC) which cover the leaves, stems, and peduncle of the plant. They are present from an early developmental stage but upon salt stress rapidly expand due to the accumulation of water and sodium. This particular plant feature makes it an attractive system for single cell type studies, with recent proteomics and transcriptomics studies of the EBC establishing that these cells are metabolically active and have roles other than sodium sequestration. To continue our investigation into the function of these unusual cells we carried out a comprehensive global analysis of the metabolites present in the EBC extract by gas chromatography Time-of-Flight mass spectrometry (GC-TOF) and identified 194 known and 722 total molecular features. Statistical analysis of the metabolic changes between control and salt-treated samples identified 352 significantly differing metabolites (268 after correction for FDR). Principal components analysis provided an unbiased evaluation of the data variance structure. Biochemical pathway enrichment analysis suggested significant perturbations in 13 biochemical pathways as defined in KEGG. More than 50% of the metabolites that show significant changes in the EBC, can be classified as compatible solutes and include sugars, sugar alcohols, protein and non-protein amino acids, and organic acids, highlighting the need to maintain osmotic homeostasis to balance the accumulation of Na<sup>+</sup> and Cl<sup>−</sup> ions. Overall, the comparison of metabolic changes in salt treated relative to control samples suggests large alterations in M. crystallinum epidermal bladder cells.

Keywords: salinity, salt-tolerance, metabolomics, halophyte, trichomes, epidermal bladder cells, Crassulacean acid metabolism, Mesembryanthemum crystallinum

## Introduction

Single cell-type studies allow insight into the highly specific processes and functions of differentiated cells. In plants, studies into the role of specialized cells have been undertaken with a particular focus on those cells in which the isolation procedure is relatively uncomplicated. This includes free moving cell-types such as pollen grains, but also cells of the epidermis, such as root hairs, guard cells, and trichomes (Misra et al., 2014). Trichomes are widely conserved across the plant kingdom but are highly divergent in relation to morphology and function. They can be simple unicellular structures like the hairs of Arabidopsis, or complex multicellular appendages made up of differentiated basal, stalk, and apical cells; with the possibility of multiple trichome types co-existing on the same organ, as is observed for tomato (Kang et al., 2009). They produce, store and, in the case of glandular trichomes, secrete a diverse range of chemical compounds including polar and non-polar metabolites, which have been implicated in adaptive responses of the plant to their biotic and abiotic environments (Schilmiller et al., 2008). Trichomes can function to deter insects and herbivores, attract pollinators, aid in seed dispersal, regulate leaf temperature, but also to store unwanted xenobiotics including heavy metals and salts (Wagner et al., 1994).

In the Aizoaceae, specialized single cell trichomes called epidermal bladder cells (EBC) are a distinctive feature of this family (Klak et al., 2003). EBC of the succulent desert halophyte and facultative Crassulacean acid metabolism plant Mesembryanthemum crystallinum are abundant on leaves and stems from an early developmental age; however, cell morphology changes with development and metabolic/stress state of the plant (Adams et al., 1992). In young unstressed plants the EBC are small and tightly appressed to the leaf surface, whereas in adult salt-treated plants the EBC become enlarged and can be balloon- or sausage-like (Oh et al., 2015). They have been shown to accumulate high concentrations of sodium and thought to be important salt-adaptive features of the plant (Adams et al., 1998; Barkla et al., 2002). "Omics" approaches have helped to define a more encompassing role for the EBC, with proteomics and transcriptomics analysis suggesting these cells are metabolically active (Barkla et al., 2012), and show pronounced alternations in response to salt in a number of precisely defined pathways including significant changes in transcripts from networks representing Gene Ontology (GO) terms for ion transport, osmolyte accumulation, and stress signaling (Oh et al., 2015). Here we continue our systems wide integrative investigation of EBC in the facultative CAM plant M. crystallinum by carrying out non-targeted metabolite profiling of EBC extracts from plants under control and salinity treatment regimens to obtain a snapshot of EBC metabolism. Overall, the comparison of metabolic changes in salt treated relative to control samples suggested very large perturbations in metabolites between the treatment conditions and highlighted 13 significantly enriched biochemical pathways.

# Material and Methods

#### Plant Materials and Growth Conditions

M. crystallinum L. plants were grown from seed in soil (MetroMix 510; Sun Gro Horticulture, Bellevue, WA) in a propagation tray as previously described (Barkla et al., 2009). Three weeks following germination, individual seedlings were transplanted to pots containing the soil mixture, with two plants per 15-cmdiameter pot. Plants were watered daily and one-half strength Hoagland's medium (Hoagland and Arnon, 1938) was supplied weekly. NaCl (200 mM) treatment was initiated 6 weeks after germination for a period of 14 d. Plants were grown in a glasshouse under natural irradiation and photoperiod, with the greenhouse photosynthetic photon flux density reaching a peak value of 1300 mmol m−<sup>2</sup> s <sup>−</sup><sup>1</sup> during the middle of the day. Temperature was maintained at 25◦C ± 3 ◦C.

### Extraction of Bladder Cell Extract

Bladder cell extract was obtained from individual cells on the leaf abaxial epidermal surface and stems by vacuum aspiration of the cell contents using a fine gage insulin needle (27 G, 13 mm) attached to a collection reservoir maintained on ice. The needle was oriented horizontally to the leaf or stem axis to avoid removing sap from underlying tissue and the procedure was visualized using a Nikon SMZ645 stereo microscope equipped with a dual arm Nikon MKII fiber optic light source (Nikon, México). The extracted liquid from approximately 2000 EBC estimated by counts of cell yield per leaf/stem section from a single control or salt-treated plant was pooled to obtain a single biological replicate. Seven biological replicates were collected per treatment. Plants were maintained in the dark prior to extraction and collection was undertaken early in the morning.

#### Sample Preparation for Metabolomics Analysis

Bladder cell extract was aliquoted into 1.5 ml Eppendorf tubes and the extract was evaporated in a Labconco Centrivap concentrator (Kansas City, MO, USA) to complete dryness. The dried extract was then resuspended in the indicated prechilled (to –20◦C) nitrogen degassed extraction solution. Samples were vortexed for 10 s, and placed on an orbital shaker at 4◦C for 6 min and finally dried in the Labconco Centrivap cold trap concentrator to complete dryness prior to derivitization. Two experimental replicates were obtained for each biological replicate.

#### Gas Chromatography Time-of-flight Mass Spectrometry (GC-TOF)

GC-TOF was performed by West Coast Metabolomics at the UC Davis Genome Centre (Davis CA, USA). Mass spectrometry was performed using a Pegasus III Time Of Flight (TOF) mass spectrometer (LECO, St. Joseph, MI, USA) coupled to an Agilent 6890N gas chromatograph (Agilent Technologies) equipped with a Gerstel autosampler, including a MPS PrepStation and Automated Liner EXchange (ALEX) (Gerstel, Muehlheim, Germany). Liner was changed after every 10 samples, (using the Maestro1 Gerstel software vs. 1.1.4.18). Before and after each injection, the 10µl injection syringe was washed three times with 10µl ethyl acetate. The Gas Chromatograph was equipped with a 30 m long, 0.25 mm i.d. Rtx-5Sil MS column (0.25µm 95% dimethyl 5% diphenyl polysiloxane film) with additional 10 m integrated guard column (Restek, Bellefonte PA, USA). Pure helium (99.9999%) with built-in purifier (Airgas, Radnor PA, USA) was set at constant flow of 1 ml/min. The oven temperature was held constant at 50◦C for 1 min and then ramped at 20◦C/min to 330◦C at which it was held constant for 5 min. The injector was set to a temperature of 250◦C and was used in split mode with 10:1 ratio. The injection volume was 1µl. The transfer line temperature was set to 280◦C (GC re-entrant temperature). The mass spectrometer, controlled by Leco ChromaTOF software vs. 2.32 (St. Joseph, MI, USA), was operated in positive EI mode. All EI spectra were collected using an electron energy of 70 eV, trap current of 250 mA, emission current of 600 mA, filament current of 4.5 A, and source temperature of 250◦C. Acquisition rate was 17 spectra/s, with a scan mass range of 85–500 Da. Spectra were deconvoluted from co-eluting peaks with ChromaTOF, this software detects peaks in an unbiased way and exports one deconvoluted spectrum per peak.

#### Metabolite Identifications

Following acquisition of data, a BinBase relational database system was used to allow for automated metabolite annotation. Metabolites were grouped into either a list of, "identified compounds" or a list of "unknown compounds" and assigned as previously described Lee and Fiehn (2008). Accordingly both groups were unambiguously assigned using the identification criteria of retention index and mass spectrum to BinBase identifier numbers. Additional confidence criteria were given by mass spectral metadata, using the combination of unique ions, apex ions, peak purity, and signal/noise ratios. Each BinBase identifier was routinely matched against the Fiehn lab mass spectral library (Fiehn et al., 2005). For named BinBase compounds, PubChem numbers and KEGG identifiers were added. In addition, for all reported compounds (identified and unknown metabolites) the quantification ion and the full mass spectrum encoded as string is reported.

#### Data Analysis and Accessibility

Statistical analyses were conducted on natural logarithm transformed metabolic parameters, and data summaries are presented for raw data values. HCA of samples, and PCA were implemented on autoscaled data (mean centered and scaled by the standard deviation, z-scaled). The data obtained in this study will be accessible at the NIH Common Fund's Data Repository and Coordinating Center (supported by NIH grant, U01- DK097430) website, http://www.metabolomicsworkbench.org.

# Results

#### Extraction of Metabolites

Untargeted, global metabolite profiling studies aim to analyse as many small-molecular-weight species with a single sample extraction protocol as possible. However, due to the large variety of metabolites with different chemical stability, solubility, and polarity, the choice of sample-preparation method can greatly influence the observed metabolite content of a cellular extract. Ideally the method should be as non-selective as possible to ensure maximal depth of coverage. We first established the optimal sample extraction protocol for GC-TOF of EBC extracts that would allow us to obtain the greatest coverage of the EBC metabolome. Three solvent extraction methods were tested: 9:1 methanol/chloroform, 5:2:2 methanol/chloroform/water and 10:3:1 methanol/chloroform/water. Varying the ratio of aqueous and organic solvents can influence the recovery of polar and nonpolar metabolites. Supplementary Table 1 shows the metabolite profile of EBC's extracted using the different solvent regimes and two different volumes (200 and 500µl). In all extracts we were able to identify 668 different molecular features (175 known and 493 unknown). However, the amounts extracted varied between the different procedures (Supplementary Table 1) suggesting a variation in recovery efficiency similar to what was shown for yeast metabolite extraction (Tambellini et al., 2013). Based on this information we selected 5:2:2 methanol/chloroform/water solvent to water ratio which has been shown to extract comparatively high levels of both polar metabolites and non-polar fatty acids (Lee and Fiehn, 2008).

#### Alteration in Metabolites in Salt-Treated EBC Samples

To explore the effect of salinity on EBC metabolite levels, samples from 7 independent biological replicates for untreated and salinity–treated plants were collected, extracted with 5:2:2 methanol/chloroform/water, and metabolic parameters (194 known and 722 total molecular features) were compared Supplementary Table 2. A two-sample Student's t-Test was used to assess the significance of the difference between the control and salt-treated samples. The probability level for the p-values were adjusted to allow for a maximum 5% probability (q = 0.05) of false positives (Benjamini and Hochberg, 1995). The FDR was also directly estimated as the q-value, for all comparisons (Klaus and Strimmer, 2012). Statistical test results are reported along with metabolite averages, standard deviations and fold changes of means in Supplementary Table 2. A total of 352 significantly differing metabolites (268 after correction for FDR) were identified (Supplementary Table 2). Of the known metabolites, 28 showed a statistically significant down-regulation of more than 1.5 fold (**Table 1**) and 28 metabolites showed a statistically significant up-regulation of more than 1.5 fold (**Table 1**). The top 10 known, significant down- and up-regulated metabolites are depicted in box plots (**Figures 1A,B**). Of the significantly changing unknown metabolites, nine showed more than a 5-fold increase in the salt-treated samples compared to the control samples, while an additional 13 were shown to be significantly down-regulated by more than 5-fold (Supplementary Table 2).

#### Hierarchical Cluster Analysis

Hierarchical cluster analysis (HCA) was used to group samples and metabolites based on similarities in auto scaled values and correlations, respectively (Sugimoto et al., 2012). Distance was calculated based on the Euclidean method, linkage was done using the Ward method (Ward, 1963), and variable similarities were based on Spearman correlations. Multivariate sample similarities, displayed as a heatmap, are shown in **Figure 2**. Separation of experimental treatments or sample classes into non-overlapping clusters suggests large metabolic differences between the comparisons. While separation of similar classes of samples into clusters may suggest high biological or analytical variability.

TABLE 1 | Metabolites which were significantly down- or up-regulated by more than 1.5-fold in EBC extracts from salt-treated plants compared to control plants.



<sup>a</sup> Fold change—mean salt-treated value/mean control value.

<sup>b</sup>p-values—Significance levels for the two-sample Student's t-Test were computed for assessing the significance of difference between salt-treated samples and control samples.

<sup>c</sup>Adjusted p-values—the significance level for the test statistics were adjusted for the multiple hypotheses tested to allow for a maximum 5% probability of false positives (pFDR-adjusted p-values).

<sup>d</sup>q-values—estimate of the FDR for all comparisons.

the test statics (i.e., p-values) were adjusted for the multiple hypotheses tested to allow for a maximum 5% probability (q = 0.05) of false positives. Box and whiskers plots showing natural logarithm transformed values are used to visualize the group means and standard deviations.

heatmap in which columns represent samples and rows metabolites. The heatmap visualization is used to encode individual measurements for each sample (autoscaled) as colors (red, relative increase; blue, relative decrease), as indicated by the color bar at top right of figure, all of which are organized using HCA. Metabolites are ordered on the heatmap in the order they appear in the list in Supplementary Table 2.

#### Principal Component Analysis

Principal component analysis (PCA) provided an unbiased evaluation of the data variance structure. **Figure 3** shows the PCA for the dataset of EBC extracts from control and salt treated plants for the first two principal components. As can be observed the experiments are clearly separated into two distinct groups which reflect the treatment conditions. The first principal component (PC1)—explained the greatest variance (35%) across the data and separates the samples based on treatment. The second principal component (PC2) separated the components based on sample replicates and accounted for 16% of the variance.

### Biochemical Pathway Enrichment

Pathway enrichment analysis was used to identify if the significantly changed metabolites, between control and salt-treated EBC samples, were significantly enriched in members of biochemical pathways as defined by the KEGG Database. Significant enrichment in KEGG pathways was identified using MBRole (Chagoyen and Pazos, 2011). The hypergeometric test coupled with adjustment for FDR (Benjamini and Hochberg, 1995) was used to identify 13 significantly enriched pathways (**Table 2**); which included aminoacyl-tRNA biosynthesis, glutathione metabolism, biosynthesis of alkaloids derived from histidine and purine, C5-branched dibasic acid metabolism, biosynthesis of alkaloids derived from ornithine, lysine and nicotinic acid, beta-alanine metabolism, biosynthesis of plant hormones, biosynthesis of phenylpropanoids, butanoate metabolism, galactose metabolism, cyanoamino acid metabolism and glyoxylate and dicarboxylate metabolism.

# Discussion

Within the plant kingdom it is believed there are an estimated 200,000 metabolites (Fiehn, 2002), however species specificity, specialized organ, tissue and cellular distribution, spatiotemporal factors, and rapid metabolite turnover, combined with the difficulty to capture and detect the metabolites, severely limits the number of metabolites that can be detected in an organism at any one time. Employing multiple analytical approaches can increase the number of metabolites detected, but also simply optimizing the extraction protocol can influence the success of detection. In this study, following extraction optimization steps, we were able to identify a total of 722 metabolites in single celltype EBC extracts consisting of 194 known and 528 unidentified molecular features. The amount of known metabolites detected in EBC extracts was significantly higher than previous reports of single cell-type global metabolite profiling of leaf trichomes, including glandular trichomes from Solanum (119 known compounds/LC-MS/3:3:2 propanol:acetonitrile:water) and

#### TABLE 2 | Pathways which are significantly enriched for the metabolites.


Significant using adjusted p-value <= 0.05. Arabidopsis thaliana was used as a background set.

simple trichomes from Arabidopsis (119 compounds/GC-TOF/ 1:2.5:1 chloroform:methanol:water) (Ebert et al., 2010; McDowell et al., 2011). This may be a result of selecting a more optimized extraction protocol, or related to cell-type specific metabolic diversity, but could also be due to improvements in bioinformatics pipelines for the chemical assignment of unknown ions in metabolome data.

In order to gain greater insight into the biochemical function and underlying biological roles of the EBC we studied the effect of salt treatment on the EBC metabolites. Statistical analysis detected significant changes in 37% of the total metabolites (Supplementary Table 2). Of the 194 known metabolites identified we were able to detect significant changes in 57 (30%) of these upon salt-treatment (**Table 1**), suggesting an extensive alteration in the metabolome in response to salinity. The most highly increased metabolite (more than 12 fold) was the nonprotein amino acid pipecolic acid, shown to be a critical regulator of inducible plant immunity and resistance to bacterial pathogens (Návarová et al., 2012), but also known to accumulate upon salt stress in Limonium vulgare (Stewart and Larher, 1980). Pipecolic acid is an analog of proline, sharing a similar chemical structure (Ganapathy et al., 1983). Proline is another non-protein amino acid that showed high accumulation in salt stressed EBC (**Table 1**). In fact, more than 50% of the metabolites that show significant changes in the EBC, can be classified as compatible solutes and include sugars, sugar alcohols, protein and nonprotein amino acids, and organic acids (**Figures 4A,B**, **Table 1**) (Slama et al., 2014). This data correlates well with gene ontology enrichment analysis of differentially expressed transcripts in the EBC, with GO term networks "small molecule metabolic process" and "carboxylic acid metabolic process", significantly enriched in the salt-treated EBC transcripts as compared to the control (Oh et al., 2015). The accumulation of osmotically active non-toxic organic compounds in the cytosol increases the cellular osmotic potential to provide a balance between the cytoplasm and the vacuolar lumen, which, in M. crystallinum can accumulate up to 1 M Na<sup>+</sup> (and Cl−) (Adams et al., 1992). Both proline and pinnitol have been shown to increase previously in EBC from salt-treated plants (Paul and Cockburn, 1989; Adams et al., 1998), and RNAseq results suggested that salt treatment induces metabolic pathways that lead to synthesis, accumulation, transport and conversion of compatible solutes (Oh et al., 2015). Specifically, increases in pinitol can be linked to increases in transcripts encoding myo-inositol-1-phosphate synthase and myo-inositol O-methyltransferase 1, two key enzymes in the pathway leading to pinitol synthesis via ononitol, which were highly abundant and significantly upregulated in the EBC transcriptome (up to 170-fold) following salinity treatment (Oh et al., 2015). Expression of transcripts encoding two key proline biosynthesis enzymes, delta-1-pyrroline-5-carboxylate synthase and pyrroline-5-carboxylate reductase were also salt-induced in the EBC (Oh et al., 2015) and link changes in other metabolites, including ornithine and alpha ketoglutaric acid (**Table 1**) to proline accumulation (**Figure 4C**).

A high number of organic acids were identified as significantly altered in the EBC following salt-treatment (**Table 1**, **Figure 4B**). As well as their possible function as metabolically active solutes for osmotic adjustment, organic acid metabolism is of fundamental importance at the cellular level for several biochemical pathways, including as photosynthetic intermediates in CAM (Cushman and Bohnert, 1997), and for the formation of precursors for amino-acid biosynthesis (López-Bucio et al., 2000). It is likely that changes in abundance of organic acids reflects all these roles. One of the most highly altered organic acids was the C<sup>4</sup> dicarboxylic acid maleate (maleic acid, 9.41 fold increase), a trans-isomer of fumaric acid, it is considered an uncommon plant metabolite (Fiehn et al., 2000), and its biological role is not clear, other than as a possible citric acid cycle intermediate. It was shown to be negatively correlated with Na<sup>+</sup> concentration in roots from salt-treated barley seedlings (Wu et al., 2013), and was a cold shock inducible metabolite identified in Arabidopsis shoots (Kaplan et al., 2004), as well as being identified in Arabidopsis and cotton seed trichomes (Ebert et al., 2010; Naoumkina et al., 2013); but these reports give little insight into its function in the plant.

Amino acids are the predominant form of solute accumulated by a phylogenetically diverse range of salt-tolerant organisms such as salt-tolerant bacteria, halophytes, marine invertebrates, and hagfishes (Yancey et al., 1982). However, only selected protein amino acids appear to be utilized. In EBC there was an overall net increase in amino acids (**Table 1**), with the highest increase (approx. 3-fold) observed for asparagine and valine. Both these amino acids have been reported to increase upon salt-treatment (Rabe, 1990; Martinelli et al., 2007; Nedjimi, 2011; Zhang et al., 2011) and are implicated as nitrogen stores during stress to maintain metabolic homeostasis (Mansour, 2000; Rabe, 1990).

Both ascorbic acid and its oxidized form dehydroascorbic (DHA) acid are significantly altered in salt-treated EBC, with ascorbate decreasing and dehydroascorbic increasing. In plants ascorbate is the most abundant water-soluble antioxidant (Smirnoff, 2000) and participates in cellular redox state homeostasis by maintaining ROS below toxic levels but also through temporal–spatial coordination of ROS for a multitude

of signaling pathways (Baxter et al., 2013). Salinity stress results in the increased production of ROS (Abogadallah, 2010) and the recycling of ascorbic acid may be critical for maintaining ROS at a level within the EBC that minimizes oxidative damage but permits signaling function (Gallie, 2013). Ascorbic acid reacts with ROS and is oxidized to the radical MDHA which can rapidly disproportionate non-enzymatically to produce dehydroascorbic acid and recycled ascorbic acid (Smirnoff, 2000). The accumulation of dehydroascorbic acid in the EBC may reflect this recycling process. Oxidation of ascorbate has also been implicated in cell growth through the generation of MDHA radicals, and downstream stimulation of the plasma membrane proton ATPase which acidifies the extra cellular space resulting in cell wall loosening (Kato and Esaka, 2008).

Sinapyl alcohol, one of the main building blocks of lignin (Vanholme et al., 2010) is increased in the EBC from salttreated plants. Lignin is deposited in the secondary cell walls of all vascular plants forming cross-links with other cell wall components to create rigidity. Increased lignification may be essential for the structural integrity of the swelling bladder cells to provide increased support as their volume increases with the accumulation of ions and water. Additionally, increases in the fatty alcohol dodeconol, a component of the waxy cuticular film that is present on the aerial surface of plants (Kunst and Samuels, 2009) would help to protect the swelling bladder cells, limiting non-stomatal water loss, while increases in fatty acids, such as is observed for monopalmitin under salt-stress, would serve as precursors to fatty alcohol synthesis in the EBC (Kunst et al., 2006). Salt induced increases in transcripts for enzymes involved in biosynthesis of plant cuticular wax, including fatty acid hydrolase, were also observed in EBC transcriptome (Oh et al., 2015).

Comparison of global metabolic changes upon salt stress with other halophytes, including the legume Lotus creticus (Sanchez et al., 2011), and the salt-tolerant Arabidopsis relative Thellungiella salsuginea (Lugan et al., 2010), highlights a similarity in the classes of metabolites which show highest alteration upon salinity treatment, in particular, sugars, sugar alcohols, organic acids, and amino acids. This is not surprising as compatible solute synthesis is a well-documented tolerance response in most plants to salinity (Hasegawa et al., 2000), but there was little agreement in the specific metabolites that change, implying the existence of differential metabolic arrangements between species and possibly cell types, to compensate for ion imbalance (Flowers and Colmer, 2008).

Of the chemically unidentified metabolites, 21 showed changes of more than 5-fold in the salt-treated EBC samples compared to the control samples. These could potentially make important contributions to the cell-type specific adaptations to salt in M. crystallinum EBC. Recent advances in linking functionally identified genes to unknown metabolites by genome wide association studies, integrating the data from genetic associations and metabolic networks with biochemical pathway information, has been successful in both humans (Krumsiek et al., 2012) and plants (Luo, 2015) for annotation of unidentified metabolites and will provide a means in the future to accelerate research in this area.

### References


#### Acknowledgments

Funding for this work was provided in part by DGAPA IN202514 and CONACYT P-178232 to R.V-E.

#### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00435/abstract


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Barkla and Vera-Estrella. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.